Neetcode 150

Intuition-first study bank — 150 problems. Click to expand.

18 categories 150 solutions

Arrays & Hashing (9)

Contains Duplicate #217
Hash Set / Bloom Filter detection

Intuition

Think of this like a bouncer at a club checking the guest list. As each person arrives, you check if their name is already in your log. If it is - duplicate! If not, you add it to the log and let them in. The 'log' is our set, and checking it is essentially instant (O(1)) because hash tables work like a perfect filing system - no need to flip through pages, you jump directly to where the name would be.

Why This Pattern?

We only need to detect IF a duplicate exists, not count them or find which ones. A hash set gives O(1) average-case lookup - we can check each number as we go and know instantly if we've seen it before. This is the classic 'membership testing' pattern.

Solution

class Solution:
    def containsDuplicate(self, nums: List[int]) -> bool:
        seen = set()  # Our "guest list" - tracks what we've encountered
        
        for num in nums:
            if num in seen:  # O(1) lookup - is this name already on the list?
                return True   # Found a duplicate!
            seen.add(num)     # Add to our set for future checks
        
        return False  # Made it through - all unique

Complexity

Time: O(n)
Space: O(n)

We must look at each of the n elements at least once (worst case, the duplicate is at the very end). For each element, we do O(1) set operations. Can't get faster than O(n) because we need to 'touch' each input element. The space is O(n) in the worst case - if all elements are unique, we store all n of them in our set.

Common Mistakes

Edge Cases

Connections

Encode and Decode Strings #271
Length-delimited serialization. Self-describing format where each item encodes its own size, making the data immune to whatever characters it contains.

Intuition

Think of this like packaging boxes for shipping. If you just wrote each box's contents and then stacked them, you couldn't tell where one box ends and another begins if the contents happened to match your separator. The smart solution: label each box with its exact length FIRST, then pack the contents. When unpacking, you read the label, know exactly how many items are in that box, and move to the next. The '#' delimiter is just the label separator — the actual string content can be anything because you're reading a fixed number of characters based on the length prefix, not scanning for a special character.

Why This Pattern?

This pattern exploits a key structural property: by embedding the length in the encoded format, we eliminate dependence on 'special characters' that can't appear in the payload. The format is compositional — each string is independently decodable, which is both efficient and robust.

Solution

class Codec:
    def encode(self, strs: List[str]) -> str:
        """Encodes a list of strings to a single string."""
        # For each string: prefix with length + delimiter, then append string
        # The delimiter '#' marks where the length ends
        encoded = []
        for s in strs:
            # Length tells decoder exactly how many chars to read
            encoded.append(str(len(s)) + '#' + s)
        return ''.join(encoded)
    
    def decode(self, s: str) -> List[str]:
        """Decodes a single string to a list of strings."""
        result = []
        i = 0
        while i < len(s):
            # Find delimiter: where length ends
            j = i
            while s[j] != '#':
                j += 1
            
            # Extract length (everything between i and j)
            length = int(s[i:j])
            
            # Skip delimiter, read exactly 'length' characters
            result.append(s[j + 1 : j + 1 + length])
            
            # Move pointer to start of next length-prefixed string
            i = j + 1 + length
        
        return result

Complexity

Time: O(N) where N is total characters in output string. Each character is visited exactly once during encode and once during decode — no backtracking or re-scanning.
Space: O(N) for the encoded string and output list. We must store all characters somewhere; you can't compress below the information-theoretic minimum.

The string length serves as a 'forward pointer' — we never revisit characters. It's like a linked list in memory: once you process a node, you move on. The bottleneck is physically reading/writing each character, which is unavoidable since every character in the input must appear in the output.

Common Mistakes

Edge Cases

Connections

Group Anagrams #49
Keyed grouping / Canonical form hashing

Intuition

Think of each string as a recipe - if two recipes have the exact same ingredients in the exact same quantities, they're the same 'type' of recipe regardless of the order ingredients were added. The challenge is: how do we efficiently identify which recipes share ingredients? We need a 'fingerprint' for each string that all its anagrams share. Sorting the characters creates this fingerprint: 'eat', 'tea', and 'ate' all become 'aet' when sorted - they now have the same identity and can be grouped together. This is like organizing a deck of cards by suit and rank rather than by the order they were dealt.

Why This Pattern?

We're partitioning strings into equivalence classes where the equivalence relation is 'are anagrams of each other'. The structural property that makes hashing the natural choice is that anagrams are indistinguishable when you ignore order - they form a natural group. By transforming each string into a canonical representation (sorted characters), we create a perfect hash key where strings with the same key MUST be anagrams (the converse also holds).

Solution

def groupAnagrams(strs):
    # The key insight: anagrams become identical when sorted
    # "eat" -> "aet", "tea" -> "aet", "ate" -> "aet" all map to same bucket
    
    groups = {}
    
    for s in strs:
        # Canonical form: sorted characters
        # This is the "fingerprint" that all anagrams share
        key = tuple(sorted(s))
        
        if key not in groups:
            groups[key] = []
        groups[key].append(s)
    
    return list(groups.values())

Complexity

Time: O(n * k log k)
Space: O(n * k)

We process n strings, and for each one we sort k characters (where k is average string length). Sorting dominates at O(k log k) per string. We can't do better than sorting because we must examine all k characters to determine anagram equivalence - you can't know if two strings are anagrams without checking every character. The space comes from storing all n strings in the hash map, plus the sorted representation of each string as a key.

Common Mistakes

Edge Cases

Connections

Longest Consecutive Sequence #128
Set-based sequence detection with the 'start point' optimization. This leverages a hash set to achieve O(1) membership checking.

Intuition

Imagine you're looking at a bunch of numbered blocks scattered on a table. Some of them happen to form continuous chains - like 4,5,6,7 sitting next to each other. Your job is to find the longest chain. The key insight: you only need to start counting from the BEGINNING of each chain - the block that doesn't have its left neighbor. If you try to count from the middle of a chain (like 5), you'll double-count. So the algorithm is: put all numbers in a hash set for O(1) lookups, then iterate through and only start counting when a number has NO left neighbor in the set. This is like finding all the 'headwaters' of streams and tracing each river to its source.

Why This Pattern?

The structural property that makes this pattern work: consecutive sequences have a unique 'start' element (the one with no predecessor). By filtering to only process these starts, we avoid redundant work. If we processed every element, we'd re-count the same sequences multiple times. The set gives us constant-time lookup to check if a predecessor exists.

Solution

def longestConsecutive(nums):
    if not nums:
        return 0
    
    # Put all numbers in a set for O(1) lookups
    num_set = set(nums)
    longest = 0
    
    for num in num_set:
        # Only start counting if this is the beginning of a sequence
        # (i.e., num-1 doesn't exist in the set)
        if num - 1 not in num_set:
            current = num
            streak = 1
            
            # Keep counting forward while consecutive numbers exist
            while current + 1 in num_set:
                current += 1
                streak += 1
            
            longest = max(longest, streak)
    
    return longest

Complexity

Time: O(n) - Here's why: We visit each number at most twice in the worst case. Each number enters the set once, and at most one number from each sequence triggers the forward-counting loop. Within that loop, we visit each element of that sequence exactly once total across all starts. Since sequences don't overlap, the total is bounded by O(2n) = O(n).
Space: O(n) - The hash set stores all n numbers. Without this, we couldn't do O(1) predecessor lookups.

Think of it like tracking rivers: we're only starting new 'expeditions' from headwaters (numbers with no predecessors). Each river's length gets counted exactly once because we don't start from midstream. The set lookup is O(1) like checking if a key exists in a dictionary - instant regardless of size.

Common Mistakes

Edge Cases

Connections

Product of Array Except Self #238
Prefix/Suffix Product (Two-Pass Accumulation)

Intuition

Think of this like a river flowing through checkpoints. At each position i, you need to know the total 'flow' (product) from both upstream (all elements left of i) AND downstream (all elements right of i), but NOT the flow through position i itself. It's like calculating what the product would be if you magically 'removed' each element one at a time. The key insight: multiplication is distributive, so we can break the problem into two independent sweeps - left-to-right for prefix products, then right-to-left for suffix products - and multiply them together at each position.

Why This Pattern?

The answer at each position depends on ALL other positions, not just neighbors. This screams prefix/suffix pattern because: (1) we can precompute a running product as we sweep, (2) the left contributions and right contributions are independent and can be combined via multiplication, (3) we need O(n) with O(1) extra space - exactly what two linear passes achieve.

Solution

def productExceptSelf(nums):
    n = len(nums)
    answer = [1] * n
    
    # First pass: compute prefix products (all elements to the LEFT of i)
    # At each step, 'prefix' holds product of nums[0] through nums[i-1]
    prefix = 1
    for i in range(n):
        answer[i] = prefix  # Store product of everything LEFT of i
        prefix *= nums[i]   # Update prefix to include current element
    
    # Second pass: compute suffix products (all elements to the RIGHT of i)
    # At each step, 'suffix' holds product of nums[i+1] through nums[n-1]
    suffix = 1
    for i in range(n - 1, -1, -1):
        answer[i] *= suffix  # Multiply left product by right product
        suffix *= nums[i]   # Update suffix to include current element
    
    return answer

Complexity

Time: O(n) - We make exactly two passes through the array, each doing constant work per element.
Space: O(1) extra space - We only use a few variables (prefix, suffix) regardless of input size. The output array doesn't count toward space complexity since it's required by the problem.

We can't do better than O(n) because every element (except itself) must contribute to every output position - that's n×n total multiplications conceptually. The two-pass approach is optimal because we must touch each element at least twice (once from left, once from right). The O(1) extra space comes from reusing the answer array as our accumulator - we never need to store intermediate products for all indices simultaneously.

Common Mistakes

Edge Cases

Connections

Top K Frequent Elements #347
Bucket Sort by Frequency / Counting Sort Adaptation

Intuition

Think of this like finding the most popular items in a store inventory. You have a list of what sold, and you want to know which k products were purchased most often. The intuitive approach: (1) First, count how many times each product appeared — this is your frequency map. (2) Then, find the top k by frequency. Here's the key insight: the frequency can't exceed the array length (if every element is the same, frequency = n). So we can use the frequency as a 'bucket index' — bucket[i] holds all elements that appeared exactly i times. This is like organizing books on a shelf by how many times you've read them, then grabbing from the most-read shelf first.

Why This Pattern?

The problem has a special property: frequency is bounded between 1 and n (array length). This is perfect for bucket sort because we can directly index into an array using frequency values. Instead of comparing elements, we use their frequency as a direct address — O(1) insertion into buckets rather than O(log n) heap operations. It's the natural choice when: (1) we know the range of the 'key' we're sorting by, and (2) that range is small relative to the data size.

Solution

def topKFrequent(nums: list[int], k: int) -> list[int]:
    # Step 1: Build frequency map - count how often each element appears
    # This is like tallying votes in an election
    freq = {}
    for num in nums:
        freq[num] = freq.get(num, 0) + 1
    
    # Step 2: Create buckets where bucket[i] = all elements with frequency i
    # We need n+1 buckets because frequency can range from 0 to n
    n = len(nums)
    buckets = [[] for _ in range(n + 1)]
    
    # Place each element in its frequency bucket
    for num, count in freq.items():
        buckets[count].append(num)
    
    # Step 3: Collect top k elements from highest frequency bucket down
    # This is like grabbing books from the 'most-read' shelf first
    result = []
    for i in range(n, 0, -1):  # go from high frequency to low
        result.extend(buckets[i])  # add all elements with this frequency
        if len(result) >= k:  # once we have k elements, we're done
            break
    
    return result[:k]

Complexity

Time: O(n)
Space: O(n)

We make three linear passes: (1) building the frequency map visits each of n elements once, O(n); (2) populating buckets visits each unique element once, O(n); (3) collecting results visits at most n buckets, O(n). Total is O(n). Space is O(n) for the frequency map plus O(n) for the buckets — we need to store all elements somewhere. This is optimal because we must at least examine every element to determine frequency.

Common Mistakes

Edge Cases

Connections

Two Sum #1
Hash table complement lookup - for each element, compute what you need (target - current) and check if it's already been seen.

Intuition

Imagine you're balancing a scale. You have a target weight (the sum you need), and as you place each number on the scale, you're instantly checking if its 'counterpart' is already there. If you see a 5 and need sum 9, you immediately ask 'do I already have a 4?' If yes, done. If not, save the 5 for later. It's like a matchmaking service - you're constantly asking 'is my complement already here?' as you scan through.

Why This Pattern?

The problem requires finding ANY pair that sums to target. Checking all pairs is O(n²). By using a hash table, we can check 'has this complement been seen?' in O(1) time, reducing overall complexity to O(n). This works because we only need ONE valid pair, not all pairs, so we can stop as soon as we find a match.

Solution

def twoSum(nums, target):
    # Dictionary stores: number value -> its index
    # We need the index, not just the value, to return the answer
    seen = {}
    
    for i, num in enumerate(nums):
        complement = target - num  # What number would pair with num?
        
        # If we've seen the complement, we found our pair!
        if complement in seen:
            return [seen[complement], i]
        
        # Otherwise, remember this number and its index for future lookups
        seen[num] = i
    
    # Problem guarantees a solution exists
    return []

Complexity

Time: O(n) - We traverse the array once. For each element, computing the complement is O(1) and hash table lookup is O(1) average case.
Space: O(n) - In the worst case (no solution found until the very end, which doesn't happen here since solution is guaranteed), we store all n elements in the hash table.

You can't do better than O(n) because in the worst case, you might need to look at every element before finding the pair. The hash table trades space for speed - by storing what we've seen, we avoid the nested loops that would check every possible pair (which would be n² comparisons).

Common Mistakes

Edge Cases

Connections

Valid Anagram #242
Frequency Count / Hash Map

Intuition

Think of each string as a 'chemical composition' - you're checking if two substances have the exact same atoms in the exact same quantities, just arranged differently. The ORDER doesn't matter, only the COUNT. This is like a fingerprint: if two strings have identical letter fingerprints, they're anagrams. You could physically sort letters (like sorting cards), but there's a faster way: just count what you have.

Why This Pattern?

An anagram problem is fundamentally about comparing multisets of characters. The ONLY structural property that matters is how many of each character exists. A hash map naturally captures 'how many of X' for any X, making it the perfect tool. We're checking if two sequences are equivalent under permutation, which is exactly what frequency counting detects.

Solution

from collections import Counter

class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        # Quick win: different lengths can't be anagrams
        if len(s) != len(t):
            return False
        
        # Count characters in first string, subtract for second
        # If they're anagrams, every count will return to zero
        count = Counter(s)
        
        for char in t:
            count[char] -= 1
            if count[char] < 0:  # More of this char in t than s
                return False
        
        return True

# Alternative one-liner (same logic, less explicit):
# return Counter(s) == Counter(t)

Complexity

Time: O(n)
Space: O(k) where k = 26 (alphabet size)

Common Mistakes

Edge Cases

Connections

Valid Sudoku #36
Multi-dimensional Uniqueness Validation with Hash Sets

Intuition

Think of Sudoku validation as checking three independent 'conservation laws' simultaneously. Each digit 1-9 is like a unique 'energy packet' that can only exist once per row, once per column, and once per 3x3 subgrid. You're essentially verifying that no digit is 'overlapping' with itself in any of these three dimensions. Imagine three transparent overlays on the board - one showing rows, one showing columns, one showing boxes - a digit appearing at position (i,j) must not already appear in that row's overlay, that column's overlay, or that box's overlay. The moment you spot a duplicate, the board is invalid.

Why This Pattern?

Sudoku has three separate constraint domains (rows, columns, subgrids) that must all be satisfied independently. Each domain requires uniqueness checking, which is exactly what hash sets excel at - O(1) lookup to check if an element was already seen. The problem's structure (fixed 9x9 grid with exactly 3 constraints per cell) makes sets the natural choice over more complex data structures.

Solution

def isValidSudoku(board: List[List[str]]) -> bool:
    # Three sets of sets: one for each row, column, and 3x3 subgrid
    row_sets = [set() for _ in range(9)]
    col_sets = [set() for _ in range(9)]
    box_sets = [set() for _ in range(9)]
    
    for i in range(9):
        for j in range(9):
            num = board[i][j]
            
            # Skip empty cells - they impose no constraint
            if num == '.':
                continue
            
            # Calculate which 3x3 box we're in
            # Box index formula: row_group * 3 + col_group
            box_index = (i // 3) * 3 + (j // 3)
            
            # Check ALL three constraints simultaneously
            # If num exists in ANY of the three sets, we have a duplicate
            if (num in row_sets[i] or 
                num in col_sets[j] or 
                num in box_sets[box_index]):
                return False
            
            # No duplicate found - add to all three sets
            row_sets[i].add(num)
            col_sets[j].add(num)
            box_sets[box_index].add(num)
    
    return True

Complexity

Time: O(81) = O(1) - The board is always 9x9, so we iterate exactly 81 cells. Each cell does O(1) set operations (3 lookups + 3 insertions at most). Since the input size is bounded by a constant, this is technically O(1) space and time.
Space: O(81) = O(1) - We maintain 27 sets (9 rows + 9 cols + 9 boxes), each holding at most 9 elements. Total memory is bounded by a constant regardless of input.

We can't do better than O(81) because we must examine every non-empty cell to verify it's valid - a single invalid cell could be anywhere. The space is also bounded because we're only tracking uniqueness in 27 fixed domains (9 rows, 9 columns, 9 boxes), each capped at 9 unique digits.

Common Mistakes

Edge Cases

Connections

Two Pointers (5)

3Sum #15
Two Pointers with Sorting (specifically the 'sort + two pointers' pattern for 2-sum)

Intuition

Think of this like finding three weights that balance perfectly on a see-saw - they need to sum to zero. The key insight: once you sort the array, it becomes ordered like numbers on a number line. Pick one number as an 'anchor' (like fixing one weight), and now you're looking for two other numbers that sum to the negative of your anchor. This reduces to the classic 2Sum problem on a sorted array. The two pointers work like two people walking toward each other from opposite ends of a hallway - they'll either meet at the right spot (sum = target) or cross paths (and you know to move one direction). Sorting is essential because it gives you a monotonic sequence where you can reliably predict which direction to move when the sum is too big or too small.

Why This Pattern?

Sorting transforms an unordered search into a directed walk. After fixing one element, the remaining two-pointer search works because: (1) the array is monotonic, so if sum > target you MUST decrease it by moving right pointer left, (2) if sum < target you MUST increase it by moving left pointer right. This greedy movement guarantees you never miss a valid pair - it's like gradient descent on a sorted landscape. The duplicate handling during iteration prevents revisiting equivalent states.

Solution

def threeSum(nums):
    res = []
    nums.sort()  # Sort first - enables the two-pointer trick
    
    for i in range(len(nums)):
        # Skip duplicate first elements to avoid repeated triplets
        if i > 0 and nums[i] == nums[i - 1]:
            continue
        
        # Two-pointer search for the remaining two numbers
        left, right = i + 1, len(nums) - 1
        target = -nums[i]  # We need two numbers that sum to this
        
        while left < right:
            current_sum = nums[left] + nums[right]
            
            if current_sum == target:
                # Found a valid triplet!
                res.append([nums[i], nums[left], nums[right]])
                
                # Skip duplicates for left and right to avoid repeats
                while left < right and nums[left] == nums[left + 1]:
                    left += 1
                while left < right and nums[right] == nums[right - 1]:
                    right -= 1
                
                # Move both pointers after finding a match
                left += 1
                right -= 1
                
            elif current_sum < target:
                # Sum too small, need larger value - move left pointer right
                left += 1
            else:
                # Sum too large, need smaller value - move right pointer left
                right -= 1
    
    return res

Complexity

Time: O(n²)
Space: O(1) auxiliary space (not counting output), or O(n) if counting the sort's extra space depending on implementation

Sorting takes O(n log n). The outer loop runs n times, and for each iteration, the two-pointer search potentially traverses the remaining n-i elements. This is like checking every possible triplet but doing it efficiently by using the sorted property to skip invalid branches - you don't need to check all n³ combinations because the sorted order lets you systematically eliminate possibilities.

Common Mistakes

Edge Cases

Connections

Container With Most Water #11
Two Pointers (opposite ends, moving toward center)

Intuition

Imagine two vertical cliffs forming a valley. The water it can hold is limited by BOTH the distance between them (width) AND the shorter cliff (height) - water spills over the shorter one. This is like finding two people standing apart who are both tall. The key insight: if you're at two positions and the shorter one limits you, moving the taller one can ONLY make things worse (you reduce width and the shorter one is still the bottleneck). But moving the shorter one MIGHT find a taller partner - that's your only hope for improvement. It's like if you're playing 'height partner' with someone: when you're paired with someone shorter than you, you should go find a new partner - but if you're the shorter one, you stay put and let them find someone taller.

Why This Pattern?

The array represents positions in a line where both location and value matter. Starting from the widest possible container (ends), we can only improve by moving the pointer at the shorter height - because moving the taller pointer can never increase area (the shorter height remains the bottleneck while width decreases). This creates a deterministic search path that explores all potentially optimal solutions.

Solution

def maxArea(height):
    """
    Two pointers starting at opposite ends. At each step:
    1. Calculate area with current pointers
    2. Move the pointer at the shorter height (potential to find taller walls)
    3. Stop when pointers meet
    """
    left = 0
    right = len(height) - 1
    max_water = 0
    
    while left < right:
        # Width is distance between indices
        width = right - left
        # Height is limited by shorter wall (water spills over)
        current_height = min(height[left], height[right])
        
        # Update maximum area found
        max_water = max(max_water, width * current_height)
        
        # Move pointer at shorter height - this is THE key insight:
        # Moving the taller one CANNOT help (shorter still limits)
        # Moving the shorter one MIGHT help (might find taller wall)
        if height[left] < height[right]:
            left += 1
        else:
            right -= 1
    
    return max_water

Complexity

Time: O(n) - single pass through array
Space: O(1) - only using two pointers and a few variables

We visit each index at most once (left moves right, right moves left, they meet in middle). Even though we might skip some pairs, we don't need to check all n² pairs explicitly because the greedy pointer movement guarantees we explore only the candidates that could potentially be optimal - we're not enumerating, we're intelligently searching.

Common Mistakes

Edge Cases

Connections

Trapping Rain Water #42
Two Pointers (greedy)

Intuition

Imagine this as a valley system where water accumulates. The key insight: at any position, the water level is determined by the SHORTER of the two 'walls' that could contain it. Think of it like pouring water into a container - the water spills over the lower side. So water at position i = min(max_height_to_left, max_height_to_right) - height[i]. The two-pointer approach exploits a clever observation: if we know the left wall is shorter than the right wall, we only need to worry about the left side because the right side (being taller) can't limit the water on the left. We process the smaller side first, knowing the answer is constrained by that smaller boundary.

Why This Pattern?

The problem has a monotonic property: the limiting factor for water at any position is always the minimum of the two maximum heights on either side. By processing from both ends and always moving the pointer with the smaller height boundary, we maintain a invariant that lets us compute water locally without needing to scan the entire remaining array. We're greedily resolving the side we can be certain about.

Solution

def trap(height):
    if not height:
        return 0
    
    left, right = 0, len(height) - 1
    left_max, right_max = 0, 0
    water = 0
    
    while left < right:
        # Process the shorter side - we can resolve it with certainty
        if height[left] < height[right]:
            # Left side is the limiting factor
            if height[left] >= left_max:
                left_max = height[left]  # This becomes new boundary
            else:
                water += left_max - height[left]  # Water trapped here
            left += 1
        else:
            # Right side is the limiting factor (or equal)
            if height[right] >= right_max:
                right_max = height[right]
            else:
                water += right_max - height[right]
            right -= 1
    
    return water

Complexity

Time: O(n)
Space: O(1)

We traverse the array exactly once with two pointers moving toward each other. Each element is visited at most once. We only store a few integer variables regardless of input size. Can't do better than O(n) because we must examine each bar to know the answer - every position potentially holds water.

Common Mistakes

Edge Cases

Connections

Two Sum II #167
Two Pointers (opposite direction)

Intuition

Think of this like balancing a scale. You have a sorted list of weights from lightest to heaviest, and you want two weights that exactly equal a target weight. Start with the lightest (leftmost) and heaviest (rightmost). If they sum too heavy, you KNOW the heaviest is too heavy — move it down. If they sum too light, you KNOW the lightest is too light — move it up. The sorted property guarantees this always works because adjusting one element in a direction has a predictable effect on the sum.

Why This Pattern?

The sorted input creates a monotonic relationship: if current sum > target, decreasing the larger element can ONLY help (never hurt). If current sum < target, increasing the smaller element can ONLY help. This lets us explore all possible pairs in exactly O(n) by moving each pointer at most n times — no backtracking needed.

Solution

def twoSum(numbers, target):
    left = 0
    right = len(numbers) - 1
    
    while left < right:
        current_sum = numbers[left] + numbers[right]
        
        if current_sum == target:
            # 1-indexed return as specified in problem
            return [left + 1, right + 1]
        
        if current_sum > target:
            # Sum too large: decrease the larger element
            # Since sorted, numbers[right] is the larger one
            right -= 1
        else:
            # Sum too small: increase the smaller element
            left += 1
    
    return []  # No solution found (won't happen for valid input)

Complexity

Time: O(n)
Space: O(1)

Each iteration eliminates at least one possibility — either moving left or right pointer. Since pointers only move toward each other and never backtrack, we visit at most n pairs total. Every comparison and arithmetic operation is O(1).

Common Mistakes

Edge Cases

Connections

Valid Palindrome #125
Two Pointers (opposite ends moving inward)

Intuition

Think of a palindrome as a system in equilibrium — like a perfectly balanced scale. The first character must equal the last, the second must equal the second-to-last, and so on. If any pair doesn't match, the system is out of balance. Using two pointers is like having observers at both ends of a seesaw, walking toward the center. They meet when they've checked all necessary pairs (or discover a mismatch along the way).

Why This Pattern?

Palindromes have symmetric structure — position i from the left must match position n-1-i from the right. We only need to compare ceil(n/2) pairs to determine if it's a palindrome. Two pointers let us check both sides in a single pass without extra space, like mirrors reflecting each other.

Solution

def isPalindrome(s: str) -> bool:
    left, right = 0, len(s) - 1
    
    while left < right:
        # Skip non-alphanumeric characters from left
        while left < right and not s[left].isalnum():
            left += 1
        # Skip non-alphanumeric characters from right  
        while left < right and not s[right].isalnum():
            right -= 1
        
        # Compare characters (case-insensitive)
        if s[left].lower() != s[right].lower():
            return False
        
        # Move pointers toward center
        left += 1
        right -= 1
    
    return True

Complexity

Time: O(n)
Space: O(1)

Each character in the string is visited at most once as the left and right pointers move toward each other. Even in the worst case (all valid alphanumerics), we touch each character exactly once. We only store two integer pointers regardless of input size.

Common Mistakes

Edge Cases

Connections

Sliding Window (6)

Best Time to Buy and Sell Stock #121
Single Pass Scan / Implicit Sliding Window

Intuition

Think of this like a hiker walking through a mountain range who can only look forward in time. They want to find the lowest valley BEFORE a peak — they can't time travel to buy at the lowest point overall if it comes after their selling point. As you walk through each day, you track the lowest price seen so far (that's your best buying opportunity up to this point). At every peak, you ask: 'How much would I gain if I sold here, having bought at my lowest point so far?' The answer that maximizes this gain is your answer. The key insight: you don't need to try all pairs — you only need to remember the minimum price encountered before the current day.

Why This Pattern?

This is a degenerate sliding window where we're tracking a single value (the minimum) as we 'slide' forward through time. We don't need an explicit window data structure because we're just maintaining one running minimum. The problem has optimal substructure: the best profit up to day i depends only on the minimum price seen up to day i-1.

Solution

def maxProfit(prices):
    # Initialize to track:
    # 1. The minimum price seen so far (best day to buy)
    # 2. The maximum profit achievable (best day to sell)
    min_price = float('inf')
    max_profit = 0
    
    for price in prices:
        # Update our best buying opportunity if current price is lower
        # This is like remembering the lowest valley we've passed through
        if price < min_price:
            min_price = price
        
        # Calculate profit if we sell today having bought at min_price
        # We use max() to ensure we only track positive gains
        profit = price - min_price
        max_profit = max(max_profit, profit)
    
    return max_profit

Complexity

Time: O(n) - We make exactly one pass through the prices array. Each iteration does O(1) work. We can't do better because we must examine each price at least once to know the minimum so far.
Space: O(1) - Only two variables are used regardless of input size. No arrays or data structures that grow with input.

Think of it as scanning a conveyor belt once. You don't need to go back (that's O(n²) with nested loops), you just need to remember the lowest price encountered so far. Memory is constant because you're only storing two numbers, not the entire price history.

Common Mistakes

Edge Cases

Connections

Longest Repeating Character Replacement #424
Sliding Window (Longest Substring with Condition)

Intuition

Think of this like a 'squeeze' or purification problem. You have a window of characters and a budget of k 'impurities' (characters that don't match the majority). Your goal is to find the largest window you can 'clean' to make all one character by replacing at most k impurities. The key insight: in any valid window, the number of replacements needed = window_size - count_of_most_frequent_char. If this ≤ k, the window is valid. It's like trying to maintain equilibrium where your 'energy' (k) is used to push the system toward uniformity.

Why This Pattern?

We need the longest contiguous substring satisfying a condition. The condition involves character frequencies, which we can track incrementally as we expand/shrink the window. The key structural property: we can always grow right, and only need to shrink left when invalid, giving O(n) solution. This is the 'longest valid window' variant of sliding window.

Solution

def characterReplacement(s: str, k: int) -> int:
    # Track frequency of each character in current window
    char_count = [0] * 26
    max_count = 0  # Count of most frequent char in current window
    left = 0
    result = 0
    
    for right in range(len(s)):
        # Add current char to window and update its count
        idx = ord(s[right]) - ord('A')
        char_count[idx] += 1
        
        # Update the most frequent char count in window
        max_count = max(max_count, char_count[idx])
        
        # Window is invalid if (size - max_count) > k
        # This means we need more replacements than we have budget for
        while (right - left + 1) - max_count > k:
            # Shrink window from left
            char_count[ord(s[left]) - ord('A')] -= 1
            left += 1
        
        # Window is now valid - update result
        result = max(result, right - left + 1)
    
    return result

Complexity

Time: O(n) where n = len(s)
Space: O(1) - fixed 26 character array

Time is O(n) because each character is visited at most twice - once when right expands past it, once when left shrinks past it. We never revisit. Space is O(1) because we only store counts for 26 letters regardless of input size - doesn't grow with n.

Common Mistakes

Edge Cases

Connections

Longest Substring Without Repeating Characters #3
Sliding Window with HashMap - Two Pointer Technique

Intuition

Imagine you're maintaining a 'clean zone' on a conveyor belt - you expand your window to the right, adding new characters. But the moment you spot a repeat (a 'contaminant'), you have to throw away everything from the left up to and including that character's last occurrence. It's like a sliding observation window that auto-adjusts: expand when things are新鲜 (fresh), contract when you hit a duplicate. The beautiful part is you never need to go backwards - both pointers only march forward, making this O(n) total.

Why This Pattern?

This fits sliding window because: (1) we're seeking a contiguous substring, (2) the optimal answer is some window we can represent with left/right boundaries, (3) the 'validity' constraint (no repeats) can be checked incrementally as we move, and (4) both pointers only advance forward - we never need to revisit positions. The HashMap lets us jump the left pointer directly to the optimal position (just after the last occurrence of a duplicate) rather than sliding one step at a time.

Solution

def lengthOfLongestSubstring(s):
    char_last_seen = {}  # Map character -> last index where it appeared
    max_len = 0
    left = 0  # Left boundary of our 'clean' window
    
    for right in range(len(s)):
        char = s[right]
        
        # If char exists in our current window, we need to shrink from left
        # The new left becomes one position AFTER its last occurrence
        if char in char_last_seen and char_last_seen[char] >= left:
            left = char_last_seen[char] + 1
        
        # Record/update this character's latest position
        char_last_seen[char] = right
        
        # Calculate window size and update max if larger
        max_len = max(max_len, right - left + 1)
    
    return max_len

Complexity

Time: O(n)
Space: O(min(n, 26)) for lowercase, or O(min(n, 128)) for ASCII - bounded by character set size, not input size

We visit each character exactly once with the right pointer (n steps). The left pointer can also move at most n times total (each character 'falls off' the left edge at most once). So total pointer movements = 2n = O(n). HashMap operations are O(1) average. Space is bounded by how many distinct characters can fit in the window - in the worst case (all unique chars), we store n entries, but typically it's bounded by the character set (26 for lowercase, 128 for ASCII, ~1M for Unicode).

Common Mistakes

Edge Cases

Connections

Minimum Window Substring #76
Sliding Window (Expand-Contract with Two Pointers)

Intuition

Think of this like a filtration or scanning problem. You have a 'target signature' (string t) — like a bouncer checking if you have all required documents, or a chef verifying all ingredients are present. You're scanning through s looking for the smallest window that contains this complete signature. The sliding window works because: expand right to gather more characters until you have a valid window (all required chars present), then contract left to find the minimum size while keeping it valid. It's like finding the tightest grip that still holds all the pieces.

Why This Pattern?

We need to examine contiguous regions in s. The validity of a window (whether it contains all chars of t) can be checked incrementally — as we add chars on the right, we can update our counts; as we remove from the left, we can update counts. Both pointers only move forward, giving O(n) time. This is the natural pattern when you're looking for optimal contiguous subsequences defined by a constraint.

Solution

def minWindow(s: str, t: str) -> str:
    if not s or not t:
        return ""
    
    # Character frequency we NEED to satisfy
    need = {}
    for c in t:
        need[c] = need.get(c, 0) + 1
    
    # Character frequency in current WINDOW
    window = {}
    
    # Two pointers define our window
    left = 0
    right = 0
    
    # 'valid' counts how many UNIQUE chars have met their required frequency
    valid = 0
    required = len(need)
    
    # Track the answer
    min_len = float('inf')
    min_start = 0
    
    while right < len(s):
        # EXPAND: Add s[right] to window
        char_in = s[right]
        if char_in in need:
            window[char_in] = window.get(char_in, 0) + 1
            # If this char now meets its required count, increment valid
            if window[char_in] == need[char_in]:
                valid += 1
        
        # CONTRACT: Shrink from left while window is still valid
        while valid == required:
            # Update answer - this is a valid window!
            if right - left + 1 < min_len:
                min_len = right - left + 1
                min_start = left
            
            # Try to shrink by moving left pointer in
            char_out = s[left]
            if char_out in need:
                # If we're about to remove a char that was meeting requirement
                if window[char_out] == need[char_out]:
                    valid -= 1  # Window will become invalid after removal
                window[char_out] -= 1
            
            left += 1
        
        right += 1
    
    return "" if min_len == float('inf') else s[min_start:min_start + min_len]

Complexity

Time: O(n + m) where n = len(s), m = len(t)
Space: O(m + k) where m = unique chars in t, k = unique chars in s that overlap with t

Common Mistakes

Edge Cases

Connections

Permutation in String #567
Sliding Window with Frequency Counter and Match Tracking

Intuition

Imagine you're looking for a chord (any permutation of s1's notes) inside a longer song (s2). The order of notes within the chord doesn't matter - you just need the exact same multiset of characters. This is like comparing histograms: does any contiguous window in s2 have the exact same character frequency distribution as s1? Think of it as a "fingerprint" matching problem - we're checking if s1's character-count fingerprint appears anywhere in s2's sliding window.

Why This Pattern?

The window size is FIXED to len(s1). Instead of rebuilding the frequency map for each window position (which would be O(n*m)), we maintain it incrementally: as the window slides, we decrement one character count and increment another. We also track how many character frequencies currently match between the window and s1 - this 'match count' lets us check the entire window in O(1) rather than comparing all 26 letters each time.

Solution

def checkInclusion(s1: str, s2: str) -> bool:
    if len(s1) > len(s2):
        return False
    
    # Build the target frequency map (the fingerprint we're looking for)
    need = {}
    for c in s1:
        need[c] = need.get(c, 0) + 1
    
    window = {}
    window_size = len(s1)
    
    # Initialize first window in s2
    for i in range(window_size):
        window[s2[i]] = window.get(s2[i], 0) + 1
    
    # Count how many characters currently have matching frequencies
    matches = 0
    for c in need:
        if window.get(c, 0) == need[c]:
            matches += 1
    
    # If all chars match at start, we found a permutation
    if matches == len(need):
        return True
    
    # Slide the window: remove leftmost char, add new rightmost char
    for i in range(window_size, len(s2)):
        # Add the new character entering the window
        new_char = s2[i]
        if new_char in need:
            # Before incrementing: if this was a matching count, we're about to break that match
            if window.get(new_char, 0) == need[new_char]:
                matches -= 1
            window[new_char] = window.get(new_char, 0) + 1
            # After incrementing: check if we now match
            if window.get(new_char, 0) == need[new_char]:
                matches += 1
        
        # Remove the old character leaving the window
        old_char = s2[i - window_size]
        if old_char in need:
            # Before decrementing: if this was a matching count, we're about to break that match
            if window.get(old_char, 0) == need[old_char]:
                matches -= 1
            window[old_char] = window.get(old_char, 0) - 1
            # After decrementing: check if we now match
            if window.get(old_char, 0) == need[old_char]:
                matches += 1
        
        # Check if all character frequencies now match
        if matches == len(need):
            return True
    
    return False

Complexity

Time: O(n) where n = len(s2). Each character in s2 is visited twice at most (once when entering, once when leaving the window), and all other operations are O(1).
Space: O(1) or O(k) where k = 26 (lowercase letters). Since the alphabet size is fixed at 26, we consider this constant space.

We process each character in s2 exactly twice (enter and exit the window), giving O(n). The frequency maps only store at most 26 entries (one per lowercase letter), so space is bounded by the alphabet size - constant. We can't do better than O(n) because we must potentially check every starting position in s2.

Common Mistakes

Edge Cases

Connections

Sliding Window Maximum #239
Monotonic Deque (Monotonic Decreasing Queue)

Intuition

Think of this like a VIP section at a club. As people line up (the sliding window), you need to know who the tallest person is in the current VIP group. Here's the key insight: if a new person taller than everyone in line walks in, the shorter people can NEVER be the maximum while the new person is in the window — they're 'dominated.' It's like if LeBron James walks into a room, everyone else in that room loses their chance to be the tallest. We keep a decreasing queue of potential maximums: any smaller element to the left becomes irrelevant the moment a bigger one appears to its right.

Why This Pattern?

The structural property that makes this pattern work is that we process elements left-to-right and maintain a queue where values decrease. Any element to the left that's smaller than a new element can NEVER become the maximum — the new element dominates it for the remainder of its window lifespan. This allows O(1) max retrieval and O(1) insertions/removals. It's the only way to achieve O(n) for this problem because each element is pushed and popped at most once.

Solution

from collections import deque

def maxSlidingWindow(nums, k):
    result = []
    dq = deque()  # stores INDICES, not values - crucial for knowing when to remove
    
    for i in range(len(nums)):
        # 1. REMOVE: indices outside the current window
        # If the leftmost element is before window start, it's expired
        while dq and dq[0] < i - k + 1:
            dq.popleft()
        
        # 2. REMOVE: indices whose values are smaller than current
        # These elements are "dominated" - current nums[i] will be the max
        # for the rest of their window lifespan, so they're useless
        while dq and nums[dq[-1]] < nums[i]:
            dq.pop()
        
        # 3. ADD: current index to the deque
        dq.append(i)
        
        # 4. RECORD: once we have a full window, the front is our max
        if i >= k - 1:
            result.append(nums[dq[0]])
    
    return result

Complexity

Time: O(n)
Space: O(k)

Common Mistakes

Edge Cases

Connections

Stack (7)

Car Fleet #853
Sort + Stack (or single counter)

Intuition

Imagine cars as particles flowing toward an energy minimum (the target). A faster car behind a slower car is like a particle that can get 'captured' by the slower one's gravitational well - once it catches up, they move as one unit. The key insight: sort cars by their starting distance from the target (closest first), then calculate each car's 'arrival time' (how long to reach target). If a car takes LONGER to arrive than the fleet ahead of it, it forms its own fleet - it can't catch up. If it takes less time, it joins the fleet in front. Think of it like runners on a track - someone running faster but starting farther back might still get caught by someone ahead running slower but with a head start.

Why This Pattern?

Sorting by position (descending from target) creates a natural ordering where we only need to compare each car to the one immediately ahead of it - like a linked list. We don't need a full stack because we're only tracking the 'slowest arrival time so far' (the lead fleet). Each car either joins that fleet or starts a new one, which we can count with a simple variable.

Solution

def carFleet(target: int, position: list[int], speed: list[int]) -> int:
    # Pair each car's position with its speed, then sort by position (furthest from target first)
    cars = sorted(zip(position, speed), key=lambda x: x[0])
    
    # Calculate time for each car to reach target: distance / speed
    # times[i] = (target - position[i]) / speed[i]
    times = [(target - pos) / spd for pos, spd in cars]
    
    # Count of fleets (at least one car = at least one fleet)
    fleets = 0
    
    # Track the slowest arrival time seen so far (the lead fleet)
    # Working from closest to target outward
    slowest_time = 0
    
    for t in times:
        # If this car takes LONGER than the lead fleet, it forms a NEW fleet
        # (it can't catch up to the car ahead)
        if t > slowest_time:
            fleets += 1
            slowest_time = t
        # If t <= slowest_time, this car joins the fleet ahead (catches up)
    
    return fleets

Complexity

Time: O(n log n) - dominated by the sorting step. The single pass through times is O(n).
Space: O(n) for storing the cars and times arrays (could be O(1) if we computed times on the fly, but the cleaner version uses extra space).

Sorting is required because without ordering by position, we can't determine which car is 'ahead' of which. We're essentially imposing a total order on the cars to simulate their spatial arrangement. O(n log n) is the best we can do - any algorithm must at least look at each car once, and comparison-based sorting is optimal for the unordered input.

Common Mistakes

Edge Cases

Connections

Daily Temperatures #739
Monotonic Decreasing Stack (Next Greater Element pattern)

Intuition

Think of this like a thermodynamic system where each day is 'seeking equilibrium' with a warmer future day. The key insight: when a warmer day arrives, it immediately 'resolves' all the previous days that were waiting for warmth. Picture a stack of people in a cold line - each person wants to know when a warmer person will show up behind them. When that warmer person arrives, they can tell all the waiting 'colder' people exactly how many days they waited. The stack maintains days that haven't found warmth yet in decreasing temperature order - this way, when warmth comes, we can resolve ALL the waiting days at once, like dominos falling.

Why This Pattern?

This is the canonical 'Next Greater Element' problem. The structural property: we need to pair each element with the FIRST future element that's larger. A monotonic stack maintains elements in decreasing order, so when we encounter a larger element, we can immediately resolve all waiting elements - each element gets pushed and popped at most once, giving O(n) time.

Solution

def dailyTemperatures(temperatures):
    n = len(temperatures)
    answer = [0] * n  # default: 0 if no warmer day exists
    stack = []  # stores indices with decreasing temperatures (waiting for warmth)
    
    for i in range(n):
        # Current day is warmer than days waiting on stack
        # 'Resolve' all those days - they found their warmer tomorrow!
        while stack and temperatures[i] > temperatures[stack[-1]]:
            prev_day = stack.pop()
            answer[prev_day] = i - prev_day  # days waited = current day - that day
        
        # This day is now waiting for a warmer future day
        stack.append(i)
    
    # Days remaining in stack have no warmer day ahead (stay 0)
    return answer

Complexity

Time: O(n)
Space: O(n) in worst case (strictly decreasing temperatures)

Each day is pushed onto the stack exactly once and popped exactly once. That's 2n operations total, making this linear. We can't do better because in the worst case (strictly decreasing), every day must wait until the end, so we must examine each element.

Common Mistakes

Edge Cases

Connections

Evaluate Reverse Polish Notation #150
Stack - Expression Evaluation

Intuition

Think of RPN like cooking from a recipe card. You read instructions in order: when you see an ingredient (number), you put it on the counter. When you see an action (operator), you grab the two most recent things on the counter, combine them, and put the result back. The stack acts as your counter - it holds intermediate results until they're consumed by the next operator. This is exactly like a factory assembly line: operators are machines that take 2 inputs and produce 1 output, which becomes available for the next machine.

Why This Pattern?

Postfix notation has an inherently 'last in, first out' structure - operators always consume the most recently seen operands. The stack perfectly models this: push numbers as they arrive, pop two when you see an operator, compute, push the result back. This transforms what would be a tree traversal problem into a simple linear scan.

Solution

def evalRPN(tokens):
    stack = []
    
    for token in tokens:
        if token in '+-*/':
            # Pop in reverse order: b is the second operand, a is the first
            b = stack.pop()
            a = stack.pop()
            
            if token == '+':
                result = a + b
            elif token == '-':
                result = a - b
            elif token == '*':
                result = a * b
            else:  # division - must truncate toward zero
                result = int(a / b)
            
            stack.append(result)
        else:
            # It's a number - convert and push onto stack
            stack.append(int(token))
    
    return stack[0]

Complexity

Time: O(n) - We process each token exactly once. Each push and pop is O(1).
Space: O(n) - In the worst case (all numbers, no operators), the stack holds n/2 operands. In the worst expression structure, we could have all numbers before any operators.

You can't do better than O(n) because you must read every token to understand the expression. Each number contributes to the final result, and each operator must combine existing values - there's no shortcut that skips processing any token.

Common Mistakes

Edge Cases

Connections

Generate Parentheses #22
Backtracking / Depth-First Search on a state tree

Intuition

Think of building parentheses like maintaining a 'height' or 'balance' in a system. Each '(' is like stepping up, each ')' is like stepping down. You start at ground level (balance = 0), can only go up n times, and must return to ground level at the end. The key constraint: you can never step below ground (more closes than opens) — that would be physically impossible, like going negative on a bank account. At any intermediate step, your 'balance' (opens minus closes) tells you whether you can add a close. This is exactly like a depth-first exploration where you try both moves but respect the conservation law that balance >= 0 everywhere.

Why This Pattern?

The problem has a natural tree structure: each position in the string represents a decision point with limited valid choices. The 'balance' constraint naturally prunes invalid branches, making DFS the natural fit. We explore all valid paths from root to depth 2n, collecting leaves that represent complete well-formed strings.

Solution

def generateParenthesis(n):
    result = []
    
    def backtrack(open_count, close_count, current):
        # Base case: we've used all n pairs of each parenthesis
        if open_count == n and close_count == n:
            result.append(current)
            return
        
        # Choice 1: Add an opening parenthesis if we haven't used all n
        # This increases our "balance" - we're adding potential to close later
        if open_count < n:
            backtrack(open_count + 1, close_count, current + "(")
        
        # Choice 2: Add a closing parenthesis only if it won't exceed opens
        # This is the "debt" constraint - we can only close what we've opened
        if close_count < open_count:
            backtrack(open_count, close_count + 1, current + ")")
    
    backtrack(0, 0, "")
    return result

Complexity

Time: O(C_n) where C_n is the nth Catalan number (~4^n / n^(3/2))
Space: O(n) for recursion stack + O(C_n * n) for storing all results

We must generate ALL valid parentheses combinations - there's no way around producing each one. The Catalan number represents the count of all well-formed sequences with n pairs. Each sequence takes n steps to build, so we can't do better than O(C_n). The recursion depth is at most n (max nesting depth), which is the minimal extra space needed.

Common Mistakes

Edge Cases

Connections

Largest Rectangle in Histogram #84
MONOTONIC STACK (increasing)

Intuition

Picture the histogram as a city skyline. You're looking for the largest rectangle that fits under any part of this skyline. Here's the key insight: for any particular building (bar), the largest rectangle that includes it can only stretch left until it hits a shorter building, and right until it hits a shorter building. The height of that rectangle is fixed at the building's height - the width is determined by these 'shorter building boundaries.' Think of it like water settling between buildings of different heights - the water level is constrained by the shorter building on each side. This is exactly what we're computing: for each bar, find where the water would 'spill' (the first shorter bar to left and right).

Why This Pattern?

We maintain a stack of bar indices in increasing order of height. When we encounter a bar shorter than the stack's top, we've found the RIGHT boundary for that taller bar (the current bar is the first shorter one to the right). The LEFT boundary is the bar now at the top of the stack after popping (the first shorter one to the left). This is the natural structure because we need to find 'nearest smaller element' boundaries - a classic monotonic stack use case.

Solution

def largestRectangleArea(heights):
    # Add sentinel 0 at end to flush remaining bars in stack
    # This handles rectangles that extend to the last column
    stack = []  # stores indices of bars in increasing height order
    max_area = 0
    
    for i, h in enumerate(heights + [0]):  # append sentinel
        # While current bar is shorter than top of stack,
        # we've found the right boundary for the stack's top bar
        while stack and heights[stack[-1]] > h:
            height = heights[stack.pop()]  # the bar we're computing area for
            # Width: current index is right boundary,
            # new stack top (after pop) is left boundary
            # If stack is empty, left boundary is -1 (start of array)
            width = i if not stack else i - stack[-1] - 1
            max_area = max(max_area, height * width)
        stack.append(i)
    
    return max_area

Complexity

Time: O(n)
Space: O(n) in worst case (strictly increasing heights)

Each bar is pushed onto the stack exactly once and popped at most once. Even though there's a while loop inside the for loop, the total number of iterations across all pops is bounded by n. We visit each bar twice at most (once when pushing, once when popping).

Common Mistakes

Edge Cases

Connections

Min Stack #155
Stack with augmented state - storing auxiliary information at each stack frame

Intuition

Imagine you're managing a stack of weighted crates and need to answer 'what's the lightest crate in the entire stack?' instantly at any moment. When you add a new crate lighter than everything below it, you create a 'checkpoint' — you remember both the new crate AND what the minimum was before. When you later remove that light crate, you automatically restore the previous minimum because it was saved at that stack level. It's like each layer of the stack carries a 'memory' of the minimum for all layers beneath it.

Why This Pattern?

Each stack element needs to know not just its own value but the minimum of all elements below it. When you push a new minimum, you create a checkpoint. When you pop, you automatically restore the previous minimum because it was stored at the level being removed. This creates a chain where every stack level encodes the minimum for its entire 'subtree'.

Solution

class MinStack:
    def __init__(self):
        # Each element stores: (actual_value, min_value_at_this_level)
        self.stack = []
    
    def push(self, val: int) -> None:
        if not self.stack:
            # First element: it's both the value and the current minimum
            self.stack.append((val, val))
        else:
            # The minimum at this level = min(new_value, previous_minimum)
            current_min = self.stack[-1][1]
            new_min = min(val, current_min)
            self.stack.append((val, new_min))
    
    def pop(self) -> None:
        self.stack.pop()
    
    def top(self) -> int:
        # Return the actual value (first element of tuple)
        return self.stack[-1][0]
    
    def getMin(self) -> int:
        # Return the stored minimum at this level (second element of tuple)
        return self.stack[-1][1]

Complexity

Time: O(1) for all operations
Space: O(n) where n is the number of elements pushed

Every operation touches only the top of the stack (constant work). We must store O(n) data because each stack frame needs to remember its own minimum — there's no way around this since we need to restore previous minimums when elements are popped.

Common Mistakes

Edge Cases

Connections

Valid Parentheses #20
Stack (Last-In-First-Out)

Intuition

Think of this like a seesaw or balance scale. When you see an opening bracket '(', you're placing weight on one side. The matching closing bracket ')' is the counterweight that balances it. But here's the crucial insight: the balance must be maintained at EVERY step. You can't close an outer expression before closing its inner one first. This is exactly how a stack works—the most recently opened bracket must be the next one closed (LIFO).

Why This Pattern?

The nesting structure of parentheses is inherently LIFO—the innermost opening bracket must be closed before its outer counterpart can be. A stack naturally models this: push openings, pop when you see a closing, and check if they match.

Solution

def isValid(s: str) -> bool:
    # Stack holds unmatched opening brackets
    stack = []
    
    # Map each closing bracket to its corresponding opening bracket
    # This lets us check: does top of stack match what I need to close?
    mapping = {')': '(', ']': '[', '}': '{'}
    
    for char in s:
        if char in mapping:
            # It's a closing bracket - need to match against stack top
            if not stack:
                # Stack empty = nothing to close = invalid
                return False
            if stack[-1] != mapping[char]:
                # Top of stack doesn't match the required opener = invalid
                return False
            stack.pop()  # Successfully matched, remove from stack
        else:
            # It's an opening bracket - save it for later
            stack.append(char)
    
    # Valid only if ALL opening brackets were matched (stack empty)
    return len(stack) == 0

Complexity

Time: O(n)
Space: O(n) in worst case (when all opening brackets like '((((')

We traverse the string once (O(n)). Each operation on the stack is O(1). Worst case space is n because we could have n/2 opening brackets all waiting to be matched—like '((((((...'. We can't do better than O(n) space because we might need to remember every unmatched opener.

Common Mistakes

Edge Cases

Connections

Binary Search (7)

Binary Search #704
Binary Search on a sorted array

Intuition

Imagine you're looking for a specific book in a perfectly alphabetized library with millions of books. You wouldn't check every book from A to Z — that's painfully slow. Instead, you'd open to the middle, see if your book comes before or after, and instantly eliminate half the library. Repeat. This is binary search. The key insight is that sorted data has a 'gradient' property: everything to the left of any point is smaller, everything to the right is larger. This lets you make a binary decision (go left OR go right) that eliminates half the remaining possibilities at each step.

Why This Pattern?

The array's sorted property creates a strict monotonic sequence — each element has a known relationship to its neighbors. This monotonicity is the structural property that makes binary search valid: if mid < target, we KNOW all elements from mid to left are too small, so we can safely discard them. No such guarantee exists with unsorted data, which is why binary search requires sorting.

Solution

def search(nums, target):
    left, right = 0, len(nums) - 1  # Initialize search bounds
    
    while left <= right:  # <= because right could be a valid index
        # Calculate mid safely (avoids potential overflow in other languages)
        mid = left + (right - left) // 2
        
        if nums[mid] == target:
            return mid  # Found it!
        elif nums[mid] < target:
            left = mid + 1  # Target must be in right half
        else:
            right = mid - 1  # Target must be in left half
    
    return -1  # Target not in array

Complexity

Time: O(log n)
Space: O(1)

At each iteration, we halve the search space. After k steps, we're searching n/2^k elements. We stop when n/2^k < 1, which takes log2(n) steps. This is the same reason compound interest grows exponentially — each 'division' compounds. We can't do better because any algorithm must examine each element at least once in the worst case, and binary search does the minimum necessary comparisons to leverage sorted information.

Common Mistakes

Edge Cases

Connections

Find Minimum in Rotated Sorted Array #153
Modified Binary Search for Boundary Detection

Intuition

Think of this like finding the seam where two sorted stacks of papers were taped together. Originally you had one sorted stack, then someone rotated it by picking a point and moving everything before that point to the end. The minimum is where the sequence 'wraps around' - the only place where a higher number is followed by a lower number. Using binary search: one half of the array is ALWAYS sorted (that's the invariant). If the middle element is greater than the rightmost element, the minimum MUST be in the right half (because the right half contains the wrap-around point). If middle is less than or equal to right, the minimum is at middle or to the left. It's like sliding your finger down a valley - you're trying to find the lowest point.

Why This Pattern?

This isn't searching for a target value - we're searching for a structural boundary (the rotation point). The key insight is that in a rotated sorted array, at least one half (left or right of mid) is always sorted. We use this property to eliminate half of the search space at each step, converging on the minimum.

Solution

def findMin(nums):
    left, right = 0, len(nums) - 1
    
    # Binary search for the minimum
    while left < right:
        mid = (left + right) // 2
        
        # If mid > right, minimum is in right half (wraps around)
        # We can't go left because that section is monotonically increasing
        if nums[mid] > nums[right]:
            left = mid + 1
        # If mid <= right, minimum is at mid or in left half
        # The right half is sorted and starts with something >= mid
        else:
            right = mid
    
    # When left == right, we've found the minimum
    return nums[left]

Complexity

Time: O(log n)
Space: O(1)

We only use three pointers (left, right, mid) and a few variables regardless of input size. No recursion, no extra data structures.

Common Mistakes

Edge Cases

Connections

Koko Eating Bananas #875
Binary Search on Answer (Monotonic Predicate)

Intuition

Think of Koko's eating speed like water flow through a pipe. You need a minimum flow rate to push all the bananas through within the time limit. If the flow is too slow, bananas pile up and overflow (time runs out). If it's fast enough, they all get processed. The monotonic property is key: if speed k works, any faster speed definitely works too — just like higher water pressure can't make things worse. We're essentially finding the minimum 'pressure' needed.

Why This Pattern?

The predicate 'can finish in h hours' is monotonically decreasing in k — if speed k works, all speeds > k also work. This creates a clean boundary we can binary search. The search space is bounded: minimum speed is 1, maximum is max(piles) (eat one whole pile per hour).

Solution

def minEatingSpeed(piles, h):
    # Helper: calculate hours needed at speed k
    def hours_needed(k):
        total = 0
        for pile in piles:
            # Ceiling division: pile/k, rounded up
            # Python's math.ceil would work but (pile + k - 1) // k is faster
            total += (pile + k - 1) // k
        return total
    
    # Binary search bounds
    left, right = 1, max(piles)
    
    while left < right:
        mid = (left + right) // 2
        
        if hours_needed(mid) <= h:
            # This speed works! Try to go slower (left part)
            right = mid
        else:
            # Too slow, need faster speed (right part)
            left = mid + 1
    
    return left

Complexity

Time: O(n log m) where n = len(piles), m = max(piles)
Space: O(1)

We binary search over speeds (log m iterations), and each iteration scans all piles (n). Can't do better than O(n) per check since we must examine each pile to calculate total hours — each pile affects the answer. The log m factor is the minimum needed to find the exact boundary in a sorted search space.

Common Mistakes

Edge Cases

Connections

Median of Two Sorted Arrays #4
Binary Search on Partition (searching for a cut point in a sorted structure)

Intuition

Think of two sorted decks of cards. You want to find the median of all cards combined. Instead of merging (which is slow), imagine making a single cut through BOTH decks such that all cards to the LEFT of the cut are smaller than all cards to the RIGHT. That's the 'invisible' cut in the merged sorted array. The median is just around that cut. Binary search is our tool to FIND that cut efficiently — we guess where to cut the first array, then calculate where the second array's cut must be to make the partition valid. We know we found it when the largest element on the left of BOTH arrays ≤ smallest element on the right of BOTH arrays.

Why This Pattern?

The problem asks 'where would the median cut be if we merged these arrays?' — not 'what value is the median?' The answer lives in the index space (0 to len(nums1)), which is a sorted search space. We can determine if we're too far left or right by checking if the partition satisfies the inequality max(left) ≤ min(right).

Solution

def findMedianSortedArrays(nums1, nums2):
    # Ensure nums1 is the smaller array for O(log(min(m,n)))
    if len(nums1) > len(nums2):
        nums1, nums2 = nums2, nums1
    
    m, n = len(nums1), len(nums2)
    left, right = 0, m  # binary search on nums1's indices
    
    while left <= right:
        # Partition positions — partition1 + partition2 divides total elements
        partition1 = (left + right) // 2
        partition2 = (m + n + 1) // 2 - partition1
        
        # Get boundary values; use -inf/inf for empty partitions
        maxLeft1 = float('-inf') if partition1 == 0 else nums1[partition1 - 1]
        minRight1 = float('inf') if partition1 == m else nums1[partition1]
        maxLeft2 = float('-inf') if partition2 == 0 else nums2[partition2 - 1]
        minRight2 = float('inf') if partition2 == n else nums2[partition2]
        
        # Check if partition is valid: max of lefts ≤ min of rights
        if maxLeft1 <= minRight2 and maxLeft2 <= minRight1:
            # Found the correct partition!
            if (m + n) % 2 == 1:
                # Odd total: median is the max of left sides
                return max(maxLeft1, maxLeft2)
            else:
                # Even total: median is average of max left and min right
                return (max(maxLeft1, maxLeft2) + min(minRight1, minRight2)) / 2
        elif maxLeft1 > minRight2:
            # Too many from nums1 (partition1 too far right), move left
            right = partition1 - 1
        else:
            # Too few from nums1 (partition1 too far left), move right
            left = partition1 + 1
    
    return 0.0  # should never reach here for valid inputs

Complexity

Time: O(log(min(m, n)))
Space: O(1) — only a fixed number of variables regardless of input size

We binary search on the smaller array's indices. Each iteration cuts the search space in half. The number of iterations is logarithmic in the smaller array's length. We only do constant-time lookups at partition boundaries — no array merging or extra storage.

Common Mistakes

Edge Cases

Connections

Search a 2D Matrix #74
Binary Search on Virtual Sorted Array

Intuition

Imagine you have a perfectly sorted list of numbers, but someone drew grid lines over it — splitting it into rows where each row continues from where the previous one ended (like a phone book folded into a grid). To find a number, you don't need to think in 2D — you just need to convert a 1D position into 2D coordinates. The key insight: if you flatten this matrix into one long sorted array, the element at index i would be at row = i // n (integer division) and col = i % n (remainder). This is a coordinate transformation — we're doing binary search in a 'virtual' 1D space and translating those indices to actual matrix positions.

Why This Pattern?

The matrix satisfies the conditions of a fully sorted sequence: each row is sorted, AND the last element of row k is less than the first element of row k+1. This means if we concatenated all rows into one array, it would be perfectly sorted. Binary search requires a sorted input — by mapping our search space to this virtual sorted array, we get O(log(m*n)) performance instead of O(m+n) from brute force.

Solution

def searchMatrix(matrix, target):
    if not matrix or not matrix[0]:
        return False
    
    m = len(matrix)
    n = len(matrix[0])
    
    # Binary search on virtual 1D array of size m*n
    left, right = 0, m * n - 1
    
    while left <= right:
        mid = (left + right) // 2
        # Convert 1D index to 2D coordinates
        row = mid // n
        col = mid % n
        
        if matrix[row][col] == target:
            return True
        elif matrix[row][col] < target:
            left = mid + 1
        else:
            right = mid - 1
    
    return False

Complexity

Time: O(log(m*n))
Space: O(1)

Binary search on n elements always takes O(log n) steps — you halve the search space each iteration. Here our 'n' is m*n (total elements). We can't do better because any comparison-based search must examine enough elements to distinguish between all possible positions — that's log₂(m*n) decisions in the worst case.

Common Mistakes

Edge Cases

Connections

Search in Rotated Sorted Array #33
Modified Binary Search with Sorted-Half Identification

Intuition

Think of a sorted bookshelf where someone picked up a stack of books and reinserted them at a different position - that's the rotation. The key insight: at any midpoint, at least ONE half of the array is ALWAYS sorted. This is because rotation only creates ONE break point in the sorted order. You can visualize it like finding your way through a mountain range where one side of any valley is always flat (sorted) - you just need to figure out which side contains your target.

Why This Pattern?

The rotation property guarantees that for any mid point, exactly one of [left, mid] or [mid, right] is sorted (unless mid equals left, which falls through to the other case). This gives us a binary decision: either the target lies in the sorted half, or it must be in the unsorted half. This cuts search space in half at each step.

Solution

def search(nums, target):
    left, right = 0, len(nums) - 1
    
    while left <= right:
        mid = (left + right) // 2
        
        # Found the target
        if nums[mid] == target:
            return mid
        
        # Identify which half is sorted
        if nums[left] <= nums[mid]:
            # Left half is sorted [left, ..., mid]
            # Check if target falls within this sorted range
            if nums[left] <= target < nums[mid]:
                # Target is in left sorted half
                right = mid - 1
            else:
                # Target must be in right half
                left = mid + 1
        else:
            # Right half is sorted [mid, ..., right]
            # Check if target falls within this sorted range
            if nums[mid] < target <= nums[right]:
                # Target is in right sorted half
                left = mid + 1
            else:
                # Target must be in left half
                right = mid - 1
    
    # Target not found
    return -1

Complexity

Time: O(log n)
Space: O(1)

Each iteration eliminates half of the remaining search space. Even though we might check both halves conceptually, we only traverse ONE branch per iteration - the sorted half that might contain the target. This is exactly like standard binary search: we make a constant-time decision at each step and reduce the problem size by half.

Common Mistakes

Edge Cases

Connections

Time Based Key-Value Store #981
Binary Search on Sorted Arrays

Intuition

Think of this like a version control system or document editing history. When you 'get' a value at timestamp 7, you're asking 'what was the value of this key at moment 7?' If you set values at timestamps 1, 5, and 10, and query at timestamp 7, you'd get the value from timestamp 5 - the most recent change that hadn't passed your query time. It's like looking backward through a timeline and grabbing the last snapshot that exists before or at your query point.

Why This Pattern?

The timestamps for each key are inserted in strictly increasing order, creating a sorted sequence. To find 'the largest timestamp <= target', binary search is the optimal algorithm - it's O(log n) compared to O(n) for linear scan. This is the classic 'floor' or 'lower bound' search pattern.

Solution

class TimeMap:
    def __init__(self):
        self.store = {}  # key -> list of (timestamp, value) pairs
    
    def set(self, key: str, value: str, timestamp: int) -> None:
        """Store value with timestamp. Insertions are always in increasing timestamp order."""
        if key not in self.store:
            self.store[key] = []
        self.store[key].append((timestamp, value))
    
    def get(self, key: str, timestamp: int) -> str:
        """Get the value at the largest timestamp <= given timestamp."""
        if key not in self.store:
            return ""
        
        values = self.store[key]
        left, right = 0, len(values) - 1
        result = ""
        
        while left <= right:
            mid = (left + right) // 2
            curr_timestamp = values[mid][0]
            
            if curr_timestamp <= timestamp:
                # This timestamp is valid (not past our query time)
                result = values[mid][1]  # Store as potential answer
                left = mid + 1  # Try to find a larger (more recent) valid timestamp
            else:
                # This timestamp is too new, go left
                right = mid - 1
        
        return result

Complexity

Time: O(1) amortized for set (always appends to end), O(log n) for get (binary search on sorted timestamps)
Space: O(n) total - storing all key-value-timestamp pairs

Set is O(1) because we always insert at the end of the list (timestamps are guaranteed increasing). Get is O(log n) because we binary search through at most n timestamps for that key. We can't do better than log n - we must examine enough timestamps to distinguish the boundary between valid and invalid times, which requires log n comparisons in the worst case.

Common Mistakes

Edge Cases

Connections

Linked List (11)

Add Two Numbers #2
Simultaneous traversal with persistent state (carry). This is essentially a 'two-pointer merge' where both pointers advance together while maintaining a running state.

Intuition

Think of adding two numbers on paper, column by column from right to left. The linked lists are already in the perfect order for this - the first node is the ones place, second is tens, etc. This is like a ripple carry adder in hardware: at each position you sum the two digits plus any incoming carry, output the result digit, and pass the overflow to the next position. The carry is a feedback loop - it persists from one iteration to the next, just like energy flowing through a system until equilibrium is reached. You process until both lists are empty AND there's no more carry to propagate.

Why This Pattern?

We need to process two sequences in lockstep. The carry is a state variable that gets updated each iteration and feeds back into the next calculation - this is the hallmark of a system with memory/persistence. The dual termination condition (both lists done AND no carry) mirrors physical systems that only stabilize when all energy dissipates.

Solution

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

def addTwoNumbers(l1, l2):
    # Dummy head simplifies edge case handling - like a buffer
    dummy = ListNode(0)
    current = dummy
    carry = 0  # Persistent state - the 'overflow' from each column
    
    # Process until both lists exhausted AND no carry remains
    while l1 or l2 or carry:
        # Get values (default to 0 if list exhausted) - like a switch that defaults to 0
        val1 = l1.val if l1 else 0
        val2 = l2.val if l2 else 0
        
        # Sum this column plus any incoming carry
        total = val1 + val2 + carry
        
        # Extract new digit (mod 10) and new carry (div 10)
        # This is like a beam splitter - energy divides into two paths
        carry = total // 10
        digit = total % 10
        
        # Create new node with result digit
        current.next = ListNode(digit)
        current = current.next
        
        # Advance pointers if lists not exhausted
        l1 = l1.next if l1 else None
        l2 = l2.next if l2 else None
    
    return dummy.next  # Skip dummy, return actual result

Complexity

Time: O(max(m, n)) where m and n are the lengths of the two lists. We visit each node at most once, and the carry propagation is bounded by the longer list plus one extra iteration for final carry.
Space: O(max(m, n)) for the result list. We create a new node for each digit in the output (plus one for final carry).

We must touch every digit in both input lists at minimum - there's no way to add numbers without looking at each digit. The output size is bounded by max(m,n)+1 (the +1 is that final carry can create an extra digit), so we can't do better than linear in the output size. This is the tightest bound.

Common Mistakes

Edge Cases

Connections

Copy List with Random Pointer #138
Hash Table + Two-Pass Traversal (or Interweaving)

Intuition

Think of this like cloning a city map where: - `next` pointers are sequential streets (linear, predictable) - `random` pointers are secret tunnels that can jump anywhere The challenge: When you copy a node that has a 'tunnel' to some other node, you need to know WHERE THE COPY of that destination node lives. You can't just copy the pointer directly or you'd end up pointing to the ORIGINAL city instead of the clone. The solution is a two-step dance: 1. First, go through and create all the new buildings (nodes) without connecting anything 2. Then go back and draw all the roads and tunnels using your knowledge of where each copied building sits

Why This Pattern?

The random pointers create an arbitrary graph structure, not just a linear chain. To copy edges that point to arbitrary nodes, you need a lookup mechanism. A hash map provides O(1) lookup from original node → copied node, solving the 'where is the copy?' problem. The two-pass approach separates node creation from edge connection, avoiding circular dependency issues.

Solution

class Node:
    def __init__(self, x: int, next: 'Node' = None, random: 'Node' = None):
        self.val = int(x)
        self.next = next
        self.random = random

def copyRandomList(head: 'Node') -> 'Node':
    if not head:
        return None
    
    # PASS 1: Create all new nodes, store mapping from old→new
    old_to_new = {}
    curr = head
    while curr:
        # Create copy with same value
        old_to_new[curr] = Node(curr.val)
        curr = curr.next
    
    # PASS 2: Wire up next and random pointers
    curr = head
    while curr:
        # Get the copied node for current position
        copy = old_to_new[curr]
        
        # Connect next: look up what original's next points to, get THAT copy
        copy.next = old_to_new.get(curr.next)
        
        # Connect random: same trick
        copy.random = old_to_new.get(curr.random)
        
        curr = curr.next
    
    return old_to_new[head]

Complexity

Time: O(n)
Space: O(n)

We traverse the list twice (2n operations = O(n)). We can't do it in one pass because when we encounter a random pointer, the destination node might not be copied yet. The O(n) space is for the hash map - we need to store a mapping for every node so we can find its copy later. This is unavoidable if random pointers can go forward or backward arbitrarily.

Common Mistakes

Edge Cases

Connections

Find the Duplicate Number #287
Floyd's Tortoise and Hare (Cycle Detection in a Linked List)

Intuition

Imagine the array as a linked list where each value points to the next index to visit. Since we have n+1 numbers all pointing to indices in a 1..n range, we're guaranteed to have a 'collision' - two different starting points eventually lead to the same node. This creates a cycle, just like water finding its way to the lowest point in a landscape. The duplicate number is the entrance to that cycle - it's where two different 'paths' in the array converge. Using two pointers at different speeds (Floyd's algorithm), we're essentially running a process until we find where the loop closes, then backtracking to find its starting point.

Why This Pattern?

The array forms a functional graph - each value 'points' to another index. Since there's one more element than the range of values, the pigeonhole principle guarantees at least one collision, creating a cycle. The duplicate is exactly where the cycle begins because two different indices must point to the same location. This is mathematically equivalent to finding the entry point of a cycle in a linked list.

Solution

def findDuplicate(nums):
    # Phase 1: Find intersection point inside the cycle
    # Both pointers will eventually meet somewhere on the cycle
    slow = nums[0]
    fast = nums[0]
    
    while True:
        slow = nums[slow]           # moves 1 step (tortoise)
        fast = nums[nums[fast]]    # moves 2 steps (hare)
        if slow == fast:
            break
    
    # Phase 2: Find the entrance to the cycle (the duplicate)
    # Reset slow to start, keep fast at meeting point
    # They meet exactly at the cycle entrance
    slow = nums[0]
    while slow != fast:
        slow = nums[slow]
        fast = nums[fast]
    
    return slow

Complexity

Time: O(n) - Each phase traverses at most n elements. The first phase visits at most n nodes until meeting inside the cycle, the second phase visits at most n nodes to find the entrance.
Space: O(1) - Only two pointer variables used regardless of input size.

We can't do better than O(n) because we must examine all elements to guarantee finding the duplicate. We use O(1) space by exploiting the array structure itself as our 'linked list' - no extra data structures needed.

Common Mistakes

Edge Cases

Connections

Linked List Cycle #141
Floyd's Cycle Detection Algorithm (Two Pointer / Tortoise and Hare)

Intuition

Imagine two runners on a track. If the track is a straight line (no cycle), the faster runner will eventually finish and leave the slower runner behind. But if the track loops (has a cycle), the faster runner will eventually lap the slower one - they'll meet. This is the classic 'tortoise and hare' insight: a faster pointer moving at 2x speed will ALWAYS catch up to a slower pointer if there's a cycle, because it gains 1 position on each iteration. If there's no cycle, the faster pointer simply reaches the end of the list.

Why This Pattern?

This pattern is the natural choice because: (1) We can't modify the list to mark visited nodes, (2) We need O(1) space, not O(n) for a hash set, and (3) The mathematical guarantee - in any cycle, a faster pointer moving at 2x speed will eventually 'lap' the slower one. The relative speed is 1 node per iteration, guaranteeing convergence.

Solution

def hasCycle(head: ListNode) -> bool:
    if not head or not head.next:
        return False
    
    slow = head      # Tortoise: moves 1 step at a time
    fast = head      # Hare: moves 2 steps at a time
    
    while fast and fast.next:
        slow = slow.next        # Move slow by 1
        fast = fast.next.next   # Move fast by 2
        
        if slow == fast:        # They met = cycle exists
            return True
    
    return False  # Fast reached end = no cycle

Complexity

Time: O(n) - In the worst case (cycle exists), both pointers traverse the list until they meet. The maximum distance is bounded by the cycle length plus the non-cyclic portion. If no cycle, we visit each node at most once.
Space: O(1) - Only two pointer variables regardless of input size.

Time can't be less than O(n) because in the worst case (no cycle), we must check every node to confirm there's no cycle. Space is O(1) because we only track two pointers - the 'state' of the problem is entirely in the current positions of the runners, not in any data structure that grows with input.

Common Mistakes

Edge Cases

Connections

LRU Cache #146
HashMap + Doubly Linked List

Intuition

Imagine a library desk where you keep your most-used reference books within arm's reach. When you need a book, you grab it from the desk (fast access). When you use a book, you put it back on top of the pile (most recently used). When the desk is full and you need space, you put away the book at the bottom of the pile—the one you haven't touched in the longest time. That's exactly what an LRU cache does: it keeps frequently-accessed items readily available while automatically discarding the least recently used ones when capacity runs out. The 'desk' is your cache with limited space, and the 'book at the bottom' is your LRU item.

Why This Pattern?

We need O(1) operations for both get and put. A hash map gives us O(1) lookup by key. But we also need to track which item was used least recently, and we need to reorder in O(1) when something is accessed. A doubly linked list naturally maintains this order—head represents most recently used, tail represents least recently used—with O(1) insertion, deletion, and repositioning. The hash map maps each key to its corresponding node in the linked list, giving us the best of both worlds.

Solution

class DListNode:
    def __init__(self, key=0, value=0):
        self.key = key
        self.value = value
        self.prev = None
        self.next = None

class LRUCache:
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.cache = {}  # Maps key -> DListNode (O(1) lookup)
        
        # Dummy head/tail simplify edge cases - no null checks needed
        self.head = DListNode()
        self.tail = DListNode()
        self.head.next = self.tail
        self.tail.prev = self.head
    
    def _remove(self, node):
        """Detach node from list - O(1) operation"""
        node.prev.next = node.next
        node.next.prev = node.prev
    
    def _add_to_head(self, node):
        """Insert node right after head (most recently used position) - O(1)"""
        node.prev = self.head
        node.next = self.head.next
        self.head.next.prev = node
        self.head.next = node
    
    def _move_to_head(self, node):
        """When item is accessed, mark it as recently used by moving to front"""
        self._remove(node)
        self._add_to_head(node)
    
    def get(self, key: int) -> int:
        if key in self.cache:
            node = self.cache[key]
            self._move_to_head(node)  # Update usage order
            return node.value
        return -1  # Cache miss
    
    def put(self, key: int, value: int) -> None:
        if key in self.cache:
            # Key exists: update value and mark as recently used
            node = self.cache[key]
            node.value = value
            self._move_to_head(node)
        else:
            # New key: create node and add to front
            node = DListNode(key, value)
            self.cache[key] = node
            self._add_to_head(node)
            
            # Evict LRU if over capacity
            if len(self.cache) > self.capacity:
                lru_node = self.tail.prev  # Node right before tail = LRU
                self._remove(lru_node)
                del self.cache[lru_node.key]

Complexity

Time: O(1) for both get and put operations
Space: O(capacity) - we store at most 'capacity' key-value pairs in the hashmap and linked list

Every operation touches only constant-time data structures: hashmap lookup is O(1), and linked list node manipulation (remove, add, move) is O(1) because we have direct pointers to the nodes we need. We never traverse the list—we just rearrange pointers. This is the minimum possible since we must be able to access any cached item instantly.

Common Mistakes

Edge Cases

Connections

Merge K Sorted Lists #23
Heap-based merging (Priority Queue optimization)

Intuition

Imagine k sorted streams of water flowing into one river. At each moment, you only care about finding the smallest drop at the very front of ALL streams. Once you pick that drop, you move forward in just that one stream. A min-heap is the perfect data structure for this - it's like a bottleneck that always gives you the smallest element among k sources instantly, without having to check all k heads every time. Without a heap, you'd scan k heads for every element (expensive). With a heap, you pay a small log(k) price to maintain that 'smallest front' property.

Why This Pattern?

We have k sorted sequences and need to repeatedly find the global minimum across all of them. A min-heap of size k gives us O(log k) access to the smallest element among k sources. This transforms what would be O(n*k) naive scanning into O(n log k) - a massive win when k is large.

Solution

import heapq

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

def mergeKLists(lists: list[ListNode]) -> ListNode:
    # Min-heap stores (value, list_index, node) - list_index breaks ties
    heap = []
    
    # Initialize: add the first node from each non-empty list
    for i, node in enumerate(lists):
        if node:
            heapq.heappush(heap, (node.val, i, node))
    
    # Dummy head simplifies edge case handling at the start
    dummy = ListNode(0)
    current = dummy
    
    # Keep extracting smallest until heap is empty
    while heap:
        val, i, node = heapq.heappop(heap)
        current.next = node
        current = current.next
        
        # If this list has more nodes, push the next one
        if node.next:
            heapq.heappush(heap, (node.next.val, i, node.next))
    
    return dummy.next

Complexity

Time: O(N log k) where N = total nodes across all lists, k = number of lists
Space: O(k) for the heap + O(N) for the result list (which we must build anyway)

We push each of the N nodes onto the heap exactly once and pop each exactly once. Each heap operation costs O(log k) where k is heap size (number of lists). The heap never exceeds size k because we only push the next node after popping the current one. The O(N) factor is unavoidable - we must visit every node to include it in the result.

Common Mistakes

Edge Cases

Connections

Merge Two Sorted Lists #21
Two-pointer merge (merge sorted sequences)

Intuition

Imagine two conveyor belts delivering sorted packages, and you need to unload them onto a single belt in order. Both belts are already sorted, so at any moment the next package to unload must be at the front of one of the two belts — never buried in the middle. You just compare the front packages, take the smaller one, and repeat. It's like merging two sorted streams into one, always picking the lowest-energy element available. The 'equilibrium' is when both streams are exhausted and your new belt is complete.

Why This Pattern?

Both lists are already sorted in ascending order, so the smallest remaining element MUST be at the head of one of the two lists. This guarantees our greedy choice (always pick the smaller head) is always optimal — no backtracking needed. The problem has a greedy optimal substructure property.

Solution

```python
# Definition for singly-linked list.
# class ListNode:
#     def __init__(self, val=0, next=None):
#         self.val = val
#         self.next = next

def mergeTwoLists(list1, list2):
    # Dummy head: a "placeholder" node that simplifies edge cases
    # We don't care about its value; we just use it to anchor our result
    # This avoids special-casing the first node we add
    dummy = ListNode(0)
    current = dummy  # This tracks the last node in our merged list
    
    # Compare heads and attach the smaller one
    # Both lists are sorted, so the smallest unprocessed element 
    # must be at the head of one of the lists
    while list1 and list2:
        if list1.val <= list2.val:
            current.next = list1  # Attach list1's node
            list1 = list1.next    # Move list1 forward
        else:
            current.next = list2  # Attach list2's node
            list2 = list2.next    # Move list2 forward
        current = current.next    # Move our pointer forward
    
    # One list may have leftover nodes — they're already sorted
    # Just attach the remaining chain
    if list1:
        current.next = list1
    if list2:
        current.next = list2
    
    # Return the merged list, skipping the dummy head
    return dummy.next
```

Complexity

Time: O(n + m) where n and m are the lengths of list1 and list2
Space: O(1) — we only create one dummy node and use a constant number of pointers, regardless of input size

We visit each node exactly once. Each iteration processes one node from either list1 or list2, and we make exactly (n + m) iterations total. We can't do better because we must at least look at every node to include it in the output. Space is constant because we reuse existing nodes rather than creating new ones — we just rearrange pointers.

Common Mistakes

Edge Cases

Connections

Remove Nth Node From End of List #19
Fast-Slow Pointer (Two Pointer) with Dummy Head

Intuition

Imagine two runners on a track. The second runner starts n positions behind the first. When the first runner reaches the finish line (end of list), the second runner is exactly at position n from the end — the node we want to remove. This 'gap' technique lets us find a position relative to the end without knowing the list length upfront. The key insight: if we maintain exactly n nodes between fast and slow pointers, when fast hits None, slow will be right before our target node.

Why This Pattern?

We need to maintain a fixed spatial gap (n nodes) between two pointers while traversing. This gap naturally encodes 'nth from end'. The dummy head simplifies removing the first node — without it, we'd need special-case logic when n equals the list length.

Solution

class Solution:
    def removeNthFromEnd(self, head: Optional[ListNode], n: int) -> Optional[ListNode]:
        # Dummy head simplifies removing the first node (when n = list length)
        dummy = ListNode(0, head)
        
        # Both pointers start at dummy
        fast = dummy
        slow = dummy
        
        # Move fast n+1 steps ahead to create a gap of n nodes
        # This positions slow exactly one node before the target
        for i in range(n + 1):
            fast = fast.next
        
        # Advance both until fast hits the end
        # When fast = None, slow will be at node BEFORE the one to remove
        while fast:
            fast = fast.next
            slow = slow.next
        
        # Skip over the target node
        slow.next = slow.next.next
        
        return dummy.next

Complexity

Time: O(L) where L is list length
Space: O(1) - only using a fixed number of pointers

We traverse the list at most once. The fast pointer walks the entire list, and slow catches up — total operations proportional to list length. We can't do better because we must examine every node to reach the end. Space is constant because pointers replace nodes, not create new ones.

Common Mistakes

Edge Cases

Connections

Reorder List #143
Three-step pointer manipulation: (1) Find middle using slow/fast pointers, (2) Reverse the second half, (3) Interleave nodes from first half with reversed second half.

Intuition

Think of this like shuffling a deck of cards. You split the deck in half, reverse the second half, then interleave them like shuffling. The challenge with a linked list is you can only move forward, so you need to reverse the second half to 'reach back' and grab elements from the end. The slow/fast pointer is like finding the center of a rope by walking: one person walks slowly (1 step), another walks fast (2 steps) - when the fast walker reaches the end, the slow walker is at the middle.

Why This Pattern?

Linked lists only give forward traversal, but this problem requires working from both ends simultaneously. The structural property that makes this pattern natural is that we need access to both the beginning (first half) and end (reversed second half) at the same time. Reversing creates a 'mirror' that lets us pull from the 'end' while traversing from the start.

Solution

class Solution:
    def reorderList(self, head: Optional[ListNode]) -> None:
        if not head or not head.next:
            return
        
        # Step 1: Find middle using slow/fast pointers
        # When fast reaches end, slow is at middle
        slow, fast = head, head
        while fast and fast.next:
            slow = slow.next
            fast = fast.next.next
        
        # Step 2: Reverse second half starting from slow
        # prev starts as None (will become new tail)
        prev, curr = None, slow
        while curr:
            next_temp = curr.next  # Save next before overwriting
            curr.next = prev       # Reverse the pointer
            prev = curr            # Move prev forward
            curr = next_temp       # Move curr forward
        # After loop, prev points to new head of reversed list
        
        # Step 3: Merge first half and reversed second half
        first, second = head, prev
        # Alternate: take one from first, one from second
        while second.next:
            # Save next nodes before overwriting
            first_next = first.next
            second_next = second.next
            
            # Connect first to second
            first.next = second
            # Move first forward
            first = first_next
            
            # Connect second to next first node
            second.next = first
            # Move second forward
            second = second_next

Complexity

Time: O(n)
Space: O(1)

We traverse the list 3 times, but each traversal covers different parts or is proportional to n. Finding middle: O(n) - fast pointer visits ~n/2 nodes. Reversing: O(n/2) - only processes second half. Merging: O(n/2) - interleaves both halves. Total is O(n). Space is O(1) because we only use pointer variables regardless of list size.

Common Mistakes

Edge Cases

Connections

Reverse Linked List #206
Three-pointer in-place reversal

Intuition

Imagine a train with cars connected in a line. Each car has a coupler pointing forward to the next car. To reverse the train, you don't detach the cars—you flip each coupler so it points backward instead. The key is: before you can flip a coupler's direction, you need to remember which car comes after it, otherwise you'd lose the rest of the train. This is why we need three hands: one to hold the current car, one to remember what's behind it, and one to peek ahead before we rewire the connection.

Why This Pattern?

A singly linked list gives only forward references. To reverse direction, we must manually flip each pointer while preserving access to the remaining list. The three pointers (prev, curr, next_temp) are the minimal state needed: prev records what's now behind us, curr is what we're currently reorienting, and next_temp prevents losing the rest of the chain before we overwrite the pointer.

Solution

def reverseList(head):
    prev = None      # Starts as None - becomes new tail
    curr = head     # Current node we're reversing
    
    while curr:
        next_temp = curr.next  # Save next node - don't lose the rest!
        curr.next = prev       # Flip pointer: now points backward
        prev = curr            # Move prev forward (this node is now "behind")
        curr = next_temp       # Move curr to saved next node
    
    return prev  # prev is the new head after full reversal

Complexity

Time: O(n)
Space: O(1)

We visit each of the n nodes exactly once and perform constant work per node. We only store three pointers regardless of list size—no recursion stack, no new data structures proportional to input.

Common Mistakes

Edge Cases

Connections

Reverse Nodes in K-Group #25
Reversal with boundary checking - this is the 'localized reversal' pattern where you reverse a bounded segment while maintaining the list's overall structure. It combines: (1) boundary detection - checking if k nodes exist, (2) classic 3-pointer linked list reversal, and (3) reconnection - stitching the reversed segment back into the list.

Intuition

Think of this like reversing paragraphs in an essay while keeping sentences intact. You have a linked list (like a train of k-car segments), and you reverse each chunk of k cars. If there aren't k cars left at the end, you leave them as-is. The key insight: you're doing LOCAL reversal (the k nodes) while PRESERVING global structure (the connections between groups). It's like untangling a necklace - you work on small sections while keeping the whole structure coherent. The 'kth node' acts as your boundary marker - it tells you whether you can reverse or must stop.

Why This Pattern?

This pattern fits because the problem has natural BOUNDARIES - exactly k nodes per reversal. We're not reversing the entire list (that would just be standard reversal); we're doing multiple LOCAL reversals with a STOP condition. The structure is recursive: after reversing one group, the 'tail' of that reversed group becomes the starting point for the next group. The boundary check makes this fundamentally different from simple reversal.

Solution

class Solution:
    def reverseKGroup(self, head: ListNode, k: int) -> ListNode:
        # Dummy node simplifies edge cases - acts as a "virtual" previous node
        dummy = ListNode(0, head)
        group_prev = dummy  # Marks the node BEFORE the current group
        
        while True:
            # STEP 1: Find the kth node from group_prev
            # If fewer than k nodes remain, we're done
            kth = self.get_kth_node(group_prev, k)
            if not kth:
                break
            
            # STEP 2: Store the node AFTER this group (will become new "next")
            group_next = kth.next
            
            # STEP 3: Reverse exactly k nodes
            # prev starts at "group_next" (the node after our group)
            # curr starts at the first node of the group to reverse
            prev, curr = group_next, group_prev.next
            while curr != group_next:
                next_temp = curr.next  # Save next before overwriting
                curr.next = prev       # Reverse the link
                prev = curr            # Move prev forward
                curr = next_temp       # Move curr forward
            
            # STEP 4: Reconnect the reversed group to the list
            # group_prev.next was pointing to the old head, now points to new head (kth)
            # kth was the tail during reversal, now becomes the head
            next_head = group_prev.next  # This is now the TAIL after reversal
            group_prev.next = kth        # Connect previous group to new head
            group_prev = next_head       # Move to the tail for next iteration
        
        return dummy.next
    
    def get_kth_node(self, start, k):
        # Traverse k nodes to find the boundary
        current = start
        for _ in range(k):
            if not current:
                return None
            current = current.next
        return current

Complexity

Time: O(n) - We visit each node a constant number of times. Each node is: (1) counted once when finding the kth node, (2) touched once during reversal, and (3) potentially visited during reconnection. The 'while True' loop with the kth check ensures we don't process nodes multiple times.
Space: O(1) - Only using a fixed number of pointers regardless of input size. No recursion, no extra data structures. The dummy node is just for convenience, not extra space proportional to n.

We can't do better than O(n) because every node must be visited at least once to determine grouping and potentially reversed. The O(1) space is achievable because we manipulate links in-place using the 3-pointer technique - we're essentially 'rotating' pointers rather than building new structures.

Common Mistakes

Edge Cases

Connections

Trees (15)

Balanced Binary Tree #110
Bottom-up recursion with early termination (post-order traversal)

Intuition

Think of a balanced tree like a well-designed building - no single column should be dramatically taller than its neighbor, or the structure becomes unstable. The 'balance' here is about equilibrium: at EVERY node in the tree, the left and right subtrees must have heights that differ by at most 1. It's like checking that every floor of a building has roughly equal ceiling heights on both sides. The key insight is that we need to check from the bottom up - if the foundation (leaves) is unstable, nothing above can be stable.

Why This Pattern?

The problem demands checking every subtree's balance AND computing its height simultaneously. Post-order (left, right, node) is perfect because we need information from children before we can make decisions about the parent. The '-1 sentinel pattern' is elegant here: instead of returning both (is_balanced, height), we return height normally, or -1 to signal 'unbalanced' - this single value carries both pieces of information and lets us short-circuit the moment we find an imbalance.

Solution

class Solution:
    def isBalanced(self, root: Optional[TreeNode]) -> bool:
        # Helper returns height if balanced, -1 if unbalanced
        def check(node):
            if not node:
                return 0  # Empty tree has height 0, is balanced
            
            # Recursively check left subtree
            left_height = check(node.left)
            if left_height == -1:
                return -1  # Left subtree already unbalanced, propagate failure up
            
            # Recursively check right subtree  
            right_height = check(node.right)
            if right_height == -1:
                return -1  # Right subtree already unbalanced
            
            # At this point both subtrees are balanced - check current node
            if abs(left_height - right_height) > 1:
                return -1  # Current node violates balance condition
            
            # Return height of current node (max of children + 1 for current level)
            return max(left_height, right_height) + 1
        
        # If check returns -1, tree is unbalanced; otherwise balanced
        return check(root) != -1

Complexity

Time: O(n) - We visit each node exactly once. The key insight is that we DON'T recompute heights at every level (which would be O(n²)). By computing heights bottom-up and returning early on imbalance, we maintain linear time.
Space: O(h) where h is the height of the tree, due to the recursion stack. In the worst case (skewed tree), this is O(n); in a balanced tree, it's O(log n).

We can't do better than O(n) because we must at least examine every node to guarantee balance - a single deep leaf could be the problem. The recursion stack corresponds to the 'call chain' down the tree; we only need to keep track of the path from root to current node, not the entire tree.

Common Mistakes

Edge Cases

Connections

Binary Tree Level Order Traversal #102
BFS (Breadth-First Search) on a tree using a queue, with level-by-level processing

Intuition

Think of this like ripples spreading outward from a stone dropped in water. When you process a tree breadth-first, you're essentially flooding it level by level - all nodes at depth 0 get visited first (the root), then all nodes at depth 1, then depth 2, and so on. This is fundamentally different from depth-first search which goes deep first (like exploring one path fully before backtracking). The key insight: a queue naturally implements this 'wave' behavior because nodes at the current depth are processed before their children get added to the queue.

Why This Pattern?

A queue is FIFO - first in, first out. When we add children to the queue, they wait their turn. By processing exactly the number of nodes that were in the queue at the start of each level, we ensure all nodes at depth d are processed before any node at depth d+1. This 'batch processing' per level is what gives us the分层 result.

Solution

from collections import deque

def levelOrder(root):
    if not root:
        return []
    
    result = []
    queue = deque([root])
    
    while queue:
        # Snapshot how many nodes are at THIS level
        level_size = len(queue)
        current_level = []
        
        # Process exactly 'level_size' nodes - all nodes at current depth
        for _ in range(level_size):
            node = queue.popleft()
            current_level.append(node.val)
            
            # Add children for NEXT level (they'll be processed later)
            if node.left:
                queue.append(node.left)
            if node.right:
                queue.append(node.right)
        
        # Finished this entire level
        result.append(current_level)
    
    return result

Complexity

Time: O(n)
Space: O(w) where w is max width of tree (can be up to n/2 in worst case for complete binary tree) + O(h) for the result where h is height

Every node is visited exactly once and added to/popped from the queue exactly once - that's O(n). For space, the queue holds at most one full level of nodes. In a complete binary tree, the bottom level has roughly n/2 nodes, so that's our worst-case space. The result also stores n values across h levels.

Common Mistakes

Edge Cases

Connections

Binary Tree Maximum Path Sum #124
Post-order DFS with state propagation. Each recursive call returns the maximum path sum starting from that node and going DOWN (single branch), while updating a global answer for paths that go THROUGH the node (both branches).

Intuition

Think of each node as a junction where energy/power can flow through. A path through a node can either: (1) flow from one child through the node to the other child (an inverted V shape) — this is a complete path but can't extend upward to parents, or (2) flow from the node down through exactly one child (a straight line) — this can be extended upward to contribute to a larger path. It's like designing electrical circuits: at each junction, you either form a closed loop (both branches used) or you pass current upward through exactly one branch. We need BOTH the best single-branch contribution we can make to our parent AND the best two-branch path we can form right here. We process bottom-up so children report their best single-branch contribution first, then we combine them.

Why This Pattern?

The problem has a natural bottom-up dependency: to know what a node can contribute to its parent, we must first know what each child can contribute. The key insight is that a path through a node either uses one child (extensible upward) or both children (forms a complete path at this node, not extensible). This two-part return value (one for extending up, one for the global answer) naturally maps to post-order traversal.

Solution

class Solution:
    def maxPathSum(self, root: Optional[TreeNode]) -> int:
        # Track the global maximum; must start very low since node values can be negative
        self.max_sum = float('-inf')
        
        def dfs(node):
            if not node:
                return 0  # Base case: empty tree contributes nothing
            
            # Recursively get best single-branch contribution from each subtree
            # max(0, ...) means we can OPTIONALLY include a child — if it adds 
            # negativity, we'd rather not include it (path must have at least one node)
            left_gain = max(0, dfs(node.left))
            right_gain = max(0, dfs(node.right))
            
            # Path that goes THROUGH this node: one child -> this node -> other child
            # This forms a complete path but cannot extend upward to parents
            through_node = left_gain + node.val + right_gain
            self.max_sum = max(self.max_sum, through_node)
            
            # Return what we can contribute to our parent: best path going DOWN from here
            # We can only pick ONE child to extend upward (otherwise we'd have a cycle)
            return max(left_gain, right_gain) + node.val
        
        dfs(root)
        return self.max_sum

Complexity

Time: O(n) where n is the number of nodes. We visit every node exactly once, doing O(1) work per node.
Space: O(h) where h is the height of the tree. This is the recursion stack depth. In the worst case (skewed tree), h = n → O(n); in balanced tree, h = log(n).

We must examine every node to guarantee finding the max path — the path could be arbitrarily located. The space is just the call stack because we only maintain O(1) extra state per recursive call. We can't reduce space without fundamentally changing the algorithm since we need the full recursive descent to compute bottom-up values.

Common Mistakes

Edge Cases

Connections

Binary Tree Right Side View #199
Level-order traversal (BFS) or Depth-first search with right-first ordering

Intuition

Imagine standing on the right side of a binary tree and taking a photograph. What do you see? At each horizontal 'depth level', you see the rightmost node. Here's the key insight: if a right subtree exists at some level, it blocks the left subtree from view at that level. Think of it like a shadow cast from the right - only the rightmost nodes at each depth catch the 'light'. This is equivalent to asking: for each depth value, what's the last node you'd encounter if you scanned that level left-to-right?

Why This Pattern?

The problem has a natural 'layered' structure - we need exactly one node per depth level. BFS naturally processes level-by-level, so the last node at each level is always the rightmost. Alternatively, if using DFS with a right-then-left order, the FIRST time we encounter a new depth, that node is guaranteed to be the rightmost (because we visited the right side first). This is like the 'first appearance' pattern - the first node we see at each depth in right-first traversal must be visible from the right.

Solution

from collections import deque
from typing import Optional, List

# Definition for a binary tree node.
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

class Solution:
    def rightSideView(self, root: Optional[TreeNode]) -> List[int]:
        """
        BFS approach: Process level by level, capture the last node at each level.
        """
        if not root:
            return []
        
        result = []
        queue = deque([root])
        
        while queue:
            level_size = len(queue)  # Number of nodes at current depth
            
            for i in range(level_size):
                node = queue.popleft()
                
                # Last node in this level = rightmost node = visible from right
                if i == level_size - 1:
                    result.append(node.val)
                
                # Add children for next level (left first, then right)
                if node.left:
                    queue.append(node.left)
                if node.right:
                    queue.append(node.right)
        
        return result

# Alternative DFS solution (commented out):
# class Solution:
#     def rightSideView(self, root: Optional[TreeNode]) -> List[int]:
#         result = []
#         
#         def dfs(node, depth):
#             if not node:
#                 return
#             
#             # First time we reach this depth = rightmost node (we go right first!)
#             if depth == len(result):
#                 result.append(node.val)
#             
#             # Visit RIGHT first, then LEFT - this ensures rightmost nodes are seen first
#             dfs(node.right, depth + 1)
#             dfs(node.left, depth + 1)
#         
#         dfs(root, 0)
#         return result

Complexity

Time: O(n)
Space: O(w) for BFS where w is max width of tree, O(h) for DFS where h is height (recursion stack)

We must visit every node in the tree at least once to determine the rightmost node at each depth. There's no way around this because a node buried deep on the left could theoretically be the rightmost at its depth if no right subtree exists at that level. In the worst case (left-skewed tree), we need to visit all n nodes to find the single visible node per depth.

Common Mistakes

Edge Cases

Connections

Construct Binary Tree from Preorder and Inorder Traversal #105
Recursive divide-and-conquer with index tracking and hashmap lookup

Intuition

Think of this like archaeology at two dig sites. Preorder (root-first) tells you 'here's the family head,' and inorder (left-root-right) tells you 'here's where families split.' The first element in preorder is ALWAYS the root. When you find that root in inorder, everything to its LEFT is the left subtree (came before it), everything to its RIGHT is the right subtree (came after it). Now you know the SIZE of the left subtree from inorder, so in preorder you can slice: after the root, the next N elements belong to left subtree (where N = size of left portion in inorder), and the rest belong to right subtree. It's a recursive family reconstruction: find the patriarch, see where they stand in the family line, then apply to each branch.

Why This Pattern?

The traversals encode structural information positionally: preorder gives root-first ordering, inorder gives the left-root-right split point. Together they uniquely determine the tree structure. We use a hashmap to achieve O(1) root lookup, converting a recursive structure problem into a simple index-manipulation problem.

Solution

class Solution:
    def buildTree(self, preorder: List[int], inorder: List[int]) -> Optional[TreeNode]:
        # Hashmap: value -> its index in inorder (for O(1) lookup)
        inorder_index = {val: i for i, val in enumerate(inorder)}
        
        def build(pre_left, pre_right, in_left, in_right):
            # Base case: empty segment
            if pre_left > pre_right:
                return None
            
            # Root is first element in this preorder segment
            root_val = preorder[pre_left]
            root = TreeNode(root_val)
            
            # Find where root sits in the inorder segment
            in_idx = inorder_index[root_val]
            
            # Number of nodes in left subtree = elements to left of root in inorder
            left_size = in_idx - in_left
            
            # Build left subtree: preorder[pre_left+1 to pre_left+left_size]
            # corresponds to inorder[in_left to in_idx-1]
            root.left = build(pre_left + 1, pre_left + left_size, in_left, in_idx - 1)
            
            # Build right subtree: remaining preorder elements
            root.right = build(pre_left + left_size + 1, pre_right, in_idx + 1, in_right)
            
            return root
        
        return build(0, len(preorder) - 1, 0, len(inorder) - 1)

Complexity

Time: O(n)
Space: O(n) for hashmap + O(h) for recursion stack, where h is tree height

We visit each of the n nodes exactly once. The hashmap gives O(1) lookup to find where each root sits in inorder. The recursion divides the problem in half each time, but the total work across all levels sums to n. We can't do better than O(n) because we must create n TreeNode objects regardless.

Common Mistakes

Edge Cases

Connections

Count Good Nodes in Binary Tree #1448
DFS with path state tracking. We maintain the maximum value encountered along the current root-to-node path as we traverse.

Intuition

Think of this like a 'peak detector' on a mountain range. As you walk from the root down any path, you're tracking the highest elevation seen so far. A node is 'good' if it's a new peak — its value is the highest point reached so far on that path. Like how a mountain climber might say 'I've reached a new summit' when they climb higher than anything before, we're counting nodes that are higher than all their ancestors. We carry the running maximum down each branch, updating it when we find a higher value.

Why This Pattern?

We need to evaluate every root-to-node path and determine if the current node's value is the maximum on that path. DFS naturally explores all paths, and by passing the current maximum as a parameter, we maintain the necessary state without storing entire paths. This is optimal because we must visit every node anyway to check if it's a 'good' node.

Solution

class Solution:
    def goodNodes(self, root: TreeNode) -> int:
        def dfs(node, max_so_far):
            if not node:
                return 0
            
            # A node is good if its value is >= all values on the path from root
            # If current node's value is greater than or equal to max_so_far,
            # then no ancestor has a higher value, making this a "good node"
            count = 1 if node.val >= max_so_far else 0
            
            # Propagate the maximum value seen so far down the tree
            new_max = max(max_so_far, node.val)
            
            # Recurse on both children, carrying forward the updated maximum
            count += dfs(node.left, new_max)
            count += dfs(node.right, new_max)
            
            return count
        
        # Start with root's value as initial maximum
        return dfs(root, root.val)

Complexity

Time: O(n) — We must visit every single node in the tree to determine if it's a good node. There's no way to skip any node because each node's 'goodness' depends only on the path to it, which requires examining that path.
Space: O(h) — The recursion stack depth equals the tree height. In a balanced tree this is O(log n); in a worst-case skewed tree (like a linked list), it's O(n). We only store one integer (the running maximum) per level of recursion.

Think of it like hiking all trails on a mountain. You can't skip any trail segment because you need to traverse it to know what peaks you'll encounter. The space is like your memory of the highest point on your current trail — you only need to remember one number as you go deeper, not a list of everything you've seen.

Common Mistakes

Edge Cases

Connections

Diameter of Binary Tree #543
Post-order DFS with global state tracking. This pattern applies when: (1) you need information from children before computing parent results, (2) the answer isn't necessarily at the root, and (3) you need to track a global maximum while doing local computations at each node.

Intuition

Think of the tree as a network of branches. The diameter is the longest distance between any two leaf nodes in this network. Here's the key insight: the longest path through ANY particular node is simply the height of its left subtree PLUS the height of its right subtree. But here's the subtlety - the diameter might NOT go through the root. It could be lurking in any subtree. So we need to: (1) compute height of each subtree (longest path from that node down to a leaf), (2) at each node, check if the path through this node is the longest we've seen, and (3) pass the height upward to parent nodes. This is like calculating stress at each joint in a structure - the maximum stress (diameter) might occur anywhere, but we can compute it locally at each joint while traversing the structure.

Why This Pattern?

A binary tree diameter is inherently a bottom-up property. The height of a node depends on its children's heights, and the diameter through a node depends on both children's heights. We must process children first (post-order) to have the data we need. The global diameter accumulates as we traverse because we don't know which subtree contains the maximum until we've checked all of them.

Solution

class Solution:
    def diameterOfBinaryTree(self, root: Optional[TreeNode]) -> int:
        # Global tracker for maximum diameter found so far
        self.diameter = 0
        
        def get_height(node):
            # Base case: empty tree has height 0
            if not node:
                return 0
            
            # Post-order: process children BEFORE computing current node
            # Recursively get heights of left and right subtrees
            left_height = get_height(node.left)
            right_height = get_height(node.right)
            
            # The longest path THROUGH this node = left height + right height
            # This represents the distance between deepest leaf in left 
            # and deepest leaf in right, passing through current node
            self.diameter = max(self.diameter, left_height + right_height)
            
            # Return height of current subtree: 1 (current node) + max child height
            return 1 + max(left_height, right_height)
        
        get_height(root)
        return self.diameter

Complexity

Time: O(n) - We visit every node exactly once. Each node requires O(1) work: two recursive calls and some max comparisons. There's no way to do better because the diameter could involve any node - we'd have to examine all nodes to be sure.
Space: O(h) where h is the height of the tree. This is the recursion stack depth. In the worst case (skewed tree), h = n giving O(n); in balanced tree, h = log(n) giving O(log n). We don't use extra space proportional to n because we only store one value per stack frame.

Time can't be less than O(n) because you must examine every node - the diameter could be hiding in any subtree and you need information from all nodes to find it. Space is O(h) because at any moment, you're only holding the path from root to current node in the call stack - you don't need to remember anything about branches you've already finished processing.

Common Mistakes

Edge Cases

Connections

Invert Binary Tree #226
Depth-First Search (DFS) with recursion - specifically post-order traversal where we process children before the parent. This is a divide-and-conquer approach.

Intuition

Think of this like reflecting a binary tree in a vertical mirror. Every left child becomes a right child and vice versa - it's a horizontal flip. At each node, you're simply swapping the 'direction' of the signal going left vs right. The entire tree is just a collection of these local swaps applied recursively to every subtree.

Why This Pattern?

Binary trees are inherently recursive structures - each subtree IS itself a binary tree. To invert the whole tree, you can: 1) invert the left subtree, 2) invert the right subtree, 3) swap the two subtrees at the current node. This decomposes one big problem into identical smaller problems until you hit the base case (null node).

Solution

class Solution:
    def invertTree(self, root):
        # Base case: empty tree or leaf node's child (null)
        if not root:
            return None
        
        # Swap left and right children at current node
        # This is the key operation - we're 'flipping' this node's children
        root.left, root.right = root.right, root.left
        
        # Recursively invert both subtrees
        # The recursion handles all descendants - we just swap at each level
        self.invertTree(root.left)
        self.invertTree(root.right)
        
        return root

Complexity

Time: O(n) where n is the number of nodes
Space: O(h) where h is the height of the tree (worst case O(n) for skewed trees, O(log n) for balanced)

We must visit every single node to swap its children - there's no way around this because every node's position changes relative to its parent. For space: the recursion stack depth equals tree height. A balanced tree with log n levels needs only log n stack frames, but a completely skewed tree (like a linked list) needs n stack frames.

Common Mistakes

Edge Cases

Connections

Kth Smallest Element in a BST #230
In-order traversal with early termination

Intuition

Think of a BST as a sorted array that's been 'folded' into a tree shape. The BST property (left subtree < root < right subtree) means if you read it in the right order, you get sorted numbers. That's exactly what in-order traversal does: left -> root -> right. It's like unwrapping a folded sort - you're just reading the tree in its natural sorted order and stopping at position k. The kth node you encounter IS the kth smallest.

Why This Pattern?

The BST property guarantees that in-order traversal visits nodes in ascending sorted order. We don't need to sort anything or collect all nodes - we just need to 'read' the tree correctly and stop early. The iterative approach simulates the recursive call stack but lets us exit as soon as we find the answer, avoiding unnecessary exploration of the right subtree.

Solution

def kthSmallest(root, k):
    # Iterative in-order traversal - stops as soon as we find kth element
    stack = []
    current = root
    count = 0
    
    while current or stack:
        # Go all the way left - like descending a ladder
        while current:
            stack.append(current)
            current = current.left
        
        # Process current node (the next smallest)
        current = stack.pop()
        count += 1
        
        if count == k:
            return current.val
        
        # Move to right subtree to continue
        current = current.right
    
    return None  # Edge case: invalid k

Complexity

Time: O(H + k) where H is tree height
Space: O(H) for the stack, where H is tree height

We descend H levels to reach the leftmost (smallest) element, then process exactly k nodes. In a balanced BST, H = log(n), so O(log n + k). In the worst case (completely skewed tree like a linked list), H = n, so O(n). The key insight: we never visit more nodes than necessary - we stop at k, not at the entire tree.

Common Mistakes

Edge Cases

Connections

Lowest Common Ancestor of a BST #235
Binary Search on Tree (using BST property to prune half the search space at each step)

Intuition

Think of a BST as a sorted hierarchy - like an org chart sorted by employee ID. The two nodes p and q each have a 'path' from the root. The LCA is where these two paths first meet going upward. Here's the key insight: as you traverse from the root, you're essentially asking 'are both nodes to the left of me, or both to the right?' If they're both on one side, you know the LCA must be in that subtree. The moment one node is on the left and one is on the right, you've found the divergence point - that's your LCA because p and q's lowest common ancestor must be an ancestor of both, and this node is the deepest one that satisfies that.

Why This Pattern?

The BST property (left < node < right) gives us directional information. Unlike a generic tree where we'd need to find paths first and compare them, here we can use the sorted values to directly navigate to the LCA. Each step eliminates half the tree - this is the 'search' aspect. We're not searching for a single target, but rather searching for the point where two targets 'diverge' in their direction from the root.

Solution

def lowestCommonAncestor(self, root: 'TreeNode', p: 'TreeNode', q: 'TreeNode') -> 'TreeNode':
    # Start at root, navigate down using BST property
    current = root
    
    while current:
        # Both nodes are in the left subtree - LCA must be in left subtree
        if p.val < current.val and q.val < current.val:
            current = current.left
        # Both nodes are in the right subtree - LCA must be in right subtree  
        elif p.val > current.val and q.val > current.val:
            current = current.right
        else:
            # One on left, one on right (or one IS current) - this is the divergence point
            return current
    
    return current  # Should never reach here with valid BST and nodes

Complexity

Time: O(h) where h is height of tree
Space: O(1) - only using a pointer, no extra space

We traverse only one path from root to LCA - at most h nodes. In the worst case (skewed tree), h = n giving O(n), but in a balanced BST it's O(log n). We never visit a node twice because the BST property tells us exactly which direction to go.

Common Mistakes

Edge Cases

Connections

Maximum Depth of Binary Tree #104
DFS (Depth-First Search) with recursion / divide and conquer

Intuition

Think of this like measuring how tall a tree grows — from the trunk (root) down to the farthest leaf. You're essentially asking: 'How many levels of branches are there, counting from the top?' Imagine a signal propagating downward from the root — the deepest leaf is where the signal takes the longest path. Each node you traverse adds 1 to your count, and you want the maximum path length.

Why This Pattern?

Trees are recursive structures — each node's children are themselves (smaller) trees. This means we can solve the problem by: (1) finding the max depth of the left subtree, (2) finding the max depth of the right subtree, (3) taking the maximum and adding 1 for the current node. This is the natural fit because the answer for a tree depends exactly on the answers for its subtrees.

Solution

def maxDepth(root):
    # Base case: empty tree has depth 0
    if not root:
        return 0
    
    # Recursively get depth of left and right subtrees
    left_depth = maxDepth(root.left)
    right_depth = maxDepth(root.right)
    
    # The depth of current node = deeper child + 1 (for current node)
    return max(left_depth, right_depth) + 1

Complexity

Time: O(n)
Space: O(h) where h is the height of the tree (recursion stack depth)

We must visit every node at least once to know the maximum depth — you can't determine depth without checking all paths. The space is the recursion stack, which goes as deep as the tree height. In a balanced tree this is O(log n), in a completely skewed tree it's O(n).

Common Mistakes

Edge Cases

Connections

Same Tree #100
Depth-First Search (DFS) / Structural Recursion

Intuition

Imagine you're comparing two trees in a forest - you need to check if every branch, twig, and leaf is in exactly the same position. Two trees are identical if: (1) they're both empty, OR (2) they both have roots with the same value AND their left branches are identical AND their right branches are identical. It's like the mathematical definition of tree equality - structural isomorphism plus value matching. Think of it as a recursive mirror: you're checking 'are these two subtrees the same?' at every level.

Why This Pattern?

Trees are inherently recursive data structures - a tree is made of a root plus a left subtree and a right subtree. The most natural way to compare them is to break the problem down recursively: 'Are tree A and tree B identical?' becomes 'Are root values equal AND (are left subtrees identical AND are right subtrees identical)?' This mirrors how trees are defined, making recursion the most elegant solution.

Solution

def isSameTree(p, q):
    # Base case: both nodes are None - we've reached identical leaves
    if not p and not q:
        return True
    
    # Structural mismatch: one tree has a node where the other doesn't
    if not p or not q:
        return False
    
    # Check current node values AND recursively check both subtrees
    # Both must be true for trees to be identical
    return (p.val == q.val) and isSameTree(p.left, q.left) and isSameTree(p.right, q.right)

Complexity

Time: O(min(n, m)) where n and m are node counts in the two trees. In the worst case (identical trees), we must visit every node to confirm equality - we can't know they're the same without checking all positions.
Space: O(min(h1, h2)) where h1 and h2 are the heights of the two trees. This is the recursive call stack depth, which equals the depth of the shorter/more shallow tree.

For time: imagine checking two fingerprints - you must examine every ridge at every position to confirm a match. For space: recursion is like walking through the trees branch by branch simultaneously - you only need to 'remember' the path from root to your current position, not the entire tree.

Common Mistakes

Edge Cases

Connections

Serialize and Deserialize Binary Tree #297
Pre-order depth-first traversal with null sentinel markers

Intuition

Think of serialization like creating a shipping manifest for a fractal structure. A binary tree has 'gaps' where branches don't exist — you need to record both what's there AND what's missing. The key insight: if you do a PRE-ORDER traversal (root, then left subtree, then right subtree), you get all the information in the right order. When you hit a null marker (#), you know exactly how far to 'rewind' to find where the next branch connects. It's like a recipe: 'Take the main ingredient, then here's how to make the left side dish, then the right side dish.' The root anchors everything — once you know the root, you know the left subtree comes next, then the right.

Why This Pattern?

Pre-order gives us the root first, which is the anchor for the entire structure. Combined with null markers (#), it creates an unambiguous encoding: when we encounter a null, we know we've finished processing that subtree and can 'bubble up' to attach the next subtree. It's like a pushdown automaton — the nulls tell us when to pop back up the recursion stack. BFS/level-order also works but requires a more complex queue structure; pre-order with recursion is the most natural fit.

Solution

class Codec:
    def serialize(self, root):
        """Encodes a tree to a single string using pre-order traversal."""
        def preorder(node):
            if not node:
                return ['#']  # Sentinel for null — marks "no child here"
            # Convert to string for joining, pre-order: root -> left -> right
            return [str(node.val)] + preorder(node.left) + preorder(node.right)
        
        return ','.join(preorder(root))
    
    def deserialize(self, data):
        """Decodes the string back to a binary tree."""
        def build():
            val = next(values)
            if val == '#':
                return None  # This spot is empty, bubble up
            # Create node, recursively build its left and right subtrees
            node = TreeNode(int(val))
            node.left = build()   # Everything after root's val until # is left subtree
            node.right = build()  # Everything after left subtree is right subtree
            return node
        
        values = iter(data.split(','))
        return build()

Complexity

Time: O(n) where n is the number of nodes
Space: O(n) for the serialized string and recursion stack

We must visit every node once to serialize it (can't skip nodes — we need their values) and once to deserialize it (can't reconstruct without reading all values). The serialization string contains n values plus n+1 null markers, so O(n) space is unavoidable. The recursion depth equals tree height, which is O(n) worst-case for skewed trees but O(log n) for balanced ones.

Common Mistakes

Edge Cases

Connections

Subtree of Another Tree #572
Recursive tree traversal with equality checking (the 'check everywhere' pattern).

Intuition

Think of this like finding a pattern in a larger structure. Imagine you're looking for a specific subtree shape within a bigger tree - it's like pattern matching in a hierarchy. You can't know ahead of time where t might 'start' within s, so you have to check EVERY node in s as a potential root. At each node, you ask: 'Does the tree rooted here match tree t exactly?' If yes, you found it. If not, keep looking in the left and right branches. The key insight: a subtree must have identical STRUCTURE (shape) AND values - not just matching values in isolation.

Why This Pattern?

Since we don't know where t might be positioned in s, we must attempt a match at every node. This is a classic exhaustive search over potential starting positions, combined with a structural equality check at each position.

Solution

class Solution:
    def isSubtree(self, s: TreeNode, t: TreeNode) -> bool:
        # Base case: empty t is subtree of anything (vacuously true)
        if not t:
            return True
        # If t exists but s is exhausted, no match possible
        if not s:
            return False
        
        # Check if current node in s could be root of t
        if self.isSameTree(s, t):
            return True
        
        # Otherwise, recurse on both subtrees - t could be anywhere
        return self.isSubtree(s.left, t) or self.isSubtree(s.right, t)
    
    def isSameTree(self, s: TreeNode, t: TreeNode) -> bool:
        # Both trees exhausted - identical so far
        if not s and not t:
            return True
        # One tree exhausted, other not - structure differs
        if not s or not t:
            return False
        
        # Check root value, then recurse on both children
        # Structure must match exactly (both left children, both right children)
        return (s.val == t.val and 
                self.isSameTree(s.left, t.left) and 
                self.isSameTree(s.right, t.right))

Complexity

Time: O(n * m) in worst case, where n = nodes in s, m = nodes in t. For each of n potential starting positions, we may compare up to m nodes.
Space: O(n) in worst case for recursion stack, where n is height of s. In a skewed tree (linked list), depth could be n; in balanced tree, depth is O(log n).

We must potentially try every node in s as a root candidate. At each try, we might need to traverse the entire subtree t to verify equality. There's no way to skip comparisons because tree structure doesn't have the predictable ordering that would let us rule out regions quickly (unlike BSTs where left/right tells you which branch to take).

Common Mistakes

Edge Cases

Connections

Validate Binary Search Tree #98
Tree traversal with constraint propagation (bounded recursion)

Intuition

Think of a BST like a distribution center with strict ordering rules. Every piece of mail going left must be 'less than' the current location, every piece going right must be 'greater than'. The key insight that trips people up: it's not just about immediate children — ALL left descendants must be less than the parent, and ALL right descendants must be greater. A common mistake is only checking node.left < node < node.right, which misses cases where a grandchild violates the rule. The solution is to carry valid bounds DOWN the tree like a pass-down rule: 'anything in this subtree must be between X and Y.'

Why This Pattern?

Each node needs to know what valid range it lives in. The left subtree inherits 'everything must be less than current value' as an upper bound, the right subtree inherits 'everything must be greater than current value' as a lower bound. This creates a natural recursive structure where constraints tighten as we go deeper.

Solution

def isValidBST(self, root):
    def validate(node, low, high):
        # Empty trees are valid BSTs (base case)
        if not node:
            return True
        
        # Current node must be strictly within bounds
        # Using <= and >= catches duplicates that would break BST property
        if node.val <= low or node.val >= high:
            return False
        
        # Validate left subtree with tighter upper bound (current value)
        # Validate right subtree with tighter lower bound (current value)
        return validate(node.left, low, node.val) and validate(node.right, node.val, high)
    
    return validate(root, float('-inf'), float('inf'))

Complexity

Time: O(n)
Space: O(h) where h is tree height (worst case O(n) for skewed tree, best case O(log n) for balanced)

We must visit every node to confirm the BST property holds everywhere — you can't determine validity without checking all values. Space is the recursion depth, which equals tree height: a balanced tree has log n levels, a completely skewed tree has n levels (like a linked list).

Common Mistakes

Edge Cases

Connections

Tries (3)

Design Add and Search Words Data Structure #211
Trie (Prefix Tree) with DFS backtracking for wildcard search

Intuition

Think of this like organizing words in a physical dictionary where each page represents a letter. If you're looking for 'cat', you go to the 'c' section, then 'a', then 't'. Now imagine some search queries have wildcards - it's like someone handing you a mask that covers one letter, and you have to check ALL possible letters that could be under that mask. This is exactly what a Trie does: it builds a tree where each path from root to leaf spells a word, and the wildcard search just means 'try every possible branch at this point'.

Why This Pattern?

Words have a natural hierarchical structure based on their prefixes. A trie exploits this by sharing common prefixes. The wildcard '.' character requires exploring ALL possible children at that position - this is a classic tree traversal problem where DFS naturally explores all branches. The tree structure makes backtracking straightforward: when one path fails, we automatically return to try other branches.

Solution

class TrieNode:
    def __init__(self):
        self.children = {}  # char -> TrieNode mapping
        self.is_end = False  # marks if this node completes a word

class WordDictionary:
    def __init__(self):
        self.root = TrieNode()

    def addWord(self, word: str) -> None:
        """Insert word into trie - create nodes as needed, mark end."""
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end = True  # mark this as end of a valid word

    def search(self, word: str) -> bool:
        """Search with DFS/backtracking. '.' means try ALL children."""
        def dfs(index, node):
            # Base case: processed all characters
            if index == len(word):
                return node.is_end
            
            char = word[index]
            
            if char == '.':
                # Wildcard: try EVERY possible child branch
                for child in node.children.values():
                    if dfs(index + 1, child):
                        return True
                return False  # no children led to a match
            else:
                # Specific character: must exist in children
                if char not in node.children:
                    return False
                return dfs(index + 1, node.children[char])
        
        return dfs(0, self.root)

Complexity

Time: O(L) for addWord where L is word length. For search: O(L) for exact match, but O(26^L) worst case when search string is all wildcards '.' because we may need to explore every branch of the tree.
Space: O(N * L) where N is number of words and L is average word length - each character needs a node. For search: O(L) recursion stack depth.

Adding a word is like walking down a path - you visit each character once, so that's O(L). Searching is like exploring a maze: if you know the exact letters, you take one path (O(L)). But with wildcards, at each '.' you might have to try up to 26 different directions (alphabet size), creating exponential exploration in the worst case. The space is the physical 'filing cabinet' you build to store all words - each character needs its own folder/node.

Common Mistakes

Edge Cases

Connections

Implement Trie (Prefix Tree) #208
Trie (Prefix Tree) with hashmap children

Intuition

Imagine a filing cabinet where you organize words by their first letter, then within each drawer you organize by second letter, and so on. That's essentially a Trie. The root is the cabinet, each branch is a letter, and when you reach the end of a word, you put a flag there saying 'this is a complete word, not just a prefix.' It's like a tree that branches more and more as letters diverge - 'apple' and 'apply' share 'app' on the same branch, then split at 'l' vs 'y'. This structure naturally groups all words starting with 'app' together, which is why prefix searches are so fast.

Why This Pattern?

A Trie exploits the fact that words share prefixes. Each node represents a prefix, and paths from root to any node represent valid prefixes. The tree structure naturally clusters related words together, making prefix operations O(m) where m is prefix length - you just follow the branches. This is fundamentally different from hash-based approaches which need to examine the entire word.

Solution

class TrieNode:
    def __init__(self):
        self.children = {}  # char -> TrieNode
        self.is_end = False  # marks complete word

class Trie:
    def __init__(self):
        self.root = TrieNode()
    
    def insert(self, word: str) -> None:
        # Walk down the tree, creating nodes as needed
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end = True  # mark the end of this word
    
    def search(self, word: str) -> bool:
        # Find the node for this word, then check if it's an endpoint
        node = self._find_node(word)
        return node is not None and node.is_end
    
    def startsWith(self, prefix: str) -> bool:
        # Just need to find the node - existence means prefix exists
        return self._find_node(prefix) is not None
    
    def _find_node(self, prefix: str) -> TrieNode:
        # Helper to traverse the trie and return the final node
        node = self.root
        for char in prefix:
            if char not in node.children:
                return None
            node = node.children[char]
        return node

Complexity

Time: O(m) where m is the length of the word/prefix
Space: O(1) for operations, O(ALPHABET_SIZE * m * n) for storage where m is avg word length and n is number of words

For any operation, you touch exactly one node per character - you can't skip letters because each branch IS a letter. Insertion must create nodes for new prefixes, but search/startsWith just follows existing branches. The space is proportional to total unique prefixes stored, which is bounded by the total characters across all inserted words.

Common Mistakes

Edge Cases

Connections

Word Search II #212
Trie + Backtracking (DFS)

Intuition

Imagine you're searching for words in a crossword puzzle. Instead of taking each word from your list and individually hunting for it on the grid (slow!), you first memorize ALL the words into a prefix tree (Trie). Then, as you explore the grid letter-by-letter, you can quickly check 'does this path match any word I'm looking for?' The Trie acts like a routing table — at each cell, you ask 'can I continue down a valid word path?' If the current letters don't match any prefix in your dictionary, stop exploring that branch immediately (pruning). This turns a potentially expensive O(words × board) search into a single traversal where we check all words simultaneously.

Why This Pattern?

The problem requires finding multiple words in a single grid. A Trie enables O(1) character lookup per step (checking if current prefix exists in any word), while DFS explores all possible paths from each starting cell. The Trie structure naturally supports pruning: once a path diverges from all word prefixes, we backtrack. This combination is the classic solution because we're essentially doing one unified search for ALL words at once, rather than N separate searches.

Solution

class TrieNode:
    def __init__(self):
        self.children = {}
        self.word = None  # stores the complete word if this is end of a word

class Solution:
    def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
        # Build Trie from all words
        root = TrieNode()
        for word in words:
            node = root
            for char in word:
                if char not in node.children:
                    node.children[char] = TrieNode()
                node = node.children[char]
            node.word = word  # mark end of word
        
        result = []
        rows, cols = len(board), len(board[0])
        
        def dfs(r, c, node):
            char = board[r][c]
            # If current cell not in Trie, stop (prune dead branch)
            if char not in node.children:
                return
            
            next_node = node.children[char]
            # If we've reached a complete word, add to result
            if next_node.word:
                result.append(next_node.word)
                next_node.word = None  # prevent duplicates
            
            # Mark as visited by temporarily replacing with '#'
            board[r][c] = '#'
            
            # Explore all 4 directions
            for dr, dc in [(0,1), (0,-1), (1,0), (-1,0)]:
                nr, nc = r + dr, c + dc
                if 0 <= nr < rows and 0 <= nc < cols:
                    dfs(nr, nc, next_node)
            
            # Restore cell (backtrack)
            board[r][c] = char
        
        # Start DFS from every cell
        for r in range(rows):
            for c in range(cols):
                dfs(r, c, root)
        
        return result

Complexity

Time: O(M × N × 4^L) where M×N is board size and L is max word length
Space: O(total characters in all words) for Trie + O(L) for DFS recursion stack

In the worst case, we explore every cell and from each cell try all 4 directions to depth L (longest word). However, the Trie dramatically prunes this: for each step, if the character isn't in any word's prefix, we stop immediately. The actual runtime is closer to O(M×N) in practice because dead branches are cut short. We also visit each cell at most once per word found, and each cell's character is processed once per DFS call.

Common Mistakes

Edge Cases

Connections

Heap / Priority Queue (7)

Design Twitter #355
Merge K Sorted Streams with Heap (Top-K Selection)

Intuition

Think of this like a news aggregator merging multiple feeds. Each user has their own feed (a stream of tweets sorted by time). When you want your news feed, you need to merge K+1 sorted streams (your own feed + everyone you follow) and pick the top 10. This is exactly like having K+1 sorted lists and finding the top-k elements — the classic 'merge k sorted arrays' problem. A max-heap is perfect here: we keep one pointer into each feed, the heap tells us which pointer has the most recent tweet, we take it, then advance that pointer. We repeat until we have 10 tweets or run out.

Why This Pattern?

Each user's tweet history is naturally sorted by timestamp (newer tweets are added to the front). To get the global top-10 most recent, we need to merge K+1 sorted sequences. A max-heap gives us O(log K) access to the 'current maximum' among all streams, making this the optimal pattern. We don't need to sort everything — just find top-10, which the heap handles elegantly.

Solution

class Tweet:
    def __init__(self, id: int, time: int):
        self.id = id
        self.time = time
    
    def __lt__(self, other):
        # For max-heap: we want larger time to be "smaller" in heap ordering
        return self.time > other.time

class Twitter:
    def __init__(self):
        self.tweets = {}          # userId -> list of Tweets (newest first)
        self.follows = {}         # userId -> set of followeeIds
        self.time = 0             # global timestamp (monotonically increasing)

    def postTweet(self, userId: int, tweetId: int) -> None:
        self.time += 1
        if userId not in self.tweets:
            self.tweets[userId] = []
        # Add to front (newest first)
        self.tweets[userId].append(Tweet(tweetId, self.time))

    def follow(self, followerId: int, followeeId: int) -> None:
        if followerId == followeeId:
            return  # Can't follow yourself (per problem constraints)
        if followerId not in self.follows:
            self.follows[followerId] = set()
        self.follows[followerId].add(followeeId)

    def unfollow(self, followerId: int, followeeId: int) -> None:
        if followerId in self.follows:
            self.follows[followerId].discard(followeeId)

    def getNewsFeed(self, userId: int) -> List[int]:
        # Build list of tweet sources: user's own tweets + everyone they follow
        sources = []
        
        # Include self
        if userId in self.tweets:
            sources.append(iter(self.tweets[userId]))
        
        # Include follows
        if userId in self.follows:
            for followee in self.follows[userId]:
                if followee in self.tweets:
                    sources.append(iter(self.tweets[followee]))
        
        # Max-heap to get most recent tweet across all sources
        heap = []
        result = []
        
        # Initialize heap with first tweet from each source
        for source in sources:
            tweet = next(source, None)
            if tweet:
                heapq.heappush(heap, (tweet, source))  # (tweet, iterator)
        
        # Extract top 10
        while heap and len(result) < 10:
            tweet, source = heapq.heappop(heap)
            result.append(tweet.id)
            # Get next tweet from this source
            next_tweet = next(source, None)
            if next_tweet:
                heapq.heappush(heap, (next_tweet, source))
        
        return result

Complexity

Time: O(F + 10*log(F)) where F = number of people user follows (plus self). In worst case F = O(N) where N is total users, so O(N + 10*log N). For each getNewsFeed, we push at most F+1 items and pop at most 10 items.
Space: O(N + T) = O(total users + total tweets). We store all follows as sets and all tweets. getNewsFeed uses O(F) extra for the heap.

We can't do better than O(N) for getNewsFeed in the worst case because in theory you follow everyone. But the heap saves us from sorting — we only do 10 pops max. The 10 is constant (the problem limits feed to 10), so it's effectively O(F + log F). Storing all tweets is necessary because we need them for future feed requests.

Common Mistakes

Edge Cases

Connections

Find Median from Data Stream #295
Two-Heap / Dual Priority Queue Pattern

Intuition

Think of median as finding the 'balance point' on a number line where half your data sits on each side. The most elegant way to maintain this split is with two heaps acting like opposing forces reaching equilibrium. A max-heap holds all the smaller numbers (the left side of the median), and a min-heap holds all the larger numbers (the right side). The magic: the top of the max-heap is the largest of the small numbers, and the top of the min-heap is the smallest of the large numbers — exactly the two values you need to compute the median! It's like a scale trying to stay balanced: whenever one side gets too heavy, you rebalance.

Why This Pattern?

This problem has an inherent 'split in the middle' structure — we need quick access to the boundary elements on either side of the median. Heaps give us O(1) access to the extremes (max of left half, min of right half) while maintaining sorted order in O(log n) for insertions. This is the classic 'complementary heaps' pattern where one heap stores the lower half and the other stores the upper half, with size balancing ensuring we always know which heap contains the median.

Solution

import heapq

class MedianFinder:
    def __init__(self):
        # max-heap for the smaller half (store negatives since Python only has min-heap)
        self.small = []
        # min-heap for the larger half
        self.large = []
    
    def addNum(self, num: int) -> None:
        # Step 1: Add to max-heap (small half)
        heapq.heappush(self.small, -num)
        
        # Step 2: Balance - ensure every element in small <= every element in large
        # The largest of small (-self.small[0]) should be <= smallest of large
        if self.small and self.large and -self.small[0] > self.large[0]:
            # Move the problematic element to large
            val = -heapq.heappop(self.small)
            heapq.heappush(self.large, val)
        
        # Step 3: Size balancing - keep small either equal to large or one element larger
        # This ensures we know which heap contains the median for odd counts
        if len(self.small) > len(self.large) + 1:
            val = -heapq.heappop(self.small)
            heapq.heappush(self.large, val)
        elif len(self.large) > len(self.small):
            val = heapq.heappop(self.large)
            heapq.heappush(self.small, -val)
    
    def findMedian(self) -> float:
        if len(self.small) > len(self.large):
            # Odd total: median is the top of the larger heap (small)
            return float(-self.small[0])
        else:
            # Even total: average of both tops
            return (-self.small[0] + self.large[0]) / 2.0

Complexity

Time: O(log n) for addNum (heap push/pop), O(1) for findMedian
Space: O(n) — we store all elements in the two heaps

Adding a number requires heap operations which take O(log n) because we might need to bubble up/down through the tree. The heap maintains its heap property in logarithmic time. Finding the median is O(1) because we just peek at the tops of both heaps — no computation needed, just accessing two values. We can't do better than O(log n) for insertion because any algorithm that maintains sorted order (which we need to find the median) must touch at least log n elements in the worst case — that's the information-theoretic lower bound for comparison-based sorting.

Common Mistakes

Edge Cases

Connections

K Closest Points to Origin #973
Max Heap / Priority Queue (maintain k-smallest elements)

Intuition

Imagine you're at a party and you want to find the k people closest to you. You could measure everyone's distance, sort everyone by how close they are, then pick the first k. But that's overkill — you only care about the k closest, not the order beyond that. Instead, think of it like a competition: you have k 'slots' for the closest people. As you meet new people, if someone's closer than your current farthest person in the group, they push that person out. This is exactly what a max-heap does: it gives you fast access to the 'worst' element in your current set, letting you swap it out when you find something better.

Why This Pattern?

The key insight is that we only need to track k elements, not all n. A max-heap of size k gives us O(1) access to the current 'worst' (farthest) point among our k closest. When we see a new point closer than that worst one, we swap them in O(log k) time. This beats sorting all n points because we only do log(k) work per point instead of log(n) to insert into our heap, making it O(n log k) vs O(n log n).

Solution

import heapq

def kClosest(points, k):
    # We want to keep the k CLOSEST points, which means we need
    # a max-heap to quickly find/replace the FARTHEST among our k.
    # Python's heapq is a min-heap, so we negate distances.
    
    max_heap = []  # stores (-distance, x, y)
    
    for x, y in points:
        dist = x*x + y*y  # squared distance — sqrt not needed for comparison
        
        # Push this point (negate dist for max-heap behavior)
        heapq.heappush(max_heap, (-dist, x, y))
        
        # If we have more than k points, remove the farthest
        if len(max_heap) > k:
            heapq.heappop(max_heap)
    
    # Extract the points from heap (ignore the negated distance)
    return [[x, y] for (_, x, y) in max_heap]

Complexity

Time: O(n log k)
Space: O(k)

We process each of the n points once. For each point, we do a heap push (log k) and possibly a heap pop (log k), so O(log k) per point = O(n log k). We only store k points in the heap at any time, so O(k) space. We can't do better than O(n log k) in the worst case because we must examine every point to know which k are closest — there's no way to know if a point is a contender without comparing it against our current set.

Common Mistakes

Edge Cases

Connections

Kth Largest Element in a Stream #703
Heap as a sliding window / maintain k largest elements

Intuition

Think of this like maintaining a 'water level' in a lake. The kth largest element is like the k-th highest point. If you're at a party and someone asks 'who's the 3rd tallest person?' you don't need to know everyone's height - you just track the top 3. When someone new arrives, you compare them to your top 3 and update if needed. A min-heap of size k does exactly this! It keeps the k largest elements, and the top of the heap (minimum of the top k) is our kth largest answer.

Why This Pattern?

We only need the kth largest at any moment, so we don't need to store the entire stream. A min-heap of size k naturally gives us O(1) access to the smallest of our top k elements (which is the kth largest overall). Every new element either gets ignored if smaller than our kth largest, or kicks it out and becomes a new candidate.

Solution

import heapq

class KthLargest:
    def __init__(self, k: int, nums: List[int]):
        self.k = k
        # Use min-heap of size k - stores the k largest elements
        # heap[0] will be the SMALLEST among these k largest = kth largest overall
        self.heap = nums[:k]
        heapq.heapify(self.heap)
        
        # If nums has more than k elements, only keep the k largest
        # (pop smallest until heap size = k)
        if len(nums) > k:
            for _ in range(len(nums) - k):
                heapq.heappop(self.heap)
    
    def add(self, val: int) -> int:
        # Add new value to heap
        heapq.heappush(self.heap, val)
        
        # If we have more than k elements, remove smallest
        # (that's the (k+1)th largest, not needed)
        if len(self.heap) > self.k:
            heapq.heappop(self.heap)
        
        # heap[0] is always the kth largest
        return self.heap[0]

Complexity

Time: O((n + m) * log k) where n = initial array size, m = number of add() calls
Space: O(k)

We only store k elements in the heap. Each heap operation (push/pop) costs O(log k) because the heap is a complete binary tree with height log k. Initialization processes n elements but each pop is O(log k), giving O(n log k). Each add() does at most one push and one pop = O(log k).

Common Mistakes

Edge Cases

Connections

Kth Largest Element in an Array #215
K-Largest Elements Heap Pattern (Min-heap of size k)

Intuition

Think of this like finding the kth tallest person in a crowd. You could sort everyone by height (expensive), or you could maintain a 'top k' list that automatically keeps track. A min-heap of size k acts like a water level - it holds exactly the k largest elements we've seen, and the smallest among those (the root) is exactly the kth largest element overall. It's like keeping a 'ceiling' at the kth position: any element above the ceiling gets in and pushes someone out, any element below gets ignored.

Why This Pattern?

The min-heap root gives O(1) access to the SMALLEST among our top k candidates. Since we want the kth LARGEST, we want the smallest of the largest k. This is the perfect structure: we maintain exactly k elements, and the heap invariant automatically keeps the smallest of those at the top. When we see a new element larger than our current minimum, it belongs in the top k, so we swap it in.

Solution

import heapq

def findKthLargest(nums: list[int], k: int) -> int:
    # Use a min-heap to track the k largest elements
    # The root will be the SMALLEST among the k largest = kth largest
    min_heap = []
    
    for num in nums:
        # Add current element to heap
        heapq.heappush(min_heap, num)
        
        # If heap exceeds k, remove smallest element
        # This maintains exactly k largest elements seen so far
        if len(min_heap) > k:
            heapq.heappop(min_heap)
    
    # Root is the kth largest (smallest among top k)
    return min_heap[0]

Complexity

Time: O(n log k)
Space: O(k)

We process n elements, and each heap operation (push or pop) costs O(log k) since the heap never exceeds size k. This beats sorting O(n log n) when k << n, and we only care about one rank position, not full order.

Common Mistakes

Edge Cases

Connections

Last Stone Weight #1046
Max Heap / Priority Queue - Extract Max pattern

Intuition

Think of this like a collision/energy dissipation system. When two stones smash, their 'energy' (weight) partially dissipates - if equal, all energy is lost; if unequal, only the difference remains. The heaviest stones dominate the outcome because we always process the two largest first. It's like a pressure system where the highest pressures interact first, and each collision potentially creates a new pressure point. The key insight: we never need to consider smaller stones until the larger ones are resolved - like how in a game of billiards, the heaviest balls determine the trajectory before smaller ones matter.

Why This Pattern?

The problem is defined entirely in terms of 'largest' elements - we repeatedly need the two heaviest stones. A max-heap gives O(1) access to the maximum element and O(log n) insertion/deletion, making it the natural data structure. Each operation (smash) transforms the two maxes into a potential new max, which the heap efficiently maintains. This is fundamentally a 'repeatedly get largest' pattern.

Solution

import heapq

def lastStoneWeight(stones):
    # Python's heapq is a min-heap, so negate to simulate max-heap
    max_heap = [-stone for stone in stones]
    heapq.heapify(max_heap)  # O(n) - more efficient than n * O(log n)
    
    # Keep smashing until 0 or 1 stone remains
    while len(max_heap) > 1:
        # Extract two heaviest stones (negate to get actual values)
        y = -heapq.heappop(max_heap)  # heaviest
        x = -heapq.heappop(max_heap)  # second heaviest
        
        # If they differ, push the difference back as a new stone
        if y > x:
            heapq.heappush(max_heap, -(y - x))
    
    # Return last stone weight, or 0 if empty
    return -max_heap[0] if max_heap else 0

Complexity

Time: O(n log n)
Space: O(n)

Heapify takes O(n) using the Floyd algorithm. Each smash operation does two pops and possibly one push, each O(log n). In worst case (stones never fully cancel), we do O(n) smash operations. So total is O(n log n). We can't do better because we must examine each stone at least once, and each comparison/insertion in a heap is O(log n) by definition of the data structure.

Common Mistakes

Edge Cases

Connections

Task Scheduler #621
Greedy formula from max frequency - calculate the minimum intervals needed based on the most frequent task's spacing requirements

Intuition

Think of this like scheduling workers at a factory. You have different types of jobs (tasks), but if you do the same job twice too quickly, the machine overheats and you must wait (cooling interval). The key insight: the most frequently occurring task acts like a 'bottleneck' - it creates the longest chain in your schedule, and all other tasks must fit into the gaps between occurrences of this task. If you have plenty of other tasks, you keep the machine busy. If not, you end up with idle waiting time. The formula emerges from asking: 'How many gaps do I need to create between the most frequent task, and do I have enough other tasks to fill them?'

Why This Pattern?

The cooling constraint fundamentally creates 'slots' that must exist between the same task. The task with maximum frequency determines how many slots we MUST have (f_max - 1 groups, each needing n slots). The number of tasks sharing that max frequency tells us how many 'final' slots get filled. This structural property makes the greedy formula the natural solution - we can't do better than this lower bound, and we can always achieve it by arranging tasks this way.

Solution

import collections
import math

class Solution:
    def leastInterval(self, tasks: List[str], n: int) -> int:
        # Count frequency of each task
        task_counts = collections.Counter(tasks)
        
        # Find maximum frequency (the bottleneck task)
        f_max = max(task_counts.values())
        
        # Count how many tasks have that maximum frequency
        # (they all create the same length chain)
        count_max = sum(1 for count in task_counts.values() if count == f_max)
        
        # The formula: (f_max - 1) groups * (n + 1) slots per group + count_max at end
        # Think of it as: A _ _ ... A _ _ ... A, where we have n gaps between each A
        part1 = (f_max - 1) * (n + 1)
        part2 = count_max
        
        # Maximum of: formula result vs actual task count
        # If we have plenty of tasks to fill gaps, we use all tasks
        # If gaps exceed available tasks, we have idle time
        return max(part1 + part2, len(tasks))

Complexity

Time: O(m) where m is the number of unique tasks. We count all tasks once to get frequencies, find max, then iterate to count how many have max frequency. The number of unique tasks is bounded by 26 (uppercase letters), making this very fast.
Space: O(1) or O(26) - we store at most 26 task frequencies (uppercase English letters), which is constant space relative to input.

We only care about the COUNT of each unique task, not the order. Finding max and counting max-frequency tasks each require scanning 26 items max. We can't do better because we must at least examine each distinct task once to know which is most frequent. The formula itself is O(1).

Common Mistakes

Edge Cases

Connections

Backtracking (9)

Combination Sum II #40
Backtracking with sorting-based duplicate elimination

Intuition

Think of this as exploring a decision tree where each number can either be included or excluded from our current combination. The key insight is that when the candidates array has duplicates, we need to avoid creating duplicate combinations - like taking two different paths that lead to the same destination. By sorting first, we group equal numbers together, and we can then make a strategic choice: when we're at a level of the recursion and see the same number as the previous one we already explored, we skip it. This is similar to 'if you already tried taking the first copy of a duplicate and it didn't work, there's no point trying the second copy at the same recursion depth' - they lead to identical sub-problems.

Why This Pattern?

Sorting enables two critical optimizations: (1) we can prune branches where the current sum exceeds target, and (2) we can detect and skip duplicate combinations by checking if candidates[i] == candidates[i-1] at the same recursion depth. The 'start' parameter in backtracking enforces the 'each number used once' constraint - we only consider elements from index 'start' onward.

Solution

class Solution:
    def combinationSum2(self, candidates: List[int], target: int) -> List[List[int]]:
        result = []
        candidates.sort()  # Sort to enable pruning and duplicate detection
        
        def backtrack(start: int, remaining: int, current: List[int]):
            # Base case: we've found a valid combination
            if remaining == 0:
                result.append(current[:])  # Append a copy
                return
            
            # Prune: if remaining is negative or we've exhausted candidates
            if remaining < 0:
                return
            
            # Explore each candidate from 'start' onwards
            for i in range(start, len(candidates)):
                # Skip duplicates: if this candidate is same as previous AND
                # we're not at the first choice at this recursion depth
                if i > start and candidates[i] == candidates[i-1]:
                    continue
                
                # Include current candidate and recurse
                # Use i+1 because each number can only be used once
                current.append(candidates[i])
                backtrack(i + 1, remaining - candidates[i], current)
                current.pop()  # Backtrack: remove and try next option
        
        backtrack(0, target, [])
        return result

Complexity

Time: O(2^n) in the worst case where n is the number of candidates, but pruning significantly reduces this. In practice, it's bounded by the number of valid combinations times n for copying. The sorting is O(n log n).
Space: O(n) for the recursion stack in the worst case (when we explore all paths), plus O(k) for storing each valid combination where k is the average combination length.

We can't do better than exponential in the worst case because in theory every subset could be a valid combination. However, sorting adds O(n log n) which is dominated by the exponential exploration. The space is dominated by the depth of recursion - at most n levels deep (one for each unique position), plus storage for results.

Common Mistakes

Edge Cases

Connections

Combination Sum #39
Backtracking with sorting and pruning

Intuition

Think of this as a budget allocation problem. You have a target 'spending limit' and a list of 'items' you can buy (the candidates). Each item costs its face value, and you can buy each item unlimited times. The question asks: what are all the ways to spend exactly your budget? Imagine exploring a decision tree where at each node you choose how many copies of the current item to buy. The key insight: since [2,3] and [3,2] represent the same combination, we process items in sorted order and never go back - this eliminates duplicates naturally. It's like a conversation where you say "I'm going to use 2 of item A, now let's discuss item B..." - you never reconsider A after moving to B. The pruning (cutting off dead branches) works because once your remaining budget goes negative, no further choices can fix that - you've overspent.

Why This Pattern?

We need to explore ALL possible combinations (not optimize), we can reuse elements unlimited times (changes recursion to include current index), and order doesn't matter so we process in sorted order to avoid duplicates. The structural property: if candidates[i] > remaining, then ALL subsequent candidates (which are >= candidates[i]) will also exceed remaining - this is why sorting enables efficient pruning.

Solution

def combinationSum(candidates, target):
    result = []
    candidates.sort()  # Sort to enable pruning and avoid duplicates
    
    def backtrack(start, remaining, current):
        # Base case: exactly matched target - found valid combination
        if remaining == 0:
            result.append(current[:])  # Append COPY since we mutate current
            return
        
        # Pruning: overspent - no valid combination possible in this branch
        if remaining < 0:
            return
        
        # Try each candidate starting from 'start' (allows reuse of same candidate)
        for i in range(start, len(candidates)):
            # Pruning: since sorted, if current exceeds remaining, all subsequent will too
            if candidates[i] > remaining:
                break
            
            # CHOOSE: add this candidate to our combination
            current.append(candidates[i])
            
            # EXPLORE: recurse with same index i (unlimited reuse) and updated remaining
            backtrack(i, remaining - candidates[i], current)
            
            # UNCHOOSE: remove and try next option
            current.pop()
    
    backtrack(0, target, [])
    return result

Complexity

Time: O(N^target) in worst case where N is number of candidates and target is the sum - exponential because we explore all combinations. More precisely, it's bounded by the number of valid combinations in the output times the average length of each combination.
Space: O(target) for recursion stack depth (max depth equals max number of elements that can fit in target), plus O(number of combinations) for storing results.

The recursion depth is bounded by target/min(candidate) - you literally can't have more elements than this. The time is exponential because in worst case (like target=7 and candidates=[1,2,3]) we explore many branches - but the pruning significantly cuts this in practice. We can't do better than exponential in the worst case because we genuinely need to generate all valid combinations.

Common Mistakes

Edge Cases

Connections

Letter Combinations of a Phone Number #17
Cartesian Product via Backtracking - you're computing the Cartesian product of multiple sets (the letters corresponding to each digit), generating all possible tuples by taking one element from each set.

Intuition

Think of this like a tree growing horizontally. You start with an empty branch, and for each digit, you SPLIT that branch into multiple smaller branches - one for each possible letter. For "23", you'd take your empty start, split it into 'a','b','c' for the first digit, then each of those splits again into 'd','e','f'. It's like opening a combination lock where each dial has a different number of letters - you systematically try every possible combination by advancing one dial at a time, then backtracking to try the next option.

Why This Pattern?

The problem has a natural tree structure where each level corresponds to one digit, and each node at that level branches into all possible letters for that digit. Backtracking is perfect here because you're building partial solutions incrementally, and when you reach the end (processed all digits), you backtrack to explore alternative letter choices at previous positions. This is the classic 'explore all paths' pattern.

Solution

def letterCombinations(digits: str) -> list[str]:
    if not digits:
        return []
    
    # Phone keypad mapping - each digit maps to its possible letters
    phone = {
        '2': 'abc', '3': 'def', '4': 'ghi', '5': 'jkl',
        '6': 'mno', '7': 'pqrs', '8': 'tuv', '9': 'wxyz'
    }
    
    res = []
    
    def backtrack(index, current):
        # Base case: we've processed all digits, complete combination found
        if index == len(digits):
            res.append(current)
            return
        
        # Get all possible letters for current digit
        letters = phone[digits[index]]
        
        # For each letter option, recurse to next digit
        for letter in letters:
            backtrack(index + 1, current + letter)
    
    backtrack(0, "")
    return res

Complexity

Time: O(4^n * n) where n is the number of digits
Space: O(4^n * n) for storing all combinations + O(n) for recursion stack

You generate 4^n combinations in the worst case (digits 7 and 9 have 4 letters each). Each combination takes O(n) time to build since string concatenation creates a new string of length n. Can't do better because you must actually produce all combinations - that's n * 4^n characters in the output.

Common Mistakes

Edge Cases

Connections

N-Queens #51
Backtracking with state sets

Intuition

Imagine each queen as a radio tower broadcasting interference along its row, column, and two diagonals. Your job is to place N towers so their 'signals' never overlap. You place queens row by row - each row is like choosing which frequency band to claim. When you place a queen at (row, col), you're 'reserving' that column and those two diagonal frequencies. The key insight: you can compute diagonal IDs with simple math - the '/' diagonal has ID (row + col), and the '\\' diagonal has ID (row - col + n-1). Think of backtracking as exploring a tree where each branch is 'try this column, see if it leads to a solution.' When a branch fails (you hit a conflict), you 'unclaim' your resources and try the next column - this is the 'back' in backtracking. It's like filling a mold one piece at a time; if a piece doesn't fit, you remove it and try a different piece.

Why This Pattern?

The problem naturally forms a decision tree: at each row, you choose one of N columns. Each choice reduces the problem size (move to next row with fewer available positions). When a path fails (conflict detected), you undo the last choice - exactly the backtracking pattern. Using sets to track occupied columns and diagonals gives O(1) conflict detection, making the backtracking efficient.

Solution

import json

def solveNQueens(n):
    """
    Place n queens on an n×n chessboard so no two queens attack each other.
    Returns all valid solutions as list of board representations.
    """
    result = []
    
    # Board[i][j] = 'Q' if queen placed, '.' otherwise
    board = [['.' for _ in range(n)] for _ in range(n)]
    
    # Track occupied columns and diagonals - O(1) lookup
    cols = set()      # occupied columns
    diag1 = set()     # '/' diagonals identified by (row + col)
    diag2 = set()     # '\\' diagonals identified by (row - col + n-1)
    
    def backtrack(row):
        # Base case: successfully placed n queens
        if row == n:
            # Convert board to list of strings for output
            solution = [''.join(row) for row in board]
            result.append(solution)
            return
        
        # Try each column in current row
        for col in range(n):
            # Calculate diagonal IDs for this position
            d1 = row + col           # '/' diagonal (top-left to bottom-right)
            d2 = row - col + n - 1   # '\\' diagonal (top-right to bottom-left)
            
            # Skip if this position is under attack
            if col in cols or d1 in diag1 or d2 in diag2:
                continue
            
            # Place queen - claim this column and diagonals
            board[row][col] = 'Q'
            cols.add(col)
            diag1.add(d1)
            diag2.add(d2)
            
            # Recurse to next row
            backtrack(row + 1)
            
            # Backtrack: remove queen and unclaim resources
            board[row][col] = '.'
            cols.remove(col)
            diag1.remove(d1)
            diag2.remove(d2)
    
    backtrack(0)
    return result

Complexity

Time: O(N!)
Space: O(N)

The recursion depth is at most N (one queen per row). The sets store at most N columns and 2N-1 diagonals total. The board itself is N×N but we reuse it, so the auxiliary space is O(N) for the tracking sets plus O(N) for recursion stack.

Common Mistakes

Edge Cases

Connections

Palindrome Partitioning #131
Backtracking with palindrome validation

Intuition

Think of this like a factory line where you're cutting a rope into segments. At each position, you decide whether to make a cut. Each segment you produce must be 'balanced' (a palindrome). You're exploring all possible ways to make these cuts, like a tree of decisions where each branch represents a cut. When a branch leads to a segment that isn't a palindrome, that's a dead end—you 'backtrack' (undo the cut) and try the next option. It's like water finding all possible paths through a maze: flow down each path, and when you hit a wall, back up and try another direction.

Why This Pattern?

The problem requires exploring ALL possible partitions—a classic combinatorial search. At each index, we choose to either cut or not cut, building solutions incrementally. When a chosen substring isn't a palindrome, we backtrack (undo the last decision) to explore alternative paths. This is the exact structure of backtracking: explore, validate, and retreat when needed.

Solution

def partition(s):
    result = []
    path = []  # Current partition being built
    
    def is_palindrome(start, end):
        # Two-pointer check: compare chars from both ends moving inward
        while start < end:
            if s[start] != s[end]:
                return False
            start += 1
            end -= 1
        return True
    
    def backtrack(index):
        # Base case: we've processed entire string
        # Valid partition found - add copy to results
        if index == len(s):
            result.append(path.copy())
            return
        
        # Try every possible end position for the next palindrome
        for end in range(index, len(s)):
            # Only proceed if the substring s[index:end+1] is a palindrome
            # This is our pruning step - don't explore dead ends
            if is_palindrome(index, end):
                path.append(s[index:end+1])  # Make choice: include this palindrome
                backtrack(end + 1)          # Recurse on remaining string
                path.pop()                  # Undo choice: backtrack
    
    backtrack(0)
    return result

Complexity

Time: O(n * 2^n)
Space: O(n)

Common Mistakes

Edge Cases

Connections

Permutations #46
Backtracking / Depth-First Search on a choice tree

Intuition

Think of arranging books on a shelf. You have n books and n slots. For the first slot, you can pick any of the n books. For the second slot, any of the remaining n-1 books. And so on. Each path from 'top of the tree' to a 'leaf' is one valid permutation. This is like exploring a tree where at each level you pick one of the remaining unchosen elements. The key insight: once you've picked an element, it's 'locked in' for that branch - you can't reuse it until you backtrack (undo that choice).

Why This Pattern?

We need to generate ALL possible orderings. At each step we have a set of available choices (unused elements). We try each choice, recurse to build the rest of the permutation, then undo that choice to try the next option. This is the canonical backtracking structure: make choice → explore → undo choice. The 'tree' structure emerges naturally because each choice branches into multiple subtrees.

Solution

def permute(nums):
    result = []
    path = []
    used = [False] * len(nums)  # tracks which elements are already in our current path
    
    def backtrack():
        # Base case: we've picked all elements → we have a complete permutation
        if len(path) == len(nums):
            result.append(path[:])  # MUST copy! otherwise all results point to same list
            return
        
        # Try each element that hasn't been used yet
        for i in range(len(nums)):
            if not used[i]:
                # MAKE CHOICE: pick nums[i]
                used[i] = True
                path.append(nums[i])
                
                # EXPLORE: recurse to fill remaining positions
                backtrack()
                
                # UNDO CHOICE: backtrack to try other options
                used[i] = False
                path.pop()
    
    backtrack()
    return result

Complexity

Time: O(n! * n)
Space: O(n)

Common Mistakes

Edge Cases

Connections

Subsets II #90
Backtracking with duplicate skipping (sort-and-skip pattern)

Intuition

Think of this like organizing a photo album where you have multiple identical photos of the same person. You want to create pages representing all possible selections of photos, but you don't want duplicate pages (e.g., two pages both with just 'photo A'). The trick: sort all photos first so identical ones are adjacent. Then when building your album, once you decide NOT to include a particular person on the current page, skip over ALL their identical photos before moving to the next person. This guarantees no duplicates because any subset containing 'photo A' would be identical to a subset containing 'photo A' from a different position - so we only generate one.

Why This Pattern?

When the array is sorted, all duplicate values become adjacent. The structural property: at any recursion level, if we skip nums[i], then including nums[i+1] (which equals nums[i]) would create a subset identical to what we'd get by including nums[i]. So we skip all consecutive duplicates at each recursion depth. This is the same core insight as Subsets I, but with an additional pruning step.

Solution

def subsetsWithDup(self, nums: List[int]) -> List[List[int]]:
    res = []
    nums.sort()  # Critical: sort to group duplicates together
    
    def backtrack(start, path):
        # Every path is a valid subset - add copy (not reference)
        res.append(path[:])
        
        for i in range(start, len(nums)):
            # Skip duplicates: if same as previous element at this level, skip
            # The 'i > start' check ensures we only skip within the SAME recursion level
            if i > start and nums[i] == nums[i-1]:
                continue
            
            # Choose: include current element
            path.append(nums[i])
            
            # Explore: recurse with next index
            backtrack(i + 1, path)
            
            # Un-choose: backtrack
            path.pop()
    
    backtrack(0, [])
    return res

Complexity

Time: O(n * 2^n)
Space: O(n) for recursion stack (excluding output)

There are at most 2^n subsets, and we spend O(n) time copying each subset into the result. The duplicate-skipping optimization doesn't reduce worst-case complexity (which occurs when all elements are unique), but it dramatically reduces the constant factor in practice. The recursion depth is at most n.

Common Mistakes

Edge Cases

Connections

Subsets #78
Backtracking / Decision Tree Traversal

Intuition

Think of this like exploring all possible paths in a decision tree. For each element in the array, you face a binary choice: include it in your current subset, or don't include it. Imagine you have coins [1, 2, 3] — for each coin, you flip to decide include/exclude. The power set is simply all possible combinations of these decisions. It's like a game where at every step you can either take the current element or leave it, and you explore every possible combination of these yes/no choices.

Why This Pattern?

The problem naturally forms a binary tree structure where each level represents a decision (include or exclude element i). Starting from an empty set, you branch two ways at each element: add it to the current subset, or skip it. This creates exactly 2^n leaf nodes (subsets). Backtracking is ideal because you build solutions incrementally, explore all branches, then 'undo' the last decision to try other paths — classic depth-first exploration of a decision space.

Solution

def subsets(nums):
    result = []
    
    def backtrack(start, path):
        # Every path in the decision tree IS a valid subset
        # Make a copy! Otherwise we append a reference that changes
        result.append(path[:])
        
        # Try adding each remaining element, one at a time
        for i in range(start, len(nums)):
            path.append(nums[i])           # Choose: include nums[i]
            backtrack(i + 1, path)         # Recurse with remaining elements
            path.pop()                     # Un-choose: backtrack
    
    backtrack(0, [])
    return result

Complexity

Time: O(n * 2^n)
Space: O(n)

There are exactly 2^n possible subsets (each of n elements can be either in or out). Building each subset takes O(n) time since we copy the path each time we add to result. So total is O(n * 2^n). This is optimal because you MUST generate 2^n subsets — you can't do better than examining every possible combination.

Common Mistakes

Edge Cases

Connections

Word Search #79
Backtracking (Depth-First Search on a grid)

Intuition

Imagine you're exploring a cave system looking for a hidden message written on rocks. You can only move up, down, left, or right, and you can't step on the same rock twice (because that would reuse a letter). At each junction, you try one direction; if it doesn't lead anywhere, you backtrack and try another. The grid is like a graph where each cell connects to its neighbors, and you're searching for any path that spells out the word. The key insight: you can START from any cell that matches the first letter — you don't know which entrance leads to the solution.

Why This Pattern?

The problem has exponential branching — at each cell you have up to 4 choices, and you need to find ANY valid path. Backtracking naturally explores one path deeply, then 'undoes' moves to try alternatives. The 'visited' mechanism (marking cells temporarily) ensures you don't reuse cells within a single path, which is essential since each cell can only be used once per word construction.

Solution

def exist(board, word):
    if not board or not board[0]:
        return False
    
    rows, cols = len(board), len(board[0])
    
    def backtrack(r, c, index):
        # Base case: we've matched all characters
        if index == len(word):
            return True
        
        # Check bounds and if current cell matches the current letter
        if (r < 0 or r >= rows or c < 0 or c >= cols or 
            board[r][c] != word[index]):
            return False
        
        # Mark as visited by temporarily replacing with a placeholder
        # (can't use None because board contains chars)
        original = board[r][c]
        board[r][c] = '#'
        
        # Explore all 4 directions: right, left, down, up
        for dr, dc in [(0, 1), (0, -1), (1, 0), (-1, 0)]:
            if backtrack(r + dr, c + dc, index + 1):
                return True
        
        # Backtrack: restore the original character
        # This 'undo' is what makes it backtracking
        board[r][c] = original
        return False
    
    # Try starting from every cell (any could be the entrance)
    for i in range(rows):
        for j in range(cols):
            if backtrack(i, j, 0):
                return True
    
    return False

Complexity

Time: O(M * N * 4^L) where M*N is the board size and L is word length
Space: O(L) for recursion stack (maximum depth equals word length)

In the worst case, we visit every cell and from each cell explore up to 4 directions recursively. The branching factor is 4, but we can't reuse cells, so it's effectively a search through a subset of all possible paths. We can't do better in the worst case because we might need to explore almost all paths to determine if a match exists — it's like checking every possible route in a maze.

Common Mistakes

Edge Cases

Connections

Graphs (13)

Clone Graph #133
Graph Traversal with Node Mapping (DFS/BFS with Hash Map)

Intuition

Think of this like copying a social network. You know one person, and you need to map out ALL their connections, then build an exact duplicate network. The key insight: as you traverse, you must remember which people you've already copied (using a hash map). Otherwise, if there's a cycle (A knows B, B knows A), you'd either loop forever or create duplicate copies of the same person. The hash map solves both problems simultaneously - it acts as your 'visited' set to prevent infinite recursion AND as a reference table so all edges pointing to the same original node point to the same copy.

Why This Pattern?

Graphs can contain cycles - nodes can reference nodes we've already visited. The structural property that makes this pattern natural is: we need a data structure that serves double duty - tracking 'visited' status to prevent infinite loops while also maintaining the mapping from original to copy so that all edges to the same node reference the same cloned node. A hash map elegantly solves both in one data structure.

Solution

"""
# Definition for undirected graph node
class Node:
    def __init__(self, val=0, neighbors=None):
        self.val = val
        self.neighbors = neighbors if neighbors is not None else []

def cloneGraph(node):
    if not node:
        return None
    
    # Hash map: {original_node: cloned_node}
    # This is our 'memory' - tracks which originals we've already copied
    clone_map = {}
    
    def dfs(original):
        # Base case: if we've already copied this node, return the copy
        # This check is what PREVENTS infinite loops on cycles
        if original in clone_map:
            return clone_map[original]
        
        # Create the clone for this node
        clone = Node(original.val)
        clone_map[original] = clone  # Store BEFORE recursing to handle self-loops
        
        # Recursively clone all neighbors
        for neighbor in original.neighbors:
            cloned_neighbor = dfs(neighbor)
            clone.neighbors.append(cloned_neighbor)
        
        return clone
    
    return dfs(node)

# BFS alternative (conceptually similar, just iterative):
from collections import deque

def cloneGraphBFS(node):
    if not node:
        return None
    
    clone_map = {node: Node(node.val)}
    queue = deque([node])
    
    while queue:
        original = queue.popleft()
        
        for neighbor in original.neighbors:
            # If neighbor hasn't been cloned yet, create clone and add to queue
            if neighbor not in clone_map:
                clone_map[neighbor] = Node(neighbor.val)
                queue.append(neighbor)
            # Link the cloned neighbor to the cloned current node
            clone_map[original].neighbors.append(clone_map[neighbor])
    
    return clone_map[node]"""

Complexity

Time: O(V + E)
Space: O(V) for the hash map + O(V) for recursion stack (DFS) or O(V) for queue (BFS) = O(V + E) auxiliary space for the clones themselves, plus O(V) for the map

We must visit every node at least once (that's V). For each node, we examine all its edges to connect neighbors (that's E total across all nodes). We can't do better because we literally need to create a copy of every node and edge - the work is inherent to the problem size. The hash map operations are O(1) average, so they don't add to our complexity.

Common Mistakes

Edge Cases

Connections

Course Schedule II #210
Topological Sort using Kahn's Algorithm (BFS with in-degree counting)

Intuition

Think of this like a dependency resolution system — like a package manager installing software where each package might depend on others already installed. You're looking for a valid installation order. Each course is a node, each prerequisite relationship is a directed edge from prerequisite → dependent. We need a linear ordering where all dependencies come before what depends on them. The trick: always pick courses with NO prerequisites first (in-degree 0), take them, then update the graph. This is like peeling layers off an onion — start from the outside (no dependencies) and work inward.

Why This Pattern?

The problem has a natural DAG structure — courses and prerequisites form a directed graph where edges point from prerequisite to dependent. Topological sort finds a linear ordering that respects all directed edges. Kahn's algorithm exploits the key property: nodes with in-degree 0 (no prerequisites) can always be taken first. When we 'take' such a course, we effectively remove its outgoing edges, decreasing the in-degree of its dependents. This cascading 'unlocking' is the natural consequence of the dependency structure.

Solution

from collections import deque, defaultdict

def findOrder(numCourses: int, prerequisites: List[List[int]]) -> List[int]:
    # Step 1: Build the graph (adjacency list) and track in-degrees
    # graph[prereq] = list of courses that depend on prereq
    graph = defaultdict(list)
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        graph[prereq].append(course)  # prereq -> course dependency
        in_degree[course] += 1  # course has one more prerequisite
    
    # Step 2: Initialize queue with courses that have NO prerequisites
    # These are our "starting points" - like packages with no dependencies
    queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
    result = []
    
    # Step 3: Process courses using BFS (Kahn's algorithm)
    while queue:
        course = queue.popleft()  # Take a course we CAN take
        result.append(course)
        
        # "Taking" this course unlocks its dependents
        # Reduce their in-degree (fewer prerequisites remaining)
        for dependent in graph[course]:
            in_degree[dependent] -= 1
            if in_degree[dependent] == 0:
                # All prerequisites satisfied - now THIS course is unlockable
                queue.append(dependent)
    
    # If we couldn't take all courses, there's a cycle (impossible)
    return result if len(result) == numCourses else []

Complexity

Time: O(V + E) where V = numCourses and E = len(prerequisites)
Space: O(V + E) for the graph adjacency list, in-degree array, and queue. The result array is O(V).

We visit each course exactly once (V operations) and traverse each prerequisite relationship exactly once (E edges). We can't do better because every course and every dependency must be processed to produce a valid ordering — you can't know the position of a course without checking its dependencies.

Common Mistakes

Edge Cases

Connections

Course Schedule #207
Cycle Detection in Directed Graph using Topological Sort (Kahn's Algorithm)

Intuition

Think of this like planning a construction project. Each course is a 'task' and each prerequisite is a 'dependency' - you must complete prerequisites before the dependent course. If you have a circular dependency (A needs B, B needs C, C needs A), it's like having a circular blueprint - you'd never be able to start! This is exactly what a CYCLE in a directed graph represents. The question reduces to: 'Is there a cycle in this dependency graph?' If NO cycle exists (a DAG), you can complete all courses. If a cycle exists, you're stuck.

Why This Pattern?

A valid course schedule corresponds exactly to a DAG (Directed Acyclic Graph). Topological sort works because we process courses that have no prerequisites first (in-degree = 0), 'consuming' them and potentially freeing up their dependents. If we can process ALL courses, no cycles exist. If we get stuck with courses that still have unmet prerequisites, a cycle exists - that's the key insight.

Solution

from collections import defaultdict, deque

def canFinish(numCourses, prerequisites):
    """
    Determine if all courses can be finished given prerequisite relationships.
    Uses Kahn's Algorithm (BFS topological sort).
    """
    # Step 1: Build the graph and compute in-degrees
    # graph[prereq] = list of courses that require this prereq
    # in_degree[course] = how many prerequisites this course needs
    graph = defaultdict(list)
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        # Important: prereq -> course (you need prereq BEFORE course)
        graph[prereq].append(course)
        in_degree[course] += 1
    
    # Step 2: Initialize queue with courses that have NO prerequisites
    # These are 'free' to take - they're our starting points
    queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
    completed = 0
    
    # Step 3: Process courses in topological order
    # Take a course with no remaining prerequisites, 'complete' it,
    # then reduce the in-degree of all courses that depended on it
    while queue:
        course = queue.popleft()
        completed += 1
        
        # 'Complete' this course by reducing dependents' in-degree
        for dependent in graph[course]:
            in_degree[dependent] -= 1
            # If dependent now has all prerequisites met, it becomes available
            if in_degree[dependent] == 0:
                queue.append(dependent)
    
    # If we completed all courses, no cycle existed
    return completed == numCourses

Complexity

Time: O(V + E) where V = numCourses and E = len(prerequisites)
Space: O(V + E) for the graph adjacency list and in-degree array

We must visit every course (V) and process every prerequisite relationship (E) at least once. Each edge is traversed exactly once when its source course is processed. This is optimal because we need to examine the entire dependency structure to determine if a cycle exists - you can't shortcut by skipping courses or relationships.

Common Mistakes

Edge Cases

Connections

Graph Valid Tree #261
Union-Find (Disjoint Set Union) / Cycle Detection

Intuition

Think of a tree as a connected water pipeline system with n houses. To connect ALL houses, you need exactly n-1 pipes. Any more and you create a loop (cycle), any fewer and some houses are cut off (disconnected). The key insight: a valid tree has exactly one 'path' between any two points - no detours, no isolated sections. When we process edges, if we ever try to connect two nodes that are ALREADY connected, we've found a cycle. If after processing all edges, all nodes belong to the same 'family' (connected component), we have a tree.

Why This Pattern?

Union-Find naturally models the question 'are these two nodes already connected?' When processing each edge, if find(u) == find(v), then u and v are already connected through some path - adding this edge creates a cycle. If they're not connected, we union them. After all edges, if the graph is valid, all n nodes should belong to exactly one set. This is more efficient than BFS/DFS because we avoid recursion and can detect cycles during edge processing.

Solution

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
    
    def find(self, x):
        # Path compression: flatten the tree
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        # Union by rank: attach smaller tree under larger
        root_x, root_y = self.find(x), self.find(y)
        if root_x == root_y:
            return False  # Already connected - cycle detected!
        if self.rank[root_x] < self.rank[root_y]:
            root_x, root_y = root_y, root_x
        self.parent[root_y] = root_x
        if self.rank[root_x] == self.rank[root_y]:
            self.rank[root_x] += 1
        return True

def validTree(n, edges):
    # A tree must have exactly n-1 edges
    if len(edges) != n - 1:
        return False
    
    uf = UnionFind(n)
    
    for u, v in edges:
        # If u and v already connected, adding this edge creates a cycle
        if not uf.union(u, v):
            return False
    
    # If we got here, no cycles and we have n-1 edges
    # With n nodes and n-1 edges, connectivity is guaranteed
    return True

Complexity

Time: O(n α(n)) where α is the inverse Ackermann function, which is practically constant (≤4). We process each edge once, and each find/union operation is nearly O(1).
Space: O(n) for the parent and rank arrays.

We need O(n) space to track which node belongs to which set. For time, we process each of the n-1 edges once, and each union-find operation costs α(n) ≈ constant. We can't do better than O(n) because we must at least look at all edges once.

Common Mistakes

Edge Cases

Connections

Max Area of Island #695
Graph traversal - finding connected components using flood fill (DFS or BFS).

Intuition

Think of the grid as a city where 1s are buildings and 0s are empty lots. You want to find the largest contiguous block of buildings. The trick: when you 'discover' an island, you 'flood' it (turn all its 1s to 0s) so you don't count those cells again. It's like pouring water into each island to mark it as 'measured' - once you've counted an island, you've claimed it, so move on to find the next unclaimed one.

Why This Pattern?

The grid is a graph where each cell connects to its 4 neighbors. We need to find all connected components of 1s and measure their sizes. The flood fill naturally handles this: when we visit a cell, we recursively visit all connected cells, counting as we go, and marking visited cells to avoid double-counting.

Solution

class Solution:
    def maxAreaOfIsland(self, grid: List[List[int]]) -> int:
        def dfs(row, col):
            # Base cases: out of bounds or already water
            if (row < 0 or row >= len(grid) or 
                col < 0 or col >= len(grid[0]) or 
                grid[row][col] == 0):
                return 0
            
            # "Flood" this cell - mark as visited so we don't count it again
            grid[row][col] = 0
            
            # Visit all 4 neighbors and count their areas
            # The +1 counts the current cell itself
            return (1 + 
                    dfs(row + 1, col) + 
                    dfs(row - 1, col) + 
                    dfs(row, col + 1) + 
                    dfs(row, col - 1))
        
        max_area = 0
        for row in range(len(grid)):
            for col in range(len(grid[0])):
                if grid[row][col] == 1:
                    # Found a new island - explore it and update max
                    max_area = max(max_area, dfs(row, col))
        
        return max_area

Complexity

Time: O(m * n) where m is rows and n is columns.
Space: O(m * n) in the worst case - all 1s in a single row or column means the recursion stack could go m or n deep. In practice, it's O(k) where k is the size of the largest island.

Each cell is visited at most once. When we start a DFS from a land cell, we explore the entire island and mark all its cells as visited (0). Future iterations skip these flooded cells. So across the entire algorithm, we do constant work per cell.

Common Mistakes

Edge Cases

Connections

Number of Connected Components in an Undirected Graph #323
Union-Find (Disjoint Set Union / DSU)

Intuition

Think of this like counting isolated islands on a map. Each connected group of nodes is like an island - you can travel between any two nodes within an island via the edges, but you can't reach nodes on different islands. The question asks: how many disconnected islands exist in this graph? It's like pouring water at every node and watching it spread along edges - each 'pool' of water is one connected component.

Why This Pattern?

Edges define equivalence relations - if u connects to v, they're in the same set. Union-Find naturally models this: each node starts in its own set, and we union sets when we find connections. The key structural property is that connectivity is transitive (if A connects to B and B to C, then A connects to C), which is exactly what equivalence classes capture. Union-Find with path compression + union by rank gives near-O(1) amortized operations, making it ideal for merging sets dynamically.

Solution

class UnionFind:
    def __init__(self, n):
        # Each node starts as its own parent (self-contained set)
        self.parent = list(range(n))
        # Track tree depth for smart union by rank
        self.rank = [0] * n
    
    def find(self, x):
        # Path compression: flatten the tree by pointing directly to root
        # This makes future lookups O(1) instead of O(tree height)
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        root_x, root_y = self.find(x), self.find(y)
        
        # Already in same set - don't double count
        if root_x == root_y:
            return False
        
        # Union by rank: attach shallower tree under deeper tree
        # This keeps the tree balanced and查找 fast
        if self.rank[root_x] < self.rank[root_y]:
            root_x, root_y = root_y, root_x
        
        self.parent[root_y] = root_x
        if self.rank[root_x] == self.rank[root_y]:
            self.rank[root_x] += 1
        
        return True


class Solution:
    def countComponents(self, n: int, edges: List[List[int]]) -> int:
        # Start with n separate components (each node is its own island)
        uf = UnionFind(n)
        components = n
        
        # Process each edge - it connects two nodes, reducing component count
        for u, v in edges:
            if uf.union(u, v):
                components -= 1
        
        return components

Complexity

Time: O(n + E * α(n)) ≈ O(n + E), where α(n) is the inverse Ackermann function (effectively constant < 5 for all practical n)
Space: O(n) for the parent and rank arrays

We visit each node once to initialize and process all edges once. The α(n) comes from Union-Find's near-constant time operations. We can't do better than O(n + E) because we must at least examine every edge to know about all connections - you can't know if nodes are connected without looking at the edges that might connect them.

Common Mistakes

Edge Cases

Connections

Number of Islands #200
Depth-First Search (DFS) flood fill / Connected Components

Intuition

Think of the grid as a map where '1' is land and '0' is water. Each island is a connected blob of land - like a territory on a map. The key insight: once you find any piece of land, you need to 'explore' all connected land to mark it as visited. This is like sending scouts from a landing point - they keep spreading to adjacent land until they've mapped the entire island. You count one island per exploration. It's essentially finding connected components in an implicit graph where cells are nodes and edges connect horizontally/vertically adjacent '1's.

Why This Pattern?

The grid forms an implicit graph where each '1' cell is a node and edges exist between adjacent '1's (up, down, left, right). Islands are connected components in this graph. Finding connected components is exactly what DFS does naturally - start from an unvisited node, explore everything reachable, and that's one component. The grid structure makes DFS cleaner than BFS here since we can modify the input in-place.

Solution

def numIslands(grid):
    if not grid:
        return 0
    
    count = 0
    rows, cols = len(grid), len(grid[0])
    
    def dfs(r, c):
        # Base cases: out of bounds or already water/visited
        if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] == '0':
            return
        
        # Mark as visited by converting to '0' (consumes the land)
        grid[r][c] = '0'
        
        # Explore all 4 directions (up, down, left, right)
        dfs(r + 1, c)  # down
        dfs(r - 1, c)  # up
        dfs(r, c + 1)  # right
        dfs(r, c - 1)  # left
    
    # Scan entire grid
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '1':  # Found unvisited land = new island
                count += 1
                dfs(r, c)  # Flood fill to mark entire island as visited
    
    return count

Complexity

Time: O(rows * cols)
Space: O(rows * cols) in worst case (recursion stack)

Worst case: the entire grid is one island, so DFS goes as deep as rows*cols (a snake-like path). Best case: no land at all, so O(1) stack space. On average, it's proportional to the size of the largest island.

Common Mistakes

Edge Cases

Connections

Pacific Atlantic Water Flow #417
Reverse flood-fill from boundaries (also called 'multi-source BFS/DFS')

Intuition

Imagine you're a drop of water on a mountain range. You can only flow downhill (or stay level) to neighboring cells. The Pacific Ocean touches the left and top edges; the Atlantic touches the right and bottom. Instead of checking every possible path from each cell to both oceans (expensive!), think backwards: if a cell can reach the ocean going forward, then from that ocean we can reach the cell going backward. It's like flood-filling from the coastlines inland — if water can physically flow from the mountains to the sea going forward, we can trace that same path in reverse from the sea back to the mountains. Cells reachable from BOTH oceans are the answer.

Why This Pattern?

The key insight is that water flow is reversible. If water can flow A→B→→ocean, then in reverse we can go ocean→B→A. By starting from all boundary cells simultaneously and 'flowing' backward through cells that are downhill or level, we find every cell that can drain to each ocean. We then intersect the two reachable sets. This transforms an expensive 'from each cell to both boundaries' problem into two 'from boundaries to all cells' problems.

Solution

from typing import List

class Solution:
    def pacificAtlantic(self, heights: List[List[int]]) -> List[List[int]]:
        if not heights or not heights[0]:
            return []
        
        rows, cols = len(heights), len(heights[0])
        
        # Track which cells can reach each ocean
        pacific_reachable = [[False] * cols for _ in range(rows)]
        atlantic_reachable = [[False] * cols for _ in range(rows)]
        
        def dfs(r: int, c: int, visited: List[List[bool]]) -> None:
            """Flood fill from ocean inward - cells can reach the ocean if
            they can flow from this cell to a neighbor that's already reachable."""
            visited[r][c] = True
            # Check all 4 directions - can flow to neighbor if neighbor <= current
            # (water flows downhill or stays level)
            for dr, dc in [(1,0), (-1,0), (0,1), (0,-1)]:
                nr, nc = r + dr, c + dc
                if 0 <= nr < rows and 0 <= nc < cols and not visited[nr][nc]:
                    # Can flow backward if neighbor's height <= current cell's height
                    # (we're going in reverse direction)
                    if heights[nr][nc] >= heights[r][c]:
                        dfs(nr, nc, visited)
        
        # Start DFS from Pacific boundary (top row and left column)
        for c in range(cols):
            if not pacific_reachable[0][c]:
                dfs(0, c, pacific_reachable)
        for r in range(rows):
            if not pacific_reachable[r][0]:
                dfs(r, 0, pacific_reachable)
        
        # Start DFS from Atlantic boundary (bottom row and right column)
        for c in range(cols):
            if not atlantic_reachable[rows-1][c]:
                dfs(rows-1, c, atlantic_reachable)
        for r in range(rows):
            if not atlantic_reachable[r][cols-1]:
                dfs(r, cols-1, atlantic_reachable)
        
        # Find cells that can reach both oceans
        result = []
        for r in range(rows):
            for c in range(cols):
                if pacific_reachable[r][c] and atlantic_reachable[r][c]:
                    result.append([r, c])
        
        return result

Complexity

Time: O(m * n)
Space: O(m * n)

We visit each cell at most twice (once from Pacific DFS, once from Atlantic DFS), so that's 2*mn operations. Each cell is marked visited to prevent redundant work. We can't do better because we genuinely need to examine each cell's connectivity to both boundaries — the answer could include any cell in the grid.

Common Mistakes

Edge Cases

Connections

Redundant Connection #684
Union-Find / Disjoint Set Union (DSU)

Intuition

Think of this like building a river delta. A tree is like a river system with no loops - there's exactly one path from any point to any other. When you add ONE extra edge to a tree, you create exactly one loop (cycle), like water finding a shortcut back to itself. The problem asks: which edge, when added, created that loop? Here's the key insight: If you build the graph edge-by-edge, the moment you try to connect two nodes that are ALREADY connected, you've found your redundant edge. Why? Because a proper tree with n nodes has exactly n-1 edges. The moment you add the nth edge, you're guaranteed to create a cycle - it's mathematically impossible not to. It's like finding where the traffic jam formed: when two previously separate traffic paths merge and you discover they're actually already connected, that's the bottleneck - the redundant connection.

Why This Pattern?

DSU is the natural choice because we need to efficiently answer: "Are these two nodes already connected?" as we process each edge. DSU provides nearly O(1) amortized time for this connectivity query using path compression and union by rank. This is exactly the incremental cycle detection we need - we build components as we go and the instant we try to union two nodes already in the same set, we've found our cycle.

Solution

def findRedundantConnection(self, edges: List[List[int]]) -> List[int]:
    # DSU with path compression
    # n nodes, n-1 edges = tree. One extra edge creates exactly one cycle.
    parent = list(range(len(edges) + 1))  # 1-indexed: parent[i] = parent of node i
    
    def find(x):
        """Find root with path compression - makes future lookups O(1)"""
        if parent[x] != x:
            parent[x] = find(parent[x])  # recursively compress path
        return parent[x]
    
    def union(x, y) -> bool:
        """Union two sets. Returns True if successfully merged, False if already connected (cycle!)."""
        px, py = find(x), find(y)
        if px == py:
            # Already in same set - adding this edge creates a cycle!
            return False
        # Union by rank: attach smaller tree under larger tree
        parent[px] = py
        return True
    
    # Process each edge in order
    for u, v in edges:
        if not union(u, v):
            # Found the edge that creates a cycle - this is our redundant connection
            return [u, v]
    
    # Should never reach here if input guarantees exactly one cycle
    return []

Complexity

Time: O(N × α(N)) where α is the inverse Ackermann function
Space: O(N) for the parent array

For N edges, we do up to N union/find operations. With path compression and union by rank, each operation takes amortized O(α(N)) - which is effectively constant (α(N) ≤ 4 for any realistic N). So practically O(N). We can't do better than O(N) because we must examine each edge at least once to know which one is redundant.

Common Mistakes

Edge Cases

Connections

Rotting Oranges #994
Multi-Source Breadth-First Search (BFS)

Intuition

Think of this like a contagion spreading through a population. Rotten oranges are 'infected' nodes that transmit the infection to adjacent healthy nodes each minute. This is exactly like pouring dye into water - it spreads outward in waves. The key insight: each 'wave' of BFS represents exactly one minute of time. We're essentially asking: how many waves of infection until all reachable fresh oranges are contaminated? The maximum distance from any fresh orange to its nearest initially-rotten orange tells us the total time needed.

Why This Pattern?

The grid is an unweighted graph where each cell connects to its 4 neighbors. BFS naturally explores by increasing distance from sources - meaning it finds the shortest path in terms of 'hops'. Since each hop represents exactly one minute, the level (depth) at which we reach a fresh orange tells us exactly when it rots. We need multi-source because multiple oranges can start rotten simultaneously, and we want the minimum time from ANY source.

Solution

from collections import deque

def orangesRotting(grid):
    rows, cols = len(grid), len(grid[0])
    queue = deque()
    fresh_count = 0
    
    # First pass: find all initially rotten oranges and count fresh ones
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == 2:
                queue.append((r, c, 0))  # (row, col, time)
            elif grid[r][c] == 1:
                fresh_count += 1
    
    # No fresh oranges at all - already done!
    if fresh_count == 0:
        return 0
    
    # No rotten oranges to start the chain reaction
    if not queue:
        return -1
    
    directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
    minutes = 0
    
    # BFS: process level by level (each level = one minute)
    while queue:
        r, c, time = queue.popleft()
        minutes = time  # Track the latest time we've processed
        
        # Spread to adjacent cells
        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == 1:
                grid[nr][nc] = 2  # Rot!
                fresh_count -= 1
                queue.append((nr, nc, time + 1))
    
    # If any fresh oranges remain, they were unreachable
    return -1 if fresh_count > 0 else minutes

Complexity

Time: O(rows * cols)
Space: O(rows * cols) in worst case

The queue could hold up to all cells in the worst case (if all oranges start rotten and we add them all). Additionally, we modify the grid in-place, so the extra space is primarily the queue. In practice, it's bounded by the number of rotten oranges at any given BFS level.

Common Mistakes

Edge Cases

Connections

Surrounded Regions #130
Boundary-Connected Flood Fill with Complement

Intuition

Think of this like water flow or an escape room. The 'O's on the boundary are like exits - any 'O' connected to the boundary can 'escape' to the outside world. Only the 'O's that have NO path to the boundary are truly trapped (surrounded). Instead of trying to find all trapped regions directly (hard), we flip the problem: flood-fill from the boundary to mark everything that CAN escape, then flip everything else. It's like asking 'what's NOT captured' rather than 'what is captured'.

Why This Pattern?

The key insight is that any 'O' on the boundary (or connected to one via other 'O's) is NOT surrounded - it's 'touching the ocean' and can escape. Rather than finding all surrounded regions (which requires complex region detection), we find the complement: mark all escapeable 'O's, then flip everything remaining. This is O(n) instead of exponential because we visit each cell at most once.

Solution

def solve(board):
    if not board:
        return
    
    rows, cols = len(board), len(board[0])
    
    def dfs(r, c):
        # Base: out of bounds or not an unvisited 'O'
        if r < 0 or r >= rows or c < 0 or c >= cols or board[r][c] != 'O':
            return
        
        # Mark this 'O' as escapeable (temporary marker)
        board[r][c] = 'E'
        
        # Visit all 4 neighbors - water flows out
        dfs(r + 1, c)
        dfs(r - 1, c)
        dfs(r, c + 1)
        dfs(r, c - 1)
    
    # Step 1: Start DFS from ALL boundary cells that are 'O'
    # These are the 'water sources' that can reach the outside
    for r in range(rows):
        for c in range(cols):
            is_boundary = (r == 0 or r == rows - 1 or c == 0 or c == cols - 1)
            if is_boundary and board[r][c] == 'O':
                dfs(r, c)
    
    # Step 2: Flip remaining 'O's (trapped) to 'X', restore 'E' to 'O'
    for r in range(rows):
        for c in range(cols):
            if board[r][c] == 'O':
                board[r][c] = 'X'  # Trapped - capture it
            elif board[r][c] == 'E':
                board[r][c] = 'O'  # Was escapeable - restore

Complexity

Time: O(rows × cols)
Space: O(rows × cols) for recursion stack in worst case (all cells are connected 'O's)

Each cell is visited at most twice: once during boundary flood fill (if it's escapeable), and once in the final pass. We can't do better because we must check every cell to determine if it's trapped or not. The recursion stack is O(n) in worst case because the DFS could theoretically traverse every cell in a snake-like pattern.

Common Mistakes

Edge Cases

Connections

Walls and Gates #286
Multi-Source Breadth-First Search (BFS)

Intuition

Think of this as dropping pebbles (gates) into a pond simultaneously. Each ripple expands outward one unit at a time. When a ripple first touches an empty room, that's the shortest distance to the nearest gate. BFS naturally models this 'wavefront' expansion - we process all cells at distance d before any at distance d+1, guaranteeing we find the shortest path. Multi-source BFS is the key: starting from ALL gates at once means we don't have to try each gate separately - the waves collide at the optimal boundary.

Why This Pattern?

BFS guarantees shortest path in unweighted graphs (each move costs 1). Multi-source BFS is optimal here because: (1) we want distance to the NEAREST gate, not any gate (2) starting from all gates simultaneously avoids redundant searches (3) the wavefronts naturally meet at the optimal boundary between gate territories. This is fundamentally a shortest-path problem on an unweighted grid.

Solution

from collections import deque
from typing import List

def wallsAndGates(rooms: List[List[int]]) -> None:
    if not rooms or not rooms[0]:
        return
    
    rows, cols = len(rooms), len(rooms[0])
    INF = 2**31 - 1
    queue = deque()
    
    # Step 1: Find all gates and add to queue (these are our "sources")
    for r in range(rows):
        for c in range(cols):
            if rooms[r][c] == 0:
                queue.append((r, c))
    
    # Step 2: BFS expands wavefront from all gates simultaneously
    # directions: up, down, left, right
    directions = [(1, 0), (-1, 0), (0, 1), (0, -1)]
    
    while queue:
        r, c = queue.popleft()
        
        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            # Check bounds and if room is empty (unvisited)
            if 0 <= nr < rows and 0 <= nc < cols and rooms[nr][nc] == INF:
                # Distance is current cell's distance + 1
                rooms[nr][nc] = rooms[r][c] + 1
                # Add to queue to continue wavefront expansion
                queue.append((nr, nc))

Complexity

Time: O(m * n) where m = rows, n = cols
Space: O(m * n) in worst case

Every cell is visited at most once. We start with all gates in queue, then each empty room gets visited exactly when first reached by a wavefront. No cell is processed twice because we mark rooms as visited by setting them to a finite distance (they start as INF).

Common Mistakes

Edge Cases

Connections

Word Ladder #127
BFS on an implicit graph with bidirectional wildcard indexing. This is fundamentally a shortest-path problem on an unweighted graph where nodes are words and edges exist between words differing by exactly one letter.

Intuition

Imagine each word as a node in a vast network, and you can jump between nodes if they're exactly one letter apart (like 'hit' → 'hot'). You're trying to find the shortest route from your starting word to your target word. This is exactly the 'six degrees of Kevin Bacon' for words. BFS is the natural choice because it explores all paths of length 1, then length 2, etc. — guaranteeing the first time you reach the target, you've found the shortest possible path. The key insight is that we don't need to pre-build the entire graph; instead, we generate neighbors on-the-fly by treating each letter position as a 'door' (wildcard), and every word that shares the same door pattern is a neighbor.

Why This Pattern?

BFS guarantees shortest path in unweighted graphs because it explores level-by-level, finding all nodes at distance d before any at distance d+1. The wildcard pattern optimization avoids O(n²) neighbor-finding by using a hash map: for each word position, replace that character with '*' to create a pattern key. All words with the same pattern are by definition one letter apart. This transforms neighbor discovery from comparing every word against every other word to a simple O(1) hash lookup.

Solution

from collections import defaultdict, deque

def ladderLength(beginWord: str, endWord: str, wordList: list) -> int:
    wordSet = set(wordList)
    if endWord not in wordSet:
        return 0
    
    # Build pattern map: each pattern maps to all words sharing that pattern
    # e.g., 'hot' produces '*ot', 'h*t', 'ho*'
    # All words under the same pattern are exactly one letter apart
    pattern_map = defaultdict(list)
    for word in wordSet:
        for i in range(len(word)):
            pattern = word[:i] + '*' + word[i+1:]
            pattern_map[pattern].append(word)
    
    # BFS: (current_word, transformation_count)
    # We count the word itself in the length, so beginWord = 1
    queue = deque([(beginWord, 1)])
    visited = {beginWord}
    
    while queue:
        word, length = queue.popleft()
        
        # Generate all possible patterns from current word
        for i in range(len(word)):
            pattern = word[:i] + '*' + word[i+1:]
            
            # All words matching this pattern are valid next moves
            for next_word in pattern_map[pattern]:
                if next_word == endWord:
                    return length + 1
                
                if next_word not in visited:
                    visited.add(next_word)
                    queue.append((next_word, length + 1))
            
            # Clear to prevent reprocessing same pattern (memory optimization)
            pattern_map[pattern] = []
    
    return 0

Complexity

Time: O(M * N) where M = word length, N = number of words in wordList. Each word generates M patterns, and each pattern lookup is O(1). In the worst case, we visit each word once and examine all M positions. Cannot be less because we potentially need to check every transformation possibility.
Space: O(M * N) for the pattern_map storing all word-pattern combinations, plus O(N) for visited set and queue. We need this space because theoretically any word could connect to any other word through the transformation rules.

We need to store all pattern-to-word mappings because we don't know ahead of time which pattern will connect our current word to its neighbors. The visited set is essential to prevent infinite loops in what could be a cyclic graph. The queue holds at most one copy of each word at any time, bounded by N.

Common Mistakes

Edge Cases

Connections

Advanced Graphs (6)

Alien Dictionary #269
Topological Sort (Kahn's Algorithm)

Intuition

Imagine you're a linguist trying to reconstruct an alien alphabet from a dictionary. You have words sorted in unknown order, and you need to deduce character ordering. The key insight: compare adjacent words. The FIRST character where they differ reveals the ordering. If 'cat' comes before 'car', then 't' must come after 'r' in their alphabet—you've discovered a dependency. Think of this like reconstructing a family tree where each character has constraints on who must come before/after them. This is a classic dependency-resolution problem, solved by finding an ordering where all constraints are satisfied.

Why This Pattern?

The problem gives us pairwise ordering constraints between characters (edges in a directed graph). We need a linear ordering of all vertices (characters) such that every edge points 'forward'—this is exactly what topological sort computes. The first differing character between adjacent words creates a directed edge representing 'this character must come before that one'.

Solution

from collections import defaultdict, deque

def alienOrder(words):
    # Step 1: Build the graph and track all unique characters
    graph = defaultdict(set)
    all_chars = set()
    for word in words:
        all_chars.update(word)
    
    # Step 2: Add edges based on first differing character between adjacent words
    for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]
        # Find first difference
        min_len = min(len(w1), len(w2))
        for j in range(min_len):
            if w1[j] != w2[j]:
                # w1[j] comes before w2[j] in alien language
                if w2[j] not in graph[w1[j]]:
                    graph[w1[j]].add(w2[j])
                break
        else:
            # No difference found: check valid ordering (shorter first)
            if len(w1) > len(w2):
                return ""  # Invalid: prefix comes after longer word
    
    # Step 3: Calculate in-degrees
    in_degree = {char: 0 for char in all_chars}
    for char in graph:
        for neighbor in graph[char]:
            in_degree[neighbor] += 1
    
    # Step 4: Kahn's algorithm - start with characters having no incoming edges
    queue = deque([char for char in all_chars if in_degree[char] == 0])
    result = []
    
    while queue:
        char = queue.popleft()
        result.append(char)
        
        for neighbor in graph[char]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    # If we didn't process all characters, there's a cycle (invalid dict)
    return "".join(result) if len(result) == len(all_chars) else ""

Complexity

Time: O(C + N) where C is the number of unique characters and N is total characters across all words
Space:

We traverse each character at most twice: once when building the graph (comparing adjacent word pairs) and once during BFS processing. Each edge is also processed once when decrementing in-degrees. We can't do better because we must examine every character and every constraint to establish their relationships.

Common Mistakes

Edge Cases

Connections

Cheapest Flights Within K Stops #787
Constrained Shortest Path with Bellman-Ford

Intuition

Think of this like finding the cheapest subway route where you're limited in how many transfers (stops) you can make. Each flight is a 'step' in the journey, and you can take at most K+1 steps total (K stops means K+1 flights). The key insight: Bellman-Ford naturally builds up shortest paths iteration by iteration. After the first iteration, you know the cheapest way to reach each city using exactly 1 flight. After the second iteration, you know the cheapest way using at most 2 flights. So after K+1 iterations, you know the cheapest way using at most K+1 flights - exactly what we need!

Why This Pattern?

Bellman-Ford is the natural choice because it iteratively improves path costs by considering one more edge at a time. Each iteration represents adding one more flight to our journey. By stopping after K+1 iterations, we exactly enforce the 'at most K stops' constraint. Dijkstra doesn't naturally handle this edge-count constraint because it greedily picks the cheapest path without tracking how many edges were used.

Solution

def findCheapestPrice(n, flights, src, dst, K):
    # prices[i] = cheapest price to reach city i using at most current iterations flights
    prices = [float('inf')] * n
    prices[src] = 0
    
    # Relax all edges K+1 times:
    # K stops = at most K+1 flights (the destination counts as a stop)
    for i in range(K + 1):
        # Copy prices to avoid using updated values within same iteration
        # (this ensures each iteration only adds exactly one more flight)
        tmp = prices[:]
        for s, d, p in flights:
            # Can't reach source city yet, skip
            if prices[s] == float('inf'):
                continue
            # If going through s to d is cheaper, update d's price
            if prices[s] + p < tmp[d]:
                tmp[d] = prices[s] + p
        prices = tmp
    
    return prices[dst] if prices[dst] != float('inf') else -1

Complexity

Time: O((K+1) * E) where E is number of flights
Space: O(N) for the price array

We iterate through all E edges K+1 times. This is necessary because we must consider paths of length 1, 2, ..., K+1. Each path length requires a full pass through all edges to compute (no early termination because a cheaper path with more stops might exist). We can't do better than this worst-case because we need to evaluate all possible paths up to length K+1.

Common Mistakes

Edge Cases

Connections

Min Cost to Connect All Points #1584
Minimum Spanning Tree (MST) using Prim's Algorithm

Intuition

Think of this like building a railway network across cities. You need to connect all cities but want to minimize total track length. The key insight: at each step, the optimal move is to add the shortest edge that connects an unvisited city to your growing network. This greedy choice works because of the cut property - in any partition of nodes, the minimum-weight edge crossing that cut MUST be in the optimal spanning tree. Imagine splitting your points into 'already connected' and 'not yet connected' - the cheapest bridge between these two groups is always part of the optimal solution.

Why This Pattern?

We have a complete graph where every point can connect to every other point, and we need the minimum-weight subgraph that connects ALL nodes without cycles. This is exactly the definition of MST. Prim's is preferred over Kruskal here because the graph is dense (n² edges), so O(n²) or O(n² log n) beats Kruskal's O(n² log n) sorting step.

Solution

import heapq
from typing import List

class Solution:
    def minCostConnectPoints(self, points: List[List[int]]) -> int:
        """
        Prim's Algorithm with min-heap.
        Start from any point (0), then greedily add the closest unvisited point.
        """
        n = len(points)
        if n == 1:
            return 0
        
        # Track which points are already in our growing tree
        visited = [False] * n
        # Min-heap of (cost, point_index) - always grab cheapest edge
        min_heap = [(0, 0)]  # Start from point 0 with cost 0
        total_cost = 0
        edges_used = 0
        
        while min_heap and edges_used < n:
            # Get the minimum cost edge to an unvisited point
            cost, point = heapq.heappop(min_heap)
            
            # Skip if this point is already connected
            if visited[point]:
                continue
            
            # Add this edge to our tree
            visited[point] = True
            total_cost += cost
            edges_used += 1
            
            # Try connecting to all unvisited points
            for next_point in range(n):
                if not visited[next_point]:
                    # Manhattan distance: |x1-x2| + |y1-y2|
                    dist = abs(points[point][0] - points[next_point][0]) + \
                           abs(points[point][1] - points[next_point][1])
                    heapq.heappush(min_heap, (dist, next_point))
        
        return total_cost

Complexity

Time: O(n² log n)
Space: O(n)

We process each of the n points once (that's the n factor). For each point, we potentially push up to n edges onto the heap (that's the n factor for edges per point). Heap operations are O(log n), giving O(n² log n). We can't do better than considering all possible edges in the worst case because the graph is complete - every point could connect to every other point. The space is O(n) for the visited array and heap.

Common Mistakes

Edge Cases

Connections

Network Delay Time #743
Single-Source Shortest Path (SSSP) with Dijkstra's Algorithm

Intuition

Think of this like dropping a stone in a pond and watching the ripples spread outward. The signal propagates from source k like a wavefront, with each node getting 'infected' at the earliest possible time. The key insight: the network is fully informed when the LAST node receives the signal. We're looking for the longest shortest-path from k to any reachable node — like asking 'how long until the ripple reaches the farthest point?'

Why This Pattern?

The problem has non-negative edge weights and asks for shortest paths from ONE source to ALL nodes. Dijkstra's algorithm is the natural choice because it greedily expands from the currently closest unvisited node, guaranteeing we find the earliest arrival time at each node. This is fundamentally a 'earliest arrival' problem, which is exactly what Dijkstra solves.

Solution

from heapq import heappush, heappop

def networkDelayTime(times, n, k):
    # Build adjacency list: graph[u] = [(v, w), ...]
    graph = [[] for _ in range(n + 1)]
    for u, v, w in times:
        graph[u].append((v, w))
    
    # Distance array: earliest time to reach each node
    dist = [float('inf')] * (n + 1)
    dist[k] = 0
    
    # Min-heap stores (time_to_reach, node) - always process earliest time first
    pq = [(0, k)]
    
    # Track maximum distance to any reached node
    max_time = 0
    visited = 0
    
    while pq:
        d, node = heappop(pq)
        
        # Skip if we've already found a faster way to this node
        if d > dist[node]:
            continue
        
        visited += 1
        max_time = max(max_time, d)
        
        # Explore neighbors: can we reach them faster through current node?
        for neighbor, weight in graph[node]:
            new_dist = d + weight
            if new_dist < dist[neighbor]:
                dist[neighbor] = new_dist
                heappush(pq, (new_dist, neighbor))
    
    # If we couldn't reach all nodes, return -1
    return max_time if visited == n else -1

Complexity

Time: O((V + E) log V)
Space: O(V + E)

Common Mistakes

Edge Cases

Connections

Reconstruct Itinerary #332
Hierholzer's Algorithm variant with greedy edge selection (Eulerian Path in directed graph)

Intuition

Think of this like planning a road trip where you MUST use every single road exactly once - you're finding a Eulerian path. The key insight: when you have multiple flight choices from an airport, always pick the lexicographically SMALLEST destination first. Why? Imagine you have two paths available - a short one to 'B' and a longer one to 'Z'. If you take 'Z' first, you might get stuck later because 'B' was your only way to reach the remaining tickets. By taking the smallest option early, you 'use up' your constraints while you still have maximum flexibility - you're essentially making sure you don't paint yourself into a corner. This is like a stack of cards - play your smallest cards early so they don't clutter your hand later.

Why This Pattern?

The problem guarantees a valid Eulerian circuit exists (all vertices have balanced in/out degrees except start/end, and the graph is connected). This means there exists a path that uses every edge exactly once. The greedy choice of smallest lexical destination works because: when multiple edges leave a node in a valid Eulerian path, taking the smallest one first never blocks the solution - if there was a valid path using a larger edge first, there must also be a valid path using the smaller edge first (by the structure of Eulerian graphs).

Solution

from collections import defaultdict
import heapq

def findItinerary(tickets):
    # Build graph: airport -> list of destinations (sorted, smallest first)
    graph = defaultdict(list)
    for src, dst in tickets:
        heapq.heappush(graph[src], dst)  # min-heap = automatic sorting
    
    # DFS from JFK, building itinerary in POST-ORDER
    # (we add airport AFTER exploring all its outgoing flights)
    itinerary = []
    
    def dfs(airport):
        # Keep exploring while there are destinations available
        while graph[airport]:
            # Always take the smallest lexical destination (greedy choice)
            next_dest = heapq.heappop(graph[airport])
            dfs(next_dest)
        # After exhausting all flights FROM this airport, add to itinerary
        # This is like the "return" in a function call stack
        itinerary.append(airport)
    
    dfs("JFK")
    
    # Reverse because we built it post-order (like reverse of DFS finish times)
    return itinerary[::-1]

Complexity

Time: O(E log V) where E = number of tickets (edges) and V = number of airports (vertices). Each ticket is processed once, and we use a heap which costs log V to pop the smallest destination.
Space: O(V + E) - we store the graph (E edges) and the recursion stack could go as deep as V (worst case: linear path through all airports).

We can't do better than O(E log V) because: (1) we MUST process every ticket exactly once to use all edges - that's E operations minimum, (2) at each airport we need to pick the smallest available destination, and with multiple flights the heap gives us O(log n) retrieval. The space is also tight - we literally need to remember all flights (E) and potentially the entire path (V).

Common Mistakes

Edge Cases

Connections

Swim in Rising Water #778
Minimax path with node costs using modified Dijkstra's algorithm. Instead of summing edge weights, we take the maximum of the path cost and current node's cost.

Intuition

Think of this like escaping a flooding terrain. You start at the highest point of your path and wait as water rises. The question is: how high must the water rise before you can swim from start to end? You want to find a path that stays as low as possible - you're minimizing the maximum elevation you need to traverse. This is like a hiker wanting to cross mountains while staying in valleys as much as possible. The water level acts like a threshold - any cell with height ≤ threshold is flooded and swimmable. You want the minimum threshold that connects start to end.

Why This Pattern?

Dijkstra's algorithm works because when we pop a node from the priority queue, we've found the optimal minimax cost to reach it. For each neighbor, the cost to reach it through current node is max(current_path_cost, neighbor_height). This correctly computes 'the minimum possible maximum height along any path to this cell'. The priority queue orders by this cost, guaranteeing we process cells in order of increasing minimum-required-water-level.

Solution

import heapq

def swimInWater(grid):
    n = len(grid)
    # Minimum time to reach each cell - initialized to infinity
    min_time = [[float('inf')] * n for _ in range(n)]
    min_time[0][0] = grid[0][0]  # Starting cell needs water to reach its height
    
    # Priority queue: (time_needed, row, col)
    # Heap orders by time_needed (minimum max-height path found so far)
    pq = [(grid[0][0], 0, 0)]
    
    # 4 directions: right, down, left, up
    directions = [(0, 1), (1, 0), (0, -1), (-1, 0)]
    
    while pq:
        current_time, row, col = heapq.heappop(pq)
        
        # Early exit - we found the shortest path to destination
        if row == n - 1 and col == n - 1:
            return current_time
        
        # Skip if we've already found a better path to this cell
        if current_time > min_time[row][col]:
            continue
            
        # Explore neighbors
        for dr, dc in directions:
            nr, nc = row + dr, col + dc
            
            # Check bounds
            if 0 <= nr < n and 0 <= nc < n:
                # Time to reach neighbor = max of current path's max height and neighbor's height
                # This represents: "to swim here, water must rise to at least this level"
                new_time = max(current_time, grid[nr][nc])
                
                # If this path is better, update and add to queue
                if new_time < min_time[nr][nc]:
                    min_time[nr][nc] = new_time
                    heapq.heappush(pq, (new_time, nr, nc))
    
    return min_time[n-1][n-1]

Complexity

Time: O(n² log(n²)) = O(n² log n)
Space: O(n²) for the visited/distance array and priority queue

We potentially visit every cell once (n² total). Each heap operation costs O(log(n²)) = O(log n). The work is proportional to the grid size - we can't possibly do better than looking at all cells because any cell could be part of the optimal path.

Common Mistakes

Edge Cases

Connections

1-D Dynamic Programming (12)

Climbing Stairs #70
Fibonacci Sequence / Dynamic Programming with O(1) Space

Intuition

Imagine water flowing down a staircase. At each step, the 'flow' (number of ways to arrive) comes from two sources: the step immediately above (arrived by taking 1 step) and the step two above (arrived by taking 2 steps). The total flow at any step is the sum of these two incoming flows. It's like a conservation law - the number of distinct paths reaching step n equals the sum of paths reaching step n-1 and step n-2. This is why the answer follows a Fibonacci-like pattern: 1, 2, 3, 5, 8... Each number 'remembers' the history of all paths that could lead to it.

Why This Pattern?

The problem has optimal substructure - the answer for n depends directly on answers for n-1 and n-2. There's also overlapping subproblems (we'd recompute the same values multiple times with naive recursion). The recurrence relation dp[n] = dp[n-1] + dp[n-2] naturally emerges from asking: 'What was the last move I made to reach step n?' You either came from n-1 (1 step) or n-2 (2 steps), so total ways = ways(n-1) + ways(n-2).

Solution

class Solution:
    def climbStairs(self, n: int) -> int:
        # Base cases: 1 way to climb 1 stair, 2 ways to climb 2 stairs
        if n <= 2:
            return n
        
        # Use two variables (like Fibonacci) - we only need previous two values
        prev2 = 1  # ways to reach step 1
        prev1 = 2  # ways to reach step 2
        
        # Iterate from step 3 to n, building up the answer
        for i in range(3, n + 1):
            current = prev1 + prev2  # ways(i) = ways(i-1) + ways(i-2)
            prev2 = prev1  # shift window: move prev2 forward
            prev1 = current  # update prev1 to current value
        
        return prev1

Complexity

Time: O(n)
Space: O(1)

We iterate exactly n-2 times (for n > 2), performing O(1) work each time - this is optimal because we must compute values for each step from 2 to n. The O(1) space comes from only tracking two variables instead of the entire sequence - we don't need to remember values from 5 steps ago because each step only depends on its immediate two predecessors.

Common Mistakes

Edge Cases

Connections

Coin Change #322
Dynamic Programming - Bottom-up Tabulation (Unbounded Knapsack variant for minimization)

Intuition

Think of this like climbing a ladder where each coin is a step size. You're at amount 0 and want to reach your target amount. Each coin denomination lets you take a 'jump' forward by that amount. You want the path with the fewest jumps. The key insight: to know the minimum jumps to reach amount 'i', you need to know the minimum jumps to reach 'i - coin' for every coin that fits. This is like a shortest-path problem in a graph where each amount is a node and each coin creates an edge from amount 'i' to amount 'i + coin'.

Why This Pattern?

The problem exhibits optimal substructure: the minimum coins for amount 'n' depends on the minimum coins for smaller amounts (n - coin). There are also overlapping subproblems - we compute the same smaller amounts repeatedly. The 'unbounded' part means we can use each coin unlimited times, just like filling a knapsack with unlimited items.

Solution

def coinChange(coins, amount):
    # dp[i] represents the minimum coins needed to make amount i
    # Initialize with infinity (impossible state), except dp[0] = 0
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    # Fill the dp table bottom-up
    for current_amount in range(1, amount + 1):
        # Try using each coin to reach this amount
        for coin in coins:
            # Can we use this coin? (coin must not exceed current_amount)
            if coin <= current_amount:
                # If we use this coin, we need dp[current_amount - coin] coins
                # to reach the remainder, plus 1 for this coin
                # Take the minimum over all valid coins
                dp[current_amount] = min(dp[current_amount], dp[current_amount - coin] + 1)
    
    # If dp[amount] is still infinity, we couldn't make that amount
    return dp[amount] if dp[amount] != float('inf') else -1

Complexity

Time: O(amount * n) where n = number of coin denominations
Space: O(amount)

We iterate through each amount from 1 to target (amount times), and for each amount, we check all n coins. We can't do better because we must consider every coin at every amount to guarantee finding the true minimum - there's no way to 'skip' combinations without checking them.

Common Mistakes

Edge Cases

Connections

Decode Ways #91
1-D Dynamic Programming (Fibonacci-like)

Intuition

Think of this like a signal propagating through a chain. Each digit is a signal that can either stand alone (decode as a single letter) or combine with its neighbor (decode as a pair, if the pair forms 10-26). The number of valid decodings at position i depends on what happened at positions i-1 and i-2 — like a cascade or domino effect where each state absorbs valid contributions from previous states. It's analogous to counting paths in a graph where valid single-digit and double-digit decodings are edges pointing forward.

Why This Pattern?

The problem has optimal substructure: the number of ways to decode the first i characters depends on the number of ways to decode the first i-1 characters (if current digit is valid alone) and i-2 characters (if the current digit combines with previous to form a valid 10-26). The subproblems overlap naturally, making DP the natural choice. The recurrence f(i) = f(i-1) + f(i-2) mirrors climbing stairs, but with validity constraints.

Solution

def numDecodings(s: str) -> int:
    # Edge case: empty string or starts with '0' - no valid decoding
    if not s or s[0] == '0':
        return 0
    
    n = len(s)
    # dp[i] = number of ways to decode s[0:i+1]
    # Only need previous two states, so optimize to O(1) space
    prev2 = 1  # dp[0], base case: empty string has 1 way
    prev1 = 1  # dp[1], ways for first character (always 1 if not '0')
    
    for i in range(1, n):
        curr = 0
        
        # Case 1: Decode s[i] alone (if it's not '0')
        # '1'-'9' can stand alone
        if s[i] != '0':
            curr = prev1
        
        # Case 2: Decode s[i-1:i+1] as a pair (if valid 10-26)
        two_digit = int(s[i-1:i+1])
        if 10 <= two_digit <= 26:
            curr += prev2
        
        # If curr is 0, no valid decoding exists (e.g., '0', '00', '30')
        if curr == 0:
            return 0
        
        # Shift window forward
        prev2, prev1 = prev1, curr
    
    return prev1

Complexity

Time: O(n)
Space: O(1)

We iterate through the string once, doing O(1) work per character (checking single digit and forming the two-digit number). We can't do better than O(n) because we must inspect every digit to know the total number of decodings. Space is O(1) because we only track the two most recent DP states; the full DP table isn't needed.

Common Mistakes

Edge Cases

Connections

House Robber II #213
1-D Dynamic Programming with circular boundary handling - breaking a circular constraint into two linear subproblems

Intuition

Imagine the houses as a circular necklace. The key insight is that in a circle, if you rob house 0, you CANNOT rob house n-1 (they're adjacent). But if you DON'T rob house 0, you CAN rob house n-1. This creates two mutually exclusive scenarios that cover all possibilities. Think of it like a decision at the boundary: either we 'commit' to robbing the first house (and are thus prohibited from robbing the last), or we 'skip' the first house (and are free to consider the last). The optimal solution must be one of these two paths through the circle. The problem reduces to solving the classic linear House Robber twice and taking the maximum.

Why This Pattern?

The circular adjacency creates a 'first-and-last mutually exclusive' constraint. When two options are mutually exclusive (can't pick both), a powerful technique is to consider each option separately and take the maximum. We take the linear DP solution and run it on two modified arrays: nums[0:n-1] (exclude last house, allowing us to take first) and nums[1:n] (exclude first house, allowing us to take last). The circle is 'broken' at a different point in each case, converting it to the standard linear problem.

Solution

class Solution:
    def rob(self, nums: List[int]) -> int:
        n = len(nums)
        if n == 0:
            return 0
        if n == 1:
            return nums[0]
        
        # Break the circle into two linear cases:
        # Case 1: Rob house 0 -> cannot rob house n-1, so consider nums[:-1]
        # Case 2: Don't rob house 0 -> can rob house n-1, so consider nums[1:]
        
        def rob_linear(houses):
            """Standard House Robber I solution for a linear street."""
            m = len(houses)
            if m == 0:
                return 0
            if m == 1:
                return houses[0]
            
            # dp[i] = max money robbing up to house i
            # Two variables suffice: prev1 = dp[i-1], prev2 = dp[i-2]
            prev2, prev1 = 0, 0
            for money in houses:
                # Either skip current (prev1) or rob it (prev2 + money)
                curr = max(prev1, prev2 + money)
                prev2 = prev1
                prev1 = curr
            return prev1
        
        # Take max of both scenarios
        return max(rob_linear(nums[:-1]), rob_linear(nums[1:]))

Complexity

Time: O(n)
Space: O(1) - we use only two variables regardless of input size, not O(n) for an array

Common Mistakes

Edge Cases

Connections

House Robber #198
1-D Dynamic Programming with Linear Scrolling State

Intuition

Imagine you're a mountain climber choosing which peaks to summit. You can only move to non-adjacent peaks (can't rob neighboring houses). At each peak, you face a choice: take it and add its height to your score, but then you must skip the next one; or skip it and move to the next with your current best. The key insight is this: your decision at house i depends ONLY on what was optimal at houses i-1 and i-2. It's like a gradient descent where you're always choosing the local maximum that leads to the global maximum. The problem has 'memory' - past decisions constrain future options in a specific, predictable way.

Why This Pattern?

The problem exhibits optimal substructure: the best answer for house i depends ONLY on the best answers for houses i-1 and i-2. This is the signature of DP problems. There's no need to consider earlier houses because any optimal path to house i must either include i-1 (in which case it can't include i) or exclude i-1 (in which case it's already the optimal path to i-1). The decision at each step only looks back 1 or 2 steps, making this a linear scrolling DP where we maintain only the last two states.

Solution

def rob(nums):
    if not nums:
        return 0
    if len(nums) == 1:
        return nums[0]
    
    # Edge case: two houses - just take the max
    if len(nums) == 2:
        return max(nums[0], nums[1])
    
    # We only need to track the previous two states
    # prev2 = max money robbing up to house i-2
    # prev1 = max money robbing up to house i-1
    prev2 = nums[0]
    prev1 = max(nums[0], nums[1])
    
    for i in range(2, len(nums)):
        # Two choices:
        # 1. Don't rob current house: take whatever was optimal at i-1 (prev1)
        # 2. Rob current house: add current house value to best we could do at i-2 (prev2)
        current = max(prev1, prev2 + nums[i])
        
        # Shift window forward
        prev2 = prev1
        prev1 = current
    
    return prev1

Complexity

Time: O(n)
Space: O(1)

We traverse each house exactly once, doing O(1) work per house. We can't do better than O(n) because we must examine every house to know if we should rob it - there's no shortcut since each decision depends on the specific values of neighboring houses. For space, we only store two variables (the last two optimal values), independent of input size. This is the minimum because we need to remember at least the last two decisions to compute the next one.

Common Mistakes

Edge Cases

Connections

Longest Increasing Subsequence #300
Patience Sorting with Binary Search (O(n log n)) - a greedy + binary search hybrid

Intuition

Imagine you're stacking coins in a row, but you can only place each new coin on top of a smaller coin - you want the tallest possible tower. You don't have to use every coin, but the ones you pick must each be larger than the one below it. The challenge: knowing only the heights so far, what's the tallest tower you can build? There's a beautiful insight here: instead of tracking exactly which coins we picked (which would be O(n²)), we can track something much simpler - for each possible tower height, what's the smallest 'top coin' we could possibly have? This is like keeping the 'lightest weight that could hold up' a tower of each size. If we see a coin heavier than all our tops, we build a taller tower. If it's lighter, we swap out one of our tops to be smaller - which actually HELPS future coins fit under it.

Why This Pattern?

The problem has optimal substructure: the longest increasing subsequence ending at position i depends on all previous positions. The key structural insight is that we only care about the SMALLEST possible tail value for each subsequence length. If we can achieve length L with a smaller tail, we leave more room for future elements. Binary search lets us efficiently find where to place each element in this 'tails' array.

Solution

def lengthOfLIS(nums):
    if not nums:
        return 0
    
    # tails[i] = smallest tail element for LIS of length i+1
    # This array is ALWAYS sorted - that's the magic property we exploit
    tails = []
    
    for num in nums:
        # Binary search: find leftmost position where tails[pos] >= num
        # This is like asking: "where does this coin fit in our sorted tops?"
        left, right = 0, len(tails)
        
        while left < right:
            mid = (left + right) // 2
            if tails[mid] < num:
                left = mid + 1
            else:
                right = mid
        
        # If we reached the end, this num extends the longest subsequence
        # Otherwise, we replace tails[left] with a smaller value (better!)
        if left == len(tails):
            tails.append(num)
        else:
            tails[left] = num
    
    return len(tails)

Complexity

Time: O(n log n)
Space: O(n) - the tails array can grow to size n

We do O(n) iterations, and each iteration performs a binary search on the tails array, which at worst has size O(n). So n × log n operations. We can't do better than O(n log n) because we need to examine each element at least once (the output depends on all n inputs), and binary search is optimal for sorted searches.

Common Mistakes

Edge Cases

Connections

Longest Palindromic Substring #5
Two-pointer expansion from centers (also called 'center expansion').

Intuition

Think of a palindrome as a mirror image - the left side reflects the right side. For any position in a string, imagine you're standing at a center point and peeking outward in both directions. As long as the characters match, you're looking at a palindrome. This is like a standing wave that has a natural center and symmetric patterns extending from it. The beautiful thing is: every palindrome has a center (either a single character for odd-length, or the gap between two characters for even-length), and from any center, you can expand outward to find the longest palindrome that has that center. The problem becomes: try all possible centers, expand as far as possible from each, and keep track of the longest one found.

Why This Pattern?

Palindromes have a recursive symmetry: if you remove the outer characters of any palindrome, what remains is still a palindrome. This means the palindrome property is preserved when you shrink from the edges toward the center. By starting from a center and expanding outward, we naturally discover palindromes without redundant checking. There are exactly 2n-1 possible centers in a string of length n (n odd-length centers at each character, n-1 even-length centers at each gap), and expanding from each center is O(n) in the worst case, giving O(n²) total.

Solution

def longestPalindrome(s: str) -> str:
    if len(s) <= 1:
        return s
    
    def expand_from_center(left: int, right: int) -> str:
        # Expand outward while characters match (like ripples in a pond)
        while left >= 0 and right < len(s) and s[left] == s[right]:
            left -= 1
            right += 1
        # Return the palindrome we found (left+1 and right-1 are the bounds)
        return s[left + 1:right]
    
    longest = ""
    for i in range(len(s)):
        # Odd-length palindrome: center is a single character at position i
        odd_palindrome = expand_from_center(i, i)
        # Even-length palindrome: center is the gap between i and i+1
        even_palindrome = expand_from_center(i, i + 1)
        
        # Keep the longest palindrome found so far
        if len(odd_palindrome) > len(longest):
            longest = odd_palindrome
        if len(even_palindrome) > len(longest):
            longest = even_palindrome
    
    return longest

Complexity

Time: O(n²) in the worst case.
Space: O(1) - we only use a few pointers and store the current longest palindrome, no additional data structures proportional to input size.

There are 2n-1 centers to check (each character and each gap between characters). For each center, in the worst case (like a string of all 'a's), we expand all the way to the ends, which takes O(n). So total is O(n × n) = O(n²). We can't do better in the worst case because there can be Ω(n²) palindromic substrings in a string (consider 'aaaaa' - it has O(n²) different palindromes), so we need to examine enough of them to find the longest.

Common Mistakes

Edge Cases

Connections

Maximum Product Subarray #152
Extended Kadane's Algorithm with dual state tracking

Intuition

Think of this like tracking a signal that can flip polarity. In Maximum Sum Subarray, we could just track the best positive sum because negatives only hurt us. But here, two negatives make a positive — so a number that seems terrible now (negative) might pair with another negative later to become huge. We need to track BOTH the best and worst possible products at each position, like a seesaw: when you multiply by a negative, the max becomes the min and vice versa. It's like maintaining both the highest peak and deepest valley in a landscape, because two valleys (negatives) can combine into a mountain.

Why This Pattern?

The problem structure demands tracking two extremes because multiplication can flip signs. At any position, the best product ending there depends on either: (1) starting fresh at the current element, (2) extending the previous max product, or (3) extending the previous min product (which becomes max when multiplied by a negative). Just tracking max_prod like in sum problems fails because we lose the min_prod that might become valuable when paired with a future negative.

Solution

class Solution:
    def maxProduct(self, nums: List[int]) -> int:
        # Global answer starts with first element
        result = nums[0]
        
        # Track max and min products ending at current position
        max_prod = nums[0]
        min_prod = nums[0]
        
        for i in range(1, len(nums)):
            # If current is negative, max and min swap roles conceptually
            # So we compute both possibilities: current*max_prod and current*min_prod
            # The new max is the best of: starting fresh, extending previous max, or extending previous min
            new_max = max(nums[i], nums[i] * max_prod, nums[i] * min_prod)
            new_min = min(nums[i], nums[i] * max_prod, nums[i] * min_prod)
            
            # Update both and record global best
            max_prod, min_prod = new_max, new_min
            result = max(result, max_prod)
        
        return result

Complexity

Time: O(n)
Space: O(1)

We make exactly one pass through the array. At each position, we do a constant amount of work (a few multiplications and comparisons). We can't do better than O(n) because we must examine each element to know if it contributes to the optimal subarray. The O(1) space comes from only tracking three variables regardless of input size.

Common Mistakes

Edge Cases

Connections

Min Cost Climbing Stairs #746
1-D Dynamic Programming with optimal substructure

Intuition

Think of this like a ball rolling up a hill with energy costs at each position. At every stair, the ball has two choices: take one step or skip one. The minimum cost to reach any stair is the minimum of the costs to reach the two stairs below it, plus the cost of the current stair. It's like finding the path of least resistance up the hill - at each fork, you choose whichever route accumulated less total cost. The key insight: to stand on stair i, you must have come from either stair i-1 or stair i-2, so you take whichever was cheaper and add the cost of standing on i. You don't pay for your starting position - you can begin at stair 0 or 1 freely.

Why This Pattern?

The problem exhibits optimal substructure: the minimum cost to reach stair i depends only on the minimum costs to reach stairs i-1 and i-2. This is a 'choose your best previous state' pattern where each decision (take 1 step or 2 steps) leads to a new state.

Solution

def minCostClimbingStairs(cost):
    n = len(cost)
    # Base cases: can start at index 0 or 1 without paying yet
    # dp[i] = minimum cost to reach and stand on stair i
    
    # Option 1: use array (more readable)
    # dp = [0] * n
    # dp[0] = cost[0]
    # dp[1] = cost[1]
    # for i in range(2, n):
    #     dp[i] = min(dp[i-1], dp[i-2]) + cost[i]
    # return min(dp[n-1], dp[n-2])  # can end at last or second-to-last
    
    # Option 2: space-optimized (only need previous 2 values)
    prev2 = cost[0]  # cost to reach i-2
    prev1 = cost[1]  # cost to reach i-1
    
    for i in range(2, n):
        current = min(prev1, prev2) + cost[i]
        prev2 = prev1
        prev1 = current
    
    # Can finish at second-to-last or last stair (no cost beyond array)
    return min(prev1, prev2)

Complexity

Time: O(n)
Space: O(1)

Common Mistakes

Edge Cases

Connections

Palindromic Substrings #647
Expand Around Center - a fundamental palindrome enumeration technique

Intuition

Think of a palindrome like a balanced system - it has a center of symmetry, and characters mirror outward from that center like a ripple in a pond. When you drop a pebble (pick a center), the ripple expands outward as long as the symmetry holds. At each step outward, you check if the left and right 'forces' (characters) match - if they do, you've found another palindrome. If they don't match, the symmetry breaks and the ripple stops. This is why we expand around centers: every palindrome has exactly one center (for odd-length) or two adjacent centers (for even-length), and we can systematically find all of them by expanding outward from each possible center.

Why This Pattern?

Every palindrome has a well-defined center of symmetry. For odd-length palindromes like 'aba', the center is position 1 (the 'b'). For even-length palindromes like 'aa', the center is between positions 0 and 1. By treating each position as a potential odd center and each gap between positions as a potential even center, we can exhaustively enumerate all palindromes. The expansion naturally stops when symmetry breaks, making this O(n) per center in the worst case.

Solution

class Solution:
    def countSubstrings(self, s: str) -> int:
        """
        Count palindromic substrings by expanding around each possible center.
        
        For each position i, we expand twice:
        1. Odd-length: treat s[i] as the center (e.g., 'aba')
        2. Even-length: treat the gap between s[i] and s[i+1] as center (e.g., 'aa')
        
        Each successful expansion = one palindromic substring found.
        """
        n = len(s)
        count = 0
        
        def expand(left: int, right: int) -> None:
            """Expand outward from center while characters match."""
            nonlocal count
            while left >= 0 and right < n and s[left] == s[right]:
                count += 1      # Found a palindrome!
                left -= 1       # Expand left
                right += 1      # Expand right
        
        for i in range(n):
            # Odd-length: center is a single character at i
            expand(i, i)
            # Even-length: center is between i and i+1
            expand(i, i + 1)
        
        return count

Complexity

Time: O(n²) - In the worst case (like 'aaaaa...'), we expand O(n) times for each of the n centers. Each expansion is O(1), so total is O(n²). We can't do better because there can be O(n²) palindromic substrings in the worst case (every substring of a repeated character string is a palindrome), so we must at least enumerate all of them.
Space: O(1) - Only using a constant amount of extra space (the count variable and loop indices).

The space is constant because we don't store any substrings or use recursion - we just count as we go. The time is quadratic because we might check every possible expansion from every possible center. In 'aaaaa', there are n*(n+1)/2 = O(n²) palindromes to find, so we can't do better than O(n²) in the worst case.

Common Mistakes

Edge Cases

Connections

Partition Equal Subset Sum #416
Subset Sum / 0-1 Knapsack - Each element can be either IN a subset or NOT in it (two choices), and we want to hit an exact target sum.

Intuition

Think of this like a balance scale. If the total weight is odd, the scale can never balance—immediate fail. If it's even, we just need to find ONE subset that weighs exactly half. Here's the beautiful part: if we find a subset equaling half, the remaining elements automatically equal half (because total - half = half). This transforms the problem from "can I split this perfectly?" to the simpler question "can I find a subset that sums to X?" It's like asking: given coins, can I reach exactly half the total? Each number is a coin we can use once.

Why This Pattern?

The problem asks whether some subset equals exactly half the total. This is the canonical subset sum formulation: given a set of numbers and a target, can some combination reach that target? The '0-1' refers to using each element at most once (we're partitioning, not repeating).

Solution

class Solution:
    def canPartition(self, nums: List[int]) -> bool:
        total = sum(nums)
        # Odd total can never be split into equal integer sums
        if total % 2 != 0:
            return False
        
        target = total // 2
        # dp[i] = True if we can form sum 'i' using some subset
        dp = [False] * (target + 1)
        # Base case: we can always form sum 0 (empty subset)
        dp[0] = True
        
        for num in nums:
            # Iterate BACKWARDS! This is critical.
            # Going high-to-low ensures we don't reuse the same element
            # in one iteration (that would make it unbounded knapsack)
            for j in range(target, num - 1, -1):
                # Either we keep the old sum, OR we add current num to reach j
                dp[j] = dp[j] or dp[j - num]
        
        return dp[target]

Complexity

Time: O(n * sum/2) = O(n * sum) where n is array length and sum is total of all elements. We iterate through each element and for each, potentially update all sums from target down to that element.
Space: O(target) = O(sum/2). We need a boolean array representing every possible sum from 0 to half the total.

We can't do better than O(n*sum) because there are O(sum) possible subset sums and we potentially need to check each one. The sum could be as large as 200*10000 = 2,000,000 in worst case. This is pseudo-polynomial—we're exponential in the VALUE of the sum, not the COUNT of elements.

Common Mistakes

Edge Cases

Connections

Word Break #139
Linear Scan with Reachability DP

Intuition

Think of this as a pathfinding problem through the string. You're standing at position 0 and want to reach position n (the end). Each valid word in the dictionary is like a 'jump' - if you're at position j and the substring s[j:i] is a word, you can jump to position i. The question becomes: can you traverse from start to end using only valid word-jumps? Alternatively, think of it like signal propagation: the signal starts at position 0. Each valid word acts as a wire that propagates the signal forward. If the signal can reach the end, the word break is valid. This is like asking whether a ripple can travel through a medium using only certain-sized waves.

Why This Pattern?

The string has a natural left-to-right ordering. For any position i, to determine if it's reachable from the start, we only need to check positions j < i. This creates a perfect ordering for DP: dp[i] = 'can we reach position i?' and we propagate reachability forward. It's essentially a graph reachability problem with a linear structure - we sweep left to right once.

Solution

def wordBreak(s: str, wordDict: List[str]) -> bool:
    word_set = set(wordDict)  # O(1) lookups instead of O(n) list scans
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True  # Base case: empty string is always 'breakable'
    
    # Sweep through each position i (1 to n)
    for i in range(1, n + 1):
        # Check all possible previous positions j
        for j in range(i):
            # If we can reach j AND s[j:i] is a word, we can reach i
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break  # Found a valid path to i, no need to check more j's
    
    return dp[n]

# Optimized version with max_word_len constraint:
def wordBreak_optimized(s: str, wordDict: List[str]) -> bool:
    word_set = set(wordDict)
    max_len = max(len(word) for word in wordDict)  # Prune search space
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True
    
    for i in range(1, n + 1):
        # Only check j positions within max word length of i
        for j in range(max(0, i - max_len), i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break
    
    return dp[n]

Complexity

Time: O(n² × m) where n is string length and m is average word length for substring slicing. With the optimization, becomes O(n × L) where L is the max word length in dictionary. The slicing itself is O(m) but Python optimizes this reasonably well. Essentially we're doing at most n × max_len substring checks.
Space: O(n + k) where n is the DP array size and k is the dictionary size for the set lookup table. The set dominates for large dictionaries.

We must check each position i against potentially all previous positions j - that's the n². Each check requires verifying if a substring exists in our dictionary. We can't do better than checking each position because any character could be the start of the final valid word - there's no way to 'skip' positions without checking them. The set lookup is O(1), so the main cost is the nested loop structure itself.

Common Mistakes

Edge Cases

Connections

2-D Dynamic Programming (11)

Best Time to Buy and Sell Stock with Cooldown #309
State Machine DP with three states

Intuition

Think of this as a state machine with three positions: (1) HOLDING a stock, (2) NOT HOLDING and CAN BUY, (3) COOLDOWN (just sold yesterday, can't buy today). The cooldown creates a 'rest period' in the cycle—like a pendulum that must swing back through a resting point before going forward again. The key insight: each day's optimal state depends only on yesterday's states. If you sell today, you enter cooldown tomorrow (can't buy). If you're in cooldown, you must wait, then you can buy. This creates a natural three-state flow that captures all constraints. It's like tracking a particle moving through positions with specific allowed transitions.

Why This Pattern?

The cooldown constraint explicitly creates three distinct states the system can be in at any point. This isn't optional—we NEED three states because knowing only 'holding vs not holding' isn't enough; we need to know whether we're in cooldown (can't act) or can buy. The problem's constraint directly maps to state transitions, making this the natural pattern.

Solution

def maxProfit(prices):
    if not prices:
        return 0
    
    n = len(prices)
    # Three states: hold (holding a stock), profit (not holding, can buy), cooldown (just sold)
    # Initialize: on day 0, either we don't buy (profit=0) or we buy (hold=-prices[0])
    hold = -prices[0]      # max profit if holding after day 0
    profit = 0             # max profit if not holding (can buy) after day 0
    cooldown = 0           # max profit if in cooldown after day 0 (not possible initially)
    
    for i in range(1, n):
        # Calculate new states based on previous states
        # NEW hold: max of (kept holding, bought today from profit state)
        new_hold = max(hold, profit - prices[i])
        # NEW profit: max of (was already not holding, recovering from cooldown)
        new_profit = max(profit, cooldown)
        # NEW cooldown: must have sold today (was holding, now sell at today's price)
        new_cooldown = hold + prices[i]
        
        # Update states
        hold, profit, cooldown = new_hold, new_profit, new_cooldown
    
    # At the end, we can't end in cooldown (no future benefit), so return max of hold and profit
    return max(profit, cooldown)

Complexity

Time: O(n)
Space: O(1)

We iterate through prices once, doing O(1) work each day. We can't do better because we must examine each day's price to know whether to buy/sell—we can't skip any day since each affects our available actions (cooldown forces a day off). The space is O(1) because we only track 3 variables regardless of input size—previous states are overwritten each iteration.

Common Mistakes

Edge Cases

Connections

Burst Balloons #312
Interval Dynamic Programming (also called 'Divide and Conquer DP')

Intuition

Think of this backwards: instead of asking which balloon to burst FIRST, ask which one was burst LAST. When the final balloon bursts, its only neighbors are the boundaries (value 1 on both sides). So if balloon i is the LAST to burst in range (l, r), then when it finally bursts, the left and right subproblems are already solved - balloon i sees nums[l] and nums[r] as its neighbors. This is like finding an 'energy minimum' - we're looking for the balloon whose burst creates the most 'energy' (coins) given its local environment after all other balloons are gone. The problem has no greedy choice property going forward (bursting a high-value balloon early hurts your score), but going backwards, the last burst is unambiguous - it always sees boundaries.

Why This Pattern?

The problem has optimal substructure when we divide at the point of the LAST burst. If we fix which balloon bursts LAST in an interval, the left and right sides become independent subproblems - balloons on the left don't affect coins earned from balloons on the right. This is like matrix chain multiplication or optimal BST - we try all partition points. The 'interval' dimension tracks the left and right boundaries we're considering, and for each interval we try each position as the last burst.

Solution

class Solution:
    def maxCoins(self, nums: List[int]) -> int:
        # Add boundary balloons with value 1
        # These represent the walls that never burst
        val = [1] + [num for num in nums if num > 0] + [1]
        n = len(val)
        
        # dp[i][j] = max coins from bursting ALL balloons in interval (i, j)
        # exclusive bounds - we never burst the boundary balloons
        dp = [[0] * n for _ in range(n)]
        
        # Fill dp table - consider intervals of increasing length
        # length = distance between boundaries
        for length in range(2, n):
            for left in range(0, n - length):
                right = left + length
                # Try each balloon in (left, right) as the LAST to burst
                # When it bursts, its neighbors are val[left] and val[right]
                # because all balloons between them are already burst
                for k in range(left + 1, right):
                    # Coins from bursting k last: neighbors * k's value
                    # + coins from left subproblem (left, k)
                    # + coins from right subproblem (k, right)
                    coins = val[left] * val[k] * val[right]
                    coins += dp[left][k] + dp[k][right]
                    dp[left][right] = max(dp[left][right], coins)
        
        # Full interval (0, n-1) with boundaries excluded
        return dp[0][n - 1]

Complexity

Time: O(n^3)
Space: O(n^2)

We have O(n^2) possible intervals (left, right pairs), and for each we try O(n) possible 'last burst' positions. We can't do better because the problem requires considering all possible partition points - trying different balloons as the dividing point is fundamental to finding the optimal structure. The 3D nature (two boundaries + partition point) is inherent to the problem's optimal substructure.

Common Mistakes

Edge Cases

Connections

Coin Change II #518
2D Dynamic Programming with outer loop over coins, inner loop over amounts

Intuition

Think of this as counting the number of ways to distribute a 'flow' of amount units across different coin types. Imagine coins as different colored balls and you want to know how many valid color distributions sum to the target amount. The key insight: if you process coins in order (coin 1, then coin 2, then coin 3...), you naturally avoid double-counting because once you've moved past coin 1, you never revisit it - you're only adding more coin types to existing combinations. It's like building a staircase where each step down represents adding a new coin type, and each step right represents using more of the current coin type. Every path from top-left to bottom-right is a valid combination.

Why This Pattern?

The order of iteration is crucial. By putting coins on the outer loop and amounts on the inner loop (processing amounts in ascending order), we ensure that when computing dp[i][j], the value dp[i][j-coin] already includes combinations using the current coin - allowing us to use the same coin multiple times. If we reversed the order, we'd be counting permutations (different orderings) instead of combinations.

Solution

def change(amount: int, coins: list[int]) -> int:
    # dp[i][j] = ways to make amount j using first i coins
    dp = [[0] * (amount + 1) for _ in range(len(coins) + 1)]
    
    # Base case: one way to make amount 0 (use no coins)
    dp[0][0] = 1
    
    for i in range(1, len(coins) + 1):
        coin = coins[i - 1]  # Current coin denomination
        for j in range(amount + 1):
            # Option 1: Don't use current coin - ways = dp[i-1][j]
            dp[i][j] = dp[i - 1][j]
            
            # Option 2: Use current coin (at least once)
            # dp[i][j-coin] gives ways to make remaining amount using
            # current coin (and previous coins) - allows unlimited use
            if j >= coin:
                dp[i][j] += dp[i][j - coin]
    
    return dp[len(coins)][amount]

# Space-optimized version
def change_optimized(amount: int, coins: list[int]) -> int:
    dp = [0] * (amount + 1)
    dp[0] = 1  # One way to make amount 0: use nothing
    
    for coin in coins:
        for j in range(coin, amount + 1):
            # Add ways to make (j - coin) to current dp[j]
            # This counts all combinations using current coin
            dp[j] += dp[j - coin]
    
    return dp[amount]

Complexity

Time: O(len(coins) * amount)
Space: O(amount) for optimized version, O(len(coins) * amount) for 2D version

We must consider each coin (n coins) for each possible amount from 0 to target (amount+1 values). We can't do better because the answer could theoretically be different for every amount - there's no formula that skips computation. Each state requires constant time to compute from its predecessors.

Common Mistakes

Edge Cases

Connections

Distinct Subsequences #115
2D Dynamic Programming (DP)

Intuition

Think of this as counting the number of distinct paths through s that can spell out t. At each character in s, you have a choice: either match it with the current character in t (if they match), or skip it. We're essentially counting how many different ways we can 'use up' characters from s to build t. If s[i-1] == t[j-1], we have two choices - use this character as part of our subsequence or skip it. If they don't match, we must skip it.

Why This Pattern?

The problem exhibits optimal substructure: the number of ways to form t[0:j] from s[0:i] depends on smaller prefixes. When s[i-1] == t[j-1], we can either match them (contributing dp[i-1][j-1] ways) or skip s[i-1] (contributing dp[i-1][j] ways). This creates a natural recurrence relation that fills a 2D table. The decision to match or skip at each position creates the branching that DP naturally captures.

Solution

def numDistinct(s: str, t: str) -> int:
    m, n = len(s), len(t)
    
    # dp[i][j] = number of ways to form t[0:j] from s[0:i]
    # Using m+1 x n+1 to include empty string base cases
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base case: empty t can be formed one way (delete everything from s)
    # dp[0][0] = 1 represents: empty s forms empty t in one way
    for i in range(m + 1):
        dp[i][0] = 1
    
    # Fill the DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s[i-1] == t[j-1]:
                # Two choices: use s[i-1] as match (dp[i-1][j-1])
                # OR skip s[i-1] (dp[i-1][j])
                dp[i][j] = dp[i-1][j-1] + dp[i-1][j]
            else:
                # Characters don't match, must skip s[i-1]
                dp[i][j] = dp[i-1][j]
    
    return dp[m][n]

Complexity

Time: O(m * n)
Space: O(m * n)

We iterate through all m*n combinations of prefixes of s and t. At each cell we do O(1) work. This is necessary because we need to remember the count for every prefix combination - the answer for longer strings depends on all shorter combinations. Space can be reduced to O(n) by noticing we only need the previous row.

Common Mistakes

Edge Cases

Connections

Edit Distance #72
2D Dynamic Programming on Two Sequences

Intuition

Think of this as transforming one system (word1) into another (word2). Each character is a component, and you're allowed three 'moves': insert a new component, delete an existing one, or replace one with another. The key insight: at each position, if characters match, you're in equilibrium - just carry the previous state forward. If they don't match, you're at a 'energy barrier' and must pay a cost of 1 to either replace (overcome the difference), delete from word1 (skip the mismatch), or insert into word1 (add what's needed). This is like finding the minimum-cost path through a grid where each step represents an edit operation.

Why This Pattern?

This problem has optimal substructure - the minimum operations to convert prefixes word1[0:i] and word2[0:j] depends only on smaller prefixes. It also has overlapping subproblems - without memoization, we'd recompute the same subproblems repeatedly. The 2D grid naturally represents the 'state space' of all possible prefix transformations.

Solution

def minDistance(word1, word2):
    m, n = len(word1), len(word2)
    
    # dp[i][j] = min operations to convert word1[0:i] to word2[0:j]
    # i and j are lengths, so dp[0][*] and dp[*][0] are base cases
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base case: convert empty string to word2[0:j] = j insertions
    for j in range(1, n + 1):
        dp[0][j] = j
    
    # Base case: convert word1[0:i] to empty string = i deletions
    for i in range(1, m + 1):
        dp[i][0] = i
    
    # Fill the table: for each prefix pair
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                # Characters match - no operation needed, carry forward
                dp[i][j] = dp[i-1][j-1]
            else:
                # Three choices, pick minimum cost:
                # 1. Replace current character
                # 2. Delete from word1 (move i-1, stay at j)
                # 3. Insert into word1 (stay at i, move j-1)
                dp[i][j] = 1 + min(
                    dp[i-1][j-1],  # replace
                    dp[i-1][j],    # delete from word1
                    dp[i][j-1]     # insert into word1
                )
    
    return dp[m][n]

Complexity

Time: O(m * n)
Space: O(m * n)

We compute one DP cell for each pair of prefixes (m+1)*(n+1) total. Each cell takes O(1) to compute. We can't do better because every prefix of word1 potentially relates to every prefix of word2 - the edit distance fundamentally requires comparing all character combinations.

Common Mistakes

Edge Cases

Connections

Interleaving String #97
2-D Dynamic Programming (grid path-finding)

Intuition

Think of this like two rivers merging. s1 and s2 are tributaries that must merge to form s3, maintaining their internal order but mixing characters. Imagine a 2D grid where moving right means taking the next character from s1, and moving down means taking from s2. We start at the source (0,0) and need to reach the destination (len(s1), len(s2)) by following a path that exactly spells out s3. At each cell, we can only come from the left or from above - this is like a water flow finding its way to a drain, with s3 dictating which paths are valid.

Why This Pattern?

The problem has optimal substructure - whether we can reach cell (i,j) depends only on whether we could reach (i-1,j) or (i,j-1) and whether the current character matches. It's a classic 'unique paths' style problem where we're not counting paths but checking if ANY valid path exists. The 2D structure naturally emerges from the two input strings forming the axes of our decision space.

Solution

class Solution:
    def isInterleave(self, s1: str, s2: str, s3: str) -> bool:
        m, n = len(s1), len(s2)
        
        # Quick check: lengths must add up
        if m + n != len(s3):
            return False
        
        # dp[i][j] = True if s3[0:i+j] can be formed by interleaving s1[0:i] and s2[0:j]
        dp = [[False] * (n + 1) for _ in range(m + 1)]
        
        # Base case: empty strings
        dp[0][0] = True
        
        # Fill first column (using only s1)
        for i in range(1, m + 1):
            dp[i][0] = dp[i-1][0] and s1[i-1] == s3[i-1]
        
        # Fill first row (using only s2)
        for j in range(1, n + 1):
            dp[0][j] = dp[0][j-1] and s2[j-1] == s3[j-1]
        
        # Fill the rest of the grid
        for i in range(1, m + 1):
            for j in range(1, n + 1):
                # Current position in s3 we're trying to match
                k = i + j - 1
                
                # Either came from left (using s1) OR from above (using s2)
                from_s1 = dp[i-1][j] and s1[i-1] == s3[k]
                from_s2 = dp[i][j-1] and s2[j-1] == s3[k]
                
                dp[i][j] = from_s1 or from_s2
        
        return dp[m][n]

Complexity

Time: O(m * n) - we visit each cell in the (m+1) x (n+1) grid exactly once
Space: O(m * n) for the full DP table, though this can be reduced to O(n) using only one row at a time

We must consider every possible prefix combination - there are m+1 possible positions in s1 and n+1 in s2, so (m+1)(n+1) states. Each state requires constant time to compute. We can't do better because the answer could theoretically depend on any split point - imagine checking if s1[:i] + s2[:j] forms s3[:i+j] for all possible i,j.

Common Mistakes

Edge Cases

Connections

Longest Common Subsequence #1143
2D Dynamic Programming (Edit Distance family)

Intuition

Imagine two rivers (the two strings) flowing side by side. A subsequence is like a path that follows the river but can skip around. The LCS is the longest path that exists in BOTH rivers at the same relative positions. At each decision point (matching characters vs. not), it's like choosing which river to 'sacrifice' a character from when they don't align. The key insight: if characters match, we gain 1 and move diagonally inward. If they don't match, we must abandon one character from either string and take the best path forward. Think of it as a choose-your-own-adventure where at every fork, we pick the branch that leads to the longest shared future.

Why This Pattern?

This problem has optimal substructure: the best answer at position (i,j) depends on the best answers at smaller positions. If characters match, LCS(i,j) = 1 + LCS(i-1,j-1). If they don't match, LCS(i,j) = max(LCS(i-1,j), LCS(i,j-1)). This creates a natural recursion that fills a 2D table where each cell represents the LCS for prefixes of both strings.

Solution

def longestCommonSubsequence(text1: str, text2: str) -> int:
    m, n = len(text1), len(text2)
    # dp[i][j] = LCS length for text1[0:i] and text2[0:j]
    # Using (m+1) x (n+1) to handle empty prefix cases
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i - 1] == text2[j - 1]:
                # Characters match - extend the common subsequence
                # by 1 from the diagonal (previous prefixes)
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                # Characters don't match - take the best path:
                # either skip current char from text1 OR from text2
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    
    return dp[m][n]

Complexity

Time: O(m * n)
Space: O(m * n)

We must visit every cell in the m×n table once because each cell depends on the cell above and to the left. We can't skip any position - to know the LCS for prefixes of length i and j, we need to have computed all smaller prefix combinations. There's no shortcut because the problem doesn't have monotonic properties we could exploit.

Common Mistakes

Edge Cases

Connections

Longest Increasing Path in a Matrix #329
DFS with Memoization (Topological DP on DAG)

Intuition

Imagine each cell's value as an elevation. You're a hiker who can only walk uphill (to strictly higher values). You want to find the longest possible hike you could take from any starting point. The key insight: since values strictly increase along any path, you can never cycle back - the graph of valid moves is a Directed Acyclic Graph (DAG). This means if you compute the longest path from a cell once, that answer is final forever - no future decisions can change it. It's like calculating potential energy: once you know the maximum height reachable from each point, you just add 1 for the current step.

Why This Pattern?

The matrix with 'only move to higher values' constraint naturally forms a DAG because strict increase prevents cycles. For any cell, its longest path equals 1 (itself) plus the maximum of its neighbors' longest paths. Since neighbors always have higher values, there's no circular dependency - we can compute in any order using memoization. This is essentially dynamic programming on a DAG, computed via depth-first search.

Solution

class Solution:
    def longestIncreasingPath(self, matrix: List[List[int]]) -> int:
        if not matrix or not matrix[0]:
            return 0
        
        m, n = len(matrix), len(matrix[0])
        # Cache stores longest path starting from each cell
        cache = [[0] * n for _ in range(m)]
        
        def dfs(i, j):
            # If already computed, return cached result
            if cache[i][j] != 0:
                return cache[i][j]
            
            # Directions: up, down, left, right
            directions = [(-1, 0), (1, 0), (0, -1), (0, 1)]
            max_path = 1  # At minimum, we can stay at current cell
            
            for di, dj in directions:
                ni, nj = i + di, j + dj
                # Check bounds and only move to strictly higher values
                if 0 <= ni < m and 0 <= nj < n and matrix[ni][nj] > matrix[i][j]:
                    # Recursively find longest path from neighbor, add current cell
                    max_path = max(max_path, 1 + dfs(ni, nj))
            
            cache[i][j] = max_path
            return max_path
        
        # Try starting from every cell, return the maximum
        result = 0
        for i in range(m):
            for j in range(n):
                result = max(result, dfs(i, j))
        
        return result

Complexity

Time: O(m * n)
Space: O(m * n) for the cache + O(m * n) for recursion stack in worst case

Each cell is visited exactly once and we check 4 neighbors per cell. Even though we recursively explore, the memoization ensures we never recompute a subproblem - each cell's result is computed exactly one time and cached. The worst-case space equals the number of cells because in a strictly increasing path, we could recurse through every cell before returning.

Common Mistakes

Edge Cases

Connections

Regular Expression Matching #10
2-D Dynamic Programming on two strings (like Longest Common Subsequence, Edit Distance, Wildcard Matching)

Intuition

Think of this as a signal propagation or state machine problem. You're trying to find a 'path' through both strings where each step either consumes a character or uses the special '*' operator to either suppress the previous element (zero occurrences) or allow it to repeat. The '.' is a wildcard - like a universal adapter that fits anything. The '*' is a feedback loop: it can either 'dampen' the signal (zero of preceding) or 'amplify' it (one or more of preceding). The 2D grid naturally emerges because you have two independent positions to track - where you are in the string and where you are in the pattern. Each cell asks: 'Can the remainder of string s[i:] match remainder of pattern p[j:]?'

Why This Pattern?

The problem has optimal substructure: whether s[i:] matches p[j:] depends on smaller suffixes. There are overlapping subproblems - we might reach the same (i,j) state via different paths. The state space is naturally 2D because we track two independent indices. This is structurally identical to wildcard matching (LeetCode 44) but with '*' having different semantics (preceding element must exist, not optional).

Solution

def isMatch(s: str, p: str) -> bool:
    # dp[i][j] = True if s[i:] matches p[j:]
    # Working backwards from ends (like Edit Distance)
    m, n = len(s), len(p)
    dp = [[False] * (n + 1) for _ in range(m + 1)]
    
    # Base case: empty string matches empty pattern
    dp[m][n] = True
    
    # Fill last row: matching empty string against pattern
    # Pattern 'x*' can match empty (x occurs 0 times)
    for j in range(n - 1, -1, -1):
        if p[j] == '*':
            dp[m][j] = dp[m][j + 2]  # skip 'x*' entirely
    
    # Fill DP table: for each position in s and p
    for i in range(m - 1, -1, -1):
        for j in range(n - 1, -1, -1):
            # Do current characters match? (accounting for '.' wildcard)
            first_match = s[i] == p[j] or p[j] == '.'
            
            # Check if next pattern char is '*'
            if (j + 1) < n and p[j + 1] == '*':
                # Two paths with '*': 
                # 1) Skip 'x*' entirely (x occurs 0 times): dp[i][j+2]
                # 2) Use '*' to match current char and stay on pattern j
                #    (x occurs 1+ times): first_match AND dp[i+1][j]
                dp[i][j] = dp[i][j + 2] or (first_match and dp[i + 1][j])
            else:
                # No '*', must match current chars and advance both
                dp[i][j] = first_match and dp[i + 1][j + 1]
    
    return dp[0][0]

Complexity

Time: O(m * n) where m = len(s), n = len(p)
Space: O(m * n) for the DP table. Can be optimized to O(n) using only two rows since we only look at dp[i+1][...] and dp[i][j+2].

We must examine every cell in the m×n grid because any (i,j) combination could be relevant. We can't skip cells since pattern matching decisions depend on both string and pattern positions simultaneously. The table is a product of the two input sizes - each character in s potentially interacts with each character in p.

Common Mistakes

Edge Cases

Connections

Target Sum #494
Subset Sum / 0-1 Knapsack (counting variation)

Intuition

Think of this as a conservation law problem - imagine you have weights (the array elements) and you want to balance a scale. Placing a number on the left plate is +, on the right is -. The difference between the two sides must equal the target. The key insight: if P is the sum of all + signs and N is the sum of all - signs, then P - N = target AND P + N = total_sum. Solving these gives us P = (target + total_sum) / 2. So instead of counting +/- arrangements, we're just counting subsets that sum to (target + total)/2. This is like asking: "How many ways can we reach a specific energy level?" Each element either contributes its energy (included) or doesn't (excluded).

Why This Pattern?

The problem appears to be about +/- signs, but the mathematical transformation P = (target + total)/2 reveals it as a subset counting problem. Each element is either included (goes to + group) or excluded (goes to - group), and we count ways to reach a specific sum. This is the classic 0-1 knapsack structure where we either take or don't take each item.

Solution

def findTargetSumWays(nums, target):
    total = sum(nums)
    
    # If target is beyond what we can achieve, impossible
    if abs(target) > total:
        return 0
    
    # P = (target + total) / 2 must be an integer
    # This comes from: P - N = target and P + N = total → 2P = target + total
    if (target + total) % 2 != 0:
        return 0
    
    goal = (target + total) // 2
    
    # dp[i] = number of ways to achieve sum i
    # Using 1D DP - iterate backwards to simulate 0-1 choice (don't reuse items)
    dp = [0] * (goal + 1)
    dp[0] = 1  # One way to get sum 0: select no elements
    
    for num in nums:
        # Traverse backwards: for each num, add it to existing subsets
        # This ensures each num is used at most once (0-1 knapsack property)
        for i in range(goal, num - 1, -1):
            dp[i] += dp[i - num]
    
    return dp[goal]

Complexity

Time: O(n * goal) where n is the array length and goal = (target + total)/2
Space: O(goal) - the 1D DP array

We iterate through each of n elements, and for each element, we potentially update all sums from goal down to that element's value. This is like filling a knapsack - each item has 'weight' equal to its value, and we count distinct ways to fill it to exactly goal. We can't do better because we must consider every subset possibility.

Common Mistakes

Edge Cases

Connections

Unique Paths #62
2D Dynamic Programming (Grid DP)

Intuition

Imagine this like water flowing through a grid of pipes. At each intersection, the water can split — it can go right or down. The number of ways to reach any cell is like combining two streams: the stream coming from above and the stream coming from the left. When streams merge, their flows add up. So to reach cell (i,j), you must have arrived either from above (i-1,j) or from the left (i,j-1). Each path is unique because the robot makes different choices at each step — it's like a decision tree where right/down are the two branches.

Why This Pattern?

This problem has optimal substructure — the answer for each cell depends only on the answers for cells above and to its left. There's no need to consider the full history because the robot's position at any cell completely determines future possibilities. The grid structure naturally maps to a 2D DP table where we build up solutions from top-left to bottom-right.

Solution

def uniquePaths(m, n):
    # dp[j] represents number of ways to reach column j in current row
    # Initialize with 1s: first row has only 1 way (all right moves)
    dp = [1] * n
    
    # Iterate through each row starting from second
    for i in range(1, m):
        # For each column starting from second
        for j in range(1, n):
            # dp[j] currently holds ways from cell above (same column, previous row)
            # dp[j-1] holds ways from cell to the left (previous column, same row)
            # Add them together — this is the core DP transition
            dp[j] += dp[j-1]
    
    return dp[-1]  # Return ways to reach bottom-right

Complexity

Time: O(m × n) — we visit each cell exactly once, computing its value from two neighbors
Space: O(n) — we only need one row array instead of a full m×n table because each row only depends on itself and the previous row

We can't do better than O(m×n) because there are m×n cells and each one must contribute to the final count somehow — we need to consider all possible paths. For space, we only keep track of one row because when computing cell (i,j), we only need dp[j] (from above) and dp[j-1] (from left in current row). Older rows become irrelevant once we've moved past them, like forgetting where water came from once it's passed through a pipe junction.

Common Mistakes

Edge Cases

Connections

Greedy (8)

Gas Station #134
Greedy with proof / Single-pass proof. We make a locally optimal choice (move start forward when we fail) that guarantees global optimality because of the conservation law (total gas vs total cost).

Intuition

Think of this as an energy conservation problem. At each station, you gain gas[i] and spend cost[i] to move forward. If total_gas < total_cost over the entire circuit, it's fundamentally impossible - you're losing energy overall. But if total_gas >= total_cost, there MUST be a valid starting point. Here's why: imagine your fuel tank level as you travel around the circle. If the overall level ends up higher than where it started (or equal), the path must have a minimum point. That minimum point is your starting station - from there, the tank level never drops below zero, guaranteeing you can complete the circuit. The greedy insight: if starting from station A fails at station B (your tank goes negative), then NO station between A and B can work either, because you'd be starting with even less fuel than A had when it failed.

Why This Pattern?

The problem has a mathematical guarantee: if total_gas >= total_cost, there exists a solution, and we can find it greedily. If we fail at station B starting from A, all stations between A and B are impossible because you'd arrive there with less cumulative fuel than A did. This lets us skip them all in one move.

Solution

def canCompleteCircuit(gas, cost):
    total_tank = 0  # Track overall gas vs cost for the whole circuit
    curr_tank = 0   # Track gas vs cost from current start
    start = 0       # Our candidate starting station
    
    for i in range(len(gas)):
        diff = gas[i] - cost[i]
        total_tank += diff   # Add to overall account
        curr_tank += diff    # Add to current journey account
        
        # If we can't reach station i from our current start,
        # then no station between start and i can work.
        # Skip all of them by moving start to i+1.
        if curr_tank < 0:
            start = i + 1
            curr_tank = 0  # Reset tank for new starting point
    
    # If overall tank is non-negative, we found a valid start.
    # Otherwise, even the total gas wasn't enough - impossible.
    return start if total_tank >= 0 else -1

Complexity

Time: O(n) - single pass through all stations
Space: O(1) - only tracking 3 integer variables regardless of input size

We can't do better than O(n) because we might need to examine every station to find the answer. The greedy proof guarantees we never need to revisit a station we've skipped, so one pass is sufficient. Space is O(1) because we only need to track the current tank level and total, not the entire journey history.

Common Mistakes

Edge Cases

Connections

Hand of Straights #846
Greedy with Counter/Frequency Map

Intuition

Think of this like organizing numbered books into consecutive groups. You have a pile of numbered books and need to form stacks where each stack has consecutive numbers (like 2,3,4 or 7,8,9). The key insight: always pick the SMALLEST available book and try to build a stack going up. Why? The smallest number has the least flexibility—it can only start stacks going upward, while larger numbers could either start new stacks OR extend existing ones. If you can't form a valid stack starting from the smallest available book, you're stuck. This is a "most constrained first" greedy principle—tackle the choice with fewest options first.

Why This Pattern?

The problem has a greedy structure because: (1) We make locally optimal choices (pick smallest, build upward), (2) these choices are irreversible (once we use a card, it's gone), (3) making the most constrained choice first (smallest card) is always safe—if we can't start a valid group from the smallest available card, we never will. The counter tracks what's available to use, which is essential for knowing when we can form groups.

Solution

from collections import Counter

class Solution:
    def isNStraightHand(self, hand: List[int], groupSize: int) -> bool:
        # If total cards can't be evenly divided, impossible
        if len(hand) % groupSize != 0:
            return False
        
        # Count frequency of each card value
        count = Counter(hand)
        
        # Iterate through card values in sorted order
        for card in sorted(count.keys()):
            # While we still have copies of this card, try to form a group
            while count[card] > 0:
                # Try to form consecutive sequence: card, card+1, ..., card+groupSize-1
                for i in range(groupSize):
                    target = card + i
                    # If we don't have the needed consecutive card, impossible
                    if count[target] > 0:
                        count[target] -= 1  # Use this card
                    else:
                        return False
        
        return True

Complexity

Time: O(n log n) where n is the number of cards
Space: O(n) for the counter storing unique card frequencies

Time: Sorting the unique card values takes O(k log k) where k is the number of unique cards (k ≤ n). The nested while/for loops process each card exactly once when forming groups, so O(n). Combined: O(n log n). We can't do better than sorting because we need to know the smallest available card at each step—this is inherent to the problem structure. Space: Counter stores at most one entry per unique card value, so O(k) ≤ O(n).

Common Mistakes

Edge Cases

Connections

Jump Game II #45
Greedy - Always choose the option that maximizes immediate reach (lookahead one step)

Intuition

Imagine you're hopping across lily pads. Each lily pad tells you how far you can jump from there. You want minimum hops to reach the last pad. The greedy insight: at every jump, pick the lily pad that gets you as FAR as possible for the NEXT jump. It's like a gradient descent - you're always moving to the position with maximum 'potential energy' (reach). Think of it as expanding a "frontier" - from your current jump, you can reach positions up to some boundary. When you hit that boundary, you MUST make another jump, so you might as well jump to wherever gives you the farthest new boundary. This is like a wave propagating outward - each jump expands your reachable region, and you're looking for the minimum "time" (jumps) to hit the target.

Why This Pattern?

This problem has optimal substructure: the minimum jumps to reach the end from position i depends on which position you jump to next. The greedy choice works because, from all reachable positions in your current 'jump window', picking the one that extends your reach farthest MUST be optimal - if it weren't, you'd have a shorter path that reaches less far in one fewer jump, which contradicts the idea of minimum jumps. It's also a matroid-like structure where the greedy choice preserves optimality.

Solution

def jump(nums):
    # Edge case: already at end or single element
    if len(nums) <= 1:
        return 0
    
    jumps = 0          # Number of jumps made so far
    curr_end = 0       # End of current jump's reachable range
    farthest = 0       # Furthest we can reach with jumps+1 jumps
    
    # Iterate through all positions except the last
    # (we don't need to jump FROM the last position)
    for i in range(len(nums) - 1):
        # Update furthest reach from current position
        farthest = max(farthest, i + nums[i])
        
        # When we've exhausted current jump's range, we MUST make another jump
        # At this point, farthest represents our new boundary after this jump
        if i == curr_end:
            jumps += 1
            curr_end = farthest
    
    return jumps

Complexity

Time: O(n)
Space: O(1) - only using a few integer variables

We iterate through the array exactly once. At each index, we do O(1) work (constant-time max update and comparison). We can't do better than O(n) because we must examine each position to know the maximum reach - it's like needing to sample every point on a curve to find its maximum. The space is O(1) because we don't need to store any per-position data - we just track the running maximum reach as we sweep through.

Common Mistakes

Edge Cases

Connections

Jump Game #55
Greedy - keep track of maximum reach

Intuition

Think of this like a signal propagating outward from the start position. Each element tells you the maximum range of that signal - from position i, the signal can spread to i+nums[i]. The question becomes: can this reachability 'wave' spread all the way to the last index? The greedy insight is simple: we don't need to plan the exact path, we just need to track the farthest position we CAN reach at any moment. If at any point our current position is beyond what we can reach, we're stuck (dead end). If we ever reach or exceed the last index, we're done.

Why This Pattern?

The problem has a matroid-like structure: reachability is monotonic and cumulative. If you can reach position i, you can reach all positions before i. This means the 'best' strategy (furthest reach) is always optimal - we never need to backtrack or reconsider a previous decision because earlier positions always remain reachable.

Solution

def canJump(nums):
    max_reach = 0  # Farthest position we can currently reach
    n = len(nums)
    
    for i in range(n):
        # If current position is beyond max reach, we're stuck
        if i > max_reach:
            return False
        
        # Extend our reach based on jumping from position i
        max_reach = max(max_reach, i + nums[i])
        
        # Early exit: if we can reach or exceed the last index
        if max_reach >= n - 1:
            return True
    
    return True

Complexity

Time: O(n)
Space: O(1)

We make a single pass through the array. For each element, we do O(1) work (update max_reach and check a condition). We can't do better than O(n) because we must inspect at least n-1 elements to verify reachability - in the worst case (all 1s), we need to check every position to know we can make it.

Common Mistakes

Edge Cases

Connections

Maximum Subarray #53
Kadane's Algorithm (Greedy)

Intuition

Imagine you're tracking your bank balance over time. Each day has a positive or negative balance change. You want to find the contiguous period (subarray) where your balance was highest. The key insight: if your accumulated balance becomes negative, it's better to 'reset' and start fresh from the next day — a negative balance only drags you down. This is like a system seeking energy minimization: the 'energy' (sum) of your current subarray, if it drops below zero, you abandon it and start a new equilibrium state from the current position. Kadane's algorithm is essentially a greedy decision at each step: extend the previous subarray OR start fresh from the current element — whichever gives us a better starting point.

Why This Pattern?

This is a greedy problem because at each position, we make a local optimal choice: extend the current subarray OR start a new one. This works because we're tracking the best subarray ENDING at each position. The local optimal (restart if previous sum < 0) leads to the global optimal (maximum subarray sum) because any subarray that includes a negative-prefixed segment can be improved by dropping that prefix.

Solution

def maxSubArray(nums):
    # Kadane's Algorithm: O(n) time, O(1) space
    # Key insight: at each position, decide to extend or restart
    
    max_sum = nums[0]  # best subarray seen so far
    current_sum = nums[0]  # best subarray ENDING at current position
    
    for i in range(1, len(nums)):
        # Either extend previous subarray OR start fresh from current element
        # If previous sum is negative, it's better to restart
        current_sum = max(nums[i], current_sum + nums[i])
        
        # Track the best sum we've seen overall
        max_sum = max(max_sum, current_sum)
    
    return max_sum

Complexity

Time: O(n)
Space: O(1)

We make exactly one pass through the array of n elements. At each element, we perform only constant-time operations (a comparison and addition). This is optimal because we must examine each element at least once to know if it's part of the maximum subarray. We cannot do better than O(n).

Common Mistakes

Edge Cases

Connections

Merge Triplets to Form Target Triplet #1899
Greedy - greedy selection works because the problem has a matroid-like structure. We don't need to optimize which triplets to pick; we only need to verify existence. Once a triplet is valid (all values <= target), adding more valid triplets can only help (never hurt) because we're taking max values.

Intuition

Think of each triplet as a 'supply' of three resources. You can only use triplets where ALL three values are at or below the target (otherwise you'd overshoot and 'break' the target). Once you filter to valid triplets, you just need to check if they collectively contain every value of the target - like collecting ingredients. If you have all three ingredients (target_a, target_b, target_c) available from your valid supply, you can form the target. The 'merge' operation (taking max) means we just need each target value to appear somewhere in our valid set.

Why This Pattern?

The key insight is that valid triplets (those not exceeding target in any dimension) form an independence set - any subset of them is also valid. We only need to check coverage, not optimal selection. This is like checking if three specific items exist in a filtered list - a greedy 'take what works' approach is optimal because there's no trade-off between valid triplets.

Solution

class Solution:
    def mergeTriplets(self, triplets: list[list[int]], target: list[int]) -> bool:
        # Track whether we can achieve each component of the target
        can_a = can_b = can_c = False
        
        target_a, target_b, target_c = target
        
        for a, b, c in triplets:
            # Skip if any value exceeds target - this triplet would overshoot
            if a > target_a or b > target_b or c > target_c:
                continue
            
            # Check if this valid triplet gives us each target component
            if a == target_a:
                can_a = True
            if b == target_b:
                can_b = True
            if c == target_c:
                can_c = True
        
        # Need all three components to form the target
        return can_a and can_b and can_c

Complexity

Time: O(n) where n is the number of triplets. We make a single pass through all triplets, doing O(1) work each time.
Space: O(1) - only using three boolean flags regardless of input size. No additional data structures needed.

We can't do better than O(n) because we must examine each triplet at least once to determine if it's valid (has any value > target). The space is O(1) because we're not storing triplets - we're just checking existence of values, which we can do with simple boolean flags.

Common Mistakes

Edge Cases

Connections

Partition Labels #763
Two-pass greedy with last occurrence tracking. First, find each character's final position. Then, traverse while maintaining the furthest last-seen position of any character in the current partition. When current index hits that furthest point, we have a complete partition.

Intuition

Think of each letter as a 'signal' that starts at its first appearance and ends at its last appearance. We're trying to find natural boundaries where all signals that started before a point have already ended — it's like finding equilibrium points where no signal is still 'active.' We cut the string at these safe points because once we've seen the last occurrence of every letter in the current partition, we know no letter from this partition appears anywhere else. The greedy choice of cutting as soon as it's safe maximizes the number of partitions.

Why This Pattern?

The problem has a 'local optimal' property: once we've seen the last occurrence of every character encountered so far, we can safely cut — extending further would necessarily include a character that appears elsewhere, violating the constraint. This makes greedy optimal because we're always cutting at the earliest safe point, which leaves maximum room for subsequent partitions.

Solution

def partitionLabels(s):
    # First pass: find last occurrence of each character
    last = {c: i for i, c in enumerate(s)}
    
    result = []
    size = 0  # current partition size
    end = 0   # furthest last occurrence seen in current partition
    
    for i, c in enumerate(s):
        # extend the current partition to include this character's territory
        end = max(end, last[c])
        size += 1
        
        # if we've reached the end of all characters in this partition, cut here
        if i == end:
            result.append(size)
            size = 0
    
    return result

Complexity

Time: O(n) — two passes over the string, each doing O(1) work per character.
Space: O(1) — we store last positions for at most 26 letters (English alphabet), which is constant space.

We can't do better than O(n) because we must at least examine each character to know where partitions end. The space is bounded by the alphabet size (26 for lowercase English letters), not by n, so it's O(1).

Common Mistakes

Edge Cases

Connections

Valid Parenthesis String #678
Greedy with range tracking (min/max balance)

Intuition

Think of parentheses like a see-saw or a balance scale. At any point, the number of '(' minus ')' represents how much 'weight' is on the left side. Normally with only '(' and ')', we just track one number. But with '*' acting as a wildcard, we have UNCERTAINTY - we don't know exactly what the balance is, but we know it's somewhere in a RANGE. The key insight: instead of trying every possibility (which is exponential), we track the MINIMUM and MAXIMUM possible balance at each step. If the maximum balance ever goes negative, we've broken the see-saw too far - even treating all '*' as '(' can't save us. If the minimum goes negative, we can 'reset' it to zero because we can always use some '*' as '(' to compensate. At the end, if we can achieve a balance of exactly 0 (min = 0), the string is valid.

Why This Pattern?

The wildcard '*' creates uncertainty in the balance state. Rather than exploring all 3^n possibilities of how '*' acts, we maintain the BOUNDS of all possible balances. This works because: (1) if max_balance < 0 at any point, no assignment can save us - it's impossible, (2) if min_balance < 0, we can always use some '*' as '(' to bring it back to 0, (3) at the end, checking if min == 0 tells us if there's an assignment achieving exactly zero balance. The structure of this problem is fundamentally about managing uncertainty in a range, and greedy tracking of bounds captures all possibilities efficiently.

Solution

def checkValidString(s):
    # min_balance: treat all '*' as ')' -> lowest possible balance
    # max_balance: treat all '*' as '(' -> highest possible balance
    min_balance = 0
    max_balance = 0
    
    for c in s:
        if c == '(':
            min_balance += 1
            max_balance += 1
        elif c == ')':
            min_balance -= 1
            max_balance -= 1
        else:  # c == '*'
            # '*' as '(' increases max, as ')' decreases min
            min_balance -= 1
            max_balance += 1
        
        # If max_balance < 0, even with all '*' as '(', we have too many ')'
        # This breaks the invariant - no valid assignment possible
        if max_balance < 0:
            return False
        
        # If min_balance < 0, we can use some '*' as '(' to bring it back to 0
        # We don't let it go negative because that's recoverable
        if min_balance < 0:
            min_balance = 0
    
    # At the end, can we achieve exactly balance 0?
    # min_balance == 0 means there's some assignment that reaches exactly 0
    return min_balance == 0

Complexity

Time: O(n) - single pass through the string
Space: O(1) - only tracking two variables regardless of input size

We process each character exactly once with O(1) work per character, giving O(n) time. Space is O(1) because we only maintain min_balance and max_balance - no arrays, stacks, or recursion. The bounds tracking compresses all 3^n possible wildcard assignments into just two numbers, which is the key efficiency insight.

Common Mistakes

Edge Cases

Connections

Intervals (6)

Insert Interval #57
Linear scan with three-way case analysis (intervals before, intervals after, intervals that overlap with new). This is essentially a 'sweep line' where we process intervals in sorted order and decide what to do with each one relative to the new interval.

Intuition

Think of this like adding a new meeting to a calendar. You have a list of non-overlapping meetings sorted by start time. When you add a new meeting, you need to find where it fits and merge it with any meetings that now overlap. It's like dropping a new train onto a schedule - if it overlaps with existing trains, you combine them into one longer block.

Why This Pattern?

The intervals are already sorted by start time, which means we can make a single pass through them. We don't need to backtrack or use complex data structures - we just need to handle three cases: (1) intervals that end completely before the new one go unchanged, (2) intervals that start completely after the new one go unchanged at the end, (3) intervals that overlap need to be merged by taking the min of starts and max of ends.

Solution

def insert(intervals, newInterval):
    # If no intervals exist, just return the new one
    if not intervals:
        return [newInterval]
    
    result = []
    i = 0
    n = len(intervals)
    
    # Case 1: Add all intervals that come BEFORE the new interval
    # (intervals that end before newInterval starts)
    while i < n and intervals[i][1] < newInterval[0]:
        result.append(intervals[i])
        i += 1
    
    # Case 2: Merge all intervals that OVERLAP with newInterval
    # Keep expanding newInterval to include all overlaps
    while i < n and intervals[i][0] <= newInterval[1]:
        # Merge: take the min start and max end
        newInterval[0] = min(newInterval[0], intervals[i][0])
        newInterval[1] = max(newInterval[1], intervals[i][1])
        i += 1
    
    # Add the merged newInterval to result
    result.append(newInterval)
    
    # Case 3: Add all remaining intervals that come AFTER the new interval
    while i < n:
        result.append(intervals[i])
        i += 1
    
    return result

Complexity

Time: O(n) - We make exactly one pass through all intervals. Each interval is visited at most once, so this is optimal. We can't do better because in the worst case (like inserting at the beginning), we still need to look at every interval to know where to place the new one.
Space: O(n) - We need space for the result list. The algorithm itself only uses O(1) extra space for pointers and temporary variables.

Think of it like reading a sorted file and inserting an item - you have to scan through to find the right position. With sorted data, O(n) is the best we can do for general insertion. We could do O(log n) with a tree if we just needed to search, but we also need to output all intervals, so O(n) output size dominates.

Common Mistakes

Edge Cases

Connections

Meeting Rooms II #253
Min-Heap (Priority Queue) / Sweep Line

Intuition

Think of this like a busy hotel. When guests check in, they need rooms. When they check out, rooms become available. The question is: at the busiest moment of the day, how many rooms are simultaneously occupied? This is exactly the 'maximum concurrency' problem. Each meeting is a 'guest' that occupies a room for a specific duration. We need to find the peak overlap - the moment when the most meetings are happening at once. Another way: imagine stacking transparent time intervals on top of each other. How tall is the stack at its tallest point? That's your answer.

Why This Pattern?

The problem asks for maximum concurrency at any point in time. A min-heap naturally models this because it always gives us 'the room that becomes available next soonest' - we push end times and the smallest (earliest) end time sits at the top. When a new meeting starts, if its start time is >= the earliest end time, that room is free and we can reuse it. Otherwise we need a new room. The heap size at any moment equals the number of rooms in use. This is the classic 'resource allocation with release times' pattern.

Solution

import heapq

def minMeetingRooms(intervals):
    if not intervals:
        return 0
    
    # Sort meetings by start time - we process them in chronological order
    intervals.sort(key=lambda x: x[0])
    
    # Min-heap stores end times of currently occupied rooms
    # heap[0] = earliest ending meeting (room that frees up soonest)
    heap = []
    
    for start, end in intervals:
        # If earliest ending room is free by this meeting's start time,
        # reuse that room (pop it and push the new meeting's end time)
        if heap and start >= heap[0]:
            heapq.heappop(heap)
        # Always push the current meeting's end time
        # Either into a freed room (we popped first) or as a new room
        heapq.heappush(heap, end)
    
    # Maximum heap size = minimum rooms needed
    return len(heap)

Complexity

Time: O(n log n)
Space: O(n) in worst case (all meetings overlap)

Sorting takes O(n log n). Each meeting causes at most one heap push and one heap pop, each O(log n). So total is O(n log n). We can't do better than O(n log n) because sorting is required - we need to know the chronological order of meetings. The heap space is O(n) because in the worst case (all meetings overlapping), we hold all end times simultaneously.

Common Mistakes

Edge Cases

Connections

Meeting Rooms #252
Sorting + Sequential Greedy Check (Sweep Line variant)

Intuition

Imagine you're scheduling trains on a single track. Each meeting is like a train that occupies the track for a certain duration. Can all trains run on schedule without any collisions? The key insight: if you line up all meetings by their start time (like trains waiting at a station), you just need to check if any meeting 'rears into' the one before it. Think of it like cars following each other on a road - if the second car starts before the first car has cleared the road, there's a crash. By sorting, we create a timeline where we only need to compare neighbors - no need to check every possible pair.

Why This Pattern?

When intervals are sorted by start time, any overlap MUST occur between consecutive intervals. Why? If interval A overlaps interval B, and both are sorted, either A comes before B (so A's end > B's start, which we catch when checking A→B) or B comes before A (caught when checking B→A). We don't need to check non-consecutive pairs because if A doesn't overlap B, and B doesn't overlap C, then A definitely doesn't overlap C (transitive property of non-overlapping sorted intervals).

Solution

def canAttendMeetings(intervals):
    # Edge case: 0 or 1 meeting can always be attended
    if len(intervals) <= 1:
        return True
    
    # Sort by start time - creates the timeline
    # This is the critical first step that makes everything else work
    intervals.sort(key=lambda x: x[0])
    
    # Check each meeting against the previous one
    for i in range(1, len(intervals)):
        prev_start, prev_end = intervals[i-1]
        curr_start, curr_end = intervals[i]
        
        # If current meeting starts before previous one ends → overlap!
        if curr_start < prev_end:
            return False
    
    # No overlaps found - all meetings can be attended
    return True

Complexity

Time: O(n log n)
Space: O(1) if sorting in-place, O(n) if Python's Timsort creates a copy

Sorting is the dominant cost - we must examine all n meetings to place them in order. We can't know the correct relative positions without comparing each meeting to others, which requires at least n log n comparisons for comparison-based sorting. The subsequent scan is O(n) - just one pass through sorted meetings. We can't do better than O(n log n) because in the worst case (all meetings at different times), we need to establish their precise order.

Common Mistakes

Edge Cases

Connections

Merge Intervals #56
Sort and Scan (Sweep Line variant)

Intuition

Think of intervals like train tracks laid out on a table. Some tracks overlap (share rails), some are separate. Your job is to find all the continuous track segments after merging overlapping ones. The key insight: if you line up all tracks by their starting position (like sorting books by where they start on a shelf), you only need to look at your current track and the next one to decide if they merge. You don't need to compare every track with every other track - the sorting makes the problem one-dimensional and local.

Why This Pattern?

This problem has a natural ordering property - intervals are ranges that can be sorted by their start points. Once sorted, the merge decision becomes purely local: each interval either (1) extends the current merged interval if it overlaps, or (2) starts a new interval if it doesn't. This transforms an O(n²) all-pairs problem into O(n) scanning after O(n log n) sorting. The 'sweep' happens in sorted order, and we maintain only one active interval at a time.

Solution

def merge(intervals):
    if not intervals:
        return []
    
    # Step 1: Sort by start time - this is the critical first move
    # Like organizing books by where they begin on a shelf
    intervals.sort(key=lambda x: x[0])
    
    # Step 2: Start with first interval as our baseline
    merged = [intervals[0]]
    
    # Step 3: Sweep through remaining intervals
    for start, end in intervals[1:]:
        # Get the end of the last merged interval
        last_end = merged[-1][1]
        
        # If current interval starts before or at the last one's end, they overlap
        # Like two train tracks that share rails - merge them!
        if start <= last_end:
            # Extend the end to be the max of both (covers nested and partial overlap)
            merged[-1][1] = max(last_end, end)
        else:
            # No overlap - start a new merged interval
            merged.append([start, end])
    
    return merged

Complexity

Time: O(n log n)
Space: O(n) for the output array, O(1) extra if we don't count output (just the merged list we build)

Sorting is the bottleneck - we must examine each interval's start position to establish the correct order, which takes O(n log n). Once sorted, we make exactly one pass through all intervals, doing constant-time work per interval. We can't do better than O(n log n) because any algorithm that determines the correct merge order must at minimum examine the relative positions of all intervals - that's a sorting lower bound for comparison-based approaches.

Common Mistakes

Edge Cases

Connections

Minimum Interval to Include Each Query #1851
Sweep Line with Priority Queue (Active Intervals)

Intuition

Think of this like matching 'service providers' (intervals) to 'customers' (query points). Each provider covers a range, and each customer needs service at a specific point. The goal is to find the SMALLEST provider that can serve each customer — the tightest fit, not just any fit. This is like finding the most 'efficient' resource that can handle each request. As we sweep through the number line (like a scanner moving left to right), we keep track of all intervals that have 'opened' but haven't 'closed' yet. Among all active intervals at any point, we want the smallest one — this is a classic job for a min-heap where the smallest interval bubbles to the top.

Why This Pattern?

The problem has two dimensions: we need to process queries in sorted order (to efficiently track which intervals are active) AND we need quick access to the smallest active interval. The sweep line processes queries in sorted order, and the priority queue gives us O(1) access to the minimum. This is the natural decomposition: 'which intervals cover this point?' (sweep line) + 'which is smallest?' (heap).

Solution

import heapq

def minInterval(self, intervals: List[List[int]], queries: List[int]) -> List[int]:
    # Sort intervals by start time - we add them to heap in this order
    intervals.sort(key=lambda x: x[0])
    
    # Sort queries but keep original indices to restore order later
    # This lets us process queries from left to right on the number line
    queries_with_idx = sorted([(q, i) for i, q in enumerate(queries)])
    
    result = [-1] * len(queries)
    min_heap = []  # (length, end, start) - min-heap by length
    i = 0  # pointer into intervals array
    
    for query, idx in queries_with_idx:
        # PHASE 1: Open all intervals that start at or before this query
        # These intervals are now 'active' - they could contain the query
        while i < len(intervals) and intervals[i][0] <= query:
            start, end = intervals[i]
            length = end - start + 1
            heapq.heappush(min_heap, (length, end, start))
            i += 1
        
        # PHASE 2: Close intervals that don't cover this query
        # An interval is useless if its end comes before the query
        while min_heap and min_heap[0][1] < query:
            heapq.heappop(min_heap)
        
        # PHASE 3: The smallest active interval is our answer
        # Top of min-heap always has smallest length
        if min_heap:
            result[idx] = min_heap[0][0]
    
    return result

Complexity

Time: O((n + m) log n)
Space: O(n + m)

Common Mistakes

Edge Cases

Connections

Non-overlapping Intervals #435
Greedy - Activity Selection (Earliest Finish Time)

Intuition

Imagine you're packing boxes into a fixed-length shelf. To fit the most boxes, you'd pick the narrowest boxes first - they leave maximum room for the rest. That's exactly what's happening here with time intervals. The greedy insight: if you always pick the interval that ends earliest, you leave maximum room for the remaining intervals to fit. This is like a game of Tetris where you want to maximize pieces - you always drop the piece that lands lowest to keep the stack as low as possible for future pieces.

Why This Pattern?

This pattern fits because the problem has optimal substructure: an optimal solution can be built by repeatedly making the locally optimal choice (pick earliest-ending non-overlapping interval). If you have an optimal solution that doesn't include the earliest-ending interval, you can swap that later-ending interval for the earlier one and never reduce the number of intervals you can keep - you actually gain space.

Solution

def eraseOverlapIntervals(intervals):
    if not intervals:
        return 0
    
    # Sort by end time - this is the key to the greedy approach
    intervals.sort(key=lambda x: x[1])
    
    removals = 0
    # Track the end time of the last non-overlapping interval we kept
    last_end = intervals[0][1]
    
    # Start from second interval since we kept the first one
    for i in range(1, len(intervals)):
        start, end = intervals[i]
        
        if start >= last_end:
            # No overlap - we can keep this interval
            last_end = end
        else:
            # Overlap detected - must remove one interval
            # We keep the one ending earlier (which is already sorted)
            # so we simply increment removal count and skip current
            removals += 1
    
    return removals

Complexity

Time: O(n log n)
Space: O(1) excluding the sort, or O(n) if counting the sorted array

Common Mistakes

Edge Cases

Connections

Math & Geometry (8)

Detect Squares #2013
Diagonal-pair enumeration with hash-based point lookup

Intuition

Think of this like a resonance detector. When you add a point, it creates potential 'vibrations' that can resonate with other points to form squares. Here's the key insight: any square has two diagonals that share the same midpoint and are perpendicular with equal length. If we pick any two points as a diagonal, the other two corners of the square are UNiquely determined - there's exactly one way to complete the square. So instead of checking all 4-point combinations (which would be slow), we pick two points as a diagonal and check if the other two corners exist in our collection.

Why This Pattern?

A square's diagonals have two crucial properties: (1) they bisect each other at the same midpoint, and (2) they have equal length and are perpendicular. This means if we fix two points as a potential diagonal, the other two corners are mathematically determined - there's no ambiguity or choice to make. This turns the problem into: for each point A, treat it and some other point B as a diagonal, compute where C and D must be, and check if they exist. The hash map gives O(1) lookup, making the enumeration efficient.

Solution

class DetectSquares:
    def __init__(self):
        # Hash map: point -> count of times added (handles duplicates)
        self.point_count = {}
        
    def add(self, point: List[int]) -> None:
        self.point_count[tuple(point)] = self.point_count.get(tuple(point), 0) + 1
        
    def count(self) -> int:
        result = 0
        # For each point P in our collection
        for p in self.point_count:
            px, py = p
            # Try every other point A that could form a diagonal with P
            for a in self.point_count:
                if a == p:
                    continue
                ax, ay = a
                
                # Skip if same x or y - these would form a line, not a diagonal
                # (diagonals must be perpendicular, so x's and y's must differ)
                if px == ax or py == ay:
                    continue
                
                # Compute the other two corners of the square
                # The diagonals of a square bisect each other at the same midpoint
                # and are perpendicular with equal length
                # Given P=(px,py) and A=(ax,ay), the other corners are:
                # B = (ax, py)  -- shares x with A, y with P
                # C = (px, ay)  -- shares x with P, y with A
                # This forms an axis-aligned square
                b = (ax, py)
                c = (px, ay)
                
                # Check if both B and C exist in our collection
                if b in self.point_count and c in self.point_count:
                    # Multiply counts because each point could appear multiple times
                    result += (self.point_count[a] * 
                              self.point_count[b] * 
                              self.point_count[c])
        
        return result // 2  # Divide by 2 to avoid double-counting each square

Complexity

Time: O(n²) per query, where n is the number of unique points added
Space: O(n) for storing the hash map of points

We iterate through all point pairs (O(n²)), and for each pair, we do O(1) hash lookups and multiplications. We can't do better than O(n²) in the worst case because in the worst scenario (all points on a circle, say), there could be O(n²) different squares, and we need to count them all. The hash map gives us constant-time lookup for checking if a point exists, which is crucial - without it, we'd need O(n) lookup per check, making it O(n³).

Common Mistakes

Edge Cases

Connections

Happy Number #202
Floyd's Tortoise and Hare (Cycle Detection) - same technique used for detecting loops in linked lists.

Intuition

Think of this like a feedback loop in a system. You take a number, crunch its digits into a new number, and feed that back in. The question is: does this system settle into equilibrium (reaches 1, the 'happy' state) or does it get stuck in a repeating pattern? The beautiful mathematical fact here is that the sum-of-squares operation is a 'contraction' - it shrinks numbers fast enough that you can't diverge to infinity. You MUST either hit 1 or enter a cycle. The known unhappy cycle is 4→16→37→58→145→42→20→4 (and it turns out ALL unhappy numbers eventually hit this exact cycle). So we just need to detect if we enter a cycle that doesn't include 1.

Why This Pattern?

This is a finite state machine where each number maps to exactly one successor. Finite state machines with a deterministic transition function either terminate (hit 1) or enter a cycle. Floyd's algorithm detects cycles by having two pointers traverse the sequence at different speeds - they'll eventually meet if a cycle exists. It's O(1) space because we don't need to store visited numbers, unlike a set-based approach.

Solution

def isHappy(n: int) -> bool:
    # Helper: compute sum of squares of digits
    def get_next(num):
        total = 0
        while num > 0:
            digit = num % 10
            total += digit * digit
            num //= 10
        return total
    
    # Edge case: 1 is immediately happy
    if n == 1:
        return True
    
    # Two pointers: slow moves 1 step, fast moves 2 steps
    slow = n
    fast = get_next(n)
    
    # Loop until fast hits 1 (happy) or they meet (cycle = unhappy)
    while fast != 1 and slow != fast:
        slow = get_next(slow)
        fast = get_next(get_next(fast))
    
    return fast == 1

Complexity

Time: O(log n) - Each digit-squaring step reduces the number significantly. For a number with d digits, the maximum sum of squares is 81d. For n ≥ 100, this sum is always less than n, creating rapid contraction. The sequence reaches either 1 or enters the cycle within a bounded number of steps (known max ~20 for 32-bit integers).
Space: O(1) - Only two integer variables (slow, fast) regardless of input size. No set of visited numbers needed.

We don't need to track every number we've seen because Floyd's algorithm exploits the cycle structure: like two runners on a circular track, a fast runner will eventually lap a slow runner if there's a loop. The 'track' here is the sequence of numbers generated by the happy process. Since the sequence either ends at 1 or loops forever, we only need to detect if a loop exists that doesn't include 1.

Common Mistakes

Edge Cases

Connections

Multiply Strings #43
Digit-by-digit multiplication with positional accumulation

Intuition

Think of multiplication like waves colliding. When you multiply digit[i] from the first number with digit[j] from the second, their 'energy' arrives at position i+j in the result. All the collisions at position k come from pairs where i+j=k. This is exactly like adding up all the products that land at each position - we accumulate contributions rather than doing the traditional step-by-step shifted multiplication. The carry is just 'energy overflow' that spills to the next position.

Why This Pattern?

In base-10 positional notation, digit at position i has value 10^i and digit at position j has value 10^j. Their product contributes to position i+j. All pairs (i,j) that sum to k contribute to result[k]. This mathematical property makes accumulation the natural approach - we collect all contributions at each position first, then normalize with carries.

Solution

class Solution:
    def multiply(self, num1: str, num2: str) -> str:
        # Handle zeros upfront - essential optimization
        if num1 == "0" or num2 == "0":
            return "0"
        
        # Result can be at most len(num1) + len(num2) digits
        # (e.g., 99 × 99 = 9801, which is 4 digits = 2 + 2)
        result = [0] * (len(num1) + len(num2))
        
        # Process from right to left (least significant digits)
        for i in range(len(num1) - 1, -1, -1):
            for j in range(len(num2) - 1, -1, -1):
                # Convert chars to integers
                n1 = ord(num1[i]) - ord('0')
                n2 = ord(num2[j]) - ord('0')
                
                # Position in result where this product contributes
                # i + j gives current position, carry goes to i + j + 1
                position = i + j + 1
                
                # Multiply and add to existing value at position
                product = n1 * n2 + result[position]
                
                # Store ones digit at current position
                result[position] = product % 10
                # Carry the tens digit to next position
                result[position - 1] += product // 10
        
        # Skip leading zeros (if any) and convert to string
        start_idx = 0
        while start_idx < len(result) and result[start_idx] == 0:
            start_idx += 1
        
        # Build final string from remaining digits
        return ''.join(str(d) for d in result[start_idx:])

Complexity

Time: O(m × n) where m = len(num1), n = len(num2)
Space: O(m + n) for the result array

We must multiply every digit in num1 by every digit in num2 - there are m × n such pairs, so we can't do better than O(m×n). The result array needs at most m+n positions because the largest product (all 9s) produces at most m+n digits.

Common Mistakes

Edge Cases

Connections

Plus One #66
Carry propagation / digit-by-digit processing

Intuition

Think of adding 1 like pouring water into a graduated cylinder. You fill up the rightmost 'bucket' (ones place). If it overflows (hits 10), it empties to 0 and carries 1 to the next bucket to the left. Keep propagating left until a bucket doesn't overflow. If ALL buckets overflow (like 999), you need a new bucket at the front (becoming 1000). It's exactly like doing addition on paper - you just don't know you need the carry until you hit a 9.

Why This Pattern?

The problem has a natural right-to-left sequential dependency. Each digit's final value depends on whether there was a carry from processing the less significant digit. This single-pass-from-right approach is the only way because you can't know if you need to carry until you've processed all digits to the right.

Solution

def plusOne(digits):
    # Process from rightmost (least significant) digit to left
    n = len(digits)
    
    for i in range(n - 1, -1, -1):
        if digits[i] < 9:
            # No carry needed - we can just increment and we're done
            # This digit absorbs the +1, nothing propagates further
            digits[i] += 1
            return digits
        else:
            # digits[i] == 9: becomes 0, carry propagates to next digit
            # Like 9 + 1 = 10, write 0, carry 1
            digits[i] = 0
            # Loop continues, carrying 1 to the next position
    
    # If we exit the loop, ALL digits were 9 (e.g., 999 -> 1000)
    # Need to prepend a 1 (the implicit carry creates a new most significant digit)
    return [1] + digits

Complexity

Time: O(n)
Space: O(1) excluding output array, O(n) if counting the output array

Worst case (like 999...) requires touching every digit once. Best case (like 1234) only touches the last digit. On average, we might process half the digits, but O(n) captures the upper bound. Space is O(1) extra because we modify in-place - we only allocate a new array in the all-9s case.

Common Mistakes

Edge Cases

Connections

Pow(x, n) #50
Binary Exponentiation (Exponentiation by Squaring)

Intuition

Think of exponentiation like a nuclear chain reaction or signal propagation. If you want x^16, you don't multiply x by itself 16 times - you double: x→x²→x⁴→x⁸→x¹⁶. Each step squares the previous result. This is 'exponentiation by squaring' - exploiting the fact that (x²)² = x⁴, and so on. The exponent acts like a 'signal' that propagates through this doubling process. For odd exponents, we first extract one factor of x, then handle the remaining even part. Negative exponents just mean 'divide instead of multiply' - you're asking 'how much of xⁿ equals 1?'

Why This Pattern?

The exponent n can be represented in binary. Each bit represents whether we include that power of 2 in our final product. Mathematically: x^n = (x^(2^0))^b₀ × (x^(2^1))^b₁ × ... where bᵢ are the bits of n. This transforms O(n) multiplications into O(log n) by squaring the base and halving the exponent at each step.

Solution

def myPow(x: float, n: int) -> float:
    # Handle negative exponent: x^(-n) = 1 / x^n
    if n < 0:
        x = 1 / x
        n = -n
    
    result = 1.0
    
    # Binary exponentiation: process each bit of n
    while n > 0:
        # If current bit is 1, multiply result by current power of x
        if n & 1:
            result *= x
        
        # Square x for next bit position (move to next power of 2)
        x *= x
        
        # Move to next bit (halve n)
        n >>= 1
    
    return result

Complexity

Time: O(log n)
Space: O(1)

We process one bit of the exponent per iteration, and the number of bits in n is log₂(n). Each iteration does constant work (a couple multiplications), so total is O(log n). We can't do better because we need to compute all 2^k powers where k is the number of bits - that's Θ(log n) distinct values.

Common Mistakes

Edge Cases

Connections

Rotate Image #48
Layer-by-layer 4-way swap with in-place rotation

Intuition

Imagine you're rotating a physical photo frame 90° clockwise. The top-left corner moves to top-right, top-right to bottom-right, and so on. Here's the key insight: instead of thinking about individual elements, think about concentric RINGS (or layers). For each ring, we can rotate 4 elements at a time in a cycle. Picture a square with corners labeled A, B, C, D going clockwise. After rotation: A→B→C→D→A. The element at position (row, col) moves to (col, n-1-row). This is like a 4-way dance where each element hands off its value to the next position.

Why This Pattern?

The matrix has symmetry along its center. By processing from the outermost layer inward, each element participates in exactly one 4-element cycle. This guarantees O(1) space because we only need one temp variable. The coordinate transformation (i,j) → (j, n-1-i) is deterministic, so we can compute exact swap destinations mathematically.

Solution

class Solution:
    def rotate(self, matrix: List[List[int]]) -> None:
        """
        Rotates matrix 90 degrees clockwise in-place.
        Uses layer-by-layer approach: for each layer, rotate 4 elements
        in a cycle (top-left → top-right → bottom-right → bottom-left → top-left)
        """
        n = len(matrix)
        
        # Process each layer from outer to inner
        # For n=4: layers 0 and 1 (0-indexed)
        for layer in range(n // 2):
            first = layer
            last = n - 1 - layer
            
            # For each element in current layer (excluding the last one of each side)
            for i in range(first, last):
                offset = i - first
                
                # Save top-left (will be overwritten)
                top = matrix[first][i]
                
                # Left → Top
                matrix[first][i] = matrix[last - offset][first]
                
                # Bottom → Left
                matrix[last - offset][first] = matrix[last][last - offset]
                
                # Right → Bottom
                matrix[last][last - offset] = matrix[i][last]
                
                # Top → Right (using saved value)
                matrix[i][last] = top

Complexity

Time: O(n²)
Space: O(1)

Common Mistakes

Edge Cases

Connections

Set Matrix Zeroes #73
In-place State Preservation using matrix boundaries as markers

Intuition

Think of this like contamination detection. Each 0 is a 'contaminant' that needs to spread along its entire row and column. The challenge: you need to remember which rows/columns are contaminated WITHOUT losing information as you go. It's like taking notes while reading - you can't erase what you've read. The trick is to use the matrix's own edges (first row and first column) as a 'todo list' - marking which rows/columns need zeroing without actually zeroing them yet. Once you've found ALL the zeros, THEN you systematically clear the marked rows and columns.

Why This Pattern?

The matrix already has a natural 'edge' structure - the first row and first column. Instead of using extra space to remember which rows/columns have zeros, we exploit the matrix structure itself. Any zero at position (i,j) marks its row's first cell and column's first cell. This creates a distributed 'signature' of contamination. After scanning the entire matrix, we use these markers to zero everything in one pass.

Solution

def setZeroes(matrix):
    if not matrix or not matrix[0]:
        return
    
    m, n = len(matrix), len(matrix[0])
    
    # Step 1: Check if first row/col need to be zeroed (we'll overwrite them later)
    first_row_zero = any(matrix[0][j] == 0 for j in range(n))
    first_col_zero = any(matrix[i][0] == 0 for i in range(m))
    
    # Step 2: Use first row and first column as markers
    # If cell (i,j) is 0, mark its row-start and col-start to 0
    for i in range(1, m):
        for j in range(1, n):
            if matrix[i][j] == 0:
                matrix[i][0] = 0  # Mark this row for zeroing
                matrix[0][j] = 0  # Mark this column for zeroing
    
    # Step 3: Zero out cells based on markers (skip first row/col)
    for i in range(1, m):
        for j in range(1, n):
            if matrix[i][0] == 0 or matrix[0][j] == 0:
                matrix[i][j] = 0
    
    # Step 4: Zero out first column if needed
    if first_col_zero:
        for i in range(m):
            matrix[i][0] = 0
    
    # Step 5: Zero out first row if needed
    if first_row_zero:
        for j in range(n):
            matrix[0][j] = 0

Complexity

Time: O(m * n) - We traverse the matrix a constant number of times (3 passes: marking, zeroing, edge handling). Each cell is visited O(1) times.
Space: O(1) - Only using a few boolean variables, no extra data structures proportional to matrix size.

We can't do better than O(m*n) because potentially every cell needs to be examined (to find zeros) AND potentially every cell needs to be set to zero. That's Θ(mn) work minimum. For space, we exploit the matrix's own boundary as storage - we 'pay' with the first row/column being temporarily unusable as data, which is the price of O(1) extra space.

Common Mistakes

Edge Cases

Connections

Spiral Matrix #54
Boundary Traversal / Layer-by-layer peeling

Intuition

Think of a snail crawling through the matrix starting from the top-left corner. It crawls right until it hits a wall, then turns down, then left, then up, and keeps spiraling inward. Each direction change happens when you either hit the matrix boundary OR hit a cell you've already visited. It's like walking along the edges of an onion, peeling off one layer at a time, then moving to the next inner layer.

Why This Pattern?

The spiral order naturally decomposes the matrix into concentric 'shells' or layers. Each complete cycle around the perimeter visits all boundary elements exactly once before moving to the next inner layer. The structure of spiral order IS this boundary-following behavior - there's no shorter path because you must visit every outer cell before accessing inner cells.

Solution

def spiralOrder(matrix):
    if not matrix or not matrix[0]:
        return []
    
    result = []
    m, n = len(matrix), len(matrix[0])
    # Define the current rectangle's boundaries
    top, bottom = 0, m - 1
    left, right = 0, n - 1
    
    while top <= bottom and left <= right:
        # 1. Traverse RIGHT along the top row
        for col in range(left, right + 1):
            result.append(matrix[top][col])
        top += 1  # Top boundary done, shrink inward
        
        # 2. Traverse DOWN along the rightmost column
        for row in range(top, bottom + 1):
            result.append(matrix[row][right])
        right -= 1  # Right boundary done
        
        # 3. Traverse LEFT along the bottom row (if rows remain)
        if top <= bottom:
            for col in range(right, left - 1, -1):
                result.append(matrix[bottom][col])
            bottom -= 1
        
        # 4. Traverse UP along the leftmost column (if columns remain)
        if left <= right:
            for row in range(bottom, top - 1, -1):
                result.append(matrix[row][left])
            left += 1
    
    return result

Complexity

Time: O(m * n)
Space: O(1) excluding output array

Common Mistakes

Edge Cases

Connections

Bit Manipulation (7)

Counting Bits #338
Dynamic Programming on binary representation. The recurrence relation is: dp[n] = dp[n >> 1] + (n & 1).

Intuition

Think of binary numbers as a tree where each number's 'parent' is itself right-shifted by 1 (dividing by 2). A number's bit count = its parent's bit count + the bit that was removed. If n is even (ends in 0), we just added a 0 bit, so count stays the same. If n is odd (ends in 1), we added a 1 bit, so count increases by 1. This is like signal propagation through a binary tree - each child inherits the 'energy' (bit count) of its parent plus what it picked up on the way down.

Why This Pattern?

The fundamental property of binary: right-shifting divides by 2 (floor), dropping the least significant bit. The number of 1-bits in n equals the number in n//2 plus whatever the LSB contributes (0 if even, 1 if odd). This creates an obvious subproblem structure - smaller numbers help build larger ones.

Solution

def countBits(n):
    # dp[i] = number of 1-bits in i
    dp = [0] * (n + 1)
    
    # Base case: dp[0] = 0 (already initialized)
    # For each number from 1 to n
    for i in range(1, n + 1):
        # i >> 1 = i // 2 (right shift drops LSB)
        # i & 1 = i % 2 (isolates LSB: 0 if even, 1 if odd)
        dp[i] = dp[i >> 1] + (i & 1)
    
    return dp

Complexity

Time: O(n) - We compute and store the result for each integer from 0 to n exactly once. There's no way to do better because we need to return n+1 results.
Space: O(n) - We need to store the result for each number to reference when building larger numbers. Could be O(1) if we printed results on-the-fly instead of storing all.

We must produce n+1 outputs (one for each number from 0 to n), so O(n) time is optimal. The DP array of size n+1 is necessary because each number's answer depends on potentially any smaller number (specifically n//2).

Common Mistakes

Edge Cases

Connections

Missing Number #268
XOR Identity / Sum Conservation

Intuition

Think of this like a conservation law. We know the complete set should be 0 through n, but one number is 'leaking' out. If we add up what we SHOULD have (the sum 0+1+2+...+n) and subtract what we ACTUALLY have, the difference is exactly the missing number - like measuring a water leak by comparing expected and actual volume. Alternatively, think of XOR as a 'cancellation' operation: every number that appears twice cancels to 0, leaving only the missing number (which appears once) as the surviving signal.

Why This Pattern?

This problem has a complete set [0,n] with exactly one element removed. Both XOR and sum have inverses: x ^ x = 0 and x - x = 0. This means we can 'cancel out' all the numbers that are present in the array, leaving only the missing one. The structural property is that we know the exact universe of possible values but one is absent - making identity operations the natural tool.

Solution

def missingNumber(nums):
    # Approach 1: XOR (preferred - no overflow risk in Python)
    # XOR all numbers from 0 to n, then XOR with all array elements
    # Each present number appears twice and cancels to 0
    # Only the missing number survives
    result = 0
    for i in range(len(nums) + 1):
        result ^= i  # XOR with expected range
    for num in nums:
        result ^= num  # XOR out what's actually there
    return result

# Alternative (sum method) - simpler conceptually:
# def missingNumber(nums):
#     n = len(nums)
#     expected_sum = n * (n + 1) // 2  # Sum of 0 to n
#     return expected_sum - sum(nums)

Complexity

Time: O(n)
Space: O(1)

We must touch each element in the array exactly once to either XOR it or sum it. There's no way to find the missing number without checking all inputs - it's essentially counting n items. The O(1) extra space comes from only needing a single accumulator variable regardless of input size.

Common Mistakes

Edge Cases

Connections

Number of 1 Bits #191
Bit Manipulation - Remove Rightmost Set Bit

Intuition

Think of each set bit as an 'energy source' in a system. The trick `n & (n-1)` acts like a drain that removes exactly one source per operation — specifically the rightmost one. When you subtract 1 from a number, all bits to the right of the rightmost 1 flip (0↔1), and that rightmost 1 becomes 0. When you AND the result with the original number, those flipped bits become 0, effectively 'draining' that one source. You keep draining until the system has no energy left (n becomes 0), and the number of drains equals your answer. This is like counting items in a system by systematically removing them one at a time rather than inspecting every possible location.

Why This Pattern?

The property that `n & (n-1)` always removes exactly one set bit is structural — it exploits how binary subtraction works. This pattern is natural because we only iterate as many times as there are 1-bits, not 32 times. Each iteration deterministically removes one known set bit.

Solution

class Solution:
    def hammingWeight(self, n: int) -> int:
        """
        Count the number of '1' bits in the binary representation of n.
        Uses the n & (n-1) trick to remove rightmost set bit each iteration.
        """
        count = 0
        while n:
            # Remove the rightmost set bit: flips bits after rightmost 1 to 1s,
            # turns rightmost 1 to 0, then AND with original clears all those bits
            n = n & (n - 1)
            count += 1
        return count

Complexity

Time: O(k) where k is the number of set bits (at most 32 for 32-bit integers)
Space: O(1)

We only loop once per set bit rather than checking all 32 bit positions. In the worst case (n = 0xFFFFFFFF), we iterate 32 times; in the best case (n = 0), we iterate 0 times. This is optimal because we must at minimum examine each set bit to count it.

Common Mistakes

Edge Cases

Connections

Reverse Bits #190
Bit-by-bit extraction and reconstruction

Intuition

Think of the 32 bits as a line of 32 dominoes. Each domino is either standing (1) or fallen (0). 'Reversing bits' is like reflecting this line in a mirror — the leftmost domino becomes the rightmost, the second-from-left becomes second-from-right, and so on. The position transformation is simple: bit at position i moves to position 31-i. Another way: we're 'reading the binary number backwards' — the least significant bit becomes the most significant, and vice versa. This is fundamentally a positional remapping problem.

Why This Pattern?

Reversing is inherently an order-reversal operation. By extracting each bit one at a time (from LSB toward MSB) and building the result in the opposite order (placing each extracted bit from MSB toward LSB), we naturally achieve the reversal. This is the 'dual-pointer' technique applied to bit positions instead of array indices.

Solution

def reverseBits(n):
    result = 0
    for i in range(32):
        # Extract the i-th bit from the original number (working right to left)
        bit = (n >> i) & 1
        # Place it in the mirrored position (31-i) in result (working left to right)
        # We OR because result already has bits from previous iterations
        result = result | (bit << (31 - i))
    return result

Complexity

Time: O(1)
Space: O(1)

We iterate exactly 32 times (once per bit), regardless of input value. This is constant because the input is always a fixed 32-bit integer. Each iteration does constant-time bit operations. No data structures grow with input size.

Common Mistakes

Edge Cases

Connections

Reverse Integer #7
Digit-by-digit accumulation with overflow guards

Intuition

Think of reversing an integer like stacking rings on a pole. Each new digit taken from the original number gets placed on top, pushing everything else down one position. The overflow problem is like checking whether the pole can support another ring before adding it. If you're building up a number and the current result is already greater than INT_MAX/10, adding another digit (even a 0) would overflow. If the result equals INT_MAX/10, you can only add digits 0-7 (the last digit of INT_MAX is 7). This is exactly how you'd check if a water glass will overflow: if there's already 9 ounces in a 10-ounce glass, adding any more spills. But if there's 8 ounces, you can add up to 2 more safely.

Why This Pattern?

The problem has a hard boundary (32-bit signed integer range), and we're building the result incrementally. This means we can detect overflow at each step before it happens, rather than computing first and checking afterward (which would already be too late).

Solution

class Solution:
    def reverse(self, x: int) -> int:
        # Handle negative numbers by working with positive, restore sign at end
        sign = -1 if x < 0 else 1
        x = abs(x)
        
        result = 0
        while x > 0:
            # Take the rightmost digit
            digit = x % 10
            
            # Check for overflow BEFORE adding:
            # If result > INT_MAX / 10, multiplying by 10 would overflow
            # If result == INT_MAX / 10 and digit > 7, adding would overflow
            # (INT_MAX = 2147483647, so last digit could be 0-7)
            if result > 2147483647 // 10 or (result == 2147483647 // 10 and digit > 7):
                return 0
            
            # Add digit to result (shifting existing digits left)
            result = result * 10 + digit
            
            # Remove the processed digit
            x //= 10
        
        return sign * result

Complexity

Time: O(d) where d is the number of digits in the input (at most 10 for 32-bit integers)
Space: O(1) - only a fixed number of integer variables used regardless of input size

We process each digit exactly once, so time is proportional to digit count. The digit count is bounded by 10 (for 32-bit), making this effectively O(1) in the worst case. Space is constant because we're just storing a few integers, not building any data structure that grows with input.

Common Mistakes

Edge Cases

Connections

Single Number #136
XOR Cancellation / Bit Manipulation

Intuition

Imagine each number as a particle, and pairs of identical numbers as matter and antimatter - when they meet, they annihilate completely (become 0). XOR is the mathematical operation that does exactly this: any number XORed with itself gives 0, and any number XORed with 0 gives itself. So if we line up all the numbers and XOR them together, the pairs annihilate each other, leaving only the single number standing. This is like a 'conservation law' for bits - pairs cancel out at each bit position.

Why This Pattern?

The problem has a specific structural property: every element appears exactly twice except one. This paired repetition structure is what XOR is perfectly designed to handle. XOR has three key properties that match this problem: (1) a ^ a = 0 - paired elements cancel to zero, (2) a ^ 0 = a - the single element remains, (3) XOR is commutative and associative so order doesn't matter. This is the most elegant solution because it exploits a fundamental mathematical property rather than brute force.

Solution

def singleNumber(nums):
    # Start with 0 because x ^ 0 = x (identity property)
    result = 0
    # XOR all numbers together - pairs cancel out, single remains
    for num in nums:
        result ^= num
    return result

Complexity

Time: O(n)
Space: O(1)

We must examine each of the n elements at least once to find the unique one - there's no way around this. For space, we only store a single integer (result) regardless of input size, which is the minimum possible since we need to output something.

Common Mistakes

Edge Cases

Connections

Sum of Two Integers #371
Iterative Bitwise Carry Propagation

Intuition

Think of binary addition like water flowing in connected containers. When you add 1+1 at any bit position, you get 0 there but create a 'overflow' (carry) that flows to the next position. XOR tells you what each bit sums to WITHOUT considering carries (1+1=0, 1+0=1, 0+1=1, 0+0=0). AND tells you where carries are created (only 1+1 produces a carry). You then shift the carries left and repeat until no carries remain. This is like a ripple propagating through the system until it reaches equilibrium.

Why This Pattern?

The problem forces us to simulate CPU-level addition. At each bit position, two independent operations happen in parallel: XOR computes the sum, AND identifies carries. The carry must propagate to higher bits, creating a feedback loop that continues until the system stabilizes (no carries left). This is the natural hardware algorithm.

Solution

def getSum(a: int, b: int) -> int:
    # Simulate 32-bit signed integer arithmetic without using + or -
    # 
    # Core insight: XOR = sum without carries, AND << 1 = carries
    # Iterate until carries propagate to nothing (b becomes 0)
    
    MASK = 0xFFFFFFFF        # 32-bit mask
    MAX_INT = 0x7FFFFFFF     # Max positive 32-bit signed int
    
    # Work with 32-bit unsigned integers
    while b != 0:
        # XOR: sum WITHOUT considering carries
        # Example: 1^1=0, 1^0=1, 0^1=1, 0^0=0 (exact binary addition truth table!)
        sum_without_carry = (a ^ b)
        
        # AND then shift: positions where BOTH are 1 generate a carry
        # The carry flows to the NEXT bit (left shift by 1)
        carry = ((a & b) << 1) & MASK
        
        # Update for next iteration
        a = sum_without_carry & MASK  # Keep within 32 bits
        b = carry
    
    # Convert from unsigned back to signed if needed
    # If the 32nd bit is set, we have a negative number
    return a if a <= MAX_INT else ~(a ^ MASK)

Complexity

Time: O(log(min(|a|, |b|)))
Space: O(1)

Each iteration eliminates at least one carry bit (the rightmost one). Since carries can propagate through all 32 bits, we need at most 32 iterations for 32-bit integers. The number of iterations equals the number of carry propagation cycles needed, which is bounded by the bit-width.

Common Mistakes

Edge Cases

Connections