Neetcode 150

Contains Duplicate #217

Hash Set / Bloom Filter detection

Intuition

Think of this like a bouncer at a club checking the guest list. As each person arrives, you check if their name is already in your log. If it is - duplicate! If not, you add it to the log and let them in. The 'log' is our set, and checking it is essentially instant (O(1)) because hash tables work like a perfect filing system - no need to flip through pages, you jump directly to where the name would be.

Why This Pattern?

We only need to detect IF a duplicate exists, not count them or find which ones. A hash set gives O(1) average-case lookup - we can check each number as we go and know instantly if we've seen it before. This is the classic 'membership testing' pattern.

Solution

class Solution:
    def containsDuplicate(self, nums: List[int]) -> bool:
        seen = set()  # Our "guest list" - tracks what we've encountered
        
        for num in nums:
            if num in seen:  # O(1) lookup - is this name already on the list?
                return True   # Found a duplicate!
            seen.add(num)     # Add to our set for future checks
        
        return False  # Made it through - all unique

Complexity

Time: O(n)
Space: O(n)

We must look at each of the n elements at least once (worst case, the duplicate is at the very end). For each element, we do O(1) set operations. Can't get faster than O(n) because we need to 'touch' each input element. The space is O(n) in the worst case - if all elements are unique, we store all n of them in our set.

Common Mistakes

Using a list instead of a set for membership checking (makes it O(n²) instead of O(n))
Not considering the edge case of empty array returning False
Modifying the original array when that wasn't required - we don't need to sort or change the input
Forgetting that integers can be negative - doesn't matter for this problem but good to remember

Edge Cases

Empty array [] returns False - no duplicates possible
Single element [1] returns False
All identical elements [1,1,1] returns True immediately on second element
Two elements [1,2] returns False
Two identical [1,1] returns True on second element
Negative numbers work fine: [-1, -2, -1] returns True

Connections

Valid Anagram (#242) - uses same hash set pattern for frequency checking
Set Matrix Zeroes (#73) - uses set for tracking positions
First Missing Positive (#41) - uses index as hash key pattern
Longest Consecutive Sequence (#128) - set-based membership testing

Encode and Decode Strings #271

Length-delimited serialization. Self-describing format where each item encodes its own size, making the data immune to whatever characters it contains.

Intuition

Think of this like packaging boxes for shipping. If you just wrote each box's contents and then stacked them, you couldn't tell where one box ends and another begins if the contents happened to match your separator. The smart solution: label each box with its exact length FIRST, then pack the contents. When unpacking, you read the label, know exactly how many items are in that box, and move to the next. The '#' delimiter is just the label separator — the actual string content can be anything because you're reading a fixed number of characters based on the length prefix, not scanning for a special character.

Why This Pattern?

This pattern exploits a key structural property: by embedding the length in the encoded format, we eliminate dependence on 'special characters' that can't appear in the payload. The format is compositional — each string is independently decodable, which is both efficient and robust.

Solution

class Codec:
    def encode(self, strs: List[str]) -> str:
        """Encodes a list of strings to a single string."""
        # For each string: prefix with length + delimiter, then append string
        # The delimiter '#' marks where the length ends
        encoded = []
        for s in strs:
            # Length tells decoder exactly how many chars to read
            encoded.append(str(len(s)) + '#' + s)
        return ''.join(encoded)
    
    def decode(self, s: str) -> List[str]:
        """Decodes a single string to a list of strings."""
        result = []
        i = 0
        while i < len(s):
            # Find delimiter: where length ends
            j = i
            while s[j] != '#':
                j += 1
            
            # Extract length (everything between i and j)
            length = int(s[i:j])
            
            # Skip delimiter, read exactly 'length' characters
            result.append(s[j + 1 : j + 1 + length])
            
            # Move pointer to start of next length-prefixed string
            i = j + 1 + length
        
        return result

Complexity

Time: O(N) where N is total characters in output string. Each character is visited exactly once during encode and once during decode — no backtracking or re-scanning.
Space: O(N) for the encoded string and output list. We must store all characters somewhere; you can't compress below the information-theoretic minimum.

The string length serves as a 'forward pointer' — we never revisit characters. It's like a linked list in memory: once you process a node, you move on. The bottleneck is physically reading/writing each character, which is unavoidable since every character in the input must appear in the output.

Common Mistakes

Using a simple delimiter (like comma) without length prefix — fails when string contains the delimiter
Not handling empty list case (empty input)
Off-by-one errors when skipping the delimiter and reading the string
Assuming digits won't appear in strings — they absolutely can, which is why '#' delimiter is essential

Edge Cases

Empty list input [] — encode should return '' (empty string)
Empty string '' in the list — encodes as '0#' (length 0, no chars)
Strings containing '#' — handled correctly because we read fixed length, not scanning for delimiter
Very long strings — length can be multiple digits, hence the while loop to find delimiter, not fixed-position parsing

Connections

Similar to: Integer encoding in bit streams (write length then data)
Related to: JSON serialization (self-describing format)
Underlies: TCP/IP packet framing where you specify header length before payload
Think of this as the string version of 'serialize' in any serialization framework — same length-prefix concept appears in Protobuf, Avro, etc.

Group Anagrams #49

Keyed grouping / Canonical form hashing

Intuition

Think of each string as a recipe - if two recipes have the exact same ingredients in the exact same quantities, they're the same 'type' of recipe regardless of the order ingredients were added. The challenge is: how do we efficiently identify which recipes share ingredients? We need a 'fingerprint' for each string that all its anagrams share. Sorting the characters creates this fingerprint: 'eat', 'tea', and 'ate' all become 'aet' when sorted - they now have the same identity and can be grouped together. This is like organizing a deck of cards by suit and rank rather than by the order they were dealt.

Why This Pattern?

We're partitioning strings into equivalence classes where the equivalence relation is 'are anagrams of each other'. The structural property that makes hashing the natural choice is that anagrams are indistinguishable when you ignore order - they form a natural group. By transforming each string into a canonical representation (sorted characters), we create a perfect hash key where strings with the same key MUST be anagrams (the converse also holds).

Solution

def groupAnagrams(strs):
    # The key insight: anagrams become identical when sorted
    # "eat" -> "aet", "tea" -> "aet", "ate" -> "aet" all map to same bucket
    
    groups = {}
    
    for s in strs:
        # Canonical form: sorted characters
        # This is the "fingerprint" that all anagrams share
        key = tuple(sorted(s))
        
        if key not in groups:
            groups[key] = []
        groups[key].append(s)
    
    return list(groups.values())

Complexity

Time: O(n * k log k)
Space: O(n * k)

We process n strings, and for each one we sort k characters (where k is average string length). Sorting dominates at O(k log k) per string. We can't do better than sorting because we must examine all k characters to determine anagram equivalence - you can't know if two strings are anagrams without checking every character. The space comes from storing all n strings in the hash map, plus the sorted representation of each string as a key.

Common Mistakes

Using a list as a dict key instead of tuple - lists are unhashable in Python
Not realizing that sorted() returns a string in Python 3, but we need a hashable type for dict keys
Forgetting that the answer can be in any order - don't assume sorted output
Using list() on dict_values directly instead of list() - this works in Python 3.7+ but explicitly converting is clearer

Edge Cases

Empty strings - '' sorted is '', so all empty strings group together
Single character strings - each unique character forms its own group
Strings with all identical characters like 'aaa' - these all group together
Input with only one string - returns list containing that string's group
Strings with all 26 letters - still works, sorted key will be 'abcdefghijklmnopqrstuvwxyz'

Connections

Valid Anagram (#242) - This is the binary version: are two strings anagrams? Group Anagrams extends this to many strings
Top K Frequent Elements (#347) - Uses similar hash-map grouping pattern but with frequency as key
Find All Anagrams in a String (#438) - Sliding window version of anagram detection
Longest Substring with K Distinct Characters (#340) - Another keyed-grouping problem

Longest Consecutive Sequence #128

Set-based sequence detection with the 'start point' optimization. This leverages a hash set to achieve O(1) membership checking.

Intuition

Imagine you're looking at a bunch of numbered blocks scattered on a table. Some of them happen to form continuous chains - like 4,5,6,7 sitting next to each other. Your job is to find the longest chain. The key insight: you only need to start counting from the BEGINNING of each chain - the block that doesn't have its left neighbor. If you try to count from the middle of a chain (like 5), you'll double-count. So the algorithm is: put all numbers in a hash set for O(1) lookups, then iterate through and only start counting when a number has NO left neighbor in the set. This is like finding all the 'headwaters' of streams and tracing each river to its source.

Why This Pattern?

The structural property that makes this pattern work: consecutive sequences have a unique 'start' element (the one with no predecessor). By filtering to only process these starts, we avoid redundant work. If we processed every element, we'd re-count the same sequences multiple times. The set gives us constant-time lookup to check if a predecessor exists.

Solution

def longestConsecutive(nums):
    if not nums:
        return 0
    
    # Put all numbers in a set for O(1) lookups
    num_set = set(nums)
    longest = 0
    
    for num in num_set:
        # Only start counting if this is the beginning of a sequence
        # (i.e., num-1 doesn't exist in the set)
        if num - 1 not in num_set:
            current = num
            streak = 1
            
            # Keep counting forward while consecutive numbers exist
            while current + 1 in num_set:
                current += 1
                streak += 1
            
            longest = max(longest, streak)
    
    return longest

Complexity

Time: O(n) - Here's why: We visit each number at most twice in the worst case. Each number enters the set once, and at most one number from each sequence triggers the forward-counting loop. Within that loop, we visit each element of that sequence exactly once total across all starts. Since sequences don't overlap, the total is bounded by O(2n) = O(n).
Space: O(n) - The hash set stores all n numbers. Without this, we couldn't do O(1) predecessor lookups.

Think of it like tracking rivers: we're only starting new 'expeditions' from headwaters (numbers with no predecessors). Each river's length gets counted exactly once because we don't start from midstream. The set lookup is O(1) like checking if a key exists in a dictionary - instant regardless of size.

Common Mistakes

Not using a set and using O(n) list lookups, making it O(n²)
Starting from every number instead of only 'starts' - leads to double counting and TLE
Forgetting to handle the empty array case - returns 0
Using sorted() which makes it O(n log n) - violates the O(n) requirement

Edge Cases

Empty array [] should return 0
Single element [1] should return 1
All same elements [5,5,5] should return 1
Already consecutive [1,2,3,4,5] should return 5
Unsorted with gaps [100, 4, 200, 1, 3, 2] should return 4 (1,2,3,4)
Negative numbers and zero [-1, 0, 1] should return 3
Duplicate numbers [1,2,2,3,3,4] should still work - set removes dups

Connections

Contains Duplicate (#217) - uses same hash set pattern for O(1) lookups
Longest Repeating Character Replacement (#424) - shares the 'only process starts' optimization concept
Number of Islands (#200) - uses set for tracking visited nodes in grid exploration

Product of Array Except Self #238

Prefix/Suffix Product (Two-Pass Accumulation)

Intuition

Think of this like a river flowing through checkpoints. At each position i, you need to know the total 'flow' (product) from both upstream (all elements left of i) AND downstream (all elements right of i), but NOT the flow through position i itself. It's like calculating what the product would be if you magically 'removed' each element one at a time. The key insight: multiplication is distributive, so we can break the problem into two independent sweeps - left-to-right for prefix products, then right-to-left for suffix products - and multiply them together at each position.

Why This Pattern?

The answer at each position depends on ALL other positions, not just neighbors. This screams prefix/suffix pattern because: (1) we can precompute a running product as we sweep, (2) the left contributions and right contributions are independent and can be combined via multiplication, (3) we need O(n) with O(1) extra space - exactly what two linear passes achieve.

Solution

def productExceptSelf(nums):
    n = len(nums)
    answer = [1] * n
    
    # First pass: compute prefix products (all elements to the LEFT of i)
    # At each step, 'prefix' holds product of nums[0] through nums[i-1]
    prefix = 1
    for i in range(n):
        answer[i] = prefix  # Store product of everything LEFT of i
        prefix *= nums[i]   # Update prefix to include current element
    
    # Second pass: compute suffix products (all elements to the RIGHT of i)
    # At each step, 'suffix' holds product of nums[i+1] through nums[n-1]
    suffix = 1
    for i in range(n - 1, -1, -1):
        answer[i] *= suffix  # Multiply left product by right product
        suffix *= nums[i]   # Update suffix to include current element
    
    return answer

Complexity

Time: O(n) - We make exactly two passes through the array, each doing constant work per element.
Space: O(1) extra space - We only use a few variables (prefix, suffix) regardless of input size. The output array doesn't count toward space complexity since it's required by the problem.

We can't do better than O(n) because every element (except itself) must contribute to every output position - that's n×n total multiplications conceptually. The two-pass approach is optimal because we must touch each element at least twice (once from left, once from right). The O(1) extra space comes from reusing the answer array as our accumulator - we never need to store intermediate products for all indices simultaneously.

Common Mistakes

Trying to use division (product of all / nums[i]) - fails with zeros and causes overflow
Forgetting to reinitialize suffix to 1 in the second pass
Off-by-one errors in the loop ranges - especially the backward loop
Not handling the case where nums[i] = 0 (division approach would break)

Edge Cases

> Array with one element - returns [that element] since there's nothing to multiply
Arrays containing zeros - works naturally, zeros 'kill' products as expected
Arrays with all zeros except one element - correctly produces all zeros except at the non-zero position
Negative numbers - handled correctly since we're doing multiplication, not comparison

Connections

This is the multiplication equivalent of 'Prefix Sum' problems (like subarray sum equals k) - same two-pass technique but with multiplication instead of addition
Similar to 'Maximum Product Subarray' (both use prefix/suffix thinking)
Relates to 'Trapping Rain Water' (another two-pointer approach, but there it's about heights)
The technique of 'computing left and right contributions separately' appears in many DP and array problems

Top K Frequent Elements #347

Bucket Sort by Frequency / Counting Sort Adaptation

Intuition

Think of this like finding the most popular items in a store inventory. You have a list of what sold, and you want to know which k products were purchased most often. The intuitive approach: (1) First, count how many times each product appeared — this is your frequency map. (2) Then, find the top k by frequency. Here's the key insight: the frequency can't exceed the array length (if every element is the same, frequency = n). So we can use the frequency as a 'bucket index' — bucket[i] holds all elements that appeared exactly i times. This is like organizing books on a shelf by how many times you've read them, then grabbing from the most-read shelf first.

Why This Pattern?

The problem has a special property: frequency is bounded between 1 and n (array length). This is perfect for bucket sort because we can directly index into an array using frequency values. Instead of comparing elements, we use their frequency as a direct address — O(1) insertion into buckets rather than O(log n) heap operations. It's the natural choice when: (1) we know the range of the 'key' we're sorting by, and (2) that range is small relative to the data size.

Solution

def topKFrequent(nums: list[int], k: int) -> list[int]:
    # Step 1: Build frequency map - count how often each element appears
    # This is like tallying votes in an election
    freq = {}
    for num in nums:
        freq[num] = freq.get(num, 0) + 1
    
    # Step 2: Create buckets where bucket[i] = all elements with frequency i
    # We need n+1 buckets because frequency can range from 0 to n
    n = len(nums)
    buckets = [[] for _ in range(n + 1)]
    
    # Place each element in its frequency bucket
    for num, count in freq.items():
        buckets[count].append(num)
    
    # Step 3: Collect top k elements from highest frequency bucket down
    # This is like grabbing books from the 'most-read' shelf first
    result = []
    for i in range(n, 0, -1):  # go from high frequency to low
        result.extend(buckets[i])  # add all elements with this frequency
        if len(result) >= k:  # once we have k elements, we're done
            break
    
    return result[:k]

Complexity

Time: O(n)
Space: O(n)

We make three linear passes: (1) building the frequency map visits each of n elements once, O(n); (2) populating buckets visits each unique element once, O(n); (3) collecting results visits at most n buckets, O(n). Total is O(n). Space is O(n) for the frequency map plus O(n) for the buckets — we need to store all elements somewhere. This is optimal because we must at least examine every element to determine frequency.

Common Mistakes

Confusing bucket index with array index — bucket[i] stores elements with frequency i, not elements at position i
Forgetting to allocate n+1 buckets (frequency can be 0 to n)
Using a regular sort O(n log n) when bucket sort O(n) is possible
Not handling the case where multiple elements share the same frequency — extend() adds all of them, which is correct

Edge Cases

Empty array — should return empty list (k=0 case)
All elements are the same — all go into bucket[n], one element returned
All elements are unique — each goes to bucket[1], need to take k from there
k equals number of unique elements — return all unique elements
k = 1 — just need the single most frequent element

Connections

Sort Characters by Frequency (LC 451) — same pattern of bucket sorting by frequency, but with characters
Kth Largest Element (LC 215) — also finds 'top k' but sorts by value not frequency, uses heap/quickselect
Least K Frequent (same idea, just flip the direction)
Frequency Stack (LC 895) — uses frequency as a structural key, similar bucket-based thinking

Two Sum #1

Hash table complement lookup - for each element, compute what you need (target - current) and check if it's already been seen.

Intuition

Imagine you're balancing a scale. You have a target weight (the sum you need), and as you place each number on the scale, you're instantly checking if its 'counterpart' is already there. If you see a 5 and need sum 9, you immediately ask 'do I already have a 4?' If yes, done. If not, save the 5 for later. It's like a matchmaking service - you're constantly asking 'is my complement already here?' as you scan through.

Why This Pattern?

The problem requires finding ANY pair that sums to target. Checking all pairs is O(n²). By using a hash table, we can check 'has this complement been seen?' in O(1) time, reducing overall complexity to O(n). This works because we only need ONE valid pair, not all pairs, so we can stop as soon as we find a match.

Solution

def twoSum(nums, target):
    # Dictionary stores: number value -> its index
    # We need the index, not just the value, to return the answer
    seen = {}
    
    for i, num in enumerate(nums):
        complement = target - num  # What number would pair with num?
        
        # If we've seen the complement, we found our pair!
        if complement in seen:
            return [seen[complement], i]
        
        # Otherwise, remember this number and its index for future lookups
        seen[num] = i
    
    # Problem guarantees a solution exists
    return []

Complexity

Time: O(n) - We traverse the array once. For each element, computing the complement is O(1) and hash table lookup is O(1) average case.
Space: O(n) - In the worst case (no solution found until the very end, which doesn't happen here since solution is guaranteed), we store all n elements in the hash table.

You can't do better than O(n) because in the worst case, you might need to look at every element before finding the pair. The hash table trades space for speed - by storing what we've seen, we avoid the nested loops that would check every possible pair (which would be n² comparisons).

Common Mistakes

Returning the values instead of indices - the problem asks for indices
Using list.index() inside a loop (creates O(n²) behavior)
Forgetting to update the hash table with the current element after checking
Modifying the input array when you don't need to

Edge Cases

Multiple valid pairs exist - we return the first found, which is fine
Duplicate values like [3,3] with target 6 - need to ensure we don't return the same index twice
Empty array or single element - problem guarantees valid input with at least 2 elements
Negative numbers work fine with the same logic

Connections

3Sum uses this as its core subroutine - fix one element, solve two-sum for the rest
Valid Anagram uses the same hash-table-counting pattern
Two Sum II uses two pointers on a sorted array - this is the unsorted version
Climbing Stairs uses dynamic programming which also optimizes from O(2^n) to O(n) by storing intermediate results

Valid Anagram #242

Frequency Count / Hash Map

Intuition

Think of each string as a 'chemical composition' - you're checking if two substances have the exact same atoms in the exact same quantities, just arranged differently. The ORDER doesn't matter, only the COUNT. This is like a fingerprint: if two strings have identical letter fingerprints, they're anagrams. You could physically sort letters (like sorting cards), but there's a faster way: just count what you have.

Why This Pattern?

An anagram problem is fundamentally about comparing multisets of characters. The ONLY structural property that matters is how many of each character exists. A hash map naturally captures 'how many of X' for any X, making it the perfect tool. We're checking if two sequences are equivalent under permutation, which is exactly what frequency counting detects.

Solution

from collections import Counter

class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        # Quick win: different lengths can't be anagrams
        if len(s) != len(t):
            return False
        
        # Count characters in first string, subtract for second
        # If they're anagrams, every count will return to zero
        count = Counter(s)
        
        for char in t:
            count[char] -= 1
            if count[char] < 0:  # More of this char in t than s
                return False
        
        return True

# Alternative one-liner (same logic, less explicit):
# return Counter(s) == Counter(t)

Complexity

Time: O(n)
Space: O(k) where k = 26 (alphabet size)

Common Mistakes

Checking length AFTER counting - wastes time on obviously different strings
Using == to compare strings directly - order doesn't matter!
Forgetting that 'A' and 'a' are different in this problem (case-sensitive)
Creating two separate Counters and comparing - works but uses 2x space unnecessarily

Edge Cases

Empty strings (both '' - returns True)
Strings of length 1 ('a' vs 'b' returns False)
Strings with repeated characters ('aaa' vs 'aaaa' - correctly returns False)
Non-alphabetic characters are treated as valid characters

Connections

Group Anagrams (#49) - uses the SAME frequency map insight, just groups by it
Intersection of Two Arrays II (#350) - identical frequency counting pattern
Longest Substring Without Repeating Characters (#3) - uses sliding window + counter but different strategy
Ransom Note (#383) - essentially checking if one string has sufficient 'counts' from another

Valid Sudoku #36

Multi-dimensional Uniqueness Validation with Hash Sets

Intuition

Think of Sudoku validation as checking three independent 'conservation laws' simultaneously. Each digit 1-9 is like a unique 'energy packet' that can only exist once per row, once per column, and once per 3x3 subgrid. You're essentially verifying that no digit is 'overlapping' with itself in any of these three dimensions. Imagine three transparent overlays on the board - one showing rows, one showing columns, one showing boxes - a digit appearing at position (i,j) must not already appear in that row's overlay, that column's overlay, or that box's overlay. The moment you spot a duplicate, the board is invalid.

Why This Pattern?

Sudoku has three separate constraint domains (rows, columns, subgrids) that must all be satisfied independently. Each domain requires uniqueness checking, which is exactly what hash sets excel at - O(1) lookup to check if an element was already seen. The problem's structure (fixed 9x9 grid with exactly 3 constraints per cell) makes sets the natural choice over more complex data structures.

Solution

def isValidSudoku(board: List[List[str]]) -> bool:
    # Three sets of sets: one for each row, column, and 3x3 subgrid
    row_sets = [set() for _ in range(9)]
    col_sets = [set() for _ in range(9)]
    box_sets = [set() for _ in range(9)]
    
    for i in range(9):
        for j in range(9):
            num = board[i][j]
            
            # Skip empty cells - they impose no constraint
            if num == '.':
                continue
            
            # Calculate which 3x3 box we're in
            # Box index formula: row_group * 3 + col_group
            box_index = (i // 3) * 3 + (j // 3)
            
            # Check ALL three constraints simultaneously
            # If num exists in ANY of the three sets, we have a duplicate
            if (num in row_sets[i] or 
                num in col_sets[j] or 
                num in box_sets[box_index]):
                return False
            
            # No duplicate found - add to all three sets
            row_sets[i].add(num)
            col_sets[j].add(num)
            box_sets[box_index].add(num)
    
    return True

Complexity

Time: O(81) = O(1) - The board is always 9x9, so we iterate exactly 81 cells. Each cell does O(1) set operations (3 lookups + 3 insertions at most). Since the input size is bounded by a constant, this is technically O(1) space and time.
Space: O(81) = O(1) - We maintain 27 sets (9 rows + 9 cols + 9 boxes), each holding at most 9 elements. Total memory is bounded by a constant regardless of input.

We can't do better than O(81) because we must examine every non-empty cell to verify it's valid - a single invalid cell could be anywhere. The space is also bounded because we're only tracking uniqueness in 27 fixed domains (9 rows, 9 columns, 9 boxes), each capped at 9 unique digits.

Common Mistakes

Forgetting to skip '.' cells - they should never be added to sets
Using (i*3 + j) instead of (i//3)*3 + (j//3) for box index - integer division is crucial
Only checking rows and columns but forgetting subgrids, or vice versa
Confusing 0-indexed vs 1-indexed thinking - board uses 0-8, digits are 1-9
Not realizing the board can contain invalid characters beyond digits 1-9 and '.' (the solution implicitly handles this by never matching them)

Edge Cases

Empty board - all '.' - should return True
Single digit anywhere - should return True
All same digit in one row - should return False
Valid complete Sudoku - should return True
Digit appearing in same row AND column (but different boxes) - should return False
Digit appearing in same column AND box (but different rows) - should return False

Connections

Set Matrix Zeroes (Neetcode 150) - uses similar row/column tracking with sets
Detect Duplicate in String Array - same uniqueness-in-sets principle
Longest Consecutive Sequence - uses set for O(1) lookup, similar pattern of hash-based validation
Contains Duplicate - fundamental uniqueness check that underlies this problem

Two Pointers (5)

3Sum #15

Two Pointers with Sorting (specifically the 'sort + two pointers' pattern for 2-sum)

Intuition

Think of this like finding three weights that balance perfectly on a see-saw - they need to sum to zero. The key insight: once you sort the array, it becomes ordered like numbers on a number line. Pick one number as an 'anchor' (like fixing one weight), and now you're looking for two other numbers that sum to the negative of your anchor. This reduces to the classic 2Sum problem on a sorted array. The two pointers work like two people walking toward each other from opposite ends of a hallway - they'll either meet at the right spot (sum = target) or cross paths (and you know to move one direction). Sorting is essential because it gives you a monotonic sequence where you can reliably predict which direction to move when the sum is too big or too small.

Why This Pattern?

Sorting transforms an unordered search into a directed walk. After fixing one element, the remaining two-pointer search works because: (1) the array is monotonic, so if sum > target you MUST decrease it by moving right pointer left, (2) if sum < target you MUST increase it by moving left pointer right. This greedy movement guarantees you never miss a valid pair - it's like gradient descent on a sorted landscape. The duplicate handling during iteration prevents revisiting equivalent states.

Solution

def threeSum(nums):
    res = []
    nums.sort()  # Sort first - enables the two-pointer trick
    
    for i in range(len(nums)):
        # Skip duplicate first elements to avoid repeated triplets
        if i > 0 and nums[i] == nums[i - 1]:
            continue
        
        # Two-pointer search for the remaining two numbers
        left, right = i + 1, len(nums) - 1
        target = -nums[i]  # We need two numbers that sum to this
        
        while left < right:
            current_sum = nums[left] + nums[right]
            
            if current_sum == target:
                # Found a valid triplet!
                res.append([nums[i], nums[left], nums[right]])
                
                # Skip duplicates for left and right to avoid repeats
                while left < right and nums[left] == nums[left + 1]:
                    left += 1
                while left < right and nums[right] == nums[right - 1]:
                    right -= 1
                
                # Move both pointers after finding a match
                left += 1
                right -= 1
                
            elif current_sum < target:
                # Sum too small, need larger value - move left pointer right
                left += 1
            else:
                # Sum too large, need smaller value - move right pointer left
                right -= 1
    
    return res

Complexity

Time: O(n²)
Space: O(1) auxiliary space (not counting output), or O(n) if counting the sort's extra space depending on implementation

Sorting takes O(n log n). The outer loop runs n times, and for each iteration, the two-pointer search potentially traverses the remaining n-i elements. This is like checking every possible triplet but doing it efficiently by using the sorted property to skip invalid branches - you don't need to check all n³ combinations because the sorted order lets you systematically eliminate possibilities.

Common Mistakes

Not sorting before applying two pointers - the algorithm assumes sorted order
Forgetting to skip duplicates, which causes repeated triplets in the output
Comparing nums[i] to nums[i+1] instead of nums[i-1] when checking for first-element duplicates
Not adjusting both pointers after finding a valid triplet - you must move past all duplicates
Using the wrong target value - remember it's -nums[i], not nums[i]

Edge Cases

Empty array or array with fewer than 3 elements returns empty list
All zeros: [-1,0,1,2,-1,-4] contains duplicate -1s that must be skipped correctly
All same positive or negative numbers: no valid triplet exists
Array with exactly one valid triplet: must not produce duplicates
Mixed positives and negatives with zeros: [0,0,0,0] should return [[0,0,0]]

Connections

2Sum (Problem 1): Uses the exact same two-pointer insight but simpler - just one pair instead of triplets
4Sum (Problem 18): Same pattern extended with another nested loop - fix two elements, use two pointers for the other two
3Sum Closest (Problem 16): Similar structure but asks for sum closest to target instead of exact match
Valid Triangle Number (Problem 611): Uses sorted array + two pointers but counts triplets satisfying a different condition (sum of two sides > third)

Container With Most Water #11

Two Pointers (opposite ends, moving toward center)

Intuition

Imagine two vertical cliffs forming a valley. The water it can hold is limited by BOTH the distance between them (width) AND the shorter cliff (height) - water spills over the shorter one. This is like finding two people standing apart who are both tall. The key insight: if you're at two positions and the shorter one limits you, moving the taller one can ONLY make things worse (you reduce width and the shorter one is still the bottleneck). But moving the shorter one MIGHT find a taller partner - that's your only hope for improvement. It's like if you're playing 'height partner' with someone: when you're paired with someone shorter than you, you should go find a new partner - but if you're the shorter one, you stay put and let them find someone taller.

Why This Pattern?

The array represents positions in a line where both location and value matter. Starting from the widest possible container (ends), we can only improve by moving the pointer at the shorter height - because moving the taller pointer can never increase area (the shorter height remains the bottleneck while width decreases). This creates a deterministic search path that explores all potentially optimal solutions.

Solution

def maxArea(height):
    """
    Two pointers starting at opposite ends. At each step:
    1. Calculate area with current pointers
    2. Move the pointer at the shorter height (potential to find taller walls)
    3. Stop when pointers meet
    """
    left = 0
    right = len(height) - 1
    max_water = 0
    
    while left < right:
        # Width is distance between indices
        width = right - left
        # Height is limited by shorter wall (water spills over)
        current_height = min(height[left], height[right])
        
        # Update maximum area found
        max_water = max(max_water, width * current_height)
        
        # Move pointer at shorter height - this is THE key insight:
        # Moving the taller one CANNOT help (shorter still limits)
        # Moving the shorter one MIGHT help (might find taller wall)
        if height[left] < height[right]:
            left += 1
        else:
            right -= 1
    
    return max_water

Complexity

Time: O(n) - single pass through array
Space: O(1) - only using two pointers and a few variables

We visit each index at most once (left moves right, right moves left, they meet in middle). Even though we might skip some pairs, we don't need to check all n² pairs explicitly because the greedy pointer movement guarantees we explore only the candidates that could potentially be optimal - we're not enumerating, we're intelligently searching.

Common Mistakes

Moving the WRONG pointer - some try to move both or move based on which gives bigger area (that doesn't work)
Forgetting to update max_water before moving pointers
Using max(height[left], height[right]) instead of min() - the container height is limited by the shorter wall

Edge Cases

Array with 2 elements - just calculate that one area
All same heights - any pair works, but algorithm still finds it
Strictly increasing/decreasing heights - algorithm handles correctly
Empty or single element array - would return 0 (not valid input per problem)

Connections

Trapping Rain Water (#42) - uses similar two-pointer approach but tracks water level
3Sum (#15) - another two-pointer problem but pairs sum to target, not optimizes area
Valid Palindrome (#125) - two pointers from opposite ends, different logic but same structure

Trapping Rain Water #42

Two Pointers (greedy)

Intuition

Imagine this as a valley system where water accumulates. The key insight: at any position, the water level is determined by the SHORTER of the two 'walls' that could contain it. Think of it like pouring water into a container - the water spills over the lower side. So water at position i = min(max_height_to_left, max_height_to_right) - height[i]. The two-pointer approach exploits a clever observation: if we know the left wall is shorter than the right wall, we only need to worry about the left side because the right side (being taller) can't limit the water on the left. We process the smaller side first, knowing the answer is constrained by that smaller boundary.

Why This Pattern?

The problem has a monotonic property: the limiting factor for water at any position is always the minimum of the two maximum heights on either side. By processing from both ends and always moving the pointer with the smaller height boundary, we maintain a invariant that lets us compute water locally without needing to scan the entire remaining array. We're greedily resolving the side we can be certain about.

Solution

def trap(height):
    if not height:
        return 0
    
    left, right = 0, len(height) - 1
    left_max, right_max = 0, 0
    water = 0
    
    while left < right:
        # Process the shorter side - we can resolve it with certainty
        if height[left] < height[right]:
            # Left side is the limiting factor
            if height[left] >= left_max:
                left_max = height[left]  # This becomes new boundary
            else:
                water += left_max - height[left]  # Water trapped here
            left += 1
        else:
            # Right side is the limiting factor (or equal)
            if height[right] >= right_max:
                right_max = height[right]
            else:
                water += right_max - height[right]
            right -= 1
    
    return water

Complexity

Time: O(n)
Space: O(1)

We traverse the array exactly once with two pointers moving toward each other. Each element is visited at most once. We only store a few integer variables regardless of input size. Can't do better than O(n) because we must examine each bar to know the answer - every position potentially holds water.

Common Mistakes

Using height[left] <= height[right] instead of < for the condition - this causes issues with equal heights
Forgetting to update left_max/right_max before calculating water - you need the boundary height, not current height
Confusing the two-pointer solution with the DP approach - they look similar but DP uses extra arrays
Not handling empty or single-element arrays - no water can be trapped

Edge Cases

Empty array returns 0
Single element returns 0
Strictly increasing array returns 0 (no valleys)
Strictly decreasing array returns 0
All equal elements returns 0 (flat surface, water flows away)
Array with one tall peak in middle (like [5,4,1,2]) - water only on sides of peak

Connections

Container With Most Water (#11) - also uses two pointers, but optimizes area instead of trapping water
Trapping Rain Water II (#407) - the 2D version, uses a priority queue (MinHeap) as the natural extension
The Two Pointer pattern here is similar to Sorted Squares (#977) where we process from both ends

Two Sum II #167

Two Pointers (opposite direction)

Intuition

Think of this like balancing a scale. You have a sorted list of weights from lightest to heaviest, and you want two weights that exactly equal a target weight. Start with the lightest (leftmost) and heaviest (rightmost). If they sum too heavy, you KNOW the heaviest is too heavy — move it down. If they sum too light, you KNOW the lightest is too light — move it up. The sorted property guarantees this always works because adjusting one element in a direction has a predictable effect on the sum.

Why This Pattern?

The sorted input creates a monotonic relationship: if current sum > target, decreasing the larger element can ONLY help (never hurt). If current sum < target, increasing the smaller element can ONLY help. This lets us explore all possible pairs in exactly O(n) by moving each pointer at most n times — no backtracking needed.

Solution

def twoSum(numbers, target):
    left = 0
    right = len(numbers) - 1
    
    while left < right:
        current_sum = numbers[left] + numbers[right]
        
        if current_sum == target:
            # 1-indexed return as specified in problem
            return [left + 1, right + 1]
        
        if current_sum > target:
            # Sum too large: decrease the larger element
            # Since sorted, numbers[right] is the larger one
            right -= 1
        else:
            # Sum too small: increase the smaller element
            left += 1
    
    return []  # No solution found (won't happen for valid input)

Complexity

Time: O(n)
Space: O(1)

Each iteration eliminates at least one possibility — either moving left or right pointer. Since pointers only move toward each other and never backtrack, we visit at most n pairs total. Every comparison and arithmetic operation is O(1).

Common Mistakes

Forgetting 1-indexed requirement and returning [left, right] instead of [left+1, right+1]
Using binary search instead of two pointers (would work but is O(n log n) and overcomplicates)
Confusing which pointer to move — remember: if sum is too big, we need to DECREASE it, so move right pointer left (to a smaller large number)
Not checking if left < right in while condition (could cause infinite loop)

Edge Cases

Array with exactly 2 elements — returns [1, 2] immediately if they sum to target
Target is unreachable — returns empty list
Duplicate values in array — algorithm handles correctly since we check both pointers
Target equals first element + last element — correctly found at extremes

Connections

3Sum (LeetCode 15) — uses this same two-pointer technique nested inside a loop over the first element
3Sum Closest (LeetCode 16) — same pattern with a 'closest' comparison instead of exact match
Two Sum Less Than K (LeetCode 1099) — same structure but checking for sum < K instead of exact match
Valid Palindrome II (LeetCode 680) — uses similar 'opposite pointer' technique but for character comparison

Valid Palindrome #125

Two Pointers (opposite ends moving inward)

Intuition

Think of a palindrome as a system in equilibrium — like a perfectly balanced scale. The first character must equal the last, the second must equal the second-to-last, and so on. If any pair doesn't match, the system is out of balance. Using two pointers is like having observers at both ends of a seesaw, walking toward the center. They meet when they've checked all necessary pairs (or discover a mismatch along the way).

Why This Pattern?

Palindromes have symmetric structure — position i from the left must match position n-1-i from the right. We only need to compare ceil(n/2) pairs to determine if it's a palindrome. Two pointers let us check both sides in a single pass without extra space, like mirrors reflecting each other.

Solution

def isPalindrome(s: str) -> bool:
    left, right = 0, len(s) - 1
    
    while left < right:
        # Skip non-alphanumeric characters from left
        while left < right and not s[left].isalnum():
            left += 1
        # Skip non-alphanumeric characters from right  
        while left < right and not s[right].isalnum():
            right -= 1
        
        # Compare characters (case-insensitive)
        if s[left].lower() != s[right].lower():
            return False
        
        # Move pointers toward center
        left += 1
        right -= 1
    
    return True

Complexity

Time: O(n)
Space: O(1)

Each character in the string is visited at most once as the left and right pointers move toward each other. Even in the worst case (all valid alphanumerics), we touch each character exactly once. We only store two integer pointers regardless of input size.

Common Mistakes

Forgetting to convert to same case (comparing 'A' vs 'a')
Not handling non-alphanumeric characters — they should be skipped entirely
Using string reversal which costs O(n) extra space
Not handling the while loop bounds correctly (can cause infinite loops or index errors)

Edge Cases

Empty string — should return True
Single character — always a palindrome
Only non-alphanumeric characters — skip all, return True
String like 'race a car' — r!=r at first glance, but spaces are ignored, so actually r!=r (fails at i=0, j=8)
Mixed case 'Aba' — must normalize to lowercase

Connections

Valid Palindrome II (#680) — same problem but allows one deletion
Palindrome Linked List (#234) — uses similar two-pointer logic on a linked structure
Longest Palindromic Substring (#5) — uses expanding centers (a variation of palindrome checking)
Reverse String (#344) — two pointers from ends, different goal but same motion

Sliding Window (6)

Best Time to Buy and Sell Stock #121

Single Pass Scan / Implicit Sliding Window

Intuition

Think of this like a hiker walking through a mountain range who can only look forward in time. They want to find the lowest valley BEFORE a peak — they can't time travel to buy at the lowest point overall if it comes after their selling point. As you walk through each day, you track the lowest price seen so far (that's your best buying opportunity up to this point). At every peak, you ask: 'How much would I gain if I sold here, having bought at my lowest point so far?' The answer that maximizes this gain is your answer. The key insight: you don't need to try all pairs — you only need to remember the minimum price encountered before the current day.

Why This Pattern?

This is a degenerate sliding window where we're tracking a single value (the minimum) as we 'slide' forward through time. We don't need an explicit window data structure because we're just maintaining one running minimum. The problem has optimal substructure: the best profit up to day i depends only on the minimum price seen up to day i-1.

Solution

def maxProfit(prices):
    # Initialize to track:
    # 1. The minimum price seen so far (best day to buy)
    # 2. The maximum profit achievable (best day to sell)
    min_price = float('inf')
    max_profit = 0
    
    for price in prices:
        # Update our best buying opportunity if current price is lower
        # This is like remembering the lowest valley we've passed through
        if price < min_price:
            min_price = price
        
        # Calculate profit if we sell today having bought at min_price
        # We use max() to ensure we only track positive gains
        profit = price - min_price
        max_profit = max(max_profit, profit)
    
    return max_profit

Complexity

Time: O(n) - We make exactly one pass through the prices array. Each iteration does O(1) work. We can't do better because we must examine each price at least once to know the minimum so far.
Space: O(1) - Only two variables are used regardless of input size. No arrays or data structures that grow with input.

Think of it as scanning a conveyor belt once. You don't need to go back (that's O(n²) with nested loops), you just need to remember the lowest price encountered so far. Memory is constant because you're only storing two numbers, not the entire price history.

Common Mistakes

Forgetting to initialize min_price to infinity and getting wrong results when prices[0] is large
Not handling the case where no profit is possible (prices always decreasing) — the algorithm correctly returns 0
Trying to find both minimum and maximum separately then subtracting — this fails because max might come BEFORE min
Using a two-pointer window approach when there's no valid window to maintain

Edge Cases

Single element array: returns 0 (can't make any transaction)
Strictly decreasing prices: returns 0 (never profitable to buy)
Prices like [1,1,1,1]: returns 0 (no variation)
Two elements [5,3]: returns 0 (decreasing, can't profit)
Two elements [3,5]: returns 2 (buy day 0, sell day 1)

Connections

LeetCode 122 (Best Time II): Same problem but allows multiple transactions — uses greedy approach
LeetCode 123 (Best Time III): Hard version allowing at most 2 transactions — uses DP
LeetCode 309 (Best Time with Cooldown): Adds cooldown between transactions
Same prefix minimum insight as 'Maximum Subarray' (Kadane's algorithm) and many 'running minimum/maximum' problems

Longest Repeating Character Replacement #424

Sliding Window (Longest Substring with Condition)

Intuition

Think of this like a 'squeeze' or purification problem. You have a window of characters and a budget of k 'impurities' (characters that don't match the majority). Your goal is to find the largest window you can 'clean' to make all one character by replacing at most k impurities. The key insight: in any valid window, the number of replacements needed = window_size - count_of_most_frequent_char. If this ≤ k, the window is valid. It's like trying to maintain equilibrium where your 'energy' (k) is used to push the system toward uniformity.

Why This Pattern?

We need the longest contiguous substring satisfying a condition. The condition involves character frequencies, which we can track incrementally as we expand/shrink the window. The key structural property: we can always grow right, and only need to shrink left when invalid, giving O(n) solution. This is the 'longest valid window' variant of sliding window.

Solution

def characterReplacement(s: str, k: int) -> int:
    # Track frequency of each character in current window
    char_count = [0] * 26
    max_count = 0  # Count of most frequent char in current window
    left = 0
    result = 0
    
    for right in range(len(s)):
        # Add current char to window and update its count
        idx = ord(s[right]) - ord('A')
        char_count[idx] += 1
        
        # Update the most frequent char count in window
        max_count = max(max_count, char_count[idx])
        
        # Window is invalid if (size - max_count) > k
        # This means we need more replacements than we have budget for
        while (right - left + 1) - max_count > k:
            # Shrink window from left
            char_count[ord(s[left]) - ord('A')] -= 1
            left += 1
        
        # Window is now valid - update result
        result = max(result, right - left + 1)
    
    return result

Complexity

Time: O(n) where n = len(s)
Space: O(1) - fixed 26 character array

Time is O(n) because each character is visited at most twice - once when right expands past it, once when left shrinks past it. We never revisit. Space is O(1) because we only store counts for 26 letters regardless of input size - doesn't grow with n.

Common Mistakes

Not updating max_count when window shrinks - max_count can actually stay the same or even become invalid temporarily, but we handle this by NOT recalculating it on each iteration, just tracking the running maximum
Using while loop inside for loop incorrectly - the while condition must match the invalidity condition exactly
Confusing the condition - some think it's max_count + k, but it's actually (right-left+1) - max_count <= k

Edge Cases

k=0 means no replacements allowed - find longest run of same character
k >= len(s) means you can make entire string uniform
Single character string returns 1
All same characters returns len(s) regardless of k
k=1 with string like 'AABABBA' - the window can expand past the most frequent char count temporarily

Connections

Similar to 'Longest Substring with At Most K Distinct Characters' (NEETCODE 145) - same sliding window template but with frequency constraint instead of distinct count
Related to 'Longest Substring Without Repeating Characters' (NEETCODE 110) - different constraint but same window expansion/contraction pattern
The core insight (window_size - max_frequent <= k) is reused in 'Minimum Window Substring' (NEETCODE 76) but inverted (finding smallest window containing all characters)

Longest Substring Without Repeating Characters #3

Sliding Window with HashMap - Two Pointer Technique

Intuition

Imagine you're maintaining a 'clean zone' on a conveyor belt - you expand your window to the right, adding new characters. But the moment you spot a repeat (a 'contaminant'), you have to throw away everything from the left up to and including that character's last occurrence. It's like a sliding observation window that auto-adjusts: expand when things are新鲜 (fresh), contract when you hit a duplicate. The beautiful part is you never need to go backwards - both pointers only march forward, making this O(n) total.

Why This Pattern?

This fits sliding window because: (1) we're seeking a contiguous substring, (2) the optimal answer is some window we can represent with left/right boundaries, (3) the 'validity' constraint (no repeats) can be checked incrementally as we move, and (4) both pointers only advance forward - we never need to revisit positions. The HashMap lets us jump the left pointer directly to the optimal position (just after the last occurrence of a duplicate) rather than sliding one step at a time.

Solution

def lengthOfLongestSubstring(s):
    char_last_seen = {}  # Map character -> last index where it appeared
    max_len = 0
    left = 0  # Left boundary of our 'clean' window
    
    for right in range(len(s)):
        char = s[right]
        
        # If char exists in our current window, we need to shrink from left
        # The new left becomes one position AFTER its last occurrence
        if char in char_last_seen and char_last_seen[char] >= left:
            left = char_last_seen[char] + 1
        
        # Record/update this character's latest position
        char_last_seen[char] = right
        
        # Calculate window size and update max if larger
        max_len = max(max_len, right - left + 1)
    
    return max_len

Complexity

Time: O(n)
Space: O(min(n, 26)) for lowercase, or O(min(n, 128)) for ASCII - bounded by character set size, not input size

We visit each character exactly once with the right pointer (n steps). The left pointer can also move at most n times total (each character 'falls off' the left edge at most once). So total pointer movements = 2n = O(n). HashMap operations are O(1) average. Space is bounded by how many distinct characters can fit in the window - in the worst case (all unique chars), we store n entries, but typically it's bounded by the character set (26 for lowercase, 128 for ASCII, ~1M for Unicode).

Common Mistakes

Forgetting to update char_last_seen after moving left - the character's position should always reflect its latest index
Using 'if char in char_last_seen' without checking if that index is >= left (old occurrences outside the window shouldn't affect us)
Not handling empty string as initial case - max_len starts at 0, which works but edge cases like s='' need to return 0
Confusing substring vs subsequence - this problem requires contiguous characters, so pattern matters

Edge Cases

Empty string s='' returns 0
Single character 'a' returns 1
All unique characters 'abcde' returns 5
All same characters 'aaaaa' returns 1 (window always size 1)
Repeated pattern 'ababab' returns 2 (alternating 'ab')
Unicode characters - using set() instead of dict keys can handle this but is slower

Connections

This is the foundational sliding window problem - most others build on this pattern. It's like '最小覆盖子串' (Minimum Window Substring) but inverted: instead of finding a window containing all needed characters, we find the longest window with NO bad characters. Also connects to '无重复字符的最长子串' logic used in '找到字符串中所有字母异位词' (Find All Anagrams in a String) where you maintain a validity window.

Minimum Window Substring #76

Sliding Window (Expand-Contract with Two Pointers)

Intuition

Think of this like a filtration or scanning problem. You have a 'target signature' (string t) — like a bouncer checking if you have all required documents, or a chef verifying all ingredients are present. You're scanning through s looking for the smallest window that contains this complete signature. The sliding window works because: expand right to gather more characters until you have a valid window (all required chars present), then contract left to find the minimum size while keeping it valid. It's like finding the tightest grip that still holds all the pieces.

Why This Pattern?

We need to examine contiguous regions in s. The validity of a window (whether it contains all chars of t) can be checked incrementally — as we add chars on the right, we can update our counts; as we remove from the left, we can update counts. Both pointers only move forward, giving O(n) time. This is the natural pattern when you're looking for optimal contiguous subsequences defined by a constraint.

Solution

def minWindow(s: str, t: str) -> str:
    if not s or not t:
        return ""
    
    # Character frequency we NEED to satisfy
    need = {}
    for c in t:
        need[c] = need.get(c, 0) + 1
    
    # Character frequency in current WINDOW
    window = {}
    
    # Two pointers define our window
    left = 0
    right = 0
    
    # 'valid' counts how many UNIQUE chars have met their required frequency
    valid = 0
    required = len(need)
    
    # Track the answer
    min_len = float('inf')
    min_start = 0
    
    while right < len(s):
        # EXPAND: Add s[right] to window
        char_in = s[right]
        if char_in in need:
            window[char_in] = window.get(char_in, 0) + 1
            # If this char now meets its required count, increment valid
            if window[char_in] == need[char_in]:
                valid += 1
        
        # CONTRACT: Shrink from left while window is still valid
        while valid == required:
            # Update answer - this is a valid window!
            if right - left + 1 < min_len:
                min_len = right - left + 1
                min_start = left
            
            # Try to shrink by moving left pointer in
            char_out = s[left]
            if char_out in need:
                # If we're about to remove a char that was meeting requirement
                if window[char_out] == need[char_out]:
                    valid -= 1  # Window will become invalid after removal
                window[char_out] -= 1
            
            left += 1
        
        right += 1
    
    return "" if min_len == float('inf') else s[min_start:min_start + min_len]

Complexity

Time: O(n + m) where n = len(s), m = len(t)
Space: O(m + k) where m = unique chars in t, k = unique chars in s that overlap with t

Common Mistakes

Not understanding when window is 'valid' — it's when EACH required char appears at least its required frequency, not just when total chars >= len(t)
Forgetting to decrement 'valid' when shrinking window removes a char that was meeting its requirement
Using len(window) == len(need) instead of tracking valid count — this fails because a window might have all char types but insufficient frequency of some
Not handling edge case when t is longer than s

Edge Cases

s = 'a', t = 'aa' → no valid window, return empty string
t has duplicate characters like 'aa' — need to track frequency, not just presence
s and t contain unicode or special characters — dictionary handles this correctly
s = t → the entire string is the answer

Connections

Similar to 'Longest Substring Without Repeating Characters' (LeetCode 3) — same sliding window framework but different validity condition
Related to 'Permutation in String' (LeetCode 567) — same 'need/window' pattern but checks for exact permutation rather than minimum window
Compare to 'Substring with Concatenation of All Words' (LeetCode 30) — more complex version with fixed-size windows
The two-pointer technique here is foundational — appears in 'Find All Anagrams in a String' (LeetCode 438)

Permutation in String #567

Sliding Window with Frequency Counter and Match Tracking

Intuition

Imagine you're looking for a chord (any permutation of s1's notes) inside a longer song (s2). The order of notes within the chord doesn't matter - you just need the exact same multiset of characters. This is like comparing histograms: does any contiguous window in s2 have the exact same character frequency distribution as s1? Think of it as a "fingerprint" matching problem - we're checking if s1's character-count fingerprint appears anywhere in s2's sliding window.

Why This Pattern?

The window size is FIXED to len(s1). Instead of rebuilding the frequency map for each window position (which would be O(n*m)), we maintain it incrementally: as the window slides, we decrement one character count and increment another. We also track how many character frequencies currently match between the window and s1 - this 'match count' lets us check the entire window in O(1) rather than comparing all 26 letters each time.

Solution

def checkInclusion(s1: str, s2: str) -> bool:
    if len(s1) > len(s2):
        return False
    
    # Build the target frequency map (the fingerprint we're looking for)
    need = {}
    for c in s1:
        need[c] = need.get(c, 0) + 1
    
    window = {}
    window_size = len(s1)
    
    # Initialize first window in s2
    for i in range(window_size):
        window[s2[i]] = window.get(s2[i], 0) + 1
    
    # Count how many characters currently have matching frequencies
    matches = 0
    for c in need:
        if window.get(c, 0) == need[c]:
            matches += 1
    
    # If all chars match at start, we found a permutation
    if matches == len(need):
        return True
    
    # Slide the window: remove leftmost char, add new rightmost char
    for i in range(window_size, len(s2)):
        # Add the new character entering the window
        new_char = s2[i]
        if new_char in need:
            # Before incrementing: if this was a matching count, we're about to break that match
            if window.get(new_char, 0) == need[new_char]:
                matches -= 1
            window[new_char] = window.get(new_char, 0) + 1
            # After incrementing: check if we now match
            if window.get(new_char, 0) == need[new_char]:
                matches += 1
        
        # Remove the old character leaving the window
        old_char = s2[i - window_size]
        if old_char in need:
            # Before decrementing: if this was a matching count, we're about to break that match
            if window.get(old_char, 0) == need[old_char]:
                matches -= 1
            window[old_char] = window.get(old_char, 0) - 1
            # After decrementing: check if we now match
            if window.get(old_char, 0) == need[old_char]:
                matches += 1
        
        # Check if all character frequencies now match
        if matches == len(need):
            return True
    
    return False

Complexity

Time: O(n) where n = len(s2). Each character in s2 is visited twice at most (once when entering, once when leaving the window), and all other operations are O(1).
Space: O(1) or O(k) where k = 26 (lowercase letters). Since the alphabet size is fixed at 26, we consider this constant space.

We process each character in s2 exactly twice (enter and exit the window), giving O(n). The frequency maps only store at most 26 entries (one per lowercase letter), so space is bounded by the alphabet size - constant. We can't do better than O(n) because we must potentially check every starting position in s2.

Common Mistakes

Forgetting to handle the case where s1 > s2 (early return False)
Not tracking matches correctly when incrementing/decrementing counts - the order matters (check BEFORE and AFTER the change)
Using == len(window) instead of == len(need) when checking for permutation match
Confusing this with substring search - the ORDER within the window doesn't matter, only the character counts

Edge Cases

s1 longer than s2 (return False immediately)
s1 or s2 empty (handled by length check)
All same characters: s1='a', s2='aaaa' - window will match when all 'a's accumulate
Only one character in s1: s1='a', s2='bca' - just need to find that single char anywhere
s2 length equals s1 length: just one window to check

Connections

Same core pattern as 'Find All Anagrams in a String' (#438) - that problem asks for ALL starting indices, this asks if ANY exists
Uses the same frequency-counting + sliding window technique as 'Minimum Window Substring' (#76) but with fixed window size instead of expanding/shrinking
The 'match tracking' optimization here is also used in 'Longest Repeating Character Replacement' (#424)

Sliding Window Maximum #239

Monotonic Deque (Monotonic Decreasing Queue)

Intuition

Think of this like a VIP section at a club. As people line up (the sliding window), you need to know who the tallest person is in the current VIP group. Here's the key insight: if a new person taller than everyone in line walks in, the shorter people can NEVER be the maximum while the new person is in the window — they're 'dominated.' It's like if LeBron James walks into a room, everyone else in that room loses their chance to be the tallest. We keep a decreasing queue of potential maximums: any smaller element to the left becomes irrelevant the moment a bigger one appears to its right.

Why This Pattern?

The structural property that makes this pattern work is that we process elements left-to-right and maintain a queue where values decrease. Any element to the left that's smaller than a new element can NEVER become the maximum — the new element dominates it for the remainder of its window lifespan. This allows O(1) max retrieval and O(1) insertions/removals. It's the only way to achieve O(n) for this problem because each element is pushed and popped at most once.

Solution

from collections import deque

def maxSlidingWindow(nums, k):
    result = []
    dq = deque()  # stores INDICES, not values - crucial for knowing when to remove
    
    for i in range(len(nums)):
        # 1. REMOVE: indices outside the current window
        # If the leftmost element is before window start, it's expired
        while dq and dq[0] < i - k + 1:
            dq.popleft()
        
        # 2. REMOVE: indices whose values are smaller than current
        # These elements are "dominated" - current nums[i] will be the max
        # for the rest of their window lifespan, so they're useless
        while dq and nums[dq[-1]] < nums[i]:
            dq.pop()
        
        # 3. ADD: current index to the deque
        dq.append(i)
        
        # 4. RECORD: once we have a full window, the front is our max
        if i >= k - 1:
            result.append(nums[dq[0]])
    
    return result

Complexity

Time: O(n)
Space: O(k)

Common Mistakes

Storing VALUES instead of INDICES in the deque — you need indices to know when elements fall out of the window
Forgetting to remove elements that are now outside the window (stale indices)
Using a regular deque with max() — that's O(n) per window instead of O(1)
Not removing dominated elements — results in wrong answer because smaller elements block bigger ones

Edge Cases

k = 1: every element is its own window, just return the array
k = len(nums): only one window, return single max
All decreasing elements: each new element dominates all previous, deque has size 1
All equal elements: no domination, deque stores all k elements

Connections

This is the foundational problem for Monotonic Deque pattern — similar logic appears in 'Minimum in Sliding Window' (just flip comparison), 'Stock Span Problem', and 'Shortest Subarray with Sum at Least K' (for prefix sum monotonicity).
Unlike 'Maximum Subarray' (Kadane's) which finds any subarray max, this constrains to fixed-size sliding windows.

Stack (7)

Car Fleet #853

Sort + Stack (or single counter)

Intuition

Imagine cars as particles flowing toward an energy minimum (the target). A faster car behind a slower car is like a particle that can get 'captured' by the slower one's gravitational well - once it catches up, they move as one unit. The key insight: sort cars by their starting distance from the target (closest first), then calculate each car's 'arrival time' (how long to reach target). If a car takes LONGER to arrive than the fleet ahead of it, it forms its own fleet - it can't catch up. If it takes less time, it joins the fleet in front. Think of it like runners on a track - someone running faster but starting farther back might still get caught by someone ahead running slower but with a head start.

Why This Pattern?

Sorting by position (descending from target) creates a natural ordering where we only need to compare each car to the one immediately ahead of it - like a linked list. We don't need a full stack because we're only tracking the 'slowest arrival time so far' (the lead fleet). Each car either joins that fleet or starts a new one, which we can count with a simple variable.

Solution

def carFleet(target: int, position: list[int], speed: list[int]) -> int:
    # Pair each car's position with its speed, then sort by position (furthest from target first)
    cars = sorted(zip(position, speed), key=lambda x: x[0])
    
    # Calculate time for each car to reach target: distance / speed
    # times[i] = (target - position[i]) / speed[i]
    times = [(target - pos) / spd for pos, spd in cars]
    
    # Count of fleets (at least one car = at least one fleet)
    fleets = 0
    
    # Track the slowest arrival time seen so far (the lead fleet)
    # Working from closest to target outward
    slowest_time = 0
    
    for t in times:
        # If this car takes LONGER than the lead fleet, it forms a NEW fleet
        # (it can't catch up to the car ahead)
        if t > slowest_time:
            fleets += 1
            slowest_time = t
        # If t <= slowest_time, this car joins the fleet ahead (catches up)
    
    return fleets

Complexity

Time: O(n log n) - dominated by the sorting step. The single pass through times is O(n).
Space: O(n) for storing the cars and times arrays (could be O(1) if we computed times on the fly, but the cleaner version uses extra space).

Sorting is required because without ordering by position, we can't determine which car is 'ahead' of which. We're essentially imposing a total order on the cars to simulate their spatial arrangement. O(n log n) is the best we can do - any algorithm must at least look at each car once, and comparison-based sorting is optimal for the unordered input.

Common Mistakes

Sorting in ascending order instead of descending (by position) - this breaks the 'closest to target first' logic
Using integer division instead of float division for time calculation
Forgetting that cars at the same position are separate fleets (they each have their own time)
Comparing position instead of time - a car with higher speed but farther back might still not catch up
Not handling the edge case of 0 or 1 cars correctly

Edge Cases

Empty car list returns 0 fleets
Single car returns 1 fleet
Cars at the same position are separate fleets (can't catch each other)
Cars that start at or beyond target (position >= target) reach instantly - time = 0

Connections

This is similar to 'Merge Intervals' in that we're grouping items based on overlap, but here the 'overlap' is in time dimension - does car A's arrival time extend beyond car B's?
The core insight (sort by position, compare times) is analogous to 'Non-overlapping Intervals' where we count groups after sorting by start time
Like 'Task Scheduler' or 'Dota2 Senate', this uses a greedy single-pass approach after sorting - we process in order and make local decisions that yield the global optimum

Daily Temperatures #739

Monotonic Decreasing Stack (Next Greater Element pattern)

Intuition

Think of this like a thermodynamic system where each day is 'seeking equilibrium' with a warmer future day. The key insight: when a warmer day arrives, it immediately 'resolves' all the previous days that were waiting for warmth. Picture a stack of people in a cold line - each person wants to know when a warmer person will show up behind them. When that warmer person arrives, they can tell all the waiting 'colder' people exactly how many days they waited. The stack maintains days that haven't found warmth yet in decreasing temperature order - this way, when warmth comes, we can resolve ALL the waiting days at once, like dominos falling.

Why This Pattern?

This is the canonical 'Next Greater Element' problem. The structural property: we need to pair each element with the FIRST future element that's larger. A monotonic stack maintains elements in decreasing order, so when we encounter a larger element, we can immediately resolve all waiting elements - each element gets pushed and popped at most once, giving O(n) time.

Solution

def dailyTemperatures(temperatures):
    n = len(temperatures)
    answer = [0] * n  # default: 0 if no warmer day exists
    stack = []  # stores indices with decreasing temperatures (waiting for warmth)
    
    for i in range(n):
        # Current day is warmer than days waiting on stack
        # 'Resolve' all those days - they found their warmer tomorrow!
        while stack and temperatures[i] > temperatures[stack[-1]]:
            prev_day = stack.pop()
            answer[prev_day] = i - prev_day  # days waited = current day - that day
        
        # This day is now waiting for a warmer future day
        stack.append(i)
    
    # Days remaining in stack have no warmer day ahead (stay 0)
    return answer

Complexity

Time: O(n)
Space: O(n) in worst case (strictly decreasing temperatures)

Each day is pushed onto the stack exactly once and popped exactly once. That's 2n operations total, making this linear. We can't do better because in the worst case (strictly decreasing), every day must wait until the end, so we must examine each element.

Common Mistakes

Using a stack of values instead of indices - you need indices to calculate days waited
Not handling the default 0 for days with no warmer temperature
Processing left-to-right instead of leveraging the monotonic property
Forgetting that the answer array initializes to 0 (no warmer day)

Edge Cases

Empty array [] - returns []
Single element [70] - returns [0]
Strictly decreasing [70, 60, 50] - all zeros, stack never pops
Already sorted increasing [50, 60, 70] - each day resolved immediately
Duplicate temperatures [70, 70, 70] - all zeros, we need STRICTLY warmer

Connections

Next Greater Element I (Neetcode 150) - exact same pattern with different input
Next Greater Element II - circular version of this problem
Daily Temperatures II (LeetCode 2288) - harder version with multi-step lookahead
Candy distribution problem - similar monotonic stack approach for finding next greater

Evaluate Reverse Polish Notation #150

Stack - Expression Evaluation

Intuition

Think of RPN like cooking from a recipe card. You read instructions in order: when you see an ingredient (number), you put it on the counter. When you see an action (operator), you grab the two most recent things on the counter, combine them, and put the result back. The stack acts as your counter - it holds intermediate results until they're consumed by the next operator. This is exactly like a factory assembly line: operators are machines that take 2 inputs and produce 1 output, which becomes available for the next machine.

Why This Pattern?

Postfix notation has an inherently 'last in, first out' structure - operators always consume the most recently seen operands. The stack perfectly models this: push numbers as they arrive, pop two when you see an operator, compute, push the result back. This transforms what would be a tree traversal problem into a simple linear scan.

Solution

def evalRPN(tokens):
    stack = []
    
    for token in tokens:
        if token in '+-*/':
            # Pop in reverse order: b is the second operand, a is the first
            b = stack.pop()
            a = stack.pop()
            
            if token == '+':
                result = a + b
            elif token == '-':
                result = a - b
            elif token == '*':
                result = a * b
            else:  # division - must truncate toward zero
                result = int(a / b)
            
            stack.append(result)
        else:
            # It's a number - convert and push onto stack
            stack.append(int(token))
    
    return stack[0]

Complexity

Time: O(n) - We process each token exactly once. Each push and pop is O(1).
Space: O(n) - In the worst case (all numbers, no operators), the stack holds n/2 operands. In the worst expression structure, we could have all numbers before any operators.

You can't do better than O(n) because you must read every token to understand the expression. Each number contributes to the final result, and each operator must combine existing values - there's no shortcut that skips processing any token.

Common Mistakes

Getting operand order wrong - remember the first operand popped becomes the SECOND operand in the operation (a - b, not b - a)
Using Python's default floor division // instead of int() - // floors toward negative infinity, but the problem requires truncation toward zero
Forgetting to convert string tokens to int before pushing to stack

Edge Cases

Division by zero won't occur per problem constraints, but be aware of integer overflow in other languages
Negative numbers in input - tokens like '-2' are single tokens, not operator + number
All operations are left-to-right regardless of operator precedence - that's the whole point of RPN!

Connections

Decode String (LeetCode 394) - also uses stack for expression evaluation, pushing 'markers' to track nesting levels
Basic Calculator II - similar idea of processing expression with operators, but infix notation requires handling operator precedence differently
Flatten Nested List Iterator - uses stack to handle hierarchical structures where items can be both containers and values

Generate Parentheses #22

Backtracking / Depth-First Search on a state tree

Intuition

Think of building parentheses like maintaining a 'height' or 'balance' in a system. Each '(' is like stepping up, each ')' is like stepping down. You start at ground level (balance = 0), can only go up n times, and must return to ground level at the end. The key constraint: you can never step below ground (more closes than opens) — that would be physically impossible, like going negative on a bank account. At any intermediate step, your 'balance' (opens minus closes) tells you whether you can add a close. This is exactly like a depth-first exploration where you try both moves but respect the conservation law that balance >= 0 everywhere.

Why This Pattern?

The problem has a natural tree structure: each position in the string represents a decision point with limited valid choices. The 'balance' constraint naturally prunes invalid branches, making DFS the natural fit. We explore all valid paths from root to depth 2n, collecting leaves that represent complete well-formed strings.

Solution

def generateParenthesis(n):
    result = []
    
    def backtrack(open_count, close_count, current):
        # Base case: we've used all n pairs of each parenthesis
        if open_count == n and close_count == n:
            result.append(current)
            return
        
        # Choice 1: Add an opening parenthesis if we haven't used all n
        # This increases our "balance" - we're adding potential to close later
        if open_count < n:
            backtrack(open_count + 1, close_count, current + "(")
        
        # Choice 2: Add a closing parenthesis only if it won't exceed opens
        # This is the "debt" constraint - we can only close what we've opened
        if close_count < open_count:
            backtrack(open_count, close_count + 1, current + ")")
    
    backtrack(0, 0, "")
    return result

Complexity

Time: O(C_n) where C_n is the nth Catalan number (~4^n / n^(3/2))
Space: O(n) for recursion stack + O(C_n * n) for storing all results

We must generate ALL valid parentheses combinations - there's no way around producing each one. The Catalan number represents the count of all well-formed sequences with n pairs. Each sequence takes n steps to build, so we can't do better than O(C_n). The recursion depth is at most n (max nesting depth), which is the minimal extra space needed.

Common Mistakes

Placing a close when close_count >= open_count - this creates invalid sequences like '())'
Forgetting to check the base case properly and generating incomplete strings
Using BFS with a queue when DFS/backtracking is more natural for this tree structure

Edge Cases

n = 0 returns empty list (edge case, though problem typically n >= 1)
n = 1 returns ['()'] - single pair
Very deep nesting like n=20 - recursion depth is 20, which is fine in Python
All combinations are valid by construction due to the balance check

Connections

This is the foundational backtracking problem that establishes the 'balance/debt' pattern used in Valid Parentheses (#20) and Minimum Remove to Make Valid Parentheses (#1249)
Similar tree-construction with constraints appears in Letter Combinations of a Phone Number (#17) and Combination Sum (#39)
The Catalan number connection links to other counting problems like Unique Binary Search Trees (#96)

Largest Rectangle in Histogram #84

MONOTONIC STACK (increasing)

Intuition

Picture the histogram as a city skyline. You're looking for the largest rectangle that fits under any part of this skyline. Here's the key insight: for any particular building (bar), the largest rectangle that includes it can only stretch left until it hits a shorter building, and right until it hits a shorter building. The height of that rectangle is fixed at the building's height - the width is determined by these 'shorter building boundaries.' Think of it like water settling between buildings of different heights - the water level is constrained by the shorter building on each side. This is exactly what we're computing: for each bar, find where the water would 'spill' (the first shorter bar to left and right).

Why This Pattern?

We maintain a stack of bar indices in increasing order of height. When we encounter a bar shorter than the stack's top, we've found the RIGHT boundary for that taller bar (the current bar is the first shorter one to the right). The LEFT boundary is the bar now at the top of the stack after popping (the first shorter one to the left). This is the natural structure because we need to find 'nearest smaller element' boundaries - a classic monotonic stack use case.

Solution

def largestRectangleArea(heights):
    # Add sentinel 0 at end to flush remaining bars in stack
    # This handles rectangles that extend to the last column
    stack = []  # stores indices of bars in increasing height order
    max_area = 0
    
    for i, h in enumerate(heights + [0]):  # append sentinel
        # While current bar is shorter than top of stack,
        # we've found the right boundary for the stack's top bar
        while stack and heights[stack[-1]] > h:
            height = heights[stack.pop()]  # the bar we're computing area for
            # Width: current index is right boundary,
            # new stack top (after pop) is left boundary
            # If stack is empty, left boundary is -1 (start of array)
            width = i if not stack else i - stack[-1] - 1
            max_area = max(max_area, height * width)
        stack.append(i)
    
    return max_area

Complexity

Time: O(n)
Space: O(n) in worst case (strictly increasing heights)

Each bar is pushed onto the stack exactly once and popped at most once. Even though there's a while loop inside the for loop, the total number of iterations across all pops is bounded by n. We visit each bar twice at most (once when pushing, once when popping).

Common Mistakes

Not handling the edge case where the stack becomes empty - need to use width = i when stack is empty (left boundary is -1)
Forgetting the sentinel at the end - without it, the last bars never get flushed and their areas aren't calculated
Confusing whether to use > or >= in the while condition - use > (strictly greater) to handle equal heights correctly with the 'first shorter' boundary logic
Using heights[i] directly instead of popping first then accessing heights[popped_index]

Edge Cases

All increasing heights: stack grows to n, we only get areas at the sentinel flush
All decreasing heights: every bar triggers a pop immediately, but we still get correct results
All equal heights: sentinel triggers all pops, max area = n * height
Single bar: returns that bar's height
Empty histogram: returns 0 (empty list + sentinel = [0])

Connections

Trapping Rain Water (#42) - uses similar monotonic stack to find water levels between boundaries
Maximal Rectangle (#85) - applies this histogram algorithm row-by-row on a 2D matrix
Sum of Subarray Minimum (#907) - same nearest smaller element pattern applied to subarrays
Online Stock Span (#901) - monotonic stack for nearest greater element

Min Stack #155

Stack with augmented state - storing auxiliary information at each stack frame

Intuition

Imagine you're managing a stack of weighted crates and need to answer 'what's the lightest crate in the entire stack?' instantly at any moment. When you add a new crate lighter than everything below it, you create a 'checkpoint' — you remember both the new crate AND what the minimum was before. When you later remove that light crate, you automatically restore the previous minimum because it was saved at that stack level. It's like each layer of the stack carries a 'memory' of the minimum for all layers beneath it.

Why This Pattern?

Each stack element needs to know not just its own value but the minimum of all elements below it. When you push a new minimum, you create a checkpoint. When you pop, you automatically restore the previous minimum because it was stored at the level being removed. This creates a chain where every stack level encodes the minimum for its entire 'subtree'.

Solution

class MinStack:
    def __init__(self):
        # Each element stores: (actual_value, min_value_at_this_level)
        self.stack = []
    
    def push(self, val: int) -> None:
        if not self.stack:
            # First element: it's both the value and the current minimum
            self.stack.append((val, val))
        else:
            # The minimum at this level = min(new_value, previous_minimum)
            current_min = self.stack[-1][1]
            new_min = min(val, current_min)
            self.stack.append((val, new_min))
    
    def pop(self) -> None:
        self.stack.pop()
    
    def top(self) -> int:
        # Return the actual value (first element of tuple)
        return self.stack[-1][0]
    
    def getMin(self) -> int:
        # Return the stored minimum at this level (second element of tuple)
        return self.stack[-1][1]

Complexity

Time: O(1) for all operations
Space: O(n) where n is the number of elements pushed

Every operation touches only the top of the stack (constant work). We must store O(n) data because each stack frame needs to remember its own minimum — there's no way around this since we need to restore previous minimums when elements are popped.

Common Mistakes

Forgetting to check if stack is empty before getMin() or top() - causes IndexError
Confusing which tuple index is value vs. minimum (index 0 is value, index 1 is minimum)
Using a separate min-stack approach but failing to pop from both stacks together

Edge Cases

Pushing duplicate minimum values - works because min(5,5)=5
Popping the only element - next operation must handle empty stack gracefully
Pushing values in descending order (5,4,3) - each becomes new minimum
Pushing values in ascending order (1,2,3) - minimum stays at bottom

Connections

Max Stack (same pattern, track maximum instead)
Implement Stack using Queues (different pattern, same O(1) retrieval goal)
Sliding Window Maximum (uses deque, not stack, but similar 'track running extremum' insight)
Stock Span Problem (related concept: each element 'inherits' information from previous elements)

Valid Parentheses #20

Stack (Last-In-First-Out)

Intuition

Think of this like a seesaw or balance scale. When you see an opening bracket '(', you're placing weight on one side. The matching closing bracket ')' is the counterweight that balances it. But here's the crucial insight: the balance must be maintained at EVERY step. You can't close an outer expression before closing its inner one first. This is exactly how a stack works—the most recently opened bracket must be the next one closed (LIFO).

Why This Pattern?

The nesting structure of parentheses is inherently LIFO—the innermost opening bracket must be closed before its outer counterpart can be. A stack naturally models this: push openings, pop when you see a closing, and check if they match.

Solution

def isValid(s: str) -> bool:
    # Stack holds unmatched opening brackets
    stack = []
    
    # Map each closing bracket to its corresponding opening bracket
    # This lets us check: does top of stack match what I need to close?
    mapping = {')': '(', ']': '[', '}': '{'}
    
    for char in s:
        if char in mapping:
            # It's a closing bracket - need to match against stack top
            if not stack:
                # Stack empty = nothing to close = invalid
                return False
            if stack[-1] != mapping[char]:
                # Top of stack doesn't match the required opener = invalid
                return False
            stack.pop()  # Successfully matched, remove from stack
        else:
            # It's an opening bracket - save it for later
            stack.append(char)
    
    # Valid only if ALL opening brackets were matched (stack empty)
    return len(stack) == 0

Complexity

Time: O(n)
Space: O(n) in worst case (when all opening brackets like '((((')

We traverse the string once (O(n)). Each operation on the stack is O(1). Worst case space is n because we could have n/2 opening brackets all waiting to be matched—like '((((((...'. We can't do better than O(n) space because we might need to remember every unmatched opener.

Common Mistakes

Only checking counts instead of order—']()' has equal opens/closes but wrong order
Not handling empty stack when encountering closing bracket first (like ')' alone)
Forgetting to check if stack is empty at end—an unclosed bracket like '(' fails this
Using a list as stack but only checking 'if char in stack' instead of checking stack[-1]

Edge Cases

Empty string returns True (nothing to violate rules)
Single closing bracket ')' returns False
Alternating brackets '()[]{}' works fine
Nested valid '({[]})' works
Mismatched types '(]' returns False
Reversed order ')((' returns False

Connections

S
a
m
e
p
a
t
t
e
r
n
a
p
p
e
a
r
s
i
n
:
(
1
)
D
e
c
o
d
e
S
t
r
i
n
g
-
s
t
a
c
k
h
a
n
d
l
e
s
n
e
s
t
e
d
b
r
a
c
k
e
t
s
w
i
t
h
c
o
u
n
t
s
,
(
2
)
D
a
i
l
y
T
e
m
p
e
r
a
t
u
r
e
-
s
t
a
c
k
f
i
n
d
s
n
e
x
t
g
r
e
a
t
e
r
e
l
e
m
e
n
t
,
(
3
)
R
e
m
o
v
e
A
l
l
A
d
j
a
c
e
n
t
D
u
p
l
i
c
a
t
e
s
-
s
t
a
c
k
c
a
n
c
e
l
s
o
u
t
p
a
i
r
s
,
(
4
)
H
T
M
L
T
a
g
V
a
l
i
d
a
t
o
r
-
s
i
m
i
l
a
r
n
e
s
t
i
n
g
l
o
g
i
c
.
T
h
e
c
o
r
e
i
n
s
i
g
h
t
o
f
'
m
a
t
c
h
m
o
s
t
r
e
c
e
n
t
o
p
e
n
e
r
f
i
r
s
t
'
i
s
u
n
i
v
e
r
s
a
l
i
n
p
a
r
s
i
n
g
p
r
o
b
l
e
m
s
.

Binary Search (7)

Binary Search #704

Binary Search on a sorted array

Intuition

Imagine you're looking for a specific book in a perfectly alphabetized library with millions of books. You wouldn't check every book from A to Z — that's painfully slow. Instead, you'd open to the middle, see if your book comes before or after, and instantly eliminate half the library. Repeat. This is binary search. The key insight is that sorted data has a 'gradient' property: everything to the left of any point is smaller, everything to the right is larger. This lets you make a binary decision (go left OR go right) that eliminates half the remaining possibilities at each step.

Why This Pattern?

The array's sorted property creates a strict monotonic sequence — each element has a known relationship to its neighbors. This monotonicity is the structural property that makes binary search valid: if mid < target, we KNOW all elements from mid to left are too small, so we can safely discard them. No such guarantee exists with unsorted data, which is why binary search requires sorting.

Solution

def search(nums, target):
    left, right = 0, len(nums) - 1  # Initialize search bounds
    
    while left <= right:  # <= because right could be a valid index
        # Calculate mid safely (avoids potential overflow in other languages)
        mid = left + (right - left) // 2
        
        if nums[mid] == target:
            return mid  # Found it!
        elif nums[mid] < target:
            left = mid + 1  # Target must be in right half
        else:
            right = mid - 1  # Target must be in left half
    
    return -1  # Target not in array

Complexity

Time: O(log n)
Space: O(1)

At each iteration, we halve the search space. After k steps, we're searching n/2^k elements. We stop when n/2^k < 1, which takes log2(n) steps. This is the same reason compound interest grows exponentially — each 'division' compounds. We can't do better because any algorithm must examine each element at least once in the worst case, and binary search does the minimum necessary comparisons to leverage sorted information.

Common Mistakes

Using < instead of <= in the while condition — misses the case when target is at the rightmost index
Forgetting to use mid+1 or mid-1 when narrowing bounds — causes infinite loops
Using (left + right) // 2 instead of left + (right-left)//2 — can overflow in languages with fixed-width integers

Edge Cases

Empty array (should return -1)
Single element array where target equals element (returns 0)
Single element array where target differs (returns -1)
Target smaller than all elements
Target larger than all elements
Target is exactly at the middle after narrowing

Connections

This is the foundational pattern — variants include finding insertion position, searching rotated sorted arrays (33), finding first/last position (34), and searching 2D matrices (74)
Same binary search core insight used in peak finding (162), sqrt implementation (69), and capacity to ship packages (1011)

Find Minimum in Rotated Sorted Array #153

Modified Binary Search for Boundary Detection

Intuition

Think of this like finding the seam where two sorted stacks of papers were taped together. Originally you had one sorted stack, then someone rotated it by picking a point and moving everything before that point to the end. The minimum is where the sequence 'wraps around' - the only place where a higher number is followed by a lower number. Using binary search: one half of the array is ALWAYS sorted (that's the invariant). If the middle element is greater than the rightmost element, the minimum MUST be in the right half (because the right half contains the wrap-around point). If middle is less than or equal to right, the minimum is at middle or to the left. It's like sliding your finger down a valley - you're trying to find the lowest point.

Why This Pattern?

This isn't searching for a target value - we're searching for a structural boundary (the rotation point). The key insight is that in a rotated sorted array, at least one half (left or right of mid) is always sorted. We use this property to eliminate half of the search space at each step, converging on the minimum.

Solution

def findMin(nums):
    left, right = 0, len(nums) - 1
    
    # Binary search for the minimum
    while left < right:
        mid = (left + right) // 2
        
        # If mid > right, minimum is in right half (wraps around)
        # We can't go left because that section is monotonically increasing
        if nums[mid] > nums[right]:
            left = mid + 1
        # If mid <= right, minimum is at mid or in left half
        # The right half is sorted and starts with something >= mid
        else:
            right = mid
    
    # When left == right, we've found the minimum
    return nums[left]

Complexity

Time: O(log n)
Space: O(1)

We only use three pointers (left, right, mid) and a few variables regardless of input size. No recursion, no extra data structures.

Common Mistakes

Using nums[mid] > nums[left] as condition instead of nums[mid] > nums[right] - the right pointer is the anchor because the array is rotated
Forgetting to handle the case where array isn't rotated (already sorted) - but the algorithm naturally handles this
Using mid+1 when nums[mid] <= nums[right], which can skip the minimum - should use right = mid
Confusing this with standard binary search and trying to find a specific target

Edge Cases

Single element array - returns that element correctly
Array not rotated (already sorted) - correctly finds first element as minimum
Two elements - correctly identifies the smaller one
Array rotated at index 0 (looks like normal sorted array) - still works

Connections

Similar to Find Peak Element (#162) - both find a structural boundary using binary search
Same core insight as Search in Rotated Sorted Array (#33) - exploiting that one half is always sorted
Comparison-based binary search pattern used across many problems like Find Minimum in Rotated Sorted Array II (#154) - which adds duplicates complication

Koko Eating Bananas #875

Binary Search on Answer (Monotonic Predicate)

Intuition

Think of Koko's eating speed like water flow through a pipe. You need a minimum flow rate to push all the bananas through within the time limit. If the flow is too slow, bananas pile up and overflow (time runs out). If it's fast enough, they all get processed. The monotonic property is key: if speed k works, any faster speed definitely works too — just like higher water pressure can't make things worse. We're essentially finding the minimum 'pressure' needed.

Why This Pattern?

The predicate 'can finish in h hours' is monotonically decreasing in k — if speed k works, all speeds > k also work. This creates a clean boundary we can binary search. The search space is bounded: minimum speed is 1, maximum is max(piles) (eat one whole pile per hour).

Solution

def minEatingSpeed(piles, h):
    # Helper: calculate hours needed at speed k
    def hours_needed(k):
        total = 0
        for pile in piles:
            # Ceiling division: pile/k, rounded up
            # Python's math.ceil would work but (pile + k - 1) // k is faster
            total += (pile + k - 1) // k
        return total
    
    # Binary search bounds
    left, right = 1, max(piles)
    
    while left < right:
        mid = (left + right) // 2
        
        if hours_needed(mid) <= h:
            # This speed works! Try to go slower (left part)
            right = mid
        else:
            # Too slow, need faster speed (right part)
            left = mid + 1
    
    return left

Complexity

Time: O(n log m) where n = len(piles), m = max(piles)
Space: O(1)

We binary search over speeds (log m iterations), and each iteration scans all piles (n). Can't do better than O(n) per check since we must examine each pile to calculate total hours — each pile affects the answer. The log m factor is the minimum needed to find the exact boundary in a sorted search space.

Common Mistakes

Using float division instead of ceiling integer division — leads to off-by-one errors
Not handling edge case when h >= len(piles) — minimum speed could be 1, but technically could be less if each pile takes <= 1 hour
Using math.ceil instead of integer arithmetic — works but slower
Forgetting to handle empty piles list (shouldn't happen in valid input)

Edge Cases

piles = [1], h = 1 → answer is 1
piles = [1000000000], h = 1 → answer must equal the pile size
h is very large (>= sum of piles) → answer is 1 since can eat one banana per hour
All piles are size 1 → answer is 1 if h >= len(piles)

Connections

Similar to 'Capacity to Ship Packages Within D Days' (same binary search on answer pattern)
Same core insight as 'Minimum Number of Days to Make m Bouquets' — both use binary search on answer with a monotonic predicate
Different from 'Search in Rotated Array' — that searches for a value, this searches for a condition

Median of Two Sorted Arrays #4

Binary Search on Partition (searching for a cut point in a sorted structure)

Intuition

Think of two sorted decks of cards. You want to find the median of all cards combined. Instead of merging (which is slow), imagine making a single cut through BOTH decks such that all cards to the LEFT of the cut are smaller than all cards to the RIGHT. That's the 'invisible' cut in the merged sorted array. The median is just around that cut. Binary search is our tool to FIND that cut efficiently — we guess where to cut the first array, then calculate where the second array's cut must be to make the partition valid. We know we found it when the largest element on the left of BOTH arrays ≤ smallest element on the right of BOTH arrays.

Why This Pattern?

The problem asks 'where would the median cut be if we merged these arrays?' — not 'what value is the median?' The answer lives in the index space (0 to len(nums1)), which is a sorted search space. We can determine if we're too far left or right by checking if the partition satisfies the inequality max(left) ≤ min(right).

Solution

def findMedianSortedArrays(nums1, nums2):
    # Ensure nums1 is the smaller array for O(log(min(m,n)))
    if len(nums1) > len(nums2):
        nums1, nums2 = nums2, nums1
    
    m, n = len(nums1), len(nums2)
    left, right = 0, m  # binary search on nums1's indices
    
    while left <= right:
        # Partition positions — partition1 + partition2 divides total elements
        partition1 = (left + right) // 2
        partition2 = (m + n + 1) // 2 - partition1
        
        # Get boundary values; use -inf/inf for empty partitions
        maxLeft1 = float('-inf') if partition1 == 0 else nums1[partition1 - 1]
        minRight1 = float('inf') if partition1 == m else nums1[partition1]
        maxLeft2 = float('-inf') if partition2 == 0 else nums2[partition2 - 1]
        minRight2 = float('inf') if partition2 == n else nums2[partition2]
        
        # Check if partition is valid: max of lefts ≤ min of rights
        if maxLeft1 <= minRight2 and maxLeft2 <= minRight1:
            # Found the correct partition!
            if (m + n) % 2 == 1:
                # Odd total: median is the max of left sides
                return max(maxLeft1, maxLeft2)
            else:
                # Even total: median is average of max left and min right
                return (max(maxLeft1, maxLeft2) + min(minRight1, minRight2)) / 2
        elif maxLeft1 > minRight2:
            # Too many from nums1 (partition1 too far right), move left
            right = partition1 - 1
        else:
            # Too few from nums1 (partition1 too far left), move right
            left = partition1 + 1
    
    return 0.0  # should never reach here for valid inputs

Complexity

Time: O(log(min(m, n)))
Space: O(1) — only a fixed number of variables regardless of input size

We binary search on the smaller array's indices. Each iteration cuts the search space in half. The number of iterations is logarithmic in the smaller array's length. We only do constant-time lookups at partition boundaries — no array merging or extra storage.

Common Mistakes

Using the wrong array to binary search on (should always be smaller one for efficiency)
Forgetting the +1 in partition2 calculation, causing off-by-one errors with odd-length totals
Not handling empty partitions correctly (need -inf/inf boundaries)
Confusing which index is the partition vs. which is the element before partition

Edge Cases

One or both arrays are empty (just return median of non-empty array)
Arrays of length 1 each (tiny inputs)
Very unequal array lengths (e.g., [1] and [2,3,4,5,6,7,8,9,10])
Both arrays identical (simple case)

Connections

Kth Element in Two Sorted Arrays (uses same partition logic to find kth element)
Search a 2D Matrix (binary search on index space)
Find Minimum in Rotated Sorted Array (binary search on value space)
This is the 'hard' binary search problem — applies partition-based BS to two arrays instead of one

Search a 2D Matrix #74

Binary Search on Virtual Sorted Array

Intuition

Imagine you have a perfectly sorted list of numbers, but someone drew grid lines over it — splitting it into rows where each row continues from where the previous one ended (like a phone book folded into a grid). To find a number, you don't need to think in 2D — you just need to convert a 1D position into 2D coordinates. The key insight: if you flatten this matrix into one long sorted array, the element at index i would be at row = i // n (integer division) and col = i % n (remainder). This is a coordinate transformation — we're doing binary search in a 'virtual' 1D space and translating those indices to actual matrix positions.

Why This Pattern?

The matrix satisfies the conditions of a fully sorted sequence: each row is sorted, AND the last element of row k is less than the first element of row k+1. This means if we concatenated all rows into one array, it would be perfectly sorted. Binary search requires a sorted input — by mapping our search space to this virtual sorted array, we get O(log(m*n)) performance instead of O(m+n) from brute force.

Solution

def searchMatrix(matrix, target):
    if not matrix or not matrix[0]:
        return False
    
    m = len(matrix)
    n = len(matrix[0])
    
    # Binary search on virtual 1D array of size m*n
    left, right = 0, m * n - 1
    
    while left <= right:
        mid = (left + right) // 2
        # Convert 1D index to 2D coordinates
        row = mid // n
        col = mid % n
        
        if matrix[row][col] == target:
            return True
        elif matrix[row][col] < target:
            left = mid + 1
        else:
            right = mid - 1
    
    return False

Complexity

Time: O(log(m*n))
Space: O(1)

Binary search on n elements always takes O(log n) steps — you halve the search space each iteration. Here our 'n' is m*n (total elements). We can't do better because any comparison-based search must examine enough elements to distinguish between all possible positions — that's log₂(m*n) decisions in the worst case.

Common Mistakes

Using row = mid % n and col = mid // n (reversed) — remember: rows change slower, so row gets the larger divisor
Not checking for empty matrix first — will cause index error
Forgetting the matrix is 0-indexed when reasoning about edge cases

Edge Cases

Empty matrix or empty row — return False immediately
Target smaller than matrix[0][0] — binary search handles this naturally
Target larger than matrix[m-1][n-1] — binary search handles this naturally
Single row matrix — still works because row calculation gives 0 always
Single column matrix — still works because col calculation gives 0 always

Connections

Search a 2D Matrix II — similar problem but WITHOUT the 'sorted rows' property, requires different O(m+n) algorithm starting from corner
Find Peak Element — uses binary search but on 1D array with different comparison logic
Peak Index in a Mountain Array — same binary search pattern, different termination condition

Search in Rotated Sorted Array #33

Modified Binary Search with Sorted-Half Identification

Intuition

Think of a sorted bookshelf where someone picked up a stack of books and reinserted them at a different position - that's the rotation. The key insight: at any midpoint, at least ONE half of the array is ALWAYS sorted. This is because rotation only creates ONE break point in the sorted order. You can visualize it like finding your way through a mountain range where one side of any valley is always flat (sorted) - you just need to figure out which side contains your target.

Why This Pattern?

The rotation property guarantees that for any mid point, exactly one of [left, mid] or [mid, right] is sorted (unless mid equals left, which falls through to the other case). This gives us a binary decision: either the target lies in the sorted half, or it must be in the unsorted half. This cuts search space in half at each step.

Solution

def search(nums, target):
    left, right = 0, len(nums) - 1
    
    while left <= right:
        mid = (left + right) // 2
        
        # Found the target
        if nums[mid] == target:
            return mid
        
        # Identify which half is sorted
        if nums[left] <= nums[mid]:
            # Left half is sorted [left, ..., mid]
            # Check if target falls within this sorted range
            if nums[left] <= target < nums[mid]:
                # Target is in left sorted half
                right = mid - 1
            else:
                # Target must be in right half
                left = mid + 1
        else:
            # Right half is sorted [mid, ..., right]
            # Check if target falls within this sorted range
            if nums[mid] < target <= nums[right]:
                # Target is in right sorted half
                left = mid + 1
            else:
                # Target must be in left half
                right = mid - 1
    
    # Target not found
    return -1

Complexity

Time: O(log n)
Space: O(1)

Each iteration eliminates half of the remaining search space. Even though we might check both halves conceptually, we only traverse ONE branch per iteration - the sorted half that might contain the target. This is exactly like standard binary search: we make a constant-time decision at each step and reduce the problem size by half.

Common Mistakes

Using < instead of <= when comparing with left boundary, causing boundary elements to be missed
Not handling the edge case when left == mid (single element comparison)
Confusing which boundary to adjust - when target is LESS than the start of sorted range, go to opposite side
Using wrong comparison operators - must use <= for inclusive ranges

Edge Cases

Single element array - handled naturally by the loop condition
Array not rotated (sorted) - left half will always be sorted, works correctly
Target at rotation point - compared against sorted half boundaries correctly
Target is smallest or largest element - correctly routed to appropriate half
Duplicates in array - this solution assumes distinct elements; with duplicates, worst case becomes O(n)

Connections

F
i
n
d
M
i
n
i
m
u
m
i
n
R
o
t
a
t
e
d
S
o
r
t
e
d
A
r
r
a
y
(
#
1
5
3
)
-
s
a
m
e
r
o
t
a
t
e
d
a
r
r
a
y
c
o
n
c
e
p
t
b
u
t
f
i
n
d
s
t
h
e
p
i
v
o
t
p
o
i
n
t
i
n
s
t
e
a
d
o
f
s
e
a
r
c
h
i
n
g
;
S
e
a
r
c
h
i
n
R
o
t
a
t
e
d
S
o
r
t
e
d
A
r
r
a
y
I
I
(
#
8
1
)
-
a
d
d
s
d
u
p
l
i
c
a
t
e
h
a
n
d
l
i
n
g
w
h
e
r
e
w
o
r
s
t
c
a
s
e
d
e
g
r
a
d
e
s
t
o
O
(
n
)
;
T
h
e
p
i
v
o
t
-
f
i
n
d
i
n
g
i
n
s
i
g
h
t
h
e
r
e
i
s
f
o
u
n
d
a
t
i
o
n
a
l
f
o
r
m
a
n
y
'
r
o
t
a
t
e
d
a
r
r
a
y
'
p
r
o
b
l
e
m
s
.

Time Based Key-Value Store #981

Binary Search on Sorted Arrays

Intuition

Think of this like a version control system or document editing history. When you 'get' a value at timestamp 7, you're asking 'what was the value of this key at moment 7?' If you set values at timestamps 1, 5, and 10, and query at timestamp 7, you'd get the value from timestamp 5 - the most recent change that hadn't passed your query time. It's like looking backward through a timeline and grabbing the last snapshot that exists before or at your query point.

Why This Pattern?

The timestamps for each key are inserted in strictly increasing order, creating a sorted sequence. To find 'the largest timestamp <= target', binary search is the optimal algorithm - it's O(log n) compared to O(n) for linear scan. This is the classic 'floor' or 'lower bound' search pattern.

Solution

class TimeMap:
    def __init__(self):
        self.store = {}  # key -> list of (timestamp, value) pairs
    
    def set(self, key: str, value: str, timestamp: int) -> None:
        """Store value with timestamp. Insertions are always in increasing timestamp order."""
        if key not in self.store:
            self.store[key] = []
        self.store[key].append((timestamp, value))
    
    def get(self, key: str, timestamp: int) -> str:
        """Get the value at the largest timestamp <= given timestamp."""
        if key not in self.store:
            return ""
        
        values = self.store[key]
        left, right = 0, len(values) - 1
        result = ""
        
        while left <= right:
            mid = (left + right) // 2
            curr_timestamp = values[mid][0]
            
            if curr_timestamp <= timestamp:
                # This timestamp is valid (not past our query time)
                result = values[mid][1]  # Store as potential answer
                left = mid + 1  # Try to find a larger (more recent) valid timestamp
            else:
                # This timestamp is too new, go left
                right = mid - 1
        
        return result

Complexity

Time: O(1) amortized for set (always appends to end), O(log n) for get (binary search on sorted timestamps)
Space: O(n) total - storing all key-value-timestamp pairs

Set is O(1) because we always insert at the end of the list (timestamps are guaranteed increasing). Get is O(log n) because we binary search through at most n timestamps for that key. We can't do better than log n - we must examine enough timestamps to distinguish the boundary between valid and invalid times, which requires log n comparisons in the worst case.

Common Mistakes

Returning the wrong value when timestamps match exactly - must handle <= not just <
Forgetting to update result when finding a valid timestamp - only updating at the end
Not handling the case where key doesn't exist in the store
Using linear search instead of binary search - would be O(n) and too slow

Edge Cases

Query timestamp is smaller than ALL timestamps for that key - should return empty string
Query timestamp matches EXACTLY a timestamp in the list - should return that value
Key has only ONE timestamp entry
Multiple keys with different timestamp histories

Connections

Similar to 'Search Insert Position' - finding where a value fits in a sorted array
Same core insight as 'Find First and Last Position of Element in Sorted Array' - binary search with condition tracking
Used as component in 'Longest Increasing Subsequence' variant problems involving time windows

Linked List (11)

Add Two Numbers #2

Simultaneous traversal with persistent state (carry). This is essentially a 'two-pointer merge' where both pointers advance together while maintaining a running state.

Intuition

Think of adding two numbers on paper, column by column from right to left. The linked lists are already in the perfect order for this - the first node is the ones place, second is tens, etc. This is like a ripple carry adder in hardware: at each position you sum the two digits plus any incoming carry, output the result digit, and pass the overflow to the next position. The carry is a feedback loop - it persists from one iteration to the next, just like energy flowing through a system until equilibrium is reached. You process until both lists are empty AND there's no more carry to propagate.

Why This Pattern?

We need to process two sequences in lockstep. The carry is a state variable that gets updated each iteration and feeds back into the next calculation - this is the hallmark of a system with memory/persistence. The dual termination condition (both lists done AND no carry) mirrors physical systems that only stabilize when all energy dissipates.

Solution

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

def addTwoNumbers(l1, l2):
    # Dummy head simplifies edge case handling - like a buffer
    dummy = ListNode(0)
    current = dummy
    carry = 0  # Persistent state - the 'overflow' from each column
    
    # Process until both lists exhausted AND no carry remains
    while l1 or l2 or carry:
        # Get values (default to 0 if list exhausted) - like a switch that defaults to 0
        val1 = l1.val if l1 else 0
        val2 = l2.val if l2 else 0
        
        # Sum this column plus any incoming carry
        total = val1 + val2 + carry
        
        # Extract new digit (mod 10) and new carry (div 10)
        # This is like a beam splitter - energy divides into two paths
        carry = total // 10
        digit = total % 10
        
        # Create new node with result digit
        current.next = ListNode(digit)
        current = current.next
        
        # Advance pointers if lists not exhausted
        l1 = l1.next if l1 else None
        l2 = l2.next if l2 else None
    
    return dummy.next  # Skip dummy, return actual result

Complexity

Time: O(max(m, n)) where m and n are the lengths of the two lists. We visit each node at most once, and the carry propagation is bounded by the longer list plus one extra iteration for final carry.
Space: O(max(m, n)) for the result list. We create a new node for each digit in the output (plus one for final carry).

We must touch every digit in both input lists at minimum - there's no way to add numbers without looking at each digit. The output size is bounded by max(m,n)+1 (the +1 is that final carry can create an extra digit), so we can't do better than linear in the output size. This is the tightest bound.

Common Mistakes

Forgetting the final carry - a sum like 5 + 5 = 10 needs two result digits (0 followed by 1)
Not handling different length lists - shorter list should act like it's padded with zeros
Modifying input lists when you shouldn't - create new nodes instead
Using carry = total // 10 before calculating digit - order matters, use the old carry value

Edge Cases

Lists of different lengths (e.g., [5] + [5,5] = [0,6])
Both lists empty? Not possible per problem, but logic handles it if carry exists
Final carry overflow (e.g., 999 + 1 = 1000 - three zeros, one one)
Zero values: [0] + [0] = [0]

Connections

This is the same core logic as 'Add Binary' (#67) - just with base 10 instead of base 2
The carry pattern appears in 'Multiply Strings' (#43) - though that's more complex
Similar to 'Plus One' (#66) which is just adding 1 to a single number represented as array

Copy List with Random Pointer #138

Hash Table + Two-Pass Traversal (or Interweaving)

Intuition

Think of this like cloning a city map where: - `next` pointers are sequential streets (linear, predictable) - `random` pointers are secret tunnels that can jump anywhere The challenge: When you copy a node that has a 'tunnel' to some other node, you need to know WHERE THE COPY of that destination node lives. You can't just copy the pointer directly or you'd end up pointing to the ORIGINAL city instead of the clone. The solution is a two-step dance: 1. First, go through and create all the new buildings (nodes) without connecting anything 2. Then go back and draw all the roads and tunnels using your knowledge of where each copied building sits

Why This Pattern?

The random pointers create an arbitrary graph structure, not just a linear chain. To copy edges that point to arbitrary nodes, you need a lookup mechanism. A hash map provides O(1) lookup from original node → copied node, solving the 'where is the copy?' problem. The two-pass approach separates node creation from edge connection, avoiding circular dependency issues.

Solution

class Node:
    def __init__(self, x: int, next: 'Node' = None, random: 'Node' = None):
        self.val = int(x)
        self.next = next
        self.random = random

def copyRandomList(head: 'Node') -> 'Node':
    if not head:
        return None
    
    # PASS 1: Create all new nodes, store mapping from old→new
    old_to_new = {}
    curr = head
    while curr:
        # Create copy with same value
        old_to_new[curr] = Node(curr.val)
        curr = curr.next
    
    # PASS 2: Wire up next and random pointers
    curr = head
    while curr:
        # Get the copied node for current position
        copy = old_to_new[curr]
        
        # Connect next: look up what original's next points to, get THAT copy
        copy.next = old_to_new.get(curr.next)
        
        # Connect random: same trick
        copy.random = old_to_new.get(curr.random)
        
        curr = curr.next
    
    return old_to_new[head]

Complexity

Time: O(n)
Space: O(n)

We traverse the list twice (2n operations = O(n)). We can't do it in one pass because when we encounter a random pointer, the destination node might not be copied yet. The O(n) space is for the hash map - we need to store a mapping for every node so we can find its copy later. This is unavoidable if random pointers can go forward or backward arbitrarily.

Common Mistakes

Trying to copy in one pass - fails because when you see random→nodeX, nodeX might not exist yet
Forgetting to handle null pointers (random or next can be None)
Accidentally modifying the original list while copying
Using the same Node object instead of creating new ones - defeats the purpose of deep copy

Edge Cases

Empty list (head is None) - return None
Single node with null random
Single node with random pointing to itself (self-reference)
All random pointers are None
Deeply nested structure where random jumps backward repeatedly

Connections

Clone Graph (133) - same concept of copying nodes with arbitrary connections, but that one is true graph (nodes can have multiple edges)
Deep Copy List with Coupled Indices variant - similar but uses array indexing instead of pointers
The interweaving space-optimized solution shares DNA with 'reversing linked list in-place' tricks - both modify the original structure temporarily as scaffolding

Find the Duplicate Number #287

Floyd's Tortoise and Hare (Cycle Detection in a Linked List)

Intuition

Imagine the array as a linked list where each value points to the next index to visit. Since we have n+1 numbers all pointing to indices in a 1..n range, we're guaranteed to have a 'collision' - two different starting points eventually lead to the same node. This creates a cycle, just like water finding its way to the lowest point in a landscape. The duplicate number is the entrance to that cycle - it's where two different 'paths' in the array converge. Using two pointers at different speeds (Floyd's algorithm), we're essentially running a process until we find where the loop closes, then backtracking to find its starting point.

Why This Pattern?

The array forms a functional graph - each value 'points' to another index. Since there's one more element than the range of values, the pigeonhole principle guarantees at least one collision, creating a cycle. The duplicate is exactly where the cycle begins because two different indices must point to the same location. This is mathematically equivalent to finding the entry point of a cycle in a linked list.

Solution

def findDuplicate(nums):
    # Phase 1: Find intersection point inside the cycle
    # Both pointers will eventually meet somewhere on the cycle
    slow = nums[0]
    fast = nums[0]
    
    while True:
        slow = nums[slow]           # moves 1 step (tortoise)
        fast = nums[nums[fast]]    # moves 2 steps (hare)
        if slow == fast:
            break
    
    # Phase 2: Find the entrance to the cycle (the duplicate)
    # Reset slow to start, keep fast at meeting point
    # They meet exactly at the cycle entrance
    slow = nums[0]
    while slow != fast:
        slow = nums[slow]
        fast = nums[fast]
    
    return slow

Complexity

Time: O(n) - Each phase traverses at most n elements. The first phase visits at most n nodes until meeting inside the cycle, the second phase visits at most n nodes to find the entrance.
Space: O(1) - Only two pointer variables used regardless of input size.

We can't do better than O(n) because we must examine all elements to guarantee finding the duplicate. We use O(1) space by exploiting the array structure itself as our 'linked list' - no extra data structures needed.

Common Mistakes

Using slow = nums[slow] instead of slow = nums[slow] for BOTH pointers in phase 1 - some mistakenly use slow += 1
Forgetting phase 2 and just returning at the intersection - the intersection is inside the cycle, not at the duplicate
Not initializing both pointers to nums[0] before the loop - they must start together

Edge Cases

: Single duplicate: nums = [1,1] returns 1
Large duplicate at end: nums = [1,2,3,3,4,5] returns 3
Multiple possible duplicates but algorithm finds the one that creates the cycle entrance
Array with exactly one element repeated multiple times

Connections

Linked List Cycle II (LeetCode 142) - same algorithm, different context
Happy Number (LeetCode 202) - uses same cycle detection principle on a different number transformation
Find the Celebrity (LeetCode 277) - another 'find who points to whom' graph problem

Linked List Cycle #141

Floyd's Cycle Detection Algorithm (Two Pointer / Tortoise and Hare)

Intuition

Imagine two runners on a track. If the track is a straight line (no cycle), the faster runner will eventually finish and leave the slower runner behind. But if the track loops (has a cycle), the faster runner will eventually lap the slower one - they'll meet. This is the classic 'tortoise and hare' insight: a faster pointer moving at 2x speed will ALWAYS catch up to a slower pointer if there's a cycle, because it gains 1 position on each iteration. If there's no cycle, the faster pointer simply reaches the end of the list.

Why This Pattern?

This pattern is the natural choice because: (1) We can't modify the list to mark visited nodes, (2) We need O(1) space, not O(n) for a hash set, and (3) The mathematical guarantee - in any cycle, a faster pointer moving at 2x speed will eventually 'lap' the slower one. The relative speed is 1 node per iteration, guaranteeing convergence.

Solution

def hasCycle(head: ListNode) -> bool:
    if not head or not head.next:
        return False
    
    slow = head      # Tortoise: moves 1 step at a time
    fast = head      # Hare: moves 2 steps at a time
    
    while fast and fast.next:
        slow = slow.next        # Move slow by 1
        fast = fast.next.next   # Move fast by 2
        
        if slow == fast:        # They met = cycle exists
            return True
    
    return False  # Fast reached end = no cycle

Complexity

Time: O(n) - In the worst case (cycle exists), both pointers traverse the list until they meet. The maximum distance is bounded by the cycle length plus the non-cyclic portion. If no cycle, we visit each node at most once.
Space: O(1) - Only two pointer variables regardless of input size.

Time can't be less than O(n) because in the worst case (no cycle), we must check every node to confirm there's no cycle. Space is O(1) because we only track two pointers - the 'state' of the problem is entirely in the current positions of the runners, not in any data structure that grows with input.

Common Mistakes

Forgetting to check 'fast' and 'fast.next' before accessing - causes NoneType errors
Not initializing both pointers at 'head' - they must start together
Moving fast before checking if it can move (accessing None.next)
Using a different starting position for slow vs fast

Edge Cases

Empty list (head is None) - returns False
Single node with self-loop - returns True
Single node without loop - returns False
Two nodes forming a cycle - returns True
Long list with cycle near the end - still works, O(n)

Connections

Linked List Cycle II (#142) - same algorithm but now we need to find where the cycle starts, requires additional math
Happy Number (#202) - uses the same Floyd cycle detection idea in a different context
Find the Duplicate Number (#287) - another application of cycle detection on an array treated as a linked list

LRU Cache #146

HashMap + Doubly Linked List

Intuition

Imagine a library desk where you keep your most-used reference books within arm's reach. When you need a book, you grab it from the desk (fast access). When you use a book, you put it back on top of the pile (most recently used). When the desk is full and you need space, you put away the book at the bottom of the pile—the one you haven't touched in the longest time. That's exactly what an LRU cache does: it keeps frequently-accessed items readily available while automatically discarding the least recently used ones when capacity runs out. The 'desk' is your cache with limited space, and the 'book at the bottom' is your LRU item.

Why This Pattern?

We need O(1) operations for both get and put. A hash map gives us O(1) lookup by key. But we also need to track which item was used least recently, and we need to reorder in O(1) when something is accessed. A doubly linked list naturally maintains this order—head represents most recently used, tail represents least recently used—with O(1) insertion, deletion, and repositioning. The hash map maps each key to its corresponding node in the linked list, giving us the best of both worlds.

Solution

class DListNode:
    def __init__(self, key=0, value=0):
        self.key = key
        self.value = value
        self.prev = None
        self.next = None

class LRUCache:
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.cache = {}  # Maps key -> DListNode (O(1) lookup)
        
        # Dummy head/tail simplify edge cases - no null checks needed
        self.head = DListNode()
        self.tail = DListNode()
        self.head.next = self.tail
        self.tail.prev = self.head
    
    def _remove(self, node):
        """Detach node from list - O(1) operation"""
        node.prev.next = node.next
        node.next.prev = node.prev
    
    def _add_to_head(self, node):
        """Insert node right after head (most recently used position) - O(1)"""
        node.prev = self.head
        node.next = self.head.next
        self.head.next.prev = node
        self.head.next = node
    
    def _move_to_head(self, node):
        """When item is accessed, mark it as recently used by moving to front"""
        self._remove(node)
        self._add_to_head(node)
    
    def get(self, key: int) -> int:
        if key in self.cache:
            node = self.cache[key]
            self._move_to_head(node)  # Update usage order
            return node.value
        return -1  # Cache miss
    
    def put(self, key: int, value: int) -> None:
        if key in self.cache:
            # Key exists: update value and mark as recently used
            node = self.cache[key]
            node.value = value
            self._move_to_head(node)
        else:
            # New key: create node and add to front
            node = DListNode(key, value)
            self.cache[key] = node
            self._add_to_head(node)
            
            # Evict LRU if over capacity
            if len(self.cache) > self.capacity:
                lru_node = self.tail.prev  # Node right before tail = LRU
                self._remove(lru_node)
                del self.cache[lru_node.key]

Complexity

Time: O(1) for both get and put operations
Space: O(capacity) - we store at most 'capacity' key-value pairs in the hashmap and linked list

Every operation touches only constant-time data structures: hashmap lookup is O(1), and linked list node manipulation (remove, add, move) is O(1) because we have direct pointers to the nodes we need. We never traverse the list—we just rearrange pointers. This is the minimum possible since we must be able to access any cached item instantly.

Common Mistakes

Forgetting to update the hashmap when evicting the LRU item - causes stale entries
Not handling the case where put() updates an existing key - must update value AND move to head
Forgetting dummy nodes and writing messy null-check code for head/tail edges
Moving node to head before removing it first causes broken list links
Confusing head/tail: head = most recently used (right after head), tail = least recently used (right before tail)

Edge Cases

Capacity of 1 - must handle single element correctly
Repeated get() on same key - should keep it at MRU position
put() with existing key then immediately put() with new key - eviction should happen correctly
get() on non-existent key returns -1, not None
put() with value 0 or None should still work

Connections

This is the foundational pattern for any cache implementation - similar logic appears in system design with Redis/Memcached
Problem 460 (LFU Cache) extends this by tracking frequency instead of recency - same hashmap+DLL structure but with additional frequency tracking
The 'move to front' strategy is related to caching algorithms in operating systems (page replacement)
Similar to problem 432 (All O(1) Data Structure) which requires ordered tracking but with different operations

Merge K Sorted Lists #23

Heap-based merging (Priority Queue optimization)

Intuition

Imagine k sorted streams of water flowing into one river. At each moment, you only care about finding the smallest drop at the very front of ALL streams. Once you pick that drop, you move forward in just that one stream. A min-heap is the perfect data structure for this - it's like a bottleneck that always gives you the smallest element among k sources instantly, without having to check all k heads every time. Without a heap, you'd scan k heads for every element (expensive). With a heap, you pay a small log(k) price to maintain that 'smallest front' property.

Why This Pattern?

We have k sorted sequences and need to repeatedly find the global minimum across all of them. A min-heap of size k gives us O(log k) access to the smallest element among k sources. This transforms what would be O(n*k) naive scanning into O(n log k) - a massive win when k is large.

Solution

import heapq

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

def mergeKLists(lists: list[ListNode]) -> ListNode:
    # Min-heap stores (value, list_index, node) - list_index breaks ties
    heap = []
    
    # Initialize: add the first node from each non-empty list
    for i, node in enumerate(lists):
        if node:
            heapq.heappush(heap, (node.val, i, node))
    
    # Dummy head simplifies edge case handling at the start
    dummy = ListNode(0)
    current = dummy
    
    # Keep extracting smallest until heap is empty
    while heap:
        val, i, node = heapq.heappop(heap)
        current.next = node
        current = current.next
        
        # If this list has more nodes, push the next one
        if node.next:
            heapq.heappush(heap, (node.next.val, i, node.next))
    
    return dummy.next

Complexity

Time: O(N log k) where N = total nodes across all lists, k = number of lists
Space: O(k) for the heap + O(N) for the result list (which we must build anyway)

We push each of the N nodes onto the heap exactly once and pop each exactly once. Each heap operation costs O(log k) where k is heap size (number of lists). The heap never exceeds size k because we only push the next node after popping the current one. The O(N) factor is unavoidable - we must visit every node to include it in the result.

Common Mistakes

Forgetting the tiebreaker index in heap tuple (node objects aren't comparable, causing TypeError)
Not handling empty lists - pushing None onto heap or not checking before initializing
Reusing nodes instead of building new list (aliasing bugs)
Using O(k) linear scan instead of heap - works but slower

Edge Cases

All lists are empty -> return None
Only one list exists -> just return that list
Lists have vastly different lengths - heap still handles this efficiently
All nodes have same value - tiebreaker index ensures deterministic behavior

Connections

LeetCode 21 (Merge Two Sorted Lists) - this is the binary version, same merge logic
LeetCode 378 (Kth Smallest Element in a Sorted Matrix) - uses same heap-across-rows insight
Similar to external merge sort - merging sorted file chunks
Compare to divide-and-conquer approach which also gives O(N log k) but with better constants

Merge Two Sorted Lists #21

Two-pointer merge (merge sorted sequences)

Intuition

Imagine two conveyor belts delivering sorted packages, and you need to unload them onto a single belt in order. Both belts are already sorted, so at any moment the next package to unload must be at the front of one of the two belts — never buried in the middle. You just compare the front packages, take the smaller one, and repeat. It's like merging two sorted streams into one, always picking the lowest-energy element available. The 'equilibrium' is when both streams are exhausted and your new belt is complete.

Why This Pattern?

Both lists are already sorted in ascending order, so the smallest remaining element MUST be at the head of one of the two lists. This guarantees our greedy choice (always pick the smaller head) is always optimal — no backtracking needed. The problem has a greedy optimal substructure property.

Solution

```python
# Definition for singly-linked list.
# class ListNode:
#     def __init__(self, val=0, next=None):
#         self.val = val
#         self.next = next

def mergeTwoLists(list1, list2):
    # Dummy head: a "placeholder" node that simplifies edge cases
    # We don't care about its value; we just use it to anchor our result
    # This avoids special-casing the first node we add
    dummy = ListNode(0)
    current = dummy  # This tracks the last node in our merged list
    
    # Compare heads and attach the smaller one
    # Both lists are sorted, so the smallest unprocessed element 
    # must be at the head of one of the lists
    while list1 and list2:
        if list1.val <= list2.val:
            current.next = list1  # Attach list1's node
            list1 = list1.next    # Move list1 forward
        else:
            current.next = list2  # Attach list2's node
            list2 = list2.next    # Move list2 forward
        current = current.next    # Move our pointer forward
    
    # One list may have leftover nodes — they're already sorted
    # Just attach the remaining chain
    if list1:
        current.next = list1
    if list2:
        current.next = list2
    
    # Return the merged list, skipping the dummy head
    return dummy.next
```

Complexity

Time: O(n + m) where n and m are the lengths of list1 and list2
Space: O(1) — we only create one dummy node and use a constant number of pointers, regardless of input size

We visit each node exactly once. Each iteration processes one node from either list1 or list2, and we make exactly (n + m) iterations total. We can't do better because we must at least look at every node to include it in the output. Space is constant because we reuse existing nodes rather than creating new ones — we just rearrange pointers.

Common Mistakes

Forgetting to update 'current' after attaching a node, causing an infinite loop
Using < instead of <= in comparison, which works but doesn't preserve relative order for equal elements
Losing the reference to list1 or list2.next before reassigning, causing you to lose the rest of the list
Not handling the case where one or both input lists are None/empty

Edge Cases

One or both lists are empty (return the non-empty list or None)
All elements in list1 are smaller than all in list2 (one list exhausted immediately)
Both lists have the same values (preserve order with <=)

Connections

Merge K Sorted Lists — this problem is the binary merge step in a divide-and-conquer solution
Merge Sorted Array — same pattern but with arrays instead of linked lists
In-place Merge (LeetCode #88) — similar merge logic but with O(1) space constraint in arrays

Remove Nth Node From End of List #19

Fast-Slow Pointer (Two Pointer) with Dummy Head

Intuition

Imagine two runners on a track. The second runner starts n positions behind the first. When the first runner reaches the finish line (end of list), the second runner is exactly at position n from the end — the node we want to remove. This 'gap' technique lets us find a position relative to the end without knowing the list length upfront. The key insight: if we maintain exactly n nodes between fast and slow pointers, when fast hits None, slow will be right before our target node.

Why This Pattern?

We need to maintain a fixed spatial gap (n nodes) between two pointers while traversing. This gap naturally encodes 'nth from end'. The dummy head simplifies removing the first node — without it, we'd need special-case logic when n equals the list length.

Solution

class Solution:
    def removeNthFromEnd(self, head: Optional[ListNode], n: int) -> Optional[ListNode]:
        # Dummy head simplifies removing the first node (when n = list length)
        dummy = ListNode(0, head)
        
        # Both pointers start at dummy
        fast = dummy
        slow = dummy
        
        # Move fast n+1 steps ahead to create a gap of n nodes
        # This positions slow exactly one node before the target
        for i in range(n + 1):
            fast = fast.next
        
        # Advance both until fast hits the end
        # When fast = None, slow will be at node BEFORE the one to remove
        while fast:
            fast = fast.next
            slow = slow.next
        
        # Skip over the target node
        slow.next = slow.next.next
        
        return dummy.next

Complexity

Time: O(L) where L is list length
Space: O(1) - only using a fixed number of pointers

We traverse the list at most once. The fast pointer walks the entire list, and slow catches up — total operations proportional to list length. We can't do better because we must examine every node to reach the end. Space is constant because pointers replace nodes, not create new ones.

Common Mistakes

Forgetting the dummy head and not handling removal of the head node (when n equals list length)
Off-by-one: moving fast n+1 steps instead of n creates a gap of n-1, missing the target
Not saving reference to the node before the target — you need to modify its next pointer
Forgetting that slow ends up ONE NODE BEFORE the target, not AT the target

Edge Cases

Removing the head node (n equals list length)
Removing the last node (n = 1)
Single-node list
n is always valid per problem constraints

Connections

Similar to 'Find the middle of linked list' (fast-slow technique), but with a fixed gap instead of 'halfway'
Related to 'Swap nodes in pairs' and 'Linked list cycle II' — all use dummy head for edge case handling
This is the building block for 'Remove nth node' variants and forms the mental model for sliding window problems

Reorder List #143

Three-step pointer manipulation: (1) Find middle using slow/fast pointers, (2) Reverse the second half, (3) Interleave nodes from first half with reversed second half.

Intuition

Think of this like shuffling a deck of cards. You split the deck in half, reverse the second half, then interleave them like shuffling. The challenge with a linked list is you can only move forward, so you need to reverse the second half to 'reach back' and grab elements from the end. The slow/fast pointer is like finding the center of a rope by walking: one person walks slowly (1 step), another walks fast (2 steps) - when the fast walker reaches the end, the slow walker is at the middle.

Why This Pattern?

Linked lists only give forward traversal, but this problem requires working from both ends simultaneously. The structural property that makes this pattern natural is that we need access to both the beginning (first half) and end (reversed second half) at the same time. Reversing creates a 'mirror' that lets us pull from the 'end' while traversing from the start.

Solution

class Solution:
    def reorderList(self, head: Optional[ListNode]) -> None:
        if not head or not head.next:
            return
        
        # Step 1: Find middle using slow/fast pointers
        # When fast reaches end, slow is at middle
        slow, fast = head, head
        while fast and fast.next:
            slow = slow.next
            fast = fast.next.next
        
        # Step 2: Reverse second half starting from slow
        # prev starts as None (will become new tail)
        prev, curr = None, slow
        while curr:
            next_temp = curr.next  # Save next before overwriting
            curr.next = prev       # Reverse the pointer
            prev = curr            # Move prev forward
            curr = next_temp       # Move curr forward
        # After loop, prev points to new head of reversed list
        
        # Step 3: Merge first half and reversed second half
        first, second = head, prev
        # Alternate: take one from first, one from second
        while second.next:
            # Save next nodes before overwriting
            first_next = first.next
            second_next = second.next
            
            # Connect first to second
            first.next = second
            # Move first forward
            first = first_next
            
            # Connect second to next first node
            second.next = first
            # Move second forward
            second = second_next

Complexity

Time: O(n)
Space: O(1)

We traverse the list 3 times, but each traversal covers different parts or is proportional to n. Finding middle: O(n) - fast pointer visits ~n/2 nodes. Reversing: O(n/2) - only processes second half. Merging: O(n/2) - interleaves both halves. Total is O(n). Space is O(1) because we only use pointer variables regardless of list size.

Common Mistakes

Creating a cycle by not properly saving next nodes before rewiring pointers
Not handling the middle node correctly for odd-length lists (the middle element stays in place naturally)
Confusing the end condition - slow/fast loop should check 'fast and fast.next' not just 'fast'
Forgetting edge cases: empty list or single node returns early

Edge Cases

Empty list (head = None) - return immediately
Single node (head.next = None) - already 'reordered'
Two nodes - becomes head.next, head after reorder
Odd number of nodes - middle node stays as the pivot, first half has one more element

Connections

Reverse Linked List (#206) - uses identical reversal logic
Middle of Linked List (#876) - same slow/fast technique
Merge Two Sorted Lists (#21) - similar interleaving pattern but with sorted inputs
Rotate List (#61) - another three-step list manipulation but with different operations

Reverse Linked List #206

Three-pointer in-place reversal

Intuition

Imagine a train with cars connected in a line. Each car has a coupler pointing forward to the next car. To reverse the train, you don't detach the cars—you flip each coupler so it points backward instead. The key is: before you can flip a coupler's direction, you need to remember which car comes after it, otherwise you'd lose the rest of the train. This is why we need three hands: one to hold the current car, one to remember what's behind it, and one to peek ahead before we rewire the connection.

Why This Pattern?

A singly linked list gives only forward references. To reverse direction, we must manually flip each pointer while preserving access to the remaining list. The three pointers (prev, curr, next_temp) are the minimal state needed: prev records what's now behind us, curr is what we're currently reorienting, and next_temp prevents losing the rest of the chain before we overwrite the pointer.

Solution

def reverseList(head):
    prev = None      # Starts as None - becomes new tail
    curr = head     # Current node we're reversing
    
    while curr:
        next_temp = curr.next  # Save next node - don't lose the rest!
        curr.next = prev       # Flip pointer: now points backward
        prev = curr            # Move prev forward (this node is now "behind")
        curr = next_temp       # Move curr to saved next node
    
    return prev  # prev is the new head after full reversal

Complexity

Time: O(n)
Space: O(1)

We visit each of the n nodes exactly once and perform constant work per node. We only store three pointers regardless of list size—no recursion stack, no new data structures proportional to input.

Common Mistakes

Forgetting to save next_temp before overwriting curr.next, which loses the rest of the list
Not returning the new head (prev) after the loop
Trying to reverse in place without tracking all three pointers—inevitably losing access to unprocessed nodes

Edge Cases

Empty list (head = None): returns None naturally
Single node list: returns that node (prev becomes the node, curr becomes None)
Two-node list: correctly reverses both pointers

Connections

This is the foundation for reversing any portion of a linked list—used in 'Reverse Linked List II' (adds position tracking) and 'Palindrome Linked List' (uses this as a building block)
The three-pointer reversal concept appears in array reversal and string reversal algorithms too
Similar to the 'swap pairs' pattern where you must save state before rewiring pointers

Reverse Nodes in K-Group #25

Reversal with boundary checking - this is the 'localized reversal' pattern where you reverse a bounded segment while maintaining the list's overall structure. It combines: (1) boundary detection - checking if k nodes exist, (2) classic 3-pointer linked list reversal, and (3) reconnection - stitching the reversed segment back into the list.

Intuition

Think of this like reversing paragraphs in an essay while keeping sentences intact. You have a linked list (like a train of k-car segments), and you reverse each chunk of k cars. If there aren't k cars left at the end, you leave them as-is. The key insight: you're doing LOCAL reversal (the k nodes) while PRESERVING global structure (the connections between groups). It's like untangling a necklace - you work on small sections while keeping the whole structure coherent. The 'kth node' acts as your boundary marker - it tells you whether you can reverse or must stop.

Why This Pattern?

This pattern fits because the problem has natural BOUNDARIES - exactly k nodes per reversal. We're not reversing the entire list (that would just be standard reversal); we're doing multiple LOCAL reversals with a STOP condition. The structure is recursive: after reversing one group, the 'tail' of that reversed group becomes the starting point for the next group. The boundary check makes this fundamentally different from simple reversal.

Solution

class Solution:
    def reverseKGroup(self, head: ListNode, k: int) -> ListNode:
        # Dummy node simplifies edge cases - acts as a "virtual" previous node
        dummy = ListNode(0, head)
        group_prev = dummy  # Marks the node BEFORE the current group
        
        while True:
            # STEP 1: Find the kth node from group_prev
            # If fewer than k nodes remain, we're done
            kth = self.get_kth_node(group_prev, k)
            if not kth:
                break
            
            # STEP 2: Store the node AFTER this group (will become new "next")
            group_next = kth.next
            
            # STEP 3: Reverse exactly k nodes
            # prev starts at "group_next" (the node after our group)
            # curr starts at the first node of the group to reverse
            prev, curr = group_next, group_prev.next
            while curr != group_next:
                next_temp = curr.next  # Save next before overwriting
                curr.next = prev       # Reverse the link
                prev = curr            # Move prev forward
                curr = next_temp       # Move curr forward
            
            # STEP 4: Reconnect the reversed group to the list
            # group_prev.next was pointing to the old head, now points to new head (kth)
            # kth was the tail during reversal, now becomes the head
            next_head = group_prev.next  # This is now the TAIL after reversal
            group_prev.next = kth        # Connect previous group to new head
            group_prev = next_head       # Move to the tail for next iteration
        
        return dummy.next
    
    def get_kth_node(self, start, k):
        # Traverse k nodes to find the boundary
        current = start
        for _ in range(k):
            if not current:
                return None
            current = current.next
        return current

Complexity

Time: O(n) - We visit each node a constant number of times. Each node is: (1) counted once when finding the kth node, (2) touched once during reversal, and (3) potentially visited during reconnection. The 'while True' loop with the kth check ensures we don't process nodes multiple times.
Space: O(1) - Only using a fixed number of pointers regardless of input size. No recursion, no extra data structures. The dummy node is just for convenience, not extra space proportional to n.

We can't do better than O(n) because every node must be visited at least once to determine grouping and potentially reversed. The O(1) space is achievable because we manipulate links in-place using the 3-pointer technique - we're essentially 'rotating' pointers rather than building new structures.

Common Mistakes

Forgetting to save group_next before reversal - after reversal, kth.next changes and you lose the connection to the rest of the list
Not handling the last incomplete group - the get_kth_node check prevents reversing when fewer than k nodes remain
Confusing the reversal direction - prev should start at 'group_next' (the node AFTER the group), not at None
Forgetting to update group_prev after each reversal - you need to move to what was the first node (now the tail)
Not using a dummy node - makes the first group reversal messy since there's no node before the head

Edge Cases

k = 1 - no reversal needed, should return original list
k >= length of list - reverses entire list
k equals exactly the list length - single complete reversal
Empty list (head = None) - should return None
Single node list - returns same node if k > 1

Connections

Reverse Linked List (#206) - This is the fundamental reversal operation used on each k-group
Swap Nodes in Pairs (#24) - Similar pattern but k=2 fixed, simpler case of this problem
Merge K Sorted Lists (#23) - Uses dummy node pattern for similar structural reasons
Rotate List (#61) - Another 'bounded operation with reconnection' problem

Trees (15)

Balanced Binary Tree #110

Bottom-up recursion with early termination (post-order traversal)

Intuition

Think of a balanced tree like a well-designed building - no single column should be dramatically taller than its neighbor, or the structure becomes unstable. The 'balance' here is about equilibrium: at EVERY node in the tree, the left and right subtrees must have heights that differ by at most 1. It's like checking that every floor of a building has roughly equal ceiling heights on both sides. The key insight is that we need to check from the bottom up - if the foundation (leaves) is unstable, nothing above can be stable.

Why This Pattern?

The problem demands checking every subtree's balance AND computing its height simultaneously. Post-order (left, right, node) is perfect because we need information from children before we can make decisions about the parent. The '-1 sentinel pattern' is elegant here: instead of returning both (is_balanced, height), we return height normally, or -1 to signal 'unbalanced' - this single value carries both pieces of information and lets us short-circuit the moment we find an imbalance.

Solution

class Solution:
    def isBalanced(self, root: Optional[TreeNode]) -> bool:
        # Helper returns height if balanced, -1 if unbalanced
        def check(node):
            if not node:
                return 0  # Empty tree has height 0, is balanced
            
            # Recursively check left subtree
            left_height = check(node.left)
            if left_height == -1:
                return -1  # Left subtree already unbalanced, propagate failure up
            
            # Recursively check right subtree  
            right_height = check(node.right)
            if right_height == -1:
                return -1  # Right subtree already unbalanced
            
            # At this point both subtrees are balanced - check current node
            if abs(left_height - right_height) > 1:
                return -1  # Current node violates balance condition
            
            # Return height of current node (max of children + 1 for current level)
            return max(left_height, right_height) + 1
        
        # If check returns -1, tree is unbalanced; otherwise balanced
        return check(root) != -1

Complexity

Time: O(n) - We visit each node exactly once. The key insight is that we DON'T recompute heights at every level (which would be O(n²)). By computing heights bottom-up and returning early on imbalance, we maintain linear time.
Space: O(h) where h is the height of the tree, due to the recursion stack. In the worst case (skewed tree), this is O(n); in a balanced tree, it's O(log n).

We can't do better than O(n) because we must at least examine every node to guarantee balance - a single deep leaf could be the problem. The recursion stack corresponds to the 'call chain' down the tree; we only need to keep track of the path from root to current node, not the entire tree.

Common Mistakes

Using top-down approach: computing height at each node from scratch leads to O(n²) - don't recalculate heights repeatedly
Forgetting to check BOTH conditions: subtrees must be individually balanced AND have height difference ≤ 1
Not returning proper height for balanced nodes - need max(left, right) + 1, not just left + 1 or right + 1
Confusing depth (distance from root) with height (distance to deepest leaf) - height is what matters here

Edge Cases

Empty tree (root is None) - should return True, it's balanced by definition
Single node tree - balanced, height = 1
Completely skewed tree (linked list) - unbalanced at root if depth > 1
Tree with only left or only right children throughout

Connections

Same concept as 'Maximum Depth of Binary Tree' (LeetCode 104) but with an extra balance check - uses similar bottom-up height computation
Relates to 'Diameter of Binary Tree' (LeetCode 543) - both compute heights bottom-up but answer different questions
The sentinel return value (-1) pattern appears in checking subtree properties - similar logic to validating BST where we return (is_valid, min, max) from each recursive call

Binary Tree Level Order Traversal #102

BFS (Breadth-First Search) on a tree using a queue, with level-by-level processing

Intuition

Think of this like ripples spreading outward from a stone dropped in water. When you process a tree breadth-first, you're essentially flooding it level by level - all nodes at depth 0 get visited first (the root), then all nodes at depth 1, then depth 2, and so on. This is fundamentally different from depth-first search which goes deep first (like exploring one path fully before backtracking). The key insight: a queue naturally implements this 'wave' behavior because nodes at the current depth are processed before their children get added to the queue.

Why This Pattern?

A queue is FIFO - first in, first out. When we add children to the queue, they wait their turn. By processing exactly the number of nodes that were in the queue at the start of each level, we ensure all nodes at depth d are processed before any node at depth d+1. This 'batch processing' per level is what gives us the分层 result.

Solution

from collections import deque

def levelOrder(root):
    if not root:
        return []
    
    result = []
    queue = deque([root])
    
    while queue:
        # Snapshot how many nodes are at THIS level
        level_size = len(queue)
        current_level = []
        
        # Process exactly 'level_size' nodes - all nodes at current depth
        for _ in range(level_size):
            node = queue.popleft()
            current_level.append(node.val)
            
            # Add children for NEXT level (they'll be processed later)
            if node.left:
                queue.append(node.left)
            if node.right:
                queue.append(node.right)
        
        # Finished this entire level
        result.append(current_level)
    
    return result

Complexity

Time: O(n)
Space: O(w) where w is max width of tree (can be up to n/2 in worst case for complete binary tree) + O(h) for the result where h is height

Every node is visited exactly once and added to/popped from the queue exactly once - that's O(n). For space, the queue holds at most one full level of nodes. In a complete binary tree, the bottom level has roughly n/2 nodes, so that's our worst-case space. The result also stores n values across h levels.

Common Mistakes

Forgetting to capture len(queue) BEFORE the for loop - if you check len(queue) inside, it changes as you popleft
Not checking for None root at the start - leads to empty result anyway but cleaner to handle explicitly
Confusing with DFS/preorder - that gives depth-first, not level-order
Using a list as queue instead of deque - O(n) pop(0) vs O(1) popleft

Edge Cases

Empty tree (root is None) - should return []
Single node tree - returns [[val]]
Skewed tree (linked list) - still works, processes one node per level
Perfect binary tree - maximum width, queue holds n/2 nodes at bottom level

Connections

Same BFS queue pattern as 'Binary Tree Right Side View' (#199) but with full level tracking
Similar to 'Average of Levels' (#637) - same level-iteration structure
Uses same deque structure as 'Maximum Depth of Binary Tree' (#104) BFS approach
Contrast with DFS solutions like 'Maximum Depth' or 'Path Sum' - different exploration order

Binary Tree Maximum Path Sum #124

Post-order DFS with state propagation. Each recursive call returns the maximum path sum starting from that node and going DOWN (single branch), while updating a global answer for paths that go THROUGH the node (both branches).

Intuition

Think of each node as a junction where energy/power can flow through. A path through a node can either: (1) flow from one child through the node to the other child (an inverted V shape) — this is a complete path but can't extend upward to parents, or (2) flow from the node down through exactly one child (a straight line) — this can be extended upward to contribute to a larger path. It's like designing electrical circuits: at each junction, you either form a closed loop (both branches used) or you pass current upward through exactly one branch. We need BOTH the best single-branch contribution we can make to our parent AND the best two-branch path we can form right here. We process bottom-up so children report their best single-branch contribution first, then we combine them.

Why This Pattern?

The problem has a natural bottom-up dependency: to know what a node can contribute to its parent, we must first know what each child can contribute. The key insight is that a path through a node either uses one child (extensible upward) or both children (forms a complete path at this node, not extensible). This two-part return value (one for extending up, one for the global answer) naturally maps to post-order traversal.

Solution

class Solution:
    def maxPathSum(self, root: Optional[TreeNode]) -> int:
        # Track the global maximum; must start very low since node values can be negative
        self.max_sum = float('-inf')
        
        def dfs(node):
            if not node:
                return 0  # Base case: empty tree contributes nothing
            
            # Recursively get best single-branch contribution from each subtree
            # max(0, ...) means we can OPTIONALLY include a child — if it adds 
            # negativity, we'd rather not include it (path must have at least one node)
            left_gain = max(0, dfs(node.left))
            right_gain = max(0, dfs(node.right))
            
            # Path that goes THROUGH this node: one child -> this node -> other child
            # This forms a complete path but cannot extend upward to parents
            through_node = left_gain + node.val + right_gain
            self.max_sum = max(self.max_sum, through_node)
            
            # Return what we can contribute to our parent: best path going DOWN from here
            # We can only pick ONE child to extend upward (otherwise we'd have a cycle)
            return max(left_gain, right_gain) + node.val
        
        dfs(root)
        return self.max_sum

Complexity

Time: O(n) where n is the number of nodes. We visit every node exactly once, doing O(1) work per node.
Space: O(h) where h is the height of the tree. This is the recursion stack depth. In the worst case (skewed tree), h = n → O(n); in balanced tree, h = log(n).

We must examine every node to guarantee finding the max path — the path could be arbitrarily located. The space is just the call stack because we only maintain O(1) extra state per recursive call. We can't reduce space without fundamentally changing the algorithm since we need the full recursive descent to compute bottom-up values.

Common Mistakes

Not handling negative node values correctly — using 0 in max(0, child_gain) is crucial because we can skip entire subtrees that would reduce our sum
Forgetting the global maximum needs initialization to -infinity (or a very small number), not 0 — a tree with all negative values should return the largest (least negative) single node
Returning the wrong value from the recursive function — must return single-branch gain (one child), not the two-branch path, otherwise you'd create cycles when connecting to parent
Not including the current node's value in the return — the path going up must include the current node itself

Edge Cases

Tree with single node — should return that node's value
Tree with all negative values — should return the maximum (least negative) node
Tree with one negative child and one positive — must correctly choose to exclude the negative branch
Root with very large positive values deep in one subtree — the path might not even include the root

Connections

House Robber III uses the same 'take one branch or both' logic when deciding whether to rob adjacent houses in a binary tree Diameter of Binary Tree (#543) is structurally identical but sums lengths instead of node values — same pattern of computing single-branch vs two-branch returns Binary Tree Camera (#968) also uses post-order with state return to coordinate parent-child decisions based on children's states All three problems share the core insight: at each node, you either extend a path upward through exactly one child, or you close the path locally using both children.

Binary Tree Right Side View #199

Level-order traversal (BFS) or Depth-first search with right-first ordering

Intuition

Imagine standing on the right side of a binary tree and taking a photograph. What do you see? At each horizontal 'depth level', you see the rightmost node. Here's the key insight: if a right subtree exists at some level, it blocks the left subtree from view at that level. Think of it like a shadow cast from the right - only the rightmost nodes at each depth catch the 'light'. This is equivalent to asking: for each depth value, what's the last node you'd encounter if you scanned that level left-to-right?

Why This Pattern?

The problem has a natural 'layered' structure - we need exactly one node per depth level. BFS naturally processes level-by-level, so the last node at each level is always the rightmost. Alternatively, if using DFS with a right-then-left order, the FIRST time we encounter a new depth, that node is guaranteed to be the rightmost (because we visited the right side first). This is like the 'first appearance' pattern - the first node we see at each depth in right-first traversal must be visible from the right.

Solution

from collections import deque
from typing import Optional, List

# Definition for a binary tree node.
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

class Solution:
    def rightSideView(self, root: Optional[TreeNode]) -> List[int]:
        """
        BFS approach: Process level by level, capture the last node at each level.
        """
        if not root:
            return []
        
        result = []
        queue = deque([root])
        
        while queue:
            level_size = len(queue)  # Number of nodes at current depth
            
            for i in range(level_size):
                node = queue.popleft()
                
                # Last node in this level = rightmost node = visible from right
                if i == level_size - 1:
                    result.append(node.val)
                
                # Add children for next level (left first, then right)
                if node.left:
                    queue.append(node.left)
                if node.right:
                    queue.append(node.right)
        
        return result

# Alternative DFS solution (commented out):
# class Solution:
#     def rightSideView(self, root: Optional[TreeNode]) -> List[int]:
#         result = []
#         
#         def dfs(node, depth):
#             if not node:
#                 return
#             
#             # First time we reach this depth = rightmost node (we go right first!)
#             if depth == len(result):
#                 result.append(node.val)
#             
#             # Visit RIGHT first, then LEFT - this ensures rightmost nodes are seen first
#             dfs(node.right, depth + 1)
#             dfs(node.left, depth + 1)
#         
#         dfs(root, 0)
#         return result

Complexity

Time: O(n)
Space: O(w) for BFS where w is max width of tree, O(h) for DFS where h is height (recursion stack)

We must visit every node in the tree at least once to determine the rightmost node at each depth. There's no way around this because a node buried deep on the left could theoretically be the rightmost at its depth if no right subtree exists at that level. In the worst case (left-skewed tree), we need to visit all n nodes to find the single visible node per depth.

Common Mistakes

Using standard left-to-right DFS traversal (root-left-right) instead of right-first - this gives the LEFT side view, not right
Not tracking depth properly in DFS - you need to know when you've seen a new depth for the first time
Confusing the problem with 'find all rightmost nodes' - you only want ONE per depth level
Forgetting to handle empty tree edge case

Edge Cases

Empty tree (return empty list)
Single node tree (return [root.val])
Left-skewed tree (all left children) - still returns nodes from left side because nothing blocks them
Right-skewed tree (all right children) - returns rightmost path
Complete/perfect binary tree - returns rightmost node at each level
Tree with only one branch (linked-list shaped) - returns all nodes

Connections

LeetCode 104 'Maximum Depth of Binary Tree' - uses similar level-order traversal
LeetCode 515 'Find Largest Value in Each Tree Row' - same BFS level-order pattern, but keeps max instead of last
LeetCode 116/117 'Populating Next Right Pointers' - level-order with tracking of next node
LeetCode 102 'Binary Tree Level Order Traversal' - the foundational level-order pattern this builds on
LeetCode 637 'Average of Levels' - same level-order traversal, different aggregation

Construct Binary Tree from Preorder and Inorder Traversal #105

Recursive divide-and-conquer with index tracking and hashmap lookup

Intuition

Think of this like archaeology at two dig sites. Preorder (root-first) tells you 'here's the family head,' and inorder (left-root-right) tells you 'here's where families split.' The first element in preorder is ALWAYS the root. When you find that root in inorder, everything to its LEFT is the left subtree (came before it), everything to its RIGHT is the right subtree (came after it). Now you know the SIZE of the left subtree from inorder, so in preorder you can slice: after the root, the next N elements belong to left subtree (where N = size of left portion in inorder), and the rest belong to right subtree. It's a recursive family reconstruction: find the patriarch, see where they stand in the family line, then apply to each branch.

Why This Pattern?

The traversals encode structural information positionally: preorder gives root-first ordering, inorder gives the left-root-right split point. Together they uniquely determine the tree structure. We use a hashmap to achieve O(1) root lookup, converting a recursive structure problem into a simple index-manipulation problem.

Solution

class Solution:
    def buildTree(self, preorder: List[int], inorder: List[int]) -> Optional[TreeNode]:
        # Hashmap: value -> its index in inorder (for O(1) lookup)
        inorder_index = {val: i for i, val in enumerate(inorder)}
        
        def build(pre_left, pre_right, in_left, in_right):
            # Base case: empty segment
            if pre_left > pre_right:
                return None
            
            # Root is first element in this preorder segment
            root_val = preorder[pre_left]
            root = TreeNode(root_val)
            
            # Find where root sits in the inorder segment
            in_idx = inorder_index[root_val]
            
            # Number of nodes in left subtree = elements to left of root in inorder
            left_size = in_idx - in_left
            
            # Build left subtree: preorder[pre_left+1 to pre_left+left_size]
            # corresponds to inorder[in_left to in_idx-1]
            root.left = build(pre_left + 1, pre_left + left_size, in_left, in_idx - 1)
            
            # Build right subtree: remaining preorder elements
            root.right = build(pre_left + left_size + 1, pre_right, in_idx + 1, in_right)
            
            return root
        
        return build(0, len(preorder) - 1, 0, len(inorder) - 1)

Complexity

Time: O(n)
Space: O(n) for hashmap + O(h) for recursion stack, where h is tree height

We visit each of the n nodes exactly once. The hashmap gives O(1) lookup to find where each root sits in inorder. The recursion divides the problem in half each time, but the total work across all levels sums to n. We can't do better than O(n) because we must create n TreeNode objects regardless.

Common Mistakes

Trying to slice arrays repeatedly instead of using indices (creates O(n²) memory overhead)
Forgetting the base case when segment is empty (pre_left > pre_right)
Off-by-one errors when calculating left_size or the boundaries for recursive calls
Not accounting for when left or right subtree is empty

Edge Cases

Single node tree (root only)
Completely skewed tree (linked list shape)
Tree where root has only left child or only right child
Empty input arrays (though problem guarantees non-empty)

Connections

S
a
m
e
r
e
c
o
n
s
t
r
u
c
t
i
o
n
p
a
t
t
e
r
n
a
s
'
C
o
n
s
t
r
u
c
t
B
i
n
a
r
y
T
r
e
e
f
r
o
m
I
n
o
r
d
e
r
a
n
d
P
o
s
t
o
r
d
e
r
T
r
a
v
e
r
s
a
l
'
(
#
1
0
6
)
-
j
u
s
t
s
w
a
p
p
r
e
o
r
d
e
r
f
o
r
p
o
s
t
o
r
d
e
r
.
U
s
e
s
s
a
m
e
d
i
v
i
d
e
-
b
y
-
r
o
o
t
-
p
o
s
i
t
i
o
n
i
n
s
i
g
h
t
a
s
'
M
a
x
i
m
u
m
B
i
n
a
r
y
T
r
e
e
'
(
#
6
5
4
)
.
T
h
e
h
a
s
h
m
a
p
o
p
t
i
m
i
z
a
t
i
o
n
i
s
a
c
o
m
m
o
n
t
e
c
h
n
i
q
u
e
s
e
e
n
i
n
'
V
a
l
i
d
a
t
e
B
i
n
a
r
y
S
e
a
r
c
h
T
r
e
e
'
(
#
9
8
)
f
o
r
O
(
1
)
s
u
b
t
r
e
e
b
o
u
n
d
s
c
h
e
c
k
i
n
g
.

Count Good Nodes in Binary Tree #1448

DFS with path state tracking. We maintain the maximum value encountered along the current root-to-node path as we traverse.

Intuition

Think of this like a 'peak detector' on a mountain range. As you walk from the root down any path, you're tracking the highest elevation seen so far. A node is 'good' if it's a new peak — its value is the highest point reached so far on that path. Like how a mountain climber might say 'I've reached a new summit' when they climb higher than anything before, we're counting nodes that are higher than all their ancestors. We carry the running maximum down each branch, updating it when we find a higher value.

Why This Pattern?

We need to evaluate every root-to-node path and determine if the current node's value is the maximum on that path. DFS naturally explores all paths, and by passing the current maximum as a parameter, we maintain the necessary state without storing entire paths. This is optimal because we must visit every node anyway to check if it's a 'good' node.

Solution

class Solution:
    def goodNodes(self, root: TreeNode) -> int:
        def dfs(node, max_so_far):
            if not node:
                return 0
            
            # A node is good if its value is >= all values on the path from root
            # If current node's value is greater than or equal to max_so_far,
            # then no ancestor has a higher value, making this a "good node"
            count = 1 if node.val >= max_so_far else 0
            
            # Propagate the maximum value seen so far down the tree
            new_max = max(max_so_far, node.val)
            
            # Recurse on both children, carrying forward the updated maximum
            count += dfs(node.left, new_max)
            count += dfs(node.right, new_max)
            
            return count
        
        # Start with root's value as initial maximum
        return dfs(root, root.val)

Complexity

Time: O(n) — We must visit every single node in the tree to determine if it's a good node. There's no way to skip any node because each node's 'goodness' depends only on the path to it, which requires examining that path.
Space: O(h) — The recursion stack depth equals the tree height. In a balanced tree this is O(log n); in a worst-case skewed tree (like a linked list), it's O(n). We only store one integer (the running maximum) per level of recursion.

Think of it like hiking all trails on a mountain. You can't skip any trail segment because you need to traverse it to know what peaks you'll encounter. The space is like your memory of the highest point on your current trail — you only need to remember one number as you go deeper, not a list of everything you've seen.

Common Mistakes

Using > instead of >= — a node is good if it's equal to the max, not just greater than
Not updating max_so_far before recursing — the current node's value must be included in the path for its children
Forgetting the root case — handle when root is None (though problem guarantees non-empty)
Confusing this with a tree-wide maximum — it's path-specific, not global

Edge Cases

Single node tree: root is always good (count = 1)
All decreasing values: every node is good because each is >= all ancestors
All same values: every node is good due to >= comparison
Negative values: algorithm works correctly since we're tracking maximum
Deeply skewed tree: recursion may hit stack limit, consider iterative approach

Connections

Similar to 'Maximum Path Sum' (binary tree, pass running state) — both track a value as they traverse
Like 'Path Sum' problems where you carry accumulated sum down the tree
Compare with 'Longest ZigZag Path' — both use DFS with path-specific state
Related to 'Diameter of Binary Tree' in that both fundamentally require visiting all nodes but only O(h) extra space

Diameter of Binary Tree #543

Post-order DFS with global state tracking. This pattern applies when: (1) you need information from children before computing parent results, (2) the answer isn't necessarily at the root, and (3) you need to track a global maximum while doing local computations at each node.

Intuition

Think of the tree as a network of branches. The diameter is the longest distance between any two leaf nodes in this network. Here's the key insight: the longest path through ANY particular node is simply the height of its left subtree PLUS the height of its right subtree. But here's the subtlety - the diameter might NOT go through the root. It could be lurking in any subtree. So we need to: (1) compute height of each subtree (longest path from that node down to a leaf), (2) at each node, check if the path through this node is the longest we've seen, and (3) pass the height upward to parent nodes. This is like calculating stress at each joint in a structure - the maximum stress (diameter) might occur anywhere, but we can compute it locally at each joint while traversing the structure.

Why This Pattern?

A binary tree diameter is inherently a bottom-up property. The height of a node depends on its children's heights, and the diameter through a node depends on both children's heights. We must process children first (post-order) to have the data we need. The global diameter accumulates as we traverse because we don't know which subtree contains the maximum until we've checked all of them.

Solution

class Solution:
    def diameterOfBinaryTree(self, root: Optional[TreeNode]) -> int:
        # Global tracker for maximum diameter found so far
        self.diameter = 0
        
        def get_height(node):
            # Base case: empty tree has height 0
            if not node:
                return 0
            
            # Post-order: process children BEFORE computing current node
            # Recursively get heights of left and right subtrees
            left_height = get_height(node.left)
            right_height = get_height(node.right)
            
            # The longest path THROUGH this node = left height + right height
            # This represents the distance between deepest leaf in left 
            # and deepest leaf in right, passing through current node
            self.diameter = max(self.diameter, left_height + right_height)
            
            # Return height of current subtree: 1 (current node) + max child height
            return 1 + max(left_height, right_height)
        
        get_height(root)
        return self.diameter

Complexity

Time: O(n) - We visit every node exactly once. Each node requires O(1) work: two recursive calls and some max comparisons. There's no way to do better because the diameter could involve any node - we'd have to examine all nodes to be sure.
Space: O(h) where h is the height of the tree. This is the recursion stack depth. In the worst case (skewed tree), h = n giving O(n); in balanced tree, h = log(n) giving O(log n). We don't use extra space proportional to n because we only store one value per stack frame.

Time can't be less than O(n) because you must examine every node - the diameter could be hiding in any subtree and you need information from all nodes to find it. Space is O(h) because at any moment, you're only holding the path from root to current node in the call stack - you don't need to remember anything about branches you've already finished processing.

Common Mistakes

Confusing diameter with height - diameter is the WIDEST distance (through a node), height is the LONGEST distance (from node to leaf)
Forgetting to add edges correctly - diameter through a node is left_height + right_height (edges between left subtree and right subtree through current node), not heights + 1
Trying to return diameter from the recursive function - each node should return its HEIGHT to parent, while tracking diameter globally
Not handling empty trees - should return 0 for null root

Edge Cases

Empty tree (root = None) returns 0
Single node tree returns 0 (no edges between nodes)
Skewed tree (linked list shape) - diameter is actually the height minus 1, all in one branch
Tree with only two nodes - diameter is 1 (one edge connecting them)

Connections

S
a
m
e
c
o
r
e
i
n
s
i
g
h
t
a
s
'
M
a
x
i
m
u
m
D
e
p
t
h
o
f
B
i
n
a
r
y
T
r
e
e
'
(
h
e
i
g
h
t
c
a
l
c
u
l
a
t
i
o
n
)
b
u
t
w
i
t
h
a
n
a
d
d
i
t
i
o
n
a
l
d
i
m
e
n
s
i
o
n
-
w
e
'
r
e
t
r
a
c
k
i
n
g
b
o
t
h
h
e
i
g
h
t
f
o
r
p
a
r
e
n
t
c
o
m
p
u
t
a
t
i
o
n
A
N
D
d
i
a
m
e
t
e
r
a
s
a
s
i
d
e
e
f
f
e
c
t
.
A
l
s
o
c
o
n
n
e
c
t
s
t
o
'
L
o
n
g
e
s
t
U
n
i
v
a
l
u
e
P
a
t
h
'
(
s
i
m
i
l
a
r
t
r
a
v
e
r
s
a
l
p
a
t
t
e
r
n
t
r
a
c
k
i
n
g
g
l
o
b
a
l
m
a
x
)
.
T
h
e
p
o
s
t
-
o
r
d
e
r
+
g
l
o
b
a
l
t
r
a
c
k
e
r
p
a
t
t
e
r
n
a
p
p
e
a
r
s
i
n
m
a
n
y
t
r
e
e
p
r
o
b
l
e
m
s
w
h
e
r
e
a
n
s
w
e
r
i
s
n
'
t
n
e
c
e
s
s
a
r
i
l
y
a
t
r
o
o
t
.

Invert Binary Tree #226

Depth-First Search (DFS) with recursion - specifically post-order traversal where we process children before the parent. This is a divide-and-conquer approach.

Intuition

Think of this like reflecting a binary tree in a vertical mirror. Every left child becomes a right child and vice versa - it's a horizontal flip. At each node, you're simply swapping the 'direction' of the signal going left vs right. The entire tree is just a collection of these local swaps applied recursively to every subtree.

Why This Pattern?

Binary trees are inherently recursive structures - each subtree IS itself a binary tree. To invert the whole tree, you can: 1) invert the left subtree, 2) invert the right subtree, 3) swap the two subtrees at the current node. This decomposes one big problem into identical smaller problems until you hit the base case (null node).

Solution

class Solution:
    def invertTree(self, root):
        # Base case: empty tree or leaf node's child (null)
        if not root:
            return None
        
        # Swap left and right children at current node
        # This is the key operation - we're 'flipping' this node's children
        root.left, root.right = root.right, root.left
        
        # Recursively invert both subtrees
        # The recursion handles all descendants - we just swap at each level
        self.invertTree(root.left)
        self.invertTree(root.right)
        
        return root

Complexity

Time: O(n) where n is the number of nodes
Space: O(h) where h is the height of the tree (worst case O(n) for skewed trees, O(log n) for balanced)

We must visit every single node to swap its children - there's no way around this because every node's position changes relative to its parent. For space: the recursion stack depth equals tree height. A balanced tree with log n levels needs only log n stack frames, but a completely skewed tree (like a linked list) needs n stack frames.

Common Mistakes

Forgetting the base case (if not root: return None) and getting NoneType errors
Attempting to 'move' nodes rather than simply swapping references
Not understanding that swapping can happen before OR after recursing - both work
Confusing tree inversion with tree reversal (different operation)

Edge Cases

Empty tree (root is None) - should return None
Single node tree - trivial, returns the same node
Completely skewed tree (all left children or all right children) - still works but hits recursion limit on very deep trees
Very wide/balanced tree - BFS version might use more memory than DFS

Connections

Same recursive pattern as 'Maximum Depth of Binary Tree' and 'Symmetric Tree'
This IS the solution to 'Mirror of Binary Tree' (same problem, different name)
Related to 'Same Tree' - checking if two trees are identical uses similar traversal
Level-order (BFS) version uses same pattern as 'Binary Tree Level Order Traversal'

Kth Smallest Element in a BST #230

In-order traversal with early termination

Intuition

Think of a BST as a sorted array that's been 'folded' into a tree shape. The BST property (left subtree < root < right subtree) means if you read it in the right order, you get sorted numbers. That's exactly what in-order traversal does: left -> root -> right. It's like unwrapping a folded sort - you're just reading the tree in its natural sorted order and stopping at position k. The kth node you encounter IS the kth smallest.

Why This Pattern?

The BST property guarantees that in-order traversal visits nodes in ascending sorted order. We don't need to sort anything or collect all nodes - we just need to 'read' the tree correctly and stop early. The iterative approach simulates the recursive call stack but lets us exit as soon as we find the answer, avoiding unnecessary exploration of the right subtree.

Solution

def kthSmallest(root, k):
    # Iterative in-order traversal - stops as soon as we find kth element
    stack = []
    current = root
    count = 0
    
    while current or stack:
        # Go all the way left - like descending a ladder
        while current:
            stack.append(current)
            current = current.left
        
        # Process current node (the next smallest)
        current = stack.pop()
        count += 1
        
        if count == k:
            return current.val
        
        # Move to right subtree to continue
        current = current.right
    
    return None  # Edge case: invalid k

Complexity

Time: O(H + k) where H is tree height
Space: O(H) for the stack, where H is tree height

We descend H levels to reach the leftmost (smallest) element, then process exactly k nodes. In a balanced BST, H = log(n), so O(log n + k). In the worst case (completely skewed tree like a linked list), H = n, so O(n). The key insight: we never visit more nodes than necessary - we stop at k, not at the entire tree.

Common Mistakes

Treating k as 0-indexed when it's 1-indexed (k=1 means first smallest)
Using recursive solution which uses O(H) call stack space unnecessarily
Not handling invalid k (k > n or k <= 0)
Confusing the problem with kth largest (would need reverse in-order or n-k+1)

Edge Cases

k equals the total number of nodes (return largest element)
k = 1 (return smallest element, the leftmost node)
Empty tree or None root
Single node tree with k=1
k is larger than tree size

Connections

Same core insight as 'Binary Tree Inorder Traversal' - BST in-order gives sorted order
Similar to 'Validate BST' - both rely on BST property
Related to 'Kth Largest Element in an Array' but tree version exploits structural property instead of sorting
Compare with 'Find Mode in Binary Search Tree' - both traverse in sorted order

Lowest Common Ancestor of a BST #235

Binary Search on Tree (using BST property to prune half the search space at each step)

Intuition

Think of a BST as a sorted hierarchy - like an org chart sorted by employee ID. The two nodes p and q each have a 'path' from the root. The LCA is where these two paths first meet going upward. Here's the key insight: as you traverse from the root, you're essentially asking 'are both nodes to the left of me, or both to the right?' If they're both on one side, you know the LCA must be in that subtree. The moment one node is on the left and one is on the right, you've found the divergence point - that's your LCA because p and q's lowest common ancestor must be an ancestor of both, and this node is the deepest one that satisfies that.

Why This Pattern?

The BST property (left < node < right) gives us directional information. Unlike a generic tree where we'd need to find paths first and compare them, here we can use the sorted values to directly navigate to the LCA. Each step eliminates half the tree - this is the 'search' aspect. We're not searching for a single target, but rather searching for the point where two targets 'diverge' in their direction from the root.

Solution

def lowestCommonAncestor(self, root: 'TreeNode', p: 'TreeNode', q: 'TreeNode') -> 'TreeNode':
    # Start at root, navigate down using BST property
    current = root
    
    while current:
        # Both nodes are in the left subtree - LCA must be in left subtree
        if p.val < current.val and q.val < current.val:
            current = current.left
        # Both nodes are in the right subtree - LCA must be in right subtree  
        elif p.val > current.val and q.val > current.val:
            current = current.right
        else:
            # One on left, one on right (or one IS current) - this is the divergence point
            return current
    
    return current  # Should never reach here with valid BST and nodes

Complexity

Time: O(h) where h is height of tree
Space: O(1) - only using a pointer, no extra space

We traverse only one path from root to LCA - at most h nodes. In the worst case (skewed tree), h = n giving O(n), but in a balanced BST it's O(log n). We never visit a node twice because the BST property tells us exactly which direction to go.

Common Mistakes

Forgetting that p or q could BE the LCA (the case where one node is an ancestor of the other) - this is handled by the else branch since p.val == current.val or q.val == current.val will fail both comparisons
Confusing this with the generic binary tree LCA problem which requires parent pointers or post-order traversal
Using p.val == current.val alone instead of the range check - need to catch both 'split' cases

Edge Cases

p and q are the same node - that node is its own LCA
p is ancestor of q (or vice versa) - the ancestor is the LCA
p and q are at opposite ends of the tree - root is LCA
Root is one of the nodes - root is the LCA

Connections

LeetCode 236 (Lowest Common Ancestor of Binary Tree) - this is the general version without BST property, requires different approach
LeetCode 701 (Insert into BST) - uses same BST navigation pattern
LeetCode 700 (Search in BST) - same traversal logic
The BST property here enables O(h) vs O(n) for generic tree

Maximum Depth of Binary Tree #104

DFS (Depth-First Search) with recursion / divide and conquer

Intuition

Think of this like measuring how tall a tree grows — from the trunk (root) down to the farthest leaf. You're essentially asking: 'How many levels of branches are there, counting from the top?' Imagine a signal propagating downward from the root — the deepest leaf is where the signal takes the longest path. Each node you traverse adds 1 to your count, and you want the maximum path length.

Why This Pattern?

Trees are recursive structures — each node's children are themselves (smaller) trees. This means we can solve the problem by: (1) finding the max depth of the left subtree, (2) finding the max depth of the right subtree, (3) taking the maximum and adding 1 for the current node. This is the natural fit because the answer for a tree depends exactly on the answers for its subtrees.

Solution

def maxDepth(root):
    # Base case: empty tree has depth 0
    if not root:
        return 0
    
    # Recursively get depth of left and right subtrees
    left_depth = maxDepth(root.left)
    right_depth = maxDepth(root.right)
    
    # The depth of current node = deeper child + 1 (for current node)
    return max(left_depth, right_depth) + 1

Complexity

Time: O(n)
Space: O(h) where h is the height of the tree (recursion stack depth)

We must visit every node at least once to know the maximum depth — you can't determine depth without checking all paths. The space is the recursion stack, which goes as deep as the tree height. In a balanced tree this is O(log n), in a completely skewed tree it's O(n).

Common Mistakes

Forgetting the base case: empty tree (root is None) should return 0, not crash
Not adding 1 for the current node — you must count the root itself
Returning just one subtree's depth instead of taking the max of both
Confusing with minimum depth (which requires BOTH children to be checked)

Edge Cases

Empty tree (root = None) → return 0
Single node tree → return 1
Completely skewed tree (linked list shape) → still works, just traverses all nodes

Connections

Minimum Depth of Binary Tree (same structure, trickier — requires both children)
Balanced Binary Tree (uses similar recursive depth calculation)
Diameter of Binary Tree (also recursively computes depths)
Invert Binary Tree (same DFS traversal pattern)

Same Tree #100

Depth-First Search (DFS) / Structural Recursion

Intuition

Imagine you're comparing two trees in a forest - you need to check if every branch, twig, and leaf is in exactly the same position. Two trees are identical if: (1) they're both empty, OR (2) they both have roots with the same value AND their left branches are identical AND their right branches are identical. It's like the mathematical definition of tree equality - structural isomorphism plus value matching. Think of it as a recursive mirror: you're checking 'are these two subtrees the same?' at every level.

Why This Pattern?

Trees are inherently recursive data structures - a tree is made of a root plus a left subtree and a right subtree. The most natural way to compare them is to break the problem down recursively: 'Are tree A and tree B identical?' becomes 'Are root values equal AND (are left subtrees identical AND are right subtrees identical)?' This mirrors how trees are defined, making recursion the most elegant solution.

Solution

def isSameTree(p, q):
    # Base case: both nodes are None - we've reached identical leaves
    if not p and not q:
        return True
    
    # Structural mismatch: one tree has a node where the other doesn't
    if not p or not q:
        return False
    
    # Check current node values AND recursively check both subtrees
    # Both must be true for trees to be identical
    return (p.val == q.val) and isSameTree(p.left, q.left) and isSameTree(p.right, q.right)

Complexity

Time: O(min(n, m)) where n and m are node counts in the two trees. In the worst case (identical trees), we must visit every node to confirm equality - we can't know they're the same without checking all positions.
Space: O(min(h1, h2)) where h1 and h2 are the heights of the two trees. This is the recursive call stack depth, which equals the depth of the shorter/more shallow tree.

For time: imagine checking two fingerprints - you must examine every ridge at every position to confirm a match. For space: recursion is like walking through the trees branch by branch simultaneously - you only need to 'remember' the path from root to your current position, not the entire tree.

Common Mistakes

Checking node values without first verifying both nodes exist (causes AttributeError on None.val)
Using OR instead of AND in the final return - this would return True if EITHER subtree matches, not BOTH
Thinking O(n) means 'linear' without recognizing it means visiting every node once
Forgetting that trees can be structurally different even with identical values at matching positions

Edge Cases

Both trees are empty (should return True)
One tree is empty, the other has a single node (should return False)
Both trees are single nodes with same value (should return True)
Trees with identical values but different structures (e.g., [1,2] vs [1,null,2])
Deeply nested single-branch trees (linked-list shaped)

Connections

Subtree problem (LeetCode 572) - uses same recursive comparison but only checks if one tree is contained in another
Symmetric Tree (LeetCode 101) - similar recursion but compares left subtree to mirrored right subtree
Merge Two Binary Trees (LeetCode 617) - combines trees element-wise, using similar recursive traversal pattern
Count Complete Tree Nodes - uses similar tree traversal logic to explore all nodes

Serialize and Deserialize Binary Tree #297

Pre-order depth-first traversal with null sentinel markers

Intuition

Think of serialization like creating a shipping manifest for a fractal structure. A binary tree has 'gaps' where branches don't exist — you need to record both what's there AND what's missing. The key insight: if you do a PRE-ORDER traversal (root, then left subtree, then right subtree), you get all the information in the right order. When you hit a null marker (#), you know exactly how far to 'rewind' to find where the next branch connects. It's like a recipe: 'Take the main ingredient, then here's how to make the left side dish, then the right side dish.' The root anchors everything — once you know the root, you know the left subtree comes next, then the right.

Why This Pattern?

Pre-order gives us the root first, which is the anchor for the entire structure. Combined with null markers (#), it creates an unambiguous encoding: when we encounter a null, we know we've finished processing that subtree and can 'bubble up' to attach the next subtree. It's like a pushdown automaton — the nulls tell us when to pop back up the recursion stack. BFS/level-order also works but requires a more complex queue structure; pre-order with recursion is the most natural fit.

Solution

class Codec:
    def serialize(self, root):
        """Encodes a tree to a single string using pre-order traversal."""
        def preorder(node):
            if not node:
                return ['#']  # Sentinel for null — marks "no child here"
            # Convert to string for joining, pre-order: root -> left -> right
            return [str(node.val)] + preorder(node.left) + preorder(node.right)
        
        return ','.join(preorder(root))
    
    def deserialize(self, data):
        """Decodes the string back to a binary tree."""
        def build():
            val = next(values)
            if val == '#':
                return None  # This spot is empty, bubble up
            # Create node, recursively build its left and right subtrees
            node = TreeNode(int(val))
            node.left = build()   # Everything after root's val until # is left subtree
            node.right = build()  # Everything after left subtree is right subtree
            return node
        
        values = iter(data.split(','))
        return build()

Complexity

Time: O(n) where n is the number of nodes
Space: O(n) for the serialized string and recursion stack

We must visit every node once to serialize it (can't skip nodes — we need their values) and once to deserialize it (can't reconstruct without reading all values). The serialization string contains n values plus n+1 null markers, so O(n) space is unavoidable. The recursion depth equals tree height, which is O(n) worst-case for skewed trees but O(log n) for balanced ones.

Common Mistakes

Forgetting null markers — without them, you can't distinguish between 'no left child' and 'left subtree has one node'
Using incorrect delimiter or not escaping delimiters in node values (this problem assumes simple integer values)
Confusing pre-order with in-order or post-order — only pre-order works because root must come first to anchor the structure
Not resetting the iterator correctly during deserialization — each recursive call consumes one value

Edge Cases

Empty tree (root is None) — serialize returns '', deserialize returns None
Tree with single node
Completely skewed tree (linked list) — recursion depth becomes O(n), may hit stack limit
Tree with duplicate values — string representation must handle this correctly

Connections

Same core idea as 'Construct Binary Tree from Preorder and Inorder Traversal' (#105) — pre-order gives the root anchor
Related to 'Binary Tree Level Order Traversal' (#102) — that's the BFS/level-order alternative approach
Uses same recursive tree-building pattern as 'Construct String from Binary Tree' (#606)
The null sentinel concept appears in many serialization problems and regex matching

Subtree of Another Tree #572

Recursive tree traversal with equality checking (the 'check everywhere' pattern).

Intuition

Think of this like finding a pattern in a larger structure. Imagine you're looking for a specific subtree shape within a bigger tree - it's like pattern matching in a hierarchy. You can't know ahead of time where t might 'start' within s, so you have to check EVERY node in s as a potential root. At each node, you ask: 'Does the tree rooted here match tree t exactly?' If yes, you found it. If not, keep looking in the left and right branches. The key insight: a subtree must have identical STRUCTURE (shape) AND values - not just matching values in isolation.

Why This Pattern?

Since we don't know where t might be positioned in s, we must attempt a match at every node. This is a classic exhaustive search over potential starting positions, combined with a structural equality check at each position.

Solution

class Solution:
    def isSubtree(self, s: TreeNode, t: TreeNode) -> bool:
        # Base case: empty t is subtree of anything (vacuously true)
        if not t:
            return True
        # If t exists but s is exhausted, no match possible
        if not s:
            return False
        
        # Check if current node in s could be root of t
        if self.isSameTree(s, t):
            return True
        
        # Otherwise, recurse on both subtrees - t could be anywhere
        return self.isSubtree(s.left, t) or self.isSubtree(s.right, t)
    
    def isSameTree(self, s: TreeNode, t: TreeNode) -> bool:
        # Both trees exhausted - identical so far
        if not s and not t:
            return True
        # One tree exhausted, other not - structure differs
        if not s or not t:
            return False
        
        # Check root value, then recurse on both children
        # Structure must match exactly (both left children, both right children)
        return (s.val == t.val and 
                self.isSameTree(s.left, t.left) and 
                self.isSameTree(s.right, t.right))

Complexity

Time: O(n * m) in worst case, where n = nodes in s, m = nodes in t. For each of n potential starting positions, we may compare up to m nodes.
Space: O(n) in worst case for recursion stack, where n is height of s. In a skewed tree (linked list), depth could be n; in balanced tree, depth is O(log n).

We must potentially try every node in s as a root candidate. At each try, we might need to traverse the entire subtree t to verify equality. There's no way to skip comparisons because tree structure doesn't have the predictable ordering that would let us rule out regions quickly (unlike BSTs where left/right tells you which branch to take).

Common Mistakes

Forgetting that empty tree t is always a subtree (returns True vacuously)
Confusing 'subtree' with 'subsequence' - the structure must be identical, not just contain same values
Not checking both left AND right recursively - must explore all possibilities
Using wrong base cases in isSameTree - both None is True, one None is False

Edge Cases

t is None/empty - always returns True per problem definition
s and t are identical trees - should return True
t is a single node that exists somewhere in s
s is None/empty but t is not - must return False
Trees with same values but different structures (not subtrees)

Connections

Same Tree (#100) - uses identical isSameTree helper; this problem adds the 'search everywhere' layer on top
Invert Binary Tree (#226) - another tree traversal pattern, but transforms rather than searches
Maximum Depth of Binary Tree (#104) - simple recursion over left/right, foundational for tree problems

Validate Binary Search Tree #98

Tree traversal with constraint propagation (bounded recursion)

Intuition

Think of a BST like a distribution center with strict ordering rules. Every piece of mail going left must be 'less than' the current location, every piece going right must be 'greater than'. The key insight that trips people up: it's not just about immediate children — ALL left descendants must be less than the parent, and ALL right descendants must be greater. A common mistake is only checking node.left < node < node.right, which misses cases where a grandchild violates the rule. The solution is to carry valid bounds DOWN the tree like a pass-down rule: 'anything in this subtree must be between X and Y.'

Why This Pattern?

Each node needs to know what valid range it lives in. The left subtree inherits 'everything must be less than current value' as an upper bound, the right subtree inherits 'everything must be greater than current value' as a lower bound. This creates a natural recursive structure where constraints tighten as we go deeper.

Solution

def isValidBST(self, root):
    def validate(node, low, high):
        # Empty trees are valid BSTs (base case)
        if not node:
            return True
        
        # Current node must be strictly within bounds
        # Using <= and >= catches duplicates that would break BST property
        if node.val <= low or node.val >= high:
            return False
        
        # Validate left subtree with tighter upper bound (current value)
        # Validate right subtree with tighter lower bound (current value)
        return validate(node.left, low, node.val) and validate(node.right, node.val, high)
    
    return validate(root, float('-inf'), float('inf'))

Complexity

Time: O(n)
Space: O(h) where h is tree height (worst case O(n) for skewed tree, best case O(log n) for balanced)

We must visit every node to confirm the BST property holds everywhere — you can't determine validity without checking all values. Space is the recursion depth, which equals tree height: a balanced tree has log n levels, a completely skewed tree has n levels (like a linked list).

Common Mistakes

Only checking immediate children (node.left < node < node.right) — misses violations deeper in subtrees
Using <= for both bounds — BST requires STRICT inequality, so duplicates at left/right children break it
Not handling integer overflow when using max/min integers — use float('inf') instead
Forgetting that negative infinity and positive infinity are the initial bounds

Edge Cases

Single node tree (always valid)
Tree with duplicate values (invalid — violates strict inequality)
Tree with extreme values like float('inf') or -2^31 (bounds must use float inf)
Left-skewed or right-skewed tree (degenerate into linked list)

Connections

In-order traversal of a valid BST produces sorted sequence — this is an alternative solution where you track the previous value
Same constraint-propagation pattern as 'Validate Binary Search Tree' but applied to subtrees
Related to 'Kth Smallest Element in BST' — uses in-order traversal which only works on valid BSTs
Similar logic to 'Binary Tree Max Path Sum' where you propagate information up/down the tree

Tries (3)

Design Add and Search Words Data Structure #211

Trie (Prefix Tree) with DFS backtracking for wildcard search

Intuition

Think of this like organizing words in a physical dictionary where each page represents a letter. If you're looking for 'cat', you go to the 'c' section, then 'a', then 't'. Now imagine some search queries have wildcards - it's like someone handing you a mask that covers one letter, and you have to check ALL possible letters that could be under that mask. This is exactly what a Trie does: it builds a tree where each path from root to leaf spells a word, and the wildcard search just means 'try every possible branch at this point'.

Why This Pattern?

Words have a natural hierarchical structure based on their prefixes. A trie exploits this by sharing common prefixes. The wildcard '.' character requires exploring ALL possible children at that position - this is a classic tree traversal problem where DFS naturally explores all branches. The tree structure makes backtracking straightforward: when one path fails, we automatically return to try other branches.

Solution

class TrieNode:
    def __init__(self):
        self.children = {}  # char -> TrieNode mapping
        self.is_end = False  # marks if this node completes a word

class WordDictionary:
    def __init__(self):
        self.root = TrieNode()

    def addWord(self, word: str) -> None:
        """Insert word into trie - create nodes as needed, mark end."""
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end = True  # mark this as end of a valid word

    def search(self, word: str) -> bool:
        """Search with DFS/backtracking. '.' means try ALL children."""
        def dfs(index, node):
            # Base case: processed all characters
            if index == len(word):
                return node.is_end
            
            char = word[index]
            
            if char == '.':
                # Wildcard: try EVERY possible child branch
                for child in node.children.values():
                    if dfs(index + 1, child):
                        return True
                return False  # no children led to a match
            else:
                # Specific character: must exist in children
                if char not in node.children:
                    return False
                return dfs(index + 1, node.children[char])
        
        return dfs(0, self.root)

Complexity

Time: O(L) for addWord where L is word length. For search: O(L) for exact match, but O(26^L) worst case when search string is all wildcards '.' because we may need to explore every branch of the tree.
Space: O(N * L) where N is number of words and L is average word length - each character needs a node. For search: O(L) recursion stack depth.

Adding a word is like walking down a path - you visit each character once, so that's O(L). Searching is like exploring a maze: if you know the exact letters, you take one path (O(L)). But with wildcards, at each '.' you might have to try up to 26 different directions (alphabet size), creating exponential exploration in the worst case. The space is the physical 'filing cabinet' you build to store all words - each character needs its own folder/node.

Common Mistakes

Forgetting to check node.is_end at the end of search - searching 'a' should return False if only 'ab' exists in the dictionary
Not returning False when a specific character path doesn't exist
Forgetting to mark is_end=True when adding words - the word 'bad' could be missed if you don't mark the 'd' node as an end point
Not handling empty string '' - it should return True only if '' was added

Edge Cases

Searching for an empty string '' returns True if empty word was added (usually not added, so False)
Searching for a single character 'a' - must check if any word ends at that node
Words with same prefix but different lengths - 'app' vs 'apple' must both be findable
Search pattern longer than any added word - should return False

Connections

This IS the basic Trie problem (LeetCode 208 - Implement Trie) with added wildcard complexity
Similar to Word Search II (LeetCode 212) which also uses Trie + DFS for 2D grid search
The DFS/backtracking pattern here is identical to finding paths in matrices (LeetCode 79 Word Search)
If you replace '.' with ability to match multiple characters, you'd have a regex-matching problem

Implement Trie (Prefix Tree) #208

Trie (Prefix Tree) with hashmap children

Intuition

Imagine a filing cabinet where you organize words by their first letter, then within each drawer you organize by second letter, and so on. That's essentially a Trie. The root is the cabinet, each branch is a letter, and when you reach the end of a word, you put a flag there saying 'this is a complete word, not just a prefix.' It's like a tree that branches more and more as letters diverge - 'apple' and 'apply' share 'app' on the same branch, then split at 'l' vs 'y'. This structure naturally groups all words starting with 'app' together, which is why prefix searches are so fast.

Why This Pattern?

A Trie exploits the fact that words share prefixes. Each node represents a prefix, and paths from root to any node represent valid prefixes. The tree structure naturally clusters related words together, making prefix operations O(m) where m is prefix length - you just follow the branches. This is fundamentally different from hash-based approaches which need to examine the entire word.

Solution

class TrieNode:
    def __init__(self):
        self.children = {}  # char -> TrieNode
        self.is_end = False  # marks complete word

class Trie:
    def __init__(self):
        self.root = TrieNode()
    
    def insert(self, word: str) -> None:
        # Walk down the tree, creating nodes as needed
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end = True  # mark the end of this word
    
    def search(self, word: str) -> bool:
        # Find the node for this word, then check if it's an endpoint
        node = self._find_node(word)
        return node is not None and node.is_end
    
    def startsWith(self, prefix: str) -> bool:
        # Just need to find the node - existence means prefix exists
        return self._find_node(prefix) is not None
    
    def _find_node(self, prefix: str) -> TrieNode:
        # Helper to traverse the trie and return the final node
        node = self.root
        for char in prefix:
            if char not in node.children:
                return None
            node = node.children[char]
        return node

Complexity

Time: O(m) where m is the length of the word/prefix
Space: O(1) for operations, O(ALPHABET_SIZE * m * n) for storage where m is avg word length and n is number of words

For any operation, you touch exactly one node per character - you can't skip letters because each branch IS a letter. Insertion must create nodes for new prefixes, but search/startsWith just follows existing branches. The space is proportional to total unique prefixes stored, which is bounded by the total characters across all inserted words.

Common Mistakes

Not marking is_end = True after insertion - words that are prefixes of other words (like 'app' in 'apple') need this flag
Confusing search with startsWith - search requires is_end=True, startsWith just needs the path to exist
Using array of size 26 instead of dict - dict is simpler and handles any character, array is slightly faster but more verbose
Forgetting the root node doesn't represent a character - it's an empty prefix

Edge Cases

Empty string '' - insert does nothing but sets is_end=True on root, search returns True if previously inserted, startsWith always returns True
Word is prefix of another - 'dog' vs 'doggy', must properly handle is_end flag
Duplicate insertions - should be idempotent, just re-mark is_end=True
Very long words - tree depth equals word length, but Python recursion not used so no stack issues

Connections

Same core structure as Word Dictionary (LeetCode 211) - just adds wildcard matching
Used inside LeetCode 212 (Word Search II) to efficiently find all words in a grid
The prefix concept connects to Shortest Prefix (similar to Huffman coding)
Think of how this enables autocomplete - find all words under a prefix node, traverse the subtree

Word Search II #212

Trie + Backtracking (DFS)

Intuition

Imagine you're searching for words in a crossword puzzle. Instead of taking each word from your list and individually hunting for it on the grid (slow!), you first memorize ALL the words into a prefix tree (Trie). Then, as you explore the grid letter-by-letter, you can quickly check 'does this path match any word I'm looking for?' The Trie acts like a routing table — at each cell, you ask 'can I continue down a valid word path?' If the current letters don't match any prefix in your dictionary, stop exploring that branch immediately (pruning). This turns a potentially expensive O(words × board) search into a single traversal where we check all words simultaneously.

Why This Pattern?

The problem requires finding multiple words in a single grid. A Trie enables O(1) character lookup per step (checking if current prefix exists in any word), while DFS explores all possible paths from each starting cell. The Trie structure naturally supports pruning: once a path diverges from all word prefixes, we backtrack. This combination is the classic solution because we're essentially doing one unified search for ALL words at once, rather than N separate searches.

Solution

class TrieNode:
    def __init__(self):
        self.children = {}
        self.word = None  # stores the complete word if this is end of a word

class Solution:
    def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
        # Build Trie from all words
        root = TrieNode()
        for word in words:
            node = root
            for char in word:
                if char not in node.children:
                    node.children[char] = TrieNode()
                node = node.children[char]
            node.word = word  # mark end of word
        
        result = []
        rows, cols = len(board), len(board[0])
        
        def dfs(r, c, node):
            char = board[r][c]
            # If current cell not in Trie, stop (prune dead branch)
            if char not in node.children:
                return
            
            next_node = node.children[char]
            # If we've reached a complete word, add to result
            if next_node.word:
                result.append(next_node.word)
                next_node.word = None  # prevent duplicates
            
            # Mark as visited by temporarily replacing with '#'
            board[r][c] = '#'
            
            # Explore all 4 directions
            for dr, dc in [(0,1), (0,-1), (1,0), (-1,0)]:
                nr, nc = r + dr, c + dc
                if 0 <= nr < rows and 0 <= nc < cols:
                    dfs(nr, nc, next_node)
            
            # Restore cell (backtrack)
            board[r][c] = char
        
        # Start DFS from every cell
        for r in range(rows):
            for c in range(cols):
                dfs(r, c, root)
        
        return result

Complexity

Time: O(M × N × 4^L) where M×N is board size and L is max word length
Space: O(total characters in all words) for Trie + O(L) for DFS recursion stack

In the worst case, we explore every cell and from each cell try all 4 directions to depth L (longest word). However, the Trie dramatically prunes this: for each step, if the character isn't in any word's prefix, we stop immediately. The actual runtime is closer to O(M×N) in practice because dead branches are cut short. We also visit each cell at most once per word found, and each cell's character is processed once per DFS call.

Common Mistakes

Forgetting to mark cells as visited, causing infinite loops or using same cell twice
Not clearing word reference after finding it (causing duplicates)
Forgetting to restore board[r][c] after backtracking
Not checking boundary conditions in DFS
Assuming words can reuse cells (they cannot - must mark visited)

Edge Cases

Empty board or empty word list (return [])
Words that are prefixes of other words (handle by checking node.word not None)
All cells are same character
Words longer than board cells (handled naturally by no-match)
Duplicates in input words (Trie handles, result should be unique)

Connections

Word Search I (79) - same grid traversal, but finds ONE word; this problem finds MULTIPLE using Trie
Implement Trie (208) - builds the prefix tree structure used here
Prefix Tree II (472) - finds words that are concatenations of other words, uses same Trie concept
This is essentially Word Search I + Implement Trie combined into one problem

Heap / Priority Queue (7)

Design Twitter #355

Merge K Sorted Streams with Heap (Top-K Selection)

Intuition

Think of this like a news aggregator merging multiple feeds. Each user has their own feed (a stream of tweets sorted by time). When you want your news feed, you need to merge K+1 sorted streams (your own feed + everyone you follow) and pick the top 10. This is exactly like having K+1 sorted lists and finding the top-k elements — the classic 'merge k sorted arrays' problem. A max-heap is perfect here: we keep one pointer into each feed, the heap tells us which pointer has the most recent tweet, we take it, then advance that pointer. We repeat until we have 10 tweets or run out.

Why This Pattern?

Each user's tweet history is naturally sorted by timestamp (newer tweets are added to the front). To get the global top-10 most recent, we need to merge K+1 sorted sequences. A max-heap gives us O(log K) access to the 'current maximum' among all streams, making this the optimal pattern. We don't need to sort everything — just find top-10, which the heap handles elegantly.

Solution

class Tweet:
    def __init__(self, id: int, time: int):
        self.id = id
        self.time = time
    
    def __lt__(self, other):
        # For max-heap: we want larger time to be "smaller" in heap ordering
        return self.time > other.time

class Twitter:
    def __init__(self):
        self.tweets = {}          # userId -> list of Tweets (newest first)
        self.follows = {}         # userId -> set of followeeIds
        self.time = 0             # global timestamp (monotonically increasing)

    def postTweet(self, userId: int, tweetId: int) -> None:
        self.time += 1
        if userId not in self.tweets:
            self.tweets[userId] = []
        # Add to front (newest first)
        self.tweets[userId].append(Tweet(tweetId, self.time))

    def follow(self, followerId: int, followeeId: int) -> None:
        if followerId == followeeId:
            return  # Can't follow yourself (per problem constraints)
        if followerId not in self.follows:
            self.follows[followerId] = set()
        self.follows[followerId].add(followeeId)

    def unfollow(self, followerId: int, followeeId: int) -> None:
        if followerId in self.follows:
            self.follows[followerId].discard(followeeId)

    def getNewsFeed(self, userId: int) -> List[int]:
        # Build list of tweet sources: user's own tweets + everyone they follow
        sources = []
        
        # Include self
        if userId in self.tweets:
            sources.append(iter(self.tweets[userId]))
        
        # Include follows
        if userId in self.follows:
            for followee in self.follows[userId]:
                if followee in self.tweets:
                    sources.append(iter(self.tweets[followee]))
        
        # Max-heap to get most recent tweet across all sources
        heap = []
        result = []
        
        # Initialize heap with first tweet from each source
        for source in sources:
            tweet = next(source, None)
            if tweet:
                heapq.heappush(heap, (tweet, source))  # (tweet, iterator)
        
        # Extract top 10
        while heap and len(result) < 10:
            tweet, source = heapq.heappop(heap)
            result.append(tweet.id)
            # Get next tweet from this source
            next_tweet = next(source, None)
            if next_tweet:
                heapq.heappush(heap, (next_tweet, source))
        
        return result

Complexity

Time: O(F + 10*log(F)) where F = number of people user follows (plus self). In worst case F = O(N) where N is total users, so O(N + 10*log N). For each getNewsFeed, we push at most F+1 items and pop at most 10 items.
Space: O(N + T) = O(total users + total tweets). We store all follows as sets and all tweets. getNewsFeed uses O(F) extra for the heap.

We can't do better than O(N) for getNewsFeed in the worst case because in theory you follow everyone. But the heap saves us from sorting — we only do 10 pops max. The 10 is constant (the problem limits feed to 10), so it's effectively O(F + log F). Storing all tweets is necessary because we need them for future feed requests.

Common Mistakes

Not handling users who haven't posted tweets yet when building the feed source list
Using a min-heap instead of max-heap (or inverting the comparison) — we want most recent first, which is maximum timestamp
Forgetting to initialize empty data structures for new users
Modifying the original tweet list while iterating (don't consume iterators directly on the list)

Edge Cases

User follows themselves (Twitter allows this but problem says don't) — add guard
User unfollows someone they're not following — use set.discard not set.remove
User has no tweets — skip in source building
User follows someone with no tweets — skip in source building
User unfollows everyone — feed should only contain their own tweets

Connections

T
h
i
s
i
s
t
h
e
s
a
m
e
c
o
r
e
p
a
t
t
e
r
n
a
s
'
M
e
r
g
e
K
S
o
r
t
e
d
L
i
s
t
s
'
(
L
e
e
t
C
o
d
e
2
3
)
.
T
h
e
d
i
f
f
e
r
e
n
c
e
i
s
:
i
n
s
t
e
a
d
o
f
m
e
r
g
i
n
g
c
o
m
p
l
e
t
e
l
i
s
t
s
,
w
e
m
e
r
g
e
'
l
i
v
e
'
s
t
r
e
a
m
s
a
n
d
s
t
o
p
a
f
t
e
r
k
i
t
e
m
s
.
A
l
s
o
s
h
a
r
e
s
D
N
A
w
i
t
h
'
K
t
h
L
a
r
g
e
s
t
E
l
e
m
e
n
t
i
n
a
n
A
r
r
a
y
'
(
L
e
e
t
C
o
d
e
2
1
5
)
—
b
o
t
h
u
s
e
h
e
a
p
s
f
o
r
t
o
p
-
k
s
e
l
e
c
t
i
o
n
.
T
h
e
'
d
e
s
i
g
n
'
a
s
p
e
c
t
m
i
r
r
o
r
s
o
t
h
e
r
s
y
s
t
e
m
d
e
s
i
g
n
p
r
o
b
l
e
m
s
l
i
k
e
'
L
F
U
C
a
c
h
e
'
(
L
e
e
t
C
o
d
e
4
6
0
)
—
b
o
t
h
r
e
q
u
i
r
e
m
a
i
n
t
a
i
n
i
n
g
a
u
x
i
l
i
a
r
y
d
a
t
a
s
t
r
u
c
t
u
r
e
s
f
o
r
O
(
1
)
o
p
e
r
a
t
i
o
n
s
.

Find Median from Data Stream #295

Two-Heap / Dual Priority Queue Pattern

Intuition

Think of median as finding the 'balance point' on a number line where half your data sits on each side. The most elegant way to maintain this split is with two heaps acting like opposing forces reaching equilibrium. A max-heap holds all the smaller numbers (the left side of the median), and a min-heap holds all the larger numbers (the right side). The magic: the top of the max-heap is the largest of the small numbers, and the top of the min-heap is the smallest of the large numbers — exactly the two values you need to compute the median! It's like a scale trying to stay balanced: whenever one side gets too heavy, you rebalance.

Why This Pattern?

This problem has an inherent 'split in the middle' structure — we need quick access to the boundary elements on either side of the median. Heaps give us O(1) access to the extremes (max of left half, min of right half) while maintaining sorted order in O(log n) for insertions. This is the classic 'complementary heaps' pattern where one heap stores the lower half and the other stores the upper half, with size balancing ensuring we always know which heap contains the median.

Solution

import heapq

class MedianFinder:
    def __init__(self):
        # max-heap for the smaller half (store negatives since Python only has min-heap)
        self.small = []
        # min-heap for the larger half
        self.large = []
    
    def addNum(self, num: int) -> None:
        # Step 1: Add to max-heap (small half)
        heapq.heappush(self.small, -num)
        
        # Step 2: Balance - ensure every element in small <= every element in large
        # The largest of small (-self.small[0]) should be <= smallest of large
        if self.small and self.large and -self.small[0] > self.large[0]:
            # Move the problematic element to large
            val = -heapq.heappop(self.small)
            heapq.heappush(self.large, val)
        
        # Step 3: Size balancing - keep small either equal to large or one element larger
        # This ensures we know which heap contains the median for odd counts
        if len(self.small) > len(self.large) + 1:
            val = -heapq.heappop(self.small)
            heapq.heappush(self.large, val)
        elif len(self.large) > len(self.small):
            val = heapq.heappop(self.large)
            heapq.heappush(self.small, -val)
    
    def findMedian(self) -> float:
        if len(self.small) > len(self.large):
            # Odd total: median is the top of the larger heap (small)
            return float(-self.small[0])
        else:
            # Even total: average of both tops
            return (-self.small[0] + self.large[0]) / 2.0

Complexity

Time: O(log n) for addNum (heap push/pop), O(1) for findMedian
Space: O(n) — we store all elements in the two heaps

Adding a number requires heap operations which take O(log n) because we might need to bubble up/down through the tree. The heap maintains its heap property in logarithmic time. Finding the median is O(1) because we just peek at the tops of both heaps — no computation needed, just accessing two values. We can't do better than O(log n) for insertion because any algorithm that maintains sorted order (which we need to find the median) must touch at least log n elements in the worst case — that's the information-theoretic lower bound for comparison-based sorting.

Common Mistakes

Forgetting to negate values when using max-heap in Python (heapq only provides min-heap)
Not balancing heap sizes after each insertion — the median calculation depends on knowing which heap is larger
Checking self.small before accessing self.small[0] — causes IndexError on empty heap
Using integer division /2 instead of /2.0 when averaging — truncates in Python 2 or when using integer inputs

Edge Cases

Empty heap when calling findMedian — should handle before any additions
Single element — median is that element
Two elements — median is average of both
Duplicate numbers — they can go in either heap but the balancing logic handles it
Numbers arriving in sorted order (worst case for some approaches) — heaps handle this fine

Connections

S
l
i
d
i
n
g
W
i
n
d
o
w
M
e
d
i
a
n
(
#
4
8
0
)
—
s
a
m
e
t
w
o
-
h
e
a
p
c
o
n
c
e
p
t
b
u
t
w
i
t
h
a
s
l
i
d
i
n
g
w
i
n
d
o
w
r
e
q
u
i
r
i
n
g
e
l
e
m
e
n
t
r
e
m
o
v
a
l
.
I
P
O
(
#
5
0
2
)
—
u
s
e
s
h
e
a
p
f
o
r
s
e
l
e
c
t
i
o
n
.
F
i
n
d
K
t
h
L
a
r
g
e
s
t
E
l
e
m
e
n
t
(
#
2
1
5
)
—
u
s
e
s
h
e
a
p
c
o
n
c
e
p
t
b
u
t
s
i
n
g
l
e
h
e
a
p
.
T
h
i
s
i
s
t
h
e
f
o
u
n
d
a
t
i
o
n
a
l
p
a
t
t
e
r
n
f
o
r
a
n
y
p
r
o
b
l
e
m
r
e
q
u
i
r
i
n
g
'
m
i
d
d
l
e
e
l
e
m
e
n
t
'
a
c
c
e
s
s
i
n
a
d
y
n
a
m
i
c
s
t
r
e
a
m
.

K Closest Points to Origin #973

Max Heap / Priority Queue (maintain k-smallest elements)

Intuition

Imagine you're at a party and you want to find the k people closest to you. You could measure everyone's distance, sort everyone by how close they are, then pick the first k. But that's overkill — you only care about the k closest, not the order beyond that. Instead, think of it like a competition: you have k 'slots' for the closest people. As you meet new people, if someone's closer than your current farthest person in the group, they push that person out. This is exactly what a max-heap does: it gives you fast access to the 'worst' element in your current set, letting you swap it out when you find something better.

Why This Pattern?

The key insight is that we only need to track k elements, not all n. A max-heap of size k gives us O(1) access to the current 'worst' (farthest) point among our k closest. When we see a new point closer than that worst one, we swap them in O(log k) time. This beats sorting all n points because we only do log(k) work per point instead of log(n) to insert into our heap, making it O(n log k) vs O(n log n).

Solution

import heapq

def kClosest(points, k):
    # We want to keep the k CLOSEST points, which means we need
    # a max-heap to quickly find/replace the FARTHEST among our k.
    # Python's heapq is a min-heap, so we negate distances.
    
    max_heap = []  # stores (-distance, x, y)
    
    for x, y in points:
        dist = x*x + y*y  # squared distance — sqrt not needed for comparison
        
        # Push this point (negate dist for max-heap behavior)
        heapq.heappush(max_heap, (-dist, x, y))
        
        # If we have more than k points, remove the farthest
        if len(max_heap) > k:
            heapq.heappop(max_heap)
    
    # Extract the points from heap (ignore the negated distance)
    return [[x, y] for (_, x, y) in max_heap]

Complexity

Time: O(n log k)
Space: O(k)

We process each of the n points once. For each point, we do a heap push (log k) and possibly a heap pop (log k), so O(log k) per point = O(n log k). We only store k points in the heap at any time, so O(k) space. We can't do better than O(n log k) in the worst case because we must examine every point to know which k are closest — there's no way to know if a point is a contender without comparing it against our current set.

Common Mistakes

Using a min-heap instead of max-heap — this would keep the k farthest points, not the closest
Computing actual sqrt distance — unnecessary since sqrt is monotonic, x²+y² is sufficient
Forgetting to handle the case when k equals the length of points (returns all points)
Not handling negative coordinates correctly — squared distance handles all quadrants fine

Edge Cases

k = 0 returns empty list
k = n returns all points sorted by (not exactly sorted, just all points in heap order)
All points at same distance — any k of them is valid
Single point with k=1 returns that point
Points with very large coordinates — squared distance might overflow in some languages (Python handles big ints fine)

Connections

Similar to 'Top K Frequent Elements' — same heap pattern but with frequency counting instead of distance
Similar to 'Kth Largest Element in an Array' — uses a min-heap to find kth largest (our problem finds k smallest)
Different from 'Kth Smallest Pair Distance' — that one needs a different approach with binary search since distances aren't directly given

Kth Largest Element in a Stream #703

Heap as a sliding window / maintain k largest elements

Intuition

Think of this like maintaining a 'water level' in a lake. The kth largest element is like the k-th highest point. If you're at a party and someone asks 'who's the 3rd tallest person?' you don't need to know everyone's height - you just track the top 3. When someone new arrives, you compare them to your top 3 and update if needed. A min-heap of size k does exactly this! It keeps the k largest elements, and the top of the heap (minimum of the top k) is our kth largest answer.

Why This Pattern?

We only need the kth largest at any moment, so we don't need to store the entire stream. A min-heap of size k naturally gives us O(1) access to the smallest of our top k elements (which is the kth largest overall). Every new element either gets ignored if smaller than our kth largest, or kicks it out and becomes a new candidate.

Solution

import heapq

class KthLargest:
    def __init__(self, k: int, nums: List[int]):
        self.k = k
        # Use min-heap of size k - stores the k largest elements
        # heap[0] will be the SMALLEST among these k largest = kth largest overall
        self.heap = nums[:k]
        heapq.heapify(self.heap)
        
        # If nums has more than k elements, only keep the k largest
        # (pop smallest until heap size = k)
        if len(nums) > k:
            for _ in range(len(nums) - k):
                heapq.heappop(self.heap)
    
    def add(self, val: int) -> int:
        # Add new value to heap
        heapq.heappush(self.heap, val)
        
        # If we have more than k elements, remove smallest
        # (that's the (k+1)th largest, not needed)
        if len(self.heap) > self.k:
            heapq.heappop(self.heap)
        
        # heap[0] is always the kth largest
        return self.heap[0]

Complexity

Time: O((n + m) * log k) where n = initial array size, m = number of add() calls
Space: O(k)

We only store k elements in the heap. Each heap operation (push/pop) costs O(log k) because the heap is a complete binary tree with height log k. Initialization processes n elements but each pop is O(log k), giving O(n log k). Each add() does at most one push and one pop = O(log k).

Common Mistakes

Using a max-heap instead of min-heap - we need min-heap so heap[0] gives us the SMALLEST of our top k elements, which IS the kth largest
Not handling the initial nums array properly - if it has fewer than k elements, we still need to return the smallest element in the heap after all additions
Forgetting to heapify after slicing - heapq.heapify() is required to maintain heap property

Edge Cases

Initial nums is empty or has fewer than k elements - the heap will have less than k elements initially
All added values are smaller than existing kth largest - heap size stays same, just ignore the new value
k = 1 (largest element) - need to track just the maximum
Adding duplicate values - handled correctly, duplicates count as separate elements

Connections

Same core insight as LeetCode 215 (Kth Largest Element in an Array) - use min-heap of size k
Related to 1046 (Last Stone Weight) - also uses heap as a 'top-k' selector
Similar pattern to 295 (Find Median from Data Stream) - but there we use two heaps splitting at the median
This is the 'online' version of 215 - we process a stream instead of having all data upfront

Kth Largest Element in an Array #215

K-Largest Elements Heap Pattern (Min-heap of size k)

Intuition

Think of this like finding the kth tallest person in a crowd. You could sort everyone by height (expensive), or you could maintain a 'top k' list that automatically keeps track. A min-heap of size k acts like a water level - it holds exactly the k largest elements we've seen, and the smallest among those (the root) is exactly the kth largest element overall. It's like keeping a 'ceiling' at the kth position: any element above the ceiling gets in and pushes someone out, any element below gets ignored.

Why This Pattern?

The min-heap root gives O(1) access to the SMALLEST among our top k candidates. Since we want the kth LARGEST, we want the smallest of the largest k. This is the perfect structure: we maintain exactly k elements, and the heap invariant automatically keeps the smallest of those at the top. When we see a new element larger than our current minimum, it belongs in the top k, so we swap it in.

Solution

import heapq

def findKthLargest(nums: list[int], k: int) -> int:
    # Use a min-heap to track the k largest elements
    # The root will be the SMALLEST among the k largest = kth largest
    min_heap = []
    
    for num in nums:
        # Add current element to heap
        heapq.heappush(min_heap, num)
        
        # If heap exceeds k, remove smallest element
        # This maintains exactly k largest elements seen so far
        if len(min_heap) > k:
            heapq.heappop(min_heap)
    
    # Root is the kth largest (smallest among top k)
    return min_heap[0]

Complexity

Time: O(n log k)
Space: O(k)

We process n elements, and each heap operation (push or pop) costs O(log k) since the heap never exceeds size k. This beats sorting O(n log n) when k << n, and we only care about one rank position, not full order.

Common Mistakes

Using max-heap instead of min-heap (confusing logic)
Forgetting to pop when heap size exceeds k, letting it grow to n
Off-by-one: kth largest means position k from end, not index k
Confusing kth largest with kth smallest

Edge Cases

k = 1 returns maximum element
k = len(nums) returns minimum element
Array has duplicates - they count separately
k > len(nums) - should handle gracefully
Empty array - undefined behavior

Connections

Top K Frequent Elements (#347) - identical pattern with frequency counts
K Closest Points to Origin (#973) - same min-heap-of-size-k approach
IPO (#502) - uses max-heap for complementary problem
Reorganize String (#358) - uses heap for greedy selection

Last Stone Weight #1046

Max Heap / Priority Queue - Extract Max pattern

Intuition

Think of this like a collision/energy dissipation system. When two stones smash, their 'energy' (weight) partially dissipates - if equal, all energy is lost; if unequal, only the difference remains. The heaviest stones dominate the outcome because we always process the two largest first. It's like a pressure system where the highest pressures interact first, and each collision potentially creates a new pressure point. The key insight: we never need to consider smaller stones until the larger ones are resolved - like how in a game of billiards, the heaviest balls determine the trajectory before smaller ones matter.

Why This Pattern?

The problem is defined entirely in terms of 'largest' elements - we repeatedly need the two heaviest stones. A max-heap gives O(1) access to the maximum element and O(log n) insertion/deletion, making it the natural data structure. Each operation (smash) transforms the two maxes into a potential new max, which the heap efficiently maintains. This is fundamentally a 'repeatedly get largest' pattern.

Solution

import heapq

def lastStoneWeight(stones):
    # Python's heapq is a min-heap, so negate to simulate max-heap
    max_heap = [-stone for stone in stones]
    heapq.heapify(max_heap)  # O(n) - more efficient than n * O(log n)
    
    # Keep smashing until 0 or 1 stone remains
    while len(max_heap) > 1:
        # Extract two heaviest stones (negate to get actual values)
        y = -heapq.heappop(max_heap)  # heaviest
        x = -heapq.heappop(max_heap)  # second heaviest
        
        # If they differ, push the difference back as a new stone
        if y > x:
            heapq.heappush(max_heap, -(y - x))
    
    # Return last stone weight, or 0 if empty
    return -max_heap[0] if max_heap else 0

Complexity

Time: O(n log n)
Space: O(n)

Heapify takes O(n) using the Floyd algorithm. Each smash operation does two pops and possibly one push, each O(log n). In worst case (stones never fully cancel), we do O(n) smash operations. So total is O(n log n). We can't do better because we must examine each stone at least once, and each comparison/insertion in a heap is O(log n) by definition of the data structure.

Common Mistakes

Forgetting to negate values when using Python's heapq (it only has min-heap)
Not handling the case when heap becomes empty - return 0, not error
Using sorted list and always taking from end - works but O(n log n) for each removal vs O(log n) for heap pop
Forgetting that after smashing, the 'remainder' goes back into the heap as a new stone

Edge Cases

All stones equal weight - they all cancel out, return 0
Only one stone to start - return that stone
Two stones only - just compare them directly
Very large difference between stones - one smash might eliminate most stones at once

Connections

K Closest Points to Origin - uses heap to extract k smallest, this extracts largest repeatedly
Top K Frequent Elements - similar heap pattern for extracting top K
Reorganize String - uses max-heap to always use most frequent character next

Task Scheduler #621

Greedy formula from max frequency - calculate the minimum intervals needed based on the most frequent task's spacing requirements

Intuition

Think of this like scheduling workers at a factory. You have different types of jobs (tasks), but if you do the same job twice too quickly, the machine overheats and you must wait (cooling interval). The key insight: the most frequently occurring task acts like a 'bottleneck' - it creates the longest chain in your schedule, and all other tasks must fit into the gaps between occurrences of this task. If you have plenty of other tasks, you keep the machine busy. If not, you end up with idle waiting time. The formula emerges from asking: 'How many gaps do I need to create between the most frequent task, and do I have enough other tasks to fill them?'

Why This Pattern?

The cooling constraint fundamentally creates 'slots' that must exist between the same task. The task with maximum frequency determines how many slots we MUST have (f_max - 1 groups, each needing n slots). The number of tasks sharing that max frequency tells us how many 'final' slots get filled. This structural property makes the greedy formula the natural solution - we can't do better than this lower bound, and we can always achieve it by arranging tasks this way.

Solution

import collections
import math

class Solution:
    def leastInterval(self, tasks: List[str], n: int) -> int:
        # Count frequency of each task
        task_counts = collections.Counter(tasks)
        
        # Find maximum frequency (the bottleneck task)
        f_max = max(task_counts.values())
        
        # Count how many tasks have that maximum frequency
        # (they all create the same length chain)
        count_max = sum(1 for count in task_counts.values() if count == f_max)
        
        # The formula: (f_max - 1) groups * (n + 1) slots per group + count_max at end
        # Think of it as: A _ _ ... A _ _ ... A, where we have n gaps between each A
        part1 = (f_max - 1) * (n + 1)
        part2 = count_max
        
        # Maximum of: formula result vs actual task count
        # If we have plenty of tasks to fill gaps, we use all tasks
        # If gaps exceed available tasks, we have idle time
        return max(part1 + part2, len(tasks))

Complexity

Time: O(m) where m is the number of unique tasks. We count all tasks once to get frequencies, find max, then iterate to count how many have max frequency. The number of unique tasks is bounded by 26 (uppercase letters), making this very fast.
Space: O(1) or O(26) - we store at most 26 task frequencies (uppercase English letters), which is constant space relative to input.

We only care about the COUNT of each unique task, not the order. Finding max and counting max-frequency tasks each require scanning 26 items max. We can't do better because we must at least examine each distinct task once to know which is most frequent. The formula itself is O(1).

Common Mistakes

Using len(tasks) directly instead of considering the cooling constraint - you'll underestimate when cooling is tight
Forgetting to account for multiple tasks having the same max frequency (count_max) - this affects the final batch size
Not handling the case where n=0 (no cooling needed) - formula still works but students often overthink it
Confusing n as 'number of tasks between identical tasks' vs 'minimum intervals' - it's intervals, so if n=2, pattern is A _ _ A (two idle slots)

Edge Cases

n=0: no cooling needed, answer = len(tasks)
All tasks identical: must wait n between each, answer = f_max * (n+1) - 1
One task type: same as above
Cooling interval larger than remaining tasks: you get idle time
Tasks exactly fill gaps: no idle time, schedule is tight

Connections

This uses the same frequency-counting + max-finding pattern as 'Top K Frequent Elements' (LeetCode 347), though the application differs
The gap-filling intuition is similar to 'Rearrange String k Distance Apart' (LeetCode 358) - both deal with spacing constraints
The formula approach (calculate theoretical minimum then take max with actual) is similar to 'Jump Game' problems where you calculate a bound then verify against reality

Backtracking (9)

Combination Sum II #40

Backtracking with sorting-based duplicate elimination

Intuition

Think of this as exploring a decision tree where each number can either be included or excluded from our current combination. The key insight is that when the candidates array has duplicates, we need to avoid creating duplicate combinations - like taking two different paths that lead to the same destination. By sorting first, we group equal numbers together, and we can then make a strategic choice: when we're at a level of the recursion and see the same number as the previous one we already explored, we skip it. This is similar to 'if you already tried taking the first copy of a duplicate and it didn't work, there's no point trying the second copy at the same recursion depth' - they lead to identical sub-problems.

Why This Pattern?

Sorting enables two critical optimizations: (1) we can prune branches where the current sum exceeds target, and (2) we can detect and skip duplicate combinations by checking if candidates[i] == candidates[i-1] at the same recursion depth. The 'start' parameter in backtracking enforces the 'each number used once' constraint - we only consider elements from index 'start' onward.

Solution

class Solution:
    def combinationSum2(self, candidates: List[int], target: int) -> List[List[int]]:
        result = []
        candidates.sort()  # Sort to enable pruning and duplicate detection
        
        def backtrack(start: int, remaining: int, current: List[int]):
            # Base case: we've found a valid combination
            if remaining == 0:
                result.append(current[:])  # Append a copy
                return
            
            # Prune: if remaining is negative or we've exhausted candidates
            if remaining < 0:
                return
            
            # Explore each candidate from 'start' onwards
            for i in range(start, len(candidates)):
                # Skip duplicates: if this candidate is same as previous AND
                # we're not at the first choice at this recursion depth
                if i > start and candidates[i] == candidates[i-1]:
                    continue
                
                # Include current candidate and recurse
                # Use i+1 because each number can only be used once
                current.append(candidates[i])
                backtrack(i + 1, remaining - candidates[i], current)
                current.pop()  # Backtrack: remove and try next option
        
        backtrack(0, target, [])
        return result

Complexity

Time: O(2^n) in the worst case where n is the number of candidates, but pruning significantly reduces this. In practice, it's bounded by the number of valid combinations times n for copying. The sorting is O(n log n).
Space: O(n) for the recursion stack in the worst case (when we explore all paths), plus O(k) for storing each valid combination where k is the average combination length.

We can't do better than exponential in the worst case because in theory every subset could be a valid combination. However, sorting adds O(n log n) which is dominated by the exponential exploration. The space is dominated by the depth of recursion - at most n levels deep (one for each unique position), plus storage for results.

Common Mistakes

Forgetting to sort - leads to missing valid combinations or having duplicates
Not checking 'i > start' when skipping duplicates - would skip valid combinations that include the first occurrence of a duplicate
Using 'start' incorrectly - should be i+1 not start+1 to enforce 'each number used once'
Forgetting to copy the current list before adding to results - leads to all results being empty or containing only the last combination
Not pruning when remaining < 0 - allows unnecessary exploration

Edge Cases

Empty candidates list - returns empty result
Target is 0 - returns empty list (no combination sums to 0 using positive integers typically)
All candidates > target - returns empty result
Single element equal to target - returns that single combination
Multiple duplicates in candidates - correctly handled by skip logic
Candidates with 0 (if allowed) - need special handling but LeetCode typically uses positive integers

Connections

S
i
m
i
l
a
r
t
o
C
o
m
b
i
n
a
t
i
o
n
S
u
m
I
(
#
3
9
)
b
u
t
w
i
t
h
'
u
s
e
d
o
n
c
e
'
c
o
n
s
t
r
a
i
n
t
i
n
s
t
e
a
d
o
f
'
u
n
l
i
m
i
t
e
d
u
s
e
s
'
.
A
l
s
o
r
e
l
a
t
e
d
t
o
S
u
b
s
e
t
s
I
I
(
#
9
0
)
-
b
o
t
h
u
s
e
t
h
e
s
a
m
e
'
s
o
r
t
a
n
d
s
k
i
p
d
u
p
l
i
c
a
t
e
s
a
t
s
a
m
e
r
e
c
u
r
s
i
o
n
d
e
p
t
h
'
p
a
t
t
e
r
n
.
T
h
e
c
o
r
e
b
a
c
k
t
r
a
c
k
i
n
g
t
e
m
p
l
a
t
e
w
i
t
h
p
r
u
n
i
n
g
i
s
s
h
a
r
e
d
w
i
t
h
m
a
n
y
o
t
h
e
r
b
a
c
k
t
r
a
c
k
i
n
g
p
r
o
b
l
e
m
s
l
i
k
e
P
a
l
i
n
d
r
o
m
e
P
a
r
t
i
t
i
o
n
i
n
g
,
L
e
t
t
e
r
C
o
m
b
i
n
a
t
i
o
n
s
,
a
n
d
N
-
Q
u
e
e
n
s
.

Combination Sum #39

Backtracking with sorting and pruning

Intuition

Think of this as a budget allocation problem. You have a target 'spending limit' and a list of 'items' you can buy (the candidates). Each item costs its face value, and you can buy each item unlimited times. The question asks: what are all the ways to spend exactly your budget? Imagine exploring a decision tree where at each node you choose how many copies of the current item to buy. The key insight: since [2,3] and [3,2] represent the same combination, we process items in sorted order and never go back - this eliminates duplicates naturally. It's like a conversation where you say "I'm going to use 2 of item A, now let's discuss item B..." - you never reconsider A after moving to B. The pruning (cutting off dead branches) works because once your remaining budget goes negative, no further choices can fix that - you've overspent.

Why This Pattern?

We need to explore ALL possible combinations (not optimize), we can reuse elements unlimited times (changes recursion to include current index), and order doesn't matter so we process in sorted order to avoid duplicates. The structural property: if candidates[i] > remaining, then ALL subsequent candidates (which are >= candidates[i]) will also exceed remaining - this is why sorting enables efficient pruning.

Solution

def combinationSum(candidates, target):
    result = []
    candidates.sort()  # Sort to enable pruning and avoid duplicates
    
    def backtrack(start, remaining, current):
        # Base case: exactly matched target - found valid combination
        if remaining == 0:
            result.append(current[:])  # Append COPY since we mutate current
            return
        
        # Pruning: overspent - no valid combination possible in this branch
        if remaining < 0:
            return
        
        # Try each candidate starting from 'start' (allows reuse of same candidate)
        for i in range(start, len(candidates)):
            # Pruning: since sorted, if current exceeds remaining, all subsequent will too
            if candidates[i] > remaining:
                break
            
            # CHOOSE: add this candidate to our combination
            current.append(candidates[i])
            
            # EXPLORE: recurse with same index i (unlimited reuse) and updated remaining
            backtrack(i, remaining - candidates[i], current)
            
            # UNCHOOSE: remove and try next option
            current.pop()
    
    backtrack(0, target, [])
    return result

Complexity

Time: O(N^target) in worst case where N is number of candidates and target is the sum - exponential because we explore all combinations. More precisely, it's bounded by the number of valid combinations in the output times the average length of each combination.
Space: O(target) for recursion stack depth (max depth equals max number of elements that can fit in target), plus O(number of combinations) for storing results.

The recursion depth is bounded by target/min(candidate) - you literally can't have more elements than this. The time is exponential because in worst case (like target=7 and candidates=[1,2,3]) we explore many branches - but the pruning significantly cuts this in practice. We can't do better than exponential in the worst case because we genuinely need to generate all valid combinations.

Common Mistakes

Forgetting to sort candidates - without sorting, pruning by 'candidates[i] > remaining' doesn't work
Not passing index i (instead of i+1) to recursion - this breaks unlimited reuse requirement
Appending 'current' directly instead of 'current[:]' - causes all results to reference the final empty list
Modifying the original candidates list instead of working with a sorted copy

Edge Cases

Empty candidates list - returns empty list
Target of 0 - returns [[]] (one valid combination: use nothing)
All candidates larger than target - returns empty list
Single candidate that exactly equals target - returns that single-element combination
Large target with small candidates - deep recursion, may hit stack limits

Connections

S
i
m
i
l
a
r
t
o
C
o
m
b
i
n
a
t
i
o
n
S
u
m
I
I
(
4
0
)
a
n
d
C
o
m
b
i
n
a
t
i
o
n
S
u
m
I
I
I
(
2
1
6
)
-
s
a
m
e
b
a
c
k
t
r
a
c
k
i
n
g
f
r
a
m
e
w
o
r
k
b
u
t
w
i
t
h
d
i
f
f
e
r
e
n
t
c
o
n
s
t
r
a
i
n
t
s
.
C
o
m
b
i
n
a
t
i
o
n
S
u
m
I
I
r
e
s
t
r
i
c
t
s
e
a
c
h
c
a
n
d
i
d
a
t
e
t
o
o
n
c
e
u
s
e
a
n
d
h
a
n
d
l
e
s
d
u
p
l
i
c
a
t
e
s
.
T
h
i
s
p
r
o
b
l
e
m
'
s
u
n
l
i
m
i
t
e
d
r
e
u
s
e
i
s
t
h
e
s
i
m
p
l
e
s
t
v
a
r
i
a
n
t
.
U
s
e
s
s
a
m
e
c
o
r
e
r
e
c
u
r
s
i
o
n
p
a
t
t
e
r
n
a
s
s
u
b
s
e
t
s
p
r
o
b
l
e
m
b
u
t
w
i
t
h
t
a
r
g
e
t
-
s
u
m
c
o
n
s
t
r
a
i
n
t
i
n
s
t
e
a
d
o
f
s
u
b
s
e
t
e
q
u
a
l
i
t
y
.

Letter Combinations of a Phone Number #17

Cartesian Product via Backtracking - you're computing the Cartesian product of multiple sets (the letters corresponding to each digit), generating all possible tuples by taking one element from each set.

Intuition

Think of this like a tree growing horizontally. You start with an empty branch, and for each digit, you SPLIT that branch into multiple smaller branches - one for each possible letter. For "23", you'd take your empty start, split it into 'a','b','c' for the first digit, then each of those splits again into 'd','e','f'. It's like opening a combination lock where each dial has a different number of letters - you systematically try every possible combination by advancing one dial at a time, then backtracking to try the next option.

Why This Pattern?

The problem has a natural tree structure where each level corresponds to one digit, and each node at that level branches into all possible letters for that digit. Backtracking is perfect here because you're building partial solutions incrementally, and when you reach the end (processed all digits), you backtrack to explore alternative letter choices at previous positions. This is the classic 'explore all paths' pattern.

Solution

def letterCombinations(digits: str) -> list[str]:
    if not digits:
        return []
    
    # Phone keypad mapping - each digit maps to its possible letters
    phone = {
        '2': 'abc', '3': 'def', '4': 'ghi', '5': 'jkl',
        '6': 'mno', '7': 'pqrs', '8': 'tuv', '9': 'wxyz'
    }
    
    res = []
    
    def backtrack(index, current):
        # Base case: we've processed all digits, complete combination found
        if index == len(digits):
            res.append(current)
            return
        
        # Get all possible letters for current digit
        letters = phone[digits[index]]
        
        # For each letter option, recurse to next digit
        for letter in letters:
            backtrack(index + 1, current + letter)
    
    backtrack(0, "")
    return res

Complexity

Time: O(4^n * n) where n is the number of digits
Space: O(4^n * n) for storing all combinations + O(n) for recursion stack

You generate 4^n combinations in the worst case (digits 7 and 9 have 4 letters each). Each combination takes O(n) time to build since string concatenation creates a new string of length n. Can't do better because you must actually produce all combinations - that's n * 4^n characters in the output.

Common Mistakes

Forgetting to handle empty input string - should return empty list
Not accounting for digits 7 and 9 having 4 letters (pqrs, wxyz) vs 3 letters for most others
Creating new strings in recursion (current + letter) - technically O(n) per call but Python optimizes this reasonably well
Using iterative approach and messing up the nested loop structure

Edge Cases

Empty input string - return []
Input with only 1 digit - return that digit's letters as list
All digits are 7 or 9 (4-letter digits) - maximum combinations
Mixed digits of varying letter counts

Connections

Similar to 'Combinations' (LeetCode 77) - both generate all possible selections from sets
Like 'Permutations' (LeetCode 46) but here order is FIXED by digit position, you just choose which letter at each position
Related to 'Generate Parentheses' (LeetCode 22) - both build solutions incrementally with fixed positions
Uses same Cartesian product concept as 'Cartesian Product' patterns in system design

N-Queens #51

Backtracking with state sets

Intuition

Imagine each queen as a radio tower broadcasting interference along its row, column, and two diagonals. Your job is to place N towers so their 'signals' never overlap. You place queens row by row - each row is like choosing which frequency band to claim. When you place a queen at (row, col), you're 'reserving' that column and those two diagonal frequencies. The key insight: you can compute diagonal IDs with simple math - the '/' diagonal has ID (row + col), and the '\\' diagonal has ID (row - col + n-1). Think of backtracking as exploring a tree where each branch is 'try this column, see if it leads to a solution.' When a branch fails (you hit a conflict), you 'unclaim' your resources and try the next column - this is the 'back' in backtracking. It's like filling a mold one piece at a time; if a piece doesn't fit, you remove it and try a different piece.

Why This Pattern?

The problem naturally forms a decision tree: at each row, you choose one of N columns. Each choice reduces the problem size (move to next row with fewer available positions). When a path fails (conflict detected), you undo the last choice - exactly the backtracking pattern. Using sets to track occupied columns and diagonals gives O(1) conflict detection, making the backtracking efficient.

Solution

import json

def solveNQueens(n):
    """
    Place n queens on an n×n chessboard so no two queens attack each other.
    Returns all valid solutions as list of board representations.
    """
    result = []
    
    # Board[i][j] = 'Q' if queen placed, '.' otherwise
    board = [['.' for _ in range(n)] for _ in range(n)]
    
    # Track occupied columns and diagonals - O(1) lookup
    cols = set()      # occupied columns
    diag1 = set()     # '/' diagonals identified by (row + col)
    diag2 = set()     # '\\' diagonals identified by (row - col + n-1)
    
    def backtrack(row):
        # Base case: successfully placed n queens
        if row == n:
            # Convert board to list of strings for output
            solution = [''.join(row) for row in board]
            result.append(solution)
            return
        
        # Try each column in current row
        for col in range(n):
            # Calculate diagonal IDs for this position
            d1 = row + col           # '/' diagonal (top-left to bottom-right)
            d2 = row - col + n - 1   # '\\' diagonal (top-right to bottom-left)
            
            # Skip if this position is under attack
            if col in cols or d1 in diag1 or d2 in diag2:
                continue
            
            # Place queen - claim this column and diagonals
            board[row][col] = 'Q'
            cols.add(col)
            diag1.add(d1)
            diag2.add(d2)
            
            # Recurse to next row
            backtrack(row + 1)
            
            # Backtrack: remove queen and unclaim resources
            board[row][col] = '.'
            cols.remove(col)
            diag1.remove(d1)
            diag2.remove(d2)
    
    backtrack(0)
    return result

Complexity

Time: O(N!)
Space: O(N)

The recursion depth is at most N (one queen per row). The sets store at most N columns and 2N-1 diagonals total. The board itself is N×N but we reuse it, so the auxiliary space is O(N) for the tracking sets plus O(N) for recursion stack.

Common Mistakes

Using list instead of set for conflict checking - O(n) vs O(1) per position leads to much slower code
Forgetting to add offset (n-1) to diagonal 2 ID, causing negative indices
Not converting board to strings before appending, leading to reference issues (mutating board later affects stored solution)
Confusing the two diagonal formulas - d1 = row + col goes one way, d2 = row - col goes the other

Edge Cases

n = 1: Single queen on 1x1 board - 1 solution
n = 2 or 3: No valid placements (queens always attack) - empty result
n = 4: 2 solutions - classic example
Large n: Solutions exist for n = 1 and n >= 4, but not for n = 2, 3

Connections

Subsets (LeetCode 78) - same backtracking framework of building up solutions row by row
Letter Case Permutation (LeetCode 784) - similar decision tree exploration
Combinations (LeetCode 77) - backtracking with state, though this uses sets instead of start index
The diagonal tracking insight (row+col and row-col) is also used in 'Solve the Queens Problem' variations and similar grid conflict detection

Palindrome Partitioning #131

Backtracking with palindrome validation

Intuition

Think of this like a factory line where you're cutting a rope into segments. At each position, you decide whether to make a cut. Each segment you produce must be 'balanced' (a palindrome). You're exploring all possible ways to make these cuts, like a tree of decisions where each branch represents a cut. When a branch leads to a segment that isn't a palindrome, that's a dead end—you 'backtrack' (undo the cut) and try the next option. It's like water finding all possible paths through a maze: flow down each path, and when you hit a wall, back up and try another direction.

Why This Pattern?

The problem requires exploring ALL possible partitions—a classic combinatorial search. At each index, we choose to either cut or not cut, building solutions incrementally. When a chosen substring isn't a palindrome, we backtrack (undo the last decision) to explore alternative paths. This is the exact structure of backtracking: explore, validate, and retreat when needed.

Solution

def partition(s):
    result = []
    path = []  # Current partition being built
    
    def is_palindrome(start, end):
        # Two-pointer check: compare chars from both ends moving inward
        while start < end:
            if s[start] != s[end]:
                return False
            start += 1
            end -= 1
        return True
    
    def backtrack(index):
        # Base case: we've processed entire string
        # Valid partition found - add copy to results
        if index == len(s):
            result.append(path.copy())
            return
        
        # Try every possible end position for the next palindrome
        for end in range(index, len(s)):
            # Only proceed if the substring s[index:end+1] is a palindrome
            # This is our pruning step - don't explore dead ends
            if is_palindrome(index, end):
                path.append(s[index:end+1])  # Make choice: include this palindrome
                backtrack(end + 1)          # Recurse on remaining string
                path.pop()                  # Undo choice: backtrack
    
    backtrack(0)
    return result

Complexity

Time: O(n * 2^n)
Space: O(n)

Common Mistakes

Forgetting to copy path when adding to result (using path directly leads to bugs as it mutates)
Not backtracking (path.pop()) after recursion returns - this causes all partitions to be mixed together
Not pruning at the palindrome check - checking validity only at the end wastes time on impossible branches
Using string slicing incorrectly - must use indices for the palindrome check to avoid O(n) per slice

Edge Cases

Empty string: returns [[]] (one partition, empty)
Single character: returns [[s]] (only one palindrome possible)
String with no repeated chars like 'abc': each char is its own palindrome, generates many partitions
All same characters like 'aaa': maximum partitions (every prefix is a palindrome)

Connections

Similar to Word Break problem - both check if string can be split into valid pieces, but Word Break uses DP while this uses backtracking for ALL solutions
Subsets problem - same backtracking template of 'include/exclude' decisions, just with a different validity check
Palindrome validation uses same two-pointer technique as Longest Palindromic Substring

Permutations #46

Backtracking / Depth-First Search on a choice tree

Intuition

Think of arranging books on a shelf. You have n books and n slots. For the first slot, you can pick any of the n books. For the second slot, any of the remaining n-1 books. And so on. Each path from 'top of the tree' to a 'leaf' is one valid permutation. This is like exploring a tree where at each level you pick one of the remaining unchosen elements. The key insight: once you've picked an element, it's 'locked in' for that branch - you can't reuse it until you backtrack (undo that choice).

Why This Pattern?

We need to generate ALL possible orderings. At each step we have a set of available choices (unused elements). We try each choice, recurse to build the rest of the permutation, then undo that choice to try the next option. This is the canonical backtracking structure: make choice → explore → undo choice. The 'tree' structure emerges naturally because each choice branches into multiple subtrees.

Solution

def permute(nums):
    result = []
    path = []
    used = [False] * len(nums)  # tracks which elements are already in our current path
    
    def backtrack():
        # Base case: we've picked all elements → we have a complete permutation
        if len(path) == len(nums):
            result.append(path[:])  # MUST copy! otherwise all results point to same list
            return
        
        # Try each element that hasn't been used yet
        for i in range(len(nums)):
            if not used[i]:
                # MAKE CHOICE: pick nums[i]
                used[i] = True
                path.append(nums[i])
                
                # EXPLORE: recurse to fill remaining positions
                backtrack()
                
                # UNDO CHOICE: backtrack to try other options
                used[i] = False
                path.pop()
    
    backtrack()
    return result

Complexity

Time: O(n! * n)
Space: O(n)

Common Mistakes

Forgetting to copy the path when appending to result - results in list of references to the same mutable object
Not properly backtracking (undoing both used flag and path removal) - causes bugs or missing permutations
Using O(n) extra space for a 'seen' set when a simple boolean array is more efficient

Edge Cases

Empty array → returns [[]] (one permutation of nothing)
Single element → returns [[x]]
Duplicate elements in input - this problem specifies distinct integers, but if they weren't, you'd need to handle duplicates (see Permutations II)

Connections

Permutations II (#47) - same pattern but with duplicates, requires skip logic
Subset (#78) - similar backtracking but builds up rather than fills all positions
Combination Sum (#39) - backtracking with different constraint (sum target)
N-Queens (#51) - classic backtracking with placement validation

Subsets II #90

Backtracking with duplicate skipping (sort-and-skip pattern)

Intuition

Think of this like organizing a photo album where you have multiple identical photos of the same person. You want to create pages representing all possible selections of photos, but you don't want duplicate pages (e.g., two pages both with just 'photo A'). The trick: sort all photos first so identical ones are adjacent. Then when building your album, once you decide NOT to include a particular person on the current page, skip over ALL their identical photos before moving to the next person. This guarantees no duplicates because any subset containing 'photo A' would be identical to a subset containing 'photo A' from a different position - so we only generate one.

Why This Pattern?

When the array is sorted, all duplicate values become adjacent. The structural property: at any recursion level, if we skip nums[i], then including nums[i+1] (which equals nums[i]) would create a subset identical to what we'd get by including nums[i]. So we skip all consecutive duplicates at each recursion depth. This is the same core insight as Subsets I, but with an additional pruning step.

Solution

def subsetsWithDup(self, nums: List[int]) -> List[List[int]]:
    res = []
    nums.sort()  # Critical: sort to group duplicates together
    
    def backtrack(start, path):
        # Every path is a valid subset - add copy (not reference)
        res.append(path[:])
        
        for i in range(start, len(nums)):
            # Skip duplicates: if same as previous element at this level, skip
            # The 'i > start' check ensures we only skip within the SAME recursion level
            if i > start and nums[i] == nums[i-1]:
                continue
            
            # Choose: include current element
            path.append(nums[i])
            
            # Explore: recurse with next index
            backtrack(i + 1, path)
            
            # Un-choose: backtrack
            path.pop()
    
    backtrack(0, [])
    return res

Complexity

Time: O(n * 2^n)
Space: O(n) for recursion stack (excluding output)

There are at most 2^n subsets, and we spend O(n) time copying each subset into the result. The duplicate-skipping optimization doesn't reduce worst-case complexity (which occurs when all elements are unique), but it dramatically reduces the constant factor in practice. The recursion depth is at most n.

Common Mistakes

Forgetting to sort - this breaks the duplicate-skipping logic since duplicates won't be adjacent
Putting the duplicate check as 'if i > 0' instead of 'if i > start' - this incorrectly skips valid subsets because it compares across different recursion levels
Modifying path directly in res.append(path) - must append a copy with path[:] or list(path)
Not handling the empty subset - but the algorithm naturally includes it since we add before looping

Edge Cases

Empty array: returns [[]]
All same elements like [1,1,1]: returns [[],[1],[1,1],[1,1,1]] - exactly n+1 subsets
Already sorted input: still works but sorting is redundant (harmless)
Single element: returns [[], [x]]

Connections

Subsets I (#78): Same problem without duplicates - this is the 'with duplicates' version, same pattern but with skip logic added
Combination Sum II: Uses identical duplicate-skipping pattern when iterating through candidates
Permutations II: Similar duplicate-skipping but at a different level (used seen array instead of sorted skip)

Subsets #78

Backtracking / Decision Tree Traversal

Intuition

Think of this like exploring all possible paths in a decision tree. For each element in the array, you face a binary choice: include it in your current subset, or don't include it. Imagine you have coins [1, 2, 3] — for each coin, you flip to decide include/exclude. The power set is simply all possible combinations of these decisions. It's like a game where at every step you can either take the current element or leave it, and you explore every possible combination of these yes/no choices.

Why This Pattern?

The problem naturally forms a binary tree structure where each level represents a decision (include or exclude element i). Starting from an empty set, you branch two ways at each element: add it to the current subset, or skip it. This creates exactly 2^n leaf nodes (subsets). Backtracking is ideal because you build solutions incrementally, explore all branches, then 'undo' the last decision to try other paths — classic depth-first exploration of a decision space.

Solution

def subsets(nums):
    result = []
    
    def backtrack(start, path):
        # Every path in the decision tree IS a valid subset
        # Make a copy! Otherwise we append a reference that changes
        result.append(path[:])
        
        # Try adding each remaining element, one at a time
        for i in range(start, len(nums)):
            path.append(nums[i])           # Choose: include nums[i]
            backtrack(i + 1, path)         # Recurse with remaining elements
            path.pop()                     # Un-choose: backtrack
    
    backtrack(0, [])
    return result

Complexity

Time: O(n * 2^n)
Space: O(n)

There are exactly 2^n possible subsets (each of n elements can be either in or out). Building each subset takes O(n) time since we copy the path each time we add to result. So total is O(n * 2^n). This is optimal because you MUST generate 2^n subsets — you can't do better than examining every possible combination.

Common Mistakes

Forgetting to copy the path when appending to result — results in all subsets being empty or referencing the same mutable list
Using result.append(path) instead of result.append(path[:])
Not incrementing the start index in recursive call, causing duplicate elements in a single subset (combinations instead of subsets)
Modifying result list while iterating over it in iterative approaches

Edge Cases

Empty input [] returns [[]] — the power set always contains the empty set
Single element [1] returns [[], [1]] — binary choice
All same elements like [1,1,1] — problem states distinct integers so no duplicates to worry about
Maximum input size — recursion depth equals n, which is fine for Python's default recursion limit (~1000)

Connections

This is the foundation for all subset-related problems — similar structure to Subsets II (LeetCode 90) which adds deduplication, and Combinations (LeetCode 77) which is just subsets with a fixed size constraint
The iterative bit-manipulation solution uses the same insight as Gray Code (LeetCode 89) — treating subsets as binary numbers
In some ways, this is the 'hello world' of backtracking — the exact same pattern (decision tree, choose/unchoose/backtrack) applies to Permutations, Combination Sum, and many other problems

Word Search #79

Backtracking (Depth-First Search on a grid)

Intuition

Imagine you're exploring a cave system looking for a hidden message written on rocks. You can only move up, down, left, or right, and you can't step on the same rock twice (because that would reuse a letter). At each junction, you try one direction; if it doesn't lead anywhere, you backtrack and try another. The grid is like a graph where each cell connects to its neighbors, and you're searching for any path that spells out the word. The key insight: you can START from any cell that matches the first letter — you don't know which entrance leads to the solution.

Why This Pattern?

The problem has exponential branching — at each cell you have up to 4 choices, and you need to find ANY valid path. Backtracking naturally explores one path deeply, then 'undoes' moves to try alternatives. The 'visited' mechanism (marking cells temporarily) ensures you don't reuse cells within a single path, which is essential since each cell can only be used once per word construction.

Solution

def exist(board, word):
    if not board or not board[0]:
        return False
    
    rows, cols = len(board), len(board[0])
    
    def backtrack(r, c, index):
        # Base case: we've matched all characters
        if index == len(word):
            return True
        
        # Check bounds and if current cell matches the current letter
        if (r < 0 or r >= rows or c < 0 or c >= cols or 
            board[r][c] != word[index]):
            return False
        
        # Mark as visited by temporarily replacing with a placeholder
        # (can't use None because board contains chars)
        original = board[r][c]
        board[r][c] = '#'
        
        # Explore all 4 directions: right, left, down, up
        for dr, dc in [(0, 1), (0, -1), (1, 0), (-1, 0)]:
            if backtrack(r + dr, c + dc, index + 1):
                return True
        
        # Backtrack: restore the original character
        # This 'undo' is what makes it backtracking
        board[r][c] = original
        return False
    
    # Try starting from every cell (any could be the entrance)
    for i in range(rows):
        for j in range(cols):
            if backtrack(i, j, 0):
                return True
    
    return False

Complexity

Time: O(M * N * 4^L) where M*N is the board size and L is word length
Space: O(L) for recursion stack (maximum depth equals word length)

In the worst case, we visit every cell and from each cell explore up to 4 directions recursively. The branching factor is 4, but we can't reuse cells, so it's effectively a search through a subset of all possible paths. We can't do better in the worst case because we might need to explore almost all paths to determine if a match exists — it's like checking every possible route in a maze.

Common Mistakes

Forgetting to unmark/backtrack cells after exploring, causing false negatives
Not checking bounds before accessing board positions, causing index errors
Trying to use a global 'visited' set instead of marking/unmarking in place (doesn't work because different paths need different visit histories)
Starting the search only from [0][0] instead of all cells

Edge Cases

Empty board or single character board
Word is single character — need to find that one letter
Word length exceeds total cells on board (impossible case)
All cells have same letter and word is all that letter — must ensure path doesn't revisit cells
Word not found at all

Connections

Word Search II (LeetCode #212): Same traversal, but searches for multiple words using a Trie prefix tree to optimize
Path Finding problems (like #63 Unique Paths II): Use similar DFS with backtracking logic
The 'mark visited' pattern appears in flood fill problems (#733) and island problems (#200)

Graphs (13)

Clone Graph #133

Graph Traversal with Node Mapping (DFS/BFS with Hash Map)

Intuition

Think of this like copying a social network. You know one person, and you need to map out ALL their connections, then build an exact duplicate network. The key insight: as you traverse, you must remember which people you've already copied (using a hash map). Otherwise, if there's a cycle (A knows B, B knows A), you'd either loop forever or create duplicate copies of the same person. The hash map solves both problems simultaneously - it acts as your 'visited' set to prevent infinite recursion AND as a reference table so all edges pointing to the same original node point to the same copy.

Why This Pattern?

Graphs can contain cycles - nodes can reference nodes we've already visited. The structural property that makes this pattern natural is: we need a data structure that serves double duty - tracking 'visited' status to prevent infinite loops while also maintaining the mapping from original to copy so that all edges to the same node reference the same cloned node. A hash map elegantly solves both in one data structure.

Solution

"""
# Definition for undirected graph node
class Node:
    def __init__(self, val=0, neighbors=None):
        self.val = val
        self.neighbors = neighbors if neighbors is not None else []

def cloneGraph(node):
    if not node:
        return None
    
    # Hash map: {original_node: cloned_node}
    # This is our 'memory' - tracks which originals we've already copied
    clone_map = {}
    
    def dfs(original):
        # Base case: if we've already copied this node, return the copy
        # This check is what PREVENTS infinite loops on cycles
        if original in clone_map:
            return clone_map[original]
        
        # Create the clone for this node
        clone = Node(original.val)
        clone_map[original] = clone  # Store BEFORE recursing to handle self-loops
        
        # Recursively clone all neighbors
        for neighbor in original.neighbors:
            cloned_neighbor = dfs(neighbor)
            clone.neighbors.append(cloned_neighbor)
        
        return clone
    
    return dfs(node)

# BFS alternative (conceptually similar, just iterative):
from collections import deque

def cloneGraphBFS(node):
    if not node:
        return None
    
    clone_map = {node: Node(node.val)}
    queue = deque([node])
    
    while queue:
        original = queue.popleft()
        
        for neighbor in original.neighbors:
            # If neighbor hasn't been cloned yet, create clone and add to queue
            if neighbor not in clone_map:
                clone_map[neighbor] = Node(neighbor.val)
                queue.append(neighbor)
            # Link the cloned neighbor to the cloned current node
            clone_map[original].neighbors.append(clone_map[neighbor])
    
    return clone_map[node]"""

Complexity

Time: O(V + E)
Space: O(V) for the hash map + O(V) for recursion stack (DFS) or O(V) for queue (BFS) = O(V + E) auxiliary space for the clones themselves, plus O(V) for the map

We must visit every node at least once (that's V). For each node, we examine all its edges to connect neighbors (that's E total across all nodes). We can't do better because we literally need to create a copy of every node and edge - the work is inherent to the problem size. The hash map operations are O(1) average, so they don't add to our complexity.

Common Mistakes

Forgetting to check if node already exists in clone_map, causing infinite recursion on cycles
Not storing the mapping BEFORE recursing (causes issues with self-loops)
Not linking neighbors correctly - each neighbor needs to be appended to the clone's list
Returning wrong node at end - must return the cloned version, not original

Edge Cases

Empty graph / null input node - return None
Single node with no neighbors
Graph with cycles (A->B, B->A)
Graph with self-loops (node points to itself)
Multiple nodes with same value (value is NOT unique identifier, node object is)

Connections

This is the foundational pattern for 'graph copying' - same logic appears in Deep Copy Linked List with Random Pointer (#138) where you need to copy nodes while handling arbitrary references
Similar to Serialize/Deserialize Binary Tree - both require traversing a structure while maintaining a mapping
The hash map-as-visited-set technique is fundamental to all graph traversal problems with cycles

Course Schedule II #210

Topological Sort using Kahn's Algorithm (BFS with in-degree counting)

Intuition

Think of this like a dependency resolution system — like a package manager installing software where each package might depend on others already installed. You're looking for a valid installation order. Each course is a node, each prerequisite relationship is a directed edge from prerequisite → dependent. We need a linear ordering where all dependencies come before what depends on them. The trick: always pick courses with NO prerequisites first (in-degree 0), take them, then update the graph. This is like peeling layers off an onion — start from the outside (no dependencies) and work inward.

Why This Pattern?

The problem has a natural DAG structure — courses and prerequisites form a directed graph where edges point from prerequisite to dependent. Topological sort finds a linear ordering that respects all directed edges. Kahn's algorithm exploits the key property: nodes with in-degree 0 (no prerequisites) can always be taken first. When we 'take' such a course, we effectively remove its outgoing edges, decreasing the in-degree of its dependents. This cascading 'unlocking' is the natural consequence of the dependency structure.

Solution

from collections import deque, defaultdict

def findOrder(numCourses: int, prerequisites: List[List[int]]) -> List[int]:
    # Step 1: Build the graph (adjacency list) and track in-degrees
    # graph[prereq] = list of courses that depend on prereq
    graph = defaultdict(list)
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        graph[prereq].append(course)  # prereq -> course dependency
        in_degree[course] += 1  # course has one more prerequisite
    
    # Step 2: Initialize queue with courses that have NO prerequisites
    # These are our "starting points" - like packages with no dependencies
    queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
    result = []
    
    # Step 3: Process courses using BFS (Kahn's algorithm)
    while queue:
        course = queue.popleft()  # Take a course we CAN take
        result.append(course)
        
        # "Taking" this course unlocks its dependents
        # Reduce their in-degree (fewer prerequisites remaining)
        for dependent in graph[course]:
            in_degree[dependent] -= 1
            if in_degree[dependent] == 0:
                # All prerequisites satisfied - now THIS course is unlockable
                queue.append(dependent)
    
    # If we couldn't take all courses, there's a cycle (impossible)
    return result if len(result) == numCourses else []

Complexity

Time: O(V + E) where V = numCourses and E = len(prerequisites)
Space: O(V + E) for the graph adjacency list, in-degree array, and queue. The result array is O(V).

We visit each course exactly once (V operations) and traverse each prerequisite relationship exactly once (E edges). We can't do better because every course and every dependency must be processed to produce a valid ordering — you can't know the position of a course without checking its dependencies.

Common Mistakes

Reversing the direction of edges — remember edge goes FROM prerequisite TO dependent
Forgetting to check for cycles — if result length != numCourses, return empty array
Not initializing in-degree correctly — must count ALL prerequisites
Using DFS incorrectly — forgetting to detect cycles or getting post-order reversal wrong

Edge Cases

No prerequisites at all (empty prerequisites list) — return any valid order [0, 1, 2, ...]
All courses have prerequisites forming a single chain — order is deterministic
No valid ordering due to cycle — e.g., course A requires B, B requires A
Single course with no prerequisites
Two separate dependency chains that can be interleaved arbitrarily

Connections

Course Schedule I (#207) — same problem but only asks if possible, not the order
Topological Sort is used in: task scheduling, build systems, package dependency resolution, course planning
Similar pattern as: Clone Graph, Alien Dictionary, Minimum Height Trees — all leverage BFS on graphs with in-degree/out-degree properties
Unlike DFS approach, this BFS approach naturally finds one valid ordering and handles cycles by checking final count

Course Schedule #207

Cycle Detection in Directed Graph using Topological Sort (Kahn's Algorithm)

Intuition

Think of this like planning a construction project. Each course is a 'task' and each prerequisite is a 'dependency' - you must complete prerequisites before the dependent course. If you have a circular dependency (A needs B, B needs C, C needs A), it's like having a circular blueprint - you'd never be able to start! This is exactly what a CYCLE in a directed graph represents. The question reduces to: 'Is there a cycle in this dependency graph?' If NO cycle exists (a DAG), you can complete all courses. If a cycle exists, you're stuck.

Why This Pattern?

A valid course schedule corresponds exactly to a DAG (Directed Acyclic Graph). Topological sort works because we process courses that have no prerequisites first (in-degree = 0), 'consuming' them and potentially freeing up their dependents. If we can process ALL courses, no cycles exist. If we get stuck with courses that still have unmet prerequisites, a cycle exists - that's the key insight.

Solution

from collections import defaultdict, deque

def canFinish(numCourses, prerequisites):
    """
    Determine if all courses can be finished given prerequisite relationships.
    Uses Kahn's Algorithm (BFS topological sort).
    """
    # Step 1: Build the graph and compute in-degrees
    # graph[prereq] = list of courses that require this prereq
    # in_degree[course] = how many prerequisites this course needs
    graph = defaultdict(list)
    in_degree = [0] * numCourses
    
    for course, prereq in prerequisites:
        # Important: prereq -> course (you need prereq BEFORE course)
        graph[prereq].append(course)
        in_degree[course] += 1
    
    # Step 2: Initialize queue with courses that have NO prerequisites
    # These are 'free' to take - they're our starting points
    queue = deque([i for i in range(numCourses) if in_degree[i] == 0])
    completed = 0
    
    # Step 3: Process courses in topological order
    # Take a course with no remaining prerequisites, 'complete' it,
    # then reduce the in-degree of all courses that depended on it
    while queue:
        course = queue.popleft()
        completed += 1
        
        # 'Complete' this course by reducing dependents' in-degree
        for dependent in graph[course]:
            in_degree[dependent] -= 1
            # If dependent now has all prerequisites met, it becomes available
            if in_degree[dependent] == 0:
                queue.append(dependent)
    
    # If we completed all courses, no cycle existed
    return completed == numCourses

Complexity

Time: O(V + E) where V = numCourses and E = len(prerequisites)
Space: O(V + E) for the graph adjacency list and in-degree array

We must visit every course (V) and process every prerequisite relationship (E) at least once. Each edge is traversed exactly once when its source course is processed. This is optimal because we need to examine the entire dependency structure to determine if a cycle exists - you can't shortcut by skipping courses or relationships.

Common Mistakes

Misunderstanding prerequisite direction: 'prereq = [a,b]' means 'to take a, you need b first', so edge is b -> a, NOT a -> b
Forgetting courses with no prerequisites exist - they should start in the queue with in-degree 0
Not handling isolated nodes (courses that appear only as prerequisites but have no dependents)
Confusing undirected cycle detection with directed cycle detection - direction matters here

Edge Cases

Empty prerequisite list - should return True (no cycles possible)
Single course with no prerequisites - returns True
Already completed all courses (numCourses=0) - returns True
Prerequisites where a course is its own prerequisite - correctly detects cycle
Two courses depending on each other - correctly detects cycle

Connections

Course Schedule II (#210): Same problem but also returns the topological order. Uses identical cycle detection logic. 'Minimum Height Trees' (#310): Uses similar indegree-based pruning to find nodes in the 'center' of a graph. 'Clone Graph' (#133): Different graph traversal but uses similar BFS/DFS patterns. 'Reorder Routes to Make All Paths Lead to City Zero' (#1466): Another indegree-based problem - noticing which nodes have zero 'net' dependencies.

Graph Valid Tree #261

Union-Find (Disjoint Set Union) / Cycle Detection

Intuition

Think of a tree as a connected water pipeline system with n houses. To connect ALL houses, you need exactly n-1 pipes. Any more and you create a loop (cycle), any fewer and some houses are cut off (disconnected). The key insight: a valid tree has exactly one 'path' between any two points - no detours, no isolated sections. When we process edges, if we ever try to connect two nodes that are ALREADY connected, we've found a cycle. If after processing all edges, all nodes belong to the same 'family' (connected component), we have a tree.

Why This Pattern?

Union-Find naturally models the question 'are these two nodes already connected?' When processing each edge, if find(u) == find(v), then u and v are already connected through some path - adding this edge creates a cycle. If they're not connected, we union them. After all edges, if the graph is valid, all n nodes should belong to exactly one set. This is more efficient than BFS/DFS because we avoid recursion and can detect cycles during edge processing.

Solution

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
    
    def find(self, x):
        # Path compression: flatten the tree
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        # Union by rank: attach smaller tree under larger
        root_x, root_y = self.find(x), self.find(y)
        if root_x == root_y:
            return False  # Already connected - cycle detected!
        if self.rank[root_x] < self.rank[root_y]:
            root_x, root_y = root_y, root_x
        self.parent[root_y] = root_x
        if self.rank[root_x] == self.rank[root_y]:
            self.rank[root_x] += 1
        return True

def validTree(n, edges):
    # A tree must have exactly n-1 edges
    if len(edges) != n - 1:
        return False
    
    uf = UnionFind(n)
    
    for u, v in edges:
        # If u and v already connected, adding this edge creates a cycle
        if not uf.union(u, v):
            return False
    
    # If we got here, no cycles and we have n-1 edges
    # With n nodes and n-1 edges, connectivity is guaranteed
    return True

Complexity

Time: O(n α(n)) where α is the inverse Ackermann function, which is practically constant (≤4). We process each edge once, and each find/union operation is nearly O(1).
Space: O(n) for the parent and rank arrays.

We need O(n) space to track which node belongs to which set. For time, we process each of the n-1 edges once, and each union-find operation costs α(n) ≈ constant. We can't do better than O(n) because we must at least look at all edges once.

Common Mistakes

Forgetting the n-1 edge check - a graph with n nodes and n edges MUST have a cycle
Not detecting self-loops if they appear in input
Assuming n-1 edges guarantees a tree (disconnected graph with n-1 edges is a forest, not a tree)
Confusing union-find with visited set in DFS - union-find detects cycles during processing, DFS detects them by revisiting nodes

Edge Cases

n = 0: empty graph - not a valid tree
n = 1: single node with no edges - valid tree
Empty edges with n > 1: disconnected, not a tree
Already connected nodes in edge list: cycle
Duplicate edges: cycle

Connections

Number of Islands (#200) - uses same Union-Find pattern
Redundant Connection (#684) - finds the exact edge that creates a cycle using Union-Find
Graph Clone (#133) - BFS/DFS on graphs
Course Schedule (#207) - cycle detection in directed graphs (different but related concept)

Max Area of Island #695

Graph traversal - finding connected components using flood fill (DFS or BFS).

Intuition

Think of the grid as a city where 1s are buildings and 0s are empty lots. You want to find the largest contiguous block of buildings. The trick: when you 'discover' an island, you 'flood' it (turn all its 1s to 0s) so you don't count those cells again. It's like pouring water into each island to mark it as 'measured' - once you've counted an island, you've claimed it, so move on to find the next unclaimed one.

Why This Pattern?

The grid is a graph where each cell connects to its 4 neighbors. We need to find all connected components of 1s and measure their sizes. The flood fill naturally handles this: when we visit a cell, we recursively visit all connected cells, counting as we go, and marking visited cells to avoid double-counting.

Solution

class Solution:
    def maxAreaOfIsland(self, grid: List[List[int]]) -> int:
        def dfs(row, col):
            # Base cases: out of bounds or already water
            if (row < 0 or row >= len(grid) or 
                col < 0 or col >= len(grid[0]) or 
                grid[row][col] == 0):
                return 0
            
            # "Flood" this cell - mark as visited so we don't count it again
            grid[row][col] = 0
            
            # Visit all 4 neighbors and count their areas
            # The +1 counts the current cell itself
            return (1 + 
                    dfs(row + 1, col) + 
                    dfs(row - 1, col) + 
                    dfs(row, col + 1) + 
                    dfs(row, col - 1))
        
        max_area = 0
        for row in range(len(grid)):
            for col in range(len(grid[0])):
                if grid[row][col] == 1:
                    # Found a new island - explore it and update max
                    max_area = max(max_area, dfs(row, col))
        
        return max_area

Complexity

Time: O(m * n) where m is rows and n is columns.
Space: O(m * n) in the worst case - all 1s in a single row or column means the recursion stack could go m or n deep. In practice, it's O(k) where k is the size of the largest island.

Each cell is visited at most once. When we start a DFS from a land cell, we explore the entire island and mark all its cells as visited (0). Future iterations skip these flooded cells. So across the entire algorithm, we do constant work per cell.

Common Mistakes

Forgetting to mark cells as visited - leads to infinite loops or double-counting
Not handling the base case for out-of-bounds properly
Using BFS but not properly tracking visited cells separately (grid modification is a cleaner trick)
Confusing 4-directional with 8-directional connectivity - the problem specifies 4-directional

Edge Cases

Empty grid (0 rows or 0 columns) - should return 0
No islands (all 0s) - should return 0
Single cell island - should return 1
All land (one giant island) - should return m * n
Islands at corners or edges - boundary checks must handle these

Connections

Number of Islands (#200) - same flood-fill pattern, but counts islands instead of finding max area
Island Perimeter (#463) - uses similar grid traversal but computes perimeter not area
Surrounded Regions (#130) - flood fill from boundaries - inverse approach
Clone Graph (#133) - same connected component finding pattern in a different graph structure

Number of Connected Components in an Undirected Graph #323

Union-Find (Disjoint Set Union / DSU)

Intuition

Think of this like counting isolated islands on a map. Each connected group of nodes is like an island - you can travel between any two nodes within an island via the edges, but you can't reach nodes on different islands. The question asks: how many disconnected islands exist in this graph? It's like pouring water at every node and watching it spread along edges - each 'pool' of water is one connected component.

Why This Pattern?

Edges define equivalence relations - if u connects to v, they're in the same set. Union-Find naturally models this: each node starts in its own set, and we union sets when we find connections. The key structural property is that connectivity is transitive (if A connects to B and B to C, then A connects to C), which is exactly what equivalence classes capture. Union-Find with path compression + union by rank gives near-O(1) amortized operations, making it ideal for merging sets dynamically.

Solution

class UnionFind:
    def __init__(self, n):
        # Each node starts as its own parent (self-contained set)
        self.parent = list(range(n))
        # Track tree depth for smart union by rank
        self.rank = [0] * n
    
    def find(self, x):
        # Path compression: flatten the tree by pointing directly to root
        # This makes future lookups O(1) instead of O(tree height)
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        root_x, root_y = self.find(x), self.find(y)
        
        # Already in same set - don't double count
        if root_x == root_y:
            return False
        
        # Union by rank: attach shallower tree under deeper tree
        # This keeps the tree balanced and查找 fast
        if self.rank[root_x] < self.rank[root_y]:
            root_x, root_y = root_y, root_x
        
        self.parent[root_y] = root_x
        if self.rank[root_x] == self.rank[root_y]:
            self.rank[root_x] += 1
        
        return True


class Solution:
    def countComponents(self, n: int, edges: List[List[int]]) -> int:
        # Start with n separate components (each node is its own island)
        uf = UnionFind(n)
        components = n
        
        # Process each edge - it connects two nodes, reducing component count
        for u, v in edges:
            if uf.union(u, v):
                components -= 1
        
        return components

Complexity

Time: O(n + E * α(n)) ≈ O(n + E), where α(n) is the inverse Ackermann function (effectively constant < 5 for all practical n)
Space: O(n) for the parent and rank arrays

We visit each node once to initialize and process all edges once. The α(n) comes from Union-Find's near-constant time operations. We can't do better than O(n + E) because we must at least examine every edge to know about all connections - you can't know if nodes are connected without looking at the edges that might connect them.

Common Mistakes

Forgetting to decrement component count only when union actually merges two sets (not when nodes are already connected)
Confusing n (number of nodes) with number of edges - n can be larger than len(edges)
Not using path compression leads to O(n) find operations and O(n²) total time

Edge Cases

No edges provided: returns n (every node is isolated)
Fully connected graph: returns 1 (all nodes in one component)
Graph with cycles: union returns False for redundant edges, so cycles don't affect component count
Single node with no edges: returns 1

Connections

Redundant Connection (LC 684) - uses same Union-Find pattern to detect cycle
Graph Valid Tree (LC 261) - extends this: valid tree if components == 1 AND edges == n-1
Number of Islands (LC 200) - conceptually similar but uses DFS on a grid instead
Surrounded Regions (LC 130) - another Union-Find or DFS connected components problem

Number of Islands #200

Depth-First Search (DFS) flood fill / Connected Components

Intuition

Think of the grid as a map where '1' is land and '0' is water. Each island is a connected blob of land - like a territory on a map. The key insight: once you find any piece of land, you need to 'explore' all connected land to mark it as visited. This is like sending scouts from a landing point - they keep spreading to adjacent land until they've mapped the entire island. You count one island per exploration. It's essentially finding connected components in an implicit graph where cells are nodes and edges connect horizontally/vertically adjacent '1's.

Why This Pattern?

The grid forms an implicit graph where each '1' cell is a node and edges exist between adjacent '1's (up, down, left, right). Islands are connected components in this graph. Finding connected components is exactly what DFS does naturally - start from an unvisited node, explore everything reachable, and that's one component. The grid structure makes DFS cleaner than BFS here since we can modify the input in-place.

Solution

def numIslands(grid):
    if not grid:
        return 0
    
    count = 0
    rows, cols = len(grid), len(grid[0])
    
    def dfs(r, c):
        # Base cases: out of bounds or already water/visited
        if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] == '0':
            return
        
        # Mark as visited by converting to '0' (consumes the land)
        grid[r][c] = '0'
        
        # Explore all 4 directions (up, down, left, right)
        dfs(r + 1, c)  # down
        dfs(r - 1, c)  # up
        dfs(r, c + 1)  # right
        dfs(r, c - 1)  # left
    
    # Scan entire grid
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '1':  # Found unvisited land = new island
                count += 1
                dfs(r, c)  # Flood fill to mark entire island as visited
    
    return count

Complexity

Time: O(rows * cols)
Space: O(rows * cols) in worst case (recursion stack)

Worst case: the entire grid is one island, so DFS goes as deep as rows*cols (a snake-like path). Best case: no land at all, so O(1) stack space. On average, it's proportional to the size of the largest island.

Common Mistakes

Forgetting to mark cells as visited, causing infinite loops or double counting islands
Using diagonal adjacency instead of only horizontal/vertical
Not checking boundary conditions in DFS, causing index out of bounds
Modifying the original grid when you shouldn't (though here it's allowed)

Edge Cases

Empty grid [] - return 0
Grid with only water [['0','0']] - return 0
Grid with only land [['1','1']] - return 1
Single cell '1' or '0'
Islands touching at corners (diagonal) - should be separate islands

Connections

Flood Fill (#733) - same DFS flood fill pattern
Number of Connected Components in Graph - general case of this problem
Surrounded Regions (#130) - uses DFS to find reachable regions
Pacific Atlantic Water Flow (#417) - multi-source DFS on similar grid
Clone Graph (#133) - DFS for connected components in graph format

Pacific Atlantic Water Flow #417

Reverse flood-fill from boundaries (also called 'multi-source BFS/DFS')

Intuition

Imagine you're a drop of water on a mountain range. You can only flow downhill (or stay level) to neighboring cells. The Pacific Ocean touches the left and top edges; the Atlantic touches the right and bottom. Instead of checking every possible path from each cell to both oceans (expensive!), think backwards: if a cell can reach the ocean going forward, then from that ocean we can reach the cell going backward. It's like flood-filling from the coastlines inland — if water can physically flow from the mountains to the sea going forward, we can trace that same path in reverse from the sea back to the mountains. Cells reachable from BOTH oceans are the answer.

Why This Pattern?

The key insight is that water flow is reversible. If water can flow A→B→→ocean, then in reverse we can go ocean→B→A. By starting from all boundary cells simultaneously and 'flowing' backward through cells that are downhill or level, we find every cell that can drain to each ocean. We then intersect the two reachable sets. This transforms an expensive 'from each cell to both boundaries' problem into two 'from boundaries to all cells' problems.

Solution

from typing import List

class Solution:
    def pacificAtlantic(self, heights: List[List[int]]) -> List[List[int]]:
        if not heights or not heights[0]:
            return []
        
        rows, cols = len(heights), len(heights[0])
        
        # Track which cells can reach each ocean
        pacific_reachable = [[False] * cols for _ in range(rows)]
        atlantic_reachable = [[False] * cols for _ in range(rows)]
        
        def dfs(r: int, c: int, visited: List[List[bool]]) -> None:
            """Flood fill from ocean inward - cells can reach the ocean if
            they can flow from this cell to a neighbor that's already reachable."""
            visited[r][c] = True
            # Check all 4 directions - can flow to neighbor if neighbor <= current
            # (water flows downhill or stays level)
            for dr, dc in [(1,0), (-1,0), (0,1), (0,-1)]:
                nr, nc = r + dr, c + dc
                if 0 <= nr < rows and 0 <= nc < cols and not visited[nr][nc]:
                    # Can flow backward if neighbor's height <= current cell's height
                    # (we're going in reverse direction)
                    if heights[nr][nc] >= heights[r][c]:
                        dfs(nr, nc, visited)
        
        # Start DFS from Pacific boundary (top row and left column)
        for c in range(cols):
            if not pacific_reachable[0][c]:
                dfs(0, c, pacific_reachable)
        for r in range(rows):
            if not pacific_reachable[r][0]:
                dfs(r, 0, pacific_reachable)
        
        # Start DFS from Atlantic boundary (bottom row and right column)
        for c in range(cols):
            if not atlantic_reachable[rows-1][c]:
                dfs(rows-1, c, atlantic_reachable)
        for r in range(rows):
            if not atlantic_reachable[r][cols-1]:
                dfs(r, cols-1, atlantic_reachable)
        
        # Find cells that can reach both oceans
        result = []
        for r in range(rows):
            for c in range(cols):
                if pacific_reachable[r][c] and atlantic_reachable[r][c]:
                    result.append([r, c])
        
        return result

Complexity

Time: O(m * n)
Space: O(m * n)

We visit each cell at most twice (once from Pacific DFS, once from Atlantic DFS), so that's 2*mn operations. Each cell is marked visited to prevent redundant work. We can't do better because we genuinely need to examine each cell's connectivity to both boundaries — the answer could include any cell in the grid.

Common Mistakes

Using strict less-than (<) instead of less-than-or-equal (<=) when checking if water can flow - water can stay level
Not marking visited during DFS, causing infinite loops or exponential time
Starting DFS from ALL cells instead of just boundary cells - this misses the optimization
Forgetting edge cases like empty grid or single cell

Edge Cases

Empty grid (return [])
Single cell grid - can it reach both? Depends if it's on boundary (yes) or not (no, it can't reach either ocean from middle)
All same heights - every cell can reach both oceans
Strictly increasing heights from ocean inward - only boundary cells can reach each ocean
Strictly decreasing heights from ocean inward - all cells can reach both oceans

Connections

Number of Islands (200) - similar flood fill mechanics, different connectivity criteria
Clone Graph (133) - graph traversal pattern
Surrounded Regions (130) - reverse flood fill from boundaries, different rule for what counts as valid

Redundant Connection #684

Union-Find / Disjoint Set Union (DSU)

Intuition

Think of this like building a river delta. A tree is like a river system with no loops - there's exactly one path from any point to any other. When you add ONE extra edge to a tree, you create exactly one loop (cycle), like water finding a shortcut back to itself. The problem asks: which edge, when added, created that loop? Here's the key insight: If you build the graph edge-by-edge, the moment you try to connect two nodes that are ALREADY connected, you've found your redundant edge. Why? Because a proper tree with n nodes has exactly n-1 edges. The moment you add the nth edge, you're guaranteed to create a cycle - it's mathematically impossible not to. It's like finding where the traffic jam formed: when two previously separate traffic paths merge and you discover they're actually already connected, that's the bottleneck - the redundant connection.

Why This Pattern?

DSU is the natural choice because we need to efficiently answer: "Are these two nodes already connected?" as we process each edge. DSU provides nearly O(1) amortized time for this connectivity query using path compression and union by rank. This is exactly the incremental cycle detection we need - we build components as we go and the instant we try to union two nodes already in the same set, we've found our cycle.

Solution

def findRedundantConnection(self, edges: List[List[int]]) -> List[int]:
    # DSU with path compression
    # n nodes, n-1 edges = tree. One extra edge creates exactly one cycle.
    parent = list(range(len(edges) + 1))  # 1-indexed: parent[i] = parent of node i
    
    def find(x):
        """Find root with path compression - makes future lookups O(1)"""
        if parent[x] != x:
            parent[x] = find(parent[x])  # recursively compress path
        return parent[x]
    
    def union(x, y) -> bool:
        """Union two sets. Returns True if successfully merged, False if already connected (cycle!)."""
        px, py = find(x), find(y)
        if px == py:
            # Already in same set - adding this edge creates a cycle!
            return False
        # Union by rank: attach smaller tree under larger tree
        parent[px] = py
        return True
    
    # Process each edge in order
    for u, v in edges:
        if not union(u, v):
            # Found the edge that creates a cycle - this is our redundant connection
            return [u, v]
    
    # Should never reach here if input guarantees exactly one cycle
    return []

Complexity

Time: O(N × α(N)) where α is the inverse Ackermann function
Space: O(N) for the parent array

For N edges, we do up to N union/find operations. With path compression and union by rank, each operation takes amortized O(α(N)) - which is effectively constant (α(N) ≤ 4 for any realistic N). So practically O(N). We can't do better than O(N) because we must examine each edge at least once to know which one is redundant.

Common Mistakes

Forgetting 1-indexing - the nodes are 1-indexed, not 0-indexed
Confusing which redundant edge to return - it's the LAST edge in the input that creates a cycle, not the first
Not understanding DSU fundamentals - parent[x] stores the 'representative' of the set containing x

Edge Cases

Single node with one extra edge (2 nodes, 2 edges = one cycle)
All edges form a simple chain with one cycle at the end
The cycle could be anywhere in the input - might not be at the end

Connections

T
h
i
s
i
s
t
h
e
f
o
u
n
d
a
t
i
o
n
a
l
U
n
i
o
n
-
F
i
n
d
p
r
o
b
l
e
m
-
s
i
m
i
l
a
r
t
o
G
r
a
p
h
V
a
l
i
d
T
r
e
e
(
#
2
6
1
)
w
h
i
c
h
a
s
k
s
i
f
a
g
r
a
p
h
i
s
a
t
r
e
e
(
n
o
c
y
c
l
e
s
,
c
o
n
n
e
c
t
e
d
)
.
A
l
s
o
c
o
n
n
e
c
t
s
t
o
N
u
m
b
e
r
o
f
I
s
l
a
n
d
s
(
#
2
0
0
)
w
h
e
r
e
D
S
U
t
r
a
c
k
s
c
o
n
n
e
c
t
e
d
c
o
m
p
o
n
e
n
t
s
,
a
n
d
t
o
P
r
o
b
l
e
m
s
l
i
k
e
N
u
m
b
e
r
o
f
O
p
e
r
a
t
i
o
n
s
t
o
M
a
k
e
N
e
t
w
o
r
k
C
o
n
n
e
c
t
e
d
(
#
1
3
1
9
)
w
h
i
c
h
a
s
k
s
h
o
w
m
a
n
y
e
d
g
e
s
t
o
r
e
m
o
v
e
t
o
m
a
k
e
a
f
o
r
e
s
t
.

Rotting Oranges #994

Multi-Source Breadth-First Search (BFS)

Intuition

Think of this like a contagion spreading through a population. Rotten oranges are 'infected' nodes that transmit the infection to adjacent healthy nodes each minute. This is exactly like pouring dye into water - it spreads outward in waves. The key insight: each 'wave' of BFS represents exactly one minute of time. We're essentially asking: how many waves of infection until all reachable fresh oranges are contaminated? The maximum distance from any fresh orange to its nearest initially-rotten orange tells us the total time needed.

Why This Pattern?

The grid is an unweighted graph where each cell connects to its 4 neighbors. BFS naturally explores by increasing distance from sources - meaning it finds the shortest path in terms of 'hops'. Since each hop represents exactly one minute, the level (depth) at which we reach a fresh orange tells us exactly when it rots. We need multi-source because multiple oranges can start rotten simultaneously, and we want the minimum time from ANY source.

Solution

from collections import deque

def orangesRotting(grid):
    rows, cols = len(grid), len(grid[0])
    queue = deque()
    fresh_count = 0
    
    # First pass: find all initially rotten oranges and count fresh ones
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == 2:
                queue.append((r, c, 0))  # (row, col, time)
            elif grid[r][c] == 1:
                fresh_count += 1
    
    # No fresh oranges at all - already done!
    if fresh_count == 0:
        return 0
    
    # No rotten oranges to start the chain reaction
    if not queue:
        return -1
    
    directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
    minutes = 0
    
    # BFS: process level by level (each level = one minute)
    while queue:
        r, c, time = queue.popleft()
        minutes = time  # Track the latest time we've processed
        
        # Spread to adjacent cells
        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == 1:
                grid[nr][nc] = 2  # Rot!
                fresh_count -= 1
                queue.append((nr, nc, time + 1))
    
    # If any fresh oranges remain, they were unreachable
    return -1 if fresh_count > 0 else minutes

Complexity

Time: O(rows * cols)
Space: O(rows * cols) in worst case

The queue could hold up to all cells in the worst case (if all oranges start rotten and we add them all). Additionally, we modify the grid in-place, so the extra space is primarily the queue. In practice, it's bounded by the number of rotten oranges at any given BFS level.

Common Mistakes

Forgetting to handle the case where there are NO initially rotten oranges (but fresh oranges exist) - this should return -1
Not tracking fresh_count separately and checking it at the end to detect unreachable oranges
Confusing the BFS level with the time - the time is stored IN the queue node, not computed from queue size
Modifying the grid while iterating over it in the first pass (should use separate passes)

Edge Cases

Empty grid (0 rows or 0 cols) - return 0
All fresh oranges, no rotten - return -1
All oranges already rotten - return 0
Fresh oranges completely isolated from any rotten orange - return -1
Single cell grid with fresh orange - return -1
Single cell grid with rotten orange - return 0

Connections

LeetCode 542 'Distance of Nearest Cell' - IDENTICAL pattern, just asks for the distance values instead of just the maximum
LeetCode 286 'Walls and Gates' - same multi-source BFS propagation logic, but spreads -1 (infinity) from gates
LeetCode 130 'Surrounded Regions' - uses BFS/DFS to find connected components, though different goal
LeetCode 1020 'Number of Enclaves' - similar grid BFS/DFS to count reachable cells

Surrounded Regions #130

Boundary-Connected Flood Fill with Complement

Intuition

Think of this like water flow or an escape room. The 'O's on the boundary are like exits - any 'O' connected to the boundary can 'escape' to the outside world. Only the 'O's that have NO path to the boundary are truly trapped (surrounded). Instead of trying to find all trapped regions directly (hard), we flip the problem: flood-fill from the boundary to mark everything that CAN escape, then flip everything else. It's like asking 'what's NOT captured' rather than 'what is captured'.

Why This Pattern?

The key insight is that any 'O' on the boundary (or connected to one via other 'O's) is NOT surrounded - it's 'touching the ocean' and can escape. Rather than finding all surrounded regions (which requires complex region detection), we find the complement: mark all escapeable 'O's, then flip everything remaining. This is O(n) instead of exponential because we visit each cell at most once.

Solution

def solve(board):
    if not board:
        return
    
    rows, cols = len(board), len(board[0])
    
    def dfs(r, c):
        # Base: out of bounds or not an unvisited 'O'
        if r < 0 or r >= rows or c < 0 or c >= cols or board[r][c] != 'O':
            return
        
        # Mark this 'O' as escapeable (temporary marker)
        board[r][c] = 'E'
        
        # Visit all 4 neighbors - water flows out
        dfs(r + 1, c)
        dfs(r - 1, c)
        dfs(r, c + 1)
        dfs(r, c - 1)
    
    # Step 1: Start DFS from ALL boundary cells that are 'O'
    # These are the 'water sources' that can reach the outside
    for r in range(rows):
        for c in range(cols):
            is_boundary = (r == 0 or r == rows - 1 or c == 0 or c == cols - 1)
            if is_boundary and board[r][c] == 'O':
                dfs(r, c)
    
    # Step 2: Flip remaining 'O's (trapped) to 'X', restore 'E' to 'O'
    for r in range(rows):
        for c in range(cols):
            if board[r][c] == 'O':
                board[r][c] = 'X'  # Trapped - capture it
            elif board[r][c] == 'E':
                board[r][c] = 'O'  # Was escapeable - restore

Complexity

Time: O(rows × cols)
Space: O(rows × cols) for recursion stack in worst case (all cells are connected 'O's)

Each cell is visited at most twice: once during boundary flood fill (if it's escapeable), and once in the final pass. We can't do better because we must check every cell to determine if it's trapped or not. The recursion stack is O(n) in worst case because the DFS could theoretically traverse every cell in a snake-like pattern.

Common Mistakes

Starting DFS from ALL 'O's instead of just boundary 'O's (overmarks and wastes time)
Not using a marker to distinguish escaped 'O's from trapped 'O's (can't tell them apart after flood fill)
Forgetting to check both row=0 and row=rows-1, or both col=0 and col=cols-1
Using BFS but not tracking visited set (would re-process same cells infinitely)

Edge Cases

Empty board - handle gracefully
Single row or column - no interior cells can exist
All X's - nothing to flip
All O's on boundary with X in middle - only middle X's stay
Already captured regions - should flip correctly
Corner cases where trapped region touches multiple boundary 'O's - still works

Connections

Number of Islands (#200): Same DFS/BFS flood fill pattern, just different goal
Number of Closed Islands (#1254): This problem INVERSE - find islands NOT touching boundary, uses exact same boundary-trick logic
Walls and Gates (#286): Similar flood fill from multiple sources (gates), different problem context
Pacific Atlantic Water Flow (#417): Same boundary-source multi-origin BFS, different output

Walls and Gates #286

Multi-Source Breadth-First Search (BFS)

Intuition

Think of this as dropping pebbles (gates) into a pond simultaneously. Each ripple expands outward one unit at a time. When a ripple first touches an empty room, that's the shortest distance to the nearest gate. BFS naturally models this 'wavefront' expansion - we process all cells at distance d before any at distance d+1, guaranteeing we find the shortest path. Multi-source BFS is the key: starting from ALL gates at once means we don't have to try each gate separately - the waves collide at the optimal boundary.

Why This Pattern?

BFS guarantees shortest path in unweighted graphs (each move costs 1). Multi-source BFS is optimal here because: (1) we want distance to the NEAREST gate, not any gate (2) starting from all gates simultaneously avoids redundant searches (3) the wavefronts naturally meet at the optimal boundary between gate territories. This is fundamentally a shortest-path problem on an unweighted grid.

Solution

from collections import deque
from typing import List

def wallsAndGates(rooms: List[List[int]]) -> None:
    if not rooms or not rooms[0]:
        return
    
    rows, cols = len(rooms), len(rooms[0])
    INF = 2**31 - 1
    queue = deque()
    
    # Step 1: Find all gates and add to queue (these are our "sources")
    for r in range(rows):
        for c in range(cols):
            if rooms[r][c] == 0:
                queue.append((r, c))
    
    # Step 2: BFS expands wavefront from all gates simultaneously
    # directions: up, down, left, right
    directions = [(1, 0), (-1, 0), (0, 1), (0, -1)]
    
    while queue:
        r, c = queue.popleft()
        
        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            # Check bounds and if room is empty (unvisited)
            if 0 <= nr < rows and 0 <= nc < cols and rooms[nr][nc] == INF:
                # Distance is current cell's distance + 1
                rooms[nr][nc] = rooms[r][c] + 1
                # Add to queue to continue wavefront expansion
                queue.append((nr, nc))

Complexity

Time: O(m * n) where m = rows, n = cols
Space: O(m * n) in worst case

Every cell is visited at most once. We start with all gates in queue, then each empty room gets visited exactly when first reached by a wavefront. No cell is processed twice because we mark rooms as visited by setting them to a finite distance (they start as INF).

Common Mistakes

Starting BFS from only one gate instead of all gates - misses shortest paths to closer gates
Using DFS instead of BFS - DFS doesn't guarantee shortest path in unweighted graphs
Not marking visited cells properly - causes infinite loops or wrong distances
Confusing the distance calculation - rooms[r][c] already holds distance for visited cells, not just for gates

Edge Cases

Empty grid or single cell grid
Grid with no gates (all walls or all rooms) - queue starts empty, nothing changes
Grid with only walls and one gate
Multiple gates - BFS naturally handles finding nearest
Isolated rooms surrounded by walls - stay as INF (unreachable)
Large INF values that could overflow when adding 1

Connections

01 Matrix (LeetCode 542) - essentially the same problem but starting from 0s (ones) instead of gates. The multi-source BFS pattern is identical.
Surrounded Regions (LeetCode 130) - uses BFS/DFS but from boundary cells as sources
Number of Islands (LeetCode 200) - uses BFS/DFS to flood from source points, but counts islands rather than distances
Rotting Oranges (LeetCode 994) - multi-source BFS with time steps as distance, very similar structure

Word Ladder #127

BFS on an implicit graph with bidirectional wildcard indexing. This is fundamentally a shortest-path problem on an unweighted graph where nodes are words and edges exist between words differing by exactly one letter.

Intuition

Imagine each word as a node in a vast network, and you can jump between nodes if they're exactly one letter apart (like 'hit' → 'hot'). You're trying to find the shortest route from your starting word to your target word. This is exactly the 'six degrees of Kevin Bacon' for words. BFS is the natural choice because it explores all paths of length 1, then length 2, etc. — guaranteeing the first time you reach the target, you've found the shortest possible path. The key insight is that we don't need to pre-build the entire graph; instead, we generate neighbors on-the-fly by treating each letter position as a 'door' (wildcard), and every word that shares the same door pattern is a neighbor.

Why This Pattern?

BFS guarantees shortest path in unweighted graphs because it explores level-by-level, finding all nodes at distance d before any at distance d+1. The wildcard pattern optimization avoids O(n²) neighbor-finding by using a hash map: for each word position, replace that character with '*' to create a pattern key. All words with the same pattern are by definition one letter apart. This transforms neighbor discovery from comparing every word against every other word to a simple O(1) hash lookup.

Solution

from collections import defaultdict, deque

def ladderLength(beginWord: str, endWord: str, wordList: list) -> int:
    wordSet = set(wordList)
    if endWord not in wordSet:
        return 0
    
    # Build pattern map: each pattern maps to all words sharing that pattern
    # e.g., 'hot' produces '*ot', 'h*t', 'ho*'
    # All words under the same pattern are exactly one letter apart
    pattern_map = defaultdict(list)
    for word in wordSet:
        for i in range(len(word)):
            pattern = word[:i] + '*' + word[i+1:]
            pattern_map[pattern].append(word)
    
    # BFS: (current_word, transformation_count)
    # We count the word itself in the length, so beginWord = 1
    queue = deque([(beginWord, 1)])
    visited = {beginWord}
    
    while queue:
        word, length = queue.popleft()
        
        # Generate all possible patterns from current word
        for i in range(len(word)):
            pattern = word[:i] + '*' + word[i+1:]
            
            # All words matching this pattern are valid next moves
            for next_word in pattern_map[pattern]:
                if next_word == endWord:
                    return length + 1
                
                if next_word not in visited:
                    visited.add(next_word)
                    queue.append((next_word, length + 1))
            
            # Clear to prevent reprocessing same pattern (memory optimization)
            pattern_map[pattern] = []
    
    return 0

Complexity

Time: O(M * N) where M = word length, N = number of words in wordList. Each word generates M patterns, and each pattern lookup is O(1). In the worst case, we visit each word once and examine all M positions. Cannot be less because we potentially need to check every transformation possibility.
Space: O(M * N) for the pattern_map storing all word-pattern combinations, plus O(N) for visited set and queue. We need this space because theoretically any word could connect to any other word through the transformation rules.

We need to store all pattern-to-word mappings because we don't know ahead of time which pattern will connect our current word to its neighbors. The visited set is essential to prevent infinite loops in what could be a cyclic graph. The queue holds at most one copy of each word at any time, bounded by N.

Common Mistakes

Not checking if endWord exists in wordList early — returns 0 immediately if missing
Forgetting to mark beginWord as visited, causing infinite loops
Counting incorrectly — ladder length includes both beginWord and endWord
Using DFS instead of BFS — DFS finds A path, not THE shortest path
Not clearing pattern_map entries after use — causes redundant processing of already-visited patterns
Modifying the original wordList when it's a set vs list issue

Edge Cases

beginWord equals endWord (tricky: technically requires 1 step if endWord in list, but problem expects 0)
wordList is empty or doesn't contain endWord
All words are same length (required by problem, but good to validate)
Dictionary contains only beginWord and endWord with no intermediate words
Word appears multiple times in list (handled by set conversion)

Connections

Word Ladder II (#126) — same problem but asks for ALL shortest paths, not just the length. Uses the same BFS foundation but builds a full path tree. Similar to finding all shortest paths in unweighted graphs.
Min Genetic Mutation (#433) — identical pattern: find shortest path in an implicit graph where nodes differ by one character. Uses the same BFS + wildcard technique.
Open the Lock (#752) — essentially the same problem with digits instead of letters. BFS on states that change one 'digit' at a time.
Clone Graph (#133) — core BFS pattern for graph traversal, but here the graph is explicit rather than implicit.

Advanced Graphs (6)

Alien Dictionary #269

Topological Sort (Kahn's Algorithm)

Intuition

Imagine you're a linguist trying to reconstruct an alien alphabet from a dictionary. You have words sorted in unknown order, and you need to deduce character ordering. The key insight: compare adjacent words. The FIRST character where they differ reveals the ordering. If 'cat' comes before 'car', then 't' must come after 'r' in their alphabet—you've discovered a dependency. Think of this like reconstructing a family tree where each character has constraints on who must come before/after them. This is a classic dependency-resolution problem, solved by finding an ordering where all constraints are satisfied.

Why This Pattern?

The problem gives us pairwise ordering constraints between characters (edges in a directed graph). We need a linear ordering of all vertices (characters) such that every edge points 'forward'—this is exactly what topological sort computes. The first differing character between adjacent words creates a directed edge representing 'this character must come before that one'.

Solution

from collections import defaultdict, deque

def alienOrder(words):
    # Step 1: Build the graph and track all unique characters
    graph = defaultdict(set)
    all_chars = set()
    for word in words:
        all_chars.update(word)
    
    # Step 2: Add edges based on first differing character between adjacent words
    for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]
        # Find first difference
        min_len = min(len(w1), len(w2))
        for j in range(min_len):
            if w1[j] != w2[j]:
                # w1[j] comes before w2[j] in alien language
                if w2[j] not in graph[w1[j]]:
                    graph[w1[j]].add(w2[j])
                break
        else:
            # No difference found: check valid ordering (shorter first)
            if len(w1) > len(w2):
                return ""  # Invalid: prefix comes after longer word
    
    # Step 3: Calculate in-degrees
    in_degree = {char: 0 for char in all_chars}
    for char in graph:
        for neighbor in graph[char]:
            in_degree[neighbor] += 1
    
    # Step 4: Kahn's algorithm - start with characters having no incoming edges
    queue = deque([char for char in all_chars if in_degree[char] == 0])
    result = []
    
    while queue:
        char = queue.popleft()
        result.append(char)
        
        for neighbor in graph[char]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    # If we didn't process all characters, there's a cycle (invalid dict)
    return "".join(result) if len(result) == len(all_chars) else ""

Complexity

Time: O(C + N) where C is the number of unique characters and N is total characters across all words
Space:

We traverse each character at most twice: once when building the graph (comparing adjacent word pairs) and once during BFS processing. Each edge is also processed once when decrementing in-degrees. We can't do better because we must examine every character and every constraint to establish their relationships.

Common Mistakes

Forgetting to check for invalid prefix cases like ['apple', 'app'] where longer word comes first
Not initializing in_degree for characters that never appear as a 'destination' in any edge (they still need to be in the result)
Assuming all 26 letters appear—only use characters that actually exist in the input words
Adding duplicate edges (use a set for adjacency to avoid counting same edge twice)

Edge Cases

Empty input list (return empty string)
Single word (return its unique characters in any order)
Words with duplicate letters within themselves (like 'aa' - still valid)
All characters appear but no valid ordering exists (cycle)
One word is exact prefix of another (valid if shorter first, invalid otherwise)

Connections

Course Schedule (LeetCode 207) - same topological sort pattern, different domain
Course Schedule II (LeetCode 210) - returns actual order, same algorithm
Minimum Height Trees (LeetCode 310) - uses similar graph pruning approach

Cheapest Flights Within K Stops #787

Constrained Shortest Path with Bellman-Ford

Intuition

Think of this like finding the cheapest subway route where you're limited in how many transfers (stops) you can make. Each flight is a 'step' in the journey, and you can take at most K+1 steps total (K stops means K+1 flights). The key insight: Bellman-Ford naturally builds up shortest paths iteration by iteration. After the first iteration, you know the cheapest way to reach each city using exactly 1 flight. After the second iteration, you know the cheapest way using at most 2 flights. So after K+1 iterations, you know the cheapest way using at most K+1 flights - exactly what we need!

Why This Pattern?

Bellman-Ford is the natural choice because it iteratively improves path costs by considering one more edge at a time. Each iteration represents adding one more flight to our journey. By stopping after K+1 iterations, we exactly enforce the 'at most K stops' constraint. Dijkstra doesn't naturally handle this edge-count constraint because it greedily picks the cheapest path without tracking how many edges were used.

Solution

def findCheapestPrice(n, flights, src, dst, K):
    # prices[i] = cheapest price to reach city i using at most current iterations flights
    prices = [float('inf')] * n
    prices[src] = 0
    
    # Relax all edges K+1 times:
    # K stops = at most K+1 flights (the destination counts as a stop)
    for i in range(K + 1):
        # Copy prices to avoid using updated values within same iteration
        # (this ensures each iteration only adds exactly one more flight)
        tmp = prices[:]
        for s, d, p in flights:
            # Can't reach source city yet, skip
            if prices[s] == float('inf'):
                continue
            # If going through s to d is cheaper, update d's price
            if prices[s] + p < tmp[d]:
                tmp[d] = prices[s] + p
        prices = tmp
    
    return prices[dst] if prices[dst] != float('inf') else -1

Complexity

Time: O((K+1) * E) where E is number of flights
Space: O(N) for the price array

We iterate through all E edges K+1 times. This is necessary because we must consider paths of length 1, 2, ..., K+1. Each path length requires a full pass through all edges to compute (no early termination because a cheaper path with more stops might exist). We can't do better than this worst-case because we need to evaluate all possible paths up to length K+1.

Common Mistakes

Confusing K stops with K flights - K stops means K+1 flights (the destination counts as a stop)
Not copying the prices array - causes using updated values within the same iteration, leading to paths using more than K+1 flights
Using Dijkstra incorrectly - it doesn't track number of stops and will give wrong answers
Forgetting to check if source is reachable before updating

Edge Cases

No path exists between src and dst - return -1
Direct flight is cheapest - K can be 0 (0 stops = 1 flight)
Negative edge weights - Bellman-Ford handles these correctly unlike Dijkstra
Multiple flights to same city with different prices - algorithm handles this naturally

Connections

Network Delay Time (LeetCode 743) - uses same Bellman-Ford pattern but without K constraint
Minimum Cost to Reach Destination in Time (LeetCode 1364) - adds time constraint to shortest path
This is essentially Bellman-Ford with a fixed number of edge relaxations - same core algorithm as detecting negative cycles but we stop after K+1 iterations

Min Cost to Connect All Points #1584

Minimum Spanning Tree (MST) using Prim's Algorithm

Intuition

Think of this like building a railway network across cities. You need to connect all cities but want to minimize total track length. The key insight: at each step, the optimal move is to add the shortest edge that connects an unvisited city to your growing network. This greedy choice works because of the cut property - in any partition of nodes, the minimum-weight edge crossing that cut MUST be in the optimal spanning tree. Imagine splitting your points into 'already connected' and 'not yet connected' - the cheapest bridge between these two groups is always part of the optimal solution.

Why This Pattern?

We have a complete graph where every point can connect to every other point, and we need the minimum-weight subgraph that connects ALL nodes without cycles. This is exactly the definition of MST. Prim's is preferred over Kruskal here because the graph is dense (n² edges), so O(n²) or O(n² log n) beats Kruskal's O(n² log n) sorting step.

Solution

import heapq
from typing import List

class Solution:
    def minCostConnectPoints(self, points: List[List[int]]) -> int:
        """
        Prim's Algorithm with min-heap.
        Start from any point (0), then greedily add the closest unvisited point.
        """
        n = len(points)
        if n == 1:
            return 0
        
        # Track which points are already in our growing tree
        visited = [False] * n
        # Min-heap of (cost, point_index) - always grab cheapest edge
        min_heap = [(0, 0)]  # Start from point 0 with cost 0
        total_cost = 0
        edges_used = 0
        
        while min_heap and edges_used < n:
            # Get the minimum cost edge to an unvisited point
            cost, point = heapq.heappop(min_heap)
            
            # Skip if this point is already connected
            if visited[point]:
                continue
            
            # Add this edge to our tree
            visited[point] = True
            total_cost += cost
            edges_used += 1
            
            # Try connecting to all unvisited points
            for next_point in range(n):
                if not visited[next_point]:
                    # Manhattan distance: |x1-x2| + |y1-y2|
                    dist = abs(points[point][0] - points[next_point][0]) + \
                           abs(points[point][1] - points[next_point][1])
                    heapq.heappush(min_heap, (dist, next_point))
        
        return total_cost

Complexity

Time: O(n² log n)
Space: O(n)

We process each of the n points once (that's the n factor). For each point, we potentially push up to n edges onto the heap (that's the n factor for edges per point). Heap operations are O(log n), giving O(n² log n). We can't do better than considering all possible edges in the worst case because the graph is complete - every point could connect to every other point. The space is O(n) for the visited array and heap.

Common Mistakes

Not checking if a point is already visited before processing - can add same point multiple times via different paths
Starting with cost 0 for the first point but forgetting to handle single-point edge case
Using Euclidean distance instead of Manhattan distance (LeetCode 1584 uses Manhattan)
Forgetting to pop from heap after visiting a point, causing infinite loop with stale entries

Edge Cases

Single point: return 0 (no edges needed)
Two points: return the single distance between them
All points on a line: the MST is just sorting by position and summing adjacent distances
Points with same coordinates: distance is 0, algorithm handles this correctly

Connections

Similar to 'Number of Operations to Make Network Connected' - both find MST-like structures
Same core concept as Kruskal's in 'Min Cost to Connect All Points' variant using Kruskal
Relates to 'Find City With Smallest Number of Neighbors at Threshold Distance' - uses graph traversal on weighted graphs
The heap-based approach mirrors Dijkstra's (which you'll see in 'Network Delay Time') - both use greedy frontier expansion

Network Delay Time #743

Single-Source Shortest Path (SSSP) with Dijkstra's Algorithm

Intuition

Think of this like dropping a stone in a pond and watching the ripples spread outward. The signal propagates from source k like a wavefront, with each node getting 'infected' at the earliest possible time. The key insight: the network is fully informed when the LAST node receives the signal. We're looking for the longest shortest-path from k to any reachable node — like asking 'how long until the ripple reaches the farthest point?'

Why This Pattern?

The problem has non-negative edge weights and asks for shortest paths from ONE source to ALL nodes. Dijkstra's algorithm is the natural choice because it greedily expands from the currently closest unvisited node, guaranteeing we find the earliest arrival time at each node. This is fundamentally a 'earliest arrival' problem, which is exactly what Dijkstra solves.

Solution

from heapq import heappush, heappop

def networkDelayTime(times, n, k):
    # Build adjacency list: graph[u] = [(v, w), ...]
    graph = [[] for _ in range(n + 1)]
    for u, v, w in times:
        graph[u].append((v, w))
    
    # Distance array: earliest time to reach each node
    dist = [float('inf')] * (n + 1)
    dist[k] = 0
    
    # Min-heap stores (time_to_reach, node) - always process earliest time first
    pq = [(0, k)]
    
    # Track maximum distance to any reached node
    max_time = 0
    visited = 0
    
    while pq:
        d, node = heappop(pq)
        
        # Skip if we've already found a faster way to this node
        if d > dist[node]:
            continue
        
        visited += 1
        max_time = max(max_time, d)
        
        # Explore neighbors: can we reach them faster through current node?
        for neighbor, weight in graph[node]:
            new_dist = d + weight
            if new_dist < dist[neighbor]:
                dist[neighbor] = new_dist
                heappush(pq, (new_dist, neighbor))
    
    # If we couldn't reach all nodes, return -1
    return max_time if visited == n else -1

Complexity

Time: O((V + E) log V)
Space: O(V + E)

Common Mistakes

Not checking if a node has already been processed with a shorter distance before adding to heap (the 'if d > dist[node]: continue' check is critical)
Returning max(dist) instead of checking if all nodes were visited (disconnected nodes have 'inf' distance)
Forgetting that nodes are 1-indexed in the input but 0-indexed in array access

Edge Cases

Single node (n=1): answer is 0 since signal starts at k
All nodes reachable directly from k: max is just the maximum edge weight from k
Disconnected graph: some nodes unreachable, return -1
Multiple edges between same nodes: keep the minimum weight (our adjacency list handles this naturally)

Connections

Same core algorithm as 'Cheapest Flights Within K Stops' (LeetCode 787) but without the K-stop constraint — just pure Dijkstra
Related to 'Minimum Time to Visit All Points' (LeetCode #1267) — both ask for maximum of minimum distances
Conceptually similar to 'Find the City With the Smallest Number of Neighbors at a Threshold Distance' (LeetCode #1334) — both involve computing distances from one source

Reconstruct Itinerary #332

Hierholzer's Algorithm variant with greedy edge selection (Eulerian Path in directed graph)

Intuition

Think of this like planning a road trip where you MUST use every single road exactly once - you're finding a Eulerian path. The key insight: when you have multiple flight choices from an airport, always pick the lexicographically SMALLEST destination first. Why? Imagine you have two paths available - a short one to 'B' and a longer one to 'Z'. If you take 'Z' first, you might get stuck later because 'B' was your only way to reach the remaining tickets. By taking the smallest option early, you 'use up' your constraints while you still have maximum flexibility - you're essentially making sure you don't paint yourself into a corner. This is like a stack of cards - play your smallest cards early so they don't clutter your hand later.

Why This Pattern?

The problem guarantees a valid Eulerian circuit exists (all vertices have balanced in/out degrees except start/end, and the graph is connected). This means there exists a path that uses every edge exactly once. The greedy choice of smallest lexical destination works because: when multiple edges leave a node in a valid Eulerian path, taking the smallest one first never blocks the solution - if there was a valid path using a larger edge first, there must also be a valid path using the smaller edge first (by the structure of Eulerian graphs).

Solution

from collections import defaultdict
import heapq

def findItinerary(tickets):
    # Build graph: airport -> list of destinations (sorted, smallest first)
    graph = defaultdict(list)
    for src, dst in tickets:
        heapq.heappush(graph[src], dst)  # min-heap = automatic sorting
    
    # DFS from JFK, building itinerary in POST-ORDER
    # (we add airport AFTER exploring all its outgoing flights)
    itinerary = []
    
    def dfs(airport):
        # Keep exploring while there are destinations available
        while graph[airport]:
            # Always take the smallest lexical destination (greedy choice)
            next_dest = heapq.heappop(graph[airport])
            dfs(next_dest)
        # After exhausting all flights FROM this airport, add to itinerary
        # This is like the "return" in a function call stack
        itinerary.append(airport)
    
    dfs("JFK")
    
    # Reverse because we built it post-order (like reverse of DFS finish times)
    return itinerary[::-1]

Complexity

Time: O(E log V) where E = number of tickets (edges) and V = number of airports (vertices). Each ticket is processed once, and we use a heap which costs log V to pop the smallest destination.
Space: O(V + E) - we store the graph (E edges) and the recursion stack could go as deep as V (worst case: linear path through all airports).

We can't do better than O(E log V) because: (1) we MUST process every ticket exactly once to use all edges - that's E operations minimum, (2) at each airport we need to pick the smallest available destination, and with multiple flights the heap gives us O(log n) retrieval. The space is also tight - we literally need to remember all flights (E) and potentially the entire path (V).

Common Mistakes

Building itinerary in pre-order instead of post-order - results in reversed path
Not sorting destinations, leading to wrong lexical order
Forgetting that JFK is always the starting point
Not handling the case where there's only one ticket correctly

Edge Cases

Single ticket: returns ["JFK", "destination"]
All tickets form a simple chain A->B->C->D
Multiple tickets between same airports (parallel edges) - need graph that tracks each separately
Tickets that create a cycle that must be fully traversed before finishing

Connections

Word Search (79) - uses same backtracking/DFS concept with post-order result building
Pacific Atlantic Water Flow (417) - similar multi-source DFS traversal
Clone Graph (133) - classic DFS on graph structure
Course Schedule (207) - topological sort uses similar graph traversal

Swim in Rising Water #778

Minimax path with node costs using modified Dijkstra's algorithm. Instead of summing edge weights, we take the maximum of the path cost and current node's cost.

Intuition

Think of this like escaping a flooding terrain. You start at the highest point of your path and wait as water rises. The question is: how high must the water rise before you can swim from start to end? You want to find a path that stays as low as possible - you're minimizing the maximum elevation you need to traverse. This is like a hiker wanting to cross mountains while staying in valleys as much as possible. The water level acts like a threshold - any cell with height ≤ threshold is flooded and swimmable. You want the minimum threshold that connects start to end.

Why This Pattern?

Dijkstra's algorithm works because when we pop a node from the priority queue, we've found the optimal minimax cost to reach it. For each neighbor, the cost to reach it through current node is max(current_path_cost, neighbor_height). This correctly computes 'the minimum possible maximum height along any path to this cell'. The priority queue orders by this cost, guaranteeing we process cells in order of increasing minimum-required-water-level.

Solution

import heapq

def swimInWater(grid):
    n = len(grid)
    # Minimum time to reach each cell - initialized to infinity
    min_time = [[float('inf')] * n for _ in range(n)]
    min_time[0][0] = grid[0][0]  # Starting cell needs water to reach its height
    
    # Priority queue: (time_needed, row, col)
    # Heap orders by time_needed (minimum max-height path found so far)
    pq = [(grid[0][0], 0, 0)]
    
    # 4 directions: right, down, left, up
    directions = [(0, 1), (1, 0), (0, -1), (-1, 0)]
    
    while pq:
        current_time, row, col = heapq.heappop(pq)
        
        # Early exit - we found the shortest path to destination
        if row == n - 1 and col == n - 1:
            return current_time
        
        # Skip if we've already found a better path to this cell
        if current_time > min_time[row][col]:
            continue
            
        # Explore neighbors
        for dr, dc in directions:
            nr, nc = row + dr, col + dc
            
            # Check bounds
            if 0 <= nr < n and 0 <= nc < n:
                # Time to reach neighbor = max of current path's max height and neighbor's height
                # This represents: "to swim here, water must rise to at least this level"
                new_time = max(current_time, grid[nr][nc])
                
                # If this path is better, update and add to queue
                if new_time < min_time[nr][nc]:
                    min_time[nr][nc] = new_time
                    heapq.heappush(pq, (new_time, nr, nc))
    
    return min_time[n-1][n-1]

Complexity

Time: O(n² log(n²)) = O(n² log n)
Space: O(n²) for the visited/distance array and priority queue

We potentially visit every cell once (n² total). Each heap operation costs O(log(n²)) = O(log n). The work is proportional to the grid size - we can't possibly do better than looking at all cells because any cell could be part of the optimal path.

Common Mistakes

Initializing min_time[0][0] to 0 instead of grid[0][0] - you need water to rise to at least the starting cell's height
Forgetting to check if current_time > min_time[row][col] before processing - can cause revisiting cells unnecessarily
Using sum instead of max when computing new_time - it's a minimax problem, not additive
Not handling the early return correctly - must check destination after popping, not when pushing

Edge Cases

Single cell grid (n=1) - answer is grid[0][0]
All cells same height - answer is that height
Descending heights from start to end - answer is max of start and end heights
Start and end are same cell for n=1

Connections

Same core insight as 'Path With Minimum Effort' (LeetCode 1631) - both use modified Dijkstra for minimax path
Similar to 'Minimum Cost to Reach Destination in Time' - time-constrained path finding
Union-Friend approach also works - sort cells by height, progressively flood, stop when start and end connect
Can also solve with binary search + BFS/DFS - check if path exists for a given threshold

1-D Dynamic Programming (12)

Climbing Stairs #70

Fibonacci Sequence / Dynamic Programming with O(1) Space

Intuition

Imagine water flowing down a staircase. At each step, the 'flow' (number of ways to arrive) comes from two sources: the step immediately above (arrived by taking 1 step) and the step two above (arrived by taking 2 steps). The total flow at any step is the sum of these two incoming flows. It's like a conservation law - the number of distinct paths reaching step n equals the sum of paths reaching step n-1 and step n-2. This is why the answer follows a Fibonacci-like pattern: 1, 2, 3, 5, 8... Each number 'remembers' the history of all paths that could lead to it.

Why This Pattern?

The problem has optimal substructure - the answer for n depends directly on answers for n-1 and n-2. There's also overlapping subproblems (we'd recompute the same values multiple times with naive recursion). The recurrence relation dp[n] = dp[n-1] + dp[n-2] naturally emerges from asking: 'What was the last move I made to reach step n?' You either came from n-1 (1 step) or n-2 (2 steps), so total ways = ways(n-1) + ways(n-2).

Solution

class Solution:
    def climbStairs(self, n: int) -> int:
        # Base cases: 1 way to climb 1 stair, 2 ways to climb 2 stairs
        if n <= 2:
            return n
        
        # Use two variables (like Fibonacci) - we only need previous two values
        prev2 = 1  # ways to reach step 1
        prev1 = 2  # ways to reach step 2
        
        # Iterate from step 3 to n, building up the answer
        for i in range(3, n + 1):
            current = prev1 + prev2  # ways(i) = ways(i-1) + ways(i-2)
            prev2 = prev1  # shift window: move prev2 forward
            prev1 = current  # update prev1 to current value
        
        return prev1

Complexity

Time: O(n)
Space: O(1)

We iterate exactly n-2 times (for n > 2), performing O(1) work each time - this is optimal because we must compute values for each step from 2 to n. The O(1) space comes from only tracking two variables instead of the entire sequence - we don't need to remember values from 5 steps ago because each step only depends on its immediate two predecessors.

Common Mistakes

Using recursion without memoization - leads to O(2^n) time due to exponential branching
Using an array when only two variables are needed - wastes O(n) space
Off-by-one errors in the loop bounds
Not handling edge case n=0 (should return 1, one way: do nothing)

Edge Cases

n=1: Only 1 way (single step)
n=2: 2 ways (1+1 or 2)
n=0: 1 way (do nothing, edge case for recursion base)
Very large n: Works fine, but result grows exponentially

Connections

House Robber (LeetCode 198) - same dp[i] = max(dp[i-1], dp[i-2] + val) recurrence
Fibonacci Number (LeetCode 509) - mathematically identical problem
Min Cost Climbing Stairs (LeetCode 746) - similar DP but with costs to minimize

Coin Change #322

Dynamic Programming - Bottom-up Tabulation (Unbounded Knapsack variant for minimization)

Intuition

Think of this like climbing a ladder where each coin is a step size. You're at amount 0 and want to reach your target amount. Each coin denomination lets you take a 'jump' forward by that amount. You want the path with the fewest jumps. The key insight: to know the minimum jumps to reach amount 'i', you need to know the minimum jumps to reach 'i - coin' for every coin that fits. This is like a shortest-path problem in a graph where each amount is a node and each coin creates an edge from amount 'i' to amount 'i + coin'.

Why This Pattern?

The problem exhibits optimal substructure: the minimum coins for amount 'n' depends on the minimum coins for smaller amounts (n - coin). There are also overlapping subproblems - we compute the same smaller amounts repeatedly. The 'unbounded' part means we can use each coin unlimited times, just like filling a knapsack with unlimited items.

Solution

def coinChange(coins, amount):
    # dp[i] represents the minimum coins needed to make amount i
    # Initialize with infinity (impossible state), except dp[0] = 0
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    # Fill the dp table bottom-up
    for current_amount in range(1, amount + 1):
        # Try using each coin to reach this amount
        for coin in coins:
            # Can we use this coin? (coin must not exceed current_amount)
            if coin <= current_amount:
                # If we use this coin, we need dp[current_amount - coin] coins
                # to reach the remainder, plus 1 for this coin
                # Take the minimum over all valid coins
                dp[current_amount] = min(dp[current_amount], dp[current_amount - coin] + 1)
    
    # If dp[amount] is still infinity, we couldn't make that amount
    return dp[amount] if dp[amount] != float('inf') else -1

Complexity

Time: O(amount * n) where n = number of coin denominations
Space: O(amount)

We iterate through each amount from 1 to target (amount times), and for each amount, we check all n coins. We can't do better because we must consider every coin at every amount to guarantee finding the true minimum - there's no way to 'skip' combinations without checking them.

Common Mistakes

Not handling amount = 0 as a special case (though our initialization handles this naturally)
Using a list instead of float('inf') for initialization - you need a value larger than any possible answer
Iterating in the wrong direction - must go forward since dp[current] depends on smaller values dp[current - coin]
Confusing this with 0/1 knapsack and trying to iterate backwards

Edge Cases

amount = 0 should return 0 (no coins needed)
coins with denominations greater than amount are simply ignored
Multiple coins of the same denomination - the algorithm handles this naturally
Large amount with only large coins may be impossible - return -1
coins = [1] always works, amount = any finite value

Connections

T
h
i
s
i
s
t
h
e
f
o
u
n
d
a
t
i
o
n
a
l
'
u
n
b
o
u
n
d
e
d
k
n
a
p
s
a
c
k
'
p
r
o
b
l
e
m
-
s
a
m
e
p
a
t
t
e
r
n
a
s
'
M
i
n
i
m
u
m
C
o
s
t
C
l
i
m
b
i
n
g
S
t
a
i
r
s
'
b
u
t
w
i
t
h
a
r
b
i
t
r
a
r
y
c
o
i
n
j
u
m
p
s
i
n
s
t
e
a
d
o
f
f
i
x
e
d
s
t
e
p
s
.
S
h
a
r
e
s
t
h
e
D
P
b
o
t
t
o
m
-
u
p
a
p
p
r
o
a
c
h
w
i
t
h
'
C
l
i
m
b
i
n
g
S
t
a
i
r
s
'
b
u
t
a
d
d
s
t
h
e
d
i
m
e
n
s
i
o
n
o
f
t
r
y
i
n
g
m
u
l
t
i
p
l
e
'
s
t
e
p
s
i
z
e
s
'
.
T
h
e
'
c
h
a
n
g
e
-
m
a
k
i
n
g
p
r
o
b
l
e
m
'
v
a
r
i
a
n
t
a
p
p
e
a
r
s
i
n
'
P
e
r
f
e
c
t
S
q
u
a
r
e
s
'
(
f
i
n
d
i
n
g
m
i
n
i
m
u
m
s
q
u
a
r
e
s
t
h
a
t
s
u
m
t
o
n
)
a
n
d
'
I
n
t
e
g
e
r
B
r
e
a
k
'
.

Decode Ways #91

1-D Dynamic Programming (Fibonacci-like)

Intuition

Think of this like a signal propagating through a chain. Each digit is a signal that can either stand alone (decode as a single letter) or combine with its neighbor (decode as a pair, if the pair forms 10-26). The number of valid decodings at position i depends on what happened at positions i-1 and i-2 — like a cascade or domino effect where each state absorbs valid contributions from previous states. It's analogous to counting paths in a graph where valid single-digit and double-digit decodings are edges pointing forward.

Why This Pattern?

The problem has optimal substructure: the number of ways to decode the first i characters depends on the number of ways to decode the first i-1 characters (if current digit is valid alone) and i-2 characters (if the current digit combines with previous to form a valid 10-26). The subproblems overlap naturally, making DP the natural choice. The recurrence f(i) = f(i-1) + f(i-2) mirrors climbing stairs, but with validity constraints.

Solution

def numDecodings(s: str) -> int:
    # Edge case: empty string or starts with '0' - no valid decoding
    if not s or s[0] == '0':
        return 0
    
    n = len(s)
    # dp[i] = number of ways to decode s[0:i+1]
    # Only need previous two states, so optimize to O(1) space
    prev2 = 1  # dp[0], base case: empty string has 1 way
    prev1 = 1  # dp[1], ways for first character (always 1 if not '0')
    
    for i in range(1, n):
        curr = 0
        
        # Case 1: Decode s[i] alone (if it's not '0')
        # '1'-'9' can stand alone
        if s[i] != '0':
            curr = prev1
        
        # Case 2: Decode s[i-1:i+1] as a pair (if valid 10-26)
        two_digit = int(s[i-1:i+1])
        if 10 <= two_digit <= 26:
            curr += prev2
        
        # If curr is 0, no valid decoding exists (e.g., '0', '00', '30')
        if curr == 0:
            return 0
        
        # Shift window forward
        prev2, prev1 = prev1, curr
    
    return prev1

Complexity

Time: O(n)
Space: O(1)

We iterate through the string once, doing O(1) work per character (checking single digit and forming the two-digit number). We can't do better than O(n) because we must inspect every digit to know the total number of decodings. Space is O(1) because we only track the two most recent DP states; the full DP table isn't needed.

Common Mistakes

Forgetting to reject strings starting with '0' — '0' alone has no letter mapping
Not checking if two-digit number is in range 10-26 before adding dp[i-2]
Allowing '0' to be decoded alone — '0' can only appear as part of a valid pair like '10' or '20'
Confusing the base case: dp[0]=1 means one way to decode empty string, not one way to decode 'a'

Edge Cases

Empty string returns 0 (or handle as 1 depending on interpretation)
Single character '0' returns 0, single char '1'-'9' returns 1
Strings with '00' are invalid — no way to decode
String ending in '0' like '1010' — the '10' at end is valid but preceding digits must work
All single digits: each contributes independently (n ways)

Connections

Climbing Stairs (#70): Same f(n)=f(n-1)+f(n-2) structure, but without validity constraints
House Robber (#198): Uses same two-variable optimization pattern for DP
Word Break (#139): Also validates substrings, but checks dictionary membership instead of numeric range 10-26
Decode Ways II (#639): Extension that adds '*' as wildcards — increases state complexity but same core pattern

House Robber II #213

1-D Dynamic Programming with circular boundary handling - breaking a circular constraint into two linear subproblems

Intuition

Imagine the houses as a circular necklace. The key insight is that in a circle, if you rob house 0, you CANNOT rob house n-1 (they're adjacent). But if you DON'T rob house 0, you CAN rob house n-1. This creates two mutually exclusive scenarios that cover all possibilities. Think of it like a decision at the boundary: either we 'commit' to robbing the first house (and are thus prohibited from robbing the last), or we 'skip' the first house (and are free to consider the last). The optimal solution must be one of these two paths through the circle. The problem reduces to solving the classic linear House Robber twice and taking the maximum.

Why This Pattern?

The circular adjacency creates a 'first-and-last mutually exclusive' constraint. When two options are mutually exclusive (can't pick both), a powerful technique is to consider each option separately and take the maximum. We take the linear DP solution and run it on two modified arrays: nums[0:n-1] (exclude last house, allowing us to take first) and nums[1:n] (exclude first house, allowing us to take last). The circle is 'broken' at a different point in each case, converting it to the standard linear problem.

Solution

class Solution:
    def rob(self, nums: List[int]) -> int:
        n = len(nums)
        if n == 0:
            return 0
        if n == 1:
            return nums[0]
        
        # Break the circle into two linear cases:
        # Case 1: Rob house 0 -> cannot rob house n-1, so consider nums[:-1]
        # Case 2: Don't rob house 0 -> can rob house n-1, so consider nums[1:]
        
        def rob_linear(houses):
            """Standard House Robber I solution for a linear street."""
            m = len(houses)
            if m == 0:
                return 0
            if m == 1:
                return houses[0]
            
            # dp[i] = max money robbing up to house i
            # Two variables suffice: prev1 = dp[i-1], prev2 = dp[i-2]
            prev2, prev1 = 0, 0
            for money in houses:
                # Either skip current (prev1) or rob it (prev2 + money)
                curr = max(prev1, prev2 + money)
                prev2 = prev1
                prev1 = curr
            return prev1
        
        # Take max of both scenarios
        return max(rob_linear(nums[:-1]), rob_linear(nums[1:]))

Complexity

Time: O(n)
Space: O(1) - we use only two variables regardless of input size, not O(n) for an array

Common Mistakes

Forgetting to handle n=1 case - with only one house in a circle, there's no adjacency conflict
Using O(n) space with a DP array when O(1) two-variable approach works
Confusing which houses to include/exclude in each scenario - the key is they're mutually exclusive
Not realizing that for n=2, you can only rob one house (they're adjacent in the circle)

Edge Cases

Empty array returns 0
Single element array returns that element
Two elements: they're adjacent in the circle, so answer is max of the two
All zeros: answer is 0
All same values: answer is sum of alternating houses

Connections

T
h
i
s
i
s
H
o
u
s
e
R
o
b
b
e
r
I
(
N
e
e
t
c
o
d
e
#
1
9
8
)
w
i
t
h
a
c
i
r
c
u
l
a
r
b
o
u
n
d
a
r
y
-
t
h
e
s
o
l
u
t
i
o
n
l
i
t
e
r
a
l
l
y
r
u
n
s
t
h
e
l
i
n
e
a
r
v
e
r
s
i
o
n
t
w
i
c
e
.
T
h
e
'
b
r
e
a
k
t
h
e
c
i
r
c
l
e
i
n
t
o
t
w
o
l
i
n
e
a
r
c
a
s
e
s
'
t
e
c
h
n
i
q
u
e
a
l
s
o
a
p
p
e
a
r
s
i
n
o
t
h
e
r
c
i
r
c
u
l
a
r
p
r
o
b
l
e
m
s
l
i
k
e
'
M
a
x
i
m
u
m
s
u
m
c
i
r
c
u
l
a
r
s
u
b
a
r
r
a
y
'
(
K
a
d
a
n
e
'
s
a
l
g
o
r
i
t
h
m
v
a
r
i
a
n
t
)
.
T
h
e
u
n
d
e
r
l
y
i
n
g
D
P
p
a
t
t
e
r
n
(
m
a
x
o
f
'
s
k
i
p
'
v
s
'
t
a
k
e
c
u
r
r
e
n
t
+
d
p
[
i
-
2
]
'
)
i
s
t
h
e
s
a
m
e
c
o
r
e
r
e
c
u
r
r
e
n
c
e
u
s
e
d
i
n
H
o
u
s
e
R
o
b
b
e
r
I
,
C
l
i
m
b
i
n
g
S
t
a
i
r
s
,
a
n
d
D
e
l
e
t
e
a
n
d
E
a
r
n
.

House Robber #198

1-D Dynamic Programming with Linear Scrolling State

Intuition

Imagine you're a mountain climber choosing which peaks to summit. You can only move to non-adjacent peaks (can't rob neighboring houses). At each peak, you face a choice: take it and add its height to your score, but then you must skip the next one; or skip it and move to the next with your current best. The key insight is this: your decision at house i depends ONLY on what was optimal at houses i-1 and i-2. It's like a gradient descent where you're always choosing the local maximum that leads to the global maximum. The problem has 'memory' - past decisions constrain future options in a specific, predictable way.

Why This Pattern?

The problem exhibits optimal substructure: the best answer for house i depends ONLY on the best answers for houses i-1 and i-2. This is the signature of DP problems. There's no need to consider earlier houses because any optimal path to house i must either include i-1 (in which case it can't include i) or exclude i-1 (in which case it's already the optimal path to i-1). The decision at each step only looks back 1 or 2 steps, making this a linear scrolling DP where we maintain only the last two states.

Solution

def rob(nums):
    if not nums:
        return 0
    if len(nums) == 1:
        return nums[0]
    
    # Edge case: two houses - just take the max
    if len(nums) == 2:
        return max(nums[0], nums[1])
    
    # We only need to track the previous two states
    # prev2 = max money robbing up to house i-2
    # prev1 = max money robbing up to house i-1
    prev2 = nums[0]
    prev1 = max(nums[0], nums[1])
    
    for i in range(2, len(nums)):
        # Two choices:
        # 1. Don't rob current house: take whatever was optimal at i-1 (prev1)
        # 2. Rob current house: add current house value to best we could do at i-2 (prev2)
        current = max(prev1, prev2 + nums[i])
        
        # Shift window forward
        prev2 = prev1
        prev1 = current
    
    return prev1

Complexity

Time: O(n)
Space: O(1)

We traverse each house exactly once, doing O(1) work per house. We can't do better than O(n) because we must examine every house to know if we should rob it - there's no shortcut since each decision depends on the specific values of neighboring houses. For space, we only store two variables (the last two optimal values), independent of input size. This is the minimum because we need to remember at least the last two decisions to compute the next one.

Common Mistakes

Using a list for DP when only two variables are needed - adds unnecessary O(n) space
Confusing the indices: dp[i] represents the OPTIMAL choice up to AND INCLUDING house i, not starting from i
Forgetting the base cases - houses[0] and houses[1] need special handling
Trying to use greedy (always take if current > previous) - doesn't work because sometimes taking a smaller current house enables taking a much larger next house
Not handling empty input or single house edge cases

Edge Cases

Empty array [] returns 0 (no houses to rob)
Single house [5] returns 5
Two houses [2, 7] returns 7 (pick the better one)
All zeros [0, 0, 0] returns 0
Alternating pattern [1, 3, 1, 3, 1, 3] - forces skipping every other house
Decreasing values [5, 4, 3, 2, 1] - once you pick early, you're stuck with suboptimal choices

Connections

Climbing Stairs (#70) - same recurrence structure but with addition instead of max, and counts paths instead of maximizing value
House Robber II (#213) - same pattern but houses are circular, requires solving two subproblems and taking the max
House Robber III (#337) - tree version where you can't rob parent and child, combines DP with DFS
Maximum Subarray (#53) - also uses linear DP with max() but over a different state space
Coin Change (#322) - different DP pattern: unbounded knapsack with minimization instead of this path-optimization pattern

Longest Increasing Subsequence #300

Patience Sorting with Binary Search (O(n log n)) - a greedy + binary search hybrid

Intuition

Imagine you're stacking coins in a row, but you can only place each new coin on top of a smaller coin - you want the tallest possible tower. You don't have to use every coin, but the ones you pick must each be larger than the one below it. The challenge: knowing only the heights so far, what's the tallest tower you can build? There's a beautiful insight here: instead of tracking exactly which coins we picked (which would be O(n²)), we can track something much simpler - for each possible tower height, what's the smallest 'top coin' we could possibly have? This is like keeping the 'lightest weight that could hold up' a tower of each size. If we see a coin heavier than all our tops, we build a taller tower. If it's lighter, we swap out one of our tops to be smaller - which actually HELPS future coins fit under it.

Why This Pattern?

The problem has optimal substructure: the longest increasing subsequence ending at position i depends on all previous positions. The key structural insight is that we only care about the SMALLEST possible tail value for each subsequence length. If we can achieve length L with a smaller tail, we leave more room for future elements. Binary search lets us efficiently find where to place each element in this 'tails' array.

Solution

def lengthOfLIS(nums):
    if not nums:
        return 0
    
    # tails[i] = smallest tail element for LIS of length i+1
    # This array is ALWAYS sorted - that's the magic property we exploit
    tails = []
    
    for num in nums:
        # Binary search: find leftmost position where tails[pos] >= num
        # This is like asking: "where does this coin fit in our sorted tops?"
        left, right = 0, len(tails)
        
        while left < right:
            mid = (left + right) // 2
            if tails[mid] < num:
                left = mid + 1
            else:
                right = mid
        
        # If we reached the end, this num extends the longest subsequence
        # Otherwise, we replace tails[left] with a smaller value (better!)
        if left == len(tails):
            tails.append(num)
        else:
            tails[left] = num
    
    return len(tails)

Complexity

Time: O(n log n)
Space: O(n) - the tails array can grow to size n

We do O(n) iterations, and each iteration performs a binary search on the tails array, which at worst has size O(n). So n × log n operations. We can't do better than O(n log n) because we need to examine each element at least once (the output depends on all n inputs), and binary search is optimal for sorted searches.

Common Mistakes

Using bisect_left incorrectly - forgetting that we need the LEFTMOST position where tails[pos] >= num (not just any position)
Confusing the binary search condition - using tails[mid] <= num instead of tails[mid] < num changes behavior
Thinking len(tails) is the answer while iterating - you must finish the loop first!
Trying to reconstruct the actual subsequence - this solution only gives length, not the sequence itself

Edge Cases

Empty array returns 0
Single element returns 1
Strictly decreasing sequence returns 1 (each element replaces tails[0])
All equal elements returns 1 (never extends, only replaces)
Already sorted array returns n (never replaces, only extends)

Connections

Russian Doll Envelopes - same patience sorting pattern but with 2 dimensions
Maximum Length of Pair Chain - similar greedy + binary search
Longest Bitonic Subsequence - combines LIS and LDS
Longest Divisible Subsequence - same pattern but with divisibility instead of numerical comparison
Box Stacking - 3D version of LIS variant

Longest Palindromic Substring #5

Two-pointer expansion from centers (also called 'center expansion').

Intuition

Think of a palindrome as a mirror image - the left side reflects the right side. For any position in a string, imagine you're standing at a center point and peeking outward in both directions. As long as the characters match, you're looking at a palindrome. This is like a standing wave that has a natural center and symmetric patterns extending from it. The beautiful thing is: every palindrome has a center (either a single character for odd-length, or the gap between two characters for even-length), and from any center, you can expand outward to find the longest palindrome that has that center. The problem becomes: try all possible centers, expand as far as possible from each, and keep track of the longest one found.

Why This Pattern?

Palindromes have a recursive symmetry: if you remove the outer characters of any palindrome, what remains is still a palindrome. This means the palindrome property is preserved when you shrink from the edges toward the center. By starting from a center and expanding outward, we naturally discover palindromes without redundant checking. There are exactly 2n-1 possible centers in a string of length n (n odd-length centers at each character, n-1 even-length centers at each gap), and expanding from each center is O(n) in the worst case, giving O(n²) total.

Solution

def longestPalindrome(s: str) -> str:
    if len(s) <= 1:
        return s
    
    def expand_from_center(left: int, right: int) -> str:
        # Expand outward while characters match (like ripples in a pond)
        while left >= 0 and right < len(s) and s[left] == s[right]:
            left -= 1
            right += 1
        # Return the palindrome we found (left+1 and right-1 are the bounds)
        return s[left + 1:right]
    
    longest = ""
    for i in range(len(s)):
        # Odd-length palindrome: center is a single character at position i
        odd_palindrome = expand_from_center(i, i)
        # Even-length palindrome: center is the gap between i and i+1
        even_palindrome = expand_from_center(i, i + 1)
        
        # Keep the longest palindrome found so far
        if len(odd_palindrome) > len(longest):
            longest = odd_palindrome
        if len(even_palindrome) > len(longest):
            longest = even_palindrome
    
    return longest

Complexity

Time: O(n²) in the worst case.
Space: O(1) - we only use a few pointers and store the current longest palindrome, no additional data structures proportional to input size.

There are 2n-1 centers to check (each character and each gap between characters). For each center, in the worst case (like a string of all 'a's), we expand all the way to the ends, which takes O(n). So total is O(n × n) = O(n²). We can't do better in the worst case because there can be Ω(n²) palindromic substrings in a string (consider 'aaaaa' - it has O(n²) different palindromes), so we need to examine enough of them to find the longest.

Common Mistakes

Forgetting to check even-length palindromes (centers between characters) - many only check odd-length
Not adjusting indices after the while loop expansion (the bounds overshoot by 1)
Using a DP table when it's unnecessary - the center expansion is simpler and uses O(1) space
Confusing the return statement: should be s[left+1:right] not s[left:right+1]

Edge Cases

Empty string or single character - should return the string itself
All same characters (e.g., 'aaaaa') - should return the whole string
No palindrome longer than 1 (e.g., 'abc') - return first character
Two-character palindrome (e.g., 'aa') - must handle even-length center correctly

Connections

This is the 1D DP foundation for palindrome problems - similar logic appears in 'Palindromic Substrings' (counting all palindromes) and 'Longest Palindromic Subsequence' (but with removal allowed, different DP formulation)
The center expansion technique also appears in 'Manacher's Algorithm' which achieves O(n) for this exact problem - this problem is the brute-force baseline
Shares the 'expand around center' insight with problems like 'Longest Mountain in Array' and problems involving symmetric properties

Maximum Product Subarray #152

Extended Kadane's Algorithm with dual state tracking

Intuition

Think of this like tracking a signal that can flip polarity. In Maximum Sum Subarray, we could just track the best positive sum because negatives only hurt us. But here, two negatives make a positive — so a number that seems terrible now (negative) might pair with another negative later to become huge. We need to track BOTH the best and worst possible products at each position, like a seesaw: when you multiply by a negative, the max becomes the min and vice versa. It's like maintaining both the highest peak and deepest valley in a landscape, because two valleys (negatives) can combine into a mountain.

Why This Pattern?

The problem structure demands tracking two extremes because multiplication can flip signs. At any position, the best product ending there depends on either: (1) starting fresh at the current element, (2) extending the previous max product, or (3) extending the previous min product (which becomes max when multiplied by a negative). Just tracking max_prod like in sum problems fails because we lose the min_prod that might become valuable when paired with a future negative.

Solution

class Solution:
    def maxProduct(self, nums: List[int]) -> int:
        # Global answer starts with first element
        result = nums[0]
        
        # Track max and min products ending at current position
        max_prod = nums[0]
        min_prod = nums[0]
        
        for i in range(1, len(nums)):
            # If current is negative, max and min swap roles conceptually
            # So we compute both possibilities: current*max_prod and current*min_prod
            # The new max is the best of: starting fresh, extending previous max, or extending previous min
            new_max = max(nums[i], nums[i] * max_prod, nums[i] * min_prod)
            new_min = min(nums[i], nums[i] * max_prod, nums[i] * min_prod)
            
            # Update both and record global best
            max_prod, min_prod = new_max, new_min
            result = max(result, max_prod)
        
        return result

Complexity

Time: O(n)
Space: O(1)

We make exactly one pass through the array. At each position, we do a constant amount of work (a few multiplications and comparisons). We can't do better than O(n) because we must examine each element to know if it contributes to the optimal subarray. The O(1) space comes from only tracking three variables regardless of input size.

Common Mistakes

Only tracking max_prod and forgetting min_prod — this misses the case where two negatives make a positive
Not considering that starting fresh at the current element (just nums[i]) might be better than extending
Forgetting to update the global result inside the loop, only updating at the end

Edge Cases

Single element array: returns that element
All negative numbers: need at least one element (e.g., [-2] returns -2, not 0)
Zeros in the array: act as reset points because product becomes 0
Array with one negative and one positive: takes the positive
Mixed zeros, positives, negatives: algorithm handles it correctly

Connections

Maximum Subarray (53): Same Kadane-like structure, but simpler since we only track max, not min
Maximum Product Subarray II: Adds the constraint of no zeros in similar fashion
Longest Arithmetic Subsequence: Also uses DP with multiple states, though different nature

Min Cost Climbing Stairs #746

1-D Dynamic Programming with optimal substructure

Intuition

Think of this like a ball rolling up a hill with energy costs at each position. At every stair, the ball has two choices: take one step or skip one. The minimum cost to reach any stair is the minimum of the costs to reach the two stairs below it, plus the cost of the current stair. It's like finding the path of least resistance up the hill - at each fork, you choose whichever route accumulated less total cost. The key insight: to stand on stair i, you must have come from either stair i-1 or stair i-2, so you take whichever was cheaper and add the cost of standing on i. You don't pay for your starting position - you can begin at stair 0 or 1 freely.

Why This Pattern?

The problem exhibits optimal substructure: the minimum cost to reach stair i depends only on the minimum costs to reach stairs i-1 and i-2. This is a 'choose your best previous state' pattern where each decision (take 1 step or 2 steps) leads to a new state.

Solution

def minCostClimbingStairs(cost):
    n = len(cost)
    # Base cases: can start at index 0 or 1 without paying yet
    # dp[i] = minimum cost to reach and stand on stair i
    
    # Option 1: use array (more readable)
    # dp = [0] * n
    # dp[0] = cost[0]
    # dp[1] = cost[1]
    # for i in range(2, n):
    #     dp[i] = min(dp[i-1], dp[i-2]) + cost[i]
    # return min(dp[n-1], dp[n-2])  # can end at last or second-to-last
    
    # Option 2: space-optimized (only need previous 2 values)
    prev2 = cost[0]  # cost to reach i-2
    prev1 = cost[1]  # cost to reach i-1
    
    for i in range(2, n):
        current = min(prev1, prev2) + cost[i]
        prev2 = prev1
        prev1 = current
    
    # Can finish at second-to-last or last stair (no cost beyond array)
    return min(prev1, prev2)

Complexity

Time: O(n)
Space: O(1)

Common Mistakes

Including cost of the final step beyond the array (there's no cost after the last stair)
Using wrong initial values - you DON'T pay cost[0] or cost[1] to start, but you DO pay them if you step ONTO them from below
Confusing the recurrence - it's min(cost to reach i-1, cost to reach i-2) + cost[i], not min(cost[i-1], cost[i-2])

Edge Cases

Two stairs: just take min(cost[0], cost[1])
Three stairs: check all paths
All increasing costs: still need to take the minimum path
Zero or one element array - handle as edge cases

Connections

Climbing Stairs (#70) - same movement pattern but counts ways not costs
House Robber (#198) - similar DP with 'take or skip' decisions, but you can't take adjacent houses
Coin Change - also uses min of subproblems but different structure (unbounded knapsack)
Nth Tribonacci Number - same recurrence pattern with three states instead of two

Palindromic Substrings #647

Expand Around Center - a fundamental palindrome enumeration technique

Intuition

Think of a palindrome like a balanced system - it has a center of symmetry, and characters mirror outward from that center like a ripple in a pond. When you drop a pebble (pick a center), the ripple expands outward as long as the symmetry holds. At each step outward, you check if the left and right 'forces' (characters) match - if they do, you've found another palindrome. If they don't match, the symmetry breaks and the ripple stops. This is why we expand around centers: every palindrome has exactly one center (for odd-length) or two adjacent centers (for even-length), and we can systematically find all of them by expanding outward from each possible center.

Why This Pattern?

Every palindrome has a well-defined center of symmetry. For odd-length palindromes like 'aba', the center is position 1 (the 'b'). For even-length palindromes like 'aa', the center is between positions 0 and 1. By treating each position as a potential odd center and each gap between positions as a potential even center, we can exhaustively enumerate all palindromes. The expansion naturally stops when symmetry breaks, making this O(n) per center in the worst case.

Solution

class Solution:
    def countSubstrings(self, s: str) -> int:
        """
        Count palindromic substrings by expanding around each possible center.
        
        For each position i, we expand twice:
        1. Odd-length: treat s[i] as the center (e.g., 'aba')
        2. Even-length: treat the gap between s[i] and s[i+1] as center (e.g., 'aa')
        
        Each successful expansion = one palindromic substring found.
        """
        n = len(s)
        count = 0
        
        def expand(left: int, right: int) -> None:
            """Expand outward from center while characters match."""
            nonlocal count
            while left >= 0 and right < n and s[left] == s[right]:
                count += 1      # Found a palindrome!
                left -= 1       # Expand left
                right += 1      # Expand right
        
        for i in range(n):
            # Odd-length: center is a single character at i
            expand(i, i)
            # Even-length: center is between i and i+1
            expand(i, i + 1)
        
        return count

Complexity

Time: O(n²) - In the worst case (like 'aaaaa...'), we expand O(n) times for each of the n centers. Each expansion is O(1), so total is O(n²). We can't do better because there can be O(n²) palindromic substrings in the worst case (every substring of a repeated character string is a palindrome), so we must at least enumerate all of them.
Space: O(1) - Only using a constant amount of extra space (the count variable and loop indices).

The space is constant because we don't store any substrings or use recursion - we just count as we go. The time is quadratic because we might check every possible expansion from every possible center. In 'aaaaa', there are n*(n+1)/2 = O(n²) palindromes to find, so we can't do better than O(n²) in the worst case.

Common Mistakes

Forgetting to count even-length palindromes (only checking odd centers)
Not handling the boundary conditions properly in the while loop (checking left >= 0 and right < n)
Off-by-one errors in the even-center expansion (should start at i, i+1 not i-1, i+1)
Confusing substring vs subsequence - substrings must be contiguous

Edge Cases

Empty string returns 0 (loop doesn't execute)
Single character returns 1
All same characters like 'aaa' has maximum palindromes: n*(n+1)/2
All different characters like 'abc' has exactly n palindromes (each single character)
Two character strings like 'aa' or 'ab' - must handle both even and odd correctly

Connections

Longest Palindromic Substring (#5) - uses the exact same expand-around-center technique, just tracks the longest instead of counting
Palindromic Substrings (#647) - this problem
Shortest Palindrome (#214) - uses expand-around-center to find where to insert characters
Manacher's Algorithm - O(n) solution for finding all palindromes, but overkill for counting

Partition Equal Subset Sum #416

Subset Sum / 0-1 Knapsack - Each element can be either IN a subset or NOT in it (two choices), and we want to hit an exact target sum.

Intuition

Think of this like a balance scale. If the total weight is odd, the scale can never balance—immediate fail. If it's even, we just need to find ONE subset that weighs exactly half. Here's the beautiful part: if we find a subset equaling half, the remaining elements automatically equal half (because total - half = half). This transforms the problem from "can I split this perfectly?" to the simpler question "can I find a subset that sums to X?" It's like asking: given coins, can I reach exactly half the total? Each number is a coin we can use once.

Why This Pattern?

The problem asks whether some subset equals exactly half the total. This is the canonical subset sum formulation: given a set of numbers and a target, can some combination reach that target? The '0-1' refers to using each element at most once (we're partitioning, not repeating).

Solution

class Solution:
    def canPartition(self, nums: List[int]) -> bool:
        total = sum(nums)
        # Odd total can never be split into equal integer sums
        if total % 2 != 0:
            return False
        
        target = total // 2
        # dp[i] = True if we can form sum 'i' using some subset
        dp = [False] * (target + 1)
        # Base case: we can always form sum 0 (empty subset)
        dp[0] = True
        
        for num in nums:
            # Iterate BACKWARDS! This is critical.
            # Going high-to-low ensures we don't reuse the same element
            # in one iteration (that would make it unbounded knapsack)
            for j in range(target, num - 1, -1):
                # Either we keep the old sum, OR we add current num to reach j
                dp[j] = dp[j] or dp[j - num]
        
        return dp[target]

Complexity

Time: O(n * sum/2) = O(n * sum) where n is array length and sum is total of all elements. We iterate through each element and for each, potentially update all sums from target down to that element.
Space: O(target) = O(sum/2). We need a boolean array representing every possible sum from 0 to half the total.

We can't do better than O(n*sum) because there are O(sum) possible subset sums and we potentially need to check each one. The sum could be as large as 200*10000 = 2,000,000 in worst case. This is pseudo-polynomial—we're exponential in the VALUE of the sum, not the COUNT of elements.

Common Mistakes

Iterating from low to high instead of high to low (reuses elements, gives wrong answer)
Not checking odd sum early (wastes computation)
Forgetting base case dp[0] = True (empty subset always works)
Trying to use greedy—it doesn't work because you might need a 'large' number plus many small ones

Edge Cases

Empty array: returns False (can't partition into two non-empty subsets)
Single element: returns False (can't split into two subsets with equal sum)
All zeros: returns True (can partition into two empty-like subsets)
Very large single element equals target: works correctly
Negative numbers: NOT in LeetCode version, but would break the logic

Connections

Coin Change (LeetCode 322) - Same structure but unbounded (can use elements repeatedly)
Number of Subset Sums - Same DP but counting instead of boolean
Target Sum (LeetCode 494) - Same structure but with +/- signs creating two groups
Minimum Subset Sum Difference - Related problem asking for minimum difference between two partitions

Word Break #139

Linear Scan with Reachability DP

Intuition

Think of this as a pathfinding problem through the string. You're standing at position 0 and want to reach position n (the end). Each valid word in the dictionary is like a 'jump' - if you're at position j and the substring s[j:i] is a word, you can jump to position i. The question becomes: can you traverse from start to end using only valid word-jumps? Alternatively, think of it like signal propagation: the signal starts at position 0. Each valid word acts as a wire that propagates the signal forward. If the signal can reach the end, the word break is valid. This is like asking whether a ripple can travel through a medium using only certain-sized waves.

Why This Pattern?

The string has a natural left-to-right ordering. For any position i, to determine if it's reachable from the start, we only need to check positions j < i. This creates a perfect ordering for DP: dp[i] = 'can we reach position i?' and we propagate reachability forward. It's essentially a graph reachability problem with a linear structure - we sweep left to right once.

Solution

def wordBreak(s: str, wordDict: List[str]) -> bool:
    word_set = set(wordDict)  # O(1) lookups instead of O(n) list scans
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True  # Base case: empty string is always 'breakable'
    
    # Sweep through each position i (1 to n)
    for i in range(1, n + 1):
        # Check all possible previous positions j
        for j in range(i):
            # If we can reach j AND s[j:i] is a word, we can reach i
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break  # Found a valid path to i, no need to check more j's
    
    return dp[n]

# Optimized version with max_word_len constraint:
def wordBreak_optimized(s: str, wordDict: List[str]) -> bool:
    word_set = set(wordDict)
    max_len = max(len(word) for word in wordDict)  # Prune search space
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True
    
    for i in range(1, n + 1):
        # Only check j positions within max word length of i
        for j in range(max(0, i - max_len), i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break
    
    return dp[n]

Complexity

Time: O(n² × m) where n is string length and m is average word length for substring slicing. With the optimization, becomes O(n × L) where L is the max word length in dictionary. The slicing itself is O(m) but Python optimizes this reasonably well. Essentially we're doing at most n × max_len substring checks.
Space: O(n + k) where n is the DP array size and k is the dictionary size for the set lookup table. The set dominates for large dictionaries.

We must check each position i against potentially all previous positions j - that's the n². Each check requires verifying if a substring exists in our dictionary. We can't do better than checking each position because any character could be the start of the final valid word - there's no way to 'skip' positions without checking them. The set lookup is O(1), so the main cost is the nested loop structure itself.

Common Mistakes

Forgetting dp[0] = True - the empty string is the base case that anchors the DP
Using list for wordDict instead of set - O(n) lookups destroy performance
Not breaking out of inner loop once dp[i] is True - wastes cycles checking unnecessary j values
Confusing dp[i] meaning - it represents 'can we reach position i from start', not 'can we break s[0:i]'

Edge Cases

Empty string s = '' returns True (empty is breakable)
All single character words but target is longer - must find valid chain
Dictionary contains the entire string as one word - should return True
Dictionary with no valid starting words - returns False immediately
Overlapping words in dictionary - DP handles this naturally, we just need any valid path

Connections

Similar to 'Climbing Stairs' (#70) - both use reachability DP with linear propagation, but word break has variable step sizes based on word lengths
Related to 'Longest Palindromic Substring' (#5) in how we slice substrings
Uses same DP foundation as 'Decode Ways' (#91) - both determine valid segmentation based on dictionary/coding rules
The optimized version with max_len pruning is like using BFS with a queue size limit - same idea as 'Number of Islands' (#200) flood fill optimization

2-D Dynamic Programming (11)

Best Time to Buy and Sell Stock with Cooldown #309

State Machine DP with three states

Intuition

Think of this as a state machine with three positions: (1) HOLDING a stock, (2) NOT HOLDING and CAN BUY, (3) COOLDOWN (just sold yesterday, can't buy today). The cooldown creates a 'rest period' in the cycle—like a pendulum that must swing back through a resting point before going forward again. The key insight: each day's optimal state depends only on yesterday's states. If you sell today, you enter cooldown tomorrow (can't buy). If you're in cooldown, you must wait, then you can buy. This creates a natural three-state flow that captures all constraints. It's like tracking a particle moving through positions with specific allowed transitions.

Why This Pattern?

The cooldown constraint explicitly creates three distinct states the system can be in at any point. This isn't optional—we NEED three states because knowing only 'holding vs not holding' isn't enough; we need to know whether we're in cooldown (can't act) or can buy. The problem's constraint directly maps to state transitions, making this the natural pattern.

Solution

def maxProfit(prices):
    if not prices:
        return 0
    
    n = len(prices)
    # Three states: hold (holding a stock), profit (not holding, can buy), cooldown (just sold)
    # Initialize: on day 0, either we don't buy (profit=0) or we buy (hold=-prices[0])
    hold = -prices[0]      # max profit if holding after day 0
    profit = 0             # max profit if not holding (can buy) after day 0
    cooldown = 0           # max profit if in cooldown after day 0 (not possible initially)
    
    for i in range(1, n):
        # Calculate new states based on previous states
        # NEW hold: max of (kept holding, bought today from profit state)
        new_hold = max(hold, profit - prices[i])
        # NEW profit: max of (was already not holding, recovering from cooldown)
        new_profit = max(profit, cooldown)
        # NEW cooldown: must have sold today (was holding, now sell at today's price)
        new_cooldown = hold + prices[i]
        
        # Update states
        hold, profit, cooldown = new_hold, new_profit, new_cooldown
    
    # At the end, we can't end in cooldown (no future benefit), so return max of hold and profit
    return max(profit, cooldown)

Complexity

Time: O(n)
Space: O(1)

We iterate through prices once, doing O(1) work each day. We can't do better because we must examine each day's price to know whether to buy/sell—we can't skip any day since each affects our available actions (cooldown forces a day off). The space is O(1) because we only track 3 variables regardless of input size—previous states are overwritten each iteration.

Common Mistakes

Forgetting to initialize hold to -prices[0] — you can buy on day 0
Ending in cooldown state — you can't end in cooldown because you can't act on it further (need to take max with profit)
Confusing profit and cooldown states — profit means 'can buy today', cooldown means 'cannot buy today'
Using wrong transition: cooldown should come from hold + prices[i], not from profit

Edge Cases

Single day: return 0 (can't sell without buying first)
Prices decreasing: return 0 (never buy)
Two days: if prices[1] > prices[0], buy day 0 sell day 1, return profit
All same price: return 0 (no profitable transactions)

Connections

This is like 'Best Time to Buy and Sell Stock II' (#122) but with a cooldown constraint — you add a third state to track the restriction
Similar to 'Best Time to Buy and Sell Stock with Transaction Fee' (#714) where the constraint is a fee instead of a cooldown
The three-state pattern appears in other DP problems like 'House Robber II' (circular houses) where you need to track additional state information

Burst Balloons #312

Interval Dynamic Programming (also called 'Divide and Conquer DP')

Intuition

Think of this backwards: instead of asking which balloon to burst FIRST, ask which one was burst LAST. When the final balloon bursts, its only neighbors are the boundaries (value 1 on both sides). So if balloon i is the LAST to burst in range (l, r), then when it finally bursts, the left and right subproblems are already solved - balloon i sees nums[l] and nums[r] as its neighbors. This is like finding an 'energy minimum' - we're looking for the balloon whose burst creates the most 'energy' (coins) given its local environment after all other balloons are gone. The problem has no greedy choice property going forward (bursting a high-value balloon early hurts your score), but going backwards, the last burst is unambiguous - it always sees boundaries.

Why This Pattern?

The problem has optimal substructure when we divide at the point of the LAST burst. If we fix which balloon bursts LAST in an interval, the left and right sides become independent subproblems - balloons on the left don't affect coins earned from balloons on the right. This is like matrix chain multiplication or optimal BST - we try all partition points. The 'interval' dimension tracks the left and right boundaries we're considering, and for each interval we try each position as the last burst.

Solution

class Solution:
    def maxCoins(self, nums: List[int]) -> int:
        # Add boundary balloons with value 1
        # These represent the walls that never burst
        val = [1] + [num for num in nums if num > 0] + [1]
        n = len(val)
        
        # dp[i][j] = max coins from bursting ALL balloons in interval (i, j)
        # exclusive bounds - we never burst the boundary balloons
        dp = [[0] * n for _ in range(n)]
        
        # Fill dp table - consider intervals of increasing length
        # length = distance between boundaries
        for length in range(2, n):
            for left in range(0, n - length):
                right = left + length
                # Try each balloon in (left, right) as the LAST to burst
                # When it bursts, its neighbors are val[left] and val[right]
                # because all balloons between them are already burst
                for k in range(left + 1, right):
                    # Coins from bursting k last: neighbors * k's value
                    # + coins from left subproblem (left, k)
                    # + coins from right subproblem (k, right)
                    coins = val[left] * val[k] * val[right]
                    coins += dp[left][k] + dp[k][right]
                    dp[left][right] = max(dp[left][right], coins)
        
        # Full interval (0, n-1) with boundaries excluded
        return dp[0][n - 1]

Complexity

Time: O(n^3)
Space: O(n^2)

We have O(n^2) possible intervals (left, right pairs), and for each we try O(n) possible 'last burst' positions. We can't do better because the problem requires considering all possible partition points - trying different balloons as the dividing point is fundamental to finding the optimal structure. The 3D nature (two boundaries + partition point) is inherent to the problem's optimal substructure.

Common Mistakes

Forgetting to add boundary balloons of value 1 - the problem explicitly says nums[-1] = nums[n] = 1, but without adding them explicitly, the indexing gets messy
Using inclusive bounds instead of exclusive - balloon at position k is the last to burst in interval (left, right), meaning all balloons strictly between left and right get burst first
Not handling zeros - balloons with value 0 don't contribute but can be adjacent to high-value balloons, so they shouldn't be filtered out carelessly (though filtering them works since 0 * anything = 0)

Edge Cases

Empty input - return 0
Single balloon - you can only burst it once, and it sees boundaries, so coins = 1 * nums[0] * 1 = nums[0]
All zeros - max coins is 0 since any product involves 0
Two balloons - each bursts with boundaries, order doesn't matter since they're independent

Connections

Optimal BST (different cost function but same interval DP structure)
Minimum Score Triangulation of Polygon - same interval DP pattern
Matrix Chain Multiplication - partition-based DP with O(n^3) complexity
Stone Game IV / Burst Balloons variant problems - understanding that 'last element' perspective often simplifies competitive problems

Coin Change II #518

2D Dynamic Programming with outer loop over coins, inner loop over amounts

Intuition

Think of this as counting the number of ways to distribute a 'flow' of amount units across different coin types. Imagine coins as different colored balls and you want to know how many valid color distributions sum to the target amount. The key insight: if you process coins in order (coin 1, then coin 2, then coin 3...), you naturally avoid double-counting because once you've moved past coin 1, you never revisit it - you're only adding more coin types to existing combinations. It's like building a staircase where each step down represents adding a new coin type, and each step right represents using more of the current coin type. Every path from top-left to bottom-right is a valid combination.

Why This Pattern?

The order of iteration is crucial. By putting coins on the outer loop and amounts on the inner loop (processing amounts in ascending order), we ensure that when computing dp[i][j], the value dp[i][j-coin] already includes combinations using the current coin - allowing us to use the same coin multiple times. If we reversed the order, we'd be counting permutations (different orderings) instead of combinations.

Solution

def change(amount: int, coins: list[int]) -> int:
    # dp[i][j] = ways to make amount j using first i coins
    dp = [[0] * (amount + 1) for _ in range(len(coins) + 1)]
    
    # Base case: one way to make amount 0 (use no coins)
    dp[0][0] = 1
    
    for i in range(1, len(coins) + 1):
        coin = coins[i - 1]  # Current coin denomination
        for j in range(amount + 1):
            # Option 1: Don't use current coin - ways = dp[i-1][j]
            dp[i][j] = dp[i - 1][j]
            
            # Option 2: Use current coin (at least once)
            # dp[i][j-coin] gives ways to make remaining amount using
            # current coin (and previous coins) - allows unlimited use
            if j >= coin:
                dp[i][j] += dp[i][j - coin]
    
    return dp[len(coins)][amount]

# Space-optimized version
def change_optimized(amount: int, coins: list[int]) -> int:
    dp = [0] * (amount + 1)
    dp[0] = 1  # One way to make amount 0: use nothing
    
    for coin in coins:
        for j in range(coin, amount + 1):
            # Add ways to make (j - coin) to current dp[j]
            # This counts all combinations using current coin
            dp[j] += dp[j - coin]
    
    return dp[amount]

Complexity

Time: O(len(coins) * amount)
Space: O(amount) for optimized version, O(len(coins) * amount) for 2D version

We must consider each coin (n coins) for each possible amount from 0 to target (amount+1 values). We can't do better because the answer could theoretically be different for every amount - there's no formula that skips computation. Each state requires constant time to compute from its predecessors.

Common Mistakes

Putting amounts in outer loop - this counts permutations (different orders) instead of combinations
Forgetting the base case dp[0][0] = 1
Not checking if j >= coin before accessing dp[i][j-coin]
Using integer overflow (not an issue in Python, but careful in other languages)

Edge Cases

amount = 0: should return 1 (one way: use no coins)
coins = []: should return 0 for amount > 0, 1 for amount = 0
Large amount with small coins: many combinations exist
Coin value larger than amount: that coin can never be used

Connections

Knapsack problem - same unbounded knapsack structure where items have weight=coin value, value=1, and we count ways instead of max value
Unique Paths - similar 2D grid DP structure with two movement options
Coin Change (minimum coins) - same coin/amount DP but with min() instead of sum for finding fewest coins

Distinct Subsequences #115

2D Dynamic Programming (DP)

Intuition

Think of this as counting the number of distinct paths through s that can spell out t. At each character in s, you have a choice: either match it with the current character in t (if they match), or skip it. We're essentially counting how many different ways we can 'use up' characters from s to build t. If s[i-1] == t[j-1], we have two choices - use this character as part of our subsequence or skip it. If they don't match, we must skip it.

Why This Pattern?

The problem exhibits optimal substructure: the number of ways to form t[0:j] from s[0:i] depends on smaller prefixes. When s[i-1] == t[j-1], we can either match them (contributing dp[i-1][j-1] ways) or skip s[i-1] (contributing dp[i-1][j] ways). This creates a natural recurrence relation that fills a 2D table. The decision to match or skip at each position creates the branching that DP naturally captures.

Solution

def numDistinct(s: str, t: str) -> int:
    m, n = len(s), len(t)
    
    # dp[i][j] = number of ways to form t[0:j] from s[0:i]
    # Using m+1 x n+1 to include empty string base cases
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base case: empty t can be formed one way (delete everything from s)
    # dp[0][0] = 1 represents: empty s forms empty t in one way
    for i in range(m + 1):
        dp[i][0] = 1
    
    # Fill the DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s[i-1] == t[j-1]:
                # Two choices: use s[i-1] as match (dp[i-1][j-1])
                # OR skip s[i-1] (dp[i-1][j])
                dp[i][j] = dp[i-1][j-1] + dp[i-1][j]
            else:
                # Characters don't match, must skip s[i-1]
                dp[i][j] = dp[i-1][j]
    
    return dp[m][n]

Complexity

Time: O(m * n)
Space: O(m * n)

We iterate through all m*n combinations of prefixes of s and t. At each cell we do O(1) work. This is necessary because we need to remember the count for every prefix combination - the answer for longer strings depends on all shorter combinations. Space can be reduced to O(n) by noticing we only need the previous row.

Common Mistakes

Confusing s and t - s is the source, t is the target we're building
Off-by-one errors with indices (using i instead of i-1 for character access)
Forgetting the base case dp[i][0] = 1 (empty t can always be formed by deleting everything)
Not handling the case where t is longer than s (answer is 0 automatically)

Edge Cases

Empty s or t strings - s='' with any t should return 0, t='' should return 1
When s and t are equal - there's exactly 1 way (use all characters)
Repeated characters in s - e.g., s='rabbbit', t='rabbit' has 3 ways
Very long strings requiring large integer counts (Python handles big ints natively)

Connections

Edit Distance (LeetCode 72) - similar 2D DP structure but with different transition rules
Longest Common Subsequence (LeetCode 1143) - shares the diagonal matching pattern
Wildcard Matching (LeetCode 44) and Regular Expression Matching (LeetCode 10) - similar string matching DP patterns
Count Viable Combinations in Coin Change problems - same 'count ways' DP approach

Edit Distance #72

2D Dynamic Programming on Two Sequences

Intuition

Think of this as transforming one system (word1) into another (word2). Each character is a component, and you're allowed three 'moves': insert a new component, delete an existing one, or replace one with another. The key insight: at each position, if characters match, you're in equilibrium - just carry the previous state forward. If they don't match, you're at a 'energy barrier' and must pay a cost of 1 to either replace (overcome the difference), delete from word1 (skip the mismatch), or insert into word1 (add what's needed). This is like finding the minimum-cost path through a grid where each step represents an edit operation.

Why This Pattern?

This problem has optimal substructure - the minimum operations to convert prefixes word1[0:i] and word2[0:j] depends only on smaller prefixes. It also has overlapping subproblems - without memoization, we'd recompute the same subproblems repeatedly. The 2D grid naturally represents the 'state space' of all possible prefix transformations.

Solution

def minDistance(word1, word2):
    m, n = len(word1), len(word2)
    
    # dp[i][j] = min operations to convert word1[0:i] to word2[0:j]
    # i and j are lengths, so dp[0][*] and dp[*][0] are base cases
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base case: convert empty string to word2[0:j] = j insertions
    for j in range(1, n + 1):
        dp[0][j] = j
    
    # Base case: convert word1[0:i] to empty string = i deletions
    for i in range(1, m + 1):
        dp[i][0] = i
    
    # Fill the table: for each prefix pair
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                # Characters match - no operation needed, carry forward
                dp[i][j] = dp[i-1][j-1]
            else:
                # Three choices, pick minimum cost:
                # 1. Replace current character
                # 2. Delete from word1 (move i-1, stay at j)
                # 3. Insert into word1 (stay at i, move j-1)
                dp[i][j] = 1 + min(
                    dp[i-1][j-1],  # replace
                    dp[i-1][j],    # delete from word1
                    dp[i][j-1]     # insert into word1
                )
    
    return dp[m][n]

Complexity

Time: O(m * n)
Space: O(m * n)

We compute one DP cell for each pair of prefixes (m+1)*(n+1) total. Each cell takes O(1) to compute. We can't do better because every prefix of word1 potentially relates to every prefix of word2 - the edit distance fundamentally requires comparing all character combinations.

Common Mistakes

Confusing indices - word1[i-1] corresponds to dp[i][*] since dp uses length indices
Forgetting base cases - empty string transformations
Not handling the case when characters match (should just copy diagonal value, not add 1)
Using wrong recurrence - delete/insert directions matter

Edge Cases

Empty strings: minDistance('', 'abc') = 3 (3 insertions), minDistance('abc', '') = 3 (3 deletions)
One string empty: works via base cases
Completely different strings: all characters need replacement or insert+delete
Identical strings: answer is 0, no operations needed

Connections

Longest Common Subsequence (LCS) - this problem IS the LCS complement; if you know LCS length, max operations = m + n - 2*LCS for some variants
Wildcard Matching - uses similar DP on sequences
Delete Operation for Two Strings - same problem but only allows delete (no insert/replace), which simplifies to just |len1-len2|

Interleaving String #97

2-D Dynamic Programming (grid path-finding)

Intuition

Think of this like two rivers merging. s1 and s2 are tributaries that must merge to form s3, maintaining their internal order but mixing characters. Imagine a 2D grid where moving right means taking the next character from s1, and moving down means taking from s2. We start at the source (0,0) and need to reach the destination (len(s1), len(s2)) by following a path that exactly spells out s3. At each cell, we can only come from the left or from above - this is like a water flow finding its way to a drain, with s3 dictating which paths are valid.

Why This Pattern?

The problem has optimal substructure - whether we can reach cell (i,j) depends only on whether we could reach (i-1,j) or (i,j-1) and whether the current character matches. It's a classic 'unique paths' style problem where we're not counting paths but checking if ANY valid path exists. The 2D structure naturally emerges from the two input strings forming the axes of our decision space.

Solution

class Solution:
    def isInterleave(self, s1: str, s2: str, s3: str) -> bool:
        m, n = len(s1), len(s2)
        
        # Quick check: lengths must add up
        if m + n != len(s3):
            return False
        
        # dp[i][j] = True if s3[0:i+j] can be formed by interleaving s1[0:i] and s2[0:j]
        dp = [[False] * (n + 1) for _ in range(m + 1)]
        
        # Base case: empty strings
        dp[0][0] = True
        
        # Fill first column (using only s1)
        for i in range(1, m + 1):
            dp[i][0] = dp[i-1][0] and s1[i-1] == s3[i-1]
        
        # Fill first row (using only s2)
        for j in range(1, n + 1):
            dp[0][j] = dp[0][j-1] and s2[j-1] == s3[j-1]
        
        # Fill the rest of the grid
        for i in range(1, m + 1):
            for j in range(1, n + 1):
                # Current position in s3 we're trying to match
                k = i + j - 1
                
                # Either came from left (using s1) OR from above (using s2)
                from_s1 = dp[i-1][j] and s1[i-1] == s3[k]
                from_s2 = dp[i][j-1] and s2[j-1] == s3[k]
                
                dp[i][j] = from_s1 or from_s2
        
        return dp[m][n]

Complexity

Time: O(m * n) - we visit each cell in the (m+1) x (n+1) grid exactly once
Space: O(m * n) for the full DP table, though this can be reduced to O(n) using only one row at a time

We must consider every possible prefix combination - there are m+1 possible positions in s1 and n+1 in s2, so (m+1)(n+1) states. Each state requires constant time to compute. We can't do better because the answer could theoretically depend on any split point - imagine checking if s1[:i] + s2[:j] forms s3[:i+j] for all possible i,j.

Common Mistakes

Forgetting the length check upfront - if len(s1) + len(s2) != len(s3), it's impossible
Off-by-one errors in indexing - remember s1[i-1] corresponds to position i in our 1-indexed DP view
Not handling the base cases correctly - the first row and column need special handling
Confusing the direction: moving right in grid = using s1, moving down = using s2 (it's easy to swap these)

Edge Cases

Empty strings: s1='' or s2='' should work correctly
One character strings
s3 equals s1 exactly (s2 is empty)
All characters are the same (e.g., 'aaa' + 'aaa' = 'aaaaaa') - this tests that we're checking order, not just character matching

Connections

S
a
m
e
c
o
r
e
p
a
t
t
e
r
n
a
s
'
U
n
i
q
u
e
P
a
t
h
s
'
(
L
e
e
t
C
o
d
e
6
2
)
a
n
d
'
U
n
i
q
u
e
P
a
t
h
s
I
I
'
(
6
3
)
-
f
i
n
d
i
n
g
a
p
a
t
h
t
h
r
o
u
g
h
a
g
r
i
d
.
A
l
s
o
r
e
l
a
t
e
d
t
o
'
E
d
i
t
D
i
s
t
a
n
c
e
'
(
7
2
)
i
n
h
o
w
w
e
b
u
i
l
d
t
h
e
D
P
t
a
b
l
e
.
T
h
i
s
i
s
f
u
n
d
a
m
e
n
t
a
l
l
y
a
2
-
s
t
r
i
n
g
v
e
r
s
i
o
n
o
f
t
h
e
'
w
o
r
d
b
r
e
a
k
'
p
r
o
b
l
e
m
w
h
e
r
e
w
e
h
a
v
e
t
w
o
s
o
u
r
c
e
s
i
n
s
t
e
a
d
o
f
a
d
i
c
t
i
o
n
a
r
y
.

Longest Common Subsequence #1143

2D Dynamic Programming (Edit Distance family)

Intuition

Imagine two rivers (the two strings) flowing side by side. A subsequence is like a path that follows the river but can skip around. The LCS is the longest path that exists in BOTH rivers at the same relative positions. At each decision point (matching characters vs. not), it's like choosing which river to 'sacrifice' a character from when they don't align. The key insight: if characters match, we gain 1 and move diagonally inward. If they don't match, we must abandon one character from either string and take the best path forward. Think of it as a choose-your-own-adventure where at every fork, we pick the branch that leads to the longest shared future.

Why This Pattern?

This problem has optimal substructure: the best answer at position (i,j) depends on the best answers at smaller positions. If characters match, LCS(i,j) = 1 + LCS(i-1,j-1). If they don't match, LCS(i,j) = max(LCS(i-1,j), LCS(i,j-1)). This creates a natural recursion that fills a 2D table where each cell represents the LCS for prefixes of both strings.

Solution

def longestCommonSubsequence(text1: str, text2: str) -> int:
    m, n = len(text1), len(text2)
    # dp[i][j] = LCS length for text1[0:i] and text2[0:j]
    # Using (m+1) x (n+1) to handle empty prefix cases
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i - 1] == text2[j - 1]:
                # Characters match - extend the common subsequence
                # by 1 from the diagonal (previous prefixes)
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                # Characters don't match - take the best path:
                # either skip current char from text1 OR from text2
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    
    return dp[m][n]

Complexity

Time: O(m * n)
Space: O(m * n)

We must visit every cell in the m×n table once because each cell depends on the cell above and to the left. We can't skip any position - to know the LCS for prefixes of length i and j, we need to have computed all smaller prefix combinations. There's no shortcut because the problem doesn't have monotonic properties we could exploit.

Common Mistakes

Confusing LCS with Longest Common Substring (contiguous) - LCS allows gaps, substring doesn't
Off-by-one errors with 1-indexed DP table (remember text1[i-1] is the character at index i-1)
Forgetting to initialize the first row and column with zeros (LCS with empty string is 0)
Trying to use greedy - the local optimal choice doesn't always lead to global optimal

Edge Cases

Empty strings - returns 0
Single character strings - returns 1 if match, 0 if not
One string is a subsequence of the other - returns length of shorter string
Completely different strings - returns 0
Identical strings - returns length of string

Connections

LeetCode 72 (Edit Distance) - same DP structure, more complex transitions
LeetCode 516 (Longest Palindromic Subsequence) - apply LCS to string and its reverse
LeetCode 1035 (Uncrossed Lines) - identical problem, different story framing
LeetCode 1458 (Max Product of Two disjoint subsequences) - uses LCS-like DP reasoning
Can reconstruct actual subsequence by backtracking from dp[m][n]

Longest Increasing Path in a Matrix #329

DFS with Memoization (Topological DP on DAG)

Intuition

Imagine each cell's value as an elevation. You're a hiker who can only walk uphill (to strictly higher values). You want to find the longest possible hike you could take from any starting point. The key insight: since values strictly increase along any path, you can never cycle back - the graph of valid moves is a Directed Acyclic Graph (DAG). This means if you compute the longest path from a cell once, that answer is final forever - no future decisions can change it. It's like calculating potential energy: once you know the maximum height reachable from each point, you just add 1 for the current step.

Why This Pattern?

The matrix with 'only move to higher values' constraint naturally forms a DAG because strict increase prevents cycles. For any cell, its longest path equals 1 (itself) plus the maximum of its neighbors' longest paths. Since neighbors always have higher values, there's no circular dependency - we can compute in any order using memoization. This is essentially dynamic programming on a DAG, computed via depth-first search.

Solution

class Solution:
    def longestIncreasingPath(self, matrix: List[List[int]]) -> int:
        if not matrix or not matrix[0]:
            return 0
        
        m, n = len(matrix), len(matrix[0])
        # Cache stores longest path starting from each cell
        cache = [[0] * n for _ in range(m)]
        
        def dfs(i, j):
            # If already computed, return cached result
            if cache[i][j] != 0:
                return cache[i][j]
            
            # Directions: up, down, left, right
            directions = [(-1, 0), (1, 0), (0, -1), (0, 1)]
            max_path = 1  # At minimum, we can stay at current cell
            
            for di, dj in directions:
                ni, nj = i + di, j + dj
                # Check bounds and only move to strictly higher values
                if 0 <= ni < m and 0 <= nj < n and matrix[ni][nj] > matrix[i][j]:
                    # Recursively find longest path from neighbor, add current cell
                    max_path = max(max_path, 1 + dfs(ni, nj))
            
            cache[i][j] = max_path
            return max_path
        
        # Try starting from every cell, return the maximum
        result = 0
        for i in range(m):
            for j in range(n):
                result = max(result, dfs(i, j))
        
        return result

Complexity

Time: O(m * n)
Space: O(m * n) for the cache + O(m * n) for recursion stack in worst case

Each cell is visited exactly once and we check 4 neighbors per cell. Even though we recursively explore, the memoization ensures we never recompute a subproblem - each cell's result is computed exactly one time and cached. The worst-case space equals the number of cells because in a strictly increasing path, we could recurse through every cell before returning.

Common Mistakes

Forgetting to check bounds before accessing matrix - causes index errors
Not handling the case where matrix is empty - crashes on matrix[0]
Using >= instead of > for comparison - allows equal values which breaks the strictly increasing property
Not initializing cache with a sentinel value (like 0) that could be confused with valid results - better to use None
Forgetting that longest path must include the current cell itself (the +1)

Edge Cases

Single cell matrix - should return 1
All equal values - no moves possible, each cell returns 1
Completely strictly increasing from one corner - path length equals total cells
Matrix with negative values - algorithm still works since only relative comparison matters
Non-rectangular input is not possible per problem spec but empty rows should be handled

Connections

Similar to 'Number of Islands' - both use DFS with visited tracking, but this uses memoization for optimization
Same core technique as 'Climbing Stairs' - DP where state depends on 'smaller' previous states
Related to 'Course Schedule' - both exploit DAG structure, but this computes via DFS instead of topological sort
Compare with 'Cherry Pickup' - both use 2D DP but this is simpler since no need to track two states simultaneously
The memoization pattern here is identical to 'House Thief' - compute optimal substructure and cache results

Regular Expression Matching #10

2-D Dynamic Programming on two strings (like Longest Common Subsequence, Edit Distance, Wildcard Matching)

Intuition

Think of this as a signal propagation or state machine problem. You're trying to find a 'path' through both strings where each step either consumes a character or uses the special '*' operator to either suppress the previous element (zero occurrences) or allow it to repeat. The '.' is a wildcard - like a universal adapter that fits anything. The '*' is a feedback loop: it can either 'dampen' the signal (zero of preceding) or 'amplify' it (one or more of preceding). The 2D grid naturally emerges because you have two independent positions to track - where you are in the string and where you are in the pattern. Each cell asks: 'Can the remainder of string s[i:] match remainder of pattern p[j:]?'

Why This Pattern?

The problem has optimal substructure: whether s[i:] matches p[j:] depends on smaller suffixes. There are overlapping subproblems - we might reach the same (i,j) state via different paths. The state space is naturally 2D because we track two independent indices. This is structurally identical to wildcard matching (LeetCode 44) but with '*' having different semantics (preceding element must exist, not optional).

Solution

def isMatch(s: str, p: str) -> bool:
    # dp[i][j] = True if s[i:] matches p[j:]
    # Working backwards from ends (like Edit Distance)
    m, n = len(s), len(p)
    dp = [[False] * (n + 1) for _ in range(m + 1)]
    
    # Base case: empty string matches empty pattern
    dp[m][n] = True
    
    # Fill last row: matching empty string against pattern
    # Pattern 'x*' can match empty (x occurs 0 times)
    for j in range(n - 1, -1, -1):
        if p[j] == '*':
            dp[m][j] = dp[m][j + 2]  # skip 'x*' entirely
    
    # Fill DP table: for each position in s and p
    for i in range(m - 1, -1, -1):
        for j in range(n - 1, -1, -1):
            # Do current characters match? (accounting for '.' wildcard)
            first_match = s[i] == p[j] or p[j] == '.'
            
            # Check if next pattern char is '*'
            if (j + 1) < n and p[j + 1] == '*':
                # Two paths with '*': 
                # 1) Skip 'x*' entirely (x occurs 0 times): dp[i][j+2]
                # 2) Use '*' to match current char and stay on pattern j
                #    (x occurs 1+ times): first_match AND dp[i+1][j]
                dp[i][j] = dp[i][j + 2] or (first_match and dp[i + 1][j])
            else:
                # No '*', must match current chars and advance both
                dp[i][j] = first_match and dp[i + 1][j + 1]
    
    return dp[0][0]

Complexity

Time: O(m * n) where m = len(s), n = len(p)
Space: O(m * n) for the DP table. Can be optimized to O(n) using only two rows since we only look at dp[i+1][...] and dp[i][j+2].

We must examine every cell in the m×n grid because any (i,j) combination could be relevant. We can't skip cells since pattern matching decisions depend on both string and pattern positions simultaneously. The table is a product of the two input sizes - each character in s potentially interacts with each character in p.

Common Mistakes

Confusing '*' semantics with glob patterns - '*' in regex means 'zero or more of PRECEDING element', not 'any sequence'
Forgetting base case: empty pattern can match empty string
Not handling the case where '*' matches ZERO occurrences (the dp[i][j+2] path)
Off-by-one errors when checking p[j+1] for '*' at pattern boundaries
Using s[j] instead of s[i] - mixing up indices between string and pattern

Edge Cases

s='', p='' → True (empty matches empty)
s='', p='a*' → True (* allows zero a's)
s='a', p='' → False
s='aaa', p='a*' → True (* matches all three)
s='ab', p='.*' → True (. matches 'a', * matches 'b')
s='ab', p='a*b*' → True (a* matches 'a', b* matches 'b')
s='aab', p='c*a*b*' → True (c* = 0, a* matches 'aa', b* matches 'b')
s='mississippi', p='mis*is*p*' → False (tricky case with multiple stars)

Connections

Wildcard Matching (LeetCode 44) - same 2D DP structure, but '*' behavior differs (here '*' requires preceding element exists)
Edit Distance (LeetCode 72) - same backwards-filling 2D DP pattern
Longest Common Subsequence - similar grid DP approach
Distinct Subsequences (LeetCode 115) - counting matching paths in 2D grid

Target Sum #494

Subset Sum / 0-1 Knapsack (counting variation)

Intuition

Think of this as a conservation law problem - imagine you have weights (the array elements) and you want to balance a scale. Placing a number on the left plate is +, on the right is -. The difference between the two sides must equal the target. The key insight: if P is the sum of all + signs and N is the sum of all - signs, then P - N = target AND P + N = total_sum. Solving these gives us P = (target + total_sum) / 2. So instead of counting +/- arrangements, we're just counting subsets that sum to (target + total)/2. This is like asking: "How many ways can we reach a specific energy level?" Each element either contributes its energy (included) or doesn't (excluded).

Why This Pattern?

The problem appears to be about +/- signs, but the mathematical transformation P = (target + total)/2 reveals it as a subset counting problem. Each element is either included (goes to + group) or excluded (goes to - group), and we count ways to reach a specific sum. This is the classic 0-1 knapsack structure where we either take or don't take each item.

Solution

def findTargetSumWays(nums, target):
    total = sum(nums)
    
    # If target is beyond what we can achieve, impossible
    if abs(target) > total:
        return 0
    
    # P = (target + total) / 2 must be an integer
    # This comes from: P - N = target and P + N = total → 2P = target + total
    if (target + total) % 2 != 0:
        return 0
    
    goal = (target + total) // 2
    
    # dp[i] = number of ways to achieve sum i
    # Using 1D DP - iterate backwards to simulate 0-1 choice (don't reuse items)
    dp = [0] * (goal + 1)
    dp[0] = 1  # One way to get sum 0: select no elements
    
    for num in nums:
        # Traverse backwards: for each num, add it to existing subsets
        # This ensures each num is used at most once (0-1 knapsack property)
        for i in range(goal, num - 1, -1):
            dp[i] += dp[i - num]
    
    return dp[goal]

Complexity

Time: O(n * goal) where n is the array length and goal = (target + total)/2
Space: O(goal) - the 1D DP array

We iterate through each of n elements, and for each element, we potentially update all sums from goal down to that element's value. This is like filling a knapsack - each item has 'weight' equal to its value, and we count distinct ways to fill it to exactly goal. We can't do better because we must consider every subset possibility.

Common Mistakes

Not checking if (target + total) is odd - we need integer P
Forgetting that dp[0] = 1 (empty subset counts as one way)
Using forward iteration in DP loop - this creates combinations where elements can be reused, breaking the 0-1 property
Not handling the case where abs(target) > total - the difference can never be achieved
Not initializing dp[0] = 1 - this is the base case representing using no elements
Using 2D DP when 1D suffices - the backwards iteration achieves the same result more efficiently

Edge Cases

Empty array: return 1 if target is 0, else 0
All zeros in array: each zero creates 2 ways (add or subtract), doubling possibilities
Target is 0 and total is 0: one way (all signs give 0)
Target equals total: one way (all plus signs)
Target equals -total: one way (all minus signs)
Large numbers causing goal to exceed array bounds

Connections

This IS the subset sum problem - same as 'Partition Equal Subset Sum' (#416) but counting ways instead of just checking possibility
Similar to 'Number of Subsequences That Satisfy Condition' - also counts subset combinations
The transformation P = (target + total)/2 is the key insight used in 'Last Stone Weight II' (#1049) where we minimize the difference

Unique Paths #62

2D Dynamic Programming (Grid DP)

Intuition

Imagine this like water flowing through a grid of pipes. At each intersection, the water can split — it can go right or down. The number of ways to reach any cell is like combining two streams: the stream coming from above and the stream coming from the left. When streams merge, their flows add up. So to reach cell (i,j), you must have arrived either from above (i-1,j) or from the left (i,j-1). Each path is unique because the robot makes different choices at each step — it's like a decision tree where right/down are the two branches.

Why This Pattern?

This problem has optimal substructure — the answer for each cell depends only on the answers for cells above and to its left. There's no need to consider the full history because the robot's position at any cell completely determines future possibilities. The grid structure naturally maps to a 2D DP table where we build up solutions from top-left to bottom-right.

Solution

def uniquePaths(m, n):
    # dp[j] represents number of ways to reach column j in current row
    # Initialize with 1s: first row has only 1 way (all right moves)
    dp = [1] * n
    
    # Iterate through each row starting from second
    for i in range(1, m):
        # For each column starting from second
        for j in range(1, n):
            # dp[j] currently holds ways from cell above (same column, previous row)
            # dp[j-1] holds ways from cell to the left (previous column, same row)
            # Add them together — this is the core DP transition
            dp[j] += dp[j-1]
    
    return dp[-1]  # Return ways to reach bottom-right

Complexity

Time: O(m × n) — we visit each cell exactly once, computing its value from two neighbors
Space: O(n) — we only need one row array instead of a full m×n table because each row only depends on itself and the previous row

We can't do better than O(m×n) because there are m×n cells and each one must contribute to the final count somehow — we need to consider all possible paths. For space, we only keep track of one row because when computing cell (i,j), we only need dp[j] (from above) and dp[j-1] (from left in current row). Older rows become irrelevant once we've moved past them, like forgetting where water came from once it's passed through a pipe junction.

Common Mistakes

Forgetting the first row/column initialization — they all have value 1 because you can only approach from one direction
Using the wrong indices in the transition — dp[j] is 'from above', dp[j-1] is 'from left'
Confusing m and n — m is rows (height), n is columns (width)

Edge Cases

m=1 or n=1: Only one row or one column means single path (all right or all down)
Smallest case m=1, n=1: Starting and ending at same cell, one way (doing nothing)
Large grids: Numbers can get huge, might need modulo in some variants

Connections

S
a
m
e
c
o
r
e
i
n
s
i
g
h
t
a
s
'
U
n
i
q
u
e
P
a
t
h
s
I
I
'
(
#
6
3
)
—
t
h
a
t
p
r
o
b
l
e
m
a
d
d
s
o
b
s
t
a
c
l
e
h
a
n
d
l
i
n
g
,
w
h
i
c
h
i
s
j
u
s
t
c
h
e
c
k
i
n
g
i
f
a
c
e
l
l
i
s
b
l
o
c
k
e
d
b
e
f
o
r
e
c
o
m
p
u
t
i
n
g
d
p
.
A
l
s
o
c
o
n
n
e
c
t
s
t
o
'
M
i
n
i
m
u
m
P
a
t
h
S
u
m
'
(
#
6
4
)
w
h
e
r
e
i
n
s
t
e
a
d
o
f
c
o
u
n
t
i
n
g
p
a
t
h
s
,
w
e
f
i
n
d
t
h
e
p
a
t
h
w
i
t
h
m
i
n
i
m
u
m
s
u
m
.
T
h
e
2
D
D
P
p
a
t
t
e
r
n
h
e
r
e
i
s
f
o
u
n
d
a
t
i
o
n
a
l
—
y
o
u
'
l
l
s
e
e
i
t
i
n
m
a
n
y
g
r
i
d
p
r
o
b
l
e
m
s
l
i
k
e
'
D
u
n
g
e
o
n
G
a
m
e
'
(
#
1
7
4
)
a
n
d
'
C
h
e
r
r
y
P
i
c
k
u
p
'
(
#
7
4
1
'
)
.
T
h
e
s
p
a
c
e
o
p
t
i
m
i
z
a
t
i
o
n
t
r
i
c
k
(
1
D
a
r
r
a
y
)
i
s
t
h
e
s
a
m
e
t
e
c
h
n
i
q
u
e
u
s
e
d
i
n
'
H
o
u
s
e
R
o
b
b
e
r
'
(
#
1
9
8
)
.

Greedy (8)

Gas Station #134

Greedy with proof / Single-pass proof. We make a locally optimal choice (move start forward when we fail) that guarantees global optimality because of the conservation law (total gas vs total cost).

Intuition

Think of this as an energy conservation problem. At each station, you gain gas[i] and spend cost[i] to move forward. If total_gas < total_cost over the entire circuit, it's fundamentally impossible - you're losing energy overall. But if total_gas >= total_cost, there MUST be a valid starting point. Here's why: imagine your fuel tank level as you travel around the circle. If the overall level ends up higher than where it started (or equal), the path must have a minimum point. That minimum point is your starting station - from there, the tank level never drops below zero, guaranteeing you can complete the circuit. The greedy insight: if starting from station A fails at station B (your tank goes negative), then NO station between A and B can work either, because you'd be starting with even less fuel than A had when it failed.

Why This Pattern?

The problem has a mathematical guarantee: if total_gas >= total_cost, there exists a solution, and we can find it greedily. If we fail at station B starting from A, all stations between A and B are impossible because you'd arrive there with less cumulative fuel than A did. This lets us skip them all in one move.

Solution

def canCompleteCircuit(gas, cost):
    total_tank = 0  # Track overall gas vs cost for the whole circuit
    curr_tank = 0   # Track gas vs cost from current start
    start = 0       # Our candidate starting station
    
    for i in range(len(gas)):
        diff = gas[i] - cost[i]
        total_tank += diff   # Add to overall account
        curr_tank += diff    # Add to current journey account
        
        # If we can't reach station i from our current start,
        # then no station between start and i can work.
        # Skip all of them by moving start to i+1.
        if curr_tank < 0:
            start = i + 1
            curr_tank = 0  # Reset tank for new starting point
    
    # If overall tank is non-negative, we found a valid start.
    # Otherwise, even the total gas wasn't enough - impossible.
    return start if total_tank >= 0 else -1

Complexity

Time: O(n) - single pass through all stations
Space: O(1) - only tracking 3 integer variables regardless of input size

We can't do better than O(n) because we might need to examine every station to find the answer. The greedy proof guarantees we never need to revisit a station we've skipped, so one pass is sufficient. Space is O(1) because we only need to track the current tank level and total, not the entire journey history.

Common Mistakes

Not checking if total_gas >= total_cost first - leads to incorrect answers when impossible
Trying to brute force all starting positions (O(n²)) when greedy O(n) works
Forgetting that start might exceed array bounds when solution is impossible (handled by returning -1)
Not resetting curr_tank to 0 when moving start forward

Edge Cases

Empty arrays - handled by loop not executing, returns 0 which is valid if possible
Single station where gas == cost - works, returns 0
Single station where gas < cost - total_tank goes negative, returns -1
All negative diffs - total_tank negative, returns -1
All positive diffs - works, returns 0

Connections

This is the greedy cousin of 'trapping rain water' (container problem) - both find minimum points in a height/gas profile
Similar logic to finding the pivot in 'find pivot index' (LeetCode 724) - both involve cumulative sums and finding where something 'balances'
The proof structure (if A fails at B, nothing between A and B works) mirrors the sliding window pattern in 'minimum window substring'

Hand of Straights #846

Greedy with Counter/Frequency Map

Intuition

Think of this like organizing numbered books into consecutive groups. You have a pile of numbered books and need to form stacks where each stack has consecutive numbers (like 2,3,4 or 7,8,9). The key insight: always pick the SMALLEST available book and try to build a stack going up. Why? The smallest number has the least flexibility—it can only start stacks going upward, while larger numbers could either start new stacks OR extend existing ones. If you can't form a valid stack starting from the smallest available book, you're stuck. This is a "most constrained first" greedy principle—tackle the choice with fewest options first.

Why This Pattern?

The problem has a greedy structure because: (1) We make locally optimal choices (pick smallest, build upward), (2) these choices are irreversible (once we use a card, it's gone), (3) making the most constrained choice first (smallest card) is always safe—if we can't start a valid group from the smallest available card, we never will. The counter tracks what's available to use, which is essential for knowing when we can form groups.

Solution

from collections import Counter

class Solution:
    def isNStraightHand(self, hand: List[int], groupSize: int) -> bool:
        # If total cards can't be evenly divided, impossible
        if len(hand) % groupSize != 0:
            return False
        
        # Count frequency of each card value
        count = Counter(hand)
        
        # Iterate through card values in sorted order
        for card in sorted(count.keys()):
            # While we still have copies of this card, try to form a group
            while count[card] > 0:
                # Try to form consecutive sequence: card, card+1, ..., card+groupSize-1
                for i in range(groupSize):
                    target = card + i
                    # If we don't have the needed consecutive card, impossible
                    if count[target] > 0:
                        count[target] -= 1  # Use this card
                    else:
                        return False
        
        return True

Complexity

Time: O(n log n) where n is the number of cards
Space: O(n) for the counter storing unique card frequencies

Time: Sorting the unique card values takes O(k log k) where k is the number of unique cards (k ≤ n). The nested while/for loops process each card exactly once when forming groups, so O(n). Combined: O(n log n). We can't do better than sorting because we need to know the smallest available card at each step—this is inherent to the problem structure. Space: Counter stores at most one entry per unique card value, so O(k) ≤ O(n).

Common Mistakes

Forgetting to check if len(hand) % groupSize == 0 upfront—early return prevents wasted work
Not iterating through sorted keys—trying to form groups from unsorted values leads to incorrect results
Not handling the while loop properly—need to consume ALL copies of a card value before moving on (or skip when count becomes 0)
Forgetting that consecutive means card, card+1, card+2... (difference of 1, not index position in array)

Edge Cases

Single card with groupSize=1: should return True
Already sorted consecutive hand: always works
All same values (e.g., [2,2,2]): only works if groupSize equals count of that value
Empty hand: handled by len(hand) == 0, returns True (trivially satisfiable)
Hand with gaps (e.g., [1,3,4,5] with groupSize=2): fails because 2 is missing
Large duplicates: [1,1,1,2,2,2,3,3,3] with groupSize=3 works, but [1,1,2,2,3,3] with groupSize=3 fails

Connections

Similar to 'Divide Array in Sets of K Consecutive Numbers' (LeetCode 1296)—same greedy with counter pattern, just checks possibility rather than grouping
Uses same counter-based approach as 'Top K Frequent Elements' for frequency tracking
The greedy 'pick smallest available' principle also appears in 'Task Scheduler' (LeetCode 621) where we always process most frequent first

Jump Game II #45

Greedy - Always choose the option that maximizes immediate reach (lookahead one step)

Intuition

Imagine you're hopping across lily pads. Each lily pad tells you how far you can jump from there. You want minimum hops to reach the last pad. The greedy insight: at every jump, pick the lily pad that gets you as FAR as possible for the NEXT jump. It's like a gradient descent - you're always moving to the position with maximum 'potential energy' (reach). Think of it as expanding a "frontier" - from your current jump, you can reach positions up to some boundary. When you hit that boundary, you MUST make another jump, so you might as well jump to wherever gives you the farthest new boundary. This is like a wave propagating outward - each jump expands your reachable region, and you're looking for the minimum "time" (jumps) to hit the target.

Why This Pattern?

This problem has optimal substructure: the minimum jumps to reach the end from position i depends on which position you jump to next. The greedy choice works because, from all reachable positions in your current 'jump window', picking the one that extends your reach farthest MUST be optimal - if it weren't, you'd have a shorter path that reaches less far in one fewer jump, which contradicts the idea of minimum jumps. It's also a matroid-like structure where the greedy choice preserves optimality.

Solution

def jump(nums):
    # Edge case: already at end or single element
    if len(nums) <= 1:
        return 0
    
    jumps = 0          # Number of jumps made so far
    curr_end = 0       # End of current jump's reachable range
    farthest = 0       # Furthest we can reach with jumps+1 jumps
    
    # Iterate through all positions except the last
    # (we don't need to jump FROM the last position)
    for i in range(len(nums) - 1):
        # Update furthest reach from current position
        farthest = max(farthest, i + nums[i])
        
        # When we've exhausted current jump's range, we MUST make another jump
        # At this point, farthest represents our new boundary after this jump
        if i == curr_end:
            jumps += 1
            curr_end = farthest
    
    return jumps

Complexity

Time: O(n)
Space: O(1) - only using a few integer variables

We iterate through the array exactly once. At each index, we do O(1) work (constant-time max update and comparison). We can't do better than O(n) because we must examine each position to know the maximum reach - it's like needing to sample every point on a curve to find its maximum. The space is O(1) because we don't need to store any per-position data - we just track the running maximum reach as we sweep through.

Common Mistakes

Confusing this with Jump Game I - this requires tracking minimum jumps, not just reachability
Not updating farthest BEFORE checking if we need to make a jump - order matters
Forgetting that we iterate up to len(nums)-1, not len(nums) - we don't need to make a jump from the last position
Trying to track which specific index to jump to - we only need the boundary
Not handling the edge case of a single element array (should return 0)

Edge Cases

Array of length 1 - already at destination
Array where nums[i] >= len(nums) - i from start - can reach in 1 jump
Strictly increasing values - linear scan
Values of 0 in middle - must work around 'dead zones'

Connections

Jump Game I (#55) - same structure but just checking reachability (boolean), uses same interval merging idea
Gas Station (#134) - also greedy interval expansion to find minimum coverage
Minimum Number of Arrows to Burst Balloons (#452) - greedy interval merging pattern
Task Scheduler (#621) - greedy scheduling with intervals

Jump Game #55

Greedy - keep track of maximum reach

Intuition

Think of this like a signal propagating outward from the start position. Each element tells you the maximum range of that signal - from position i, the signal can spread to i+nums[i]. The question becomes: can this reachability 'wave' spread all the way to the last index? The greedy insight is simple: we don't need to plan the exact path, we just need to track the farthest position we CAN reach at any moment. If at any point our current position is beyond what we can reach, we're stuck (dead end). If we ever reach or exceed the last index, we're done.

Why This Pattern?

The problem has a matroid-like structure: reachability is monotonic and cumulative. If you can reach position i, you can reach all positions before i. This means the 'best' strategy (furthest reach) is always optimal - we never need to backtrack or reconsider a previous decision because earlier positions always remain reachable.

Solution

def canJump(nums):
    max_reach = 0  # Farthest position we can currently reach
    n = len(nums)
    
    for i in range(n):
        # If current position is beyond max reach, we're stuck
        if i > max_reach:
            return False
        
        # Extend our reach based on jumping from position i
        max_reach = max(max_reach, i + nums[i])
        
        # Early exit: if we can reach or exceed the last index
        if max_reach >= n - 1:
            return True
    
    return True

Complexity

Time: O(n)
Space: O(1)

We make a single pass through the array. For each element, we do O(1) work (update max_reach and check a condition). We can't do better than O(n) because we must inspect at least n-1 elements to verify reachability - in the worst case (all 1s), we need to check every position to know we can make it.

Common Mistakes

Confusing 'maximum jump length' with 'must jump that far' - you can always choose to jump less
Trying to find the minimum number of jumps instead of just whether it's possible
Not handling the edge case where the first element is 0 (can only stay at start)

Edge Cases

Single element array [0] - trivially reachable, should return True
First element is 0 - cannot move, only reachable if array length is 1
All zeros after some point - can only reach as far as max_reach allows

Connections

Jump Game II (LeetCode 45) - same problem but asks for minimum jumps needed
Gas Station (LeetCode 134) - similar greedy reachability analysis
Candy (LeetCode 135) - another greedy problem with monotonic considerations

Maximum Subarray #53

Kadane's Algorithm (Greedy)

Intuition

Imagine you're tracking your bank balance over time. Each day has a positive or negative balance change. You want to find the contiguous period (subarray) where your balance was highest. The key insight: if your accumulated balance becomes negative, it's better to 'reset' and start fresh from the next day — a negative balance only drags you down. This is like a system seeking energy minimization: the 'energy' (sum) of your current subarray, if it drops below zero, you abandon it and start a new equilibrium state from the current position. Kadane's algorithm is essentially a greedy decision at each step: extend the previous subarray OR start fresh from the current element — whichever gives us a better starting point.

Why This Pattern?

This is a greedy problem because at each position, we make a local optimal choice: extend the current subarray OR start a new one. This works because we're tracking the best subarray ENDING at each position. The local optimal (restart if previous sum < 0) leads to the global optimal (maximum subarray sum) because any subarray that includes a negative-prefixed segment can be improved by dropping that prefix.

Solution

def maxSubArray(nums):
    # Kadane's Algorithm: O(n) time, O(1) space
    # Key insight: at each position, decide to extend or restart
    
    max_sum = nums[0]  # best subarray seen so far
    current_sum = nums[0]  # best subarray ENDING at current position
    
    for i in range(1, len(nums)):
        # Either extend previous subarray OR start fresh from current element
        # If previous sum is negative, it's better to restart
        current_sum = max(nums[i], current_sum + nums[i])
        
        # Track the best sum we've seen overall
        max_sum = max(max_sum, current_sum)
    
    return max_sum

Complexity

Time: O(n)
Space: O(1)

We make exactly one pass through the array of n elements. At each element, we perform only constant-time operations (a comparison and addition). This is optimal because we must examine each element at least once to know if it's part of the maximum subarray. We cannot do better than O(n).

Common Mistakes

Forgetting to initialize max_sum to nums[0] and starting at 0 (fails for all-negative arrays)
Not updating max_sum in each iteration, only returning current_sum at the end
Confusing the comparison: should be max(nums[i], current_sum + nums[i]) not max(current_sum + nums[i], previous_max_sum)
Trying to use divide-and-conquer when greedy is simpler and sufficient

Edge Cases

All negative numbers: returns the largest (least negative) number, e.g., [-5,-3,-1] returns -1
Single element array: returns that element, e.g., [5] returns 5
Array with alternating positives and negatives: correctly handles restart at each negative
Large positive numbers that overflow in other languages but not Python

Connections

Best Time to Buy and Sell Stock (#121) — uses same 'running best' tracking pattern, but finds max difference instead of max subarray sum
Maximum Sum Circular Subarray (#918) — extends this by considering wrap-around cases
House Robber (#198) — another greedy/dp problem where you decide to take or skip each element
Longest Increasing Subsequence (#300) — different pattern but uses similar 'extend vs restart' logic

Merge Triplets to Form Target Triplet #1899

Greedy - greedy selection works because the problem has a matroid-like structure. We don't need to optimize which triplets to pick; we only need to verify existence. Once a triplet is valid (all values <= target), adding more valid triplets can only help (never hurt) because we're taking max values.

Intuition

Think of each triplet as a 'supply' of three resources. You can only use triplets where ALL three values are at or below the target (otherwise you'd overshoot and 'break' the target). Once you filter to valid triplets, you just need to check if they collectively contain every value of the target - like collecting ingredients. If you have all three ingredients (target_a, target_b, target_c) available from your valid supply, you can form the target. The 'merge' operation (taking max) means we just need each target value to appear somewhere in our valid set.

Why This Pattern?

The key insight is that valid triplets (those not exceeding target in any dimension) form an independence set - any subset of them is also valid. We only need to check coverage, not optimal selection. This is like checking if three specific items exist in a filtered list - a greedy 'take what works' approach is optimal because there's no trade-off between valid triplets.

Solution

class Solution:
    def mergeTriplets(self, triplets: list[list[int]], target: list[int]) -> bool:
        # Track whether we can achieve each component of the target
        can_a = can_b = can_c = False
        
        target_a, target_b, target_c = target
        
        for a, b, c in triplets:
            # Skip if any value exceeds target - this triplet would overshoot
            if a > target_a or b > target_b or c > target_c:
                continue
            
            # Check if this valid triplet gives us each target component
            if a == target_a:
                can_a = True
            if b == target_b:
                can_b = True
            if c == target_c:
                can_c = True
        
        # Need all three components to form the target
        return can_a and can_b and can_c

Complexity

Time: O(n) where n is the number of triplets. We make a single pass through all triplets, doing O(1) work each time.
Space: O(1) - only using three boolean flags regardless of input size. No additional data structures needed.

We can't do better than O(n) because we must examine each triplet at least once to determine if it's valid (has any value > target). The space is O(1) because we're not storing triplets - we're just checking existence of values, which we can do with simple boolean flags.

Common Mistakes

Forgetting to check all three dimensions - a triplet might be valid in 2 dimensions but invalid in the third
Using triplets that exceed target in ANY dimension (these would overshoot and make the target impossible)
Thinking you need to actually perform the merge operation - you only need to verify the values exist
Not handling duplicate triplets correctly - but duplicates are fine since we just need existence

Edge Cases

Empty triplets array - returns False unless target is also [0,0,0]
Target with zeros - triplets can only match if they have 0 in that position
All triplets invalid (all exceed target) - returns False
Target values that are all 0 - any valid triplet with 0s would work
Single triplet that exactly matches target - returns True

Connections

This is similar to 'Maximum Number of Achievable Transfer Requests' in greedy selection logic - filtering valid options then checking coverage
Shares the 'filter then verify' pattern with problems like 'Maximum Number of Events That Can Be Attended'
The dimension-wise checking is similar to 'Maximum Units on a Truck' where you greedily pick valid items, though here we just verify existence not maximize

Partition Labels #763

Two-pass greedy with last occurrence tracking. First, find each character's final position. Then, traverse while maintaining the furthest last-seen position of any character in the current partition. When current index hits that furthest point, we have a complete partition.

Intuition

Think of each letter as a 'signal' that starts at its first appearance and ends at its last appearance. We're trying to find natural boundaries where all signals that started before a point have already ended — it's like finding equilibrium points where no signal is still 'active.' We cut the string at these safe points because once we've seen the last occurrence of every letter in the current partition, we know no letter from this partition appears anywhere else. The greedy choice of cutting as soon as it's safe maximizes the number of partitions.

Why This Pattern?

The problem has a 'local optimal' property: once we've seen the last occurrence of every character encountered so far, we can safely cut — extending further would necessarily include a character that appears elsewhere, violating the constraint. This makes greedy optimal because we're always cutting at the earliest safe point, which leaves maximum room for subsequent partitions.

Solution

def partitionLabels(s):
    # First pass: find last occurrence of each character
    last = {c: i for i, c in enumerate(s)}
    
    result = []
    size = 0  # current partition size
    end = 0   # furthest last occurrence seen in current partition
    
    for i, c in enumerate(s):
        # extend the current partition to include this character's territory
        end = max(end, last[c])
        size += 1
        
        # if we've reached the end of all characters in this partition, cut here
        if i == end:
            result.append(size)
            size = 0
    
    return result

Complexity

Time: O(n) — two passes over the string, each doing O(1) work per character.
Space: O(1) — we store last positions for at most 26 letters (English alphabet), which is constant space.

We can't do better than O(n) because we must at least examine each character to know where partitions end. The space is bounded by the alphabet size (26 for lowercase English letters), not by n, so it's O(1).

Common Mistakes

Trying to solve in one pass without precomputing last positions — you'll get stuck trying to know future characters
Forgetting to update 'end' when you encounter a character with a later last occurrence
Not resetting 'size' after appending a partition

Edge Cases

Single character string — returns [1]
All same characters like 'aaaaaa' — returns [6]
Empty string — returns []
String with no repeats like 'abcde' — each character is its own partition, returns [1,1,1,1,1]

Connections

Merge Intervals (56) — similar concept of finding non-overlapping ranges, but here we're partitioning rather than merging
Interval List Intersections (986) — works with ranges, but this is the single-string special case
Task Scheduler (621) — also involves tracking last occurrences of distinct elements

Valid Parenthesis String #678

Greedy with range tracking (min/max balance)

Intuition

Think of parentheses like a see-saw or a balance scale. At any point, the number of '(' minus ')' represents how much 'weight' is on the left side. Normally with only '(' and ')', we just track one number. But with '*' acting as a wildcard, we have UNCERTAINTY - we don't know exactly what the balance is, but we know it's somewhere in a RANGE. The key insight: instead of trying every possibility (which is exponential), we track the MINIMUM and MAXIMUM possible balance at each step. If the maximum balance ever goes negative, we've broken the see-saw too far - even treating all '*' as '(' can't save us. If the minimum goes negative, we can 'reset' it to zero because we can always use some '*' as '(' to compensate. At the end, if we can achieve a balance of exactly 0 (min = 0), the string is valid.

Why This Pattern?

The wildcard '*' creates uncertainty in the balance state. Rather than exploring all 3^n possibilities of how '*' acts, we maintain the BOUNDS of all possible balances. This works because: (1) if max_balance < 0 at any point, no assignment can save us - it's impossible, (2) if min_balance < 0, we can always use some '*' as '(' to bring it back to 0, (3) at the end, checking if min == 0 tells us if there's an assignment achieving exactly zero balance. The structure of this problem is fundamentally about managing uncertainty in a range, and greedy tracking of bounds captures all possibilities efficiently.

Solution

def checkValidString(s):
    # min_balance: treat all '*' as ')' -> lowest possible balance
    # max_balance: treat all '*' as '(' -> highest possible balance
    min_balance = 0
    max_balance = 0
    
    for c in s:
        if c == '(':
            min_balance += 1
            max_balance += 1
        elif c == ')':
            min_balance -= 1
            max_balance -= 1
        else:  # c == '*'
            # '*' as '(' increases max, as ')' decreases min
            min_balance -= 1
            max_balance += 1
        
        # If max_balance < 0, even with all '*' as '(', we have too many ')'
        # This breaks the invariant - no valid assignment possible
        if max_balance < 0:
            return False
        
        # If min_balance < 0, we can use some '*' as '(' to bring it back to 0
        # We don't let it go negative because that's recoverable
        if min_balance < 0:
            min_balance = 0
    
    # At the end, can we achieve exactly balance 0?
    # min_balance == 0 means there's some assignment that reaches exactly 0
    return min_balance == 0

Complexity

Time: O(n) - single pass through the string
Space: O(1) - only tracking two variables regardless of input size

We process each character exactly once with O(1) work per character, giving O(n) time. Space is O(1) because we only maintain min_balance and max_balance - no arrays, stacks, or recursion. The bounds tracking compresses all 3^n possible wildcard assignments into just two numbers, which is the key efficiency insight.

Common Mistakes

Forgetting to clamp min_balance to 0 after it goes negative - this loses the insight that we can recover using '*' as '('
Returning True when max_balance == 0 at the end instead of checking min_balance == 0 - max_balance being 0 doesn't guarantee we can achieve exactly 0, only that 0 is in the possible range
Not checking max_balance < 0 during iteration - this early exit is crucial for O(n) performance
Confusing the meaning of min/max balance: min treats '*' as ')' (worst case for '('), max treats '*' as '(' (worst case for ')')

Edge Cases

Empty string - returns True (nothing to validate)
Only '*' characters - valid, can all become empty string
String like ')' or ')*(' - invalid, too many closing parentheses early
String ending with '*' - more lenient, has flexibility to become '(' or ''
All '(' with no '*' - invalid unless also all ')'

Connections

Same core concept as 'Valid Parentheses' (#20) but with uncertainty - that problem tracks a single balance, this tracks a range
Related to 'Remove Invalid Parentheses' (#301) - also deals with finding valid parenthesis configurations
Similar idea to 'Minimum Number of Swaps to Balance the String' - both use range tracking for balanced sequences
The range-tracking approach is analogous to how we handle 'Climbing Stairs' with varying step sizes or 'Jump Game' with variable jumps - maintain bounds of possible states

Intervals (6)

Insert Interval #57

Linear scan with three-way case analysis (intervals before, intervals after, intervals that overlap with new). This is essentially a 'sweep line' where we process intervals in sorted order and decide what to do with each one relative to the new interval.

Intuition

Think of this like adding a new meeting to a calendar. You have a list of non-overlapping meetings sorted by start time. When you add a new meeting, you need to find where it fits and merge it with any meetings that now overlap. It's like dropping a new train onto a schedule - if it overlaps with existing trains, you combine them into one longer block.

Why This Pattern?

The intervals are already sorted by start time, which means we can make a single pass through them. We don't need to backtrack or use complex data structures - we just need to handle three cases: (1) intervals that end completely before the new one go unchanged, (2) intervals that start completely after the new one go unchanged at the end, (3) intervals that overlap need to be merged by taking the min of starts and max of ends.

Solution

def insert(intervals, newInterval):
    # If no intervals exist, just return the new one
    if not intervals:
        return [newInterval]
    
    result = []
    i = 0
    n = len(intervals)
    
    # Case 1: Add all intervals that come BEFORE the new interval
    # (intervals that end before newInterval starts)
    while i < n and intervals[i][1] < newInterval[0]:
        result.append(intervals[i])
        i += 1
    
    # Case 2: Merge all intervals that OVERLAP with newInterval
    # Keep expanding newInterval to include all overlaps
    while i < n and intervals[i][0] <= newInterval[1]:
        # Merge: take the min start and max end
        newInterval[0] = min(newInterval[0], intervals[i][0])
        newInterval[1] = max(newInterval[1], intervals[i][1])
        i += 1
    
    # Add the merged newInterval to result
    result.append(newInterval)
    
    # Case 3: Add all remaining intervals that come AFTER the new interval
    while i < n:
        result.append(intervals[i])
        i += 1
    
    return result

Complexity

Time: O(n) - We make exactly one pass through all intervals. Each interval is visited at most once, so this is optimal. We can't do better because in the worst case (like inserting at the beginning), we still need to look at every interval to know where to place the new one.
Space: O(n) - We need space for the result list. The algorithm itself only uses O(1) extra space for pointers and temporary variables.

Think of it like reading a sorted file and inserting an item - you have to scan through to find the right position. With sorted data, O(n) is the best we can do for general insertion. We could do O(log n) with a tree if we just needed to search, but we also need to output all intervals, so O(n) output size dominates.

Common Mistakes

Forgetting to handle the case where newInterval completely engulfs multiple existing intervals
Not updating newInterval correctly when merging - need to keep expanding its bounds with min/max
Using intervals[i][0] < newInterval[1] instead of <= (off-by-one on boundary conditions)
Modifying the input intervals list directly instead of building a new one

Edge Cases

Empty intervals list - just return [newInterval]
newInterval goes at the very beginning (smallest start)
newInterval goes at the very end (largest end)
newInterval completely overlaps all existing intervals
newInterval is completely contained within an existing interval
newInterval is identical to an existing interval
All existing intervals overlap with newInterval

Connections

Merge Intervals (56) - same merging logic, but here we know exactly where to start
Interval Intersection - uses similar overlap checking logic
Non-overlapping Intervals - inverse problem of checking if new interval causes overlap
Meeting Rooms II - uses similar min-heap approach for interval scheduling

Meeting Rooms II #253

Min-Heap (Priority Queue) / Sweep Line

Intuition

Think of this like a busy hotel. When guests check in, they need rooms. When they check out, rooms become available. The question is: at the busiest moment of the day, how many rooms are simultaneously occupied? This is exactly the 'maximum concurrency' problem. Each meeting is a 'guest' that occupies a room for a specific duration. We need to find the peak overlap - the moment when the most meetings are happening at once. Another way: imagine stacking transparent time intervals on top of each other. How tall is the stack at its tallest point? That's your answer.

Why This Pattern?

The problem asks for maximum concurrency at any point in time. A min-heap naturally models this because it always gives us 'the room that becomes available next soonest' - we push end times and the smallest (earliest) end time sits at the top. When a new meeting starts, if its start time is >= the earliest end time, that room is free and we can reuse it. Otherwise we need a new room. The heap size at any moment equals the number of rooms in use. This is the classic 'resource allocation with release times' pattern.

Solution

import heapq

def minMeetingRooms(intervals):
    if not intervals:
        return 0
    
    # Sort meetings by start time - we process them in chronological order
    intervals.sort(key=lambda x: x[0])
    
    # Min-heap stores end times of currently occupied rooms
    # heap[0] = earliest ending meeting (room that frees up soonest)
    heap = []
    
    for start, end in intervals:
        # If earliest ending room is free by this meeting's start time,
        # reuse that room (pop it and push the new meeting's end time)
        if heap and start >= heap[0]:
            heapq.heappop(heap)
        # Always push the current meeting's end time
        # Either into a freed room (we popped first) or as a new room
        heapq.heappush(heap, end)
    
    # Maximum heap size = minimum rooms needed
    return len(heap)

Complexity

Time: O(n log n)
Space: O(n) in worst case (all meetings overlap)

Sorting takes O(n log n). Each meeting causes at most one heap push and one heap pop, each O(log n). So total is O(n log n). We can't do better than O(n log n) because sorting is required - we need to know the chronological order of meetings. The heap space is O(n) because in the worst case (all meetings overlapping), we hold all end times simultaneously.

Common Mistakes

Forgetting to sort intervals by start time first - processing out of order breaks the algorithm
Not handling the case where start time equals end time - a meeting ending at time 5 and another starting at time 5 can share a room, hence the >= comparison
Using a max-heap by mistake - we need earliest end time at the top, so min-heap is correct
Not checking if intervals is empty before processing

Edge Cases

Empty intervals list returns 0 rooms
Single meeting needs exactly 1 room
Meetings with identical start and end times (zero-length) - can share a room
Meetings that just touch (end == start) can reuse the same room
All meetings non-overlapping - only 1 room needed ever
All meetings overlapping - need n rooms

Connections

Meeting Rooms I (#252) - asks if meetings can fit in one room, uses same sorting approach but just checks for overlap
Merge Intervals (#56) - uses sorting by start time, related 'collapse overlapping intervals' concept
Minimum Number of Arrows to Burst Balloons (#452) - similar greedy + sorting approach for finding maximum overlaps
Car Fleet II (#1776) - uses backwards sweep with heap, same pattern of 'earliest available resource'

Meeting Rooms #252

Sorting + Sequential Greedy Check (Sweep Line variant)

Intuition

Imagine you're scheduling trains on a single track. Each meeting is like a train that occupies the track for a certain duration. Can all trains run on schedule without any collisions? The key insight: if you line up all meetings by their start time (like trains waiting at a station), you just need to check if any meeting 'rears into' the one before it. Think of it like cars following each other on a road - if the second car starts before the first car has cleared the road, there's a crash. By sorting, we create a timeline where we only need to compare neighbors - no need to check every possible pair.

Why This Pattern?

When intervals are sorted by start time, any overlap MUST occur between consecutive intervals. Why? If interval A overlaps interval B, and both are sorted, either A comes before B (so A's end > B's start, which we catch when checking A→B) or B comes before A (caught when checking B→A). We don't need to check non-consecutive pairs because if A doesn't overlap B, and B doesn't overlap C, then A definitely doesn't overlap C (transitive property of non-overlapping sorted intervals).

Solution

def canAttendMeetings(intervals):
    # Edge case: 0 or 1 meeting can always be attended
    if len(intervals) <= 1:
        return True
    
    # Sort by start time - creates the timeline
    # This is the critical first step that makes everything else work
    intervals.sort(key=lambda x: x[0])
    
    # Check each meeting against the previous one
    for i in range(1, len(intervals)):
        prev_start, prev_end = intervals[i-1]
        curr_start, curr_end = intervals[i]
        
        # If current meeting starts before previous one ends → overlap!
        if curr_start < prev_end:
            return False
    
    # No overlaps found - all meetings can be attended
    return True

Complexity

Time: O(n log n)
Space: O(1) if sorting in-place, O(n) if Python's Timsort creates a copy

Sorting is the dominant cost - we must examine all n meetings to place them in order. We can't know the correct relative positions without comparing each meeting to others, which requires at least n log n comparisons for comparison-based sorting. The subsequent scan is O(n) - just one pass through sorted meetings. We can't do better than O(n log n) because in the worst case (all meetings at different times), we need to establish their precise order.

Common Mistakes

Forgetting to sort - checking in original order will miss overlaps that would be caught after sorting
Using <= instead of < - back-to-back meetings (one ends at 5, next starts at 5) are FINE, not overlaps
Not handling empty input - should return True for 0 or 1 meetings
Confusing start/end indices - easy to mix up which position is start vs end

Edge Cases

Empty intervals array - should return True
Single meeting - always return True
Two meetings at exact same time - should return False (total overlap)
Back-to-back meetings [0,5] and [5,10] - should return True (no overlap)
Nested meeting [0,10] containing [2,3] - should return False
All meetings non-overlapping - should return True

Connections

Merge Intervals (#56) - same sorting strategy, but needs to actually merge overlapping groups
Interval Intersection (#986) - uses two-pointer technique on sorted intervals
Non-overlapping Intervals (#435) - greedy choice after sorting: remove minimum intervals to prevent overlap
Meeting Rooms II (#253) - uses min-heap to track meeting rooms needed, builds on this sorting foundation

Merge Intervals #56

Sort and Scan (Sweep Line variant)

Intuition

Think of intervals like train tracks laid out on a table. Some tracks overlap (share rails), some are separate. Your job is to find all the continuous track segments after merging overlapping ones. The key insight: if you line up all tracks by their starting position (like sorting books by where they start on a shelf), you only need to look at your current track and the next one to decide if they merge. You don't need to compare every track with every other track - the sorting makes the problem one-dimensional and local.

Why This Pattern?

This problem has a natural ordering property - intervals are ranges that can be sorted by their start points. Once sorted, the merge decision becomes purely local: each interval either (1) extends the current merged interval if it overlaps, or (2) starts a new interval if it doesn't. This transforms an O(n²) all-pairs problem into O(n) scanning after O(n log n) sorting. The 'sweep' happens in sorted order, and we maintain only one active interval at a time.

Solution

def merge(intervals):
    if not intervals:
        return []
    
    # Step 1: Sort by start time - this is the critical first move
    # Like organizing books by where they begin on a shelf
    intervals.sort(key=lambda x: x[0])
    
    # Step 2: Start with first interval as our baseline
    merged = [intervals[0]]
    
    # Step 3: Sweep through remaining intervals
    for start, end in intervals[1:]:
        # Get the end of the last merged interval
        last_end = merged[-1][1]
        
        # If current interval starts before or at the last one's end, they overlap
        # Like two train tracks that share rails - merge them!
        if start <= last_end:
            # Extend the end to be the max of both (covers nested and partial overlap)
            merged[-1][1] = max(last_end, end)
        else:
            # No overlap - start a new merged interval
            merged.append([start, end])
    
    return merged

Complexity

Time: O(n log n)
Space: O(n) for the output array, O(1) extra if we don't count output (just the merged list we build)

Sorting is the bottleneck - we must examine each interval's start position to establish the correct order, which takes O(n log n). Once sorted, we make exactly one pass through all intervals, doing constant-time work per interval. We can't do better than O(n log n) because any algorithm that determines the correct merge order must at minimum examine the relative positions of all intervals - that's a sorting lower bound for comparison-based approaches.

Common Mistakes

Forgetting to sort - comparing intervals without sorting leads to missing merges
Using intervals[i] instead of intervals[i+1] in the loop, causing index errors
Forgetting edge case of empty input
Not using max() when merging - need to handle intervals that start inside the current but end after it
Comparing wrong values - should check if new_start <= last_end, not new_end <= last_end

Edge Cases

Empty input array - return empty
Single interval - return that interval unchanged
All overlapping intervals - should merge into one
No overlapping intervals - return sorted copy of input
Nested intervals - smaller interval inside larger, should merge to larger's bounds
Touching intervals [1,2] and [2,3] - these should merge to [1,3] per LeetCode definition

Connections

Insert Interval (#57) - uses same sorted-scan pattern but inserts one interval into an already-merged set
Non-overlapping Intervals (#435) - inverse problem: find minimum removals to make non-overlapping, uses greedy selection after sorting
Meeting Rooms (#252) / Meeting Rooms II (#253) - interval scheduling problems with similar sorting approaches
Task Scheduler (#621) - different pattern but also uses sorting and greedy scanning

Minimum Interval to Include Each Query #1851

Sweep Line with Priority Queue (Active Intervals)

Intuition

Think of this like matching 'service providers' (intervals) to 'customers' (query points). Each provider covers a range, and each customer needs service at a specific point. The goal is to find the SMALLEST provider that can serve each customer — the tightest fit, not just any fit. This is like finding the most 'efficient' resource that can handle each request. As we sweep through the number line (like a scanner moving left to right), we keep track of all intervals that have 'opened' but haven't 'closed' yet. Among all active intervals at any point, we want the smallest one — this is a classic job for a min-heap where the smallest interval bubbles to the top.

Why This Pattern?

The problem has two dimensions: we need to process queries in sorted order (to efficiently track which intervals are active) AND we need quick access to the smallest active interval. The sweep line processes queries in sorted order, and the priority queue gives us O(1) access to the minimum. This is the natural decomposition: 'which intervals cover this point?' (sweep line) + 'which is smallest?' (heap).

Solution

import heapq

def minInterval(self, intervals: List[List[int]], queries: List[int]) -> List[int]:
    # Sort intervals by start time - we add them to heap in this order
    intervals.sort(key=lambda x: x[0])
    
    # Sort queries but keep original indices to restore order later
    # This lets us process queries from left to right on the number line
    queries_with_idx = sorted([(q, i) for i, q in enumerate(queries)])
    
    result = [-1] * len(queries)
    min_heap = []  # (length, end, start) - min-heap by length
    i = 0  # pointer into intervals array
    
    for query, idx in queries_with_idx:
        # PHASE 1: Open all intervals that start at or before this query
        # These intervals are now 'active' - they could contain the query
        while i < len(intervals) and intervals[i][0] <= query:
            start, end = intervals[i]
            length = end - start + 1
            heapq.heappush(min_heap, (length, end, start))
            i += 1
        
        # PHASE 2: Close intervals that don't cover this query
        # An interval is useless if its end comes before the query
        while min_heap and min_heap[0][1] < query:
            heapq.heappop(min_heap)
        
        # PHASE 3: The smallest active interval is our answer
        # Top of min-heap always has smallest length
        if min_heap:
            result[idx] = min_heap[0][0]
    
    return result

Complexity

Time: O((n + m) log n)
Space: O(n + m)

Common Mistakes

Forgetting to restore queries to original order - you must use the stored index
Not removing expired intervals from heap - intervals with end < query should be popped
Using interval length as just end - start (should be +1 for inclusive intervals)
Forgetting that queries might not be covered by any interval (return -1)

Edge Cases

Query exactly at interval start: intervals[i][0] <= query handles this with <=
Query exactly at interval end: heap[0][1] < query uses strict <, so intervals ending at query are kept
No intervals cover a query: heap becomes empty, result stays -1
Intervals that completely contain each other: all get added, but smallest length bubbles to top
Single interval covering all queries: heap maintains that one interval throughout

Connections

Similar to 'Find Right Interval' - both sort intervals and use sweep line approach
Related to 'Employee Free Time' - uses same 'merge overlapping' heap pattern
Think of this as the inverse of 'Non-overlapping Intervals' - there we're finding intervals to remove, here we're finding intervals that contain points

Non-overlapping Intervals #435

Greedy - Activity Selection (Earliest Finish Time)

Intuition

Imagine you're packing boxes into a fixed-length shelf. To fit the most boxes, you'd pick the narrowest boxes first - they leave maximum room for the rest. That's exactly what's happening here with time intervals. The greedy insight: if you always pick the interval that ends earliest, you leave maximum room for the remaining intervals to fit. This is like a game of Tetris where you want to maximize pieces - you always drop the piece that lands lowest to keep the stack as low as possible for future pieces.

Why This Pattern?

This pattern fits because the problem has optimal substructure: an optimal solution can be built by repeatedly making the locally optimal choice (pick earliest-ending non-overlapping interval). If you have an optimal solution that doesn't include the earliest-ending interval, you can swap that later-ending interval for the earlier one and never reduce the number of intervals you can keep - you actually gain space.

Solution

def eraseOverlapIntervals(intervals):
    if not intervals:
        return 0
    
    # Sort by end time - this is the key to the greedy approach
    intervals.sort(key=lambda x: x[1])
    
    removals = 0
    # Track the end time of the last non-overlapping interval we kept
    last_end = intervals[0][1]
    
    # Start from second interval since we kept the first one
    for i in range(1, len(intervals)):
        start, end = intervals[i]
        
        if start >= last_end:
            # No overlap - we can keep this interval
            last_end = end
        else:
            # Overlap detected - must remove one interval
            # We keep the one ending earlier (which is already sorted)
            # so we simply increment removal count and skip current
            removals += 1
    
    return removals

Complexity

Time: O(n log n)
Space: O(1) excluding the sort, or O(n) if counting the sorted array

Common Mistakes

Forgetting to handle empty input - returns 0 but code might crash
Not sorting by end time - sorting by start time or leaving unsorted gives wrong answer
Using start < last_end instead of start >= last_end for non-overlap check - edge case matters
Not initializing last_end correctly - should be the end of first interval after sorting

Edge Cases

Empty intervals list - return 0
All intervals overlap - we remove all but one, so return n-1
No intervals overlap - return 0
Two intervals that share an endpoint [1,2] and [2,3] - these DON'T overlap, should keep both

Connections

Minimum Number of Arrows to Burst Balloons (LeetCode 452) - same greedy logic, different application
Meeting Rooms II (LeetCode 253) - uses similar interval sorting but counts overlaps differently
Insert Interval (LeetCode 57) - interval manipulation in sorted list

Math & Geometry (8)

Detect Squares #2013

Diagonal-pair enumeration with hash-based point lookup

Intuition

Think of this like a resonance detector. When you add a point, it creates potential 'vibrations' that can resonate with other points to form squares. Here's the key insight: any square has two diagonals that share the same midpoint and are perpendicular with equal length. If we pick any two points as a diagonal, the other two corners of the square are UNiquely determined - there's exactly one way to complete the square. So instead of checking all 4-point combinations (which would be slow), we pick two points as a diagonal and check if the other two corners exist in our collection.

Why This Pattern?

A square's diagonals have two crucial properties: (1) they bisect each other at the same midpoint, and (2) they have equal length and are perpendicular. This means if we fix two points as a potential diagonal, the other two corners are mathematically determined - there's no ambiguity or choice to make. This turns the problem into: for each point A, treat it and some other point B as a diagonal, compute where C and D must be, and check if they exist. The hash map gives O(1) lookup, making the enumeration efficient.

Solution

class DetectSquares:
    def __init__(self):
        # Hash map: point -> count of times added (handles duplicates)
        self.point_count = {}
        
    def add(self, point: List[int]) -> None:
        self.point_count[tuple(point)] = self.point_count.get(tuple(point), 0) + 1
        
    def count(self) -> int:
        result = 0
        # For each point P in our collection
        for p in self.point_count:
            px, py = p
            # Try every other point A that could form a diagonal with P
            for a in self.point_count:
                if a == p:
                    continue
                ax, ay = a
                
                # Skip if same x or y - these would form a line, not a diagonal
                # (diagonals must be perpendicular, so x's and y's must differ)
                if px == ax or py == ay:
                    continue
                
                # Compute the other two corners of the square
                # The diagonals of a square bisect each other at the same midpoint
                # and are perpendicular with equal length
                # Given P=(px,py) and A=(ax,ay), the other corners are:
                # B = (ax, py)  -- shares x with A, y with P
                # C = (px, ay)  -- shares x with P, y with A
                # This forms an axis-aligned square
                b = (ax, py)
                c = (px, ay)
                
                # Check if both B and C exist in our collection
                if b in self.point_count and c in self.point_count:
                    # Multiply counts because each point could appear multiple times
                    result += (self.point_count[a] * 
                              self.point_count[b] * 
                              self.point_count[c])
        
        return result // 2  # Divide by 2 to avoid double-counting each square

Complexity

Time: O(n²) per query, where n is the number of unique points added
Space: O(n) for storing the hash map of points

We iterate through all point pairs (O(n²)), and for each pair, we do O(1) hash lookups and multiplications. We can't do better than O(n²) in the worst case because in the worst scenario (all points on a circle, say), there could be O(n²) different squares, and we need to count them all. The hash map gives us constant-time lookup for checking if a point exists, which is crucial - without it, we'd need O(n) lookup per check, making it O(n³).

Common Mistakes

Forgetting to handle duplicate points - the same point can be added multiple times, and each occurrence creates different square combinations
Not dividing by 2 at the end - each square gets counted twice (once for each diagonal)
Confusing axis-aligned with rotated squares - the solution above uses axis-aligned squares. For rotated squares, you'd need the diagonal midpoint method which is more complex

Edge Cases

No points added yet - returns 0 (handled naturally by empty iteration)
All points collinear - no squares formed (px==ax or py==ay check filters these)
Points at same location added multiple times - handled by counting multiplicities
Single point - returns 0
Two points - returns 0 (can't form a square)

Connections

T
h
i
s
i
s
s
i
m
i
l
a
r
t
o
'
N
u
m
b
e
r
o
f
G
o
o
d
P
a
i
r
s
'
(
L
e
e
t
C
o
d
e
1
5
1
2
)
i
n
t
h
a
t
b
o
t
h
u
s
e
h
a
s
h
m
a
p
s
t
o
c
o
u
n
t
c
o
m
b
i
n
a
t
i
o
n
s
e
f
f
i
c
i
e
n
t
l
y
.
T
h
e
d
i
a
g
o
n
a
l
p
r
o
p
e
r
t
y
i
s
a
n
a
l
o
g
o
u
s
t
o
'
V
a
l
i
d
S
q
u
a
r
e
'
(
L
e
e
t
C
o
d
e
5
9
3
)
b
u
t
i
n
v
e
r
t
e
d
-
t
h
e
r
e
w
e
c
h
e
c
k
i
f
4
g
i
v
e
n
p
o
i
n
t
s
f
o
r
m
a
s
q
u
a
r
e
,
h
e
r
e
w
e
s
e
a
r
c
h
f
o
r
a
l
l
p
o
s
s
i
b
l
e
s
q
u
a
r
e
s
.
T
h
e
c
o
u
n
t
i
n
g
w
i
t
h
m
u
l
t
i
p
l
i
c
i
t
i
e
s
p
a
t
t
e
r
n
a
p
p
e
a
r
s
i
n
m
a
n
y
c
o
m
b
i
n
a
t
o
r
i
a
l
p
r
o
b
l
e
m
s
.
T
h
e
d
i
v
i
d
e
-
b
y
-
2
c
o
r
r
e
c
t
i
o
n
i
s
c
o
m
m
o
n
i
n
p
a
i
r
-
c
o
u
n
t
i
n
g
p
r
o
b
l
e
m
s
(
l
i
k
e
c
o
u
n
t
i
n
g
u
n
o
r
d
e
r
e
d
p
a
i
r
s
f
r
o
m
o
r
d
e
r
e
d
c
o
m
b
i
n
a
t
i
o
n
s
)
.

Happy Number #202

Floyd's Tortoise and Hare (Cycle Detection) - same technique used for detecting loops in linked lists.

Intuition

Think of this like a feedback loop in a system. You take a number, crunch its digits into a new number, and feed that back in. The question is: does this system settle into equilibrium (reaches 1, the 'happy' state) or does it get stuck in a repeating pattern? The beautiful mathematical fact here is that the sum-of-squares operation is a 'contraction' - it shrinks numbers fast enough that you can't diverge to infinity. You MUST either hit 1 or enter a cycle. The known unhappy cycle is 4→16→37→58→145→42→20→4 (and it turns out ALL unhappy numbers eventually hit this exact cycle). So we just need to detect if we enter a cycle that doesn't include 1.

Why This Pattern?

This is a finite state machine where each number maps to exactly one successor. Finite state machines with a deterministic transition function either terminate (hit 1) or enter a cycle. Floyd's algorithm detects cycles by having two pointers traverse the sequence at different speeds - they'll eventually meet if a cycle exists. It's O(1) space because we don't need to store visited numbers, unlike a set-based approach.

Solution

def isHappy(n: int) -> bool:
    # Helper: compute sum of squares of digits
    def get_next(num):
        total = 0
        while num > 0:
            digit = num % 10
            total += digit * digit
            num //= 10
        return total
    
    # Edge case: 1 is immediately happy
    if n == 1:
        return True
    
    # Two pointers: slow moves 1 step, fast moves 2 steps
    slow = n
    fast = get_next(n)
    
    # Loop until fast hits 1 (happy) or they meet (cycle = unhappy)
    while fast != 1 and slow != fast:
        slow = get_next(slow)
        fast = get_next(get_next(fast))
    
    return fast == 1

Complexity

Time: O(log n) - Each digit-squaring step reduces the number significantly. For a number with d digits, the maximum sum of squares is 81d. For n ≥ 100, this sum is always less than n, creating rapid contraction. The sequence reaches either 1 or enters the cycle within a bounded number of steps (known max ~20 for 32-bit integers).
Space: O(1) - Only two integer variables (slow, fast) regardless of input size. No set of visited numbers needed.

We don't need to track every number we've seen because Floyd's algorithm exploits the cycle structure: like two runners on a circular track, a fast runner will eventually lap a slow runner if there's a loop. The 'track' here is the sequence of numbers generated by the happy process. Since the sequence either ends at 1 or loops forever, we only need to detect if a loop exists that doesn't include 1.

Common Mistakes

Using set-based approach which uses O(n) space unnecessarily
Not handling n=1 as an edge case (though it works anyway since fast would be 1)
Confusing // with / for integer division in get_next
Forgetting that 0 is not a happy number (though problem states positive integers)

Edge Cases

n = 1 returns True immediately
n = 2 enters the unhappy cycle
Very large numbers still converge quickly due to digit contraction property

Connections

Linked List Cycle (141) - uses identical Floyd's cycle detection
Find the Duplicate Number (287) - another Floyd's application on numerical sequences
Rectangle Overlap (836) - another math/geometry problem about detecting 'bad' states in a deterministic system

Multiply Strings #43

Digit-by-digit multiplication with positional accumulation

Intuition

Think of multiplication like waves colliding. When you multiply digit[i] from the first number with digit[j] from the second, their 'energy' arrives at position i+j in the result. All the collisions at position k come from pairs where i+j=k. This is exactly like adding up all the products that land at each position - we accumulate contributions rather than doing the traditional step-by-step shifted multiplication. The carry is just 'energy overflow' that spills to the next position.

Why This Pattern?

In base-10 positional notation, digit at position i has value 10^i and digit at position j has value 10^j. Their product contributes to position i+j. All pairs (i,j) that sum to k contribute to result[k]. This mathematical property makes accumulation the natural approach - we collect all contributions at each position first, then normalize with carries.

Solution

class Solution:
    def multiply(self, num1: str, num2: str) -> str:
        # Handle zeros upfront - essential optimization
        if num1 == "0" or num2 == "0":
            return "0"
        
        # Result can be at most len(num1) + len(num2) digits
        # (e.g., 99 × 99 = 9801, which is 4 digits = 2 + 2)
        result = [0] * (len(num1) + len(num2))
        
        # Process from right to left (least significant digits)
        for i in range(len(num1) - 1, -1, -1):
            for j in range(len(num2) - 1, -1, -1):
                # Convert chars to integers
                n1 = ord(num1[i]) - ord('0')
                n2 = ord(num2[j]) - ord('0')
                
                # Position in result where this product contributes
                # i + j gives current position, carry goes to i + j + 1
                position = i + j + 1
                
                # Multiply and add to existing value at position
                product = n1 * n2 + result[position]
                
                # Store ones digit at current position
                result[position] = product % 10
                # Carry the tens digit to next position
                result[position - 1] += product // 10
        
        # Skip leading zeros (if any) and convert to string
        start_idx = 0
        while start_idx < len(result) and result[start_idx] == 0:
            start_idx += 1
        
        # Build final string from remaining digits
        return ''.join(str(d) for d in result[start_idx:])

Complexity

Time: O(m × n) where m = len(num1), n = len(num2)
Space: O(m + n) for the result array

We must multiply every digit in num1 by every digit in num2 - there are m × n such pairs, so we can't do better than O(m×n). The result array needs at most m+n positions because the largest product (all 9s) produces at most m+n digits.

Common Mistakes

Not handling '0' as input - can return '000' instead of '0'
Forgetting that carry goes to position i+j, not i+j+2
Using integer conversion instead of string math - overflows for large numbers
Not skipping leading zeros in final result

Edge Cases

Input '0' - must return '0' immediately
Numbers with leading zeros like '001' × '003'
Single digit inputs like '9' × '9'
Maximum length inputs that produce full result array

Connections

Similar to Add Binary - same digit-by-digit accumulation with carry logic
Uses same ASCII digit conversion technique as Add Strings
Inverse operation of String to Integer (atoi) - builds number from digits instead of parsing it

Plus One #66

Carry propagation / digit-by-digit processing

Intuition

Think of adding 1 like pouring water into a graduated cylinder. You fill up the rightmost 'bucket' (ones place). If it overflows (hits 10), it empties to 0 and carries 1 to the next bucket to the left. Keep propagating left until a bucket doesn't overflow. If ALL buckets overflow (like 999), you need a new bucket at the front (becoming 1000). It's exactly like doing addition on paper - you just don't know you need the carry until you hit a 9.

Why This Pattern?

The problem has a natural right-to-left sequential dependency. Each digit's final value depends on whether there was a carry from processing the less significant digit. This single-pass-from-right approach is the only way because you can't know if you need to carry until you've processed all digits to the right.

Solution

def plusOne(digits):
    # Process from rightmost (least significant) digit to left
    n = len(digits)
    
    for i in range(n - 1, -1, -1):
        if digits[i] < 9:
            # No carry needed - we can just increment and we're done
            # This digit absorbs the +1, nothing propagates further
            digits[i] += 1
            return digits
        else:
            # digits[i] == 9: becomes 0, carry propagates to next digit
            # Like 9 + 1 = 10, write 0, carry 1
            digits[i] = 0
            # Loop continues, carrying 1 to the next position
    
    # If we exit the loop, ALL digits were 9 (e.g., 999 -> 1000)
    # Need to prepend a 1 (the implicit carry creates a new most significant digit)
    return [1] + digits

Complexity

Time: O(n)
Space: O(1) excluding output array, O(n) if counting the output array

Worst case (like 999...) requires touching every digit once. Best case (like 1234) only touches the last digit. On average, we might process half the digits, but O(n) captures the upper bound. Space is O(1) extra because we modify in-place - we only allocate a new array in the all-9s case.

Common Mistakes

Adding 1 to the entire number first then converting back (causes overflow for large numbers)
Not handling the all-9s case (999 -> 1000 needs a new digit)
Processing left-to-right instead of right-to-left (you need the carry from less significant digits first)
Using integer addition on the array treating it as a number (overflow risk for large inputs)

Edge Cases

Single digit [0] -> [1]
Single digit [9] -> [1, 0]
All 9s [9, 9, 9] -> [1, 0, 0, 0]
No carry needed [1, 2, 3] -> [1, 2, 4]
Carry stops in middle [1, 2, 9] -> [1, 3, 0]

Connections

Add Binary (#67) - same carry propagation, but base 2 instead of base 10
Multiply Strings (#43) - digit-by-digit processing but with two numbers
Integer to English Words (#273) - inverse problem, number to digit array
Excel Sheet Column Number (#171) - related base conversion problem

Pow(x, n) #50

Binary Exponentiation (Exponentiation by Squaring)

Intuition

Think of exponentiation like a nuclear chain reaction or signal propagation. If you want x^16, you don't multiply x by itself 16 times - you double: x→x²→x⁴→x⁸→x¹⁶. Each step squares the previous result. This is 'exponentiation by squaring' - exploiting the fact that (x²)² = x⁴, and so on. The exponent acts like a 'signal' that propagates through this doubling process. For odd exponents, we first extract one factor of x, then handle the remaining even part. Negative exponents just mean 'divide instead of multiply' - you're asking 'how much of xⁿ equals 1?'

Why This Pattern?

The exponent n can be represented in binary. Each bit represents whether we include that power of 2 in our final product. Mathematically: x^n = (x^(2^0))^b₀ × (x^(2^1))^b₁ × ... where bᵢ are the bits of n. This transforms O(n) multiplications into O(log n) by squaring the base and halving the exponent at each step.

Solution

def myPow(x: float, n: int) -> float:
    # Handle negative exponent: x^(-n) = 1 / x^n
    if n < 0:
        x = 1 / x
        n = -n
    
    result = 1.0
    
    # Binary exponentiation: process each bit of n
    while n > 0:
        # If current bit is 1, multiply result by current power of x
        if n & 1:
            result *= x
        
        # Square x for next bit position (move to next power of 2)
        x *= x
        
        # Move to next bit (halve n)
        n >>= 1
    
    return result

Complexity

Time: O(log n)
Space: O(1)

We process one bit of the exponent per iteration, and the number of bits in n is log₂(n). Each iteration does constant work (a couple multiplications), so total is O(log n). We can't do better because we need to compute all 2^k powers where k is the number of bits - that's Θ(log n) distinct values.

Common Mistakes

Forgetting to handle negative exponents - n can be negative, requiring reciprocal
Not handling n=0 case (correctly returns 1.0)
Using floating point equality for edge cases like x=1 or x=-1
Overflow issues with very large exponents - the squaring can quickly exceed float range
Not converting n to positive before processing when n is negative

Edge Cases

n = 0 returns 1 (anything to power 0 is 1)
x = 0 with positive n returns 0
x = 0 with negative n returns division by zero (infinity)
x = 1 returns 1 for any n
x = -1 returns 1 if n is even, -1 if n is odd
Very large n (like 2^31) - need to handle int overflow in some languages
n = -2^31 - after negating, might overflow in Python but fine since Python has arbitrary precision

Connections

This is the foundation for many 'power' problems in the Neetcode 150
Similar to 'Fast Power' in CTCI - same technique, different framing
Used as subroutine in problems like 'Super Pow' (LeetCode 372) which extends this to modular exponentiation
The divide-and-conquer structure mirrors other log-n problems like 'Merge Sort' and 'Binary Search'
Understanding this pattern helps with cryptographic algorithms (RSA uses modular exponentiation)

Rotate Image #48

Layer-by-layer 4-way swap with in-place rotation

Intuition

Imagine you're rotating a physical photo frame 90° clockwise. The top-left corner moves to top-right, top-right to bottom-right, and so on. Here's the key insight: instead of thinking about individual elements, think about concentric RINGS (or layers). For each ring, we can rotate 4 elements at a time in a cycle. Picture a square with corners labeled A, B, C, D going clockwise. After rotation: A→B→C→D→A. The element at position (row, col) moves to (col, n-1-row). This is like a 4-way dance where each element hands off its value to the next position.

Why This Pattern?

The matrix has symmetry along its center. By processing from the outermost layer inward, each element participates in exactly one 4-element cycle. This guarantees O(1) space because we only need one temp variable. The coordinate transformation (i,j) → (j, n-1-i) is deterministic, so we can compute exact swap destinations mathematically.

Solution

class Solution:
    def rotate(self, matrix: List[List[int]]) -> None:
        """
        Rotates matrix 90 degrees clockwise in-place.
        Uses layer-by-layer approach: for each layer, rotate 4 elements
        in a cycle (top-left → top-right → bottom-right → bottom-left → top-left)
        """
        n = len(matrix)
        
        # Process each layer from outer to inner
        # For n=4: layers 0 and 1 (0-indexed)
        for layer in range(n // 2):
            first = layer
            last = n - 1 - layer
            
            # For each element in current layer (excluding the last one of each side)
            for i in range(first, last):
                offset = i - first
                
                # Save top-left (will be overwritten)
                top = matrix[first][i]
                
                # Left → Top
                matrix[first][i] = matrix[last - offset][first]
                
                # Bottom → Left
                matrix[last - offset][first] = matrix[last][last - offset]
                
                # Right → Bottom
                matrix[last][last - offset] = matrix[i][last]
                
                # Top → Right (using saved value)
                matrix[i][last] = top

Complexity

Time: O(n²)
Space: O(1)

Common Mistakes

Confusing clockwise vs counterclockwise rotation — check your coordinate transformation
Using range(first, last+1) instead of range(first, last) — causes double rotation of corners
Forgetting to recalculate offset correctly when computing positions
Trying to use 2D array methods that don't exist in Python

Edge Cases

1x1 matrix: no rotation needed, loop doesn't execute (n//2 = 0)
2x2 matrix: exactly one layer, rotates correctly
Odd dimensions: center element stays in place (not in any layer cycle)
Already rotated or identity matrix: algorithm handles all correctly

Connections

Transpose + Reverse approach: Same problem solved by transposing (matrix[i][j] ↔ matrix[j][i]) then reversing each row — simpler but same O(n²)
Spiral Matrix (LeetCode 54): Uses similar layer iteration to extract matrix elements
Reverse Diagonal / Diagonal Flip problems: Share the transpose concept
In-place array rotation: Similar in-place swap pattern used in rotating arrays

Set Matrix Zeroes #73

In-place State Preservation using matrix boundaries as markers

Intuition

Think of this like contamination detection. Each 0 is a 'contaminant' that needs to spread along its entire row and column. The challenge: you need to remember which rows/columns are contaminated WITHOUT losing information as you go. It's like taking notes while reading - you can't erase what you've read. The trick is to use the matrix's own edges (first row and first column) as a 'todo list' - marking which rows/columns need zeroing without actually zeroing them yet. Once you've found ALL the zeros, THEN you systematically clear the marked rows and columns.

Why This Pattern?

The matrix already has a natural 'edge' structure - the first row and first column. Instead of using extra space to remember which rows/columns have zeros, we exploit the matrix structure itself. Any zero at position (i,j) marks its row's first cell and column's first cell. This creates a distributed 'signature' of contamination. After scanning the entire matrix, we use these markers to zero everything in one pass.

Solution

def setZeroes(matrix):
    if not matrix or not matrix[0]:
        return
    
    m, n = len(matrix), len(matrix[0])
    
    # Step 1: Check if first row/col need to be zeroed (we'll overwrite them later)
    first_row_zero = any(matrix[0][j] == 0 for j in range(n))
    first_col_zero = any(matrix[i][0] == 0 for i in range(m))
    
    # Step 2: Use first row and first column as markers
    # If cell (i,j) is 0, mark its row-start and col-start to 0
    for i in range(1, m):
        for j in range(1, n):
            if matrix[i][j] == 0:
                matrix[i][0] = 0  # Mark this row for zeroing
                matrix[0][j] = 0  # Mark this column for zeroing
    
    # Step 3: Zero out cells based on markers (skip first row/col)
    for i in range(1, m):
        for j in range(1, n):
            if matrix[i][0] == 0 or matrix[0][j] == 0:
                matrix[i][j] = 0
    
    # Step 4: Zero out first column if needed
    if first_col_zero:
        for i in range(m):
            matrix[i][0] = 0
    
    # Step 5: Zero out first row if needed
    if first_row_zero:
        for j in range(n):
            matrix[0][j] = 0

Complexity

Time: O(m * n) - We traverse the matrix a constant number of times (3 passes: marking, zeroing, edge handling). Each cell is visited O(1) times.
Space: O(1) - Only using a few boolean variables, no extra data structures proportional to matrix size.

We can't do better than O(m*n) because potentially every cell needs to be examined (to find zeros) AND potentially every cell needs to be set to zero. That's Θ(mn) work minimum. For space, we exploit the matrix's own boundary as storage - we 'pay' with the first row/column being temporarily unusable as data, which is the price of O(1) extra space.

Common Mistakes

Overwriting first row/col too early - you lose the ability to detect zeros in those rows/cols
Forgetting to handle the case where first row or first column itself contains a zero
Using matrix[i][0] and matrix[0][j] before checking they're not the first row/col (causing cascading zeros)

Edge Cases

Empty matrix or single row/column
Matrix with no zeros - should remain unchanged
Matrix where entire first row or first column is zeros
Single cell matrix (0 or 1)

Connections

Rotate Image - uses matrix boundary manipulation similarly
Search a 2D Matrix II - exploits first row/col properties
This is the in-place variant of the simpler 'brute force' approach that uses O(m+n) extra space with sets for rows/cols - trade-off between time and space

Spiral Matrix #54

Boundary Traversal / Layer-by-layer peeling

Intuition

Think of a snail crawling through the matrix starting from the top-left corner. It crawls right until it hits a wall, then turns down, then left, then up, and keeps spiraling inward. Each direction change happens when you either hit the matrix boundary OR hit a cell you've already visited. It's like walking along the edges of an onion, peeling off one layer at a time, then moving to the next inner layer.

Why This Pattern?

The spiral order naturally decomposes the matrix into concentric 'shells' or layers. Each complete cycle around the perimeter visits all boundary elements exactly once before moving to the next inner layer. The structure of spiral order IS this boundary-following behavior - there's no shorter path because you must visit every outer cell before accessing inner cells.

Solution

def spiralOrder(matrix):
    if not matrix or not matrix[0]:
        return []
    
    result = []
    m, n = len(matrix), len(matrix[0])
    # Define the current rectangle's boundaries
    top, bottom = 0, m - 1
    left, right = 0, n - 1
    
    while top <= bottom and left <= right:
        # 1. Traverse RIGHT along the top row
        for col in range(left, right + 1):
            result.append(matrix[top][col])
        top += 1  # Top boundary done, shrink inward
        
        # 2. Traverse DOWN along the rightmost column
        for row in range(top, bottom + 1):
            result.append(matrix[row][right])
        right -= 1  # Right boundary done
        
        # 3. Traverse LEFT along the bottom row (if rows remain)
        if top <= bottom:
            for col in range(right, left - 1, -1):
                result.append(matrix[bottom][col])
            bottom -= 1
        
        # 4. Traverse UP along the leftmost column (if columns remain)
        if left <= right:
            for row in range(bottom, top - 1, -1):
                result.append(matrix[row][left])
            left += 1
    
    return result

Complexity

Time: O(m * n)
Space: O(1) excluding output array

Common Mistakes

Forgetting to check if rows/cols remain before traversing bottom/left - causes duplicate traversal in single-row or single-column scenarios
Not shrinking boundaries after each pass - leads to infinite loop
Using wrong range endpoints - off-by-one errors on boundary inclusive/exclusive

Edge Cases

Empty matrix [] - return []
Single element [[5]] - returns [5]
Single row [1,2,3] - works correctly, bottom traversal skipped
Single column [[1],[2],[3]] - right and bottom traversal skipped correctly
1x2 or 2x1 matrices - handles asymmetric cases

Connections

Rotate Image (LeetCode 48) - uses similar boundary shrinking technique but in-place
Diagonal Traverse - another matrix traversal pattern but with different rule
Set Matrix Zeroes - also uses boundary/region detection

Bit Manipulation (7)

Counting Bits #338

Dynamic Programming on binary representation. The recurrence relation is: dp[n] = dp[n >> 1] + (n & 1).

Intuition

Think of binary numbers as a tree where each number's 'parent' is itself right-shifted by 1 (dividing by 2). A number's bit count = its parent's bit count + the bit that was removed. If n is even (ends in 0), we just added a 0 bit, so count stays the same. If n is odd (ends in 1), we added a 1 bit, so count increases by 1. This is like signal propagation through a binary tree - each child inherits the 'energy' (bit count) of its parent plus what it picked up on the way down.

Why This Pattern?

The fundamental property of binary: right-shifting divides by 2 (floor), dropping the least significant bit. The number of 1-bits in n equals the number in n//2 plus whatever the LSB contributes (0 if even, 1 if odd). This creates an obvious subproblem structure - smaller numbers help build larger ones.

Solution

def countBits(n):
    # dp[i] = number of 1-bits in i
    dp = [0] * (n + 1)
    
    # Base case: dp[0] = 0 (already initialized)
    # For each number from 1 to n
    for i in range(1, n + 1):
        # i >> 1 = i // 2 (right shift drops LSB)
        # i & 1 = i % 2 (isolates LSB: 0 if even, 1 if odd)
        dp[i] = dp[i >> 1] + (i & 1)
    
    return dp

Complexity

Time: O(n) - We compute and store the result for each integer from 0 to n exactly once. There's no way to do better because we need to return n+1 results.
Space: O(n) - We need to store the result for each number to reference when building larger numbers. Could be O(1) if we printed results on-the-fly instead of storing all.

We must produce n+1 outputs (one for each number from 0 to n), so O(n) time is optimal. The DP array of size n+1 is necessary because each number's answer depends on potentially any smaller number (specifically n//2).

Common Mistakes

Forgetting dp[0] = 0 as the base case - the recurrence won't work without it
Using i // 2 and i % 2 instead of bit operations - works but defeats the purpose of practicing bit manipulation
Not initializing the dp array with correct size (n+1) - causes index errors

Edge Cases

n = 0 returns [0] - must handle empty/single case
n = 1 returns [0, 1] - validates even/odd logic works
Large n = 100000 - ensure solution scales
n being a power of 2 - e.g., n=8 returns [0,1,1,2,1,2,2,3,1] where index 8 has only 1 bit

Connections

Same recurrence relation used in 'Number of 1 Bits' (LeetCode 191) - but that finds for single number while this finds for all 0 to n
Related to 'Power of Two' (LeetCode 231) - understanding bit properties helps with bit manipulation problems
Uses same DP-on-previous-state pattern as 'Climbing Stairs' but with binary bit relationships instead of arithmetic
Can be solved alternatively using Brian Kernighan's algorithm in a nested loop, but DP is more elegant

Missing Number #268

XOR Identity / Sum Conservation

Intuition

Think of this like a conservation law. We know the complete set should be 0 through n, but one number is 'leaking' out. If we add up what we SHOULD have (the sum 0+1+2+...+n) and subtract what we ACTUALLY have, the difference is exactly the missing number - like measuring a water leak by comparing expected and actual volume. Alternatively, think of XOR as a 'cancellation' operation: every number that appears twice cancels to 0, leaving only the missing number (which appears once) as the surviving signal.

Why This Pattern?

This problem has a complete set [0,n] with exactly one element removed. Both XOR and sum have inverses: x ^ x = 0 and x - x = 0. This means we can 'cancel out' all the numbers that are present in the array, leaving only the missing one. The structural property is that we know the exact universe of possible values but one is absent - making identity operations the natural tool.

Solution

def missingNumber(nums):
    # Approach 1: XOR (preferred - no overflow risk in Python)
    # XOR all numbers from 0 to n, then XOR with all array elements
    # Each present number appears twice and cancels to 0
    # Only the missing number survives
    result = 0
    for i in range(len(nums) + 1):
        result ^= i  # XOR with expected range
    for num in nums:
        result ^= num  # XOR out what's actually there
    return result

# Alternative (sum method) - simpler conceptually:
# def missingNumber(nums):
#     n = len(nums)
#     expected_sum = n * (n + 1) // 2  # Sum of 0 to n
#     return expected_sum - sum(nums)

Complexity

Time: O(n)
Space: O(1)

We must touch each element in the array exactly once to either XOR it or sum it. There's no way to find the missing number without checking all inputs - it's essentially counting n items. The O(1) extra space comes from only needing a single accumulator variable regardless of input size.

Common Mistakes

Confusing n (array length) with the max value (which is n, not n-1)
Using sum method in languages with integer overflow (not an issue in Python)
Forgetting the XOR approach requires iterating through both the range and the array

Edge Cases

Empty array [] returns 0 (the missing number from [0])
Array with single element [0] returns 1
Array [1] returns 0
When the missing number is n itself (array is 0 to n-1)

Connections

Same core insight as 'Single Number' (LeetCode 136) - using XOR to find unpaired element
Similar 'find the odd one out' logic as 'Find the Duplicate Number' but without modifying array
Generalizes to any arithmetic series where one element is missing

Number of 1 Bits #191

Bit Manipulation - Remove Rightmost Set Bit

Intuition

Think of each set bit as an 'energy source' in a system. The trick `n & (n-1)` acts like a drain that removes exactly one source per operation — specifically the rightmost one. When you subtract 1 from a number, all bits to the right of the rightmost 1 flip (0↔1), and that rightmost 1 becomes 0. When you AND the result with the original number, those flipped bits become 0, effectively 'draining' that one source. You keep draining until the system has no energy left (n becomes 0), and the number of drains equals your answer. This is like counting items in a system by systematically removing them one at a time rather than inspecting every possible location.

Why This Pattern?

The property that `n & (n-1)` always removes exactly one set bit is structural — it exploits how binary subtraction works. This pattern is natural because we only iterate as many times as there are 1-bits, not 32 times. Each iteration deterministically removes one known set bit.

Solution

class Solution:
    def hammingWeight(self, n: int) -> int:
        """
        Count the number of '1' bits in the binary representation of n.
        Uses the n & (n-1) trick to remove rightmost set bit each iteration.
        """
        count = 0
        while n:
            # Remove the rightmost set bit: flips bits after rightmost 1 to 1s,
            # turns rightmost 1 to 0, then AND with original clears all those bits
            n = n & (n - 1)
            count += 1
        return count

Complexity

Time: O(k) where k is the number of set bits (at most 32 for 32-bit integers)
Space: O(1)

We only loop once per set bit rather than checking all 32 bit positions. In the worst case (n = 0xFFFFFFFF), we iterate 32 times; in the best case (n = 0), we iterate 0 times. This is optimal because we must at minimum examine each set bit to count it.

Common Mistakes

Using n & -n instead of n & (n-1) — the former gives the rightmost SET bit, not removes it
Forgetting Python's arbitrary precision integers: need to mask with 0xFFFFFFFF for unsigned 32-bit behavior
Using n -= 1 instead of n = n & (n-1) — this would be O(k * max_bit_value) instead of O(k)

Edge Cases

n = 0 returns 0 (no set bits)
n = 1 returns 1 (only LSB set)
Python negative numbers have infinite leading 1s in two's complement — must mask if given signed input
n = 0xFFFFFFFF (all 32 bits set) returns 32

Connections

191. Number of 1 Bits uses same core insight as 338. Counting Bits (dynamic programming version uses this trick)
191 relates to 231. Power of Two — checking if n & (n-1) == 0 tells you if exactly one bit is set
191 is the fundamental operation used in 191's harder variant 2013. Count Univalue Subtrees counts via bit operations on node values

Reverse Bits #190

Bit-by-bit extraction and reconstruction

Intuition

Think of the 32 bits as a line of 32 dominoes. Each domino is either standing (1) or fallen (0). 'Reversing bits' is like reflecting this line in a mirror — the leftmost domino becomes the rightmost, the second-from-left becomes second-from-right, and so on. The position transformation is simple: bit at position i moves to position 31-i. Another way: we're 'reading the binary number backwards' — the least significant bit becomes the most significant, and vice versa. This is fundamentally a positional remapping problem.

Why This Pattern?

Reversing is inherently an order-reversal operation. By extracting each bit one at a time (from LSB toward MSB) and building the result in the opposite order (placing each extracted bit from MSB toward LSB), we naturally achieve the reversal. This is the 'dual-pointer' technique applied to bit positions instead of array indices.

Solution

def reverseBits(n):
    result = 0
    for i in range(32):
        # Extract the i-th bit from the original number (working right to left)
        bit = (n >> i) & 1
        # Place it in the mirrored position (31-i) in result (working left to right)
        # We OR because result already has bits from previous iterations
        result = result | (bit << (31 - i))
    return result

Complexity

Time: O(1)
Space: O(1)

We iterate exactly 32 times (once per bit), regardless of input value. This is constant because the input is always a fixed 32-bit integer. Each iteration does constant-time bit operations. No data structures grow with input size.

Common Mistakes

Using signed right shift (>>) instead of unsigned (>>>) in Java — this propagates the sign bit and breaks the algorithm
Forgetting the loop runs 32 times (not 31) — position indices go 0-31
Not handling the case where input is 0 — the algorithm works but some forget to test it
Confusing bit positions: i is the position in original, (31-i) is where it goes in result

Edge Cases

n = 0 (all zeros, should return 0)
n = 0xFFFFFFFF (all ones, should return 0xFFFFFFFF)
n = 1 (only LSB set, should return 0x80000000 = 2^31)
n = 0x40000000 (only bit 30 set, should return 2^1 = 2)

Connections

This is the hardware-level operation underlying byte swapping (used in network protocols like IP)
Related to 'Number of 1 Bits' (LeetCode 191) — both iterate through bits systematically
Uses same bit-extraction technique as 'Power of Two' (LeetCode 231) to check individual bits
Conceptually similar to 'Rotate Array' (LeetCode 189) — both involve reversing order in a sequence

Reverse Integer #7

Digit-by-digit accumulation with overflow guards

Intuition

Think of reversing an integer like stacking rings on a pole. Each new digit taken from the original number gets placed on top, pushing everything else down one position. The overflow problem is like checking whether the pole can support another ring before adding it. If you're building up a number and the current result is already greater than INT_MAX/10, adding another digit (even a 0) would overflow. If the result equals INT_MAX/10, you can only add digits 0-7 (the last digit of INT_MAX is 7). This is exactly how you'd check if a water glass will overflow: if there's already 9 ounces in a 10-ounce glass, adding any more spills. But if there's 8 ounces, you can add up to 2 more safely.

Why This Pattern?

The problem has a hard boundary (32-bit signed integer range), and we're building the result incrementally. This means we can detect overflow at each step before it happens, rather than computing first and checking afterward (which would already be too late).

Solution

class Solution:
    def reverse(self, x: int) -> int:
        # Handle negative numbers by working with positive, restore sign at end
        sign = -1 if x < 0 else 1
        x = abs(x)
        
        result = 0
        while x > 0:
            # Take the rightmost digit
            digit = x % 10
            
            # Check for overflow BEFORE adding:
            # If result > INT_MAX / 10, multiplying by 10 would overflow
            # If result == INT_MAX / 10 and digit > 7, adding would overflow
            # (INT_MAX = 2147483647, so last digit could be 0-7)
            if result > 2147483647 // 10 or (result == 2147483647 // 10 and digit > 7):
                return 0
            
            # Add digit to result (shifting existing digits left)
            result = result * 10 + digit
            
            # Remove the processed digit
            x //= 10
        
        return sign * result

Complexity

Time: O(d) where d is the number of digits in the input (at most 10 for 32-bit integers)
Space: O(1) - only a fixed number of integer variables used regardless of input size

We process each digit exactly once, so time is proportional to digit count. The digit count is bounded by 10 (for 32-bit), making this effectively O(1) in the worst case. Space is constant because we're just storing a few integers, not building any data structure that grows with input.

Common Mistakes

Using 2^31 instead of 2147483647 (2^31 - 1) for the boundary check
Not checking overflow before the multiplication/addition operation - by then it's too late
Forgetting to handle the sign (negative numbers)
Treating this as a string problem and converting, which misses the overflow detection point
Using Python's arbitrary precision integers to compute first and check afterward - that defeats the purpose

Edge Cases

x = 0 returns 0 (trivial case)
x = 1534236469 reverses to 9646324351 which overflows, should return 0
x = -123 should return -321 (not 321-)
x = 120 returns 21 (not 021)
x = 1000000009 reverses to 9000000001 which overflows

Connections

Palindrome Number (#9) - uses similar digit extraction logic
String to Integer (#8) - overflow checking is critical in both
This is the digit-manipulation foundation for many mathematical LeetCode problems

Single Number #136

XOR Cancellation / Bit Manipulation

Intuition

Imagine each number as a particle, and pairs of identical numbers as matter and antimatter - when they meet, they annihilate completely (become 0). XOR is the mathematical operation that does exactly this: any number XORed with itself gives 0, and any number XORed with 0 gives itself. So if we line up all the numbers and XOR them together, the pairs annihilate each other, leaving only the single number standing. This is like a 'conservation law' for bits - pairs cancel out at each bit position.

Why This Pattern?

The problem has a specific structural property: every element appears exactly twice except one. This paired repetition structure is what XOR is perfectly designed to handle. XOR has three key properties that match this problem: (1) a ^ a = 0 - paired elements cancel to zero, (2) a ^ 0 = a - the single element remains, (3) XOR is commutative and associative so order doesn't matter. This is the most elegant solution because it exploits a fundamental mathematical property rather than brute force.

Solution

def singleNumber(nums):
    # Start with 0 because x ^ 0 = x (identity property)
    result = 0
    # XOR all numbers together - pairs cancel out, single remains
    for num in nums:
        result ^= num
    return result

Complexity

Time: O(n)
Space: O(1)

We must examine each of the n elements at least once to find the unique one - there's no way around this. For space, we only store a single integer (result) regardless of input size, which is the minimum possible since we need to output something.

Common Mistakes

Using addition instead of XOR - fails with negative numbers since addition doesn't cancel properly
Not initializing result to 0 - but actually this works because 0 ^ num = num
Using a hash set which works but wastes O(n) space - missing the elegant bit solution

Edge Cases

Single element array - returns that element immediately
All pairs at the beginning, single at end - works regardless of position
Negative numbers - XOR handles two's complement representation correctly
Empty array not possible per problem constraints (non-empty guaranteed)

Connections

Single Number II (137) - same idea but each element appears 3 times, requires more complex bit manipulation
Single Number III (260) - uses same XOR foundation but adds bit-shifting to separate two singletons
Finding the duplicate number - can use XOR trick with index-value relationship

Sum of Two Integers #371

Iterative Bitwise Carry Propagation

Intuition

Think of binary addition like water flowing in connected containers. When you add 1+1 at any bit position, you get 0 there but create a 'overflow' (carry) that flows to the next position. XOR tells you what each bit sums to WITHOUT considering carries (1+1=0, 1+0=1, 0+1=1, 0+0=0). AND tells you where carries are created (only 1+1 produces a carry). You then shift the carries left and repeat until no carries remain. This is like a ripple propagating through the system until it reaches equilibrium.

Why This Pattern?

The problem forces us to simulate CPU-level addition. At each bit position, two independent operations happen in parallel: XOR computes the sum, AND identifies carries. The carry must propagate to higher bits, creating a feedback loop that continues until the system stabilizes (no carries left). This is the natural hardware algorithm.

Solution

def getSum(a: int, b: int) -> int:
    # Simulate 32-bit signed integer arithmetic without using + or -
    # 
    # Core insight: XOR = sum without carries, AND << 1 = carries
    # Iterate until carries propagate to nothing (b becomes 0)
    
    MASK = 0xFFFFFFFF        # 32-bit mask
    MAX_INT = 0x7FFFFFFF     # Max positive 32-bit signed int
    
    # Work with 32-bit unsigned integers
    while b != 0:
        # XOR: sum WITHOUT considering carries
        # Example: 1^1=0, 1^0=1, 0^1=1, 0^0=0 (exact binary addition truth table!)
        sum_without_carry = (a ^ b)
        
        # AND then shift: positions where BOTH are 1 generate a carry
        # The carry flows to the NEXT bit (left shift by 1)
        carry = ((a & b) << 1) & MASK
        
        # Update for next iteration
        a = sum_without_carry & MASK  # Keep within 32 bits
        b = carry
    
    # Convert from unsigned back to signed if needed
    # If the 32nd bit is set, we have a negative number
    return a if a <= MAX_INT else ~(a ^ MASK)

Complexity

Time: O(log(min(|a|, |b|)))
Space: O(1)

Each iteration eliminates at least one carry bit (the rightmost one). Since carries can propagate through all 32 bits, we need at most 32 iterations for 32-bit integers. The number of iterations equals the number of carry propagation cycles needed, which is bounded by the bit-width.

Common Mistakes

Forgetting the 32-bit mask in Python - unlike C/Java, Python ints have unlimited precision, so overflow bugs won't manifest naturally
Not handling negative numbers correctly when converting back from unsigned representation
Using a instead of b for the carry check in the while loop condition
Forgetting the & MASK on the carry calculation itself

Edge Cases

a=0 or b=0 (should return the other number directly)
a=-1, b=1 (maximum carries that propagate through all bits)
Both negative numbers (need correct sign extension)
Maximum negative: a=-2147483648 (INT_MIN)

Connections

LeetCode 191 'Number of 1 Bits' - uses AND to check bits, same fundamental operation
LeetCode 137 'Single Number II' - XOR used to cancel paired bits, similar bit manipulation mindset
LeetCode 201 'Bitwise AND of Numbers Range' - carry propagation in reverse (AND instead of XOR/carry)
This is essentially what hardware adders do - understanding this helps with any bit manipulation problem

Arrays & Hashing (9)

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Two Pointers (5)

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections

Intuition

Why This Pattern?

Solution

Complexity

Common Mistakes

Edge Cases

Connections