Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Understanding Backtracking Algorithms

Tech 2

13.1 Backtracking Algorithm

The "backtracking algorithm" is a problem-solving method that relies on exhaustive search. Its core concept is to start from an initial state and use brute force to search for all possible solutions. When a correct solution is found, it is recorded. The process continues until a solution is found or until all possible choices have been exhausted without finding a solution.

Backtracking algorithms typically employ "Depth-First Search" (DFS) to traverse the solution space. In the chapter on binary trees, we mentioned that preorder, inorder, and postorder traversals are all forms of DFS. Next, we will use preorder traversal to construct a backtracking problem to gradually understend how the algorithm works.

Example Problem 1

Given a binary tree, search for and record all nodes with a value of 7. Return a list of these nodes.

For this problem, we perform a preorder traversal of the tree. At each step, we check if the current node's value is 7. If it is, we add that node to our result list results. This process is illustrated in Figure 13-1 and the code below.

// === File: tree_search.c ===
/* Preorder Traversal: Example 1 */
void traverseTree(TreeNode* node) {
    if (node == NULL) {
        return;
    }
    if (node->data == 7) {
        // Record the solution
        results[(*result_count)++] = node;
    }
    traverseTree(node->left_child);
    traverseTree(node->right_child);
}

13.1.1 Attempt and Retraction

The reason it is called a backtracking algorithm is that it adopts a strategy of "attempt" and "retraction" (backtrack) when searching the solution space. When the algorithm encounters a state where it cannot proceed further or find a solution that satisfies the conditions, it undoes the previous choice, returns to the previous state, and tries other possible choices.

For Example 1, visiting each node represents an "attempt," and the return after passing a leaf node or returning to the parent node represents "retraction." It is worth noting that retraction is not limited to just function returns. To explain this, let's expand slightly on Example 1.

Example Problem 2

Search for all nodes with a value of 7 in a binary tree and return the paths from the root node to these nodes.

Building on the code from Example 1, we need a list current_path to record the path of visited nodes. When a node with a value of 7 is accessed, we copy current_path and add it to the result list results. After traversal is complete, results will contain all solutions.

// === File: path_search.c ===
/* Preorder Traversal: Example 2 */
void findPaths(TreeNode* node) {
    if (node == NULL) {
        return;
    }
    // Attempt: Add node to path
    current_path[path_len++] = node;
    
    if (node->data == 7) {
        // Record the solution
        for (int i = 0; i < path_len; ++i) {
            results[*result_count][i] = current_path[i];
        }
        (*result_count)++;
    }
    
    findPaths(node->left_child);
    findPaths(node->right_child);
    
    // Retract: Remove node from path
    path_len--;
}

In each "attempt," we record the path by adding the current node to current_path. Before "retracting," we need to pop that node from current_path to restore the state to what it was before this attempt.

Observing the process shown in Figure 13-2, we can understand attempt and retraction as "advance" and "undo," two operations that are inverse to each other.

13.1.2 Pruning

Complex backtracking problems usually involve one or more constraints. These constraints are typically used for "pruning."

Example Problem 3

Search for all nodes with a value of 7 in a binary tree and return the paths from the root node to these nodes, with the requirement that the path must not contain any nodes with a value of 3.

To satisfy this constraint, we need to add a pruning operation: during the search, if we encounter a node with a value of 3, we return immediately and stop searching further down that branch.

// === File: constrained_path_search.c ===
/* Preorder Traversal: Example 3 */
void findValidPaths(TreeNode* node) {
    // Pruning: Skip invalid nodes or nodes with value 3
    if (node == NULL || node->data == 3) {
        return;
    }
    // Attempt
    current_path[path_len++] = node;
    
    if (node->data == 7) {
        // Record the solution
        for (int i = 0; i < path_len; i++) {
            results[*result_count][i] = current_path[i];
        }
        (*result_count)++;
    }
    
    findValidPaths(node->left_child);
    findValidPaths(node->right_child);
    
    // Retract
    path_len--;
}

Pruning is a very apt metaphor. As shown in Figure 13-3, during the search process, we "cut off" branches that do not meet the constraints, avoiding many meaningless attempts and thereby improving search efficiency.

13.1.3 Framework Code

Next, we will attempt to extract the main framework of "attempt, retraction, and pruning" to improve code versatility.

In the following framework code, current_state represents the current state of the problem, and available_options represents the choices that can be made in the current state.

/* Backtracking Algorithm Framework */
void solve(State* current_state, Option* available_options, int option_count, State* solutions, int* solution_count) {
    // Check if it is a valid solution
    if (isSolution(current_state)) {
        // Record the solution
        recordSolution(current_state, solutions, solution_count);
        // Stop searching further (optional, depends on problem)
        return;
    }
    
    // Iterate through all options
    for (int i = 0; i < option_count; i++) {
        // Pruning: Check if the choice is valid
        if (isValid(current_state, &available_options[i])) {
            // Attempt: Make a choice, update state
            makeChoice(current_state, &available_options[i]);
            solve(current_state, available_options, option_count, solutions, solution_count);
            // Retract: Undo choice, revert to previous state
            undoChoice(current_state, &available_options[i]);
        }
    }
}

Next, let's solve Example Problem 3 based on this framework. The state current_state is the traversal path, the choices available_options are the left and right child nodes, and the result solutions is the list of paths.

// === File: template_path_search.c ===
/* Determine if current state is a solution */
bool isTargetFound(void) {
    return path_len > 0 && current_path[path_len - 1]->data == 7;
}

/* Record solution */
void saveCurrentSolution(void) {
    for (int i = 0; i < path_len; i++) {
        results[*result_count][i] = current_path[i];
    }
    (*result_count)++;
}

/* Check if the choice is valid under current state */
bool isChoiceValid(TreeNode* choice) {
    return choice != NULL && choice->data != 3;
}

/* Update state */
void applyChoice(TreeNode* choice) {
    current_path[path_len++] = choice;
}

/* Revert state */
void revertChoice(void) {
    path_len--;
}

/* Backtracking Algorithm: Example 3 */
void backtrack(TreeNode* options[2]) {
    // Check if solution found
    if (isTargetFound()) {
        saveCurrentSolution();
    }
    // Iterate through choices
    for (int i = 0; i < 2; i++) {
        TreeNode* choice = options[i];
        // Pruning: Check validity
        if (isChoiceValid(choice)) {
            // Attempt: Make choice
            applyChoice(choice);
            // Prepare next choices
            TreeNode* next_options[2] = {choice->left_child, choice->right_child};
            backtrack(next_options);
            // Retract: Undo choice
            revertChoice();
        }
    }
}

According to the problem statement, we should continue searching even after finding a node with value 7, so the return statement after recording the solution should be removed. Figure 13-4 compares the search process with and without the return statement.

Compared to the implementation based directly on preorder traversal, the implementation based on the backtracking framework is more verbose but significently more versatile. In fact, many backtracking problems can be solved within this framework. We only need to define current_state and available_options according to the specific problem and implement the various methods in the framework.

13.1.4 Common Terminology

To analyze algorithmic problems more clearly, let's summarize the meanings of common terms in backtracking algorithms, using Example Problem 3 for illustration.

Table 13-1 Common Backtracking Terminology

Term Definition Example Problem 3
Solution An answer that meets specific conditions of the problem; there may be one or multiple. All paths from the root to node 7 that satisfy constraints.
Constraint Conditions that limit the feasibility of solutions; often used for pruning. The path must not contain node 3.
State The situation of the problem at a specific moment, including choices made so far. The currently visited node path, i.e., the current_path list.
Attempt The process of exploring the solution space based on available choices, including making a choice, updating state, and checking if it's a solution. Recursively visiting the left (or right) child, adding the node to current_path, and checking if the node's value is 7.
Backtracking Withdrawing a previous choice and returning to the previous state when encountering a state that does not meet constraints. Terminating the search when passing leaf nodes, finishing node access, or encountering a node with value 3; function returns.
Pruning Avoiding meaningless search paths based on problem characteristics and constraints to improve efficiency. Terminating the search immediately when encountering a node with value 3.

Concepts like problem, solution, and state are general and appear in Divide and Conquer, Backtracking, Dynamic Programming, and Greedy algorithms.

13.1.5 Pros and Cons

The backtracking algorithm is essentially a depth-first search algorithm that tries all possible solutions until it finds one that meets the conditions. Its advantage is that it can find all possible solutions, and with reasonable pruning operations, it can be very efficient.

However, when dealing with large-scale or complex problems, the efficiency of the backtracking algorithm may be unacceptable.

  • Time: Backtracking algorithms usually need to traverse all possibilities in the state space, leading to exponential or factorial time complexity.
  • Space: Recursive calls require saving the current state (e.g., paths, auxiliary variables for pruning). When the recursion depth is large, space demand can become significant.

Despite this, backtracking remains the best solution for certain search problems and constraint satisfaction problems. For these problems, since we cannot predict which choices will generate valid solutions, we must traverse all possibilities. In such cases, the key is efficiency optimization. Common optimization methods include:

  • Pruning: Avoid searching paths that definitely won't yield a solution, saving time and space.
  • Heuristic Search: Introduce strategies or estimates during the search to prioritize paths most likely to produce valid solutions.

13.1.6 Typical Backtracking Problems

Backtracking algorithms are used to solve many search problems, constraint satisfaction problems, and combinatorial optimization problems.

  • Search Problems: The goal is to find a solution meeting specific conditions.
    • Permutation Problem: Given a set, find all possible permutations.
    • Subset Sum Problem: Given a set and a target sum, find all subsets summing to the target.
    • Tower of Hanoi: Move disks between pegs following specific rules.
  • Constraint Satisfaction Problems: The goal is to find a solution satisfying all constraints.
    • N-Queens: Place N queens on an NxN board so they do not attack each other.
    • Sudoku: Fill a 9x9 grid with digits 1-9 with no repeats in rows, columns, or subgrids.
    • Graph Coloring: Color vertices of a graph with the minimum colors so adjacent vertices differ.
  • Combinatorial Optimization Problems: The goal is to find an optimal solution within a combinatorial space.
    • 0-1 Knapsack Problem: Select items to maximize value without exceeding capacity.
    • Traveling Salesman Problem: Find the shortest route visiting all cities and returning to start.
    • Maximum Clique Problem: Find the largest complete subgraph in a graph.

Note that for many combinatorial optimization problems, backtracking is not the optimal solution (e.g., Dynamic Programming for Knapsack, Genetic Algorithms for TSP).

13.2 Permutation Problem

The permutation problem is a typical application of the backtracking algorithm. It involves finding all possible arrangements of elements in a given collection (like an array or string).

Table 13-2 Permutation Examples

Input Array All Permutations
[1] [1]
[1, 2] [1, 2], [2, 1]
[1, 2, 3] [1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]

13.2.1 Case Without Duplicate Elements

Input an integer array with no duplicate elements and return all possible permutations.

From a backtracking perspective, we can view generating permutations as a series of choices. Suppose the input is [1, 2, 3]. If we choose 1, then 3, and finally 2, we get [1, 3, 2]. Backtracking means undoing a choice to try other options.

In code terms, the candidate set available_options is all elements in the input array, and the state current_state is the list of elements chosen so far. Since each element can be chosen only once, elements in current_state must be unique.

Figure 13-5 shows the recursive tree. Each node represents current_state. After three rounds of choices, we reach a leaf node, which corresponds to a permutation.

1. Pruning Duplicate Choices

To ensure each element is chosen only once, we introduce a boolean array used, where used[i] indicates whether options[i] has been selected. We use this for pruning:

  • After choosing options[i], set used[i] to true.
  • When iterating through options, skip any element where used[i] is true.

Figure 13-6 illustrates this. If we choose 1, then 3, then 2, we must prune the branches for 1 in the second round, and branches for 1 and 3 in the third round. This reduces the search space from $O(n^n)$ to $O(n!)$.

2. Code Implementation

We can now fill in the blanks of our framework. To keep the code concise, we will expand the functions directly within the backtrack function.

// === File: permutations_unique.c ===
/* Backtracking Algorithm: Permutations I */
void permute(int* state, int state_len, int* nums, int nums_len, bool* used, int** output, int* output_size) {
    // When state length equals element count, record solution
    if (state_len == nums_len) {
        output[*output_size] = (int*)malloc(nums_len * sizeof(int));
        for (int i = 0; i < nums_len; i++) {
            output[*output_size][i] = state[i];
        }
        (*output_size)++;
        return;
    }
    
    // Iterate all choices
    for (int i = 0; i < nums_len; i++) {
        int num = nums[i];
        // Pruning: Skip used elements
        if (!used[i]) {
            // Attempt: Make choice, update state
            used[i] = true;
            state[state_len] = num;
            // Recursive call
            permute(state, state_len + 1, nums, nums_len, used, output, output_size);
            // Retract: Undo choice, restore state
            used[i] = false;
        }
    }
}

/* Wrapper function for Permutations I */
int** getPermutationsUnique(int* nums, int nums_size, int* return_size) {
    int* state = (int*)malloc(nums_size * sizeof(int));
    bool* used = (bool*)malloc(nums_size * sizeof(bool));
    for (int i = 0; i < nums_size; i++) used[i] = false;
    
    int** results = (int**)malloc(MAX_RES * sizeof(int*));
    *return_size = 0;
    
    permute(state, 0, nums, nums_size, used, results, return_size);
    
    free(state);
    free(used);
    return results;
}

13.2.2 Case With Duplicate Elements

Input an integer array that may contain duplicate elements. Return all unique permutations.

Suppose the input is [1, 1, 2]. To distinguish the two 1s, let's call the second one $1'$.

Figure 13-7 shows that the previous method generates many duplicate permutations.

The most direct way to remove duplicates is using a hash set. However, this is inefficient because generating duplicate permutations is unnecessary; we should identify and prune these branches early.

1. Pruning Equal Elements

Observing Figure 13-8, in the first round, choosing 1 or $1'$ is equivalent. All resulting permutations are duplicates. Therefore, we should prune $1'$.

Essentially, our goal is to ensure that in any given round of selection, multiple equal elements are chosen only once.

2. Code Implementation

We add a hash table/array visited_in_round to record elements tried in the current round to prune duplicates.

// === File: permutations_with_dupes.c ===
/* Backtracking Algorithm: Permutations II */
void permuteDup(int* state, int state_len, int* nums, int nums_len, bool* used, int** output, int* output_size) {
    if (state_len == nums_len) {
        output[*output_size] = (int*)malloc(nums_len * sizeof(int));
        for (int i = 0; i < nums_len; i++) {
            output[*output_size][i] = state[i];
        }
        (*output_size)++;
        return;
    }
    
    // Track duplicates in current round
    bool visited_in_round[MAX_SIZE] = {false};
    
    for (int i = 0; i < nums_len; i++) {
        int num = nums[i];
        // Pruning: Skip used elements OR equal elements already processed in this round
        if (!used[i] && !visited_in_round[num]) {
            visited_in_round[num] = true; // Mark value as processed
            used[i] = true;
            state[state_len] = num;
            
            permuteDup(state, state_len + 1, nums, nums_len, used, output, output_size);
            
            used[i] = false;
        }
    }
}

int** getPermutationsWithDupes(int* nums, int nums_size, int* return_size) {
    int* state = (int*)malloc(nums_size * sizeof(int));
    bool* used = (bool*)malloc(nums_size * sizeof(bool));
    for (int i = 0; i < nums_size; i++) used[i] = false;
    
    int** results = (int**)malloc(MAX_RES * sizeof(int*));
    *return_size = 0;
    
    permuteDup(state, 0, nums, nums_size, used, results, return_size);
    
    free(state);
    free(used);
    return results;
}

Complexity:

  • Time: $O(n! \cdot n)$
  • Space: $O(n^2)$ (stack $O(n)$, used array $O(n)$, visited_in_round at max $O(n)$ per level, potentially $O(n^2)$ total if not optimized strictly for hash set size, though typically considered $O(n)$ for distinct values logic. Let's stick to the text's analysis or standard analysis).

13.3 Subset Sum Problem

13.3.1 Case Without Duplicate Elements

Given an array of positive integers nums and a target positive integer target, find all combinations where the element sum equals target. The array has no duplicate elements, and each element can be selected multiple times. Return the list of combinations without duplicates.

Example: Input {3, 4, 5}, target 9. Solutions: {3, 3, 3}, {4, 5}. Note: 1. Elements can be reused. 2. Order doesn't matter ({4, 5} == {5, 4}).

1. Reference Permutation Solution

Similar to permutasions, we can view subset generation as a series of choices, updating the running sum. When the sum equals target, record the subset.

Unlike permutations, elements can be reused, so we don't need the used array. A naive modification yields:

// === File: subset_sum_naive.c ===
/* Backtracking: Subset Sum I (Naive) */
void findSubsets(int target, int current_sum, int* nums, int nums_size) {
    // Record solution if sum matches
    if (current_sum == target) {
        for (int i = 0; i < current_len; i++) {
            solutions[*sol_count][i] = current_subset[i];
        }
        solution_sizes[(*sol_count)++] = current_len;
        return;
    }
    
    for (int i = 0; i < nums_size; i++) {
        // Pruning: skip if sum exceeds target
        if (current_sum + nums[i] > target) continue;
        
        // Attempt
        current_subset[current_len++] = nums[i];
        findSubsets(target, current_sum + nums[i], nums, nums_size);
        // Retract
        current_len--;
    }
}

Input [3, 4, 5], target 9 gives {[3,3,3], [4,5], [5,4]}. The last two are duplicates because search distinguishes order, but subsets do not.

2. Pruning Duplicate Subsets

To remove duplicates during search, observe Figure 13-11. Duplicates arise from choosing elements in different orders (e.g., 4 then 5 vs 5 then 4). To prevent this, we enforce that the choice of indices must be non-decreasing: if we choose index i, the next choice must be at index j >= i.

3. Code Implementation

We introduce a start_index variable. When we choose element at index i, the next round starts at index i. Additionally:

  1. Sort nums first.
  2. Subtract from target instead of adding to current_sum. If target reaches 0, we found a solution.
// === File: subset_sum_optimized.c ===
/* Comparison function for qsort */
int compareInts(const void* a, const void* b) {
    return (*(int*)a - *(int*)b);
}

/* Backtracking: Subset Sum I */
void solveSubsetSum(int remainder, int* nums, int nums_size, int start_index) {
    // Solution found
    if (remainder == 0) {
        for (int i = 0; i < current_len; ++i) {
            solutions[*sol_count][i] = current_subset[i];
        }
        solution_sizes[(*sol_count)++] = current_len;
        return;
    }
    
    // Traverse choices starting from start_index to avoid duplicate subsets
    for (int i = start_index; i < nums_size; i++) {
        // Pruning: Since nums is sorted, if current number > remainder, break
        if (remainder - nums[i] < 0) {
            break;
        }
        // Attempt
        current_subset[current_len++] = nums[i];
        // Recursive call: pass 'i' to allow reuse of current element
        solveSubsetSum(remainder - nums[i], nums, nums_size, i);
        // Retract
        current_len--;
    }
}

/* Wrapper for Subset Sum I */
void getSubsetsSum(int* nums, int nums_size, int target) {
    qsort(nums, nums_size, sizeof(int), compareInts);
    solveSubsetSum(target, nums, nums_size, 0);
}

13.3.2 Case With Duplicate Elements

Given an array of positive integers nums (may contain duplicates) and a target target, find all unique combinations summing to target. Each element may be used only once.

Example: Input [4, 4, 5], target 9. Output {[4, 5]}.

The previous code might output two {[4, 5]}s because the two 4s are distinct indices. This creates duplicate search branches.

1. Pruning Equal Elements

To fix this, we ensure that in each round of selection, equal elements are chosen only once. Since the array is sorted, equal elements are adjacent. If the current element equals the previous one (and it's not the first element of this round), we skip it.

Additionally, since elements can only be used once, we pass i + 1 as the start index for the next recursion.

2. Code Implementation

// === File: subset_sum_ii.c ===
/* Backtracking: Subset Sum II */
void solveSubsetSumII(int remainder, int* nums, int nums_size, int start_index) {
    if (remainder == 0) {
        for (int i = 0; i < current_len; i++) {
            solutions[*sol_count][i] = current_subset[i];
        }
        solution_sizes[(*sol_count)++] = current_len;
        return;
    }
    
    for (int i = start_index; i < nums_size; i++) {
        // Pruning 1: Exceeds target
        if (remainder - nums[i] < 0) {
            break; // Assuming sorted
        }
        // Pruning 2: Skip duplicates in the same level
        if (i > start_index && nums[i] == nums[i - 1]) {
            continue;
        }
        
        // Attempt
        current_subset[current_len++] = nums[i];
        // Recursive call: pass 'i + 1' because element cannot be reused
        solveSubsetSumII(remainder - nums[i], nums, nums_size, i + 1);
        // Retract
        current_len--;
    }
}

void getSubsetsSumUnique(int* nums, int nums_size, int target) {
    qsort(nums, nums_size, sizeof(int), compareInts);
    solveSubsetSumII(target, nums, nums_size, 0);
}

13.5 Summary

1. Key Takeaways

  • Essence: Backtracking is essentially an exhaustive search method using DFS on the solution space. It records valid solutions until all are found or the search ends.
  • Process: It involves "attempt" and "retraction" (backtracking). It tries choices via DFS; if constraints are violated, it undoes the choice (retracts) and tries others. These are inverse operations.
  • Pruning: Constraints are used for pruning to cut off unnecessary branches, significantly improving efficiency.
  • Applications: Primarily used for search and constraint satisfaction problems. While applicable to combinatorial optimization, other methods (like DP) are often better.
  • Permutations: Use a used array to ensure each element is picked once. With duplicates, use a hash set/array per level to ensure equal elements are picked once per level.
  • Subset Sum: Order doesn't matter. Sort the array and use a start_index to ensure indices are chosen in increasing order ($i_1 \le i_2 \le \dots$), avoiding duplicate subsets. With duplicate inputs, skip adjacent duplicates in the same level.
  • N-Queens: (Mentioned in text) Involves row, column, and diagonal constraints. Usually solved by placing queens row by row.

Relationship between Backtracking and Recursion?

Overall, backtracking is an "algorithm strategy," while recursion is a "tool."

  • Backtracking is often implemented using recursion. It is a specific application of recursion to search problems.
  • Recursion embodies the paradigm of "sub-problem decomposition" and is used in Divide and Conquer, Backtracking, and Dynamic Programming (memoization).
Tags: algorithms

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.