Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Sudoku Solver: Data Structures and Search Optimizations

Tech 1

Core Data Structures for Sudoku Solving

When implementing a Sudoku solver, the key decision is how to represent placement constraints. A naive approach uses a 3D boolean array can[i][j][num] to indicate whether a number can be placed at position (i, j). While intuitive, this representation has a critical flaw: each cell constraint is tied to its row, column, and box. During backtracking, when we need to restore state, we cannot easily determine which entries were modified by the current move versus previous ones.

A superior approach uses three separate 2D arrays:

  • row[i][num] — whether num can appear in row i
  • col[j][num] — whether num can appear in column j
  • box[b][num] — whether num can appear in box b

This decomposition makes state restoration trivial: toggling a single entry affects exactly one constraint.

The box index calculation uses the formula:

int getBox(int x, int y) {
    return (x - 1) / 3 * 3 + (y - 1) / 3 + 1;
}

Problem 1: Basic Sudoku (P1784)

This straightforward problem requires finding any valid completion. The naive DFS approach passes comfortably given the problem constraints.

#include <bits/stdc++.h>
using namespace std;

const int SZ = 9;
bool row[11][11], col[11][11], box[11][11];
int grid[11][11];

inline int boxIndex(int r, int c) {
    return (r - 1) / 3 * 3 + (c - 1) / 3 + 1;
}

void output() {
    for (int i = 1; i <= SZ; ++i) {
        for (int j = 1; j <= SZ; ++j)
            cout << grid[i][j] << ' ';
        cout << '\n';
    }
    exit(0);
}

void search(int r, int c) {
    if (r > SZ) {
        output();
        return;
    }
    if (grid[r][c]) {
        search(r + 1, c);
        return;
    }
    for (int val = 1; val <= 9; ++val) {
        if (row[r][val] && col[c][val] && box[boxIndex(r, c)][val]) {
            grid[r][c] = val;
            row[r][val] = col[c][val] = box[boxIndex(r, c)][val] = false;
            search(r + 1, c);
            row[r][val] = col[c][val] = box[boxIndex(r, c)][val] = true;
            grid[r][c] = 0;
        }
    }
}

int main() {
    memset(row, 1, sizeof(row));
    memset(col, 1, sizeof(col));
    memset(box, 1, sizeof(box));
    
    for (int i = 1; i <= SZ; ++i)
        for (int j = 1; j <= SZ; ++j) {
            cin >> grid[i][j];
            if (grid[i][j]) {
                row[i][grid[i][j]] = false;
                col[j][grid[i][j]] = false;
                box[boxIndex(i, j)][grid[i][j]] = false;
            }
        }
    
    search(1, 1);
    return 0;
}

This implemantation runs in approximately 79ms. An additional micro-optimization involves reversing the search order (trying 9 to 1 instead of 1 to 9), which reduced runtime to about 21ms on this particular problem.

Problem 2: Weighted Sudoku

This variant requires exploring all possible completions and selecting the one with maximum score. The weighted matrix assigns higher values to central cells.

int weight[11][11] = {
    {0,0,0,0,0,0,0,0,0,0},
    {0,6,6,6,6,6,6,6,6,6},
    {0,6,7,7,7,7,7,7,7,6},
    {0,6,7,8,8,8,8,8,7,6},
    {0,6,7,8,9,9,9,8,7,6},
    {0,6,7,8,9,10,9,8,7,6},
    {0,6,7,8,9,9,9,8,7,6},
    {0,6,7,8,8,8,8,8,7,6},
    {0,6,7,7,7,7,7,7,7,6},
    {0,6,6,6,6,6,6,6,6,6}
};

Baseline Approach (Not Recommended)

A direct application of the first problem's algorithm, modified to accumulate scores, barely passes but runs near the time limit.

Row-Based Prioritization

The critical insight is to prioritize rows with the most constrained cells. Consider filling a 3×3 subgrid: if we fill a cell with few placement options early, subsequent cells often become immediately determinable. This is fundamentally different from prioritizing by score potential—it focuses on reducing branching factor.

struct RowInfo {
    int row;
    int constraintCount;
    bool operator<(const RowInfo& other) const {
        return constraintCount < other.constraintCount;
    }
};

vector<RowInfo> rowOrder;

void solve(int colIdx, int rowIdx, int currentScore, int depth) {
    if (colIdx > SZ) {
        solve(1, rowOrder[depth + 1].row, currentScore, depth + 1);
        return;
    }
    if (depth == SZ) {
        bestScore = max(bestScore, currentScore);
        return;
    }
    if (grid[rowIdx][colIdx]) {
        solve(colIdx + 1, rowIdx, currentScore + grid[rowIdx][colIdx] * weight[rowIdx][colIdx], depth);
        return;
    }
    for (int val = 1; val <= 9; ++val) {
        int b = boxIndex(rowIdx, colIdx);
        if (row[rowIdx][val] && col[colIdx][val] && box[b][val]) {
            row[rowIdx][val] = col[colIdx][val] = box[b][val] = false;
            solve(colIdx + 1, rowIdx, currentScore + val * weight[rowIdx][colIdx], depth);
            row[rowIdx][val] = col[colIdx][val] = box[b][val] = true;
        }
    }
}

int main() {
    // ... initialization ...
    
    for (int i = 1; i <= SZ; ++i) {
        int cnt = 0;
        for (int j = 1; j <= SZ; ++j)
            for (int v = 1; v <= SZ; ++v)
                if (row[i][v] && col[j][v] && box[boxIndex(i, j)][v])
                    ++cnt;
        rowOrder.push_back({i, cnt});
    }
    sort(rowOrder.begin(), rowOrder.end());
    solve(1, rowOrder[0].row, 0, 0);
    cout << (bestScore == 0 ? -1 : bestScore) << '\n';
}

This approach runs in approximately 966ms—significantly faster than the baseline.

Cell-Based Prioritization (Ineffective)

An alternative strategy precomputes the number of valid placements for each cell and sorts all 81 positions. This sounds promising but actulaly performs poorly because constraints are dynamic: a cell with few options initially may have many options later after neighboring cells are filled, and vice versa. This approach essentially performs random-like exploration with unnecessary preprocessing overhead, resulting in TLE (90 points).

Key Takeaways

The most effective Sudoku solvers leverage constraint propagatino to reduce the search space. Static prioritization based on initial state rarely outperforms dynamic, row-oriented approaches. The fundamental principle: always prioritize filling cells that immediately constrain other cells, not cells that seem most valuable to fill.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.