Home > Tech > Content

Tarjan Algorithm for Finding Strongly Connected Components

Tech May 14 1

Introduction to Strongly Connected Components

A Strongly Connected Component (SCC) of a directed graph is a maximal subgraph where every vertex is reachable from every other vertex. In simpler terms, for any two nodes \(u\) and \(v\) with in the same SCC, there exists a path from \(u\) to \(v\) and a path from \(v\) to \(u\). Tarjan's algorithm is an efficient linear-time approach based on Depth First Search (DFS) to identify all SCCs in a graph.

The Tarjan Algorithm Logic

The algorithm performs a DFS traversal. Since a single DFS might not visit all nodes if the graph is not connected, we iterate through all nodes and initiate a DFS for any node that hasn't been visited yet. Each node \(i\) is assigned two key values:

discovery_time[i]: The timestamp when node \(i\) was first visited.
low_link[i]: The smallest discovery_time reachable from \(i\) (including itself) through a back-edge or a cross-edge to a node still in the current recursion stack.

Nodes are pushed onto a stack as they are visited. While backtracking from a node \(u\), we update its low_link based on its neighbors. If low_link[u] == discovery_time[u], then \(u\) is the "root" of an SCC. All nodes from the top of the stack down to \(u\) are popped and assigned to a new SCC.


int disc[MAXN], low[MAXN], timer;
int stack_nodes[MAXN], top_idx;
bool in_stack[MAXN];
int scc_id[MAXN], scc_total;

void find_scc(int u) {
    disc[u] = low[u] = ++timer;
    stack_nodes[++top_idx] = u;
    in_stack[u] = true;

    for (int v : adjacency_list[u]) {
        if (!disc[v]) {
            find_scc(v);
            low[u] = std::min(low[u], low[v]);
        } else if (in_stack[v]) {
            low[u] = std::min(low[u], disc[v]);
        }
    }

    if (low[u] == disc[u]) {
        scc_total++;
        while (true) {
            int node = stack_nodes[top_idx--];
            in_stack[node] = false;
            scc_id[node] = scc_total;
            if (node == u) break;
        }
    }
}

// Global execution
for (int i = 1; i <= node_count; ++i) {
    if (!disc[i]) find_scc(i);
}

Correctness and Edge Classification

The algorithm correctly identifies SCCs by analyzing four types of edges in the DFS tree:

Tree Edges: Edges to unvisited nodes.
Back Edges: Edges to ancestors in the DFS tree.
Forward Edges: Edges to descendants already visited.
Cross Edges: Edges to nodes in different branches.

An SCC is formed when a back edge or certain cross edges connect a node to one of its ancestors. The low_link value effectively propagates the earliest ancestor reachable. When low_link[u] == discovery_time[u], it signifies that no node in the subtree rooted at \(u\) can reach a node visited earlier than \(u\) that is still in the current search path. Thus, \(u\) and its remaining descendants on the stack form a complete SCC.

Graph Condensation

One of the most powerful applications of Tarjan's algorithm is "condensation." By treating each SCC as a single super-node, we transform the original directed graph into a Directed Acyclic Graph (DAG). The SCCs are identified in the reverse of their topological order in this DAG.

Application Examples

1. Maximum Weight Path (Coin Collector)

Given a graph where nodes have weights, find the path that collects the maximum total weight. Since we can traverse an SCC entirely, we condense the graph into a DAG where each super-node's weight is the sum of weights of the nodes in that SCC. We then use dynamic programming on the DAG:

DP[u] = SCC_Weight[u] + max(DP[v]) for all successors \(v\) of SCC \(u\).


#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

const int MAXN = 100005;
vector<int> adj[MAXN], dag_adj[MAXN];
long long node_val[MAXN], scc_sum[MAXN], dp[MAXN];
int dfn[MAXN], low[MAXN], scc[MAXN], st[MAXN];
bool active[MAXN];
int timer, top, scc_cnt;

void tarjan(int u) {
    dfn[u] = low[u] = ++timer;
    st[++top] = u;
    active[u] = true;
    for (int v : adj[u]) {
        if (!dfn[v]) {
            tarjan(v);
            low[u] = min(low[u], low[v]);
        } else if (active[v]) {
            low[u] = min(low[u], dfn[v]);
        }
    }
    if (low[u] == dfn[u]) {
        scc_cnt++;
        while (true) {
            int curr = st[top--];
            active[curr] = false;
            scc[curr] = scc_cnt;
            scc_sum[scc_cnt] += node_val[curr];
            if (curr == u) break;
        }
    }
}

int main() {
    int n, m;
    scanf("%d %d", &n, &m);
    for (int i = 1; i <= n; i++) scanf("%lld", &node_val[i]);
    for (int i = 0; i < m; i++) {
        int u, v;
        scanf("%d %d", &u, &v);
        adj[u].push_back(v);
    }
    for (int i = 1; i <= n; i++) if (!dfn[i]) tarjan(i);
    
    for (int u = 1; u <= n; u++) {
        for (int v : adj[u]) {
            if (scc[u] != scc[v]) dag_adj[scc[u]].push_back(scc[v]);
        }
    }

    long long max_score = 0;
    // DP on DAG (processed in reverse order of discovery)
    for (int i = 1; i <= scc_cnt; i++) {
        long long best_next = 0;
        for (int v : dag_adj[i]) best_next = max(best_next, dp[v]);
        dp[i] = scc_sum[i] + best_next;
        max_score = max(max_score, dp[i]);
    }
    printf("%lld\n", max_score);
    return 0;
}

2. Semi-Connectivity in Graphs

A graph is semi-connected if for every pair of vertices \((u, v)\), there is a path from \(u\) to \(v\) or from \(v\) to \(u\). After condensing the graph into a DAG, the condition for semi-connectivity is that the DAG must contain a Hamiltonian path. This essentially means the DAG must look like a single chain of nodes.

3. Minimum Information Source

In scenarios like identifying a "killer" among people where relationships represent clues, we use SCCs to simplify the problem. Generally, you must investigate at least one person from every SCC that has an in-degree of 0 in the condensed DAG. However, using deduction (exclusion principal), if there is an SCC of size 1 with in-degree 0, and all its neighbors can be reached from other in-degree 0 SCCs, we might avoid investigating that specific person.

Tags: tarjan-algorithm

Back to List

Prev: Java Thread Waiting and Notification: wait, notify, and LockSupport

Next: Approaches to Implementing Proximity-Based Queries in LBS Applications

Fading Coder

Tarjan Algorithm for Finding Strongly Connected Components

Introduction to Strongly Connected Components

The Tarjan Algorithm Logic

Correctness and Edge Classification

Graph Condensation

Application Examples

1. Maximum Weight Path (Coin Collector)

2. Semi-Connectivity in Graphs

3. Minimum Information Source

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor