📖 Chapter 5.3 ⏱️ ~80 min read 🎯 Advanced

Chapter 5.3: Trees & Special Graphs

Trees are graphs with a special structure that enables elegant and efficient algorithms. This chapter covers tree traversals, and one of the most important data structures in competitive programming: Union-Find (also called Disjoint Set Union or DSU).

5.3.1 Tree Traversals

Given a rooted tree, there are three classic ways to visit every node with DFS:

Pre-order: Visit node, then children (node before subtree)
In-order: Visit left child, node, right child (only for binary trees)
Post-order: Visit children, then node (subtree before node)

#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
vector<int> children[MAXN];

// Pre-order: parent before children (useful for computing subtree info top-down)
void preorder(int u) {
    cout << u << " ";  // process u first
    for (int v : children[u]) preorder(v);
}

// Post-order: children before parent (useful for subtree aggregation bottom-up)
void postorder(int u) {
    for (int v : children[u]) postorder(v);
    cout << u << " ";  // process u after all children
}

// Calculate subtree size (post-order style)
int subtreeSize[MAXN];
void calcSize(int u) {
    subtreeSize[u] = 1;  // start with just this node
    for (int v : children[u]) {
        calcSize(v);
        subtreeSize[u] += subtreeSize[v];  // add child subtree sizes
    }
}

// Calculate depth of each node (pre-order style)
int depth[MAXN];
void calcDepth(int u, int d) {
    depth[u] = d;
    for (int v : children[u]) {
        calcDepth(v, d + 1);
    }
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n;
    cin >> n;

    for (int i = 2; i <= n; i++) {
        int p; cin >> p;
        children[p].push_back(i);
    }

    cout << "Pre-order: ";
    preorder(1);
    cout << "\n";

    cout << "Post-order: ";
    postorder(1);
    cout << "\n";

    calcSize(1);
    cout << "Subtree sizes: ";
    for (int i = 1; i <= n; i++) cout << subtreeSize[i] << " ";
    cout << "\n";

    calcDepth(1, 0);
    cout << "Depths: ";
    for (int i = 1; i <= n; i++) cout << depth[i] << " ";
    cout << "\n";

    return 0;
}

5.3.2 Lowest Common Ancestor (LCA) — Naive

The LCA of two nodes u and v is the deepest node that is an ancestor of both.

For small trees, a naive approach: march both nodes up to the root, find where they first meet.

Naive LCA 向上爬升过程示意：

flowchart TD
    subgraph tree["树结构（根节点=1）"]
        N1(["1\ndepth=0"])
        N2(["2\ndepth=1"])
        N3(["3\ndepth=1"])
        N4(["4\ndepth=2"])
        N5(["5\ndepth=2"])
        N6(["6\ndepth=3"])
        N1 --> N2
        N1 --> N3
        N2 --> N4
        N2 --> N5
        N4 --> N6
    end
    subgraph lca["LCA(6, 5) 求解过程"]
        direction LR
        S1["u=6, v=5\ndepth[6]=3, depth[5]=2"] -->|"u上移至同深度"| S2
        S2["u=4, v=5\ndepth相同=2"] -->|"u≠v，同步上移"| S3
        S3["u=2, v=2\nu==v → LCA=2 ✓"]
    end
    style S3 fill:#dcfce7,stroke:#16a34a

💡 关键步骤： ① 先将深度较大的节点上移至与另一节点同深度；② 再同步上移直到两节点相遇，相遇点即为 LCA。

#include <bits/stdc++.h>
using namespace std;

int parent[100001];
int depth_arr[100001];

// Naive LCA: walk both nodes up until they meet — O(depth) per query
int lca(int u, int v) {
    while (depth_arr[u] > depth_arr[v]) u = parent[u];  // bring u up to same depth as v
    while (depth_arr[v] > depth_arr[u]) v = parent[v];  // bring v up to same depth as u
    while (u != v) {  // now both at same depth; walk up together
        u = parent[u];
        v = parent[v];
    }
    return u;
}

For Silver problems, naive LCA (O(N) per query) is often sufficient. Gold uses binary lifting (O(log N) per query).

Binary Lifting 祖先表构建示意：

flowchart LR
    subgraph anc0["anc[v][0]（直接父节点）"]
        direction TB
        v6a(["6"]) -->|"父"| v4a(["4"])
        v4a -->|"父"| v2a(["2"])
        v2a -->|"父"| v1a(["1"])
    end
    subgraph anc1["anc[v][1]（2¹=2级祖先）"]
        direction TB
        v6b(["6"]) -->|"2级祖先"| v2b(["2"])
        v4b(["4"]) -->|"2级祖先"| v1b(["1"])
    end
    subgraph anc2["anc[v][2]（2²=4级祖先）"]
        direction TB
        v6c(["6"]) -->|"4级祖先"| v1c(["1"])
    end
    anc0 -->|"anc[v][k] = anc[anc[v][k-1]][k-1]"| anc1
    anc1 --> anc2
    style anc2 fill:#f0f4ff,stroke:#4A6CF7

💡 倍增思想： anc[v][k] = v 的 2^k 级祖先 = v 的 2^(k-1) 级祖先的 2^(k-1) 级祖先。查询时将深度差分解为二进制，每次跳 2^k 步，共 O(log N) 步。

5.3.3 Union-Find (Disjoint Set Union)

Union-Find is a data structure that efficiently answers two questions:

Find: Which group does element X belong to?
Union: Merge the groups containing X and Y.

Why is this useful? It efficiently tracks connected components as edges are added one by one, which is used in Kruskal's MST algorithm, detecting cycles, and many USACO problems.

Visual: Union-Find Operations

The diagram shows Union-Find evolving: initially all nodes are separate (each is its own root), then after union(0,1) and union(1,2) a tree forms. Path compression (shown at bottom) flattens the tree so future find() calls are nearly O(1).

Union-Find Structure

This static reference diagram shows the Union-Find tree structure with path compression and union by rank, illustrating how the data structure maintains near-constant time operations.

Basic Implementation

#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
int parent[MAXN];  // parent[x] = parent of x in the tree
int rankArr[MAXN]; // used for union by rank

// Initialize: each element is its own group
void init(int n) {
    for (int i = 1; i <= n; i++) {
        parent[i] = i;      // parent of i is itself
        rankArr[i] = 0;     // initial rank is 0
    }
}

// Find: returns the "representative" (root) of x's group
// Uses PATH COMPRESSION: flattens the tree for future queries
int find(int x) {
    if (parent[x] != x) {
        parent[x] = find(parent[x]);  // path compression!
    }
    return parent[x];
}

// Union: merge groups containing x and y
// Uses UNION BY RANK: attach smaller tree under larger tree
void unite(int x, int y) {
    int px = find(x), py = find(y);
    if (px == py) return;  // already in same group

    // Attach tree with lower rank under tree with higher rank
    if (rankArr[px] < rankArr[py]) swap(px, py);
    parent[py] = px;
    if (rankArr[px] == rankArr[py]) rankArr[px]++;
}

// Check if x and y are in the same group
bool connected(int x, int y) {
    return find(x) == find(y);
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;
    init(n);

    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        unite(u, v);
    }

    // Count connected components
    set<int> roots;
    for (int i = 1; i <= n; i++) roots.insert(find(i));
    cout << "Connected components: " << roots.size() << "\n";

    // Answer connectivity queries
    int q;
    cin >> q;
    while (q--) {
        int u, v;
        cin >> u >> v;
        cout << (connected(u, v) ? "YES" : "NO") << "\n";
    }

    return 0;
}

Time complexity: With path compression and union by rank, both find and unite run in nearly O(1) — specifically O(α(n)) where α is the inverse Ackermann function, which is effectively constant for all practical inputs.

Union by Rank vs Union by Size 对比

flowchart LR
    subgraph bad["❌ 不按秩合并（退化为链）"]
        direction TB
        R1(["1"]) --> R2(["2"]) --> R3(["3"]) --> R4(["4"]) --> R5(["5"])
        note_bad["find(5) 需要 5 步\nO(N) 每次查询"]
    end
    subgraph good["✅ 按秩合并（保持树矮"]
        direction TB
        Root(["1"])
        C2(["2"]) 
        C3(["3"])
        C4(["4"])
        C5(["5"])
        Root --> C2
        Root --> C3
        Root --> C4
        Root --> C5
        note_good["find(任意) 只需 2 步\nO(1) 每次查询"]
    end
    bad -->|"按秩合并优化"| good
    style good fill:#dcfce7,stroke:#16a34a
    style bad fill:#fef2f2,stroke:#dc2626

💡 按秩合并规则： 将 rank 较小的树挂到 rank 较大的树下，保证树高不超过 O(log N)。配合路径压缩后，均摊复杂度降至 O(α(N)) ≈ O(1)。

Why Union-Find is Powerful

Compare with BFS/DFS for connectivity queries:

BFS/DFS: O(N+M) per query (rebuilds from scratch)
Union-Find: O(α(N)) per query after O((N+M)α(N)) preprocessing

For Q queries after reading all edges: BFS = O(Q(N+M)) vs DSU = O((N+M+Q)α(N)).

5.3.4 Cycle Detection with DSU

Problem: Given a graph, determine if it has a cycle. If so, report which edge creates a cycle.

DSU 环检测过程示意：

flowchart LR
    subgraph e1["加入边 1-2"]
        direction TB
        A1(["1"]) --- B1(["2"])
        note1["find(1)≠find(2)\n→ 合并，无环"]
    end
    subgraph e2["加入边 2-3"]
        direction TB
        A2(["1"]) --- B2(["2"]) --- C2(["3"])
        note2["find(2)≠find(3)\n→ 合并，无环"]
    end
    subgraph e3["加入边 1-3"]
        direction TB
        A3(["1"]) --- B3(["2"]) --- C3(["3"])
        A3 -.-|"虚线=新边"| C3
        note3["find(1)==find(3)\n→ 已连通，成环！ ⚠️"]
    end
    e1 --> e2 --> e3
    style note3 fill:#fef2f2,stroke:#dc2626
    style e3 fill:#fff1f2,stroke:#fca5a5

💡 核心判断： 加入边 (u, v) 前，若 find(u) == find(v)，说明 u 和 v 已在同一连通分量，加入此边必然成环。

#include <bits/stdc++.h>
using namespace std;

int parent[100001];

int find(int x) {
    if (parent[x] != x) parent[x] = find(parent[x]);
    return parent[x];
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;
    for (int i = 1; i <= n; i++) parent[i] = i;

    bool hasCycle = false;
    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        if (find(u) == find(v)) {
            cout << "Cycle created by edge " << u << "-" << v << "\n";
            hasCycle = true;
        } else {
            parent[find(u)] = find(v);  // simple union (no rank for brevity)
        }
    }

    if (!hasCycle) cout << "No cycle\n";
    return 0;
}

5.3.5 Minimum Spanning Tree (Kruskal's Algorithm)

A minimum spanning tree (MST) of a weighted graph connects all vertices with total edge weight minimized, using exactly N-1 edges.

Kruskal's algorithm:

Sort all edges by weight
Process edges in order; add an edge if it connects two different components (using DSU)
Stop when N-1 edges are added

#include <bits/stdc++.h>
using namespace std;

int parent[100001];

int find(int x) {
    if (parent[x] != x) parent[x] = find(parent[x]);
    return parent[x];
}

bool unite(int x, int y) {
    x = find(x); y = find(y);
    if (x == y) return false;  // already connected
    parent[x] = y;
    return true;
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;
    for (int i = 1; i <= n; i++) parent[i] = i;

    // Read edges as (weight, u, v)
    vector<tuple<int,int,int>> edges(m);
    for (auto &[w, u, v] : edges) cin >> u >> v >> w;

    // Sort by weight
    sort(edges.begin(), edges.end());

    long long totalCost = 0;
    int edgesAdded = 0;

    for (auto [w, u, v] : edges) {
        if (unite(u, v)) {        // connects two different components
            totalCost += w;
            edgesAdded++;
            if (edgesAdded == n - 1) break;  // MST complete
        }
    }

    if (edgesAdded == n - 1) {
        cout << "MST cost: " << totalCost << "\n";
    } else {
        cout << "Graph is disconnected; no MST\n";
    }

    return 0;
}

5.3.6 USACO Example: The Fence

Problem (USACO-style): A farm has N fields and M fences. Each fence connects two fields. Fields in the same connected component form a "pasture." After adding each fence, output the size of the largest pasture.

#include <bits/stdc++.h>
using namespace std;

int parent[100001];
int sz[100001];   // sz[root] = size of component rooted at 'root'

int find(int x) {
    if (parent[x] != x) parent[x] = find(parent[x]);
    return parent[x];
}

void unite(int x, int y) {
    x = find(x); y = find(y);
    if (x == y) return;
    if (sz[x] < sz[y]) swap(x, y);
    parent[y] = x;
    sz[x] += sz[y];
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;
    for (int i = 1; i <= n; i++) { parent[i] = i; sz[i] = 1; }

    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        unite(u, v);

        int maxSz = 0;
        for (int j = 1; j <= n; j++) {
            if (find(j) == j) maxSz = max(maxSz, sz[j]);  // only roots have correct sz
        }
        cout << maxSz << "\n";
    }

    return 0;
}

Chapter Summary

📌 Key Takeaways

Technique	Time per Operation	Use Case	Why It Matters
Tree DFS (pre/post order)	`O(N)` total	Subtree sum, depth calc	Foundation for tree DP
Naive LCA	`O(depth)` per query	Small trees	Understanding LCA concept
Binary Lifting LCA	`O(log N)` per query	Large tree path queries	Gold-level core technique
Union-Find find/union	`O(α(N))` ≈ `O(1)`	Dynamic connectivity	Kruskal MST, online connectivity
Kruskal's MST	`O(E log E)`	Minimum spanning tree	Common in USACO Silver/Gold
Euler Tour	`O(N)` preprocessing	Subtree→range query	Combined with Segment Tree for tree problems
Tree Diameter	`O(N)` (two BFS)	Longest path in tree	Common interview/contest problem

❓ FAQ

Q1: Can Union-Find use only one of "path compression" or "union by rank"?

A: Yes. Path compression alone gives amortized O(log N). Union by rank alone gives O(log N). Both together achieve O(α(N)). In contests, at least use path compression (more impactful); union by rank can be simplified to union by size.

Q2: What is the difference between Kruskal and Prim? When to use which?

A: Kruskal sorts edges + DSU, suited for sparse graphs (E ≪ V²), concise code. Prim is like Dijkstra with a priority queue, suited for dense graphs. In contests, use Kruskal 90% of the time.

Q3: What is the difference between Euler Tour and DFS order?

A: Essentially the same. "DFS order" usually refers to in_time and out_time; "Euler Tour" sometimes means the full entry/exit sequence (length 2N). In this book they are the same thing—the key is [in[u], out[u]] corresponds to u's subtree.

Q4: Why can tree diameter be found with "two BFS passes"?

A: Proof: Let the diameter be u→v. Starting BFS from any node s, the farthest node must be u or v (provable by contradiction). Then BFS from that endpoint finds the other endpoint and the diameter length.

Q5: What is the difference between multiset::erase(ms.find(val)) and ms.erase(val)?

A: Not this chapter's content (Chapter 3.8), but related to DSU sz tracking. ms.erase(val) removes all elements equal to val; ms.erase(ms.find(val)) removes only one. When tracking group sizes in DSU, watch for similar "delete one vs delete all" issues.

🔗 Connections to Later Chapters

Chapter 3.9 (Segment Tree) + Euler Tour = efficient subtree queries (update + query both O(log N))
Chapter 6.1 (DP Introduction): Tree DP builds directly on this chapter's tree traversal—postorder DFS aggregates bottom-up
Chapter 4.1 (Greedy): MST is a classic greedy application—Kruskal greedily selects minimum edges
Union-Find is powerful for offline processing—sort all queries/edges first, then add with DSU incrementally
Binary Lifting for LCA is one of the core techniques at USACO Gold level

Practice Problems

Problem 5.3.1 — Subtree Sum Read a rooted tree with values at each node. For each node, output the sum of values in its subtree.

Problem 5.3.2 — Network Components Read a graph. Add edges one by one. After each edge, print the number of connected components.

Problem 5.3.3 — Redundant Edge Read a tree (N nodes, N-1 edges) plus one extra edge that creates a cycle. Find the extra edge. (Hint: use DSU — the edge that unites two already-connected nodes is the answer)

Problem 5.3.4 — Friend Groups N students. Read M pairs of friendships. Friends of friends are also friends (transitivity). Print the number of friend groups and the size of the largest one.

Problem 5.3.5 — USACO 2016 February Silver: Fencing the Cows (Inspired) Read a weighted graph. Find the minimum cost to connect all nodes (MST using Kruskal's). Print the total MST weight, or "IMPOSSIBLE" if the graph is not connected.

5.3.7 Kruskal's MST — Complete Worked Example

Let's trace Kruskal's algorithm on a 6-node graph with 9 edges.

Graph:

Nodes: 0,1,2,3,4,5
Edges (sorted by weight):
  0-1: w=1
  2-3: w=2
  0-2: w=3
  1-3: w=4
  3-4: w=5
  2-4: w=6
  4-5: w=7
  1-4: w=8
  3-5: w=9

Kruskal's Algorithm Trace:

Initial: 6 components {0},{1},{2},{3},{4},{5}

Process edge 0-1 (w=1): find(0)=0, find(1)=1 → DIFFERENT → ACCEPT ✓
  Tree edges: {0-1}. Components: {0,1},{2},{3},{4},{5}

Process edge 2-3 (w=2): find(2)=2, find(3)=3 → DIFFERENT → ACCEPT ✓
  Tree edges: {0-1, 2-3}. Components: {0,1},{2,3},{4},{5}

Process edge 0-2 (w=3): find(0)=root_of_01, find(2)=root_of_23 → DIFFERENT → ACCEPT ✓
  Tree edges: {0-1, 2-3, 0-2}. Components: {0,1,2,3},{4},{5}

Process edge 1-3 (w=4): find(1)=root_of_0123, find(3)=root_of_0123 → SAME → SKIP ✗
  (Adding this would create a cycle: 0-1-3-2-0)

Process edge 3-4 (w=5): find(3)=root_of_0123, find(4)=4 → DIFFERENT → ACCEPT ✓
  Tree edges: {0-1, 2-3, 0-2, 3-4}. Components: {0,1,2,3,4},{5}

Process edge 2-4 (w=6): find(2)=root_of_01234, find(4)=root_of_01234 → SAME → SKIP ✗

Process edge 4-5 (w=7): find(4)=root_of_01234, find(5)=5 → DIFFERENT → ACCEPT ✓
  Tree edges: {0-1, 2-3, 0-2, 3-4, 4-5}.
  edgesAdded = 5 = n-1 = 5. DONE!

MST total weight: 1 + 2 + 3 + 5 + 7 = 18

Kruskal 边选择过程示意：

flowchart LR
    subgraph s0["初始：6个独立分量"]
        direction LR
        n0a([0]) 
        n1a([1])
        n2a([2])
        n3a([3])
        n4a([4])
        n5a([5])
    end
    subgraph s1["加入 0-1(w=1)、2-3(w=2)"]
        direction LR
        n01b(["0——1"]) 
        n23b(["2——3"])
        n4b([4])
        n5b([5])
    end
    subgraph s2["加入 0-2(w=3)，跳过 1-3(w=4)成环"]
        direction LR
        n0123c(["0—1—2—3"])
        n4c([4])
        n5c([5])
    end
    subgraph s3["加入 3-4(w=5)、4-5(w=7)，MST完成"]
        direction LR
        n012345d(["0—1—2—3—4—5"])
    end
    s0 --> s1 --> s2 --> s3
    style s3 fill:#dcfce7,stroke:#16a34a

Complete C++ implementation with worked example:

// Solution: Kruskal's MST — O(E log E)
#include <bits/stdc++.h>
using namespace std;

struct DSU {
    vector<int> parent, rank_;
    DSU(int n) : parent(n), rank_(n, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    int find(int x) {
        if (parent[x] != x) parent[x] = find(parent[x]);
        return parent[x];
    }
    bool unite(int x, int y) {
        x = find(x); y = find(y);
        if (x == y) return false;
        if (rank_[x] < rank_[y]) swap(x, y);
        parent[y] = x;
        if (rank_[x] == rank_[y]) rank_[x]++;
        return true;
    }
};

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;

    // Read edges as {weight, u, v}
    vector<tuple<int,int,int>> edges(m);
    for (auto& [w, u, v] : edges) cin >> u >> v >> w;

    // Sort by weight (ascending)
    sort(edges.begin(), edges.end());

    DSU dsu(n);
    long long mstWeight = 0;
    int edgesAdded = 0;
    vector<pair<int,int>> mstEdges;

    for (auto [w, u, v] : edges) {
        if (dsu.unite(u, v)) {           // different components → safe to add
            mstWeight += w;
            mstEdges.push_back({u, v});
            if (++edgesAdded == n - 1) break;  // MST complete (N-1 edges)
        }
    }

    if (edgesAdded < n - 1) {
        cout << "IMPOSSIBLE: graph is disconnected\n";
    } else {
        cout << "MST weight: " << mstWeight << "\n";
        cout << "MST edges:\n";
        for (auto [u, v] : mstEdges) cout << u << " - " << v << "\n";
    }

    return 0;
}

5.3.8 Tree Diameter

The diameter of a tree is the longest path between any two nodes (measured in number of edges, or total weight for weighted trees).

Algorithm (Two BFS/DFS approach):

BFS/DFS from any node u. Find the farthest node v.
BFS/DFS from v. The farthest node from v is one endpoint of the diameter.
The distance found in step 2 is the diameter.

Why does this work? The farthest node from any node is always one endpoint of a diameter.

两次 BFS 求树直径过程示意：

flowchart LR
    subgraph bfs1["第1次 BFS：从任意节点 s=1 出发"]
        direction TB
        S1(["s=1\ndist=0"])
        N2a(["2\ndist=1"])
        N3a(["3\ndist=2"])
        N4a(["4\ndist=3 ← 最远"])
        S1 --> N2a --> N3a --> N4a
        note1["最远节点 = 4\n（直径端点之一）"]
    end
    subgraph bfs2["第2次 BFS：从端点 u=4 出发"]
        direction TB
        U(["u=4\ndist=0"])
        N3b(["3\ndist=1"])
        N2b(["2\ndist=2"])
        N1b(["1\ndist=3"])
        N5b(["5\ndist=4 ← 最远"])
        U --> N3b --> N2b --> N1b --> N5b
        note2["最远节点 = 5\n直径长度 = 4"]
    end
    bfs1 -->|"以最远节点为新起点"| bfs2
    style N4a fill:#dbeafe,stroke:#3b82f6
    style N5b fill:#dcfce7,stroke:#16a34a
    style note2 fill:#dcfce7,stroke:#16a34a

💡 正确性证明（反证法）： 设真正的直径端点为 p, q。从任意节点 s 出发，最远节点 u 必为 p 或 q（若不是，则 u 到 p 或 q 的距离更长，矛盾）。从 u 出发的最远节点即为另一端点，距离即为直径。

// Solution: Tree Diameter (Two BFS) — O(N)
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
vector<pair<int,int>> adj[MAXN];  // {neighbor, edge_weight}

// BFS from src, returns {farthest_node, farthest_distance}
pair<int,int> bfsFarthest(int src, int n) {
    vector<int> dist(n + 1, -1);
    queue<int> q;
    dist[src] = 0;
    q.push(src);
    int farthest = src;

    while (!q.empty()) {
        int u = q.front(); q.pop();
        for (auto [v, w] : adj[u]) {
            if (dist[v] == -1) {
                dist[v] = dist[u] + w;
                q.push(v);
                if (dist[v] > dist[farthest]) farthest = v;
            }
        }
    }
    return {farthest, dist[farthest]};
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n;
    cin >> n;
    for (int i = 0; i < n - 1; i++) {
        int u, v, w;
        cin >> u >> v >> w;
        adj[u].push_back({v, w});
        adj[v].push_back({u, w});
    }

    // Step 1: BFS from node 1, find farthest node u
    auto [u, _] = bfsFarthest(1, n);

    // Step 2: BFS from u, find farthest node v and its distance
    auto [v, diameter] = bfsFarthest(u, n);

    cout << "Diameter: " << diameter << "\n";
    cout << "Endpoints: " << u << " and " << v << "\n";

    return 0;
}

Unweighted version: Set all edge weights to 1, or use a simpler BFS that counts hops.

5.3.9 Lowest Common Ancestor (LCA) — Concept

The Lowest Common Ancestor (LCA) of two nodes u and v in a rooted tree is the deepest node that is an ancestor of both u and v.

Naive LCA (O(depth) per query): Walk both nodes up to the same depth, then walk together until they meet.

// Naive LCA — O(depth) per query, depth can be O(N) worst case
int lca_naive(int u, int v, int* depth, int* parent) {
    // Equalize depths
    while (depth[u] > depth[v]) u = parent[u];
    while (depth[v] > depth[u]) v = parent[v];
    // Now same depth — walk up together
    while (u != v) {
        u = parent[u];
        v = parent[v];
    }
    return u;
}

Binary Lifting LCA (O(log N) per query, O(N log N) preprocessing):

Store anc[v][k] = 2^k-th ancestor of v.

const int LOG = 17;  // log2(10^5) ≈ 17
int anc[MAXN][LOG];  // anc[v][k] = 2^k-th ancestor of v
int depth_arr[MAXN];

void preprocess(int root, int n) {
    // DFS to compute depths and anc[v][0] = direct parent
    // Then: anc[v][k] = anc[anc[v][k-1]][k-1]
    // (2^k-th ancestor = 2^(k-1)-th ancestor of 2^(k-1)-th ancestor)
    for (int v = 1; v <= n; v++)
        for (int k = 1; k < LOG; k++)
            anc[v][k] = anc[anc[v][k-1]][k-1];
}

int lca(int u, int v) {
    if (depth_arr[u] < depth_arr[v]) swap(u, v);
    int diff = depth_arr[u] - depth_arr[v];
    // Lift u up by diff levels using binary lifting
    for (int k = 0; k < LOG; k++)
        if ((diff >> k) & 1) u = anc[u][k];
    if (u == v) return u;
    // Now same depth — binary lift both
    for (int k = LOG - 1; k >= 0; k--)
        if (anc[u][k] != anc[v][k]) {
            u = anc[u][k];
            v = anc[v][k];
        }
    return anc[u][0];
}

💡 When to use LCA: Path queries on trees (e.g., "sum of values on path from u to v"), distance queries between nodes, finding "meeting points" on tree paths.

5.3.10 Euler Tour of Tree

An Euler tour flattens a tree into a linear array, enabling range queries on subtrees using regular array data structures (e.g., segment tree).

Idea: Record entry and exit times for each node during DFS. The subtree of node u corresponds to the contiguous range [in[u], out[u]] in the Euler tour array.

// Euler Tour / Heavy-Light Decomposition preprocessing
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
vector<int> children[MAXN];
int in_time[MAXN], out_time[MAXN], timer_val = 0;
int euler_arr[MAXN];  // euler_arr[in_time[v]] = val[v]
int val[MAXN];        // value at each node

void dfs_euler(int u) {
    in_time[u] = ++timer_val;      // entry time
    euler_arr[timer_val] = val[u]; // record value in euler array

    for (int v : children[u]) {
        dfs_euler(v);
    }

    out_time[u] = timer_val;       // exit time (same as in_time for leaf)
}

int main() {
    int n;
    cin >> n;
    for (int i = 2; i <= n; i++) {
        int p; cin >> p;
        children[p].push_back(i);
    }
    for (int i = 1; i <= n; i++) cin >> val[i];

    dfs_euler(1);

    // Now: subtree of node u = euler_arr[in_time[u]..out_time[u]]
    // Use a segment tree or prefix sums on euler_arr for subtree queries!

    // Example: sum of values in subtree of node 3
    // answer = sum(euler_arr[in_time[3]..out_time[3]])
    cout << "Subtree of 3 covers indices: "
         << in_time[3] << " to " << out_time[3] << "\n";

    return 0;
}

Euler Tour in/out 时间戳示意：

flowchart TD
    subgraph tree["树结构"]
        R(["1\nin=1, out=7"])
        A(["2\nin=2, out=4"])
        B(["5\nin=5, out=7"])
        C(["3\nin=3, out=3"])
        D(["4\nin=4, out=4"])
        E(["6\nin=6, out=6"])
        F(["7\nin=7, out=7"])
        R --> A
        R --> B
        A --> C
        A --> D
        B --> E
        B --> F
    end
    subgraph arr["欧拉序列数组"]
        direction LR
        P1["[1]\n节点1"]
        P2["[2]\n节点2"]
        P3["[3]\n节点3"]
        P4["[4]\n节点4"]
        P5["[5]\n节点5"]
        P6["[6]\n节点6"]
        P7["[7]\n节点7"]
        P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- P7
    end
    tree -->|"子树 2 = 区间 [2,4]"| arr
    style P2 fill:#dbeafe,stroke:#3b82f6
    style P3 fill:#dbeafe,stroke:#3b82f6
    style P4 fill:#dbeafe,stroke:#3b82f6

💡 关键性质： 节点 u 的子树对应欧拉序列中的连续区间 [in[u], out[u]]。子树查询变为区间查询，配合线段树可实现 O(log N) 的子树更新和查询。

Updated DSU Diagram

DSU Union-Find

The diagram shows all three key DSU operations: isolated nodes, trees after unions, and path compression side-by-side.

C++ for Competitive Programming: A USACO Guide