Chapter 5.3: Trees & Special Graphs
Trees are graphs with a special structure that enables elegant and efficient algorithms. This chapter covers tree traversals, and one of the most important data structures in competitive programming: Union-Find (also called Disjoint Set Union or DSU).
5.3.1 Tree Traversals
Given a rooted tree, there are three classic ways to visit every node with DFS:
- Pre-order: Visit node, then children (node before subtree)
- In-order: Visit left child, node, right child (only for binary trees)
- Post-order: Visit children, then node (subtree before node)
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<int> children[MAXN];
// Pre-order: parent before children (useful for computing subtree info top-down)
void preorder(int u) {
cout << u << " "; // process u first
for (int v : children[u]) preorder(v);
}
// Post-order: children before parent (useful for subtree aggregation bottom-up)
void postorder(int u) {
for (int v : children[u]) postorder(v);
cout << u << " "; // process u after all children
}
// Calculate subtree size (post-order style)
int subtreeSize[MAXN];
void calcSize(int u) {
subtreeSize[u] = 1; // start with just this node
for (int v : children[u]) {
calcSize(v);
subtreeSize[u] += subtreeSize[v]; // add child subtree sizes
}
}
// Calculate depth of each node (pre-order style)
int depth[MAXN];
void calcDepth(int u, int d) {
depth[u] = d;
for (int v : children[u]) {
calcDepth(v, d + 1);
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
for (int i = 2; i <= n; i++) {
int p; cin >> p;
children[p].push_back(i);
}
cout << "Pre-order: ";
preorder(1);
cout << "\n";
cout << "Post-order: ";
postorder(1);
cout << "\n";
calcSize(1);
cout << "Subtree sizes: ";
for (int i = 1; i <= n; i++) cout << subtreeSize[i] << " ";
cout << "\n";
calcDepth(1, 0);
cout << "Depths: ";
for (int i = 1; i <= n; i++) cout << depth[i] << " ";
cout << "\n";
return 0;
}
5.3.2 Lowest Common Ancestor (LCA) — Naive
The LCA of two nodes u and v is the deepest node that is an ancestor of both.
For small trees, a naive approach: march both nodes up to the root, find where they first meet.
Naive LCA 向上爬升过程示意:
flowchart TD
subgraph tree["树结构(根节点=1)"]
N1(["1\ndepth=0"])
N2(["2\ndepth=1"])
N3(["3\ndepth=1"])
N4(["4\ndepth=2"])
N5(["5\ndepth=2"])
N6(["6\ndepth=3"])
N1 --> N2
N1 --> N3
N2 --> N4
N2 --> N5
N4 --> N6
end
subgraph lca["LCA(6, 5) 求解过程"]
direction LR
S1["u=6, v=5\ndepth[6]=3, depth[5]=2"] -->|"u上移至同深度"| S2
S2["u=4, v=5\ndepth相同=2"] -->|"u≠v,同步上移"| S3
S3["u=2, v=2\nu==v → LCA=2 ✓"]
end
style S3 fill:#dcfce7,stroke:#16a34a
💡 关键步骤: ① 先将深度较大的节点上移至与另一节点同深度;② 再同步上移直到两节点相遇,相遇点即为 LCA。
#include <bits/stdc++.h>
using namespace std;
int parent[100001];
int depth_arr[100001];
// Naive LCA: walk both nodes up until they meet — O(depth) per query
int lca(int u, int v) {
while (depth_arr[u] > depth_arr[v]) u = parent[u]; // bring u up to same depth as v
while (depth_arr[v] > depth_arr[u]) v = parent[v]; // bring v up to same depth as u
while (u != v) { // now both at same depth; walk up together
u = parent[u];
v = parent[v];
}
return u;
}
For Silver problems, naive LCA (O(N) per query) is often sufficient. Gold uses binary lifting (O(log N) per query).
Binary Lifting 祖先表构建示意:
flowchart LR
subgraph anc0["anc[v][0](直接父节点)"]
direction TB
v6a(["6"]) -->|"父"| v4a(["4"])
v4a -->|"父"| v2a(["2"])
v2a -->|"父"| v1a(["1"])
end
subgraph anc1["anc[v][1](2¹=2级祖先)"]
direction TB
v6b(["6"]) -->|"2级祖先"| v2b(["2"])
v4b(["4"]) -->|"2级祖先"| v1b(["1"])
end
subgraph anc2["anc[v][2](2²=4级祖先)"]
direction TB
v6c(["6"]) -->|"4级祖先"| v1c(["1"])
end
anc0 -->|"anc[v][k] = anc[anc[v][k-1]][k-1]"| anc1
anc1 --> anc2
style anc2 fill:#f0f4ff,stroke:#4A6CF7
💡 倍增思想:
anc[v][k]= v 的 2^k 级祖先 = v 的 2^(k-1) 级祖先的 2^(k-1) 级祖先。查询时将深度差分解为二进制,每次跳 2^k 步,共 O(log N) 步。
5.3.3 Union-Find (Disjoint Set Union)
Union-Find is a data structure that efficiently answers two questions:
- Find: Which group does element X belong to?
- Union: Merge the groups containing X and Y.
Why is this useful? It efficiently tracks connected components as edges are added one by one, which is used in Kruskal's MST algorithm, detecting cycles, and many USACO problems.
Visual: Union-Find Operations
The diagram shows Union-Find evolving: initially all nodes are separate (each is its own root), then after union(0,1) and union(1,2) a tree forms. Path compression (shown at bottom) flattens the tree so future find() calls are nearly O(1).
This static reference diagram shows the Union-Find tree structure with path compression and union by rank, illustrating how the data structure maintains near-constant time operations.
Basic Implementation
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
int parent[MAXN]; // parent[x] = parent of x in the tree
int rankArr[MAXN]; // used for union by rank
// Initialize: each element is its own group
void init(int n) {
for (int i = 1; i <= n; i++) {
parent[i] = i; // parent of i is itself
rankArr[i] = 0; // initial rank is 0
}
}
// Find: returns the "representative" (root) of x's group
// Uses PATH COMPRESSION: flattens the tree for future queries
int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]); // path compression!
}
return parent[x];
}
// Union: merge groups containing x and y
// Uses UNION BY RANK: attach smaller tree under larger tree
void unite(int x, int y) {
int px = find(x), py = find(y);
if (px == py) return; // already in same group
// Attach tree with lower rank under tree with higher rank
if (rankArr[px] < rankArr[py]) swap(px, py);
parent[py] = px;
if (rankArr[px] == rankArr[py]) rankArr[px]++;
}
// Check if x and y are in the same group
bool connected(int x, int y) {
return find(x) == find(y);
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
init(n);
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
unite(u, v);
}
// Count connected components
set<int> roots;
for (int i = 1; i <= n; i++) roots.insert(find(i));
cout << "Connected components: " << roots.size() << "\n";
// Answer connectivity queries
int q;
cin >> q;
while (q--) {
int u, v;
cin >> u >> v;
cout << (connected(u, v) ? "YES" : "NO") << "\n";
}
return 0;
}
Time complexity: With path compression and union by rank, both find and unite run in nearly O(1) — specifically O(α(n)) where α is the inverse Ackermann function, which is effectively constant for all practical inputs.
Union by Rank vs Union by Size 对比
flowchart LR
subgraph bad["❌ 不按秩合并(退化为链)"]
direction TB
R1(["1"]) --> R2(["2"]) --> R3(["3"]) --> R4(["4"]) --> R5(["5"])
note_bad["find(5) 需要 5 步\nO(N) 每次查询"]
end
subgraph good["✅ 按秩合并(保持树矮"]
direction TB
Root(["1"])
C2(["2"])
C3(["3"])
C4(["4"])
C5(["5"])
Root --> C2
Root --> C3
Root --> C4
Root --> C5
note_good["find(任意) 只需 2 步\nO(1) 每次查询"]
end
bad -->|"按秩合并优化"| good
style good fill:#dcfce7,stroke:#16a34a
style bad fill:#fef2f2,stroke:#dc2626
💡 按秩合并规则: 将 rank 较小的树挂到 rank 较大的树下,保证树高不超过 O(log N)。配合路径压缩后,均摊复杂度降至 O(α(N)) ≈ O(1)。
Why Union-Find is Powerful
Compare with BFS/DFS for connectivity queries:
- BFS/DFS:
O(N+M)per query (rebuilds from scratch) - Union-Find:
O(α(N))per query afterO((N+M)α(N))preprocessing
For Q queries after reading all edges: BFS = O(Q(N+M)) vs DSU = O((N+M+Q)α(N)).
5.3.4 Cycle Detection with DSU
Problem: Given a graph, determine if it has a cycle. If so, report which edge creates a cycle.
DSU 环检测过程示意:
flowchart LR
subgraph e1["加入边 1-2"]
direction TB
A1(["1"]) --- B1(["2"])
note1["find(1)≠find(2)\n→ 合并,无环"]
end
subgraph e2["加入边 2-3"]
direction TB
A2(["1"]) --- B2(["2"]) --- C2(["3"])
note2["find(2)≠find(3)\n→ 合并,无环"]
end
subgraph e3["加入边 1-3"]
direction TB
A3(["1"]) --- B3(["2"]) --- C3(["3"])
A3 -.-|"虚线=新边"| C3
note3["find(1)==find(3)\n→ 已连通,成环! ⚠️"]
end
e1 --> e2 --> e3
style note3 fill:#fef2f2,stroke:#dc2626
style e3 fill:#fff1f2,stroke:#fca5a5
💡 核心判断: 加入边 (u, v) 前,若
find(u) == find(v),说明 u 和 v 已在同一连通分量,加入此边必然成环。
#include <bits/stdc++.h>
using namespace std;
int parent[100001];
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 1; i <= n; i++) parent[i] = i;
bool hasCycle = false;
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
if (find(u) == find(v)) {
cout << "Cycle created by edge " << u << "-" << v << "\n";
hasCycle = true;
} else {
parent[find(u)] = find(v); // simple union (no rank for brevity)
}
}
if (!hasCycle) cout << "No cycle\n";
return 0;
}
5.3.5 Minimum Spanning Tree (Kruskal's Algorithm)
A minimum spanning tree (MST) of a weighted graph connects all vertices with total edge weight minimized, using exactly N-1 edges.
Kruskal's algorithm:
- Sort all edges by weight
- Process edges in order; add an edge if it connects two different components (using DSU)
- Stop when N-1 edges are added
#include <bits/stdc++.h>
using namespace std;
int parent[100001];
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
bool unite(int x, int y) {
x = find(x); y = find(y);
if (x == y) return false; // already connected
parent[x] = y;
return true;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 1; i <= n; i++) parent[i] = i;
// Read edges as (weight, u, v)
vector<tuple<int,int,int>> edges(m);
for (auto &[w, u, v] : edges) cin >> u >> v >> w;
// Sort by weight
sort(edges.begin(), edges.end());
long long totalCost = 0;
int edgesAdded = 0;
for (auto [w, u, v] : edges) {
if (unite(u, v)) { // connects two different components
totalCost += w;
edgesAdded++;
if (edgesAdded == n - 1) break; // MST complete
}
}
if (edgesAdded == n - 1) {
cout << "MST cost: " << totalCost << "\n";
} else {
cout << "Graph is disconnected; no MST\n";
}
return 0;
}
5.3.6 USACO Example: The Fence
Problem (USACO-style): A farm has N fields and M fences. Each fence connects two fields. Fields in the same connected component form a "pasture." After adding each fence, output the size of the largest pasture.
#include <bits/stdc++.h>
using namespace std;
int parent[100001];
int sz[100001]; // sz[root] = size of component rooted at 'root'
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
void unite(int x, int y) {
x = find(x); y = find(y);
if (x == y) return;
if (sz[x] < sz[y]) swap(x, y);
parent[y] = x;
sz[x] += sz[y];
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
for (int i = 1; i <= n; i++) { parent[i] = i; sz[i] = 1; }
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
unite(u, v);
int maxSz = 0;
for (int j = 1; j <= n; j++) {
if (find(j) == j) maxSz = max(maxSz, sz[j]); // only roots have correct sz
}
cout << maxSz << "\n";
}
return 0;
}
Chapter Summary
📌 Key Takeaways
| Technique | Time per Operation | Use Case | Why It Matters |
|---|---|---|---|
| Tree DFS (pre/post order) | O(N) total | Subtree sum, depth calc | Foundation for tree DP |
| Naive LCA | O(depth) per query | Small trees | Understanding LCA concept |
| Binary Lifting LCA | O(log N) per query | Large tree path queries | Gold-level core technique |
| Union-Find find/union | O(α(N)) ≈ O(1) | Dynamic connectivity | Kruskal MST, online connectivity |
| Kruskal's MST | O(E log E) | Minimum spanning tree | Common in USACO Silver/Gold |
| Euler Tour | O(N) preprocessing | Subtree→range query | Combined with Segment Tree for tree problems |
| Tree Diameter | O(N) (two BFS) | Longest path in tree | Common interview/contest problem |
❓ FAQ
Q1: Can Union-Find use only one of "path compression" or "union by rank"?
A: Yes. Path compression alone gives amortized
O(log N). Union by rank alone givesO(log N). Both together achieveO(α(N)). In contests, at least use path compression (more impactful); union by rank can be simplified to union by size.
Q2: What is the difference between Kruskal and Prim? When to use which?
A: Kruskal sorts edges + DSU, suited for sparse graphs (E ≪ V²), concise code. Prim is like Dijkstra with a priority queue, suited for dense graphs. In contests, use Kruskal 90% of the time.
Q3: What is the difference between Euler Tour and DFS order?
A: Essentially the same. "DFS order" usually refers to
in_timeandout_time; "Euler Tour" sometimes means the full entry/exit sequence (length 2N). In this book they are the same thing—the key is[in[u], out[u]]corresponds to u's subtree.
Q4: Why can tree diameter be found with "two BFS passes"?
A: Proof: Let the diameter be u→v. Starting BFS from any node s, the farthest node must be u or v (provable by contradiction). Then BFS from that endpoint finds the other endpoint and the diameter length.
Q5: What is the difference between multiset::erase(ms.find(val)) and ms.erase(val)?
A: Not this chapter's content (Chapter 3.8), but related to DSU
sztracking.ms.erase(val)removes all elements equal to val;ms.erase(ms.find(val))removes only one. When tracking group sizes in DSU, watch for similar "delete one vs delete all" issues.
🔗 Connections to Later Chapters
- Chapter 3.9 (Segment Tree) + Euler Tour = efficient subtree queries (update + query both
O(log N)) - Chapter 6.1 (DP Introduction): Tree DP builds directly on this chapter's tree traversal—postorder DFS aggregates bottom-up
- Chapter 4.1 (Greedy): MST is a classic greedy application—Kruskal greedily selects minimum edges
- Union-Find is powerful for offline processing—sort all queries/edges first, then add with DSU incrementally
- Binary Lifting for LCA is one of the core techniques at USACO Gold level
Practice Problems
Problem 5.3.1 — Subtree Sum Read a rooted tree with values at each node. For each node, output the sum of values in its subtree.
Problem 5.3.2 — Network Components Read a graph. Add edges one by one. After each edge, print the number of connected components.
Problem 5.3.3 — Redundant Edge Read a tree (N nodes, N-1 edges) plus one extra edge that creates a cycle. Find the extra edge. (Hint: use DSU — the edge that unites two already-connected nodes is the answer)
Problem 5.3.4 — Friend Groups N students. Read M pairs of friendships. Friends of friends are also friends (transitivity). Print the number of friend groups and the size of the largest one.
Problem 5.3.5 — USACO 2016 February Silver: Fencing the Cows (Inspired) Read a weighted graph. Find the minimum cost to connect all nodes (MST using Kruskal's). Print the total MST weight, or "IMPOSSIBLE" if the graph is not connected.
5.3.7 Kruskal's MST — Complete Worked Example
Let's trace Kruskal's algorithm on a 6-node graph with 9 edges.
Graph:
Nodes: 0,1,2,3,4,5
Edges (sorted by weight):
0-1: w=1
2-3: w=2
0-2: w=3
1-3: w=4
3-4: w=5
2-4: w=6
4-5: w=7
1-4: w=8
3-5: w=9
Kruskal's Algorithm Trace:
Initial: 6 components {0},{1},{2},{3},{4},{5}
Process edge 0-1 (w=1): find(0)=0, find(1)=1 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1}. Components: {0,1},{2},{3},{4},{5}
Process edge 2-3 (w=2): find(2)=2, find(3)=3 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1, 2-3}. Components: {0,1},{2,3},{4},{5}
Process edge 0-2 (w=3): find(0)=root_of_01, find(2)=root_of_23 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1, 2-3, 0-2}. Components: {0,1,2,3},{4},{5}
Process edge 1-3 (w=4): find(1)=root_of_0123, find(3)=root_of_0123 → SAME → SKIP ✗
(Adding this would create a cycle: 0-1-3-2-0)
Process edge 3-4 (w=5): find(3)=root_of_0123, find(4)=4 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1, 2-3, 0-2, 3-4}. Components: {0,1,2,3,4},{5}
Process edge 2-4 (w=6): find(2)=root_of_01234, find(4)=root_of_01234 → SAME → SKIP ✗
Process edge 4-5 (w=7): find(4)=root_of_01234, find(5)=5 → DIFFERENT → ACCEPT ✓
Tree edges: {0-1, 2-3, 0-2, 3-4, 4-5}.
edgesAdded = 5 = n-1 = 5. DONE!
MST total weight: 1 + 2 + 3 + 5 + 7 = 18
Kruskal 边选择过程示意:
flowchart LR
subgraph s0["初始:6个独立分量"]
direction LR
n0a([0])
n1a([1])
n2a([2])
n3a([3])
n4a([4])
n5a([5])
end
subgraph s1["加入 0-1(w=1)、2-3(w=2)"]
direction LR
n01b(["0——1"])
n23b(["2——3"])
n4b([4])
n5b([5])
end
subgraph s2["加入 0-2(w=3),跳过 1-3(w=4)成环"]
direction LR
n0123c(["0—1—2—3"])
n4c([4])
n5c([5])
end
subgraph s3["加入 3-4(w=5)、4-5(w=7),MST完成"]
direction LR
n012345d(["0—1—2—3—4—5"])
end
s0 --> s1 --> s2 --> s3
style s3 fill:#dcfce7,stroke:#16a34a
Complete C++ implementation with worked example:
// Solution: Kruskal's MST — O(E log E)
#include <bits/stdc++.h>
using namespace std;
struct DSU {
vector<int> parent, rank_;
DSU(int n) : parent(n), rank_(n, 0) {
iota(parent.begin(), parent.end(), 0);
}
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
bool unite(int x, int y) {
x = find(x); y = find(y);
if (x == y) return false;
if (rank_[x] < rank_[y]) swap(x, y);
parent[y] = x;
if (rank_[x] == rank_[y]) rank_[x]++;
return true;
}
};
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n, m;
cin >> n >> m;
// Read edges as {weight, u, v}
vector<tuple<int,int,int>> edges(m);
for (auto& [w, u, v] : edges) cin >> u >> v >> w;
// Sort by weight (ascending)
sort(edges.begin(), edges.end());
DSU dsu(n);
long long mstWeight = 0;
int edgesAdded = 0;
vector<pair<int,int>> mstEdges;
for (auto [w, u, v] : edges) {
if (dsu.unite(u, v)) { // different components → safe to add
mstWeight += w;
mstEdges.push_back({u, v});
if (++edgesAdded == n - 1) break; // MST complete (N-1 edges)
}
}
if (edgesAdded < n - 1) {
cout << "IMPOSSIBLE: graph is disconnected\n";
} else {
cout << "MST weight: " << mstWeight << "\n";
cout << "MST edges:\n";
for (auto [u, v] : mstEdges) cout << u << " - " << v << "\n";
}
return 0;
}
5.3.8 Tree Diameter
The diameter of a tree is the longest path between any two nodes (measured in number of edges, or total weight for weighted trees).
Algorithm (Two BFS/DFS approach):
- BFS/DFS from any node
u. Find the farthest nodev. - BFS/DFS from
v. The farthest node fromvis one endpoint of the diameter. - The distance found in step 2 is the diameter.
Why does this work? The farthest node from any node is always one endpoint of a diameter.
两次 BFS 求树直径过程示意:
flowchart LR
subgraph bfs1["第1次 BFS:从任意节点 s=1 出发"]
direction TB
S1(["s=1\ndist=0"])
N2a(["2\ndist=1"])
N3a(["3\ndist=2"])
N4a(["4\ndist=3 ← 最远"])
S1 --> N2a --> N3a --> N4a
note1["最远节点 = 4\n(直径端点之一)"]
end
subgraph bfs2["第2次 BFS:从端点 u=4 出发"]
direction TB
U(["u=4\ndist=0"])
N3b(["3\ndist=1"])
N2b(["2\ndist=2"])
N1b(["1\ndist=3"])
N5b(["5\ndist=4 ← 最远"])
U --> N3b --> N2b --> N1b --> N5b
note2["最远节点 = 5\n直径长度 = 4"]
end
bfs1 -->|"以最远节点为新起点"| bfs2
style N4a fill:#dbeafe,stroke:#3b82f6
style N5b fill:#dcfce7,stroke:#16a34a
style note2 fill:#dcfce7,stroke:#16a34a
💡 正确性证明(反证法): 设真正的直径端点为 p, q。从任意节点 s 出发,最远节点 u 必为 p 或 q(若不是,则 u 到 p 或 q 的距离更长,矛盾)。从 u 出发的最远节点即为另一端点,距离即为直径。
// Solution: Tree Diameter (Two BFS) — O(N)
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<pair<int,int>> adj[MAXN]; // {neighbor, edge_weight}
// BFS from src, returns {farthest_node, farthest_distance}
pair<int,int> bfsFarthest(int src, int n) {
vector<int> dist(n + 1, -1);
queue<int> q;
dist[src] = 0;
q.push(src);
int farthest = src;
while (!q.empty()) {
int u = q.front(); q.pop();
for (auto [v, w] : adj[u]) {
if (dist[v] == -1) {
dist[v] = dist[u] + w;
q.push(v);
if (dist[v] > dist[farthest]) farthest = v;
}
}
}
return {farthest, dist[farthest]};
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin >> n;
for (int i = 0; i < n - 1; i++) {
int u, v, w;
cin >> u >> v >> w;
adj[u].push_back({v, w});
adj[v].push_back({u, w});
}
// Step 1: BFS from node 1, find farthest node u
auto [u, _] = bfsFarthest(1, n);
// Step 2: BFS from u, find farthest node v and its distance
auto [v, diameter] = bfsFarthest(u, n);
cout << "Diameter: " << diameter << "\n";
cout << "Endpoints: " << u << " and " << v << "\n";
return 0;
}
Unweighted version: Set all edge weights to 1, or use a simpler BFS that counts hops.
5.3.9 Lowest Common Ancestor (LCA) — Concept
The Lowest Common Ancestor (LCA) of two nodes u and v in a rooted tree is the deepest node that is an ancestor of both u and v.
Naive LCA (O(depth) per query): Walk both nodes up to the same depth, then walk together until they meet.
// Naive LCA — O(depth) per query, depth can be O(N) worst case
int lca_naive(int u, int v, int* depth, int* parent) {
// Equalize depths
while (depth[u] > depth[v]) u = parent[u];
while (depth[v] > depth[u]) v = parent[v];
// Now same depth — walk up together
while (u != v) {
u = parent[u];
v = parent[v];
}
return u;
}
Binary Lifting LCA (O(log N) per query, O(N log N) preprocessing):
Store anc[v][k] = 2^k-th ancestor of v.
const int LOG = 17; // log2(10^5) ≈ 17
int anc[MAXN][LOG]; // anc[v][k] = 2^k-th ancestor of v
int depth_arr[MAXN];
void preprocess(int root, int n) {
// DFS to compute depths and anc[v][0] = direct parent
// Then: anc[v][k] = anc[anc[v][k-1]][k-1]
// (2^k-th ancestor = 2^(k-1)-th ancestor of 2^(k-1)-th ancestor)
for (int v = 1; v <= n; v++)
for (int k = 1; k < LOG; k++)
anc[v][k] = anc[anc[v][k-1]][k-1];
}
int lca(int u, int v) {
if (depth_arr[u] < depth_arr[v]) swap(u, v);
int diff = depth_arr[u] - depth_arr[v];
// Lift u up by diff levels using binary lifting
for (int k = 0; k < LOG; k++)
if ((diff >> k) & 1) u = anc[u][k];
if (u == v) return u;
// Now same depth — binary lift both
for (int k = LOG - 1; k >= 0; k--)
if (anc[u][k] != anc[v][k]) {
u = anc[u][k];
v = anc[v][k];
}
return anc[u][0];
}
💡 When to use LCA: Path queries on trees (e.g., "sum of values on path from u to v"), distance queries between nodes, finding "meeting points" on tree paths.
5.3.10 Euler Tour of Tree
An Euler tour flattens a tree into a linear array, enabling range queries on subtrees using regular array data structures (e.g., segment tree).
Idea: Record entry and exit times for each node during DFS. The subtree of node u corresponds to the contiguous range [in[u], out[u]] in the Euler tour array.
// Euler Tour / Heavy-Light Decomposition preprocessing
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 100001;
vector<int> children[MAXN];
int in_time[MAXN], out_time[MAXN], timer_val = 0;
int euler_arr[MAXN]; // euler_arr[in_time[v]] = val[v]
int val[MAXN]; // value at each node
void dfs_euler(int u) {
in_time[u] = ++timer_val; // entry time
euler_arr[timer_val] = val[u]; // record value in euler array
for (int v : children[u]) {
dfs_euler(v);
}
out_time[u] = timer_val; // exit time (same as in_time for leaf)
}
int main() {
int n;
cin >> n;
for (int i = 2; i <= n; i++) {
int p; cin >> p;
children[p].push_back(i);
}
for (int i = 1; i <= n; i++) cin >> val[i];
dfs_euler(1);
// Now: subtree of node u = euler_arr[in_time[u]..out_time[u]]
// Use a segment tree or prefix sums on euler_arr for subtree queries!
// Example: sum of values in subtree of node 3
// answer = sum(euler_arr[in_time[3]..out_time[3]])
cout << "Subtree of 3 covers indices: "
<< in_time[3] << " to " << out_time[3] << "\n";
return 0;
}
Euler Tour in/out 时间戳示意:
flowchart TD
subgraph tree["树结构"]
R(["1\nin=1, out=7"])
A(["2\nin=2, out=4"])
B(["5\nin=5, out=7"])
C(["3\nin=3, out=3"])
D(["4\nin=4, out=4"])
E(["6\nin=6, out=6"])
F(["7\nin=7, out=7"])
R --> A
R --> B
A --> C
A --> D
B --> E
B --> F
end
subgraph arr["欧拉序列数组"]
direction LR
P1["[1]\n节点1"]
P2["[2]\n节点2"]
P3["[3]\n节点3"]
P4["[4]\n节点4"]
P5["[5]\n节点5"]
P6["[6]\n节点6"]
P7["[7]\n节点7"]
P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- P7
end
tree -->|"子树 2 = 区间 [2,4]"| arr
style P2 fill:#dbeafe,stroke:#3b82f6
style P3 fill:#dbeafe,stroke:#3b82f6
style P4 fill:#dbeafe,stroke:#3b82f6
💡 关键性质: 节点 u 的子树对应欧拉序列中的连续区间
[in[u], out[u]]。子树查询变为区间查询,配合线段树可实现 O(log N) 的子树更新和查询。
Updated DSU Diagram
The diagram shows all three key DSU operations: isolated nodes, trees after unions, and path compression side-by-side.