Statistics on Tree: Root as LCA

Problem

This tutorial explains two beginner-friendly subtask solutions for Codeforces 2222G, Statistics on Tree.

Part 1 solves it in $O(n^3)$ by fixing each possible root and evaluating each path separately.
Part 2 optimizes the same idea to $O(n^2)$ by using a DFS after fixing the root.

We are given a tree with $n$ vertices. For every pair $(u,v)$ with $u \le v$, we remove all edges on the path from $u$ to $v$. The value of the pair is the size of the largest connected component left after these edges are removed.

For every $i$, we need to count how many pairs have value exactly $i$.

First observation

The pair $(u,v)$ is unordered, but the path between them is unique.

Instead of thinking about all unordered pairs at once, we can do this:

Fix one endpoint, call it root.
Root the whole tree at root.
Compute the answer for every pair (root, v).

When the tree is rooted at root, the LCA of (root, v) is always root, because root is an ancestor of every vertex.

If we repeat this for every possible root, then every unordered pair is seen twice: once as (u, v) and once as (v, u). To avoid double-counting, we only count the pair when root <= v.

So the plan is:

for root = 0 to n - 1:
    root the tree at root
    for v = root to n - 1:
        compute the value of pair (root, v)

This is not fast enough for the full constraints, but it is a very useful stepping stone because it makes the tree components easy to understand.

Part 1: Evaluate each rooted path separately

Rooting the tree

After fixing root, run a DFS and compute:

parent[x]: the parent of vertex x in the rooted tree;
subtree_size[x]: the size of the subtree of x.

Now consider a vertex v. The path from root to v is just:

root -> ... -> v

We can recover it by starting from v and repeatedly moving to parent[v] until we reach root, then reversing the list.

What happens after removing the path edges

Suppose the path is:

$$ p_0, p_1, p_2, \ldots, p_k $$

where $p_0$ is root and $p_k$ is v.

The removed edges are:

$$ (p_0,p_1), (p_1,p_2), \ldots, (p_{k-1},p_k). $$

Every remaining component is attached to one of the path vertices.

For this rooted-path case, we can compute the component sizes directly.

Component at the root

If the path has at least one edge, then the first removed edge is:

$$ (p_0,p_1). $$

The subtree of $p_1$ gets separated from the root side. Therefore the component containing the root has size:

$$ n-\mathrm{subtree}_{\mathrm{size}}[p_1]. $$

Component at an internal path vertex

Now take an internal path vertex $p_i$, where $0 < i < k$.

Two path edges incident to it are removed:

$$ (p_{i-1},p_i) $$

and

$$ (p_i,p_{i+1}). $$

In the rooted tree, $p_{i+1}$ is a child of $p_i$. The part below $p_{i+1}$ is cut away. The part above $p_i$ is also cut away.

So the component containing $p_i$ has exactly the vertices in the subtree of $p_i$, except the subtree of $p_{i+1}$:

$$ \mathrm{subtree}_{\mathrm{size}}[p_i]-\mathrm{subtree}_{\mathrm{size}}[p_{i+1}]. $$

Component at the endpoint

At the endpoint $p_k=v$, only the edge to its parent on the path is removed.

The whole subtree of $v$ remains connected, so this component has size:

$$ \mathrm{subtree}_{\mathrm{size}}[v]. $$

Special case when root equals v

If root == v, the path has no edges.

So no edge is removed, and the whole tree remains one component. The value is:

$$ n. $$

This accounts for all pairs $(v,v)$.

Computing one pair

For a fixed root and endpoint v:

Build the path from root to v.
If the path has one vertex, the value is $n$.
Otherwise:
- consider the root-side component $n-\mathrm{subtree}_{\mathrm{size}}[p_1]$;
- consider every internal component $\mathrm{subtree}_{\mathrm{size}}[p_i]-\mathrm{subtree}_{\mathrm{size}}[p_{i+1}]$;
- consider the endpoint component $\mathrm{subtree}_{\mathrm{size}}[v]$.
The value is the maximum of all these component sizes.

Then increment:

$$ \mathrm{answer}[\mathrm{value}] $$

by $1$.

Why this counts every pair once

Every valid pair in the statement has the form $(u,v)$ with $u \le v$.

When our loop fixes root = u, the inner loop eventually chooses endpoint v, so the pair is counted.

When the loop later fixes root = v, the same unordered pair would appear as (v, u), but it is skipped because we only process endpoints v with root <= v.

Therefore every pair is counted exactly once.

Complexity

For each fixed root:

DFS takes $O(n)$.
There are $O(n)$ endpoints.
For each endpoint, walking along the path can take $O(n)$.

So one root costs $O(n^2)$, and all roots cost:

$$ O(n^3). $$

The memory usage is $O(n)$, apart from the answer array.

Part 2: Rerooting the endpoint computation

The $O(n^3)$ solution does too much repeated work.

For one fixed root, it computes the value of (root, v) by building the whole path from root to v. If the tree is a chain, this path can have length $O(n)$, and we do that for $O(n)$ endpoints. So one fixed root costs $O(n^2)$.

But when root is fixed, all paths from root to other vertices form one rooted tree. Neighboring paths share almost all of their information.

For example, suppose we already understand the path from root to v, and now we move to a child to of v.

The new path is:

root -> ... -> v -> to

Compared to the old path, only one new edge is added: (v, to).

So we should not rebuild the path from scratch.

What value do we need to carry

For pair (root, v), the answer is the maximum component size after removing all path edges.

When we walk downward from root, we can maintain:

best_before_v

This means:

Among all components already created by cutting edges on the path from root down to v, excluding the final subtree of v, what is the largest component size?

Then, for the pair (root, v), there is one more candidate: the component containing v, whose size is:

$$ \mathrm{subtree}_{\mathrm{size}}[v]. $$

Therefore:

$$ \mathrm{value}(\mathrm{root},v)=\max(\mathrm{best}_{\mathrm{before}},\mathrm{subtree}_{\mathrm{size}}[v]). $$

Moving from v to a child

Suppose the DFS is currently at vertex v, and it moves to a child to.

The path for the new pair (root, to) cuts the edge (v, to).

This creates a new component at the v side of that edge. Its size depends on whether v is the fixed root.

If v == root, then cutting (root, to) separates the child subtree of to from the rest of the tree. The component containing root has size:

$$ n-\mathrm{subtree}_{\mathrm{size}}[\mathrm{to}]. $$

If v != root, then v already has its parent edge on the path cut, and now the child edge (v, to) is also cut. The component around v contains the subtree of v, except the subtree of to:

$$ \mathrm{subtree}_{\mathrm{size}}[v]-\mathrm{subtree}_{\mathrm{size}}[\mathrm{to}]. $$

Call this new component size component_at_v.

Then the best value to pass to the child is:

$$ \mathrm{best}_{\mathrm{before\_to}}= \max(\mathrm{best}_{\mathrm{before\_v}}, \mathrm{component}_{\mathrm{at\_v}}). $$

And the value of pair (root, to) is:

$$ \max(\mathrm{best}_{\mathrm{before\_to}}, \mathrm{subtree}_{\mathrm{size}}[\mathrm{to}]). $$

DFS for one fixed root

For one fixed root, we do two DFS traversals:

dfs_size(root, -1) computes parents and subtree sizes when the tree is rooted at root.
dfs_values(root, 0, root) walks downward and computes all pairs (root, v).

The second DFS does constant work per edge, because it no longer rebuilds every path.

So for one root, the time becomes $O(n)$ instead of $O(n^2)$.

Counting pairs once

As before, pair (root, root) has value $n$.

For every other endpoint v, we count pair (root, v) only if:

root < v

This guarantees that each unordered pair is counted once.

Part 2 complexity

For each fixed root:

dfs_size takes $O(n)$.
dfs_values takes $O(n)$.

So one fixed root costs $O(n)$, and all $n$ roots cost:

$$ O(n^2). $$

This is still not the full intended solution for the original constraints, but it is a big improvement over Part 1 and teaches the central rerooting idea: after choosing a root, compute all endpoint answers in one traversal.

Code

Both implementations use a Tree class and avoid lambda functions.

Code: Statistics on Tree Root as LCA