Codeforces
CF Step
Youtube Linkedin Discord Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Editorial: Building Tree

Problem recap

We are given an original weighted undirected graph on $n$ vertices. A path does not have the usual sum of edge weights. Instead, if the set of weights appearing on the path is $S$, the path length is $\operatorname{mex}(S)$.

Then we are given $q$ vertices in a new graph. Vertex $i$ has color $c_i$, which is a vertex of the original graph. We may add an edge between new vertices $i$ and $j$ with cost $\mathrm{dis}(c_i,c_j)$, where $\mathrm{dis}$ is the minimum MEX-path length in the original graph. We need the minimum cost to make the new graph connected.

This editorial focuses on a slow but conceptually direct solution. The final official solution optimizes exactly the same idea.

Subtask 1: Understand one distance

Fix two original vertices $u$ and $v$.

Suppose there is a path from $u$ to $v$ whose MEX is $k$. By the definition of MEX:

  • all weights $0,1,\ldots,k-1$ appear on the path;
  • weight $k$ does not appear on the path.

For minimizing the MEX, the important part is the second bullet. If a path avoids weight $k$, then its MEX is at most $k$, because $k$ is one of the missing weights.

So, if $u$ and $v$ are connected after deleting all weight-$k$ edges, then:

$$ \mathrm{dis}(u,v)\le k $$

The reverse direction is also enough for an exact formula. If the best path has MEX $d$, then that path avoids weight $d$, so $u$ and $v$ are connected after deleting weight $d$.

Therefore:

$$ \mathrm{dis}(u,v)=\min \{k : u \text{ and } v \text{ are connected after deleting all weight-}k\text{ edges}\}. $$

The values of $k$ we need to test are only $0,1,\ldots,m$, because all edge weights lie in that range.

Subtask 2: Reduce the new graph to an MST

The new graph can add an edge between any two new vertices whose colors are connected in the original graph. The edge cost depends only on their colors:

$$ \text{cost}(i,j)=\mathrm{dis}(c_i,c_j). $$

So this is just a minimum spanning tree problem on the new vertices with these pairwise costs.

There is one small simplification. If the same original color appears multiple times among the $c_i$, those new vertices can be connected together with cost $0$, using the empty path from a vertex to itself. So we only need to connect the distinct colors that appear in the list.

Let these distinct colors be $a_0,a_1,\ldots,a_{s-1}$. We need the MST of the complete graph on these $s$ colors, where the edge between $a_i$ and $a_j$ has weight $\mathrm{dis}(a_i,a_j)$.

Subtask 3: A very direct slow solution

The most literal implementation is:

  1. For every pair of distinct selected colors $(a_i,a_j)$, compute $\mathrm{dis}(a_i,a_j)$.
  2. Run Kruskal on all those pairwise edges.

To compute one distance, try every $k$ from $0$ to $m$:

  1. Build a DSU using all original edges whose weight is not $k$.
  2. If $a_i$ and $a_j$ are in the same DSU component, then $\mathrm{dis}(a_i,a_j)=k$ and we can stop.

This is the easiest possible way to see the idea, but it is extremely slow. It rebuilds the original graph connectivity for every pair and every possible deleted weight.

Subtask 4: Slow Kruskal without all pair distances

We can keep the same logic but organize it like Kruskal.

Kruskal processes edges by increasing weight. Here, if two selected colors are connected after deleting weight $k$, then Kruskal has an available edge between them with cost at most $k$. The first value of $k$ where this happens is exactly their distance.

So for each $k=0,1,\ldots,m$:

  1. Rebuild a DSU on the original graph using all edges with weight not equal to $k$.
  2. Look at each distinct selected color and find its original DSU root.
  3. Colors with the same root form a clique of available edges of cost at most $k$.
  4. In the MST DSU over selected colors, merge all currently different components inside each such clique, adding $k$ for every successful merge.

This is exactly Kruskal, but we never need to create all $s(s-1)/2$ edges explicitly.

Why merging a clique by a star is enough

Suppose for the current $k$, selected colors

$$ x_1,x_2,\ldots,x_r $$

are in the same component of the original graph after deleting weight $k$.

Then every pair among them has an available edge with cost at most $k$. At this Kruskal stage, we only care about merging different MST components. Connecting all of them to $x_1$ is enough:

  • if $x_i$ is already in the same MST component as $x_1$, the union does nothing;
  • otherwise, we add one edge of cost $k$ and merge one MST component.

This produces the same effect as considering all clique edges of this cost.

Correctness

Lemma 1

For original vertices $u$ and $v$,

$$ \mathrm{dis}(u,v)=\min \{k : u \text{ and } v \text{ are connected after deleting all weight-}k\text{ edges}\}. $$

Proof.

If $u$ and $v$ are connected after deleting all weight-$k$ edges, there is a path from $u$ to $v$ that does not contain weight $k$. Therefore the MEX of this path is at most $k$, so $\mathrm{dis}(u,v)\le k$.

Conversely, if there is a path from $u$ to $v$ with MEX $k$, then weight $k$ is absent from that path. Thus the same path still exists after deleting all weight-$k$ edges, so $u$ and $v$ are connected in that deleted graph.

Taking the minimum possible $k$ proves the lemma.

Lemma 2

For a fixed $k$, every group of selected colors lying in the same component after deleting weight $k$ can be treated as a clique of edges of cost at most $k$.

Proof.

Any two colors in the group are connected after deleting weight $k$. By Lemma 1, their distance is at most $k$. Therefore the complete graph on selected colors has an edge of weight at most $k$ between every pair in the group.

Lemma 3

Processing $k$ from $0$ to $m$ and merging selected-color components inside every group is equivalent to Kruskal’s algorithm.

Proof.

By Lemma 1, an edge between two selected colors has MST weight $d$ exactly when $d$ is the first deleted-weight value that connects them.

When our algorithm is at value $k$, all edges whose first successful deleted-weight value is at most $k$ have become available. Inside each group from the deleted-weight graph, Lemma 2 says the relevant selected colors form a clique of edges with cost at most $k$. Merging all current MST components inside that clique is exactly what Kruskal would do after seeing enough edges from the clique.

The DSU prevents cycles. If a merge is still successful at value $k$, then these two MST components were not connected by any earlier processed edge, so charging this merge at the current Kruskal level is valid. Thus the algorithm performs Kruskal’s choices in nondecreasing edge weight order.

Theorem

If the algorithm connects all distinct selected colors, the accumulated value is the minimum time needed to connect the new graph. If it does not connect them, the answer is $-1$.

Proof.

By Subtask 2, the problem is the MST problem on distinct selected colors with edge weights $\mathrm{dis}$. Lemma 3 shows that the algorithm is Kruskal’s algorithm on exactly that graph. Kruskal returns a minimum spanning tree when one exists.

If the selected colors remain disconnected after all $k$, then no available edges can connect all of them, so no connected new graph can be built.

Complexity

Let $s$ be the number of distinct colors in the query list.

For every $k=0,1,\ldots,m$, the slow implementation:

  • rebuilds a DSU from $m$ edges;
  • groups the $s$ selected colors by their DSU root;
  • performs some DSU unions on the selected-color MST DSU.

The complexity is roughly

$$ O((m+1)(n+m+s)\alpha(n)). $$

This is far too slow for the full constraints, but it is much easier to understand than the official optimized solution.

Final optimization in passing

The provided model_sol.cpp keeps the same Kruskal interpretation. Instead of rebuilding connectivity after deleting each weight, it uses divide-and-conquer over weight ranges and a rollback DSU to maintain components of the graph formed by weights outside the current range.

When two components containing selected colors become connected, it performs the same kind of Kruskal merge on the selected-color DSU. That optimization reduces the repeated rebuilding work, but the conceptual core remains the deleted-weight view above.