# DP on induced subgraphs of a tree

An

induced subgraphof a graph is another graph, formed from a subset of the vertices of the graph and all of the edges connecting pairs of vertices in that subset. You are given an unrooted tree, find the number of induced subgraphs that are also trees.

For an induced subgraph to satisfy the properties of a tree, it is necessary and sufficient for it to be **connected** (assuming that the original graph was a tree).

Hence, from now onwards, whenever we talk about induced subgraphs, we will implicitly mean connected induced subgraphs.

The given tree is unrooted. Let’s root the tree at vertex $1$ (Note that it is not always correct to root the tree at an arbitrary vertex). Then, we can try to compute the answer for each subtree using **dynamic programming**.

The first DP defintion that comes to mind is

$dp[node]$ represents the number of connected induced subgraphs when the resulting subgraph is a tree rooted at node $node$.

However, this DP defintion doesn’t make a lot of sense, because even if you assume the tree to be rooted at vertex $1$, the resulting subgraph will have no concept of root, unless you explicitly define it.

Let’s assign each node a height when viewed from the ground. For example, the root is the highest vertex, all children of root are the second highest vertex and so on. The deepest leaf vertices are the lowest vertices in this ordering.

Then, we can define

$dp[u]$ is the number of connected induced subgraphs when the highest vertex in the subgraph is node $u$.

Then, the answer to the original problem would be $\sum dp[u]$ but it’s not trivial to prove. For example, what if there is a subgraph in which there are 2 nodes that can act as the highest vertex? Did we double count?

Luckily, there cannot exist 2 highest vertices $u$ and $v$ in the subgraph, as $LCA(u, v)$ must also be present in the subgraph, leading to a contradiction that $u$ and $v$ are the highest vertices in the subgraph.

With this DP definition in mind, let us analyze the transitions. We will assume that the DP values of all children have been computed (since the function is called recursively for each children). We need to figure out how these answers can be combined to compute $dp[u]$.

We will start adding children one by one. Suppose we’ve added $x$ number of children so far, and the total number of ways is $dp[u]$. What happens when you introduce a new children $c$?

You can either delete the edge between $u$ and $c$, and eventually ignore the entire subtree of $c$ or you can keep the edge between $u$ and $c$. Recall that there are $dp[u]$ ways to achieve the configuration of the parent, and $dp[c]$ ways to achieve the current configuration of the child. Hence, you can pick a configuration from both sides independently to get $dp[u]*dp[c]$.

If you decide to delete the edge, then the number of ways remain the same, i.e, $dp[u]$.

```
dfs(u, par):
dp[u] = 1
for c in adj[u]:
if c == par:
continue
dfs(c, u)
dp[u] += dp[u]*dp[c]
```

The time complexity is $\mathcal{O}(n)$.

You can find the full code here

Given a tree, find the number of connected induced subgraphs where vertex $1$ has degree $d$ $(0 \leq d < n)$.

Root the tree at vertex $u$. Define

$dp[u][d]$ represents the number of connected induced subgraphs where node $u$ is the highest vertex and has degree $d$.

Let us recursively compute the DP value for the children and then incrementally add children. Not only that, we will add children degree by degree. This concept of adding children degree by degree also appeared in the previous blog.

Suppose you have a configuration where node $u$ has degree $d$. Now, you introduce a children $c$ with degree $now$. What happens to the resulting subgraph?

- You can delete the edge between $u$ and $c$. You get a configuration where vertex $u$ has degree $d$.
- You can keep the edge between $u$ and $c$. Then, in the resulting subgraph, the degree of both $u$ and $c$ increases by 1. Therefore, we get a configuration where the degree of node $u$ is $d + 1$.

Of course, to avoid mixing old configurations of degree $d + 1$ with the newly created ones, we need to use an auxillary DP array, as discussed in the previous blog.

```
dfs(u, par):
dp[u][0] = 1
for c in adj[u]:
if c == par:
continue
dfs(c, u)
ndp = []
for d in range(adj[u].size)
# Ignore this children
ndp[d] += dp[u][d]
for now in range(adj[c].size):
nxt = d + 1
# Append this children
ndp[nxt] += dp[u][d]*dp[c][now]
swap(dp[u], ndp)
```

The time complexity is $\mathcal{O}(n^2)$. But it’s not trivial to prove. For example, If $adj[u].size$ and $adj[c].size$ had been replaced by $n$, the time complexity would clearly be $\mathcal{O}(n^3)$ because we do $\mathcal{O}(n^2)$ operations in each node. However, we only iterate over the max possible degree of source and child, because we know that the other DP values are zero. And this makes all the difference.

Resources for the proof can be found here. I will add my own interpretation later.

You can find the full code here

Given a tree, find the number of connected induced subgraphs where vertex $1$ has degree $0$ or $1$ or $\geq 2$.

In this version, we don’t need to print the answer for all the degrees. We can still compute the $O(n^2)$ DP and then print the sum of $dp[u][\geq 2]$. Can we improve the time complexity?

Note that all the degree $\geq 2$ states can be treated as identical. Hence, we can compress all these states into a single state.

$dp[u][d]$ is the number of connected induced subgraphs where vertex $u$ is the highest vertex and has degree $d \in (0, \ 1, \ \geq 2)$

The transitions are same as before. When you are combining vertex $u$ with degree $d$, and a child $c$ with degree $now$, then, if you keep the edge to the children, the resulting degree would be $d + now$. If $d + now > 1$, we can just set $d + now = 2$.

```
dfs(u, par):
dp[u][0] = 1
for c in adj[u]:
if c == par:
continue
dfs(c, u)
ndp = []
for d in {0, 1, 2}:
# Ignore this children
ndp[d] += dp[u][d]
for now in {0, 1, 2}:
nxt = d + 1
if nxt > 1:
nxt = 2
# Append this children
ndp[nxt] += dp[u][d]*dp[c][now]
swap(dp[u], ndp)
```

The time complexity is $O(n)$.

You can find the full code here

Given a tree, find the number of connected induced subgraphs where all the degree-1 vertices are monochromatic.

We will solve this problem for each color independently using the ideas discussed above.

Fix a color. $dp[u][d]$ is the number of connected induced subgraphs where the highest vertex is node $u$, it has degree $d \in (0, \ 1, \ \geq 2)$ and all the degree-1 vertices except node $u$ are monochromatic with the given color.

The DP definition might seem odd at first. For example, why isn’t $dp[u][1] = 0$ if the color of $u$ does not match the color being investigated? The original problem clearly states that degree 1 vertices cannot be of different color. So why do we allow this relaxation to vertex $u$?

The answer lies in the fact that in Tree DP problems, there would be a **correction** when you push the DP values up the tree. For example, even if vertex $u$ with degree $1$ is invalid now, it can become valid when a parent is attached to it, as its degree would become 2.

However, if a child has degree $0$, and you attach it to a parent, its degree would become $1$. Moreover, this is the last correction that would be applied to the child, because the future corrections would only act on its parent. Hence, if still the color of the child is not equal to the desired color, we will have to reject the child.

You can read more about the **extension** and **correction** phase from the previous blog.

This is not a new idea. You might have implicitly used it in other DP problems. The intermediate DP values might look contrary to what’s being asked in the problem, but we still retain those in the hopes of it being corrected in later stages.

```
dfs(u, par, color):
dp[u][0] = 1
for c in adj[u]:
if c == par:
continue
dfs(c, u)
ndp = []
for d in {0, 1, 2}:
# Ignore this children.
ndp[d] += dp[u][d]
for now in {0, 1, 2}:
# Last correction for child.
if now is 0 and a[child] != color:
continue
nxt = d + 1
if(nxt > 1) {
nxt = 2
}
# Append this children.
ndp[nxt] += dp[u][d]*dp[c][now]
swap(dp[u], ndp)
```

To compute the answer for a single color, we can iterate over all vertices and sum up the contribution of degree-2 states. However, for degree 0 and degree 1 states, we should only sum up the contribution if the color of the node is equal to the current color being investigated. This is because no more corrections are left for the highest vertex, and if the colors are not matched yet, we have to discard the configuration.

```
for color in 1 to n:
ans = 0
dfs(1, -1, color)
for u in 1 to n:
if color == u.color:
ans += dp[u][0] + dp[u][1]
ans += dp[u][2]
print(ans)
```

The time complexity for a single color is $\mathcal{O}(n)$.

You can find the full code here

This tutorial introduces several new ideas:

- How to extend a tree by incrementally adding children.
- How to compress multiple states into a single state.
- Why and how to maintain DP values that are invalid according to the problem description.
- How to perform DP on highest vertices for unrooted tree.
- How to avoid polluting DP values of the current level by maintaining an auxillary DP array for the next level.

If you are looking for a problem to test your understanding of the first 3 concepts, you can attempt ABC248F: Keep Connect. Please do not look at its editorial, I can assure you that you’ll be able to come up with the solution on your own with a little bit of help. Ask me for hints on any of these platforms.

If you’re interested, you can checkout my youtube channel.

I’m sure there are alternate (possibly easier) versions of the $\mathcal{O}(n)$ DP for each color. I’d love to hear your approach on the Codeforces blog.