Video editorial script: Sum of Fractions
A storytelling walkthrough from $O(N^4)$ down to $O(N \log N)$.
Show the diagram: a fraction $\frac{x}{y}$ at the top, two curved arrows branching to $\frac{x+1}{y}$ on the left (green) and $\frac{x}{y-1}$ on the right (red), with operation labels and the “no simplification” example $\frac{3}{7} \to \frac{3}{6} \neq \frac{1}{2}$.
Say:
“Let’s start with something simple. You have a fraction $\frac{x}{y}$. The problem gives you exactly two moves — think of them as two doors. Door one: add 1 to the numerator. Door two: subtract 1 from the denominator — but only if the denominator is bigger than 1. And here’s the catch: we never simplify. $\frac{3}{6}$ stays $\frac{3}{6}$, not $\frac{1}{2}$. These are the only two tools in our toolbox.”
Show the diagram: an Axes plot with two curves — the blue curve $\frac{1}{d}$ (gain from numerator increase) and the green curve $\frac{1}{d(d-1)}$ (gain from denominator decrease) — with a legend box. The green curve is higher for small $d$.
Say:
“So which door should we pick? Look at these two curves. The blue line is the gain from adding 1 to the numerator. The green line is the gain from decreasing the denominator by 1. For small denominators the green line towers over the blue one. That tells us: always prefer decreasing the denominator. It’s not even close when the denominator is small. This is the first key insight — denominator decrease is the strictly better move.”
Show the diagram: array $[2, 3, 5, 2, 3]$ with index labels above each cell. Subarray $a[1..3]$ is highlighted in teal with a brace labeled $a[1\ldots 3]$. Below: the fraction sum $\frac{1}{3} + \frac{1}{5} + \frac{1}{2}$ with the $\frac{1}{2}$ term colored yellow. A badge reads $\min = 2$.
Say:
“Now let’s connect this to the actual problem. Take any subarray — say indices 1 through 3 here. We turn each element into a fraction: $\frac{1}{3}$, $\frac{1}{5}$, $\frac{1}{2}$. Notice which fraction has the smallest denominator — the $\frac{1}{2}$. That’s the one highlighted in yellow. And 2 is the minimum of this subarray. Remember that: the minimum element is the star of the show.”
Show the diagram: three fractions in circles ($\frac{1}{3}$, $\frac{1}{2}$, $\frac{1}{5}$), the middle one glowing yellow. Five green $+$ tokens above with arrows all flowing into that one fraction. Below: the chain $\frac{1}{2} \to \frac{1}{1} \to \frac{2}{1} \to \frac{3}{1} \to \cdots$. A label reads $k = 5$ moves.
Say:
“Here’s the greedy in action. We have $k = 5$ moves — those are the five green tokens. Where do we spend them? All on the fraction with the smallest denominator. We decrease its denominator from 2 to 1. Now it’s $\frac{1}{1}$. We’ve used one move. We can’t decrease the denominator any further — it’s already 1. So the remaining four moves all go to the numerator: $\frac{2}{1}$, $\frac{3}{1}$, $\frac{4}{1}$, $\frac{5}{1}$. All $k$ moves, one fraction. That’s optimal.”
Show the diagram: a diamond labeled $k < mn?$ at the top. Left branch (checkmark, teal) leads to the box $\frac{k+1}{mn}$. Right branch (cross, maroon) leads to $k - mn + 2$. Below: the full formula $\mathrm{MSF} = \sum \frac{1}{a_i} - \frac{1}{mn} + \text{extra}(k)$ with dashed lines connecting the two branches into the “extra” term.
Say:
“That greedy gives us a clean closed form. The diamond at the top is the question: is $k$ smaller than the minimum? If yes, we take the left branch — the extra is $\frac{k+1}{mn}$. If no — we had enough moves to eliminate the denominator entirely — the extra is $k - mn + 2$. And the full formula is: the base sum of all fractions, minus $\frac{1}{mn}$ to avoid double-counting, plus whichever branch applies. That’s $\mathrm{MSF}$ — maximum sum of fractions — for one subarray.”
Show the diagram: two nested rounded rectangles labeled $\ell$ and $r$. Inside: two inner boxes — a red one labeled $\min$ with $O(n)$, a green one labeled $\sum \frac{1}{a}$ with $O(n)$. A badge in the corner reads $O(N^4)$.
Say:
“Now the hard part. We need this formula for every subarray, summed up. The naive way: loop over all $\ell$, loop over all $r$, and inside each pair, scan the subarray to find the minimum and the sum of inverses. Two $O(n)$ scans nested inside two $O(n)$ loops — that’s $O(n^4)$ per query. With large $n$ and many queries, this is hopeless. But it’s the starting point. Let’s optimize.”
Show the diagram: array at top. Below it, a row of rounded boxes showing $\frac{1}{a_i}$ values, connected by curved arrows (cumulative flow). A brace under a sub-range, and below that: $\mathrm{pref}[r] - \mathrm{pref}[\ell - 1] = O(1)$ in a yellow badge.
Say:
“First optimization: precompute prefix sums of the inverses. Each arrow here adds the next $\frac{1}{a_i}$ to the running total. Now the sum over any range is just one subtraction: $\mathrm{pref}[r] - \mathrm{pref}[\ell - 1]$. Constant time. We just eliminated one entire $O(n)$ pass from the inner loop.”
Show the diagram: a full binary segment tree with 8 leaves, each node a labeled dot showing the actual minimum value. Queried leaves are yellow; internal ancestors are teal. A brace under the array reads $[l, r]$. Badge: $O(\log n)$.
Say:
“The other $O(n)$ scan was finding the minimum. A segment tree handles this: we build it once in $O(n)$, and each range-min query costs $O(\log n)$. Look at the highlighted path — the yellow leaves are our query range, and the teal nodes are the ancestors that contribute to the answer. So now each subarray costs $O(\log n)$ instead of $O(n)$. Combined with prefix sums, we’re at $O(n^2 \log n)$ per query — but across all loops, that’s roughly $O(n^3 \log n)$.”
Show the diagram: 8-element array with index labels. Two labeled overlapping blocks ($\mathrm{st}[\ell][j]$ in blue, $\mathrm{st}[r - 2^j + 1][j]$ in green) hovering above. A yellow overlap region. The formula $\min[\ell..r] = \min(\mathrm{st}[\ell][j],, \mathrm{st}[r - 2^j + 1][j])$ at top. Badge: $O(1)$.
Say:
“Can we do even better? Yes — with a sparse table. We precompute minima over blocks of power-of-two lengths. Any range $[\ell, r]$ can be covered by two overlapping blocks. We just take the min of those two precomputed values — and that’s $O(1)$. The build costs $O(n \log n)$ but each query is constant. Now the inner loop is truly $O(1)$ per subarray, so we’re at $O(n^2)$ per query — $O(n^3)$ total. We shaved the log off.”
Show the diagram: array $[2, 3, 5, 2, 3]$ with indices. Seven colored arcs above (different $[l,r]$ ranges). All converge toward a star at index 3 (value 2). The formula $\sum_{l \le r} \mathrm{MSF} = \sum_p \mathrm{cnt}_p \cdot f(a_p, k)$.
Say:
“Here’s the big idea. So far we’ve been looping over all $n^2$ subarrays. But look at this picture. All these different subarrays — all those colored arcs — they all share the same minimum position: index 3, value 2. So instead of iterating over subarrays, we flip the perspective. We iterate over positions. For each position $p$, we ask: how many subarrays have their minimum at $p$? Call that $\mathrm{cnt}_p$. Then we multiply by the per-subarray contribution. One sum over $p$ instead of a double sum over $(\ell, r)$. That’s the paradigm shift.”
Show the diagram: array $[4, 2, 5, 2, 3]$. The first “2” (index 1) has a green checkmark and a green border. The second “2” (index 3) has a red cross and a red border. Brace: $a[1] = a[3] = 2$. Rule: “tie $\Rightarrow$ smaller index wins.”
Say:
“One subtlety: what if two positions have the same value? Look — indices 1 and 3 both have value 2. Which one ‘owns’ a subarray that contains both? We need a tie-breaking rule so every subarray gets assigned to exactly one minimum position. The rule: smaller index wins. Index 1 gets the checkmark; index 3 gets the cross. Simple, but critical for correctness.”
Show the diagram: array $[5, 3, 4, 2, 6, 7]$ with index labels. A dot at position $p = 3$. Dashed red boundary lines at $L$ and $R$. Two double arrows below: teal for $p - L$ and maroon for $R - p$. Formula badge: $\mathrm{cnt}_p = (p - L) \times (R - p)$.
Say:
“For each position $p$, we need to find how far its ‘reign’ extends. Where does a strictly smaller element appear? The leftmost such boundary is $L$; the rightmost is $R$. These are shown by the red dashed lines. Any subarray starting between $L$ and $p$ and ending between $p$ and $R$ will have its minimum at $p$. The count is the product of these two ranges — $(p - L) \times (R - p)$. That’s $\mathrm{cnt}_p$.”
Show the diagram: a 5x5 grid with $\ell$ on the vertical axis and $r$ on the horizontal. Bullet dots mark valid $(l, r)$ pairs (where $r \ge l$). A yellow rectangle highlights the region where min is at $p$. Formula badge: $\mathrm{cnt}_p = (p - L)(R - p)$.
Say:
“Another way to see it: imagine a grid where rows are $\ell$ and columns are $r$. Each cell is one subarray. The yellow rectangle is exactly the subarrays whose minimum is at $p$. The width of that rectangle is $R - p$, the height is $p - L$. So the number of cells — the number of subarrays with min at $p$ — is width times height. Rectangle area. Beautiful.”
Show the diagram: three rounded boxes stacked vertically. Blue box: $\sum_{x=l}^{r} \frac{1}{a_x}$ (labeled “base”). Green box: $\frac{k+1}{mn}$ or $k - mn + 2$ (labeled “extra(k)”). Red box: $-\frac{1}{mn}$ (labeled “correction”). A $+$ between blue and green; a $-$ between green and red.
Say:
“Each subarray’s contribution has three pieces. The blue box is the base — just the sum of all the fractions. The green box is the extra from using our $k$ moves — which formula depends on whether $k$ is less than the minimum. And the red box is the correction: we subtract $\frac{1}{mn}$ once so we don’t double-count the minimum’s fraction. Base, plus extra, minus correction. That’s the full contribution, and we multiply it by $\mathrm{cnt}_p$.”
Show the diagram: a number line with labeled dots $k_1, k_2, \ldots, k_6$. A yellow dashed vertical line at $mn$. To the left: a blue shaded zone with $\frac{k+1}{mn}$. To the right: a green shaded zone with $k - mn + 2$. An arrow labeled “lower_bound” points down from the threshold.
Say:
“We have many queries $k_1 \le k_2 \le \cdots \le k_m$ — they’re sorted. For each position $p$ with minimum value $mn$, the queries split into two zones. To the left of $mn$: we use one formula. To the right: the other. The dividing line is found by binary search — lower_bound — in $O(\log m)$. Now the question is: how do we efficiently add a value to a whole range of query answers without looping through them one by one?”
Show the diagram: two rows of rounded boxes. The top row is the “diff array” — all zeros except
+v(green) at position $\ell$ and-v(red) at position $r + 1$. A yellow arrow labeled “prefix sum” points down to the bottom row, the “actual” values — all $v$ in the range $[\ell, r]$ and zero outside. A formula badge at the bottom.
Say:
“This is the key technique: the difference array. We want to add $v$ to every position in a range. Instead of touching each cell, we just mark $+v$ at the start and $-v$ after the end. That’s two operations, constant time. When we need the actual values, one prefix sum recovers them. So for each position $p$, we do one binary search and a constant number of range adds. $n$ positions, each $O(\log n)$ — the whole thing is $O(n \log n)$.”
Show the diagram: three labeled rows — $\mathrm{fixed}[i]$ (blue), $\mathrm{var}[i]$ (teal), $\mathrm{ans}[i]$ (yellow). Arrows from the top two rows converge into the bottom row. Formula: $\mathrm{ans}[i] = \mathrm{fixed}[i] + \mathrm{var}[i] \times k_i$.
Say:
“We maintain two difference arrays: fixed and variable. After processing all positions and running one prefix sum on each, the answer for query $i$ is simply $\mathrm{fixed}[i] + \mathrm{variable}[i] \times k_i$. One addition, one multiplication, per query. The two streams — the constant part and the $k$-dependent part — merge into the final answer. That’s the power of decomposing the contribution into a linear function of $k$.”
Show the diagram: top — ascending columns of different heights (bar chart) representing elements sorted by value. Bottom — a number line showing the “alive” set. A yellow dot at the current index $p$, red dots at its neighbors $L$ and $R$, with curved arrows connecting them. Label: $\text{alive} = {0, 3}$.
Say:
“The last piece: how do we find $L$ and $R$ for each position? We process indices in increasing order of $a_i$ — that’s the ascending bars at the top. We maintain an ordered set of indices we’ve already processed — the ‘alive’ set. When we insert a new index $p$ (the yellow dot), its left and right neighbors in the alive set are exactly $L$ and $R$. An ordered set gives us predecessor and successor in $O(\log n)$. So the entire boundary computation for all $n$ positions costs $O(n \log n)$.”
Show the diagram: five rounded boxes in a vertical flowchart, each with a formula inside. Box 1: “Sort by $a_i$”. Box 2: “alive set $\to L, R, \mathrm{cnt}$”. Box 3: “range-add fixed, var”. Box 4: “base: $\frac{1}{a_i}(i+1)(n-i)$”. Box 5: “prefix sum $\to \mathrm{ans}[i]$”. Arrows between each box. Badge: $O(n \log n + m)$.
Say:
“And here’s the full pipeline, top to bottom. Step 1: sort indices by value. Step 2: for each index, use the alive set to find boundaries and count. Step 3: range-add into our two difference arrays, splitting queries at the threshold. Step 4: add the base contribution — each $\frac{1}{a_i}$ appears in $(i+1)(n-i)$ subarrays, so we add that once. Step 5: prefix sum both arrays, then compute $\mathrm{ans}[i] = \mathrm{fixed}[i] + \mathrm{variable}[i] \times k_i$. Total: $O(n \log n + m)$. Done.”
Show the diagram: six horizontal bars of decreasing width (red to blue), each labeled with a complexity ($O(N^4)$ down to $O(N \log N)$). Arrows between them with optimization labels: “prefix sums”, “sparse table”, “regroup by min”, “diff arrays”, “diff arrays $O(1)$”. The final bar glows yellow.
Say:
“Let’s zoom out and appreciate the journey. We started with $O(N^4)$ — the biggest, reddest bar. Prefix sums shrank it. A sparse table shrank it again. Then the paradigm shift — regrouping by minimum — gave us a massive reduction. And finally, offline queries with difference arrays brought us all the way to $O(N \log N)$, the glowing bar at the bottom. Five optimizations, each removing a factor. That’s the art of competitive programming: not one magic trick, but a sequence of insights, each building on the last.”
| Scene | Section | What’s on screen |
|---|---|---|
| S01 | Act I, Scene 1 | Fraction $\to$ two branches with math labels and example |
| S02 | Act I, Scene 2 | Axes with two gain curves (blue vs green) |
| S03 | Act I, Scene 3 | Array strip, subarray brace, fraction sum, min badge |
| S04 | Act I, Scene 4 | Fractions in circles, green tokens, greedy chain |
| S05 | Act I, Scene 5 | Diamond flowchart, two cases, full MSF formula |
| S06 | Act II, Scene 6 | Nested rounded rectangles, min/sum boxes, $O(N^4)$ badge |
| S07 | Act III, Scene 7 | Prefix row with curved arrows, brace, $O(1)$ badge |
| S08 | Act III, Scene 8 | Full labeled segment tree, query path, $O(\log n)$ badge |
| S09 | Act III, Scene 9 | Overlapping sparse table blocks, formula, $O(1)$ badge |
| S10 | Act IV, Scene 10 | Colored arcs converging to min star, regrouping formula |
| S11 | Act IV, Scene 11 | Equal values, checkmark/cross, tie-break rule |
| S12 | Act IV, Scene 12 | $L$, $p$, $R$ with double arrows, cnt formula badge |
| S13 | Act IV, Scene 13 | $(l,r)$ grid, yellow rectangle, cnt formula |
| S14 | Act IV, Scene 14 | Three stacked formula boxes: base + extra - correction |
| S15 | Act V, Scene 15 | Number line, threshold, two shaded formula zones |
| S16 | Act V, Scene 16 | Diff array row and actual row, prefix sum arrow |
| S17 | Act V, Scene 17 | fixed + variable rows merging into answer row |
| S18 | Act VI, Scene 18 | Ascending bars, alive set, L/R curved arrows |
| S19 | Act VI, Scene 19 | 5-step pipeline flowchart, $O(n \log n + m)$ badge |
| S20 | Act VI, Scene 20 | Shrinking complexity bars with optimization labels |