Hints: A Simple Problem
Answer to Hint 2: Suppose the optimal array has a position $i$ where $c_i = m_i$ (min) but $c_{i+1} = M_{i+1}$ (max). Consider flipping these to $c_i = M_i$ and $c_{i+1} = m_{i+1}$. Since $M_i \ge m_i$, position $i$ now has a larger (or equal) value, creating at least as many inversions with later positions. Since $m_{i+1} \le M_{i+1}$, position $i{+}1$ now has a smaller (or equal) value, creating at least as many inversions with earlier positions. So any “min-then-max” transition can be flipped without losing inversions. Repeating this shows the optimum has the form $[M_1, \ldots, M_k, m_{k+1}, \ldots, m_{\ell}]$ for some $k$.
Now decompose the inversion count for a given switch point $k$.
Answer to Hint 3: For a subarray of length $\ell$ with switch at $k$:
- Within the max part (positions $1 \ldots k$): $M$ is non-decreasing, so 0 inversions.
- Within the min part (positions $k{+}1 \ldots \ell$): $m$ is non-increasing, so inversions come from positions where $m$ strictly decreases (new prefix minima).
- Cross inversions (pair $(i,j)$ with $i \le k < j$): we need $M_i > m_j$.
Since $b$ is a subarray of a permutation (all distinct), $M_i \ge b_1$ and $m_j \le b_1$. When are they strictly unequal?
Answer to Hint 4: For $i \ge 2$: the prefix of length $\ge 2$ from a permutation contains at least two distinct values, so either $M_i > b_1$ or some element less than $b_1$ appeared, making $m_j < b_1 \le M_i$. In either case $M_i > m_j$. So all $(k{-}1)(\ell{-}k)$ cross pairs with $i \ge 2$ are inversions.
For $i = 1$: $M_1 = b_1$ and $m_j = b_1$ exactly when no element smaller than $b_1$ has appeared in $b_1, \ldots, b_j$. Let $p$ be the first position in the subarray where a new minimum below $b_1$ appears. Then $M_1 > m_j$ only for $j \ge p$, giving $\max(0, \ell - \max(k, p{-}1))$ inversions from $i=1$.
What determines this position $p$?
Answer to Hint 5: The position $p$ is the next smaller element after the start of the subarray — the first position where $b_p < b_1$. This is a classic monotone-stack quantity.
Before tackling the general case, consider the base case: $b_1$ is the minimum of the entire subarray (so $p > \ell$, no element is smaller). Then:
- Within-min inversions: $m_j = b_1$ for all $j$, so 0.
- Cross inversions: from $i = 1$, none (since $M_1 = b_1 = m_j$ for all $j$). From $i \ge 2$: $(k{-}1)(\ell{-}k)$.
The optimum is $\max_k (k{-}1)(\ell{-}k) = \left\lfloor (\ell{-}1)^2 / 4 \right\rfloor$. Verify this on the examples!
Answer to Hint 6: For $[1,2,3,4]$: $b_1 = 1$ is the minimum, $\ell = 4$, answer $= \lfloor 9/4 \rfloor = 2$. ✓
Now generalize. When $b_1$ is not the minimum of the whole subarray, the first new minimum at position $p$ (next smaller element after $b_1$) splits the problem: positions $1 \ldots p{-}1$ see $m = b_1$ (constant), while from position $p$ onward the min drops. How does this create a “chain” of segments?
Answer to Hint 7: From position $p$, the element $b_p < b_1$ becomes the new running minimum. Then the next smaller element after $b_p$ determines when the min drops again, and so on. This creates a chain: $b_1 \to b_p \to b_{p’} \to \cdots$, where each step goes to the next position holding a new prefix minimum.
This chain is exactly a path in the “next smaller element” tree: for each position $i$, define $\text{nxt}[i]$ as the first position $j > i$ with $b_j < b_i$. Connect $i \to \text{nxt}[i]$. This forms a forest (a Cartesian-tree-like structure rooted at a sentinel). How does the answer decompose along this chain?
Answer to Hint 8: For a query $(l, r)$, trace the chain from $l$ upward through the next-smaller-element tree: $l \to q_1 \to q_2 \to \cdots \to q_t$ where $q_t$ is the last node in the chain with $q_t \le r$ (found via binary lifting). Each segment $[q_i, q_{i+1}{-}1]$ contributes to the answer:
- The segment before $q_1$ (positions $l$ to $q_1{-}1$) behaves like the base case, where $b_l$ is the local minimum.
- Each subsequent segment contributes additional inversions as the min value drops again.
- The final partial segment from $q_t$ to $r$ needs special handling (it may not reach the next smaller element).
The contribution of each segment is a quadratic function of the query’s left endpoint $l$. Why quadratic?
Answer to Hint 9: Within each segment, the min value is constant (say $m$) and the max values grow. The cross inversions contributed by a segment of length $d$ involve terms like $(k{-}1) \cdot d$ and the within-min inversions depend on how many positions have $m$ values equal vs. strictly less. Both are polynomial (degree $\le 2$) in $l$ when the chain structure is fixed.
Because the contributions are quadratic polynomials in $l$, you can represent them compactly and sum them using a Fenwick tree that stores polynomial coefficients (e.g., a struct with fields for the constant, linear, and quadratic terms, plus a parity correction). How do you organize the computation?
Answer to Hint 10: Process queries offline, sorted by left endpoint $l$, sweeping from right to left. For each position $i$:
- Precompute $\text{nxt}[i]$ (monotone stack) and build the next-smaller-element tree.
- Use binary lifting (sparse table on the tree) to jump from $l$ up the chain to the last ancestor $\le r$.
- Accumulate segment contributions in a Fenwick tree indexed by Euler-tour positions of the next-smaller tree. Each node $i$ stores the polynomial coefficients for its segment’s contribution.
- A range query on the Fenwick tree (from $l$’s position to the ancestor’s position) gives the sum of all segment contributions.
The final segment (from the last ancestor to $r$) is handled separately with the base-case formula $\lfloor x^2 / 4 \rfloor$.
Answer to Hint 11: Putting it all together:
- Build $\text{nxt}[i]$ via a monotone stack. Construct the tree and compute Euler tour + binary lifting in $O(n \log n)$.
- Sort queries by $l$. Sweep $l$ from $n$ down to $1$. As each node activates, insert its polynomial contribution into the Fenwick tree.
- For each query $(l, r)$: use binary lifting to find the deepest ancestor $q_t \le r$. Query the Fenwick tree for the sum of polynomial contributions from $l$ to $q_t$, evaluate at $l$. Add the partial-segment contribution for $[q_t, r]$.
- Each query is answered in $O(\log^2 n)$ (binary lifting + Fenwick query). Total complexity: $O((n + q) \log n)$ after $O(n \log n)$ preprocessing.