📖 Chapter 3.2 ⏱️ About 70 minutes 🎯 Intermediate

Chapter 3.2: Arrays and Prefix Sums

📝 Prerequisites: Make sure you are comfortable with arrays, vectors, and basic loops (Chapters 2.2–2.3). You should also understand long long overflow (Chapter 2.1).

Imagine you have an array of N numbers, and someone asks you 100,000 times: "What is the sum of the elements from index L to index R?" The naive approach recomputes the sum every time — O(N) per query, for a total of O(N × Q). When N = Q = 10^5, that means 10^10 operations, which is far too slow.

Prefix sums solve this problem with O(N) preprocessing and O(1) per query. This is one of the most elegant and practical techniques in competitive programming.

💡 Key idea: Prefix sums turn a "range query" problem into a single subtraction. Instead of summing from L to R every time, you precompute cumulative sums and subtract two values. This replaces repeated O(Q) work with one-time O(N) preprocessing.

3.2.1 The Prefix Sum Idea

The prefix sum of an array is a new array where each element stores the cumulative sum up to the current index.

Visual: Prefix Sum Array

Prefix Sum Visualization

The figure above shows how the prefix sum array is built from the original array, and how sum(L, R) = P[R] - P[L-1] computes a range sum in O(1) time. The blue cells mark the query range, while the red and green cells show the two prefix values being subtracted.

Given array: A = [3, 1, 4, 1, 5, 9, 2, 6] (using 1-indexing for explanation)

Index:  1  2  3  4  5  6  7  8
A:      3  1  4  1  5  9  2  6
P:      3  4  8  9  14 23 25 31

Here P[i] = A[1] + A[2] + ... + A[i].

Why use 1-indexing?

Using 1-indexed arrays lets us define P[0] = 0 (the "empty prefix" has sum zero). This makes the query formula P[R] - P[L-1] work even when L = 1: we compute P[R] - P[0] = P[R], which is correct.

Building the prefix sum array

📄 View code: Building the prefix sum array

// Build prefix sum array — O(N)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n;
    cin >> n;

    // Step 1: read input (1-indexed)
    vector<int> A(n + 1);
    for (int i = 1; i <= n; i++) cin >> A[i];

    // Step 2: build prefix sums
    vector<long long> P(n + 1, 0);  // P[0] = 0 (base case)
    for (int i = 1; i <= n; i++) {
        P[i] = P[i - 1] + A[i];   // ← key line: each P[i] = sum of all elements up to i
    }

    return 0;
}

Complexity analysis:

Time: O(N) — one pass over the array
Space: O(N) — stores the prefix array

Step-by-step trace for A = [3, 1, 4, 1, 5]:

i=1: P[1] = P[0] + A[1] = 0 + 3 = 3
i=2: P[2] = P[1] + A[2] = 3 + 1 = 4
i=3: P[3] = P[2] + A[3] = 4 + 4 = 8
i=4: P[4] = P[3] + A[4] = 8 + 1 = 9
i=5: P[5] = P[4] + A[5] = 9 + 5 = 14

3.2.2 `O(1)` Range Sum Queries

Once we have the prefix sum array, the sum from index L to R is:

sum(L, R) = P[R] - P[L-1]

Why? P[R] is the sum of elements 1..R. P[L-1] is the sum of elements 1..(L-1). Their difference is exactly the sum of elements L..R.

💡 Key idea: Think of P[i] as "the total sum of the first i elements." To get the sum of the window [L, R], subtract "the prefix before L" from "the prefix up to R." It is like: big triangle minus small triangle = trapezoid.

📄 Complete C++ code

// Range sum queries — preprocessing O(N), each query O(1)
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 100001;
long long A[MAXN];
long long P[MAXN];

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, q;
    cin >> n >> q;

    // Step 1: read the array
    for (int i = 1; i <= n; i++) cin >> A[i];

    // Step 2: build prefix sums — O(n)
    P[0] = 0;
    for (int i = 1; i <= n; i++) {
        P[i] = P[i - 1] + A[i];
    }

    // Step 3: answer q range sum queries — each O(1)
    for (int i = 0; i < q; i++) {
        int l, r;
        cin >> l >> r;
        cout << P[r] - P[l - 1] << "\n";  // ← key line: range sum formula
    }

    return 0;
}

Sample input:

8 3
3 1 4 1 5 9 2 6
1 4
3 7
2 6

Sample output:

9
21
20

Verification:

sum(1,4) = P[4] - P[0] = 9 - 0 = 9 → A[1]+A[2]+A[3]+A[4] = 3+1+4+1 = 9 ✓
sum(3,7) = P[7] - P[2] = 25 - 4 = 21 → A[3]+...+A[7] = 4+1+5+9+2 = 21 ✓
sum(2,6) = P[6] - P[1] = 23 - 3 = 20 → A[2]+...+A[6] = 1+4+1+5+9 = 20 ✓

⚠️ Common mistake: Writing P[R] - P[L] instead of P[R] - P[L-1]. The formula includes both endpoints L and R — you need to subtract the sum before L, not the sum at L.

Total complexity: O(N + Q) — easily fast enough for N and Q up to 10^5.

3.2.3 USACO Example: Breed Counting

This is a classic USACO Bronze problem (December 2015).

Problem: N cows stand in a line. Each cow has breed 1, 2, or 3. Answer Q queries: how many cows of breed B are in positions L through R?

Solution: Maintain one prefix sum array for each breed.

📄 Complete C++ code

// Multi-breed prefix sums — O(N + Q)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, q;
    cin >> n >> q;

    vector<int> breed(n + 1);
    vector<vector<long long>> P(4, vector<long long>(n + 1, 0));
    // P[b][i] = number of cows of breed b in positions 1..i

    // Step 1: build prefix sums for each breed
    for (int i = 1; i <= n; i++) {
        cin >> breed[i];
        for (int b = 1; b <= 3; b++) {
            P[b][i] = P[b][i - 1] + (breed[i] == b ? 1 : 0);  // ← key line
        }
    }

    // Step 2: answer each query in O(1)
    for (int i = 0; i < q; i++) {
        int l, r, b;
        cin >> l >> r >> b;
        cout << P[b][r] - P[b][l - 1] << "\n";
    }

    return 0;
}

🏆 USACO tip: Many USACO Bronze problems ask you to "count elements satisfying property X inside a range." If Q is large, always consider prefix sums.

3.2.4 USACO-Style Walkthrough: FJ's Grass Fields

🔗 Related problems: This fictional USACO-style problem is inspired by "Breed Counting" and "Tallest Cow" — both classic Bronze problems.

Problem: FJ has N consecutive fields. The i-th field has grass[i] units of grass. He needs to answer Q queries: "What is the total amount of grass from field L to field R, inclusive?" N and Q can be as large as 10^5, and each query must be answered in O(1).

Sample input:

6 4
4 2 7 1 8 3
1 3
2 5
4 6
1 6

Sample output:

Step-by-step solution:

Step 1: Understand the problem. We have the array [4, 2, 7, 1, 8, 3], and we need range sums.

Step 2: Build the prefix sum array.

Index:  0  1  2  3  4  5  6
Grass:  -  4  2  7  1  8  3
P:      0  4  6  13 14 22 25

Step 3: Answer queries with P[R] - P[L-1]:

Query (1,3): P[3] - P[0] = 13 - 0 = 13 ✓
Query (2,5): P[5] - P[1] = 22 - 4 = 18 ✓
Query (4,6): P[6] - P[3] = 25 - 13 = 12 ✓
Query (1,6): P[6] - P[0] = 25 - 0 = 25 ✓

Complete C++ solution:

📄 Complete C++ code

// FJ's Grass Fields — prefix sum solution O(N + Q)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, q;
    cin >> n >> q;

    // Step 1: read grass amounts and build prefix sums at the same time
    vector<long long> P(n + 1, 0);
    for (int i = 1; i <= n; i++) {
        long long g;
        cin >> g;
        P[i] = P[i - 1] + g;   // ← key line: incremental prefix sum
    }

    // Step 2: answer each query in O(1)
    while (q--) {
        int l, r;
        cin >> l >> r;
        cout << P[r] - P[l - 1] << "\n";
    }

    return 0;
}

Why is this O(N + Q)?

Build prefix sums: one loop with N iterations → O(N)
Each query: one subtraction → O(1) per query, O(Q) total
Total: O(N + Q) — far better than brute force O(NQ)

⚠️ Common mistake: Using int instead of long long for prefix sums. If each grass amount can be up to 10^9 and N = 10^5, the total can reach 10^14 — far beyond the range of int (about 2×10^9).

3.2.5 2D Prefix Sums

For a 2D grid, prefix sums can be extended to answer rectangular range queries in O(1) time.

Given an R×C grid, define P[r][c] as the sum of all elements in the rectangle from (1,1) to (r,c).

Building a 2D prefix sum

P[r][c] = A[r][c] + P[r-1][c] + P[r][c-1] - P[r-1][c-1]

The subtraction removes overlap; otherwise the top-left rectangle would be counted twice.

💡 Key idea (inclusion-exclusion): Imagine four rectangles:

P[r-1][c] = the rectangle above

P[r][c-1] = the rectangle on the left

P[r-1][c-1] = the top-left corner (counted in both of the previous two, so subtract it once)

A[r][c] = the single new cell

Step-by-step 2D prefix sum example

Trace a 4×4 grid:

Original grid A:

     c=1  c=2  c=3  c=4
r=1:  1    2    3    4
r=2:  5    6    7    8
r=3:  9   10   11   12
r=4: 13   14   15   16

Build P step by step (left to right, top to bottom):

📄 Code trace

P[1][1] = A[1][1] = 1
P[1][2] = 2 + 0 + 1 - 0 = 3
P[1][3] = 3 + 0 + 3 - 0 = 6
P[1][4] = 4 + 0 + 6 - 0 = 10
P[2][1] = 5 + 1 + 0 - 0 = 6
P[2][2] = 6 + 3 + 6 - 1 = 14
P[2][3] = 7 + 6 + 14 - 3 = 24
P[2][4] = 8 + 10 + 24 - 6 = 36
P[3][1] = 9 + 6 + 0 - 0 = 15
P[3][2] = 10 + 14 + 15 - 6 = 33
P[3][3] = 11 + 24 + 33 - 14 = 54
P[3][4] = 12 + 36 + 54 - 24 = 78
P[4][1] = 13 + 15 + 0 - 0 = 28
P[4][2] = 14 + 33 + 28 - 15 = 60
P[4][3] = 15 + 54 + 60 - 33 = 96
P[4][4] = 16 + 78 + 96 - 54 = 136

Prefix sum grid P:

     c=1  c=2  c=3  c=4
r=1:  1    3    6   10
r=2:  6   14   24   36
r=3: 15   33   54   78
r=4: 28   60   96  136

Query: sum of subgrid (r1=2, c1=2) to (r2=3, c2=3):

ans = P[3][3] - P[1][3] - P[3][1] + P[1][1]
    = 54     -  6     -  15     +  1
    = 34

Verification: A[2][2]+A[2][3]+A[3][2]+A[3][3] = 6+7+10+11 = 34 ✓

Inclusion-exclusion visualization:

2D Prefix Sum Inclusion-Exclusion

📄 Complete C++ code

// 2D prefix sums — build O(R×C), query O(1)
#include <bits/stdc++.h>
using namespace std;

const int MAXR = 1001, MAXC = 1001;
int A[MAXR][MAXC];
long long P[MAXR][MAXC];

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int R, C;
    cin >> R >> C;

    for (int r = 1; r <= R; r++)
        for (int c = 1; c <= C; c++)
            cin >> A[r][c];

    // Step 1: build 2D prefix sums — O(R × C)
    for (int r = 1; r <= R; r++) {
        for (int c = 1; c <= C; c++) {
            P[r][c] = A[r][c]
                    + P[r-1][c]    // rectangle above
                    + P[r][c-1]    // rectangle on the left
                    - P[r-1][c-1]; // ← key line: remove overlap (counted twice)
        }
    }

    // Step 2: answer each query in O(1)
    int q;
    cin >> q;
    while (q--) {
        int r1, c1, r2, c2;
        cin >> r1 >> c1 >> r2 >> c2;
        long long ans = P[r2][c2]
                      - P[r1-1][c2]    // subtract top strip
                      - P[r2][c1-1]    // subtract left strip
                      + P[r1-1][c1-1]; // add back top-left corner
        cout << ans << "\n";
    }

    return 0;
}

Complexity analysis:

Build time: O(R × C)
Query time: O(1) each
Space: O(R × C)

⚠️ Common mistake: Forgetting to add back P[r1-1][c1-1] in the query formula. The top strip and left strip both contain the top-left corner, so it gets subtracted twice and must be added back once.

3.2.6 Difference Arrays

Earlier, we learned prefix sums: they are good at solving "many range-sum queries."

Now consider a problem that looks like the opposite:

You have an array of length N, initially all zeros. Then there are many operations, each adding a value V to the entire interval [L, R]. Finally, output the whole array.

If we simulate directly, each operation loops from L to R:

for (int i = L; i <= R; i++) {
    A[i] += V;
}

This is correct, but if both N and the number of operations M are 100000, the worst case is 10^10 additions — too slow.

The core problem solved by difference arrays is:

How can we express "add V to the whole interval" without actually modifying every element in the interval?

Start with a real-life intuition

Imagine a straight track with 8 positions:

Position: 1  2  3  4  5  6  7  8
Value:    0  0  0  0  0  0  0  0

Now we want to add 5 to all positions in [3, 6].

The naive method writes +5 on positions 3, 4, 5, and 6.

A difference array does not do that. It records only two facts:

Starting from position 3, +5 becomes active
Starting from position 7, +5 stops being active

That is:

diff[3] += 5;
diff[7] -= 5;

Why is that enough? Because at the end we scan from left to right and accumulate these "change markers."

diff:     0  0 +5  0  0  0 -5  0
running:  0  0  5  5  5  5  0  0
position: 1  2  3  4  5  6  7  8

Positions 3 through 6 become exactly 5, and position 7 returns to 0.

This is the entire intuition of difference arrays:

A range update does not directly modify the range. Instead, put a "start changing" marker at the range start, and put a "stop changing" marker after the range end.

Difference Array Range Update

What exactly does a difference array store?

If the original array is:

A:      2   5   5   9   7

Then the difference array diff stores "how much the current position changed compared with the previous position":

diff[1] = A[1] - 0    = 2
diff[2] = A[2] - A[1] = 3
diff[3] = A[3] - A[2] = 0
diff[4] = A[4] - A[3] = 4
diff[5] = A[5] - A[4] = -2

So:

A:      2   5   5   9   7
diff:   2   3   0   4  -2

Taking the prefix sum of diff restores A:

2
2 + 3 = 5
2 + 3 + 0 = 5
2 + 3 + 0 + 4 = 9
2 + 3 + 0 + 4 - 2 = 7

Therefore, you can understand difference arrays as:

diff[i] records "how much the value changes starting from position i."

Why does range addition only need two positions?

Suppose we want to add V to every element in [L, R].

Starting from position L, all later values should be larger by V, so:

diff[L] += V;

But this influence should only last through R. Starting from R+1, the extra V should disappear, so we cancel it:

diff[R + 1] -= V;

Together, this is the most important formula for difference arrays:

diff[L] += V;
diff[R + 1] -= V;

Note: this is why diff is usually allocated with size n + 2. When R = n, we still access diff[n + 1].

Complete example: three range updates

Start with a zero array of length 5:

A:      0  0  0  0  0

Perform three operations:

1) Add 2 to [1, 3]
2) Add 3 to [2, 5]
3) Add -1 to [3, 4]

We do not directly modify A; we only modify diff.

Operation 1: `[1, 3] +2`

diff[1] += 2;
diff[4] -= 2;

diff[1..6] = [2, 0, 0, -2, 0, 0]

Meaning: add 2 starting from 1, and cancel this +2 starting from 4.

Operation 2: `[2, 5] +3`

diff[2] += 3;
diff[6] -= 3;

diff[1..6] = [2, 3, 0, -2, 0, -3]

Operation 3: `[3, 4] -1`

diff[3] += -1;
diff[5] -= -1;   // equivalent to diff[5] += 1

diff[1..6] = [2, 3, -1, -2, 1, -3]

Now take the prefix sum of diff to restore the final array:

i=1: 2            -> A[1] = 2
i=2: 2 + 3 = 5    -> A[2] = 5
i=3: 5 - 1 = 4    -> A[3] = 4
i=4: 4 - 2 = 2    -> A[4] = 2
i=5: 2 + 1 = 3    -> A[5] = 3

Final result:

A: 2 5 4 2 3

Difference Array Reconstruction

Complete C++ implementation

📄 Complete C++ code

// Difference array for range updates — O(N + M)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n, m;
    cin >> n >> m;

    // diff[i] means: starting from position i, how much does the current value change?
    // Allocate one extra position: when the update right endpoint r = n, we need diff[n + 1]
    vector<long long> diff(n + 2, 0);

    // Step 1: process all range updates
    for (int i = 0; i < m; i++) {
        int l, r;
        long long v;
        cin >> l >> r >> v;

        diff[l] += v;      // from l onward, adding v becomes active
        diff[r + 1] -= v;  // from r+1 onward, cancel this addition
    }

    // Step 2: take prefix sums of diff to get the final array
    long long cur = 0;
    for (int i = 1; i <= n; i++) {
        cur += diff[i];
        cout << cur;
        if (i < n) cout << " ";
    }
    cout << "\n";

    return 0;
}

Sample input:

Sample output:

2 5 4 2 3

When should you use a difference array?

Difference arrays are suitable for this scenario:

Many range updates, then final results are queried together

For example:

Add V to every element in [L, R]
Paint many wall segments, each operation painting one interval
Increase traffic volume on many road intervals
Add 1 to online user count during a time interval

But note:

A normal difference array is not suitable for dynamic problems where updates and range queries are interleaved.

If a problem requires updates and queries to appear alternately, you usually need a Fenwick tree or a segment tree.

Common mistakes with difference arrays

Array too small: diff should have size n + 2, because we may access diff[r + 1].
Forgetting the final prefix sum: diff itself is not the answer; only after taking prefix sums do we get the final array.
Writing diff[r + 1] -= v as diff[r] -= v: this prevents position r from receiving the update.
Potential overflow: after many range additions, values can become large, so use long long.

One-sentence memory trick:

Start adding at the left endpoint, stop adding after the right endpoint; scan once at the end, and changes become answers.

3.2.7 2D Difference Arrays

A one-dimensional difference array solves "add a value to a continuous interval."

A two-dimensional difference array solves the analogous problem on a grid:

You have an R × C grid, initially all zeros. Each operation adds V to an entire rectangular region. Finally, output the whole grid.

For example:

Add V to every cell in the rectangle from top-left (r1, c1) to bottom-right (r2, c2)

If we simulate directly, we must enumerate every cell inside the rectangle. When rectangles are large and operations are numerous, this becomes very slow.

The idea of 2D difference arrays is exactly the same as in 1D:

Do not really modify the whole rectangle. Only place markers on the rectangle boundary. At the end, use a 2D prefix sum to spread those markers into the final result.

From the 1D formula to the 2D formula

For a 1D interval [L, R] + V, the markers are:

diff[L] += V;
diff[R + 1] -= V;

This means:

The effect starts at L and stops at R+1

A 2D rectangle has two directions: rows and columns.

If we only write:

diff[r1][c1] += V;

then after taking a 2D prefix sum, V affects the entire lower-right rectangle starting from (r1,c1), not just the rectangle [r1,c1]..[r2,c2] that we want.

So we need to cancel the influence to the right and below, and then add back the lower-right area that was canceled twice.

The four-corner formula is:

diff[r1][c1]         += V;  // effect starts from the top-left corner
diff[r1][c2 + 1]     -= V;  // past the right boundary, cancel horizontal influence
diff[r2 + 1][c1]     -= V;  // past the bottom boundary, cancel vertical influence
diff[r2 + 1][c2 + 1] += V;  // lower-right area was canceled twice, add it back once

These four markers are the core of 2D difference arrays.

2D Difference Array Four Corners

Understanding four-corner markers visually

Suppose we want to add V to this rectangle:

          c1        c2
          ↓         ↓
      .   .   .   .   .
r1 -> .  +V  +V  +V   .
      .  +V  +V  +V   .
r2 -> .  +V  +V  +V   .
      .   .   .   .   .

We place four markers:

Top-left corner:       +V   starts the influence
Right of top edge:     -V   prevents the influence from continuing right
Below left edge:       -V   prevents the influence from continuing downward
Lower-right outside:   +V   fixes the area that was subtracted twice

That is:

(r1, c1)       place +V
(r1, c2 + 1)   place -V
(r2 + 1, c1)   place -V
(r2 + 1, c2+1) place +V

Complete example: two rectangle updates

A 3 × 3 grid, initially all zeros.

Perform two updates:

1) update(1, 1, 2, 2, +5)
2) update(2, 2, 3, 3, +3)

Operation 1: add 5 to the top-left `2 × 2`

diff[1][1] += 5;
diff[1][3] -= 5;
diff[3][1] -= 5;
diff[3][3] += 5;

Operation 2: add 3 to the bottom-right `2 × 2`

diff[2][2] += 3;
diff[2][4] -= 3;
diff[4][2] -= 3;
diff[4][4] += 3;

After all markers, diff is roughly:

       c=1  c=2  c=3  c=4
r=1:    5    0   -5    0
r=2:    0    3    0   -3
r=3:   -5    0    5    0
r=4:    0   -3    0    3

Now take a 2D prefix sum:

diff[r][c] += diff[r-1][c] + diff[r][c-1] - diff[r-1][c-1];

The final grid is:

       c=1  c=2  c=3
r=1:    5    5    0
r=2:    5    8    3
r=3:    0    3    3

The cell (2,2) equals 8 because it is covered by both the +5 and +3 rectangles.

Complete C++ implementation

📄 Complete C++ code

// 2D difference array — each rectangle update O(1), final rebuild O(R × C)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int R, C, M;
    cin >> R >> C >> M;

    // Allocate an extra sentinel border: when r2=R or c2=C, we access r2+1 or c2+1
    vector<vector<long long>> diff(R + 2, vector<long long>(C + 2, 0));

    // Step 1: mark each rectangle update in O(1)
    while (M--) {
        int r1, c1, r2, c2;
        long long v;
        cin >> r1 >> c1 >> r2 >> c2 >> v;

        diff[r1][c1]         += v;
        diff[r1][c2 + 1]     -= v;
        diff[r2 + 1][c1]     -= v;
        diff[r2 + 1][c2 + 1] += v;
    }

    // Step 2: rebuild the final grid with a 2D prefix sum
    for (int r = 1; r <= R; r++) {
        for (int c = 1; c <= C; c++) {
            diff[r][c] += diff[r - 1][c]
                        + diff[r][c - 1]
                        - diff[r - 1][c - 1];
        }
    }

    // Step 3: output the final grid
    for (int r = 1; r <= R; r++) {
        for (int c = 1; c <= C; c++) {
            cout << diff[r][c];
            if (c < C) cout << " ";
        }
        cout << "\n";
    }

    return 0;
}

When should you use a 2D difference array?

2D difference arrays are suitable for:

Many rectangular region additions, then finally obtain the whole grid

Common problem types include:

Paint many rectangular regions, then ask for each cell's color or coverage count
Stack multiple rectangular billboards and compute the final brightness
Increase height, temperature, or weight over rectangular regions on a map
Process many rectangle updates offline

Common mistakes with 2D difference arrays

Wrong signs at the four corners: remember the pattern is + - - +.
Array too small: you must be able to access r2 + 1 and c2 + 1, so allocate at least R + 2 by C + 2.
Forgetting the final 2D prefix sum: the four-corner markers are only changes, not the final grid.
Wrong rebuild formula: it should be top + left - top-left, namely:

diff[r][c] += diff[r-1][c] + diff[r][c-1] - diff[r-1][c-1];

One-sentence memory trick:

For rectangle updates, look at the four corners: top-left add, top-right subtract, bottom-left subtract, bottom-right add; then take a 2D prefix sum.

3.2.8 USACO Example: Maximum Subarray Sum

Problem (a variation of Kadane's algorithm): Find the maximum sum of a contiguous subarray.

📄 Complete C++ code

// Kadane's algorithm — O(N) time, O(1) space
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n;
    cin >> n;
    vector<int> A(n);
    for (int &x : A) cin >> x;

    // Kadane's algorithm: O(n)
    long long maxSum = LLONG_MIN;  // smallest long long
    long long current = 0;

    for (int i = 0; i < n; i++) {
        current += A[i];
        maxSum = max(maxSum, current);
        if (current < 0) current = 0;  // ← key line: restart when the sum becomes negative
    }

    cout << maxSum << "\n";

    return 0;
}

💡 Key idea: Why reset current to 0 when it becomes negative? Because a negative prefix sum only hurts any future subarray. If the current running sum is -5, any future subarray starting fresh (sum 0) is better than continuing from -5.

Alternative using prefix sums: The maximum subarray sum is the maximum value of P[j] - P[i-1] over all pairs (i,j). For each j, this is maximized when P[i-1] is as small as possible. Track the running minimum prefix sum!

// Alternative: minimum-prefix trick — also O(N)
long long maxSum = LLONG_MIN, minPrefix = 0, prefix = 0;
for (int x : A) {
    prefix += x;
    maxSum = max(maxSum, prefix - minPrefix);  // best sum ending here
    minPrefix = min(minPrefix, prefix);         // track the smallest prefix seen so far
    // ⚠️ Note: minPrefix must be updated after maxSum.
    // If minPrefix is updated first, it effectively allows an empty subarray (length 0),
    // which makes all-negative arrays incorrectly return 0 instead of the largest negative value.
}

3.2.8 USACO Real Problem Practice: Turning Range Problems into Prefix Differences

In USACO, prefix sums and difference arrays are usually not presented as "please write a prefix sum." Instead, they are hidden behind signals like these:

Problem signal	First-choice technique	Why
Many queries for the count/sum of some kind of element in `[L,R]`	Prefix sums	The original array does not change, and there are many queries
Many operations that add to `[L,R]`, then a final query/output	Difference array	Many updates, but no interleaved queries
Many rectangle queries on a grid	2D prefix sums	Rectangle sums can be answered with inclusion-exclusion
Many rectangle additions on a grid	2D difference array	Four-corner markers, then final rebuild

Real Problem 1: Breed Counting (USACO 2015 December Silver) — Multi-category Prefix Sums

Problem link: USACO 2015 December Silver P3: Breed Counting
Pattern: multiple prefix sum arrays
Difficulty level: Silver beginner

Problem interpretation

N cows stand in a row. Each cow's breed is one of 1, 2, or 3. Then there are Q queries. Each query gives an interval [L,R], and you must output how many cows of each of the three breeds appear in that interval.

Key conditions:

The order of cows is fixed and never changes.
There are many queries, so scanning [L,R] each time is too slow.
There are only 3 breeds, so we can build one prefix sum for each breed.

Idea

For each breed b, build a prefix array:

prefix[b][i] = number of cows of breed b among the first i cows

Then the number of breed b cows in interval [L,R] is:

prefix[b][R] - prefix[b][L-1]

This is exactly the same as ordinary prefix sums, except that "summing values" becomes "counting whether a category appears."

Complete CPP code

✅ Complete code: Breed Counting

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    // USACO official file I/O can be enabled when needed:
    // freopen("bcount.in", "r", stdin);
    // freopen("bcount.out", "w", stdout);

    int n, q;
    cin >> n >> q;

    vector<array<int, 4>> prefix(n + 1);  // only indices 1..3 are used

    for (int i = 1; i <= n; i++) {
        int breed;
        cin >> breed;
        prefix[i] = prefix[i - 1];        // inherit counts from the first i-1 cows
        prefix[i][breed]++;               // add 1 to the current breed
    }

    while (q--) {
        int l, r;
        cin >> l >> r;
        for (int b = 1; b <= 3; b++) {
            int count = prefix[r][b] - prefix[l - 1][b];
            cout << count << " \n"[b == 3];
        }
    }

    return 0;
}

Complexity: preprocessing O(N × 3), each query O(3), total O(N+Q), space O(3N).

Common pitfalls

Forgetting prefix[i] = prefix[i-1]. If you only increment the current breed, all previous counts are lost.
Out-of-bounds at L-1. With 1-indexed prefix sums, prefix[0] is the sentinel, so queries with L=1 are safe.
Wrong output format. Each query outputs 3 numbers followed by a newline.
Confusing file names. The official file I/O names are bcount.in/out, not breed.in/out.

Extension

If the number of breeds increases from 3 to 10^5, but each query asks about only one breed, you can store the positions of each breed in a map<breed, vector<int>>, then use binary search to count occurrences in the interval. When choosing a data structure, always look at "number of categories × query pattern."

Real Problem 2: Haybale Stacking (USACO 2012 January Bronze) — Difference Array for Range Addition

Problem link: USACO 2012 January Bronze P2: Haybale Stacking
Pattern: difference array + final rebuild
Difficulty level: Bronze/Silver foundation

Problem interpretation

There are N hay stacks, all initially height 0. Then K instructions follow. Each instruction adds 1 to every hay stack in interval [A,B]. After all operations, output the median height.

Key conditions:

N can be up to 10^6, and K can be up to 25000.
Directly looping over [A,B] for each instruction can reach O(NK), which is too slow.
The problem only asks for the result after all operations; there are no intermediate queries.

Idea

A difference array turns one range addition into two endpoint markers:

diff[A] += 1
diff[B+1] -= 1

Finally, scan from left to right and take prefix sums to recover the final height of every position. Then sort the heights and take the median.

Complete CPP code

✅ Complete code: Haybale Stacking

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    // USACO official file I/O can be enabled when needed:
    // freopen("stacking.in", "r", stdin);
    // freopen("stacking.out", "w", stdout);

    int n, k;
    cin >> n >> k;

    vector<int> diff(n + 2, 0);  // must write to b+1, which may be n+1

    for (int i = 0; i < k; i++) {
        int a, b;
        cin >> a >> b;
        diff[a] += 1;
        diff[b + 1] -= 1;
    }

    vector<int> height(n + 1);
    int current = 0;
    for (int i = 1; i <= n; i++) {
        current += diff[i];
        height[i] = current;
    }

    sort(height.begin() + 1, height.end());
    cout << height[(n + 1) / 2] << "\n";  // N is odd, so the median index is fixed

    return 0;
}

Complexity: range updates O(K), rebuild O(N), sorting O(N log N), space O(N).

Common pitfalls

diff is too small. It must have size n+2, because when B=n, we write diff[n+1].
Forgetting to rebuild. The difference array is not the final height array; you must take prefix sums first.
Wrong median index. In this problem N is odd. With a 1-indexed height array, the median is at (N+1)/2.
Overusing segment trees. There are no intermediate queries, so a difference array is simpler and faster.

Extension

If the problem becomes "after each range addition, immediately query some position or interval," a difference array is no longer enough; you need a Fenwick tree or segment tree. This is exactly what Chapters 5.8 and 5.7 will solve.

⚠️ Common Mistakes in Chapter 3.2

Off-by-one in range queries: writing P[R] - P[L] instead of P[R] - P[L-1]. Always test with a small example.
Overflow: prefix sums of large values can exceed the range of int (2×10^9). Even if the elements are int, the prefix array should use long long.
2D query formula: forgetting the +P[r1-1][c1-1] term in a 2D query is very easy.
Difference array size: declaring diff[n+1] when you need diff[n+2] (because you write to index r+1, which can be n+1).
1-indexed vs 0-indexed: with 0-indexed prefix sums, the query formula becomes P[R+1] - P[L]. Pick one convention inside a problem and stick to it.
2D difference array size: declaring diff[R+1][C+1] when you need diff[R+2][C+2] — the four-corner update writes to (r2+1, c2+1), so it must be in bounds.
2D difference rebuild order: the 2D prefix-sum rebuild must process cells from left to right and top to bottom (the same order as building a 2D prefix sum). A mixed order produces wrong results.

Chapter Summary

📌 Key takeaways

Technique	Build time	Query time	Space	Use case
1D prefix sum	`O(N)`	`O(1)`	`O(N)`	Range sums on a 1D array
2D prefix sum	`O(RC)`	`O(1)`	`O(RC)`	Rectangle sums on a 2D grid
Difference array	`O(N+M)`	`O(1)`*	`O(N)`	Range addition updates
2D difference array	`O(RC+M)`	`O(1)`*	`O(RC)`	Rectangle additions on a 2D grid
Kadane's algorithm	`O(N)`	—	`O(1)`	Maximum subarray sum

*Values can be read only after an O(N) rebuild pass.

🧩 Core formula quick reference

Operation	Formula	Notes
1D range sum	`P[R] - P[L-1]`	P[0] = 0 is the sentinel value
2D rectangle sum	`P[r2][c2] - P[r1-1][c2] - P[r2][c1-1] + P[r1-1][c1-1]`	Inclusion-exclusion: subtract twice, add once
Difference array update	`diff[L] += V; diff[R+1] -= V;`	Array size should be N+2
2D difference update	`diff[r1][c1]+=V; diff[r1][c2+1]-=V; diff[r2+1][c1]-=V; diff[r2+1][c2+1]+=V`	Four-corner markers
Restore from difference	Take prefix sums of diff (1D or 2D)	The result is the final array

❓ FAQ

Q1: What is the relationship between prefix sums and difference arrays?

A: They are inverse operations. Taking prefix sums accumulates values; taking differences records adjacent changes. After range updates are stored as +V and -V markers, taking prefix sums spreads those markers into the final values, and +V/-V cancel outside [L,R].

Q2: When should I use prefix sums vs. difference arrays?

A: Rule of thumb — look at the operation type:

Many range-sum queries → prefix sums (preprocess O(N), query O(1))

Many range addition/subtraction operations → difference arrays (update O(1), restore O(N) at the end)

If updates and queries are interleaved, use a more advanced data structure (such as the segment tree in Chapter 5.7)

Q3: Can prefix sums handle dynamic modifications? (array elements change)

A: No. Prefix sums are one-time preprocessing, and the array cannot change afterward. If elements are modified, use a Fenwick tree (BIT) or segment tree, which support point updates and range queries in O(log N).

Q4: Why are there two versions of Kadane's algorithm (current=0 vs minPrefix)?

A: They are essentially the same and both run in O(N). The first version (classic Kadane) is more intuitive: restart when the current subarray sum becomes negative. The second version (minimum prefix) uses prefix-sum thinking: max subarray = max(P[j] - P[i]) = max(P[j]) - min(P[i]). Choose whichever you prefer.

Q5: What are the space limits for 2D prefix sums?

A: If R and C are both 10^4, the P array needs 10^8 long long values (about 800MB), which exceeds many memory limits. In general, R×C ≤ 10^6~10^7 is safe. For larger grids, consider compression or offline processing.

🔗 Connections to later chapters

Chapter 3.4 (Two Pointers): sliding windows can also handle range-like queries, but only for fixed-size or monotonically moving windows; prefix sums are more general
Chapter 3.3 (Sorting and Searching): binary search can be combined with prefix sums — for example, binary search the prefix sum array for the first position ≥ a target
Chapter 5.7 (Segment Trees): solves the "dynamic update + range query" problems that prefix sums cannot handle
Chapters 6.1–6.3 (Dynamic Programming): many state transitions involve range sums; prefix sums are an important tool for optimizing DP
The difference array idea ("+V at the start, -V after the end") appears repeatedly in sweep line algorithms, event sorting, and other advanced techniques

Practice Problems

Problem 3.2.1 — Range Sum 🟢 Easy

Read N integers and Q queries. Each query gives L and R. Print the sum of elements from index L to R (1-indexed).

Hint

Build a prefix sum array P where P[i] = A[1]+...+A[i], then answer each query with P[R] - P[L-1].

✅ Complete solution

Core idea: Precompute prefix sums in O(N), then answer each query in O(1) using P[R] - P[L-1].

#include <bits/stdc++.h>
using namespace std;
int main() {
    ios_base::sync_with_stdio(false); cin.tie(NULL);
    int n, q; cin >> n >> q;
    vector<long long> P(n + 1, 0);
    for (int i = 1; i <= n; i++) {
        int x; cin >> x;
        P[i] = P[i-1] + x;
    }
    while (q--) {
        int l, r; cin >> l >> r;
        cout << P[r] - P[l-1] << "\n";
    }
}

Complexity: O(N + Q) — far better than naive O(N × Q).

Problem 3.2.2 — Range Addition, Point Values 🟢 Easy

Start with N zeros. Process M operations: each operation adds V to every position from L to R. After all operations, print the value at every position.

Hint

For each update, do `diff[L] += V` and `diff[R+1] -= V`, then take prefix sums of diff.

✅ Complete solution

Core idea: Difference array. Each range addition affects only 2 positions in diff; final values are obtained by prefix sums.

#include <bits/stdc++.h>
using namespace std;
int main() {
    int n, m; cin >> n >> m;
    vector<long long> diff(n + 2, 0);
    while (m--) {
        int l, r; long long v; cin >> l >> r >> v;
        diff[l] += v;
        diff[r+1] -= v;
    }
    long long cur = 0;
    for (int i = 1; i <= n; i++) {
        cur += diff[i];
        cout << cur << " \n"[i == n];
    }
}

Complexity: O(N + M) — each update is O(1), and the final scan is O(N).

Problem 3.2.3 — Rectangular Sum 🟡 Medium

Read an N×M grid and Q queries. Each query gives (r1,c1,r2,c2). Print the sum of the subgrid.

Hint

Use 2D prefix sums. Query = P[r2][c2] - P[r1-1][c2] - P[r2][c1-1] + P[r1-1][c1-1].

✅ Complete solution

Core idea: 2D prefix sums. P[i][j] is the sum from (1,1) to (i,j), and any rectangle query is answered by inclusion-exclusion.

#include <bits/stdc++.h>
using namespace std;
int main() {
    ios_base::sync_with_stdio(false); cin.tie(NULL);
    int n, m, q; cin >> n >> m >> q;
    vector<vector<long long>> P(n+1, vector<long long>(m+1, 0));
    for (int i = 1; i <= n; i++)
        for (int j = 1; j <= m; j++) {
            int x; cin >> x;
            P[i][j] = x + P[i-1][j] + P[i][j-1] - P[i-1][j-1];
        }
    while (q--) {
        int r1, c1, r2, c2; cin >> r1 >> c1 >> r2 >> c2;
        cout << P[r2][c2] - P[r1-1][c2] - P[r2][c1-1] + P[r1-1][c1-1] << "\n";
    }
}

Complexity: O(N × M + Q).

Problem 3.2.4 — USACO 2016 January Bronze: Mowing the Field 🔴 Hard

FJ mows grass along a path. Cells visited more than once form a "double-mowed" area. Count how many cells are visited at least twice.

Hint

Simulate the path, mark visit counts in a 2D structure, then count cells with value ≥ 2.

✅ Complete solution

Core idea: Direct simulation — no advanced data structure is needed. Walk along the path and increment the visit counter for every cell reached.

#include <bits/stdc++.h>
using namespace std;
int main() {
    int n; cin >> n;
    map<pair<int,int>, int> cnt;
    int x = 0, y = 0; cnt[{x,y}]++;
    while (n--) {
        char dir; int steps; cin >> dir >> steps;
        int dx = (dir=='E') - (dir=='W');
        int dy = (dir=='N') - (dir=='S');
        while (steps--) {
            x += dx; y += dy;
            cnt[{x,y}]++;
        }
    }
    int doubleMowed = 0;
    for (auto& [pos, c] : cnt) if (c >= 2) doubleMowed++;
    cout << doubleMowed << "\n";
}

Complexity: O(total steps × log), dominated by map operations.

Problem 3.2.5 — 2D Range Addition 🟡 Medium

Given an N×M grid initially filled with zeros, process Q operations. Each operation adds V to the rectangle from [r1,c1] to [r2,c2]. Output the final grid.

Hint

Use a 2D difference array: mark 4 corners for each update, then rebuild with a 2D prefix sum.

✅ Complete solution

Core idea: 2D difference array. Each rectangle update touches only 4 corners. The final grid is the 2D prefix sum of the difference array.

#include <bits/stdc++.h>
using namespace std;
int main() {
    int n, m, q; cin >> n >> m >> q;
    vector<vector<long long>> D(n+2, vector<long long>(m+2, 0));
    while (q--) {
        int r1, c1, r2, c2; long long v;
        cin >> r1 >> c1 >> r2 >> c2 >> v;
        D[r1][c1] += v;
        D[r1][c2+1] -= v;
        D[r2+1][c1] -= v;
        D[r2+1][c2+1] += v;
    }
    for (int i = 1; i <= n; i++)
        for (int j = 1; j <= m; j++)
            D[i][j] += D[i-1][j] + D[i][j-1] - D[i-1][j-1];
    for (int i = 1; i <= n; i++)
        for (int j = 1; j <= m; j++)
            cout << D[i][j] << " \n"[j == m];
}

Complexity: O(Q + N × M).

Problem 3.2.6 — Maximum Subarray (Kadane's Algorithm) 🟡 Medium

Read N integers (possibly negative). Find the maximum sum of a contiguous subarray.

Hint

Use Kadane's algorithm. If all numbers are negative, the answer is the largest single element.

✅ Complete solution

Core idea: At each position, either start a new subarray or extend the current one. cur = max(A[i], cur + A[i]), and track the best value.

#include <bits/stdc++.h>
using namespace std;
int main() {
    int n; cin >> n;
    long long best = LLONG_MIN, cur = 0;
    for (int i = 0; i < n; i++) {
        long long x; cin >> x;
        cur = max(x, cur + x);
        best = max(best, cur);
    }
    cout << best << "\n";
}

Why start over when cur+x < x? Because a negative running sum only hurts future terms — discard it and restart from the current element.

Trace for [-2, 1, -3, 4, -1, 2, 1, -5, 4]:

x=-2: cur=-2, best=-2
x=1:  cur=max(1,-2+1)=1, best=1
x=-3: cur=max(-3,1-3)=-2, best=1
x=4:  cur=max(4,-2+4)=4, best=4
x=-1: cur=max(-1,4-1)=3, best=4
x=2:  cur=5, best=5
x=1:  cur=6, best=6 ✓
x=-5: cur=1, best=6
x=4:  cur=5, best=6

Complexity: O(N) time, O(1) space.

Difference Array Practice

The following problems specifically train difference arrays. First decide which category the problem belongs to:

1D range addition, final array output: use a 1D difference array.
1D range coverage counts, final answer count: use a 1D difference array + scan.
2D rectangle addition or coverage counts: use a 2D difference array.
Intermediate queries exist: ordinary difference arrays are usually not enough; use a Fenwick tree or segment tree.

Problem 3.2.7 — Maximum Value After Range Additions 🟢 Easy

You have an array of length N, initially all 0. Then there are M operations. Each operation adds V to all positions in interval [L,R]. After all operations, output the maximum value in the array.

Hint

You do not need to store every final value. Use a difference array to record all range additions, then scan left to right with a running prefix sum while maintaining the maximum.

✅ Complete solution

Core idea: Each range addition changes only two positions: diff[L] += V, diff[R+1] -= V. During the final scan, the current prefix sum is the final value at the current position.

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;

    vector<long long> diff(n + 2, 0);

    while (m--) {
        int l, r;
        long long v;
        cin >> l >> r >> v;
        diff[l] += v;
        diff[r + 1] -= v;
    }

    long long cur = 0;
    long long best = LLONG_MIN;

    for (int i = 1; i <= n; i++) {
        cur += diff[i];
        best = max(best, cur);
    }

    cout << best << "\n";
    return 0;
}

Complexity: O(N + M) time, O(N) space.

Problem 3.2.8 — Count Positions Covered at Least K Times 🟡 Medium

There are N positions on a number line, numbered 1..N. You are given M intervals [L,R], and each interval covers those positions once. Count how many positions are covered at least K times.

Hint

Each interval cover is equivalent to adding `1` to a range. Use a difference array to compute the coverage count of each position, then count positions with value `>= K`.

✅ Complete solution

Core idea: This problem does not ask you to output the final array; it asks you to count positions satisfying a condition. Rebuild the difference array and count on the fly.

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m, k;
    cin >> n >> m >> k;

    vector<int> diff(n + 2, 0);

    while (m--) {
        int l, r;
        cin >> l >> r;
        diff[l] += 1;
        diff[r + 1] -= 1;
    }

    int cur = 0;
    int answer = 0;

    for (int i = 1; i <= n; i++) {
        cur += diff[i];
        if (cur >= k) answer++;
    }

    cout << answer << "\n";
    return 0;
}

Complexity: O(N + M) time, O(N) space.

Problem 3.2.9 — Range Additions on an Existing Array 🟡 Medium

Given an original array A of length N. Then there are M operations, each adding V to all elements in interval [L,R]. After all operations, output the final array.

Hint

The initial array is not all zeros. You can use a difference array only to record the "extra additions," then add those increments back to the original array; or you can first convert the original array into a difference array and modify it directly.

✅ Complete solution

Core idea: The simplest implementation is to let diff record only the increments caused by operations. During the final scan, diff tells us how much has been added to the current position, and we add it to A[i].

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;

    vector<long long> a(n + 1);
    for (int i = 1; i <= n; i++) {
        cin >> a[i];
    }

    vector<long long> diff(n + 2, 0);

    while (m--) {
        int l, r;
        long long v;
        cin >> l >> r >> v;
        diff[l] += v;
        diff[r + 1] -= v;
    }

    long long add = 0;
    for (int i = 1; i <= n; i++) {
        add += diff[i];
        cout << a[i] + add << " \n"[i == n];
    }

    return 0;
}

Complexity: O(N + M) time, O(N) space.

Problem 3.2.10 — Rectangle Coverage Counts 🟡 Medium

You have an R × C grid, initially with coverage count 0 in every cell. You are given M rectangles, each represented by its top-left corner (r1,c1) and bottom-right corner (r2,c2). Each rectangle increases the coverage count of every cell inside it by 1. Count how many cells are covered at least K times.

Hint

This is a 2D difference array problem. For each rectangle update, mark four corners: top-left `+1`, right outside top edge `-1`, below left edge `-1`, and lower-right outside `+1`. Then take a 2D prefix sum and count the answer.

✅ Complete solution

Core idea: The four-corner marker pattern for 2D difference arrays is + - - +. During reconstruction, use the 2D prefix formula: D[r][c] += D[r-1][c] + D[r][c-1] - D[r-1][c-1].

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int R, C, M, K;
    cin >> R >> C >> M >> K;

    vector<vector<int>> diff(R + 2, vector<int>(C + 2, 0));

    while (M--) {
        int r1, c1, r2, c2;
        cin >> r1 >> c1 >> r2 >> c2;

        diff[r1][c1] += 1;
        diff[r1][c2 + 1] -= 1;
        diff[r2 + 1][c1] -= 1;
        diff[r2 + 1][c2 + 1] += 1;
    }

    int answer = 0;

    for (int r = 1; r <= R; r++) {
        for (int c = 1; c <= C; c++) {
            diff[r][c] += diff[r - 1][c]
                        + diff[r][c - 1]
                        - diff[r - 1][c - 1];
            if (diff[r][c] >= K) answer++;
        }
    }

    cout << answer << "\n";
    return 0;
}

Complexity: O(R × C + M) time, O(R × C) space.

Problem 3.2.11 — Decide Whether You Need a Difference Array 🟡 Medium

For each of the following four problem types, what data structure or algorithm should you consider first?

The array does not change, and there are many queries asking for the sum of [L,R].
There are many operations that add a value to [L,R], and after all operations we output the final array.
There are many operations that add a value to [L,R], and after each operation we immediately query the value at some position.
There are many operations that add a value to a rectangular region, and after all operations we report the final value of every grid cell.

Hint

Distinguish the key properties: query or modification? One-dimensional or two-dimensional? Are there intermediate queries?

✅ Complete solution

Prefix sums. The array is fixed and there are many range-sum queries; after preprocessing, each query is O(1).
1D difference array. There are many range additions, and the result is restored once at the end.
Fenwick tree or segment tree. Updates and queries are interleaved, so an ordinary difference array cannot directly handle intermediate queries.
2D difference array. Batch rectangle additions, then rebuild the grid once at the end.

The point of this problem is not writing code, but learning to identify the correct tool from the problem statement.

🏆 Challenge Problem: Cows and Paint Buckets

An N×M grid contains paint buckets, each with a positive value. Choose any rectangular subgrid. Score = (maximum value inside the subgrid) - (sum of boundary cells). Find the best rectangle. (N, M ≤ 500)

✅ Solution idea

Naively enumerating all rectangles costs O(N²M²), which is too slow if each rectangle also requires scanning. A better plan:

Use 2D prefix sums for O(1) sum queries.
For maximum value inside a subgrid, preprocess a 2D sparse table (or per-row RMQ) to support O(1) maximum queries.
Boundary sum = total sum - inner sum (both computed with prefix sums).

Total: O(N²M²) rectangle enumeration × O(1) per query, which can fit for N,M ≤ 500.

C++ for Competitive Programming: A USACO Guide