2D Array Traversal: Row-Major vs Column-Major
Question 6 / 12 • Correct so far: 0 (0 answered)
Row Major
float sumMatrix(const float mat[][N]) {
float sum = 0.0f;
for (int i = 0; i < N; ++i)
for (int j = 0; j < N; ++j)
sum += mat[i][j];
return sum;
}
float result = sumMatrix(matrix); Column Major
float sumMatrix(const float mat[][N]) {
float sum = 0.0f;
for (int j = 0; j < N; ++j)
for (int i = 0; i < N; ++i)
sum += mat[i][j];
return sum;
}
float result = sumMatrix(matrix); Shared test data (shared-setup)
constexpr int N = 1024;
static float matrix[N][N];
struct MatrixInit {
MatrixInit() {
for (int i = 0; i < N; ++i)
for (int j = 0; j < N; ++j)
matrix[i][j] = static_cast<float>(i * N + j + 1);
}
} _matrix_init; Which snippet is faster?
Snippet A iterates in row-major order, matching how C++ stores 2D arrays in memory — row by row. Each inner-loop step reads the next float sequentially, so the hardware prefetcher streams cache lines efficiently. Snippet B iterates column-by-column: each inner-loop step jumps an entire row (1024 floats = 4 KB), causing a cache miss on nearly every access and repeated round-trips to main memory.
Benchmark results
| Snippet | CPU time / iteration | Speedup |
|---|---|---|
| Column Major | 3.41 ms | 1.0× |
| Row Major | 585 us | 5.8× |
Explore the source
Open in Compiler ExplorerQuiz complete. You can return to the question list to restart and compare.