Branch Prediction: Sorted vs. Shuffled Data
Question 14 / 17 • Correct so far: 0 (0 answered)
Sorted Array
std::size_t filterAboveThreshold(const std::vector<int>& data,
std::vector<int>& out) {
std::size_t n = 0;
for (int x : data)
if (x >= kThreshold) out[n++] = x;
return n;
}
std::size_t n = filterAboveThreshold(SORTED_DATA, OUTPUT); Shuffled Array
std::size_t filterAboveThreshold(const std::vector<int>& data,
std::vector<int>& out) {
std::size_t n = 0;
for (int x : data)
if (x >= kThreshold) out[n++] = x;
return n;
}
std::size_t n = filterAboveThreshold(SHUFFLED_DATA, OUTPUT); Shared test data (shared-setup)
constexpr int kArraySize = 32768;
constexpr int kThreshold = 128;
static std::vector<int> generateData() {
std::mt19937 rng(42);
std::uniform_int_distribution<int> dist(0, 255);
std::vector<int> v(kArraySize);
for (auto& x : v) x = dist(rng);
return v;
}
static const std::vector<int> SHUFFLED_DATA = generateData();
static std::vector<int> makeSorted() {
auto v = generateData();
std::sort(v.begin(), v.end());
return v;
}
static const std::vector<int> SORTED_DATA = makeSorted();
// Pre-allocated output buffer reused across iterations to avoid allocation cost.
static std::vector<int> OUTPUT(kArraySize); Which snippet is faster?
Snippet A is faster. Modern CPUs use a branch predictor to speculatively execute code before a conditional resolves. When the data is sorted, the predicate (x >= kThreshold) is false for the first half and true for the second half, so the predictor learns the pattern and mispredicts only at the single transition point. With shuffled data the branch outcome is essentially random, causing roughly a 50% misprediction rate throughout the entire loop. Each misprediction flushes the pipeline and wastes approximately 15–20 cycles. The benchmark uses a conditional store into an output buffer rather than a simple integer accumulation — a conditional store cannot be turned into a branchless CMOV, so the branch predictor is genuinely exercised.
Benchmark results
| Snippet | CPU time / iteration | Speedup |
|---|---|---|
| Shuffled Array | 22.1 us | 1.0× |
| Sorted Array | 7.22 us | 3.1× |
Explore the source
Open in Compiler ExplorerQuiz complete. You can return to the question list to restart and compare.