Returning a Named Local: Plain Return vs std::move Return

Question 10 / 17 • Correct so far: 0 (0 answered)

Snippet A

Plain Return

NOINLINE Buffer buildBuffer(std::uint64_t seed) {
    Buffer buf;
    for (std::size_t i = 0; i < buf.data.size(); ++i)
        buf.data[i] = static_cast<double>(seed ^ i);
    return buf;
}

for (std::uint64_t input : inputs) {
    Buffer b = buildBuffer(input);
    checksum += b.data[input % 1024];
}

Snippet B

Move Return

NOINLINE Buffer buildBuffer(std::uint64_t seed) {
    Buffer buf;
    for (std::size_t i = 0; i < buf.data.size(); ++i)
        buf.data[i] = static_cast<double>(seed ^ i);
    return std::move(buf);
}

for (std::uint64_t input : inputs) {
    Buffer b = buildBuffer(input);
    checksum += b.data[input % 1024];
}

Shared test data (shared-setup)

struct Buffer {
    std::array<double, 1024> data;
};

constexpr std::size_t kInputCount = 512;

static std::vector<std::uint64_t> inputs;

struct DataInit {
    DataInit() {
        inputs.reserve(kInputCount);
        for (std::size_t i = 0; i < kInputCount; ++i)
            inputs.push_back(static_cast<std::uint64_t>(i) * 6364136223846793005ULL + 1);
    }
} data_init;

Which snippet is faster?

Snippet A is faster because returning a named local variable directly is eligible for Named Return Value Optimization (NRVO): the compiler constructs the Buffer in-place at the caller's return slot, eliminating all copies and moves. Snippet B applies std::move to the return expression, which explicitly suppresses NRVO. Because std::array cannot be moved cheaply — moving an array of trivial elements performs an element-wise copy — Snippet B pays an extra 8 KB copy on every call.

Benchmark results

clang · C++17 · -O3 -march=native

Snippet	CPU time / iteration	Speedup
Move Return	61.2 us	1.0×
Plain Return	39.2 us	1.6×

Explore the source

Open in Compiler Explorer

Returning a Named Local: Plain Return vs std::move Return

Benchmark results

Explore the source

Per-question summary

Tracking settings