This Sunday, during the monthly meeting of the Israeli WG21 National Body discussion forum1, we discussed a paper by Yehezkel Bernat. The paper discusses a specific issue with C++ ranges and iterators. It demonstrates it with the following code sample:

#include <ranges>
#include <iostream>
namespace rv = std::views;
int main() {
    for (auto i  : rv::iota(0)
            | rv::filter([](auto i) { return i < 11; })
            | rv::take(11))
        std::cout << i << '\n';

The code in the example does the following:

  1. Create an infinite2 range counting from 0 upwards (rv::iota(0))
  2. Filters that to only return numbers smaller than 11 (rv::filter(...))
  3. Takes the first 11 numbers - so 0 through 10 (rv::take(11))
  4. Loops over those and prints them.

But there is a problem.

The Problem

Instead of printing all 11 numbers and terminating, it prints them, and then keeps running indefinitely3. This happens because after taking the 11th element, it also tries to increment towards the 12th. Since the range is not exhausted, iota keeps running through numbers. Those numbers keep get filtered out by filter as they are all 11 or larger, and so we never terminate.

If you come from other languages (like Python or Java), this should surprise you. In those other languages, similar code will work perfectly well.

This happens due to the design of iterators in C++.

C++ Iterators

C++ has many types of iterators. The simplest one being an input-iterator. For us to iterate over it, we need 2 operators4:

  1. *it, to get a value from the iterator;
  2. ++it, to increment the iterator.

When iterating, we start with it pointing to the first element of our sequence. This means that we first use *it to get the current value, and only then use ++it to advance.

That’s why we had an issue in our above example - we read the 11th value, then tried to advance the iterator to an element that will never exist.

Additionally, we can’t skip the increment. If we do, a subsequent read from the iterator will repeat the current value.

Other Languages

The iterator design in C++ is significantly different from other languages. While C++ does read-then-increment, the common design in other languages is increment-then-read.

Python uses __next__() to get the next value of an iterator, raising StopIteration if none exist.

Java uses hasNext() to check if a value exists (and advance to it), then next() to get the value.

In both cases, there’s no reason to advance after we got the value we want. As a result - the issue won’t reproduce.

from itertools import islice, count
for i in islice(  # This is the equivalent of rv::take
                       lambda x: x < 11,

Further Thoughts

I don’t know the reason behind C++’s design. When looking at it, it feels like a derivative of the C-style loop,

for (int i = 0; i < 10; ++i) { ... }

Where the increment happens after each iteration. To me, this also seems consistent with the design of the range-based for loop in C++. And while it makes sense for iterating over pre-existing data, it feels a bit off to me when iterating calculated (or fetched) data. In those cases, Python and Java’s design feels more appropriate to me.

Another thing that comes to mind is Arno Schödl’s talk Why Iterators Got It All Wrong from CppNorth 2022. It discusses the design of iterators and how they mix up pointing to elements (begin() points to the first element) and borders (end() point after the last element). It seems to me that if begin() were to point before the first element (and therefore need to be incremented before being dereferenced) the issue would be resolved. Then again, I’m probably missing something.
I recommend that you watch the talk and see what you think.

  1. Quite a mouthful, I know. And WG21 is the ISO C++ standard. You can see the agenda for said meeting here↩︎

  2. In theory. Eventually the number will overflow. ↩︎

  3. Again, in theory. ↩︎

  4. We need more, but we’ll ignore them for simplicity. ↩︎