So with C++20 (or something like that) it looks like we will finally get ranges. I'm not a huge fan of all the new stuff in C++, but this is one thing I'll be happy to see standardized. The current implementation is here:
https://github.com/ericniebler/range-v3 and it has links to all the relevant papers/docs/etc...
Back in the day when the D programming language first came out, Andrei Alexandrescu made his video/talk about ranges vs iterators and I was inspired. I tried implementing some of the STL around ranges, just to see how it felt. There were advantages, disadvantages, and after a few months I sort of left it at that. An idea for another time... But after implementing custom containers and algorithms for my own uses over the years, I've come to realize both iterators and ranges are missing an important 'piece of the pie'.
Both iterators and ranges represent, from a theoretical/virtual perspective, a linear view over a 'range' or 'list' of items. But this isn't necessarily how they are stored in memory. If the range/iterators comes from a std::vector then the underlying items are continuous, if its linked list they are not, a tree may have small sections of continuous items, a circular buffer may be 1 or 2 continuous sections, etc...
So... why does this notion of 'continuous items' even matter? Simple, I think it forms a true 'basis' for abstract algorithms (ie. algorithms that can work on any container). Ranges and iterators are nice from a programmer/theoretical perspective, but they are not nice from a performance perspective. For example copying a string character by character is going to be slower than simply using strcpy or memcpy, because the latter can work on continuous chunks of data (keeps cache happy) and can work with chunks of data that better work with the underlying hardware (copying 64 bits at a time rather than 8 for example). Iterators and ranges throw this notion away, of continuous chunks of data, which in turn throws away any chances for having both abstract and high performance algorithms.
I think we need another notion of, or an extension to, ranges/iterators, I'm going to call them intervals (but the name isn't important, we simply need a name for a physically continuous sequence of items). An interval would simply be a pointer [begin, end) pair or something similar (pointer/count). The idea is each interval represents a *physically continuous sequence of one or more items, and a range then would consist of zero or more intervals. This abstraction would give us near maximum performance for any container, simple because all memory 'under the hood' on current systems is linear. Every single container we write, arrays 1d/2d/3d/4d, maps, graphs, dequeues, lists, sets, trees, tables, you name it, under the hood has to be mapped to zero or more intervals. What I'm trying to get at is that this idea of intervals isn't just some abstract idea, there's actually a physical basis to it, and why I feel it is a more optimal, if not the optimal, way to represent abstract container traversal.
Now adding intervals to iterators doesn't work, they are very different, but with the new addition of ranges, its easy to see how a range could 'wrap' or contain intervals. So a range would iterate over consecutive intervals, and then intervals iterate over items. For the most part it would require minimal or no syntactical changes. 'Tacking on' intervals after the fact would be near impossible though.
So why am I posting this here... This is important enough, I feel, that this idea should be considered prior to this becoming part of the standard. There's a significant amount of performance that can be gained here, with minimal or no additional work on the programmers part if this notion of intervals is added in from the beginning (in fact I found it easier to implement container traversal over intervals than ranges or iterators, simply because under the hood that's the way the memory is already ordered). That said I don't know how to tell them, or even if I should bother.
Am I completely out to lunch here? Is the idea of intervals simply stupid? Should I try? Is it too late? Is this simply a cathartic post for an overactive mind?
* I know I said 'physically continuous' a few times above. I am well aware that due to virtual memory mapping and such, that it might be a bit of a misnomer and that consecutive memory addresses may not be physically adjacent. That said, for all intents and purposes, both from a theoretical and performance standpoint, they might as well be; so I don't feel that's an important distinction, at least in the context of iterators/ranges/intervals.