One of my favorite features of Rust is iterators. They are a fast, safe, 'lazy' way of working with data structures, streams, and other more creative applications.
You can play along on http://play.rust-lang.org/ and by browsing here. This article is not a subtitute for the documentation or experience.
Our First Iterator
Having everything twice as much is great, right? Let's take a set of values and double it!
Okay first some important things to note, even from this simple example:
.iter()
can be used on many different types.- Declare a value, then create an iterator over it. Otherwise, the value will not live long enough to be iterated over. (Iterators are lazy, and do not necessarily own their data.)
.collect::<Vec<usize>>()
iterates over the entire iterator and places their values into the type of data collection we specify. (If you feel like writing your own data structure, have a look at this.)- It can be used and chained! You don't necessarily need to
.collect()
values at the end of a.map()
call.
n
at a Time
You can access the next element in an iterator with .next()
. If you'd like a batch you can use .take(n)
to create a new iterator that goes over the next n
elements. Have a few you don't care about? Use .skip(n)
to discard n
elements.
// Ouputs:
// Some(1)
// [4, 5]
Observing Laziness
We talked about how Iterators are lazy, but our first example didn't really demonstrate that. Let's use .inspect()
calls to observe evaluation.
The output is:
Pre map: 1
First map: 10
Second map: 15
Pre map: 2
First map: 20
Second map: 25
Pre map: 3
First map: 30
Second map: 35
As you can see, the map functions are only evaluated as the iterator is moved through. (Otherwise we would see 1
, 2
, 3
, 10
, ...)
Note how
.inspect()
only provides its function a&x
instead of a&mut
or the value itself. This prevents mutation and ensures that your inspection won't disturb the data pipeline.
This has some really cool implications, for example, we can have infinite or cycling iterators.
// Outputs [1, 2, 3, 1, 2, 3, 1, 2, 3]
Something In Common
Don't get hung up thinking that [1, 2, 3]
and it's ilk are the only things you can use iterators on.
Many data structures support this style, we can use things like Vectors and VecDeques as well! Look for things that implement iter()
.
use VecDeque;
// Outputs [2, 4, 6]
Notice how here we collect into a VecDeque? That's because it implements FromIterator
.
Now you're probably thinking, "Ah hah! I bet you can't use a HashMap or tree or something, Hoverbear!" Well, you're wrong! You can!
use HashMap;
// [(1, 100), (3, 300), (2, 200)]
When we're iterating over a HashMap the .map()
function changes to accept a tuple, and .collect()
recieves tuples. Of course, you can collect back into a HashMap (or whatever), too.
Did you notice how the ordering changed? HashMap's aren't necessarily in order. Be aware of this!
Try changing the code to build a
Vec<(u64, u64)>
into aHashMap
.
Writing an Iterator
Okay so we've seen a taste of the kind of things that might offer iterators, but we can also make our own. What about an iterator that counts up indefinitely?
// Outputs [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
We didn't have to call .iter()
here, and that makes sense, since we're actually implementing an iterator. (Not transforming something into an iterator like before.)
Try changing the
current
and.take()
values.
See how we could use .take()
and other functions without having to implement them separately for our new iterator? If you look at the docs of for iter
you'll see that there are various traits like Iterator
, RandomAccessIterator
.
Out on the Range with Ranges
Throughout the following examples you'll see use of the x..y
syntax. This creates a Range
. They implement Iterator
so we don't need to call .iter()
on them. You can also use (0..100).step_by(2)
if you want to use specific step increments if you're using them as Iterator
s.
Note that they are open ended, and not inclusive.
0..5 ==
2..6 ==
We can also index into our collections.
// Outputs:
// [0, 1, 2, 3, 4]
// [2, 3, 4]
// [7, 8, 9]
Gotcha: Using
.step_by()
doesn't work this way sinceStepBy
doesn't implementIdx
andRange
does.
Chaining and Zipping
Putting together interators in various ways allows for some very nice, expressive code.
// Chained: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
// Zipped: [(0, 5), (1, 6), (2, 7), (3, 8), (4, 9)]
.zip()
allows you to merge iterators, while .chain()
effectively creates an "extended" iterator. Yes, there is an .unzip()
.
Try using
.zip()
on twousize
slices then.collect()
the resulting tuples to build up a HashMap.
Getting Inquisistive
.count()
, .max_by()
, .min_by()
, .all()
, and .any()
, are common ways to inquire into an iterator.
// Must be run on nightly at time of publishing.
// Outputs:
// Oldest: Some(Bear { species: Grizzly, age: 16 })
// Youngest: Some(Bear { species: Brown, age: 5 })
// At least one polarbear?: true
// Are they all minors? (<18): true
Try using the same iterator all of the above calls.
.any()
is the only one that borrows mutably, and won't work the same as the others. This is because it might not necessarily consume the entire iterator.
Filter, Map, Red... Wait... Fold
If you're used to Javascript like me, you probably expect the holy trinity of .filter()
, .map()
, .reduce()
. Well, they're all there in Rust too, but .reduce()
is called .fold()
which I kind of prefer.
A basic example:
Of course, don't start trying to be too clever, the above could simply be:
The ability to approach problems like this in multiple ways allows Rust to be quite flexible and expressive.
Split & Scan
There is also scan
if you need a variation of folding which yeilds the result each time. This is useful if you're waiting for a certain accumulated amount and wish to check on each iteration.
Splitting an iterator up into two parts is possible. You can just use a simple grouping function that returns a boolean with partition
.
Let's use the two concepts to split up a big slice, group it be evens and odds, then progressively sum them up and make sure that the some of the evens is always less than the sum of the odds. (This is because even
starts at 0.)
// Outputs:
// Even was always less: true
Scanning can be used to provide things like a moving average. This is useful when reading through files, data, and sensors. Partitioning is a common task when shuffling through data.
Another common goal is to group elements together based on a specific value. Now, if you're expecting something like _.groupBy()
from Lodash it's not quite that simple. Consider: Rust has BTreeMap
, HashMap
, VecMap
, and other data types, our grouping method should not be opinionated.
To use a simple example, let's make an infinite iterator that cycles from 0 to 5 inclusively. In your own code these could be complex structs or tuples, but for now, simple integers are fine. We'll group them into into three categories, 0s, 5s, and the rest.
use HashMap;
// Outputs: {Zero: [0, 0, 0, 0], Five: [5, 5, 5], Other: [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1]}
It's kind of pointless to store all those replicated values. Try having this example return the number of occurances instead of the values. As a further exploration try using this method on a more complex value like a
struct
, you can also change what keys are used.
Flanked
The DoubleEndedIterator
trait is useful for certain purposes. For example, when you need the behavior of both of queue and a stack.
// Some(0)
// Some(9)
// Some(1)
// Some(8)
Play on your own!
This is a good time for you to take a break, get some tea, and crack open the Playpen. Try changing one of the above examples, fiddle with some new things in the API docs below, and generally have a good time.
If you get stuck, don't panic! Try googling the error if it befuddles you, Rust has activity on the *.rust-lang.org
domain as well as Github and Stack Exchange. You are also welcome to email me or visit us on IRC.
We're just scratching the surface... A vast world awaits us.