Status: In Progress
This note is based on this talk
Golang 1.23 added a “first class” iterator concept. The addition itself, as well as the implementation, was a bit controversial. The talk and this note attempt to demystify some use cases for iterators and to provide a bit of a framework around how to write a custom iterator in Go, without making any value judgements around having iterators in the language or the specific implementation chosen.
An iterator is just a mechanism for “going through” (iterating over) a sequence of values. That’s really it!
Imagine I have some array, slice, set, sequence or vector of elements. A classic example might be a bunch of files on disk to be processed.
// pseudocode - no Go yet
["Dockerfile", "README.md", "src", "test"]
Now imagine we want to go through these elements to operate on them for some reason. We might transform them to some other value. We might choose to keep some and discard others We might run some code on them which results in side effects. All very common operations in programming.
// still pseudocode
for element in iterate_over(["Dockerfile", "README.md", "src", "test"]) {
// transform to another value, e.g. absolutise the path
// filter out items, e.g. remove directories keep only files
// cause some side effect, e.g. delete the file
}
We’ll call the bit between the curly braces here, which is being repeatedly run, the body of the for loop.
In the for loop we are iterating over a sequence of elements. The body is run on each one. The mechanism we use to get the actual elements to be iterated over in a certain order is frequently called an iterator in various programming languages.
To function, this for loop needs to know two things.
These two things correspond to the basic functions of iterators, which usually provide
Before we dive into how iterators work in Go, it’s worth remembering that iterators are very common concept present in most modern programming languages. Since most Gophers have worked with/in programming languages other than Go, it can be interesting/helpful to have these familiar patterns in mind.
Here is how iterators are represented in a few languages that may be familiar to Gophers
| Language | Iterator Concept | Next Element | Finished? |
| ---------- | ----------------------- | --------------- | ---------------------- |
| Java | `java.util.Iterator` | `.next()` | `hasNext() == false` |
| Javascript | "iterator protocol" | `.next().value` | `.next().done` |
| Python | `std::iter::Iterator` | `.next()` | `.next() == None` |
| Rust | `__iter__` / `__next__` | `__next__()` | Raises `StopIteration` |
This is just a selection of languages that I’m familiar with - of course many other languages have these concepts, and I may add to the above as I get acquianted with them.
The approaches to iterators are pretty uniform across these languages, which are all more-or-less ‘C style’ languages which are imperative at heart. We’ll politely ignore Python making the time-honoured faux pas of using exceptions as flow control.
Naturally, functional style languages also include iterator concepts,
such as Elixir’s Enumerable
protocol and Clojure’s
ISeq
. The functional nature of these languages however mean
that the actual implementations of these look quite different (and in
some cases a lot more complex). In any case, since Go is a ‘C style’
imperative-at-heart language too, it will not benefit us to dig too
deeply into these details.
Before we dive into the details of how to write iterators and when it might make sense to use iterators, it’s always worth remembering that pretty much all uses of iterators can be replaced with for loops. I think this may be true for every use of an iterator, but I can’t be sure there are absolutely no exceptions.
Where we have something like (pseudocode)
goFiles = listFiles()
.iterator()
.map(f -> f.absolutePath())
.filter(p -> p.endsWith(".go"))
.count()
There are always a bunch of for-loop based strategies that we can take in its place. Trying to keep to a single for loop, e.g. (still pseudocode)
allFiles = listFiles()
goFiles = 0
for int i = 0; i < allFiles.length; i++ {
f = allFiles[i]
// plays the part of the map call
p = f.absolutePath()
// plays the part of the filter call
if !p.endsWith(".go") {
continue
}
// plays the part of the count call
goFiles++
}
In light of this, while I and many others on the internet can tell you for what iterators can be used and possible use cases for them, only you and/or your team can decide which is the right approach for your specific projects and situations.
Of course in some cases there will be hard constraints, such as performance, which may strongly push you or even force you in one direction over the other.
But as we explore some examples in this note, I urge you to think about how the comparable for loop and iterator based approaches would look, and which you think produce more easily understandable and maintanable systems.
In the blog post I use as my reference for iterators in Go, the iterators feature is called range over function types.
There are three interfaces here, in the general sense of the word
rather than the specific Go interface
concept, which can be
considered iterators.
I call this a zero value iterator
func(yield func() bool)
this the one value iterator
func(yield func(V) bool)
and this the two value iterator
func(yield func(K, V) bool)
Before we go any further, yes, those single capital letter arguments do mean that we’re in the world of generics. We can’t avoid them when using iterators in Go.
To spell it our explicitly since it’s a little bit to parse on first
sight, the latter two iterators types, (one value and two value
iterators) are functions returning void which take a single argument
called yield
func(yield func(V) bool)
| ^single argument called yield
^function
func(yield func(K, V) bool)
| ^single argument called yield
^function
where the yield argument is itself a function, which takes a value (one value iterator) or a key and a value (two value iterator) and returns a boolean
func(yield func(V) bool)
| | ^ returning bool
| ^ single value argument, generic type
^ function
func(yield func(K, V) bool)
| | | ^ returning bool
| | ^ value argument, generic type
| ^key argument, generic type
^function
We’ll get into the detail of what this yield
function
means, what it does, how it is called, etc… in the later sections.
The one value and two value iterators are represented by actual Go
interfaces
in the new iter
package in Go
1.23.
package iter
type Seq[V any] func(yield func(V) bool)
type Seq2[K, V any] func(yield func(K, V) bool)
This can be useful to know, if you create a function which will take an iterator as an argument, because the interface can be used as the type for that parameter.
Suppose for some reason we didn’t have one value for range over slices built into Go, but we did have iterators.
By one value for range, I mean this
func main() {
for i := range []string{"Amos", "Alex", "Holden", "Naomi"} {
.Println(i)
fmt}
}
When you run this e.g. in Go playground you’ll get output
0
1
2
3
One value for range over a slice or array type gives you as the value the index into the slice or array.
Given the iterator functionality in Go 1.23, we could easily build this behaviour ourselves. Let’s write a function to produce an iterator that will give us this behaviour.
func OneValueSliceIterator[T any](s []T) func(yield func(int) bool) {
// TODO
return nil
}
The function is generic with type parameter T
to allow
iterating over slices containing any type. However, the values it
returns to be used in the for loop, which you’ll find in the innermost
argument of the return type, is an int
- since one value
for range over a slice iterates over the slice indexes.
Now we have written out the return type in full to demonstrate, we
can use the shorter form, i.e. the interface from the iter
package.
func OneValueSliceIterator[T any](s []T) iter.Seq[int] {
// TODO
return nil
}
We need to return a function with the just shown signature
func OneValueSliceIterator[T any](s []T) iter.Seq[int] {
return func(yield func(int) bool) {
// TODO
}
}
Finally, we can talk about what the yield function represents. We should pass sequentially into the yield function the values which are to be iterated over in the for loop. Explicitly, suppose I have a for loop
for i := range SomeIterator() {
.Println(i)
fmt}
and in the function returned by SomeIterator, for a certain iteration
of the for loop, we call yield(10)
- on this iteration of
the for loop, i
will have the value of 10
.
We know that the one value for range over a slice moves sequentially through the indices of the slice. Consequently, let’s set up our iterator function to do just that (pretending for-range is not available to us).
func OneValueSliceIterator[T any](s []T) iter.Seq[int] {
return func(yield func(int) bool) {
:= 0
i for {
// TODO call yield
++
iif (i == len(s)) {
return
}
}
}
}
Now we can call yield on the index i
and we already have
a workable iterator!
func OneValueSliceIterator[T any](s []T) iter.Seq[int] {
return func(yield func(int) bool) {
:= 0
i for {
(i)
yield++
iif (i == len(s)) {
return
}
}
}
}
We can swap this into our for-range
func main() {
:= []string{"Amos", "Alex", "Holden", "Naomi"}
s for i := range OneValueSliceIterator(s) {
.Println(i)
fmt}
}
func OneValueSliceIterator[T any](s []T) iter.Seq[int] {
return func(yield func(int) bool) {
:= 0
i for {
(i)
yield++
iif i == len(s) {
return
}
}
}
}
and we’ll get the same output
0
1
2
3
Now, how about that bool
return value from
yield
, which has been totally ignored above?
To give us a clue about it’s intended purposes, let’s take a look at what happens if we now call break in the for loop.
func main() {
:= []string{"Amos", "Alex", "Holden", "Naomi"}
s for i := range OneValueSliceIterator(s) {
.Println(i)
fmtbreak
}
}
Running again, we have a problem
0
panic: runtime error: range function continued iteration after function for loop body returned false
goroutine 1 [running]:
main.main-range1(...)
/tmp/sandbox168180821/prog.go:12
main.main.OneValueSliceIterator[...].func1(...)
/tmp/sandbox168180821/prog.go:22
main.main()
/tmp/sandbox168180821/prog.go:12 +0x85
The “for loop body” we can identify with the yield function here. It
tells us whether we should stop iteration. Well, more accurately, a
false
value tells us that we should stop iteration, so
really that bool
return tells us whether we should keeop
going. If yield
returns false
, we are expected
to immediately return
.
func OneValueSliceIterator[T any](s []T) iter.Seq[int] {
return func(yield func(int) bool) {
:= 0
i for {
:= yield(i)
keepGoing if !keepGoing {
return
}
++
iif i == len(s) {
return
}
}
}
}
I’ve chosen keepGoing as quite a big variable name here, in
contravention to Go’s idiom of using very small names of variables with
small scope, because I find it easy to forget that false
means stop rather than true
! This variable name helps
cement in my mind what the return values from yield mean.
Running the for loop with the break
with this version of
our iterator gives the expected output of just 0
.
From this basic example, we can already extract a set of elements that most (almost all) Go iterators will have.
func OneValueSliceIterator[T any](s []T) iter.Seq[int] {
return func(yield func(int) bool) {
// 1a. A way to track what we have iterated over
:= 0
i for {
// 2. A call to yield with a value
:= yield(i)
keepGoing // 3. Bail if we're told to stop
if !keepGoing {
return
}
// 1b. Keeping track of what we have iterated over
++
i// 4. Natural stopping point of iteration
if i == len(s) {
return
}
}
}
}
Listing those out
for
2 and 3 are really the absolute minimum required to implement an
iterator. Without the call to yield
, no values will ever
“be iterated over” in the for loop; without return
ing when
yield
returns false
, we have seen that the
program will panic.
1 (occasionally) and 4 (very rarely) can be omitted, depending upon the intent of the iterator. We’ll look at some examples later in this note where they are naturally omitted.
We could take the 1 value implementation and modify it in some small ways to make a 2 value implementation. To get some practice implementing the patterns from the previous section, though, we’ll instead follow that blueprint to build a new implementation from scratch.
I’m sure this is obvious to all Gophers, but this is what’s I mean by for range over slice with 2 values
func main() {
for i, v := range []string{"Amos", "Alex", "Holden", "Naomi"} {
.Println(i, v)
fmt}
}
which when run outputs
0 Amos
1 Alex
2 Holden
3 Naomi
The first value is the index of the item in the slice, the second value is the value of the item in the slice.
Okay, step by step, we start by changing the return type of the
function that creates the iterator - we’ll be returning index (type
int
) as before, and additionally value (type parameter
T
).
func TwoValueSliceIterator[T any](s []T) iter.Seq2[int, T] {
return func(yield func(int, T) bool) {
}
}
- (a & b) tracking what we have iterated over (i.e. already yielded)
This part is exactly the same as before
func TwoValueSliceIterator[T any](s []T) iter.Seq2[int, T] {
return func(yield func(int, T) bool) {
:= 0
i for {
++
i}
}
}
- Calling yield with the value to use for the current iteration
Small change here because we’re returning both the index and the value from the slice (yield takes two values)
func TwoValueSliceIterator[T any](s []T) iter.Seq2[int, T] {
return func(yield func(int, T) bool) {
:= 0
i for {
(i, s[i])
yield++
i}
}
}
- Returning immediately if we are told not to keep going by
for
Exactly the same as for the one value iterator
func TwoValueSliceIterator[T any](s []T) iter.Seq2[int, T] {
return func(yield func(int, T) bool) {
:= 0
i for {
:= yield(i, s[i])
keepGoing if !keepGoing {
return
}
++
i}
}
}
- Returning when we have nothing left to iterate over
Once again, exactly the same as for the one value version.
func TwoValueSliceIterator[T any](s []T) iter.Seq2[int, T] {
return func(yield func(int, T) bool) {
:= 0
i for {
:= yield(i, s[i])
keepGoing if !keepGoing {
return
}
++
iif i == len(s) {
return
}
}
}
}
Then, using this in place of the inbuilt for range
func main() {
for i, v := range TwoValueSliceIterator([]string{"Amos", "Alex", "Holden", "Naomi"}) {
.Println(i, v)
fmt}
}
we get exactly the same output
0 Amos
1 Alex
2 Holden
3 Naomi
We did it! If Go had not built in for range over slice in both 1 and 2 value versions, we can now be able to develop that functionality ourselves with iterators!
TODO: discuss the following
- eagerness vs laziness - application to infinite/unbounded sequences