From Thunks to Streams

By Dario Iacampo
A stream is a thunk that when you call it returns a pair of the first element of the stream and then a stream for the rest of the elements.

This allows you to define and use infinitely long sequences like the positive integer numbers or the fibonacci numbers or the powers of two.

Let's start with some little piece. First of all the reason why you need a thunk and what it means.

You use a thunk when you want to delay and store computation: in many languages, arguments of functions are evaluated before the function body evaluation: each of them is evaluated once and then the function body refers to those computed values with variable names.

Conditionals works differently: you don't evaluate arguments ahead of time, condition is evaluated first and then only one of then/else branch gets evaluated.

Let's mess a bit with this concept, say you define a wrapper for the IF function that works like this:

(define (bad-if condition true-branch else-branch)
    (if condition true-branch else-branch))

now happens that if you use this bad-if function, both true-branch and else-branch are evaluated when you call the function. This is obviously bad because you are doing extra computation, can be really bad if the computation you are doing is heavy and can be worse if you are using this function in a recursive procedure, leading in an infinite loop.

(define (factorial-bad x)
    (bad-if  (= x 0) 1 (* x (factorial-bad (- x 1)))))

Now, if you try to evaluate factorial-bad with any argument it ends up in an infinite loop because arguments of the bad-if function have to be evaluated ahead of time. Even if you call it with 0, the bad-if tries to evaluate the true-branch -> 1 and the else-branch -> (* x (factorial-bad (- x 1))...

To make this work (and this is no good style, you should use the normal if function) you have to change the kind of arguments that bad-if takes. bad-if should take for then and else branches some procedure that when you call them, you get back their values:

(define (bad-if condition then-branch else-branch)
    (if condition (then-branch) (else-branch)))

remembering that in Racket, (then-branch) means call then-branch with no arguments.

Now I'll change my factorial-bad as follows:

(define (factorial-bad x)
    (bad-if  (= x 0)
                (lambda () (1))
                (lambda () (.... recursive computation...))

This practice of defining a zero arguments function to delay evaluation is called THUNK.
Just to recap, you substitute
that is eagerly evaluated, with
(lambda () (EXPRESSION))
that is only evaluated when you call it.

Now, let's cheat a bit and pretend we already have a stream defined, say our stream contains powers of 2 (2 4 6 8 ...)
if you call it (powers-of-two) you get a pair with the first element ( 2 ...) and the stream that contains the rest of the elements (2 . #<procedure...>).

So to get the second element, you need the car or calling the cdr of the stream
(car ((cdr (powers-of-two))))
and so on.

From this it gets easier to infer how a stream is made: you have to create a function with no arguments that returns a pair with the first element of the stream and a function that when you call it returns a pair with the second element of the stream and a function, ....

Let's try with the simplest stream ever, a stream that gives you 1 each time you call it forever:

As you can see ones is a procedure that returns a paire as I described before.

Now let's try something slightly more complicated: natural numbers. If you think about it, the main point here is to construct a function that takes as argument the first element of the stream and conses it in a function that takes no arguments and when you call it invokes the same function you are constructing with the next element of the stream as argument.

(define (f x) (cons x (lambda () (f (+ x 1)))))

So we have a function f that takes x and conses it into a function with no arguments that calls f with the next element, done, we only need to call this f with the first element of the stream:

The code is self-explainatory.

One of the things to notice is the the fact that the representation of x that I cons into the stream is independent from the x with whom I call f again, this sounds complicated but think about it: if I want that some elements of my sequence are negated according to some rule this is easy to accomplish, I just have to intervene on the (cons x (...) part: I have the same x here and in the subsequent function invocation and I can do what I want with those variables independently.

Say I want to negate even elements:

I only touched the (cons x (...) part and everything worked out correctly.
Now, I have to say that there is a bit of a problem here: since the f I defined I only need for constructing the nats-special stream, there is no reason for having this function publically defined, it should be defined inside nats-special by using define (current best practice in Racket community) or letrec, to highlight the fact that we have to call this function recursively.


Jan-Mar 2013 Programming Languages course by Dan Grossman (University of Washington)