Compiling with Continuations

September 16, 2025

A review of the book Compiling with Continuations, Appel (1992).

Overview

Compiling with Continuations is an excellent book for learning Standard ML beyond what’s normally taught or discussed. It’s an excellent book for learning about practical applications of continuations beyond what’s normally taught or discussed. But who would want to do that? Was it influential to the field of computer science? I really can’t form an answer to those questions.

The book has no exercises. This positions it awkwardly in the broader perspective of compiler study. It’s as if it’s a book that isn’t meant to be studied or understood, or perhaps it just doesn’t care. Maybe these things are just somebody else’s problem. It’s not about any syllabus’s learning outcomes. Rather, it’s about the book shotgun-firing you with an onslaught of facts about the chapter title. Enough that it becomes confusing about what’s going on while, ironically, at the same time giving you just barely enough to implement the topics of the first half of the book.

Doing a Google search for the book reveals plenty of parody; tantalizing wordplay even. “Compiling with Continuations, Continued (2007) (Cited by 154)” “The Essence of Compiling with Continuations (1993) (Cited by 806)” “Compiling with continuations, or without? whatever. (2019) (Cited by 34)” Doing a search for source code on Github reveals basically nothing. No blog posts. No source code repositories. Nothing. Perhaps it’s some sort of meme book used by computer science researchers to look smart. Bolster up their papers with the immense knowledge they’re building off of.

This is what the book has to say about itself:

All the nitty-gritty details of compiling to really good machine code are covered, including the interface to a runtime system and garbage collector.

This book will be essential reading for compiler writers in both industry and academe, as well as for students and researchers in programming-language theory.

Compiling with Continuations (back cover)

That was written in 1992, and even at the time was only wishful thinking. But there is a real audience for this book. There is a real value it has to offer. Standard ML is the value, and the book leaves very little unclear about this very powerful fact.

MiniML

Chapter 4 describes a MiniML language used by the SML/NJ compiler. According to the book, it has no syntactic form (although the book tries to give it one), and it’s used to represent elaborated Standard ML programs. The MiniML I’m using isn’t like that. It has a concrete syntax and consequently requires elaboration. Although the AST can be elaborated directly into Lambda, so the effect is about the same.

Lexing

The lexing into tokens was done using lex just like in the canonical implementation of Standard ML. However, unlike that version, all reserved words are parsed into tokens instead of more general identifiers.

Parsing

The parsing into an AST was done using yacc just like in the canonical implementation of Standard ML. Many programs parse well, although quite a few don’t. Despite using yacc, Standard ML isn’t a LALR (1) language, and likewise, even this subset of the language is difficult to parse, but for the purposes of studying the book it’s sufficient.

In fact, it’s better than sufficient. It’s much better than keeping large blocks of AST values in the source files. Better to reason about. Better to work with even if it’s not perfect.

Evaluation

Evaluation is done by compiling the AST into the CPS language and using the CPS semantics evaluator from Chapter 3 to run the program. This is important for checking the correctness of optimizations, most of which happen in the CPS language.

Lambda

The data representation and associativity grouping of the MiniML language are left somewhat ambiguous in the syntax. Those details aren’t ambiguous in the Lambda language, and the MiniML language can be elaborated directly into the Lambda language. After this, the Lambda language can be compiled directly into the CPS language using the compiler described in Chapter 5.

CPS

Chapter 2 gives a description of the CPS language, and Chapter 3 has an evaluator for its semantics. Most of the optimizations and program analysis are done in this language. Chapter 1 explains why it’s beneficial to optimize programs in the CPS language specifically.

The CPS language may contain inner FIX expressions just like G-code described in The Implementation of Functional Programming Languages, Jones (1987). It may also contain more free variables than can be stored in registers. Both of these are essential problems to solve for compiling, unlike the optimizations. These problems are introduced in Chapter 2.4 and Chapter 2.5 of the book. Chapter 10 and Chapter 11 describe how they’ve been solved.

Closure Conversion

The book describes 2 ways of doing closure conversion. A naive solution described in Chapter 2 and a closure sharing solution described in Chapter 10. Closure sharing ends up being a very elegant solution to this problem.

Callee-save registers can be used to store closures of some of the continuations. Specifically, continuations that aren’t first class can carry their values in registers in between applies. Unfortunately, the book describes only being able to identify most, but not all, non-first class continuations, and the implementation is an involved process.

Register Spilling

Chapter 11 describes the algorithm and data flow equations used to implement register spilling. Strangely enough, there’s no mention of what to do about register assignment. The free variables of the CPS language closely correspond to the free variables of the abstract machine language, and register assignment can happen during spilling, but this involves mixing the CPS language and the abstract machine language together. That would result in the CPS datatype holding abstract machine language registers instead of CPS variables.

Deferring register assignment to compiled abstract machine language requires doing what was done in register spilling over again, except the book only describes how to do this in the CPS language. The problem is very solvable, but it has more than one solution, and a little bit more insight into this would have been helpful.

Optimizations

The book focuses on the big ideas of optimization; inline expressions to remove abstractions and reduce unnecessary expressions to simplify them. The hope in doing this is to produce a canonical representation of the program. Chapter 7 describes how to inline expressions in the CPS language, and Chapter 6 and Chapter 9 describe how to simplify them.

Chapter 8 describes hoisting by using term rewriting rules of the CPS language. The book gives hints that hoisting can be used to remove the effects of dominators and could be useful for instruction scheduling. Unfortunately, hoisting is used very conservatively in their compiler, and that’s really all the book has to say about this topic.

Virtual Machine

A virtual machine is described in Chapter 13. The virtual machine is quite simple, but implementing the runtime to get programs interpreted in it isn’t simple. Chapter 16 describes the garbage collector for the runtime, but this is all the insight the book has on the runtime. Actually, the problem is worse than that. Chapter 13 says that linking isn’t required. This is wrong. In fact, in order to implement the compiler into the abstract machine language, a link loader is required, and so is a link loader format. A HALT or RESET instruction is required as well, even though compiling CPS programs never correspond to those semantics directly. As if that wasn’t bad enough, the end-of-computation continuation needs to encode machine instructions to preserve the register state of the virtual machine at the top of the heap image. At least it does in my solution.

After this, the book is basically complete. Chapter 14 describes quite well what to do about the differences between the abstract machine and conventional hardware, but the rest of the book isn’t concerned with implementation details.

Conclusion

There were no functors. There was no discussion of type checking in the book nor in this article, but more about Standard ML has been revealed than most people will ever discover. The book emphasizes very fundamental ideas and uses them to good effect. Standard ML is a powerful language built on powerful ideas of how to do computation. The book has shown quite well that continuations can be useful for more than just implementing interpreters. They can be used for everything. You can do everything with them.

The CPS language never became a popular intermediate representation for doing anything. Time has passed this book by. Even the SML/NJ compiler eventually switched to using MLRISC and then to LLVM. Chapter 1.2 describes how well CPS has compared to other intermediate representations. Many of the optimizations in the book don’t need continuations at all and could have been done in the Lambda language. Although the fact that they can be deferred to an IR is insightful. Admittedly, I feel like my time wasn’t well spent working through this book.