Closure Conversion (closure.ml)

Another gap still remaining between MinCaml and assembly is "nested function definitions," which are flattened by closure conversion. It is one of the most important processes when compiling functional languages.

This flattening of nested function definitions includes easy cases and hard cases. For example,

let rec quad x =
  let rec dbl x = x + x in
  dbl (dbl x) in
quad 123

can be flattened like

let rec dbl x = x + x in
let rec quad x = dbl (dbl x) in
quad 123

just by moving the function definition. However, a similar manipulation would convert

let rec make_adder x =
  let rec adder y = x + y in
  adder in
(make_adder 3) 7

into a nonsensical program

let rec adder y = x + y in
let rec make_adder x = adder in
(make_adder 3) 7
.

This is because function dbl has no free variable while adder has a free variable x.

Thus, in order to flatten function definitions with free variables, we have to treat not only the bodies of functions like adder, but also the values of their free variables like x together. It is like

let rec adder x y = x + y in
let rec make_adder x = (adder, x) in
let (f, fv) = make_adder 3 in
f fv 7

in ML-style pseudo code. First, function adder takes the value of free variable x as an argument. Then, when the function adder is returned as a value, its body is paired with the value of its free variable, like (adder, x). This pair is called a function closure. When the function adder is called, its body f and the value fv of its free variable are extracted and supplied as arguments.

Closure conversion gets complicated when it comes to "which functions need closures and which functions can be called in ordinary ways." The simplest solution is to generate closures for all functions, but it is too inefficient.

Thus, the closure conversion Closure.g of MinCaml takes the set known of functions that are "known to have no free variables and can be called in ordinary ways," and closure-converts an expression. (It also takes type environment env in addition to the expression and known.)

The results of closure conversion are represented in data type Closure.t. It is similar to KNormal.t of K-normal forms, but has closure creation MakeCls and the top-level function set toplevel instead of nested function definitions. In addition, instead of general function calls, it has closure-based function calls AppCls and top-level function calls AppDir that do not use closures. Furthermore, in the processes that follow, the data type Id.l of top-level function names (labels) are distinguished from the data type Id.t of variable names. Note that AppCls uses variables and AppDir uses labels. This is because closures are bound to variables (by MakeCls) while top-level functions are called through labels.

In the case of general function call x y1 ... yn, Closure.g checks if the function x belongs to the set known. If so, it returns AppDir. If not, it returns AppCls.

Function definitions like let rec x y1 ... yn = e1 in e2 are processed as follows. First, we assume that the function x has no free variable, add it to known, and closure-convert its body e1. Then, if x indeed has no free variable, continue the process and closure-convert e2. Otherwise, we rewind the values of set known and reference toplevel, and redo the closure conversion of e1. Last, if x never appears in e2 as a variable (not as a label), we eliminate the closure generation MakeCls.

The last part may need elaboration. Even if x has no free variable, as long as it is returned as a value like let rec x y1 ... yn = ... in x, its closure needs to be generated. This is because a user who receives x as a value does not know in general if it has a free variable or not, and therefore must anyway use AppCls (not AppDir) to call the function through its closure. In this case, we do not eliminate MakeCls since x appears in e2 as a variable. However, if x is just called as a function like let rec x y = ... in x 123, it appears only as a label in AppDir (not as a variable), so we eliminate MakeCls.

Next