A Crash Course for the MinCaml Compiler

Type Inference (`typing.ml`)

As in ordinary ML, MinCaml infers the types of variables and functions even though they are not written in the program. This functionality is called type inference. Type inference in general is very useful for programs with polymorphic and higher-order functions in particular (though MinCaml itself is monomorphic as specified by the syntax of types).

The core of type inference in MinCaml is function Typing.g. This function takes type environment env (mapping from the names of variables to their types) and expression e, and returns the inferred type of e. It also checks the types of sub-expressions. If an undefined type variable is found, it is substituted with an appropriate type. This checking and substitution are implemented by function Typing.unify.

For example, in the case of integer addition e1 + e2 (or Add(e1, e2) in terms of Syntax.t), we first infer the type of sub-expression e1 by g env e1, which is checked against int by unify. Ditto for e2. Then, we return int as the whole expression's type.

A little more complex case is function application e e1 ... en, where e is a function and e1 through en are its arguments. In this case, the function's type can be inferred by g env e and the arguments' types by g env e1 through g env en, but the result's type cannot. Thus, we create a fresh undefined type variable t, and call unify so that g env e equals the function type from g env e1, ..., g env en to t. Then, we return t as the whole expression's type.

When a new variable is introduced as in let and let rec, the type environment env is extended. Conversely, when variable x appears, its type is obtained by looking up env. However, when a variable is not found in the type environment, it is regarded as an external variable, assigned a fresh undefined type variable, and added to the special type environment extenv for external variables. This functionality is specific to MinCaml and not available in ordinary ML. Thanks to it, external variables can be used without being declared.

Function Typing.unify recursively inspects whether two given types are equal and, if one of the types is an unfefined type variable Type.var(ref None), substitutes it with the other type so that they become equal. However, before this substitution, it is checked whether the type variable appears inside the other type. This process is called occur check. It is necessary so that the resulting type has no cycle. For example, if we unified (with no occur check) type variable α with function type int->α, the result would be an infinite type like int->int->int->... because α = int->α! In order to prevent such cases, occur check is necessary even though it may appear mysterious at first.

After type inference is finished (either by type error or by normal termination), Typing.deref_typ replaces all type variables with their contents for the sake of simplicity. Type variables that are still undetermined are substituted at will with int. This feature is also specific to MinCaml.

Type Inference (typing.ml)

Type Inference (`typing.ml`)