Now that lexical analysis is done, instead of strings like

`'1' '2' '3' '-' '4' '5' '6' '+' '7' '8' '9'`

we have sequences of tokens like

`"123" "-" "456" "+" "789"`.

However, we still cannot do any high-level processing on such a flat sequence as above. For example, we have to recognize "`123-456+789`" to be "`(123-456)+789`", not "`123-(456+789)`". In terms of the data type `Syntax.t` defined in `syntax.ml`, we need a *parse tree* like:

`Add(Sub(Int 123, Int 456), Int 789)`

This process of translating a sequence of tokens into a parse tree is called *parsing*. The MinCaml compiler uses a tool called `ocamlyacc` to implement parsing in file `parser.mly`.

The contents of `parser.mly` are more or less similar to those of `lexer.mll`. It lists pattern matching from sequences of tokens to parse trees, for example like:

`| exp PLUS exp
{ Add($1, $3) }`

`$1` and `$3` mean the first and third syntax elements (which are both `exp` in this case).

This definition of syntax is almost the same as the previously described expressions `e`, but one point needs caution. In ML, function applcation is denoted by just a sequence of expressions. Therefore, if you write `x - y`, it is ambiguous whether you are subtracting integer `y` from integer `x` or applying function `x` to argument `-y`! Thus, we distinguish simple expressions `simple_exp`, which can be function arguments without extra parentheses, from general expressions `exp`. For example, since `-y` is not a `simple_exp`, the previous expression is known to be integer subtraction, not function application.

`parser.mly` also defines the priority of various syntax and binary operators as well as the data type of tokens mentioned in `lexer.mll`.

By the way, whenever the type of a variable is required (as in `let`), it is given by a fresh, undefined type variable `Var(ref None)` for the moment. This point will be explained next in type inference.