The intermediate form used by the front end is a rigidly stylized subset of Scheme. Stylistic variations that would ordinarily have no semantic significance are used to express the results of static analysis. Other information needed for optimization and transformation is conveyed by quoted data in command positions, where the value would be ignored by a standard implementation of Scheme.
The output of the front end is genuine Scheme code that can be compiled by any Scheme compiler. This makes the front end easier to understand, and has been of great benefit for testing and debugging.
Although many Scheme compilers will in fact generate better code when Twobit's front end is used as a preprocessor, the full benefits of its optimizations are realized only by a code generator that exploits the additional information that is encoded by stylistic variation and quoted constants.
The subset of Scheme used by the front end is generated by the following grammar.
L --> (lambda (I_1 ...) (begin D ...) (quote (R F <decls> <doc>) E) | (lambda (I_1 ... . I_rest) (begin D ...) (quote (R F <decls> <doc>) E) D --> (define I L) E --> (quote K) ; constants | (begin I) ; variable references | L ; lambda expressions | (E0 E1 ...) ; calls | (set! I E) ; assignments | (if E0 E1 E2) ; conditionals | (begin E0 E1 E2 ...) ; sequential expressions I -->R --> ((I <references> <assignments> <calls>) ...) F --> (I ...)
Note that the variable x
is represented by
(begin x)
, which gives to variable references
a list structure that can be shared and side effected.
For every identifier that is bound by a lambda expression,
the table R contains an entry listing all references,
assignments, and calls to that identifier.
These references, assignments, and calls share their structure
with the actual references, assignments, and calls within the
body of the lambda expression.
By using side effects, the compiler can substitute expressions
for identifiers without having to traverse the identifer's scope.
The compiler can also use
Except for these R tables and constants, the Scheme expression
used as the intermediate form does not otherwise contain any
shared structure, not does it share structure with the original
definition or expression being compiled.
The intermediate form is therefore acyclic, and its printed form
is genuine Scheme code, but its printed form is often very large
because of the shared structures, and is hard to read because of
all the clutter.
A make-readable
procedure is therefore used to
format the intermediate code for easier comprehension by humans.
Unless otherwise noted, the intermediate Scheme code shown in
these notes is the output of make-readable
.
This intermediate form is created by pass 1.