Twobit Pass 2: incremental lambda lifting

Incremental lambda lifting

Incremental lambda lifting lifts the internal definitions of known local procedures to the next outer lambda expression. For example, the internal definition of .loop_3 in

  ((lambda ()
     (begin
      (set! reverse-map
            (lambda (.f_2 .l_2)
              (define .loop_3
                (lambda (.l_5 .x_5)
                  (if (pair? .l_5)
                      (.loop_3 (cdr .l_5)
                               (cons (.f_2 (car .l_5)) .x_5))
                      .x_5)))
              (.loop_3 .l_2 '())))
      'reverse-map)))

is lifted to the outer lambda expression in

  ((lambda ()
     (define .loop_3
       (lambda (.f_2 .l_5 .x_5)
         (if (pair? .l_5)
             (.loop_3 .f_2
                      (cdr .l_5)
                      (cons (.f_2 (car .l_5)) .x_5))
             .x_5)))
     (begin
      (set! reverse-map
            (lambda (.f_2 .l_2)
              (.loop_3 .f_2 .l_2 '())))
      'reverse-map)))

Because .loop_3 is lifted out of the scope of .f_2, which occurs free within the original code for .loop_3, .f_2 must become an additional formal parameter to .loop_3, and all calls to .loop_3 must be changed to pass this additional parameter.

Lambda lifting is one of the most complex optimizations performed by Twobit, mainly because of complications that arise with groups of mutually recursive known local procedures. Consider

((lambda ()
   (begin (set! foo
                (lambda (n x y)
                  (define (f n)
                    (if (zero? n) x (g (- n 1))))
                  (define (g n)
                    (if (zero? n) y (f (- n 1))))
                  (f n)))
          'foo)))

If f were lifted to the outer lambda expression, then it would have to take g and x as extra parameters. That means g would have to supply g and x as extra arguments when it calls f:

((lambda ()
   (define (f g x n)
     (if (zero? n) x (g (- n 1))))
   (begin (set! foo
                (lambda (n x y)
                  (define (g n)
                    (if (zero? n) y (f g x (- n 1))))
                  (f g x n)))
          'foo)))

Notice that g now escapes its scope, negating the earlier closure analysis. Worse yet, if g were lifted to the outer lambda expression, it would have to take an extra parameter y. That means f would have to supply y as an extra argument when it calls g.

But f is now outside the scope of y! Lifting g can therefore add extra arguments not only to g but to procedures that call it. Adding extra arguments to those procedures may in turn involve adding extra arguments to procedures that call the procedures that call g, and so on.

Eventually this process of adding arguments will stabilize. For this example the result is

((lambda ()
   (define (f g x y n)
     (if (zero? n) x (g x y (- n 1))))
   (define (g x y n)
     (if (zero? n) y (f g x y (- n 1))))
   (begin (set! foo
                (lambda (n x y)
                  (f g x y n)))
          'foo)))

This is suboptimal: f is back within the scope of g, so it is unnecessary to pass g to f.

The flow equation for incremental lambda lifting

The solution is to lift each group of known local procedures as a unit, after performing a flow analysis to determine precisely which parameters must be added to each procedure. Let V be the set of parameters for the lambda expression at whose head a group of known local procedures is defined, let f1, ... be the names of those procedures, and let FV(fi) be the variables that occur free in the code for fi. The set of variables Ai that need to be added as parameters to fi is defined by the recursive flow equation

Ai = (FV(fi) intersect V) union (union {Aj | fi calls fj})

Compare this with the flow equation for conventional lambda lifting:

Ai = FV(fi) union (union {Aj | fi calls fj})

The only difference between these flow equations is that, by intersecting with V, incremental lambda lifting limits the flow problem to the formal parameters of the lambda expression from which the known local procedures are being lifted. Another way of looking at it is that conventional lambda lifting takes V to be the set of all variables that occur within the program.

Solving these flow equations takes O(m^3 * n^2) time, where m is the number of procedures being lifted and n is the number of variables in the base set V. For incremental lambda lifting, m and n are usually quite small. On the other hand, incremental lambda lifting requires O(m) different flow problems to be solved.

Twobit currently represents sets of variables as lists, so it actually uses an O(m^3 * n^3) algorithm to solve these flow equations. When compiling Twobit itself, Pass 2 spends about 6% of its time solving flow equations. This is less than 1% of the total compilation time.

Experiments using an O(m^3 * n^2) algorithm on synthetic benchmarks have shown that incremental lambda lifting can be either faster or slower than conventional lambda lifting, depending on the structure of the program being compiled.

In short, compilation time is not a major issue for incremental lambda lifting.

When and how to lift

Incremental lambda lifting is flexible because the lifting process can be halted at an intermediate scope. Heuristics are applied at each stage of lifting to determine whether further lifting is desirable.

Interprocedural register targeting is performed by reordering the formal parameters of known local procedures when they are lifted.

Incremental lambda lifting is performed at the very end of Pass 2.