Description of Benchmarks
Note: Our educated guesses concerning the reasons for
a particular system's performance appear within [square brackets].
This pseudo-benchmark is an aggregate statistic that
shows the geometric mean for all benchmarks.
Where other benchmarks display timings in seconds,
the numerical scores for the geometric mean show the
(geometric) average ratio, over all benchmarks that
a system was able to run, of the system's time to
the fastest system's time.
An average ratio of 1.0 is the lowest possible, and
can be achieved only by a system that is fastest on
every benchmark.
Gabriel Benchmarks
Bob Boyer's theorem proving benchmark, a Gabriel benchmark, 20 iterations.
This benchmark uses association lists in place of the original benchmark's
property lists. The nboyer
benchmark
is an updated and altogether better version of this benchmark.
Browsing a data base, a Gabriel benchmark, 600 iterations.
[May be a test of string->symbol
and/or
symbol->string
.]
The tak
benchmark in continuation-passing
style, 1000 iterations.
A test of closure creation.
The tak
benchmark in continuation-capturing
style, 100 iterations.
A test of call-with-current-continuation
.
[Larceny's code for call-with-current-continuation
is now
written in C, and most of its time on this benchmark is spent crossing
the Scheme/C barrier.]
Table-driven symbolic differentiation, a Gabriel benchmark,
2000000 iterations.
Uses association lists instead of the original benchmark's property lists.
Symbolic differentiation, a Gabriel benchmark,
2000000 iterations.
Destructive list operations, a Gabriel benchmark, 500 iterations.
Divides 200 by 2 using lists as a unary notation for integers,
a Gabriel benchmark, 1000000 iterations.
This benchmark tests
null?
, cons
, car
,
cdr
, and little else.
This benchmark is the same as idiv2 except it uses deep recursion
instead of iteration.
Combinatorial search of a state space, a Gabriel benchmark, 100 iterations.
A test of arrays and classical compiler optimizations.
This benchmark was originally written in Pascal by Forrest Baskett.
The tak
benchmark using lists to represent
integers, a Gabriel benchmark, 300 iterations.
Another combinatorial search similar to
puzzle
,
a Gabriel benchmark,
10 iterations.
Numerical Benchmarks
The common characteristic of these numerical benchmarks is that
they are easy to translate into C.
Fast Fourier Transform, 2000 iterations.
A test of floating point arithmetic.
Doubly recursive computation of the 35th fibonacci number
(9227465), using (< n 2)
to terminate the
recursion; 5 iterations.
Calculation of the 35th Fibonacci number using inexact numbers,
with only 2 iterations.
A test of floating point arithmetic.
Uses essentially the same code as the fib
benchmark.
Generation of a Mandelbrot set, 100 iterations.
A test of floating point arithmetic.
Determination of a nucleic acid's spatial structure, 5 iterations.
A test of floating point arithmetic, and a real program.
Testing to see whether a point is contained within a 2-dimensional
polygon, 100000 iterations (with 12 tests per iteration).
A test of floating point arithmetic.
Sums the integers from 0 to 10000, 10000 iterations.
Sums the integers from 0 to 10000, 10000 iterations.
A test of floating point arithmetic.
Uses essentially the same code as the sum
benchmark.
A triply recursive integer function related to the Takeuchi function,
a Gabriel benchmark, 2000 iterations.
A test of non-tail calls and arithmetic.
Historical note:
The Symbolics 3600 performed 1 iteration of this benchmark in 0.43 seconds
using generic arithmetic; for 2000 iterations, that would be 860 seconds.
Kernighan and Van Wyk Benchmarks
Brian W Kernighan and Christopher J Van Wyk wrote a set of small
benchmarks
to compare the perforance of several scripting languages, including
C and Scheme. Marc Feeley and I have modified some of these
benchmarks to correct bugs and to increase the number of iterations.
A version of the Ackermann function, with arguments 3,9.
This benchmark allocates, initializes, and copies some fairly
large one-dimensional arrays.
This file-copying benchmark is a simple test of character i/o.
This tests string-append
and substring
,
and very little else.
This benchmark reads and sums 100,000 floating point numbers.
It is primarily a test of floating point input.
Increments a global variable 100000000 times.
This benchmark was intended to test looping and arithmetic,
but in Scheme it becomes a test of the garbage collector's
write barrier.
This benchmark performs considerable character i/o.
Another character i/o benchmark.
Other Benchmarks
A type checker written by Jim Miller, 40 iterations.
Dynamic type inference, self-applied, 20 iterations.
Written by Fritz Henglein.
A real program.
Earley's parsing algorithm, parsing a 9-symbol input according to one
of the simplest ambiguous grammars, 200 iterations.
A real program, applied to toy data.
A version of fib
that uses first class continuations;
written by Kent Dybvig. Calculates the 18th Fibonacci number
500 times.
This program was provided by Andrew Wright, but we don't know much
about it, and would appreciate more information.
This higher order program creates closures almost as often as it
performs non-tail procedure calls.
Another program that was provided by Andrew Wright,
though it may have been written by Jim Miller.
It enumerates the order-preserving maps between finite lattices.
Another program that was provided by Andrew Wright.
Computes maximal matrices; similar to some puzzle programs.
Constructs a maze on a hexagonal grid, 4000 iterations.
Written by Olin Shivers.
[The maze
benchmark uses bitwise-not
and bitwise-and
procedures that are not standard
in R5RS. Implementations that provide these procedures as
extensions probably have an advantage over implementations
that use the portable versions we wrote for this benchmark.]
Constructs a maze on a rectangular grid using purely functional style,
1000 iterations.
Written by Marc Feeley.
Computes the number of solutions to the 8-queens problem,
2000 times.
Computes the number of paraffins that have 17 carbon atoms,
1000 times.
Partial evaluation of Scheme code, 200 iterations.
Written by Marc Feeley.
Computes the primes less than 100, 100000 times, using
a list-based Sieve of Eratosthenes.
Written by Eric Mohr.
Ray tracing a simple scene, 5 iterations.
A test of floating point arithmetic.
This program is translated from the Common Lisp code in
Example 9.8 of Paul Graham's book on ANSI Common Lisp.
A Scheme interpreter evaluating a merge sort, 20000 iterations.
Written by Marc Feeley.
Simplex algorithm, 100000 iterations.
A test of floating point arithmetic, and a real program.
Scheme to LaTeX processor, 20 iterations.
A test of file i/o and probably much else.
Part of a real program written by Dorai Sitaram.
Creates a list containing all 362880 permutations of a
list of 9 integers, in Gray code order, using Zaks's
algorithm, 10 iterations.
Please note that this benchmark is not the same as the
10perm9
benchmark written by Will Clinger, because
it does not retain the result of the previous iteration while
computing the result of the current iteration.
As written by Will Clinger, the 10perm9
benchmark
is a test of storage allocation and garbage collection.
At the end of each iteration, the oldest half of the
live storage becomes garbage.
This benchmark is particularly difficult for generational garbage
collectors, since it violates their assumption that young objects
have a shorter future life expectancy than older objects.
The perm9
benchmark distributed by Marc Feeley
does not have that property.
An updated and exponentially scalable version of the
boyer
benchmark.
The nboyer
benchmark's data structures are
considerably more appropriate than the data structures used in the
boyer
benchmarks.
These timings are for 100 iterations of
a problem of size 0, which is the same size as for
boyer
, and is way too small for modern machines.
A test of lists, vectors, and garbage collection.
A version of nboyer that has been tuned (by Henry
Baker) to reduce storage allocation, making it less of a garbage collection
benchmark and more of a compiler benchmark. Only 4 lines of code were
changed, and another 7 lines of code were added.
This program was written to mimic the phase structure that has been
conjectured for a class of application programs for which garbage
collection may represent a significant fraction of the execution time.
This benchmark warms up by allocating and then dropping a binary tree
of height 18.
Then it allocates a permanent tree of height 16 and a permanent array
of 500000 floating point numbers.
Then it allocates about 350 megabytes of tree storage in seven phases,
allocating about 50 megabytes per phase.
The first phase allocates 67648 trees of height 4, dropping each tree
before allocating the next.
The second phase allocates and drops 16512 trees of height 6,
and so on to the last phase, which allocates 8 trees of height 16.
Each phase is divided into two subphases. The first subphase allocates
trees top-down using side effects, while the second subphase allocates
trees bottom-up without using side effects.
This benchmark was written in Java by John Ellis and Pete Kovac,
modified by Hans Boehm, and translated into Scheme, Standard ML,
C++, and C by William Clinger.
Parses the nboyer
benchmark 1000 times
using a scanner and parser generated using Will Clinger's
LexGen and ParseGen.
[The MIT Scheme compiler cannot compile single procedures
that are as large as this benchmark, so the timing shown
is for the MIT Scheme interpreter.]
A synthetic garbage collection benchmark written by David Detlefs
and translated to Scheme by Will Clinger and Lars Hansen.
Last updated 16 February 2007.