Scheme Specification
Last Updated: Fall 2021
About This Specification
The version of Scheme used in this course is not perfectly true to any official specification of the language, though it is perhaps closest to R5RS, with some features and terminology from other versions, both older and newer. We deviate from the official specifications for several reasons, including ease of implementation (both for the staff in reference implementations and for students in completing the Scheme project) and ease of instruction.
This document and the linked built-in procedure reference are very long and are an attempt to formalize the variant of Scheme used in UCB CS61A (and thus in CS 111). Lectures, labs, and discussion are probably better resources for first learning the language. However, the Overview and Terminology section may be useful if you wish to read a more formal description of the language.
You should not find it necessary to read the full references of special forms and built-in procedures, but specific sections may be helpful to reference when working on Scheme assignments and the project (the relevant sections will typically be linked in the assignment itself).
A completed and correct implementation of the Scheme project, including all extra credit problems, should match the functionality of the staff interpreter, excluding error tracing.
This document primarily focuses on the Python-based interpreter. The web interpreter should largely match its behavior, but due to the different host language and restraints of the web platform, some inconstencies may exist, for which you can file issues. Additionally, the web interpreter contains several extra features, such as a diagrammer and JS interopability that are not documented here.
Overview and Terminology
Expressions and Environments
Scheme works by evaluating expressions in environments. Every expression evaluates to a value. Some expressions are self-evaluating, which means they are both an expression and a value, and that it evaluates to itself.
A frame is a mapping from symbols (names) to values, as well as an optional parent frame. The current environment refers to the current frame, as well as a chain of parent frames up to the global frame (which has no parent). When looking up a symbol in an environment, Scheme first checks the current frame and returns the corresponding value if it exists. If it doesn't, it repeats this process on each subsequent parent frame, until either the symbol is found, or there are no more parent frames to check.
Atomic Expressions
There are several atomic or
primitive expressions. Numbers, booleans, strings, and the
empty list (nil) are all both atomic and self-evaluating. Symbols
are atomic, but are not self-evaluating (they instead evaluate to a value that
was previously bound to it in the environment).
Call Expressions
The Scheme expressions that are not atomic are called combinations, and consist of one or more subexpressions between parentheses. Most forms are evaluated as call expressions, which has three evaluation steps:
- Evaluate the first subexpression (the operator), which must evaluate to a procedure (see below).
- Evaluate the remaining subexpressions (the operands) in order.
- Apply the procedure from step 1 to the evaluated operands (arguments) from step 2.
These steps mirror those in Python and other languages.
Special Forms
However, not all combinations are call expressions. Some are special forms. The interpreter maintains a set of particular symbols (sometimes called keywords) that signal that a combination is a special form when they are the first subexpression. Each special form has it's own procedure for which operands to evaluate and how (described below). The interpreter always checks the first subexpression of a combination first. If it matches one of the keywords, the corresponding special form is used. Otherwise, the combination is evaluated as a call expression.
Symbolic Programming
Scheme's core data type is the list, built out of pairs as
described below. Scheme code is actually built out of these lists. This means
that the code (+ 1 2) is constructed as a list of the
+ symbol, the number 1, and the number 2, which is then evaluated
as a call expression.
Since lists are normally evaluated as combinations, we need a special form to
get the actual, unevaluated list. quote is a special form that
takes a single operand expression and returns it, unevaluated. Therefore,
(quote (+ 1 2)) returns the actual list of the symbol
+, the number 1, and the number 2, rather than evaluating the
expression to get the number 3. This also works for symbols. a is
looked up in the current environment to get the corresponding value, while
(quote a) evaluates to the literal symbol a.
Because quote is so commonly used in Scheme, the language has a
shorthand way of writing it: just put a single quote in front of the
expression you want to leave unevaluated. '(+ 1 2) and
'a are equivalent to (quote (+ 1 2)) and
(quote a), respectively.
Miscellaneous
Like R5RS, 61A Scheme is entirely case-insensitive (aside from strings). This specification will use lowercase characters in symbols, but the corresponding uppercase characters may be used interchangeably.
Types of Values
Numbers
Numbers are built on top of Python's number types and can thus support a combination of arbitrarily-large integers and double-precision floating points.
The web interpreter attempts to replicate this when possible, though may deviate from Python-based versions due to the different host language and the need to work-around the quirks of JavaScript when running in a browser.
Any valid real number literal in the interpreter's host language should be properly read. You should not count on consistent results when floating point numbers are involved in any calculation or on any numbers with true division.
Booleans
There are two boolean values: #t and #f. Scheme
booleans may be input either as their canonical #t or
#f or as the words true or false.
Any expression may be evaluated in a boolean context, but #f is
the only value that is false. All other values are treated as true in a
boolean context.
Some interpreters prior to Spring 2018 displayed the words
true and false when booleans were output, but this
should not longer be the case in any interpreter released/updated since then.
Symbols
Symbols are used as identifiers in Scheme. Valid symbols consist of some combination of alphanumeric characters and/or the following special characters:
!$%&*/:<=>?@^_~+-.
All symbols should be internally stored with lowercase letters. Symbols must not form a valid integer or floating-point number.
Strings
Unlike other implementations, 61A Scheme has no concept of individual characters. Strings are considered atomic data types in their own right. Strings can be entered into the interpreter as a sequence of characters inside double quotes, with certain characters, such as line breaks and double quotes escaped. As a general rule, if a piece of text would be valid as a JSON key, it should work as a string in 61A Scheme. Strings in 61A Scheme are immutable, in contrast to most other Scheme implementations.
These differences in how strings behave are due to the status of strings in the host languages: Python and Dart both have immutable strings with no concept of individual characters.
Because the Python-based interpreter has little use for strings, it lacks
proper support for their manipulation. The web interpreter, which requires
strings for JS interop (among other things), it supports a
string-append built-in, which takes in an arbitrary number
of values or any type and combines them into a string. Additional string
manipulation can be done through JS interop.
Pairs and Lists
Pairs are a built-in data structure consisting of two fields, a
car and a cdr (also sometimes called first and
second, or first and rest). The first value can contain any scheme datatype.
However, the second value must contain nil, a pair, or a stream promise.
nil is a special value in Scheme that represents the empty list.
It can be inputted by typing nil or () into the
interpreter.
A list is defined as either nil or a pair whose
cdr is another list. Pairs are displayed as a parenthesized,
space separated, sequence of the elements in the sequence they represent. For
example, (cons (cons 1 nil) (cons 2 nil)) is displayed as
((1) 2). This means that cons is asymmetric.
There is one exception to the above rule in the case of streams. Streams are represented as the
carof the stream, followed by a dot, followed by the promise that makes up its cdr. For examplescm> (cons-stream 1 nil) (1 . #[promise (not forced)])
List literals can be constructed through the quote special form, so
(cons 1 (cons 'a nil)) and '(1 a) are
equivalent.
Procedures
Procedures represent some subroutine within a Scheme program. Procedures are first-class in Scheme, meaning that they can be bound to names and passed around just like any other Scheme value. Procedures are equivalent to functions in most other languages, and the two terms are sometimes used interchangeably.
Procedures can be called on some number of arguments, performing some number of actions and then returning some Scheme value.
A procedure call can be performed with the syntax
(<operator> <operand> ...), where
<operator> is some expression that evaluates to a procedure
and each <operand> (of which there can be any number,
including 0) evaluates to one of the procedure's arguments. The term
"procedure call" is used interchangeably with the term "call expression."
There are several types of procedures. Built-in procedures (or just built-ins) are built-in to the interpreter and already bound to names when it is started (though it is still possible for you to rebind these names). A list of all the built-in procedures in the Python-based interpreter is available in the Scheme built-ins document.
Lambda procedures are defined using the lambda or
define special forms (see below) and create a new frame whose
parent is the frame in which the lambda was defined in when called. The
expressions in the lambda's body are than evaluated in this new environment.
Mu procedures are similar, but the new frame's parent is the frame in which
the mu is called, not the frame in which it was created.
61A Scheme also has macro procedures, which must be defined with the
define-macro special form. Macros work similarly to lambdas,
except that they pass the argument expressions in the call expression into the
macro instead of the evaluated arguments and they then evaluate the expression
the macro returns in the calling environment afterwards. The modified process
for evaluating macro call expressions is:
- Evaluate the operator. If it is not a macro procedure, follow the normal call expression steps.
- Apply the macro procedure from step 1 to the unevaluated operands.
- Once the macro returns, evaluate that value in the calling environment.
Macros effectively let the user define new special forms. Macro procedures take in unevaluated operand expressions and should generally return a piece of Scheme code that the macro is equivalent to.
Promises and Streams
Promises represent the delayed evaluation of an expression in an environment.
They can be constructed by passing an expression into the
delay special form. The evaluation of a promise can be forced by
passing it into the force built-in. The expression of a promise
will only ever be evaluated once. The first call of force will
store the result, which will be immediately returned on subsequent calls of
force on the same promise.
A promise must contain a pair or nil since it is used as the
cdr of a stream. If it is found to contain something else when
forced, force will error. If force errors for any
reason, the promise remains unforced.
For example
scm> (define p (delay (begin (print "hi") (/ 1 0))))
p
scm> p
#[promise (unforced)]
scm> (force p)
hi
Error
scm> p
#[promise (unforced)]
scm> (force p)
hi
Error
Or, for an example with type errors:
scm> (define p (delay (begin (print "hi") 2)))
p
scm> p
#[promise (unforced)]
scm> (force p)
hi
Error
scm> p
#[promise (unforced)]
scm> (force p)
hi
Error
Promises are used to define streams, which are to lists what
promises are to regular values. A stream is defined as a pair where the cdr is
a promise that evaluates to another stream or nil. The
cons-stream special form and the
cdr-stream built-in are provided make the construction and
manipulation of streams easier. (cons-stream a b) is
equivalent to (cons a (delay b)) while
(cdr-stream x) is equivalent to
(force (cdr x)).
A note for those familiar with promises in languages like JavaScript: although Scheme promises and JS-style promises originate from the same general concept, JS promises are best described as a placeholder for a value that is computed asynchronously. The Python-based 61A Scheme interpreter has no concept of asynchrony, so its promises only represent delayed evaluation.
Undefined
Finally, there is also an undefined value that can be returned by
some builtin procedures. This value behaves similarly to None in
Python. If an expression entered into the REPL evaluates to
undefined, it is not printed, just like Python.
Special Forms
In all of the syntax definitions below, <x> refers to a
required element x that can vary, while [x] refers
to an optional element x. Ellipses indicate that there can be
more than one of the preceding element.
The following special forms are included in all versions of 61A Scheme.
define
(define <name> <expression>)
Evaluates <expression> and binds the value to
<name> in the current environment.
<name> must be a valid Scheme symbol.
(define (<name> [param] ...) <body> ...)
Constructs a new lambda procedure with params as its parameters
and the body expressions as its body and binds it to
name in the current environment. name must be a
valid Scheme symbol. Each param must be a unique valid Scheme
symbol. This shortcut is equivalent to:
(define <name> (lambda ([param] ...) <body> ...))
However, some interpreters may give lambdas created using the shortcut an
intrinsic name of name for the purpose of visualization or
debugging.
In either case, the return value is the symbol <name>.
Example:
scm> (define x 2)
x
scm> (define (f x) x)
f
Variadic functions
In staff implementations of the scheme language, you can define a function
that takes a variable number of arguments by using the
variadic special form. The construct
variadic constructs a "variadic symbol" that is bound to multiple
rather than a single variable. This is only allowed at the end of an arguments
list
scm> (define (f x (variadic y)) (append y (list x)))
f
scm> (f 1 2 3)
(2 3 1)
scm> (define (f (variadic y) x) (append y (list x)))
Error
This is also possible in lambdas:
scm> (define f (lambda (x (variadic y)) (append y (list x))))
f
scm> (f 1 2 3)
(2 3 1)
scm> (define my-list (lambda ((variadic x)) x))
my-list
scm> (my-list 2 3 4)
(2 3 4)
You can use the special symbol . to construct the
variadic special form:
scm> (define (f x . y) (append y (list x)))
f
scm> (f 1 2 3)
(2 3 1)
scm> '. x
(variadic x)
This is analogous to , for unquote.
This is pretty much the same as
*argsin python, except that you can't call a function usingvariadic, you instead have to use the#[apply]built-in function.
if
(if <predicate> <consequent> [alternative])
Evaluates predicate. If true, the consequent is
evaluated and returned. Otherwise, the alternative, if it exists,
is evaluated and returned (if no alternative is present in this
case, the return value is undefined).
Example: This code returns the length of non-empty lists and 0 for empty lists:
(define nums '(1 2 3))
(if (null? nums) 0 (length nums))
cond
(cond <clause> ...)
The clause corresponds to expressions of the form
(<test> [expression] ...)
Alternatively, clause can be written as
(else [expression] ...)
Starts with the first clause. Evaluates test. If
true, evaluate the expressions in order, returning the last one.
If there are none, return what test evaluated to instead. If
test is false, proceed to the next clause. If there
are no more clauses, the return value is undefined. Clauses of
the form (else [expression] ...) are equivalent to
(#t [expression] ...).
Example: This code returns -1 for negative numbers, 1 for positive numbers, and 0 for zeros.
(define n -3)
(cond
((< n 0) -1)
((> n 0) 1)
(else 0)
)
and
(and [test] ...)
Evaluate the tests in order, returning the first false value. If
no test is false, return the last test. If no
arguments are provided, return #t.
Example: This and form evaluates to true
whenever x is both greater than 10 and less than 20.
(define x 15)
(and (> x 10) (< x 20))
or
(or [test] ...)
Evaluate the tests in order, returning the first true value. If
no test is true and there are no more tests left,
return #f.
Example: This or form evaluates to true when
either x is less than -10 or greater than 10.
(define x -15)
(or (< x -10) (> x 10))
let
(let ([binding] ...) <body> ...)
The binding corresponds to expressions of the form
(<name> <expression>)
First, the expression of each binding is evaluated
in the current frame. Next, a new frame that extends the current environment
is created and each name is bound to its corresponding evaluated
expression in it.
Finally the body expressions are evaluated in order, returning
the evaluated last one.
Example: This let form has two bindings and
three body expressions:
(let (
(x 5)
(y 10)
)
(print x)
(print y)
(+ x y)
)
begin
(begin <expression> ...)
Evaluates each expression in order in the current environment,
returning the evaluated last one.
Example: Here's begin used inside an
if form in a procedure. The returned value is the last expression
of the three.
(define (sortedpair a b)
(if (> a b)
(begin
(print a)
(print b)
(cons a (cons b nil))
)
(begin
(print b)
(print a)
(cons b (cons a nil))
)
)
)
(define spair (sortedpair 10 20))
(car spair)
lambda
(lambda ([param] ...) <body> ...)
Creates a new lambda with params as its parameters and the
body expressions as its body. When the procedure created by this
form is called, the call frame will extend the environment this lambda was
defined in.
Example: This code immediately calls the lambda procedure with two arguments:
( (lambda(x y) (+ (* x x) (* y y))) 3 4)
quote
(quote <expression>)
Alternatively,
'<expression>
Returns the literal expression without evaluating it.
quasiquote
(quasiquote <expression>)
Alternatively,
`<expression>
Returns the literal expression without evaluating it, unless a
subexpression of expression is of the form:
(unquote <expr2>)
in which case that expr2 is evaluated and replaces the above form
in the otherwise unevaluated expression.
unquote
(unquote <expr2>)
Alternatively,
,<expr2>
mu
(mu ([param] ...) <body> ...)
Creates a new mu procedure with params as its parameters and the
body expressions as its body. When the procedure this form
creates is called, the call frame will extend the environment the mu is called
in.
define-macro
This special form is included in the staff interpreter and the web interpreter, but it is not in scope for the course and is not included in the project.
(define-macro (<name> [param] ...) <body> ...)
Constructs a new macro procedure with params as its parameters
and the body expressions as its body and binds it to
name in the current environment. name must be a
valid Scheme symbol. Each param must be a unique valid Scheme
symbol. (<name> [param] ...) can be
variadic.
Macro procedures should be lexically scoped, like lambda procedures.
expect
This special form is included in the staff interpreter and the web interpreter for the purposes of running doctests, but is not part of the official Scheme specification.
(expect <expr> <output>)
Evaluates expr in the current environment and compares it to the
(unevaluated) output, printing out the result. If an exception
occurs while evaluating expr, it is caught and a traceback is
printed.
unquote-splicing
This special form is included in the staff interpreter and the web interpreter, but it is not in scope for the course and is not included in the project.
(unquote-splicing <expr2>)
Alternatively,
,@<expr2>
Similar to unquote, except that expr2 must evaluate
to a list, which is then spliced into the structure containing it in
expression.
delay
This special form is included in the staff interpreter and the web interpreter, but it is not in scope for the course and is not included in the project.
(delay <expression>)
Returns a promise of expression to be evaluated in the current
environment.
cons-stream
This special form is included in the staff interpreter and the web interpreter, but it is not in scope for the course and is not included in the project.
(cons-stream <first> <rest>)
Shorthand for (cons <first> (delay <rest>)).
set!
This special form is included in the staff interpreter and the web interpreter, but it is not in scope for the course and is not included in the project.
(set! <name> <expression>)
Evaluates expression and binds the result to name in
the first frame it can be found in from the current environment. If
name is not bound in the current environment, this causes an
error.
The return value is undefined.
Additional Reading
- Scheme built-ins - covers the built-in procedures of the Python-based interpreter.
- R5RS Specification - the full Scheme specificaton that 61A Scheme most closely resembles.