(define Scheme (a great language to implement))

November 3, 2021

Recently I've started to get into Lisp dialects, specifically Scheme. This has bounced me off a few times in the past because of lack of understanding the general idea of it. But I think I'm ready now...

Scheme and its derivatives are one of the simplest languages you can implement as a beginner. Instead of solving the syntax, semantics and implementation of a whole new language at the same time, implementing something that already exists will save a lot of time. Scheme provides only a few basic primitives wrapped in a consistent, context-free syntax that's simple to parse and extend later.

As a demonstration, I've built my own interpreter in C11 just in a few evenings:

=> https://sr.ht/~yerinalexey/sushi

It's still missing a bunch of features like quote, strings and some sort of graceful error handling. If you're interested, feel free to send a few patches, I'd be happy to review and merge them :D

Another nice project is Make a Lisp, which walks through creating a simple Lisp-like language from scratch. It's a bit more complex than Scheme, but still not hard to do.

=> Make a Lisp repository


Scheme and other Lisps use S-expressions as base of their syntax. An expression is either an atom - symbol, integer or string literal, etc; or a list - a few other expressions grouped in ().

;; Some examples of S-expressions

;; Atoms - intger literal and symbol

;; List - a standard function call
(+ 2 2)
;; Lists within lists
(+ (* 2 3) (/ 10 2))

Lists also have a very important property - they work like a function call with first expression being the function to call and next ones as arguments, unless (quote expression) is used. As you might have noticed, arithmetic operations are just regular functions, and form the "prefix notation".

Empty list is also known as 'nil', and is used to indicate abscense of a value.

Special forms

In Scheme, lists also have a few special forms:

define - defines a global to the given value, for example:

(define x (+ 10 2))
x ;; = 12

lambda - creates a function that accepts some arguments and returns a result:

;; Example function that multiplies its arguments
(lambda (x y) (* x y))

;; More typically used in define
(define square
    (lambda (n) (* n n)))
(square 15) ;; = 225

;; ... or directly in a call
((lambda (n) (* n n)) 15) ;; = 225

+, -, /, *, =, <=, <, >, >= - standard arithmetic and comparison operations, work just like in any other language.

and, not, or - boolean AND, NOT, OR respectively.

quote - returns the argument unchanged, but prevents it from evaluation. This is useful in self-referencing objects or forward declarations. A shorthand for (quote expression) is 'expression.

;; This will return nil (after lookup)
;; These will return the symbol x itself
(quote x)

let - defines local variables and executes the expression within their context.

(let ((x 42))
    x) ;; = 42
x ;; Error!

(let ((x 42) (y 2)))
    (- x y)) ;; = 40

if - returns one value if the condition is true and another if it's not.

(define value 20)

(if (> value 5)
    (+ value 1)) ;; = 0

(if (< value 5)
    (+ value 1)) ;; = 21

There are a few others, but those are the most common.

Compiler considerations

Scheme can be both compiled and interpreted. Compiler is a bit harder to implement because of dynamic nature of Scheme, so I'd recommend doing an interpreter first. It will also provide a REPL (Read-Evaluate-Print loop) easily to play around in real time.

General idea of compiler is that it should keep its own temporary type system while generating. Each new type combination to a function call should re-generate that function since it can depend a lot on types. This also has a side effect of checking type errors at compile-time instead of crashing at run time when a bad function gets called.

It's also possible to make type checking at runtime by making a value a huge struct supporting anything, but most cases will be happy using first approach and less memory.

I'll be coming out with a compiler soon, using the same features as Sushi so both are interchangable!