Borrow checker notes

View source (Markdown).

Basics of linear types

The borrow checker is mostly based on the concept of linear types. In short: values of linear types must be used once and cannot be copied. They represent finite resources (for example, files).

There are also free types, which are building blocks for linears and, unlike the other kind, can be copied. These are scalar types like i32 or bool.

In the standard view of linear types, any function that does operations with an object must use the object and return it back, leading to the following API:

linear type file = ...;
fn open(path: str) file = ...;
fn write(f: file, data: []u8) file = ...;
fn close(f: file) void = ...;

This is not always great, since write can return a completely different file or modify its internal representation. This can be solved by using a reference instead:

linear type file = ...;
fn open(path: str) file = ...;
fn write(f: &file, data: []u8) void = ...;
fn close(f: file) void = ...;

This is better understood with the corrected terminology: write uses a file object to do something, but close consumes and destroys it.

fn correct() void = {
    let f = open("thing");
    write(&f, "123");
    close(f);
};
fn leak() void = {
    let f = open("thing");
    // Error: f is never consumed
};
fn use_after_close() void = {
    let f = open("thing");
    close(f);
    write(&f, "123"); // Error: invalid reference: f was consumed
};

These rules are also applied to conditionals and loops:

fn correct() void = {
    let f = open("thing");
    if (q) {
        write(&f, "q is true");
        close(f);
    } else {
        close(f);
    };
};
fn possible_leak() void = {
    let f = open("thing");
    if (q) {
        write(&f, "q is true");
        close(f);
    };
    // Error: f is possibly not consumed
};
fn possible_use_after_close() void = {
    let f = open("thing");
    for (q) {
        // Error: f was consumed in the previous iteration
        close(f);
    };
};

Creating linear types

Linear types can be created from free types using the linear keyword. If a type contains another linear, it must be marked as such as well.

type range = (int, int);          // free
linear type file = int;           // explicitly linear
linear type buffer = (str, file); // must be linear since it contains a file

type buffer = (str, file); // Error

Destructing linears

Destruction of complex resources can be a pretty annoying problem. Rust solves it by supporting destructors and Austral has struct unpacking syntax that consumes the structure and returns its fields.

The first option is obviously not possible in Hare; the second one works pretty good for tuples or small structs, but can get annoying otherwise.

Current solution:

  1. Cast to a free type (if the linear is an alias of that type) would destruct a value
  2. Cast to void destructs a linear value, but only if all of its fields were destructed too
linear type file = int;
fn close(f: file) void = {
    rt::close(f: int);
};

linear type buffer = struct {
    name: str,
    file: file,
};
fn buffer_finish(buf: buffer) void = {
    // Note: for simplicity, assume that buf.name is free

    close(buf.file); // close consumes buf.file
    buf: void;
};

Arrays/slices

A simple rule holds: arrays must always keep all of their members valid.

This introduces two new operators: atomic swap and atomic free. Note: those operations are atomic to the type system, not the CPU.

Atomic swap allows to replace a linear with another one, returning the original:

let old = value <- new;

(can be seen as new pushing value into old).

With arrays it works like this:

let files: []file = ...;
files[0] = open("a");         // Error: files[0] leaks
files[0] <- open("a");        // Error: swap result leaks
close(files[0] <- open("a")); // OK, old files[0] was closed

But it can be used with structs (and other compound types), too:

linear type document = struct { header: file, content: file };

let doc: document = ...;
close(doc.header <- open("a.hdr"));
close(doc.content <- open("a.txt"));

Atomic free allows to clean up a pointer or all elements of a slice, consuming them:

free([]T, *fn(T) void);
free(*T, *fn(T) void);
let files: []file = ...;
free(files, &close);

and a similar pattern applies to delete:

delete(files[2..5], &close);

TODO: something like this for fixed arrays

Slice allocation

If a slice contains free types, it can be default-initialized using alloc([default...], n). This does not work with linear types since they can’t be cloned from default.

The solution is pretty simple - specify capacity when allocating and use append (it will not reallocate the slice, hopefully):

let sl: []file = alloc([], 10);
for (let i = 0z; i < 10; i += 1) {
    append(sl, open(getpath(i)));
};
free(sl, &close);

Pointers vs. references

The distinction between stack and heap pointers might be nice to have. Heap pointers are linear values, in a sense, and stack pointers are free values.

Stack pointers will be called references and heap pointers remain as just pointers.

Pointers can be also trivially converted to references, but not the other way around.

let a: some_linear = ...;
let ref_a = &a;
some_linear_finish(a);
// <ref_a is invalid here>

let b: some_linear = ...;
let ptr_b: *some_linear = alloc(b); // consumes b
let ref_b: &some_linear = @ref(ptr_b);
free(ptr_b, &some_linear_finish); // using atomic free syntax; consumes ptr_b
// <ref_b is invalid here>

References depend on storage of the value they were taken from. In the example, &a is referencing a and can’t outlive it. When a is consumed, all references automatically become invalid.

For references converted from pointers, a similar rule applies: ref_b in the example is not allowed to outlive ptr_b.

Error handling and defer

Bad error handling can be a source of a lot of memory leaks that are common in existing Hare code.

Consider this parsing code:

linear type person = struct {
    name: str,
    age: int,
};
fn person_finish(p: person) void = free(p.name);

fn defer_test() (error | void) = {
    let list: []person = [];
    defer free(list, &person_finish);
    for (!eof) {
        let name = read_str()?;
        let age = read_int()?;
        // If an error happens here, name is going to leak
        append(list, person {
            name = name,
            age = age,
        });
        can_fail()?;
        // Or here
    };
};

With linear types, the code above will not compile because of possible leaks.

A new operator appears - optional defer (defer?). It generates a deferred expression only when everything used in it is valid.

linear type person = struct {
    name: str,
    age: int,
};
fn person_finish(p: person) void = free(p.name);

fn defer_test() (error | void) = {
    let list: []person = [];
    defer free(list, &person_finish);
    for (!eof) {
        let name = read_str()?;
        defer? free(name);
        let age = read_int()?;
        // If an error happens here, name is going to be freed
        append(list, person {
            name = name,
            age = age,
        });
        can_fail()?;
        // If an error happens here, name will not be freed because it
        // has been moved
    };
};

Borrowed strings and slices

Heap slices have type []T, borrowed or static slices have type [&]T. Borrowed version also depends on the lifetime of its source.

let sl: [&]int = [1, 2, 3, 4];
let part: [&]int = sl[1..3];

let sl: []int = alloc([0...], 4);
defer free(sl);
let part: [&]int = sl[1..3];

Strings are split into str (owned) and str_view (borrowed/static).

let s: str_view = "123";
let part: str_view = strings::sub(s, 0, 1);

let s: str = alloc("123");
defer free(s);
let part: str_view = strings::sub(s, 1, 2);

As seen above, an implicit conversions []T -> [&]T and str -> str_view are allowed.

Lifetime annotations

When using references as function arguments or return values, their relations must be explicitly specified in the prototype in form of lifetime annotations.

They are written as ^name either after the &, or as type(^name) for str_view and complex types.

Annotations in prototypes are read from left to right: arguments define lifetimes, return values use them.

Inside the function body, lifetimes can be inferred by the compiler and annotations are not necessary.

// This tells the compiler that two slices returned from this function depend
// on its argument 's'.
export fn cut(s: [&^s]u8, sep: u8) ([&^s]u8, [&^s]u8) = {
    for (let i = 0z; i < len(s); i += 1) {
        if (s[i] == sep) {
            if (i + 1 < len(s)) {
                return (s[..i], s[i + 1..]);
            } else {
                return (s[..i], []);
            };
        };
    };
    return (s, []);
};

export fn strings::cut(s: str_view(^s), sep: rune) (str_view(^s), str_view(^s));

let bytes: []u8 = alloc([1, 2, 3, 4]);
let (left, right) = cut(bytes, 2);
free(bytes);
// left and right are invalid here

A type requires lifetime annotations if it references other objects. Similar to plain references, explicit annotations are only needed in prototypes.

type iterator(^s) = struct {
    data: str_view(^s),
    index: size,
};
fn iter(s: str_view(^s)) iterator(^s) = iterator {
    data = s,
    index = 0,
};

{
    let it = iter("123");
    // OK: static lifetime
};

{
    let s = alloc("123");
    {
        let it = iter(s);
        it;
    };
    free(s);
    // OK: iterator was destroyed before s
};

{
    let s = alloc("123");
    let it = iter(s); // Implicit ^s
    free(s);
    it; // Error: iterator outlives ^s
};