andersch.dev

<2022-05-08 Sun>
[ web ]

WebAssembly (wasm)

WebAssembly (wasm) is a way to compile programs written in any supported language to bytecode that can run in a virtual machine on a web browser.

Without it, arbitrary code in a browser can only run with JavaScript. However, WASM itself has no external interfaces (no printing/logging or manipulation of the DOM) so JS functions need to be imported.

It comes with a text format (.wat) and a binary format (.wasm). The .wat format uses S-expressions and a very lisp-like syntax (lots of parentheses, ;; denotes comments).

Components/Toolchain

WASM module: Compiled and linked image (like ELF or PE)

  • contains sections for code, types, globals, import table, export table, etc.
  • export table lists module's entry points
  • can only affect the outside world through imported JS functions

WASM runtime: Loads WASM modules, linking import table entries into the module

  • Type checks module's linkage at load time
  • Executes start function afterwards (if it exists)
  • Then executes zero or more of its entry points

WASM compiler: Convert high-level language to low-level WASM (e.g. Clang)

  • Requires ABI to map high-level language concepts onto the machine
  • Function indices yet unknown, so references patched in by the linker

WASM linker: Links up functions to a WASM module

  • LLVM/Clang uses wasm-ld

Language runtime: E.g., C standard library, POSIX interfaces, etc.

  • Maps onto standardized imports: the WebAssembly System Interface (WASI)
  • WASI defines a set of POSIX-like functions
  • Alternatively, one can code directly against raw WASI

Using WASM in JavaScript

let imports = {
    math : { callback : x => console.log("result is", x) }
}

// instantiate wasm module while passing it JS functions to import
let wasm_module = await WebAssembly.instantiateStreaming(fetch('main.wasm'), imports)

let x = wasm_module.instance.exports.add(5,10)
let y = wasm_module.instance.exports.mult(2,5)

The corresponding wasm code (in the .wat text format):

;; wasm comments look like this
(module
    (import "math" "callback" (func $callback))

    (export "add" (func $add))
    (export "mult" (func $mult))

    (func $add (param $a i32) (param $b i32) (result i32)
        ;; WASM is stack-based:
        local.get $a ;; push a on to the stack
        local.get $b ;; push b
        i32.add      ;; pop the last two values off as arguments to add,
                     ;; then push on the result
    )

    (func $mult (param $a i32) (param $b i32) (result i32)
        local.get $a
        local.get $b
        i32.mul
    )
)

Compiling C to wasm using Clang

/* add.c */
int add(int a, int b) { return a * a + b; }

To output a add.wasm using Clang:

clang \
  --target=wasm32 \
  -nostdlib `# Don’t try and link against a standard library` \
  -Wl,--no-entry `# Flags passed to the linker` \`
  -Wl,--export-all \
  -o add.wasm \
  add.c

No libc is available, meaning

  • fundamental C APIs like malloc() and free() are not available
  • anything that requires syscalls would need to call out to Javascript

WebAssembly memory model

WebAssembly memory can grow at runtime, which means means there is no fixed end to place the stack or the heap.

  • Stack reaches from __data_end to the __heap_base, but grows downwards (towards lower addresses)
  • Heap starts at __heap_base and grows towards higher addresses
  • The stack size thus is limited to __heap_base - __data_end

    +---------+------------------+-----------------+
    | Data    |        <-- Stack | Heap -->        |
    +---------+------------------+-----------------+
    ^         ^                  ^                 ^
    0    __data_end         __heap_base           max
    

Stack size value can be configured using -Wl,-z,stack-size=$[8 * 1024 * 1024].

Custom allocator using an arena

extern unsigned char __heap_base;

unsigned int bump_pointer = &__heap_base;
void* malloc(int n) {
    unsigned int r = bump_pointer;
    bump_pointer += n;
    return (void*) r;
}

To receive a chunk of memory, JavaScript needs to call this allocator and pass the address:

async function init() {
  const { instance } = await WebAssembly.instantiateStreaming(fetch("./add.wasm"));

  const jsArray = [1, 2, 3, 4, 5];

  // Allocate memory for 5 32-bit integers and return get starting address.
  const cArrayPointer = instance.exports.malloc(jsArray.length * 4);

  // Convert into a Uint32Array, starting at that address.
  const cArray = new Uint32Array(instance.exports.memory.buffer, cArrayPointer, jsArray.length);

  // Copy the values from JS to C.
  cArray.set(jsArray);

  // Run the function, passing the starting address and length.
  console.log(instance.exports.sum(cArrayPointer, cArray.length));
}
init();

Resources