# Cranelift IR Reference¶

The Cranelift intermediate representation (IR) has two primary forms: an in-memory data structure that the code generator library is using, and a text format which is used for test cases and debug output. Files containing Cranelift textual IR have the .clif filename extension.

This reference uses the text format to describe IR semantics but glosses over the finer details of the lexical and syntactic structure of the format.

## Overall structure¶

Cranelift compiles functions independently. A .clif IR file may contain multiple functions, and the programmatic API can create multiple function handles at the same time, but the functions don’t share any data or reference each other directly.

This is a simple C function that computes the average of an array of floats:

float
average(const float *array, size_t count)
{
double sum = 0;
for (size_t i = 0; i < count; i++)
sum += array[i];
return sum / count;
}


Here is the same function compiled into Cranelift IR:

function %average(i32, i32) -> f32 system_v {
ss0 = explicit_slot 8         ; Stack slot for sum.

ebb1(v0: i32, v1: i32):
v2 = f64const 0x0.0
stack_store v2, ss0
brz v1, ebb3                  ; Handle count == 0.
v3 = iconst.i32 0
jump ebb2(v3)

ebb2(v4: i32):
v5 = imul_imm v4, 4
v7 = load.f32 v6              ; array[i]
v8 = fpromote.f64 v7
stack_store v10, ss0
v12 = icmp ult v11, v1
brnz v12, ebb2(v11)           ; Loop backedge.
v14 = fcvt_from_uint.f64 v1
v15 = fdiv v13, v14
v16 = fdemote.f32 v15
return v16

ebb3:
v100 = f32const +NaN
return v100
}


The first line of a function definition provides the function name and the function signature which declares the parameter and return types. Then follows the function preamble which declares a number of entities that can be referenced inside the function. In the example above, the preamble declares a single explicit stack slot, ss0.

After the preamble follows the function body which consists of extended basic blocks (EBBs), the first of which is the entry block. Every EBB ends with a terminator instruction, so execution can never fall through to the next EBB without an explicit branch.

A .clif file consists of a sequence of independent function definitions:

function_list ::=  { function }
function      ::=  "function" function_name signature "{" preamble function_body "}"
preamble      ::=  { preamble_decl }
function_body ::=  { extended_basic_block }


### Static single assignment form¶

The instructions in the function body use and produce values in SSA form. This means that every value is defined exactly once, and every use of a value must be dominated by the definition.

Cranelift does not have phi instructions but uses EBB parameters instead. An EBB can be defined with a list of typed parameters. Whenever control is transferred to the EBB, argument values for the parameters must be provided. When entering a function, the incoming function parameters are passed as arguments to the entry EBB’s parameters.

Instructions define zero, one, or more result values. All SSA values are either EBB parameters or instruction results.

In the example above, the loop induction variable i is represented as three SSA values: In the entry block, v4 is the initial value. In the loop block ebb2, the EBB parameter v5 represents the value of the induction variable during each iteration. Finally, v12 is computed as the induction variable value for the next iteration.

The cranelift_frontend crate contains utilities for translating from programs containing multiple assignments to the same variables into SSA form for Cranelift IR.

Such variables can also be presented to Cranelift as stack slots. Stack slots are accessed with the stack_store and stack_load instructions, and can have their address taken with stack_addr, which supports C-like programming languages where local variables can have their address taken.

## Value types¶

All SSA values have a type which determines the size and shape (for SIMD vectors) of the value. Many instructions are polymorphic – they can operate on different types.

### Boolean types¶

Boolean values are either true or false.

The b1 type represents an abstract boolean value. It can only exist as an SSA value, and can’t be directly stored in memory. It can, however, be converted into an integer with value 0 or 1 by the bint instruction (and converted back with icmp_imm with 0).

Several larger boolean types are also defined, primarily to be used as SIMD element types. They can be stored in memory, and are represented as either all zero bits or all one bits.

b1

A boolean type with 1 bits.

Bytes: Can’t be stored in memory
b8

A boolean type with 8 bits.

 Bytes: 1
b16

A boolean type with 16 bits.

 Bytes: 2
b32

A boolean type with 32 bits.

 Bytes: 4
b64

A boolean type with 64 bits.

 Bytes: 8

### Integer types¶

Integer values have a fixed size and can be interpreted as either signed or unsigned. Some instructions will interpret an operand as a signed or unsigned number, others don’t care.

The support for i8 and i16 arithmetic is incomplete and use could lead to bugs.

i8

An integer type with 8 bits. WARNING: arithmetic on 8bit integers is incomplete

 Bytes: 1
i16

An integer type with 16 bits. WARNING: arithmetic on 16bit integers is incomplete

 Bytes: 2
i32

An integer type with 32 bits.

 Bytes: 4
i64

An integer type with 64 bits.

 Bytes: 8

### Floating point types¶

The floating point types have the IEEE 754 semantics that are supported by most hardware, except that non-default rounding modes, unmasked exceptions, and exception flags are not currently supported.

There is currently no support for higher-precision types like quad-precision, double-double, or extended-precision, nor for narrower-precision types like half-precision.

NaNs are encoded following the IEEE 754-2008 recommendation, with quiet NaN being encoded with the MSB of the trailing significand set to 1, and signaling NaNs being indicated by the MSB of the trailing significand set to 0.

Except for bitwise and memory instructions, NaNs returned from arithmetic instructions are encoded as follows:

• If all NaN inputs to an instruction are quiet NaNs with all bits of the trailing significand other than the MSB set to 0, the result is a quiet NaN with a nondeterministic sign bit and all bits of the trailing significand other than the MSB set to 0.
• Otherwise the result is a quiet NaN with a nondeterministic sign bit and all bits of the trailing significand other than the MSB set to nondeterministic values.
f32

A 32-bit floating point type represented in the IEEE 754-2008 binary32 interchange format. This corresponds to the float type in most C implementations.

 Bytes: 4
f64

A 64-bit floating point type represented in the IEEE 754-2008 binary64 interchange format. This corresponds to the double type in most C implementations.

 Bytes: 8

### CPU flags types¶

Some target ISAs use CPU flags to represent the result of a comparison. These CPU flags are represented as two value types depending on the type of values compared.

Since some ISAs don’t have CPU flags, these value types should not be used until the legalization phase of compilation where the code is adapted to fit the target ISA. Use instructions like icmp instead.

The CPU flags types are also restricted such that two flags values can not be live at the same time. After legalization, some instruction encodings will clobber the flags, and flags values are not allowed to be live across such instructions either. The verifier enforces these rules.

iflags

CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.

Bytes: Can’t be stored in memory
fflags

CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code.

Bytes: Can’t be stored in memory

### SIMD vector types¶

A SIMD vector type represents a vector of values from one of the scalar types (boolean, integer, and floating point). Each scalar value in a SIMD type is called a lane. The number of lanes must be a power of two in the range 2-256.

iBxN

A SIMD vector of integers. The lane type iB is one of the integer types i8i64.

Some concrete integer vector types are i32x4, i64x8, and i16x4.

The size of a SIMD integer vector in memory is $$N B\over 8$$ bytes.

f32xN

A SIMD vector of single precision floating point numbers.

Some concrete f32 vector types are: f32x2, f32x4, and f32x8.

The size of a f32 vector in memory is $$4N$$ bytes.

f64xN

A SIMD vector of double precision floating point numbers.

Some concrete f64 vector types are: f64x2, f64x4, and f64x8.

The size of a f64 vector in memory is $$8N$$ bytes.

b1xN

A boolean SIMD vector.

Boolean vectors are used when comparing SIMD vectors. For example, comparing two i32x4 values would produce a b1x4 result.

Like the b1 type, a boolean vector cannot be stored in memory.

### Pseudo-types and type classes¶

These are not concrete types, but convenient names used to refer to real types in this reference.

iAddr

A Pointer-sized integer representing an address.

This is either i32, or i64, depending on whether the target platform has 32-bit or 64-bit pointers.

iB

Any of the scalar integer types i8i64.

Int

Any scalar or vector integer type: iB or iBxN.

fB

Either of the floating point scalar types: f32 or f64.

Float

Any scalar or vector floating point type: fB or fBxN.

TxN

Any SIMD vector type.

Mem

Any type that can be stored in memory: Int or Float.

Testable

Either b1 or iN.

### Immediate operand types¶

These types are not part of the normal SSA type system. They are used to indicate the different kinds of immediate operands on an instruction.

imm64

A 64-bit immediate integer. The value of this operand is interpreted as a signed two’s complement integer. Instruction encodings may limit the valid range.

In the textual format, imm64 immediates appear as decimal or hexadecimal literals using the same syntax as C.

offset32

A signed 32-bit immediate address offset.

In the textual format, offset32 immediates always have an explicit sign, and a 0 offset may be omitted.

ieee32

A 32-bit immediate floating point number in the IEEE 754-2008 binary32 interchange format. All bit patterns are allowed.

ieee64

A 64-bit immediate floating point number in the IEEE 754-2008 binary64 interchange format. All bit patterns are allowed.

bool

A boolean immediate value, either false or true.

In the textual format, bool immediates appear as ‘false’ and ‘true’.

intcc

An integer condition code. See the icmp instruction for details.

floatcc

A floating point condition code. See the fcmp instruction for details.

The two IEEE floating point immediate types ieee32 and ieee64 are displayed as hexadecimal floating point literals in the textual IR format. Decimal floating point literals are not allowed because some computer systems can round differently when converting to binary. The hexadecimal floating point format is mostly the same as the one used by C99, but extended to represent all NaN bit patterns:

Normal numbers
Compatible with C99: -0x1.Tpe where T are the trailing significand bits encoded as hexadecimal, and e is the unbiased exponent as a decimal number. ieee32 has 23 trailing significand bits. They are padded with an extra LSB to produce 6 hexadecimal digits. This is not necessary for ieee64 which has 52 trailing significand bits forming 13 hexadecimal digits with no padding.
Zeros
Positive and negative zero are displayed as 0.0 and -0.0 respectively.
Subnormal numbers
Compatible with C99: -0x0.Tpemin where T are the trailing significand bits encoded as hexadecimal, and emin is the minimum exponent as a decimal number.
Infinities
Either -Inf or Inf.
Quiet NaNs
Quiet NaNs have the MSB of the trailing significand set. If the remaining bits of the trailing significand are all zero, the value is displayed as -NaN or NaN. Otherwise, -NaN:0xT where T are the trailing significand bits encoded as hexadecimal.
Signaling NaNs
Displayed as -sNaN:0xT.

## Control flow¶

Branches transfer control to a new EBB and provide values for the target EBB’s arguments, if it has any. Conditional branches only take the branch if their condition is satisfied, otherwise execution continues at the following instruction in the EBB.

jump EBB(args…)

Jump.

Unconditionally jump to an extended basic block, passing the specified EBB arguments. The number and types of arguments must match the destination EBB.

Arguments: EBB (ebb) – Destination extended basic block args (variable_args) – EBB arguments
brz c, EBB(args…)

Branch when zero.

If c is a b1 value, take the branch when c is false. If c is an integer value, take the branch when c = 0.

Arguments: c (Testable) – Controlling value to test EBB (ebb) – Destination extended basic block args (variable_args) – EBB arguments Testable – inferred from c
brnz c, EBB(args…)

Branch when non-zero.

If c is a b1 value, take the branch when c is true. If c is an integer value, take the branch when c != 0.

Arguments: c (Testable) – Controlling value to test EBB (ebb) – Destination extended basic block args (variable_args) – EBB arguments Testable – inferred from c
br_icmp Cond, x, y, EBB(args…)

Compare scalar integers and branch.

Compare x and y in the same way as the icmp instruction and take the branch if the condition is true:

br_icmp ugt v1, v2, ebb4(v5, v6)


is semantically equivalent to:

v10 = icmp ugt, v1, v2
brnz v10, ebb4(v5, v6)


Some RISC architectures like MIPS and RISC-V provide instructions that implement all or some of the condition codes. The instruction can also be used to represent macro-op fusion on architectures like Intel’s.

Arguments: Cond (intcc) – An integer comparison condition code. x (iB) – A scalar integer type y (iB) – A scalar integer type EBB (ebb) – Destination extended basic block args (variable_args) – EBB arguments iB – inferred from x
br_table x, EBB, JT

Indirect branch via jump table.

Use x as an unsigned index into the jump table JT. If a jump table entry is found, branch to the corresponding EBB. If no entry was found or the index is out-of-bounds, branch to the given default EBB.

Note that this branch instruction can’t pass arguments to the targeted blocks. Split critical edges as needed to work around this.

Do not confuse this with “tables” in WebAssembly. br_table is for jump tables with destinations within the current function only – think of a match in Rust or a switch in C. If you want to call a function in a dynamic library, that will typically use call_indirect.

Arguments: x (iB) – index into jump table EBB (ebb) – Destination extended basic block JT (jump_table) – A jump table. iB – inferred from x
JT = jump_table [EBB0, EBB1, , EBBn]

Declare a jump table in the function preamble.

This declares a jump table for use by the br_table indirect branch instruction. Entries in the table are EBB names.

The EBBs listed must belong to the current function, and they can’t have any arguments.

Arguments: EBB0 – Target EBB when x = 0. EBB1 – Target EBB when x = 1. EBBn – Target EBB when x = n. A jump table identifier. (Not an SSA value).

Traps stop the program because something went wrong. The exact behavior depends on the target instruction set architecture and operating system. There are explicit trap instructions defined below, but some instructions may also cause traps for certain input value. For example, udiv traps when the divisor is zero.

trap code

Terminate execution unconditionally.

Arguments: code (trapcode) – A trap reason code.
trapz c, code

Trap when zero.

if c is non-zero, execution continues at the following instruction.

Arguments: c (Testable) – Controlling value to test code (trapcode) – A trap reason code. Testable – inferred from c
trapnz c, code

Trap when non-zero.

if c is zero, execution continues at the following instruction.

Arguments: c (Testable) – Controlling value to test code (trapcode) – A trap reason code. Testable – inferred from c

## Function calls¶

A function call needs a target function and a function signature. The target function may be determined dynamically at runtime, but the signature must be known when the function call is compiled. The function signature describes how to call the function, including parameters, return values, and the calling convention:

signature    ::=  "(" [paramlist] ")" ["->" retlist] [call_conv]
paramlist    ::=  param { "," param }
retlist      ::=  paramlist
param        ::=  type [paramext] [paramspecial]
paramext     ::=  "uext" | "sext"
paramspecial ::=  "sret" | "link" | "fp" | "csr" | "vmctx" | "sigid" | "stack_limit"
callconv     ::=  "fast" | "cold" | "system_v" | "fastcall" | "baldrdash"


A function’s calling convention determines exactly how arguments and return values are passed, and how stack frames are managed. Since all of these details depend on both the instruction set /// architecture and possibly the operating system, a function’s calling convention is only fully determined by a (TargetIsa, CallConv) tuple.

Name Description
sret pointer to a return value in memory
fp the initial value of the frame pointer
csr callee-saved register
vmctx VM context pointer, which may contain pointers to heaps etc.
sigid signature id, for checking caller/callee signature compatibility
stack_limit limit value for the size of the stack
Name Description
fast not-ABI-stable convention for best performance
cold not-ABI-stable convention for infrequently executed code
system_v System V-style convention used on many platforms
fastcall Windows “fastcall” convention, also used for x64 and ARM
baldrdash SpiderMonkey WebAssembly convention

The “not-ABI-stable” conventions do not follow an external specification and may change between versions of Cranelift.

The “fastcall” convention is not yet implemented.

Parameters and return values have flags whose meaning is mostly target dependent. These flags support interfacing with code produced by other compilers.

Functions that are called directly must be declared in the function preamble:

FN = [colocated] NAME signature

Declare a function so it can be called directly.

If the colocated keyword is present, the symbol’s definition will be defined along with the current function, such that it can use more efficient addressing.

Arguments: NAME – Name of the function, passed to the linker for resolution. signature – Function signature. See below. FN – A function identifier that can be used with call.
rvals = call FN(args…)

Direct function call.

Call a function which has been declared in the preamble. The argument types must match the function’s signature.

Arguments: FN (func_ref) – function to call, declared by function args (variable_args) – call arguments rvals (variable_args) – return values
return rvals…

Return from the function.

Unconditionally transfer control to the calling function, passing the provided return values. The list of return values must match the function signature’s return types.

Arguments: rvals (variable_args) – return values
fallthrough_return rvals…

Return from the function by fallthrough.

This is a specialized instruction for use where one wants to append a custom epilogue, which will then perform the real return. This instruction has no encoding.

Arguments: rvals (variable_args) – return values

This simple example illustrates direct function calls and signatures:

function %gcd(i32 uext, i32 uext) -> i32 uext system_v {
fn0 = %divmod(i32 uext, i32 uext) -> i32 uext, i32 uext

ebb1(v0: i32, v1: i32):
brz v1, ebb2
v2, v3 = call fn0(v0, v1)
return v2

ebb2:
return v0
}


Indirect function calls use a signature declared in the preamble.

rvals = call_indirect SIG, callee(args…)

Indirect function call.

Call the function pointed to by callee with the given arguments. The called function must match the specified signature.

Note that this is different from WebAssembly’s call_indirect; the callee is a native address, rather than a table index. For WebAssembly, table_addr and load are used to obtain a native address from a table.

Arguments: SIG (sig_ref) – function signature callee (iAddr) – address of function to call args (variable_args) – call arguments rvals (variable_args) – return values iAddr – inferred from callee
addr = func_addr FN

Get the address of a function.

Compute the absolute address of a function declared in the preamble. The returned address can be used as a callee argument to call_indirect. This is also a method for calling functions that are too far away to be addressable by a direct call instruction.

Arguments: FN (func_ref) – function to call, declared by function addr (iAddr) – An integer address type iAddr – explicitly provided

## Memory¶

Cranelift provides fully general load and store instructions for accessing memory, as well as extending loads and truncating stores.

If the memory at the given address is not addressable, the behavior of these instructions is undefined. If it is addressable but not accessible, they trap.

a = load MemFlags, p, Offset

Load from memory at p + Offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

Arguments: MemFlags (memflags) – Memory operation flags p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address a (Mem) – Value loaded Mem – explicitly provided iAddr – from input operand
store MemFlags, x, p, Offset

Store x to memory at p + Offset.

This is a polymorphic instruction that can store any value type with a memory representation.

Arguments: MemFlags (memflags) – Memory operation flags x (Mem) – Value to be stored p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address Mem – inferred from x iAddr – from input operand

There are also more restricted operations for accessing specific types of memory objects.

a = load_complex MemFlags(args…), Offset

Load from memory at sum(args) + Offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

Arguments: MemFlags (memflags) – Memory operation flags args (variable_args) – Address arguments Offset (offset32) – Byte offset from base address a (Mem) – Value loaded Mem – explicitly provided
store_complex MemFlags, x(args…), Offset

Store x to memory at sum(args) + Offset.

This is a polymorphic instruction that can store any value type with a memory representation.

Arguments: MemFlags (memflags) – Memory operation flags x (Mem) – Value to be stored args (variable_args) – Address arguments Offset (offset32) – Byte offset from base address Mem – inferred from x

### Memory operation flags¶

Loads and stores can have flags that loosen their semantics in order to enable optimizations.

When the accessible flag is set, the behavior is undefined if the memory is not accessible.

Loads and stores are misaligned if the resultant address is not a multiple of the expected alignment. By default, misaligned loads and stores are allowed, but when the aligned flag is set, a misaligned memory access is allowed to trap.

### Explicit Stack Slots¶

One set of restricted memory operations access the current function’s stack frame. The stack frame is divided into fixed-size stack slots that are allocated in the function preamble. Stack slots are not typed, they simply represent a contiguous sequence of accessible bytes in the stack frame.

SS = explicit_slot Bytes, Flags…

Allocate a stack slot in the preamble.

If no alignment is specified, Cranelift will pick an appropriate alignment for the stack slot based on its size and access patterns.

Arguments: Bytes – Stack slot size on bytes. align(N) – Request at least N bytes alignment. SS – Stack slot index.
a = stack_load SS, Offset

Load a value from a stack slot at the constant offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. $$sizeof(a) + Offset <= sizeof(SS)$$.

Arguments: SS (stack_slot) – A stack slot. Offset (offset32) – In-bounds offset into stack slot a (Mem) – Value loaded Mem – explicitly provided
stack_store x, SS, Offset

Store a value to a stack slot at a constant offset.

This is a polymorphic instruction that can store any value type with a memory representation.

The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. $$sizeof(a) + Offset <= sizeof(SS)$$.

Arguments: x (Mem) – Value to be stored SS (stack_slot) – A stack slot. Offset (offset32) – In-bounds offset into stack slot Mem – inferred from x

The dedicated stack access instructions are easy for the compiler to reason about because stack slots and offsets are fixed at compile time. For example, the alignment of these stack memory accesses can be inferred from the offsets and stack slot alignments.

It’s also possible to obtain the address of a stack slot, which can be used in unrestricted loads and stores.

addr = stack_addr SS, Offset

Get the address of a stack slot.

Compute the absolute address of a byte in a stack slot. The offset must refer to a byte inside the stack slot: $$0 <= Offset < sizeof(SS)$$.

Arguments: SS (stack_slot) – A stack slot. Offset (offset32) – In-bounds offset into stack slot addr (iAddr) – An integer address type iAddr – explicitly provided

The stack_addr instruction can be used to macro-expand the stack access instructions before instruction selection:

v0 = stack_load.f64 ss3, 16
; Expands to:


When Cranelift code is running in a sandbox, it can also be necessary to include stack overflow checks in the prologue.

### Global values¶

A global value is an object whose value is not known at compile time. The value is computed at runtime by global_value, possibly using information provided by the linker via relocations. There are multiple kinds of global values using different methods for determining their value. Cranelift does not track the type of a global value, for they are just values stored in non-stack memory.

When Cranelift is generating code for a virtual machine environment, globals can be used to access data structures in the VM’s runtime. This requires functions to have access to a VM context pointer which is used as the base address. Typically, the VM context pointer is passed as a hidden function argument to Cranelift functions.

Chains of global value expressions are possible, but cycles are not allowed. They will be caught by the IR verifier.

GV = vmctx

Declare a global value of the address of the VM context struct.

This declares a global value which is the VM context pointer which may be passed as a hidden argument to functions JIT-compiled for a VM.

Typically, the VM context is a #[repr(C, packed)] struct.

Results: GV – Global value.

A global value can also be derived by treating another global variable as a struct pointer and loading from one of its fields. This makes it possible to chase pointers into VM runtime data structures.

GV = load.Type BaseGV [Offset]

Declare a global value pointed to by BaseGV plus Offset, with type Type.

It is assumed the BaseGV plus Offset resides in accessible memory with the appropriate alignment for storing a value with type Type.

Arguments: BaseGV – Global value providing the base pointer. Offset – Offset added to the base before loading. GV – Global value.
GV = iadd_imm BaseGV, Offset

Declare a global value which has the value of BaseGV offset by Offset.

Arguments: BaseGV – Global value providing the base value. Offset – Offset added to the base value.
GV = [colocated] symbol Name

Declare a symbolic address global value.

The value of GV is symbolic and will be assigned a relocation, so that it can be resolved by a later linking phase.

If the colocated keyword is present, the symbol’s definition will be defined along with the current function, such that it can use more efficient addressing.

Arguments: Name – External name. GV – Global value.
a = global_value GV

Compute the value of global GV.

Arguments: GV (global_value) – A global value. a (Mem) – Value loaded Mem – explicitly provided
a = symbol_value GV

Compute the value of global GV, which is a symbolic value.

Arguments: GV (global_value) – A global value. a (Mem) – Value loaded Mem – explicitly provided

### Heaps¶

Code compiled from WebAssembly or asm.js runs in a sandbox where it can’t access all process memory. Instead, it is given a small set of memory areas to work in, and all accesses are bounds checked. Cranelift models this through the concept of heaps.

A heap is declared in the function preamble and can be accessed with the heap_addr instruction that traps on out-of-bounds accesses or returns a pointer that is guaranteed to trap. Heap addresses can be smaller than the native pointer size, for example unsigned i32 offsets on a 64-bit architecture.

A heap appears as three consecutive ranges of address space:

1. The mapped pages are the accessible memory range in the heap. A heap may have a minimum guaranteed size which means that some mapped pages are always present.
2. The unmapped pages is a possibly empty range of address space that may be mapped in the future when the heap is grown. They are addressable but not accessible.
3. The offset-guard pages is a range of address space that is guaranteed to always cause a trap when accessed. It is used to optimize bounds checking for heap accesses with a shared base pointer. They are addressable but not accessible.

The heap bound is the total size of the mapped and unmapped pages. This is the bound that heap_addr checks against. Memory accesses inside the heap bounds can trap if they hit an unmapped page (which is not accessible).

addr = heap_addr H, p, Size

Bounds check and compute absolute address of heap memory.

Verify that the offset range p .. p + Size - 1 is in bounds for the heap H, and generate an absolute address that is safe to dereference.

1. If p + Size is not greater than the heap bound, return an absolute address corresponding to a byte offset of p from the heap’s base address.
2. If p + Size is greater than the heap bound, generate a trap.
Arguments: H (heap) – A heap. p (HeapOffset) – An unsigned heap offset Size (uimm32) – Size in bytes addr (iAddr) – An integer address type iAddr – explicitly provided HeapOffset – from input operand

Two styles of heaps are supported, static and dynamic. They behave differently when resized.

#### Static heaps¶

A static heap starts out with all the address space it will ever need, so it never moves to a different address. At the base address is a number of mapped pages corresponding to the heap’s current size. Then follows a number of unmapped pages where the heap can grow up to its maximum size. After the unmapped pages follow the offset-guard pages which are also guaranteed to generate a trap when accessed.

H = static Base, min MinBytes, bound BoundBytes, offset_guard OffsetGuardBytes

Declare a static heap in the preamble.

Arguments: Base – Global value holding the heap’s base address. MinBytes – Guaranteed minimum heap size in bytes. Accesses below this size will never trap. BoundBytes – Fixed heap bound in bytes. This defines the amount of address space reserved for the heap, not including the offset-guard pages. OffsetGuardBytes – Size of the offset-guard pages in bytes.

#### Dynamic heaps¶

A dynamic heap can be relocated to a different base address when it is resized, and its bound can move dynamically. The offset-guard pages move when the heap is resized. The bound of a dynamic heap is stored in a global value.

H = dynamic Base, min MinBytes, bound BoundGV, offset_guard OffsetGuardBytes

Declare a dynamic heap in the preamble.

Arguments: Base – Global value holding the heap’s base address. MinBytes – Guaranteed minimum heap size in bytes. Accesses below this size will never trap. BoundGV – Global value containing the current heap bound in bytes. OffsetGuardBytes – Size of the offset-guard pages in bytes.

#### Heap examples¶

The SpiderMonkey VM prefers to use fixed heaps with a 4 GB bound and 2 GB of offset-guard pages when running WebAssembly code on 64-bit CPUs. The combination of a 4 GB fixed bound and 1-byte bounds checks means that no code needs to be generated for bounds checks at all:

function %add_members(i32, i64 vmctx) -> f32 baldrdash {
gv0 = vmctx
gv1 = load.i64 notrap aligned gv0+64
heap0 = static gv1, min 0x1000, bound 0x1_0000_0000, offset_guard 0x8000_0000

ebb0(v0: i32, v5: i64):
v1 = heap_addr.i64 heap0, v0, 1
return v4
}


A static heap can also be used for 32-bit code when the WebAssembly module declares a small upper bound on its memory. A 1 MB static bound with a single 4 KB offset-guard page still has opportunities for sharing bounds checking code:

function %add_members(i32, i32 vmctx) -> f32 baldrdash {
gv0 = vmctx
gv1 = load.i32 notrap aligned gv0+64
heap0 = static gv1, min 0x1000, bound 0x10_0000, offset_guard 0x1000

ebb0(v0: i32, v5: i32):
v1 = heap_addr.i32 heap0, v0, 1
return v4
}


If the upper bound on the heap size is too large, a dynamic heap is required instead.

Finally, a runtime environment that simply allocates a heap with malloc() may not have any offset-guard pages at all. In that case, full bounds checking is required for each access:

function %add_members(i32, i64 vmctx) -> f32 baldrdash {
gv0 = vmctx
gv1 = load.i64 notrap aligned gv0+64
gv2 = load.i32 notrap aligned gv0+72
heap0 = dynamic gv1, min 0x1000, bound gv2, offset_guard 0

ebb0(v0: i32, v6: i64):
v1 = heap_addr.i64 heap0, v0, 20
v3 = heap_addr.i64 heap0, v0, 24
return v5
}


### Tables¶

Code compiled from WebAssembly often needs access to objects outside of its linear memory. WebAssembly uses tables to allow programs to refer to opaque values through integer indices.

A table is declared in the function preamble and can be accessed with the table_addr instruction that traps on out-of-bounds accesses. Table addresses can be smaller than the native pointer size, for example unsigned i32 offsets on a 64-bit architecture.

A table appears as a consecutive range of address space, conceptually divided into elements of fixed sizes, which are identified by their index. The memory is accessible.

The table bound is the number of elements currently in the table. This is the bound that table_addr checks against.

addr = table_addr T, p, Offset

Bounds check and compute absolute address of a table entry.

Verify that the offset p is in bounds for the table T, and generate an absolute address that is safe to dereference.

Offset must be less than the size of a table element.

1. If p is not greater than the table bound, return an absolute address corresponding to a byte offset of p from the table’s base address.
2. If p is greater than the table bound, generate a trap.
Arguments: T (table) – A table. p (TableOffset) – An unsigned table offset Offset (offset32) – Byte offset from element address addr (iAddr) – An integer address type iAddr – explicitly provided TableOffset – from input operand

A table can be relocated to a different base address when it is resized, and its bound can move dynamically. The bound of a table is stored in a global value.

T = dynamic Base, min MinElements, bound BoundGV, element_size ElementSize

Declare a table in the preamble.

Arguments: Base – Global value holding the table’s base address. MinElements – Guaranteed minimum table size in elements. BoundGV – Global value containing the current heap bound in elements. ElementSize – Size of each element.

## Operations¶

a = select c, x, y

Conditional select.

This instruction selects whole values. Use vselect for lane-wise selection.

Arguments: c (Testable) – Controlling value to test x (Any) – Value to use when c is true y (Any) – Value to use when c is false a (Any) – Any integer, float, or boolean scalar or vector type Any – inferred from x Testable – from input operand
a = selectif cc, flags, x, y

Conditional select, dependent on integer condition codes.

Arguments: cc (intcc) – Controlling condition code flags (iflags) – The machine’s flag register x (Any) – Value to use when c is true y (Any) – Value to use when c is false a (Any) – Any integer, float, or boolean scalar or vector type Any – explicitly provided

### Constant materialization¶

A few instructions have variants that take immediate operands (e.g., band / band_imm), but in general an instruction is required to load a constant into an SSA value.

a = iconst N

Integer constant.

Create a scalar integer SSA value with an immediate constant value, or an integer vector where all the lanes have the same value.

Arguments: N (imm64) – A 64-bit immediate integer. a (Int) – A constant integer scalar or vector value Int – explicitly provided
a = f32const N

Floating point constant.

Create a f32 SSA value with an immediate constant value.

Arguments: N (ieee32) – A 32-bit immediate floating point number. a (f32) – A constant f32 scalar value
a = f64const N

Floating point constant.

Create a f64 SSA value with an immediate constant value.

Arguments: N (ieee64) – A 64-bit immediate floating point number. a (f64) – A constant f64 scalar value
a = bconst N

Boolean constant.

Create a scalar boolean SSA value with an immediate constant value, or a boolean vector where all the lanes have the same value.

Arguments: N (bool) – An immediate boolean. a (Bool) – A constant boolean scalar or vector value Bool – explicitly provided

### Vector operations¶

lo, hi = vsplit x

Split a vector into two halves.

Split the vector x into two separate values, each containing half of the lanes from x. The result may be two scalars if x only had two lanes.

Arguments: x (TxN) – Vector to split lo (half_vector(TxN)) – Low-numbered lanes of x hi (half_vector(TxN)) – High-numbered lanes of x TxN – inferred from x
a = vconcat x, y

Vector concatenation.

Return a vector formed by concatenating x and y. The resulting vector type has twice as many lanes as each of the inputs. The lanes of x appear as the low-numbered lanes, and the lanes of y become the high-numbered lanes of a.

It is possible to form a vector by concatenating two scalars.

Arguments: x (Any128) – Low-numbered lanes y (Any128) – High-numbered lanes a (double_vector(Any128)) – Concatenation of x and y Any128 – inferred from x
a = vselect c, x, y

Vector lane select.

Select lanes from x or y controlled by the lanes of the boolean vector c.

Arguments: c (as_bool(TxN)) – Controlling vector x (TxN) – Value to use where c is true y (TxN) – Value to use where c is false a (TxN) – A SIMD vector type TxN – inferred from x
a = splat x

Vector splat.

Return a vector whose lanes are all x.

Arguments: x (lane_of(TxN)) – None a (TxN) – A SIMD vector type TxN – explicitly provided
a = insertlane x, Idx, y

Insert y as lane Idx in x.

The lane index, Idx, is an immediate value, not an SSA value. It must indicate a valid lane index for the type of x.

Arguments: x (TxN) – SIMD vector to modify Idx (uimm8) – Lane index y (lane_of(TxN)) – New lane value a (TxN) – A SIMD vector type TxN – inferred from x
a = extractlane x, Idx

Extract lane Idx from x.

The lane index, Idx, is an immediate value, not an SSA value. It must indicate a valid lane index for the type of x.

Arguments: x (TxN) – A SIMD vector type Idx (uimm8) – Lane index a (lane_of(TxN)) – None TxN – inferred from x

### Integer operations¶

a = icmp Cond, x, y

Integer comparison.

The condition code determines if the operands are interpreted as signed or unsigned integers.

Signed Unsigned Condition
eq eq Equal
ne ne Not equal
slt ult Less than
sge uge Greater than or equal
sgt ugt Greater than
sle ule Less than or equal

When this instruction compares integer vectors, it returns a boolean vector of lane-wise comparisons.

Arguments: Cond (intcc) – An integer comparison condition code. x (Int) – A scalar or vector integer type y (Int) – A scalar or vector integer type a (as_bool(Int)) – None Int – inferred from x
a = icmp_imm Cond, x, Y

Compare scalar integer to a constant.

This is the same as the icmp instruction, except one operand is an immediate constant.

This instruction can only compare scalars. Use icmp for lane-wise vector comparisons.

Arguments: Cond (intcc) – An integer comparison condition code. x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (b1) – A boolean type with 1 bits. iB – inferred from x
a = iadd x, y

Wrapping integer addition: $$a := x + y \pmod{2^B}$$.

This instruction does not depend on the signed/unsigned interpretation of the operands.

Arguments: x (Int) – A scalar or vector integer type y (Int) – A scalar or vector integer type a (Int) – A scalar or vector integer type Int – inferred from x
a = iadd_imm x, Y

Same as iadd, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x
a = iadd_cin x, y, c_in

Same as iadd with an additional carry input. Computes:

$a = x + y + c_{in} \pmod 2^B$

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type y (iB) – A scalar integer type c_in (b1) – Input carry flag a (iB) – A scalar integer type iB – inferred from y
a, c_out = iadd_cout x, y

Same as iadd with an additional carry output.

$\begin{split}a &= x + y \pmod 2^B \\ c_{out} &= x+y >= 2^B\end{split}$

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type y (iB) – A scalar integer type a (iB) – A scalar integer type c_out (b1) – Output carry flag iB – inferred from x
a, c_out = iadd_carry x, y, c_in

Add integers with carry in and out.

Same as iadd with an additional carry input and output.

$\begin{split}a &= x + y + c_{in} \pmod 2^B \\ c_{out} &= x + y + c_{in} >= 2^B\end{split}$

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type y (iB) – A scalar integer type c_in (b1) – Input carry flag a (iB) – A scalar integer type c_out (b1) – Output carry flag iB – inferred from y
a = isub x, y

Wrapping integer subtraction: $$a := x - y \pmod{2^B}$$.

This instruction does not depend on the signed/unsigned interpretation of the operands.

Arguments: x (Int) – A scalar or vector integer type y (Int) – A scalar or vector integer type a (Int) – A scalar or vector integer type Int – inferred from x
a = irsub_imm x, Y

Immediate reverse wrapping subtraction: $$a := Y - x \pmod{2^B}$$.

Also works as integer negation when $$Y = 0$$. Use iadd_imm with a negative immediate operand for the reverse immediate subtraction.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x
a = isub_bin x, y, b_in

Subtract integers with borrow in.

Same as isub with an additional borrow flag input. Computes:

$a = x - (y + b_{in}) \pmod 2^B$

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type y (iB) – A scalar integer type b_in (b1) – Input borrow flag a (iB) – A scalar integer type iB – inferred from y
a, b_out = isub_bout x, y

Subtract integers with borrow out.

Same as isub with an additional borrow flag output.

$\begin{split}a &= x - y \pmod 2^B \\ b_{out} &= x < y\end{split}$

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type y (iB) – A scalar integer type a (iB) – A scalar integer type b_out (b1) – Output borrow flag iB – inferred from x
a, b_out = isub_borrow x, y, b_in

Subtract integers with borrow in and out.

Same as isub with an additional borrow flag input and output.

$\begin{split}a &= x - (y + b_{in}) \pmod 2^B \\ b_{out} &= x < y + b_{in}\end{split}$

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type y (iB) – A scalar integer type b_in (b1) – Input borrow flag a (iB) – A scalar integer type b_out (b1) – Output borrow flag iB – inferred from y

Todo

Add and subtract with signed overflow.

For example, see llvm.sadd.with.overflow.* and llvm.ssub.with.overflow.* in LLVM.

a = imul x, y

Wrapping integer multiplication: $$a := x y \pmod{2^B}$$.

This instruction does not depend on the signed/unsigned interpretation of the operands.

Polymorphic over all integer types (vector and scalar).

Arguments: x (Int) – A scalar or vector integer type y (Int) – A scalar or vector integer type a (Int) – A scalar or vector integer type Int – inferred from x
a = imul_imm x, Y

Integer multiplication by immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x

Todo

Larger multiplication results.

For example, smulx which multiplies i32 operands to produce a i64 result. Alternatively, smulhi and smullo pairs.

a = udiv x, y

Unsigned integer division: $$a := \lfloor {x \over y} \rfloor$$.

This operation traps if the divisor is zero.

Arguments: x (Int) – A scalar or vector integer type y (Int) – A scalar or vector integer type a (Int) – A scalar or vector integer type Int – inferred from x
a = udiv_imm x, Y

Unsigned integer division by an immediate constant.

This operation traps if the divisor is zero.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x
a = sdiv x, y

Signed integer division rounded toward zero: $$a := sign(xy) \lfloor {|x| \over |y|}\rfloor$$.

This operation traps if the divisor is zero, or if the result is not representable in $$B$$ bits two’s complement. This only happens when $$x = -2^{B-1}, y = -1$$.

Arguments: x (Int) – A scalar or vector integer type y (Int) – A scalar or vector integer type a (Int) – A scalar or vector integer type Int – inferred from x
a = sdiv_imm x, Y

Signed integer division by an immediate constant.

This operation traps if the divisor is zero, or if the result is not representable in $$B$$ bits two’s complement. This only happens when $$x = -2^{B-1}, Y = -1$$.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x
a = urem x, y

Unsigned integer remainder.

This operation traps if the divisor is zero.

Arguments: x (Int) – A scalar or vector integer type y (Int) – A scalar or vector integer type a (Int) – A scalar or vector integer type Int – inferred from x
a = urem_imm x, Y

Unsigned integer remainder with immediate divisor.

This operation traps if the divisor is zero.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x
a = srem x, y

Signed integer remainder. The result has the sign of the dividend.

This operation traps if the divisor is zero.

Arguments: x (Int) – A scalar or vector integer type y (Int) – A scalar or vector integer type a (Int) – A scalar or vector integer type Int – inferred from x
a = srem_imm x, Y

Signed integer remainder with immediate divisor.

This operation traps if the divisor is zero.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x

Todo

Integer minimum / maximum.

NEON has smin, smax, umin, and umax instructions. We should replicate those for both scalar and vector integer types. Even if the target ISA doesn’t have scalar operations, these are good pattern matching targets.

Todo

Saturating arithmetic.

Mostly for SIMD use, but again these are good patterns for contraction. Something like usatadd, usatsub, ssatadd, and ssatsub is a good start.

### Bitwise operations¶

The bitwise operations and operate on any value type: Integers, floating point numbers, and booleans. When operating on integer or floating point types, the bitwise operations are working on the binary representation of the values. When operating on boolean values, the bitwise operations work as logical operators.

a = band x, y

Bitwise and.

Arguments: x (bits) – Any integer, float, or boolean scalar or vector type y (bits) – Any integer, float, or boolean scalar or vector type a (bits) – Any integer, float, or boolean scalar or vector type bits – inferred from x
a = band_imm x, Y

Bitwise and with immediate.

Same as band, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x
a = bor x, y

Bitwise or.

Arguments: x (bits) – Any integer, float, or boolean scalar or vector type y (bits) – Any integer, float, or boolean scalar or vector type a (bits) – Any integer, float, or boolean scalar or vector type bits – inferred from x
a = bor_imm x, Y

Bitwise or with immediate.

Same as bor, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x
a = bxor x, y

Bitwise xor.

Arguments: x (bits) – Any integer, float, or boolean scalar or vector type y (bits) – Any integer, float, or boolean scalar or vector type a (bits) – Any integer, float, or boolean scalar or vector type bits – inferred from x
a = bxor_imm x, Y

Bitwise xor with immediate.

Same as bxor, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. a (iB) – A scalar integer type iB – inferred from x
a = bnot x

Bitwise not.

Arguments: x (bits) – Any integer, float, or boolean scalar or vector type a (bits) – Any integer, float, or boolean scalar or vector type bits – inferred from x
a = band_not x, y

Bitwise and not.

Computes x & ~y.

Arguments: x (bits) – Any integer, float, or boolean scalar or vector type y (bits) – Any integer, float, or boolean scalar or vector type a (bits) – Any integer, float, or boolean scalar or vector type bits – inferred from x
a = bor_not x, y

Bitwise or not.

Computes x | ~y.

Arguments: x (bits) – Any integer, float, or boolean scalar or vector type y (bits) – Any integer, float, or boolean scalar or vector type a (bits) – Any integer, float, or boolean scalar or vector type bits – inferred from x
a = bxor_not x, y

Bitwise xor not.

Computes x ^ ~y.

Arguments: x (bits) – Any integer, float, or boolean scalar or vector type y (bits) – Any integer, float, or boolean scalar or vector type a (bits) – Any integer, float, or boolean scalar or vector type bits – inferred from x

The shift and rotate operations only work on integer types (scalar and vector). The shift amount does not have to be the same type as the value being shifted. Only the low B bits of the shift amount is significant.

When operating on an integer vector type, the shift amount is still a scalar type, and all the lanes are shifted the same amount. The shift amount is masked to the number of bits in a lane, not the full size of the vector type.

a = rotl x, y

Rotate left.

Rotate the bits in x by y places.

Arguments: x (Int) – Scalar or vector value to shift y (iB) – Number of bits to shift a (Int) – A scalar or vector integer type Int – inferred from x iB – from input operand
a = rotl_imm x, Y

Rotate left by immediate.

Arguments: x (Int) – Scalar or vector value to shift Y (imm64) – A 64-bit immediate integer. a (Int) – A scalar or vector integer type Int – inferred from x
a = rotr x, y

Rotate right.

Rotate the bits in x by y places.

Arguments: x (Int) – Scalar or vector value to shift y (iB) – Number of bits to shift a (Int) – A scalar or vector integer type Int – inferred from x iB – from input operand
a = rotr_imm x, Y

Rotate right by immediate.

Arguments: x (Int) – Scalar or vector value to shift Y (imm64) – A 64-bit immediate integer. a (Int) – A scalar or vector integer type Int – inferred from x
a = ishl x, y

Integer shift left. Shift the bits in x towards the MSB by y places. Shift in zero bits to the LSB.

The shift amount is masked to the size of x.

When shifting a B-bits integer type, this instruction computes:

$\begin{split}s &:= y \pmod B, \\ a &:= x \cdot 2^s \pmod{2^B}.\end{split}$
Arguments: x (Int) – Scalar or vector value to shift y (iB) – Number of bits to shift a (Int) – A scalar or vector integer type Int – inferred from x iB – from input operand
a = ishl_imm x, Y

Integer shift left by immediate.

The shift amount is masked to the size of x.

Arguments: x (Int) – Scalar or vector value to shift Y (imm64) – A 64-bit immediate integer. a (Int) – A scalar or vector integer type Int – inferred from x
a = ushr x, y

Unsigned shift right. Shift bits in x towards the LSB by y places, shifting in zero bits to the MSB. Also called a logical shift.

The shift amount is masked to the size of the register.

When shifting a B-bits integer type, this instruction computes:

$\begin{split}s &:= y \pmod B, \\ a &:= \lfloor x \cdot 2^{-s} \rfloor.\end{split}$
Arguments: x (Int) – Scalar or vector value to shift y (iB) – Number of bits to shift a (Int) – A scalar or vector integer type Int – inferred from x iB – from input operand
a = ushr_imm x, Y

Unsigned shift right by immediate.

The shift amount is masked to the size of the register.

Arguments: x (Int) – Scalar or vector value to shift Y (imm64) – A 64-bit immediate integer. a (Int) – A scalar or vector integer type Int – inferred from x
a = sshr x, y

Signed shift right. Shift bits in x towards the LSB by y places, shifting in sign bits to the MSB. Also called an arithmetic shift.

The shift amount is masked to the size of the register.

Arguments: x (Int) – Scalar or vector value to shift y (iB) – Number of bits to shift a (Int) – A scalar or vector integer type Int – inferred from x iB – from input operand
a = sshr_imm x, Y

Signed shift right by immediate.

The shift amount is masked to the size of the register.

Arguments: x (Int) – Scalar or vector value to shift Y (imm64) – A 64-bit immediate integer. a (Int) – A scalar or vector integer type Int – inferred from x

The bit-counting instructions below are scalar only.

a = clz x

Starting from the MSB in x, count the number of zero bits before reaching the first one bit. When x is zero, returns the size of x in bits.

Arguments: x (iB) – A scalar integer type a (iB) – A scalar integer type iB – inferred from x
a = cls x

Starting from the MSB after the sign bit in x, count the number of consecutive bits identical to the sign bit. When x is 0 or -1, returns one less than the size of x in bits.

Arguments: x (iB) – A scalar integer type a (iB) – A scalar integer type iB – inferred from x
a = ctz x

Count trailing zeros.

Starting from the LSB in x, count the number of zero bits before reaching the first one bit. When x is zero, returns the size of x in bits.

Arguments: x (iB) – A scalar integer type a (iB) – A scalar integer type iB – inferred from x
a = popcnt x

Population count

Count the number of one bits in x.

Arguments: x (iB) – A scalar integer type a (iB) – A scalar integer type iB – inferred from x

### Floating point operations¶

These operations generally follow IEEE 754-2008 semantics.

a = fcmp Cond, x, y

Floating point comparison.

Two IEEE 754-2008 floating point numbers, x and y, relate to each other in exactly one of four ways:

 UN Unordered when one or both numbers is NaN. EQ When $$x = y$$. (And $$0.0 = -0.0$$). LT When $$x < y$$. GT When $$x > y$$.

The 14 floatcc condition codes each correspond to a subset of the four relations, except for the empty set which would always be false, and the full set which would always be true.

The condition codes are divided into 7 ‘ordered’ conditions which don’t include UN, and 7 unordered conditions which all include UN.

Ordered Unordered Condition
ord EQ | LT | GT uno UN NaNs absent / present.
eq EQ ueq UN | EQ Equal
one LT | GT ne UN | LT | GT Not equal
lt LT ult UN | LT Less than
le LT | EQ ule UN | LT | EQ Less than or equal
gt GT ugt UN | GT Greater than
ge GT | EQ uge UN | GT | EQ Greater than or equal

The standard C comparison operators, <, <=, >, >=, are all ordered, so they are false if either operand is NaN. The C equality operator, ==, is ordered, and since inequality is defined as the logical inverse it is unordered. They map to the floatcc condition codes as follows:

C Cond Subset
== eq EQ
!= ne UN | LT | GT
< lt LT
<= le LT | EQ
> gt GT
>= ge GT | EQ

This subset of condition codes also corresponds to the WebAssembly floating point comparisons of the same name.

When this instruction compares floating point vectors, it returns a boolean vector with the results of lane-wise comparisons.

Arguments: Cond (floatcc) – A floating point comparison condition code. x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (as_bool(Float)) – None Float – inferred from x
a = fadd x, y

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – Result of applying operator to each lane Float – inferred from x
a = fsub x, y

Floating point subtraction.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – Result of applying operator to each lane Float – inferred from x
a = fmul x, y

Floating point multiplication.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – Result of applying operator to each lane Float – inferred from x
a = fdiv x, y

Floating point division.

Unlike the integer division instructions sdiv and udiv, this can’t trap. Division by zero is infinity or NaN, depending on the dividend.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – Result of applying operator to each lane Float – inferred from x
a = sqrt x

Floating point square root.

Arguments: x (Float) – A scalar or vector floating point number a (Float) – Result of applying operator to each lane Float – inferred from x
a = fma x, y, z

Computes $$a := xy+z$$ without any intermediate rounding of the product.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number z (Float) – A scalar or vector floating point number a (Float) – Result of applying operator to each lane Float – inferred from y

#### Sign bit manipulations¶

The sign manipulating instructions work as bitwise operations, so they don’t have special behavior for signaling NaN operands. The exponent and trailing significand bits are always preserved.

a = fneg x

Floating point negation.

Note that this is a pure bitwise operation.

Arguments: x (Float) – A scalar or vector floating point number a (Float) – x with its sign bit inverted Float – inferred from x
a = fabs x

Floating point absolute value.

Note that this is a pure bitwise operation.

Arguments: x (Float) – A scalar or vector floating point number a (Float) – x with its sign bit cleared Float – inferred from x
a = fcopysign x, y

Floating point copy sign.

Note that this is a pure bitwise operation. The sign bit from y is copied to the sign bit of x.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – x with its sign bit changed to that of y Float – inferred from x

#### Minimum and maximum¶

These instructions return the larger or smaller of their operands. Note that unlike the IEEE 754-2008 minNum and maxNum operations, these instructions return NaN when either input is NaN.

When comparing zeroes, these instructions behave as if $$-0.0 < 0.0$$.

a = fmin x, y

Floating point minimum, propagating NaNs.

If either operand is NaN, this returns a NaN.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – The smaller of x and y Float – inferred from x
a = fmax x, y

Floating point maximum, propagating NaNs.

If either operand is NaN, this returns a NaN.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – The larger of x and y Float – inferred from x

#### Rounding¶

These instructions round their argument to a nearby integral value, still represented as a floating point number.

a = ceil x

Round floating point round to integral, towards positive infinity.

Arguments: x (Float) – A scalar or vector floating point number a (Float) – x rounded to integral value Float – inferred from x
a = floor x

Round floating point round to integral, towards negative infinity.

Arguments: x (Float) – A scalar or vector floating point number a (Float) – x rounded to integral value Float – inferred from x
a = trunc x

Round floating point round to integral, towards zero.

Arguments: x (Float) – A scalar or vector floating point number a (Float) – x rounded to integral value Float – inferred from x
a = nearest x

Round floating point round to integral, towards nearest with ties to even.

Arguments: x (Float) – A scalar or vector floating point number a (Float) – x rounded to integral value Float – inferred from x

### Conversion operations¶

a = bitcast x

Reinterpret the bits in x as a different type.

The input and output types must be storable to memory and of the same size. A bitcast is equivalent to storing one type and loading the other type from the same address.

Arguments: x (Mem) – Any type that can be stored in memory a (MemTo) – Bits of x reinterpreted MemTo – explicitly provided Mem – from input operand
a = breduce x

Convert x to a smaller boolean type in the platform-defined way.

The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments: x (Bool) – A scalar or vector boolean type a (BoolTo) – A smaller boolean type with the same number of lanes BoolTo – explicitly provided Bool – from input operand
a = bextend x

Convert x to a larger boolean type in the platform-defined way.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments: x (Bool) – A scalar or vector boolean type a (BoolTo) – A larger boolean type with the same number of lanes BoolTo – explicitly provided Bool – from input operand
a = bint x

Convert x to an integer.

True maps to 1 and false maps to 0. The result type must have the same number of vector lanes as the input.

Arguments: x (Bool) – A scalar or vector boolean type a (IntTo) – An integer type with the same number of lanes IntTo – explicitly provided Bool – from input operand
a = bmask x

Convert x to an integer mask.

True maps to all 1s and false maps to all 0s. The result type must have the same number of vector lanes as the input.

Arguments: x (Bool) – A scalar or vector boolean type a (IntTo) – An integer type with the same number of lanes IntTo – explicitly provided Bool – from input operand
a = ireduce x

Convert x to a smaller integer type by dropping high bits.

Each lane in x is converted to a smaller integer type by discarding the most significant bits. This is the same as reducing modulo $$2^n$$.

The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments: x (Int) – A scalar or vector integer type a (IntTo) – A smaller integer type with the same number of lanes IntTo – explicitly provided Int – from input operand
a = uextend x

Convert x to a larger integer type by zero-extending.

Each lane in x is converted to a larger integer type by adding zeroes. The result has the same numerical value as x when both are interpreted as unsigned integers.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments: x (Int) – A scalar or vector integer type a (IntTo) – A larger integer type with the same number of lanes IntTo – explicitly provided Int – from input operand
a = sextend x

Convert x to a larger integer type by sign-extending.

Each lane in x is converted to a larger integer type by replicating the sign bit. The result has the same numerical value as x when both are interpreted as signed integers.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments: x (Int) – A scalar or vector integer type a (IntTo) – A larger integer type with the same number of lanes IntTo – explicitly provided Int – from input operand
a = fpromote x

Convert x to a larger floating point format.

Each lane in x is converted to the destination floating point format. This is an exact operation.

Cranelift currently only supports two floating point formats - f32 and f64. This may change in the future.

The result type must have the same number of vector lanes as the input, and the result lanes must not have fewer bits than the input lanes. If the input and output types are the same, this is a no-op.

Arguments: x (Float) – A scalar or vector floating point number a (FloatTo) – A scalar or vector floating point number FloatTo – explicitly provided Float – from input operand
a = fdemote x

Convert x to a smaller floating point format.

Each lane in x is converted to the destination floating point format by rounding to nearest, ties to even.

Cranelift currently only supports two floating point formats - f32 and f64. This may change in the future.

The result type must have the same number of vector lanes as the input, and the result lanes must not have more bits than the input lanes. If the input and output types are the same, this is a no-op.

Arguments: x (Float) – A scalar or vector floating point number a (FloatTo) – A scalar or vector floating point number FloatTo – explicitly provided Float – from input operand
a = fcvt_to_uint x

Convert floating point to unsigned integer.

Each lane in x is converted to an unsigned integer by rounding towards zero. If x is NaN or if the unsigned integral value cannot be represented in the result type, this instruction traps.

The result type must have the same number of vector lanes as the input.

Arguments: x (Float) – A scalar or vector floating point number a (IntTo) – A larger integer type with the same number of lanes IntTo – explicitly provided Float – from input operand
a = fcvt_to_sint x

Convert floating point to signed integer.

Each lane in x is converted to a signed integer by rounding towards zero. If x is NaN or if the signed integral value cannot be represented in the result type, this instruction traps.

The result type must have the same number of vector lanes as the input.

Arguments: x (Float) – A scalar or vector floating point number a (IntTo) – A larger integer type with the same number of lanes IntTo – explicitly provided Float – from input operand
a = fcvt_to_uint_sat x

Convert floating point to unsigned integer as fcvt_to_uint does, but saturates the input instead of trapping. NaN and negative values are converted to 0.

Arguments: x (Float) – A scalar or vector floating point number a (IntTo) – A larger integer type with the same number of lanes IntTo – explicitly provided Float – from input operand
a = fcvt_to_sint_sat x

Convert floating point to signed integer as fcvt_to_sint does, but saturates the input instead of trapping. NaN values are converted to 0.

Arguments: x (Float) – A scalar or vector floating point number a (IntTo) – A larger integer type with the same number of lanes IntTo – explicitly provided Float – from input operand
a = fcvt_from_uint x

Convert unsigned integer to floating point.

Each lane in x is interpreted as an unsigned integer and converted to floating point using round to nearest, ties to even.

The result type must have the same number of vector lanes as the input.

Arguments: x (Int) – A scalar or vector integer type a (FloatTo) – A scalar or vector floating point number FloatTo – explicitly provided Int – from input operand
a = fcvt_from_sint x

Convert signed integer to floating point.

Each lane in x is interpreted as a signed integer and converted to floating point using round to nearest, ties to even.

The result type must have the same number of vector lanes as the input.

Arguments: x (Int) – A scalar or vector integer type a (FloatTo) – A scalar or vector floating point number FloatTo – explicitly provided Int – from input operand

### Extending loads and truncating stores¶

Most ISAs provide instructions that load an integer value smaller than a register and extends it to the width of the register. Similarly, store instructions that only write the low bits of an integer register are common.

In addition to the normal load and store instructions, Cranelift provides extending loads and truncation stores for 8, 16, and 32-bit memory accesses.

These instructions succeed, trap, or have undefined behavior, under the same conditions as normal loads and stores.

a = uload8 MemFlags, p, Offset

Load 8 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i8 followed by uextend.

Arguments: MemFlags (memflags) – Memory operation flags p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address a (iExt8) – An integer type with more than 8 bits iExt8 – explicitly provided iAddr – from input operand
a = sload8 MemFlags, p, Offset

Load 8 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i8 followed by sextend.

Arguments: MemFlags (memflags) – Memory operation flags p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address a (iExt8) – An integer type with more than 8 bits iExt8 – explicitly provided iAddr – from input operand
istore8 MemFlags, x, p, Offset

Store the low 8 bits of x to memory at p + Offset.

This is equivalent to ireduce.i8 followed by store.i8.

Arguments: MemFlags (memflags) – Memory operation flags x (iExt8) – An integer type with more than 8 bits p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address iExt8 – inferred from x iAddr – from input operand
a = uload16 MemFlags, p, Offset

Load 16 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i16 followed by uextend.

Arguments: MemFlags (memflags) – Memory operation flags p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address a (iExt16) – An integer type with more than 16 bits iExt16 – explicitly provided iAddr – from input operand
a = sload16 MemFlags, p, Offset

Load 16 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i16 followed by sextend.

Arguments: MemFlags (memflags) – Memory operation flags p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address a (iExt16) – An integer type with more than 16 bits iExt16 – explicitly provided iAddr – from input operand
istore16 MemFlags, x, p, Offset

Store the low 16 bits of x to memory at p + Offset.

This is equivalent to ireduce.i16 followed by store.i16.

Arguments: MemFlags (memflags) – Memory operation flags x (iExt16) – An integer type with more than 16 bits p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address iExt16 – inferred from x iAddr – from input operand
a = uload32 MemFlags, p, Offset

Load 32 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i32 followed by uextend.

Arguments: MemFlags (memflags) – Memory operation flags p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address a (iExt32) – An integer type with more than 32 bits iAddr – inferred from p
a = sload32 MemFlags, p, Offset

Load 32 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i32 followed by sextend.

Arguments: MemFlags (memflags) – Memory operation flags p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address a (iExt32) – An integer type with more than 32 bits iAddr – inferred from p
istore32 MemFlags, x, p, Offset

Store the low 32 bits of x to memory at p + Offset.

This is equivalent to ireduce.i32 followed by store.i32.

Arguments: MemFlags (memflags) – Memory operation flags x (iExt32) – An integer type with more than 32 bits p (iAddr) – An integer address type Offset (offset32) – Byte offset from base address iExt32 – inferred from x iAddr – from input operand

## ISA-specific instructions¶

Target ISAs can define supplemental instructions that do not make sense to support generally.

### x86¶

Instructions that can only be used by the x86 target ISA.

q, r = x86_sdivmodx nlo, nhi, d

Extended signed division.

Concatenate the bits in nhi and nlo to form the numerator. Interpret the bits as a signed number and divide by the signed denominator d. Trap when d is zero or if the quotient is outside the range of the output.

Return both quotient and remainder.

Arguments: nlo (iWord) – Low part of numerator nhi (iWord) – High part of numerator d (iWord) – Denominator q (iWord) – Quotient r (iWord) – Remainder iWord – inferred from nhi
q, r = x86_udivmodx nlo, nhi, d

Extended unsigned division.

Concatenate the bits in nhi and nlo to form the numerator. Interpret the bits as an unsigned number and divide by the unsigned denominator d. Trap when d is zero or if the quotient is larger than the range of the output.

Return both quotient and remainder.

Arguments: nlo (iWord) – Low part of numerator nhi (iWord) – High part of numerator d (iWord) – Denominator q (iWord) – Quotient r (iWord) – Remainder iWord – inferred from nhi
a = x86_cvtt2si x

Convert with truncation floating point to signed integer.

The source floating point operand is converted to a signed integer by rounding towards zero. If the result can’t be represented in the output type, returns the smallest signed value the output type can represent.

This instruction does not trap.

Arguments: x (Float) – A scalar or vector floating point number a (IntTo) – An integer type with the same number of lanes IntTo – explicitly provided Float – from input operand
a = x86_fmin x, y

Floating point minimum with x86 semantics.

This is equivalent to the C ternary operator x < y ? x : y which differs from fmin when either operand is NaN or when comparing +0.0 to -0.0.

When the two operands don’t compare as LT, y is returned unchanged, even if it is a signalling NaN.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – A scalar or vector floating point number Float – inferred from x
a = x86_fmax x, y

Floating point maximum with x86 semantics.

This is equivalent to the C ternary operator x > y ? x : y which differs from fmax when either operand is NaN or when comparing +0.0 to -0.0.

When the two operands don’t compare as GT, y is returned unchanged, even if it is a signalling NaN.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number a (Float) – A scalar or vector floating point number Float – inferred from x
y, rflags = x86_bsf x

Bit Scan Forwards – returns the bit-index of the least significant 1 in the word. Is otherwise identical to ‘bsr’, just above.

Arguments: x (iWord) – A scalar integer machine word y (iWord) – A scalar integer machine word rflags (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code. iWord – inferred from x
y, rflags = x86_bsr x

Bit Scan Reverse – returns the bit-index of the most significant 1 in the word. Result is undefined if the argument is zero. However, it sets the Z flag depending on the argument, so it is at least easy to detect and handle that case.

This is polymorphic in i32 and i64. It is implemented for both i64 and i32 in 64-bit mode, and only for i32 in 32-bit mode.

Arguments: x (iWord) – A scalar integer machine word y (iWord) – A scalar integer machine word rflags (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code. iWord – inferred from x
x86_push x

Pushes a value onto the stack.

Decrements the stack pointer and stores the specified value on to the top.

This is polymorphic in i32 and i64. However, it is only implemented for i64 in 64-bit mode, and only for i32 in 32-bit mode.

Arguments: x (iWord) – A scalar integer machine word iWord – inferred from x
x = x86_pop

Pops a value from the stack.

Loads a value from the top of the stack and then increments the stack pointer.

This is polymorphic in i32 and i64. However, it is only implemented for i64 in 64-bit mode, and only for i32 in 32-bit mode.

Results: x (iWord) – A scalar integer machine word iWord – explicitly provided

## Codegen implementation instructions¶

Frontends don’t need to emit the instructions in this section themselves; Cranelift will generate them automatically as needed.

### Legalization operations¶

These instructions are used as helpers when legalizing types and operations for the target ISA.

lo, hi = isplit x

Split an integer into low and high parts.

Vectors of integers are split lane-wise, so the results have the same number of lanes as the input, but the lanes are half the size.

Returns the low half of x and the high half of x as two independent values.

Arguments: x (WideInt) – An integer type with lanes from i16 upwards lo (half_width(WideInt)) – The low bits of x hi (half_width(WideInt)) – The high bits of x WideInt – inferred from x
a = iconcat lo, hi

Concatenate low and high bits to form a larger integer type.

Vectors of integers are concatenated lane-wise such that the result has the same number of lanes as the inputs, but the lanes are twice the size.

Arguments: lo (NarrowInt) – An integer type with lanes type to i32 hi (NarrowInt) – An integer type with lanes type to i32 a (double_width(NarrowInt)) – The concatenation of lo and hi NarrowInt – inferred from lo

### Special register operations¶

The prologue and epilogue of a function needs to manipulate special registers like the stack pointer and the frame pointer. These instructions should not be used in regular code.

adjust_sp_down delta

Subtracts delta offset value from the stack pointer register.

This instruction is used to adjust the stack pointer by a dynamic amount.

Arguments: delta (Int) – A scalar or vector integer type Int – inferred from delta
adjust_sp_up_imm Offset

Adds Offset immediate offset value to the stack pointer register.

This instruction is used to adjust the stack pointer, primarily in function prologues and epilogues. Offset is constrained to the size of a signed 32-bit integer.

Arguments: Offset (imm64) – Offset from current stack pointer
adjust_sp_down_imm Offset

Subtracts Offset immediate offset value from the stack pointer register.

This instruction is used to adjust the stack pointer, primarily in function prologues and epilogues. Offset is constrained to the size of a signed 32-bit integer.

Arguments: Offset (imm64) – Offset from current stack pointer
f = ifcmp_sp addr

Compare addr with the stack pointer and set the CPU flags.

This is like ifcmp where addr is the LHS operand and the stack pointer is the RHS.

Arguments: addr (iAddr) – An integer address type f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code. iAddr – inferred from addr
copy_special src, dst

Copies the contents of ‘’src’’ register to ‘’dst’’ register.

This instructions copies the contents of one register to another register without involving any SSA values. This is used for copying special registers, e.g. copying the stack register to the frame register in a function prologue.

Arguments: src (regunit) – A register unit in the target ISA dst (regunit) – A register unit in the target ISA

### Low-level control flow operations¶

fallthrough EBB(args…)

Fall through to the next EBB.

This is the same as jump, except the destination EBB must be the next one in the layout.

Jumps are turned into fall-through instructions by the branch relaxation pass. There is no reason to use this instruction outside that pass.

Arguments: EBB (ebb) – Destination extended basic block args (variable_args) – EBB arguments

### CPU flag operations¶

These operations are for working with the “flags” registers of some CPU architectures.

f = ifcmp x, y

Compare scalar integers and return flags.

Compare two scalar integer values and return integer CPU flags representing the result.

Arguments: x (iB) – A scalar integer type y (iB) – A scalar integer type f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code. iB – inferred from x
f = ifcmp_imm x, Y

Compare scalar integer to a constant and return flags.

Like icmp_imm, but returns integer CPU flags instead of testing a specific condition code.

Arguments: x (iB) – A scalar integer type Y (imm64) – A 64-bit immediate integer. f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code. iB – inferred from x
f = ffcmp x, y

Floating point comparison returning flags.

Compares two numbers like fcmp, but returns floating point CPU flags instead of testing a specific condition.

Arguments: x (Float) – A scalar or vector floating point number y (Float) – A scalar or vector floating point number f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code. Float – inferred from x
a = trueif Cond, f

Test integer CPU flags for a specific condition.

Check the CPU flags in f against the Cond condition code and return true when the condition code is satisfied.

Arguments: Cond (intcc) – An integer comparison condition code. f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code. a (b1) – A boolean type with 1 bits.
a = trueff Cond, f

Test floating point CPU flags for a specific condition.

Check the CPU flags in f against the Cond condition code and return true when the condition code is satisfied.

Arguments: Cond (floatcc) – A floating point comparison condition code. f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code. a (b1) – A boolean type with 1 bits.
trapif Cond, f, code

Trap when condition is true in integer CPU flags.

Arguments: Cond (intcc) – An integer comparison condition code. f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code. code (trapcode) – A trap reason code.
trapff Cond, f, code

Trap when condition is true in floating point CPU flags.

Arguments: Cond (floatcc) – A floating point comparison condition code. f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code. code (trapcode) – A trap reason code.
brif Cond, f, EBB(args…)

Branch when condition is true in integer CPU flags.

Arguments: Cond (intcc) – An integer comparison condition code. f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code. EBB (ebb) – Destination extended basic block args (variable_args) – EBB arguments
brff Cond, f, EBB(args…)

Branch when condition is true in floating point CPU flags.

Arguments: Cond (floatcc) – A floating point comparison condition code. f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code. EBB (ebb) – Destination extended basic block args (variable_args) – EBB arguments

### Live range splitting¶

Cranelift’s register allocator assigns each SSA value to a register or a spill slot on the stack for its entire live range. Since the live range of an SSA value can be quite large, it is sometimes beneficial to split the live range into smaller parts.

A live range is split by creating new SSA values that are copies or the original value or each other. The copies are created by inserting copy, spill, or fill instructions, depending on whether the values are assigned to registers or stack slots.

This approach permits SSA form to be preserved throughout the register allocation pass and beyond.

a = copy x

Register-register copy.

This instruction copies its input, preserving the value type.

A pure SSA-form program does not need to copy values, but this instruction is useful for representing intermediate stages during instruction transformations, and the register allocator needs a way of representing register copies.

Arguments: x (Any) – Any integer, float, or boolean scalar or vector type a (Any) – Any integer, float, or boolean scalar or vector type Any – inferred from x
a = spill x

Spill a register value to a stack slot.

This instruction behaves exactly like copy, but the result value is assigned to a spill slot.

Arguments: x (Any) – Any integer, float, or boolean scalar or vector type a (Any) – Any integer, float, or boolean scalar or vector type Any – inferred from x
a = fill x

Load a register value from a stack slot.

This instruction behaves exactly like copy, but creates a new SSA value for the spilled input value.

Arguments: x (Any) – Any integer, float, or boolean scalar or vector type a (Any) – Any integer, float, or boolean scalar or vector type Any – inferred from x

Register values can be temporarily diverted to other registers by the regmove instruction, and to and from stack slots by regspill and regfill.

regmove x, src, dst

Temporarily divert x from src to dst.

This instruction moves the location of a value from one register to another without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

The register diversions created by this instruction must be undone before the value leaves the EBB. At the entry to a new EBB, all live values must be in their originally assigned registers.

Arguments: x (Any) – Any integer, float, or boolean scalar or vector type src (regunit) – A register unit in the target ISA dst (regunit) – A register unit in the target ISA Any – inferred from x
regspill x, src, SS

Temporarily divert x from src to SS.

This instruction moves the location of a value from a register to a stack slot without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

See also regmove.

Arguments: x (Any) – Any integer, float, or boolean scalar or vector type src (regunit) – A register unit in the target ISA SS (stack_slot) – A stack slot. Any – inferred from x
regfill x, SS, dst

Temporarily divert x from SS to dst.

This instruction moves the location of a value from a stack slot to a register without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

See also regmove.

Arguments: x (Any) – Any integer, float, or boolean scalar or vector type SS (stack_slot) – A stack slot. dst (regunit) – A register unit in the target ISA Any – inferred from x

## Instruction groups¶

All of the shared instructions are part of the base instruction group.

base.instructions.GROUP

Shared base instruction set

Target ISAs may define further instructions in their own instruction groups:

isa.x86.instructions.GROUP

x86-specific instruction set

## Implementation limits¶

Cranelift’s intermediate representation imposes some limits on the size of functions and the number of entities allowed. If these limits are exceeded, the implementation will panic.

Number of instructions in a function
At most $$2^{31} - 1$$.
Number of EBBs in a function

At most $$2^{31} - 1$$.

Every EBB needs at least a terminator instruction anyway.

Number of secondary values in a function

At most $$2^{31} - 1$$.

Secondary values are any SSA values that are not the first result of an instruction.

Other entities declared in the preamble

At most $$2^{32} - 1$$.

This covers things like stack slots, jump tables, external functions, and function signatures, etc.

Number of arguments to an EBB
At most $$2^{16}$$.
Number of arguments to a function

At most $$2^{16}$$.

This follows from the limit on arguments to the entry EBB. Note that Cranelift may add a handful of ABI register arguments as function signatures are lowered. This is for representing things like the link register, the incoming frame pointer, and callee-saved registers that are saved in the prologue.

Size of function call arguments on the stack

At most $$2^{32} - 1$$ bytes.

This is probably not possible to achieve given the limit on the number of arguments, except by requiring extremely large offsets for stack arguments.

## Glossary¶

Memory in which loads and stores have defined behavior. They either succeed or trap, depending on whether the memory is accessible.
accessible
Addressable memory in which loads and stores always succeed without trapping, except where specified otherwise (eg. with the aligned flag). Heaps, globals, tables, and the stack may contain accessible, merely addressable, and outright unaddressable regions. There may also be additional regions of addressable and/or accessible memory not explicitly declared.
basic block
A maximal sequence of instructions that can only be entered from the top, and that contains no branch or terminator instructions except for the last instruction.
entry block
The EBB that is executed first in a function. Currently, a Cranelift function must have exactly one entry block which must be the first block in the function. The types of the entry block arguments must match the types of arguments in the function signature.
extended basic block
EBB

A maximal sequence of instructions that can only be entered from the top, and that contains no terminator instructions except for the last one. An EBB can contain conditional branches that can fall through to the following instructions in the block, but only the first instruction in the EBB can be a branch target.

The last instruction in an EBB must be a terminator instruction, so execution cannot flow through to the next EBB in the function. (But there may be a branch to the next EBB.)

Note that some textbooks define an EBB as a maximal subtree in the control flow graph where only the root can be a join node. This definition is not equivalent to Cranelift EBBs.

EBB parameter
A formal parameter for an EBB is an SSA value that dominates everything in the EBB. For each parameter declared by an EBB, a corresponding argument value must be passed when branching to the EBB. The function’s entry EBB has parameters that correspond to the function’s parameters.
EBB argument
Similar to function arguments, EBB arguments must be provided when branching to an EBB that declares formal parameters. When execution begins at the top of an EBB, the formal parameters have the values of the arguments passed in the branch.
function signature

A function signature describes how to call a function. It consists of:

• The calling convention.
• The number of arguments and return values. (Functions can return multiple values.)
• Type and flags of each argument.
• Type and flags of each return value.

Not all function attributes are part of the signature. For example, a function that never returns could be marked as noreturn, but that is not necessary to know when calling it, so it is just an attribute, and not part of the signature.

function preamble

A list of declarations of entities that are used by the function body. Some of the entities that can be declared in the preamble are:

• Stack slots.
• Functions that are called directly.
• Function signatures for indirect function calls.
• Function flags and attributes that are not part of the signature.
function body
The extended basic blocks which contain all the executable code in a function. The function body follows the function preamble.
intermediate representation
IR
The language used to describe functions to Cranelift. This reference describes the syntax and semantics of Cranelift IR. The IR has two forms: Textual, and an in-memory data structure.
stack slot
A fixed size memory allocation in the current function’s activation frame. These include explicit stack slots and spill stack slots.
explicit stack slot
A fixed size memory allocation in the current function’s activation frame. These differ from spill stack slots in that they can be created by frontends and they may have their addresses taken.
spill stack slot
A fixed size memory allocation in the current function’s activation frame. These differ from explicit stack slots in that they are only created during register allocation, and they may not have their address taken.
terminator instruction

A control flow instruction that unconditionally directs the flow of execution somewhere else. Execution never continues at the instruction following a terminator instruction.

The basic terminator instructions are br, return, and trap. Conditional branches and instructions that trap conditionally are not terminator instructions.

trap
traps
trapping
Terminates execution of the current thread. The specific behavior after a trap depends on the underlying OS. For example, a common behavior is delivery of a signal, with the specific signal depending on the event that triggered it.