chikaku

且听风吟

永远是深夜有多好。
github
email

Go Assembly and ABI

The cover image is from Renee French and follows the Creative Commons 4.0 Attributions license

Registers#

Plan9amd64General Purpose
AXraxAccumulatorStores arithmetic operands and return values
BXrbxBase RegisterStores memory base addresses (structures or arrays) or pointers
CXrcxCount RegisterCounting operations like loop counters
DXrdxData RegisterStores data like multipliers/divisors
DIrdiDestination IndexOffset of the destination operand
SIrsiSource IndexOffset of the source operand
BPrbpBase PointerSaves the stack base address
SPrspStack PointerSaves the stack top pointer
PCripProgram CounterProgram counter
R8-R14r8-r14General registers

Pseudo Registers#

NamePurpose
FP(Frame pointer)Base address for parameters and local variables
PC(Program counter)Program counter
SB(Static base pointer)Base address for global variables
SP(Stack pointer)Stack pointer (highest address of the current stack frame)

All local variables defined in user programs are compiled into base addresses on the FP and SB registers with a certain offset.

The pseudo register SB is used to reference global variables such as:

  • foo(SB) indicates the memory address of the global variable foo
  • foo<>(SB) indicates that the global variable foo is only visible in the current file
  • foo+4(SB) indicates the memory base address of foo plus an offset of four bytes

The pseudo register FP is used to save a virtual stack pointer to reference function parameters. The compiler uses FP plus an offset to access the current function's parameters, and a parameter name can also be attached during access, although it has no practical use, it helps in understanding and reading the code. Additionally, the assembler enforces that when using FP, a parameter name must be attached, such as:

  • 0(FP) or first_arg+0(FP) indicates the first parameter of the current function
  • 8(FP) or second_arg+8(FP) indicates the second parameter of the current function (the first parameter occupies 8 bytes)

Note: FP is a pseudo register regardless of whether a hardware FP register exists.

The pseudo register SP saves a virtual stack pointer used to access local variables and function call parameters within the current stack frame, pointing to the highest address of the current stack frame, so the offset can only be within the range of [−framesize, 0). For example, x-8(SP) y-4(SP) in architectures with hardware SP registers, there is a distinction between accessing the SP register with and without parameter name prefixes:

  • x-8(SP) accesses using the pseudo register SP with a parameter prefix
  • -8(SP) accesses using the hardware register SP without a parameter prefix

Symbol Definition#

In Go's object files and binaries, the complete symbol name consists of the package path followed by a dot and a symbol name, such as math/rand.Int. During the process of converting the source file to assembly, the compiler converts it to math∕rand·Int, where the slash and dot are converted to U+2215 and U+00B7. In handwritten assembly, it is unnecessary to include the full package name; during the linking process, the linker automatically adds the full package name to each symbol starting with a dot, so it is sufficient to define a symbol name like ·Int.

The assembler uses directives to bind code or data to symbols, for example:

Function Symbol Definition#

Functions (code segments) are defined using the TEXT directive, such as:

TEXT runtime·profileloop(SB),NOSPLIT,$24-8
  • pkgname: Package name, can be omitted
  • funcname(SB): Function name, as functions themselves are global symbols, referenced through SB
  • NOSPLIT: Assembler directive parameter, will be introduced later
  • $24-8: Indicates the size of the function's stack frame and (parameters + return values); the stack frame size must be provided when using the NOSPLIT parameter

Global Data Symbol Definition#

Global data is defined through a set of DATA directives plus a GLOBAL directive. The format of the DATA directive is:

DATA symbol+offset(SB)/width, value

This indicates initializing a memory segment of size width with an initial value of value at the specified offset offset of the symbol symbol. The offsets/widths of multiple DATA directives must be contiguous. The GLOBAL directive is used to declare global symbols, requiring the symbol name, parameters, and size to be specified. If the DATA directive does not have an initialization value, GLOBAL will initialize it to 0, such as:

DATA divtab<>+0x00(SB)/4, $0xf4f8fcff
DATA divtab<>+0x04(SB)/4, $0xe6eaedf0
...
DATA divtab<>+0x3c(SB)/4, $0x81828384
GLOBL divtab<>(SB), RODATA, $64

GLOBL runtime·tlsoffset(SB), NOPTR, $4

The above code declares and initializes a 64-byte read-only global variable divtab and a 4-byte global variable runtime·tlsoffset, both initialized to 0. Here, NOPTR is declared, meaning that this data does not contain pointers.

Parameters of Symbol Definition#

Each assembly instruction can contain one or two parameters. If there are two parameters, the first parameter must be a flag mask. All parameter definitions need to be included through #include "textflag.h". The parameters are as follows:

  • DUPOK: Allows multiple identical symbols in the binary; the linker will choose one of them
  • NOSPLIT: Used for TEXT directives, marking that stack overflow checks do not need to be inserted
  • RODATA: Used for DATA and GLOBAL directives, placing data in the read-only segment
  • NOPTR: Used for DATA and GLOBAL directives, marking that data does not contain pointers and does not require GC scanning
  • WRAPPER: Used for TEXT directives, marking that the function is just a wrapper and should not disable recover, see source code src/debug/gosym/pclntab.go
  • NEEDCTXT: Used for TEXT directives, marking that the function is a closure and requires the passed context register
  • TLSBSS: Used for DATA and GLOBAL directives, marking the allocation of TLS storage units and storing their offsets in variables
  • NOFRAME: Used for TEXT directives, marking that no instructions for allocating stack frame space are inserted in the function, suitable for zero-stack-frame functions
  • REFLECTMETHOD: Marks that the function can call reflect.Type.Method/reflect.Type.MethodByName
  • TOPFRAME: Used for TEXT directives, marking this function as the top of the call stack, stack unwinding should stop here
  • ABIWRAPPER: Used for TEXT directives, marking this function as an ABI wrapper

Using Go Types and Constants in Assembly#

If a package contains .s files, the compiler will output a special header file go_asm.h during the build, which contains many constant definitions, such as: struct field offsets, struct type sizes, and constants defined in the current package. In assembly, Go types can be used by including this header file. In the go_asm.h file, various types are defined in the following forms:

  • Constants: const_name
  • Struct field offsets: type_field
  • Struct sizes: type__size
const bufSize = 1024

type reader struct {
    buf [bufSize]byte
    r   int
}

Using the above code as an example, in assembly code, you can:

  • Use const_bufSize to access the constant bufSize
  • Use reader__size to get the size of the struct reader
  • Use reader_buf and reader_r to get the offsets of fields buf and r. If R1 contains a pointer to a reader, you can access the two fields using reader_buf(R1) and reader_r(R1).

Runtime#

To ensure the correctness of GC operations, the runtime must be aware of all pointers contained in stack frames and global variables. The compiler automatically inserts this information when compiling Go code, but it needs to be explicitly defined in assembly code. Data symbols with the NOPTR parameter do not contain runtime allocated data pointers; symbols with the RODATA parameter have their data allocated in the read-only segment of memory, thus implicitly marking NOPTR; types smaller than pointer size cannot naturally contain pointers. While it is not possible to define symbols containing pointers in assembly code, they can be defined in Go code and referenced in assembly code through the corresponding symbols. Generally, the best practice is to define all non-read-only symbols in Go rather than in assembly code.

Each function needs to annotate the locations of its parameters, return values, and live pointers in the stack frame. If an assembly function has no pointer return values, no function calls, and no stack frame space requirements, it is sufficient to define the Go function prototype (signature) in the same package. For more complex situations, it is necessary to include the funcdata.h header file to reference pseudo-assembly directives for explicit annotations. Functions without parameters and return values (annotated as $n-0 in the TEXT directive) can ignore pointer information. In addition, all pointer information must be provided through function prototypes (signatures) in Go code, even for assembly functions that will not be directly called by Go functions.

At the beginning of a function, it can be assumed that parameters have been initialized, but return values are uninitialized. If there are pointers that survive during the function call in the return values, the function should initially set the return values to null and execute the GO_RESULTS_INITIALIZED pseudo-instruction, which records that the return values have been initialized and should be scanned during stack transfers (expansion) and GC. In most cases, it is advisable to avoid returning pointers in assembly functions; at least in the standard library, no assembly functions use GO_RESULTS_INITIALIZED.

If a function has no local stack frame (i.e., declared as $n-0 in the TEXT directive) or does not contain CALL instructions, pointer information can be ignored. Otherwise, local stack frames cannot contain pointers, and the assembler will execute the pseudo-instruction NO_LOCAL_POINTERS for verification. Since stack expansion and contraction are achieved by copying and moving stack space, the stack pointer may change during function calls, so pointers to stack data should not be stored in local variables.

Assembly functions should always provide Go prototypes, as this can provide pointer information for parameters and return values and allow go vet to check the correctness of offset usage.

Memory Layout#

The sizes and alignment of built-in basic types in Go, as well as the calculation of field offsets in composite types (structures), can be found in the ABI documentation Memory layout. For other types:

  • The memory layout of map/chan/func types is equivalent to *T
  • The memory layout of array types [N]T consists of contiguous memory made up of N T types
  • The string type in memory consists of two parts: an int indicating the byte length of the string and a pointer to [cap]T
  • The slice type []T in memory consists of three parts: an int indicating the valid length of the slice, an int indicating the capacity size of the slice, and a pointer to [cap]T

The memory of struct types is composed of contiguous memory for each of its fields. For example, the memory order of a struct type struct { f1 t1; ...; fM tM } is t1, ..., tM, tP, where tP is an additional byte that is filled only when the size of the last field tM is zero and any preceding field ti has a non-zero size. Experiments have shown that when taking the address of a zero-sized field in a struct, it always returns the address of the first non-zero-sized type field that follows that field. Therefore, a byte is filled after the last zero-sized field to ensure that the address does not access external memory.

type S struct { // 0xc00034c000
    A struct{}  // 0xc00034c000
    B int       // 0xc00034c000
    C struct{}  // 0xc00034c008
    D struct{}  // 0xc00034c008
    E int       // 0xc00034c008
    F struct{}  // 0xc00034c010
}

The empty interface interface{} type runtime.eface consists of the following parts:

  • A pointer to the runtime dynamic data type description
  • A pointer to the runtime dynamic data value of type unsafe.Pointer

Non-empty interface types consist of the following parts:

  • A pointer to runtime.itab containing:
    • runtime.interfacetype containing method pointers related to this interface
    • A pointer to the runtime dynamic data type description
  • A pointer to the runtime dynamic data value of type unsafe.Pointer

Interface types can be direct or indirect:

  • Direct interface types directly store data
  • Indirect interface types store pointers to data
  • If the value within the interface consists of only a single pointer, then this interface type can only be a direct type

The above describes the memory layout structure of all Go types, but when writing assembly functions, one should not rely on these rules but rather reference the constants defined in the go_asm.h header file.

Parameter and Return Value Passing in Function Calls#

During function calls, parameters/return values are passed through the stack and hardware registers. Each parameter/return value may be entirely stored in registers (multiple registers can be used to store a single parameter/return value) or stored on the stack. Generally, since accessing registers is faster than accessing memory, parameters/return values are prioritized to be stored in registers; however, when the remaining registers cannot store the complete value or contain variable-length arrays, parameters/return values can only be passed via the stack.

Each architecture defines a set of integer registers and a set of floating-point registers. From a high-level perspective, all parameter and return value types can be decomposed into basic types and stored in registers in order. Parameters and return values can share a register, but they cannot share the same stack space. The caller will reserve a segment of overflow space on the call stack for parameters stored in registers, but this space will not be filled. The specific algorithm for allocating parameters/return values in registers or on the stack is quite complex; refer to Function call argument and result passing.

Before calling a method, a segment of memory must be allocated in the caller's stack frame to store the method receiver, stack parameters, stack return values, and register parameter overflow space. Then, the corresponding parameter values are stored in registers or stack space, and the call operation is executed. During the execution of the call, the return value stack space, overflow space, and return value registers are not initialized; the callee needs to store the return values in the corresponding registers or stack frame space allocated according to the algorithm before returning. Since there are no callee-save registers, all registers without explicit meaning may be overwritten, including parameter registers.

In a 64-bit architecture with integer registers R0-R9, the function f signature and its call stack space are as follows:

func f(a1 uint8, a2 [2]uintptr, a3 uint8) (
    r1 struct { x uintptr; y [2]uintptr },
    r2 string,
)

// Stack space layout
// a2      [2]uintptr
// r1.x    uintptr
// r1.y    [2]uintptr
// a1Spill uint8
// a3Spill uint8
// _       [6]uint8  // alignment padding

Since a2 and r1 contain arrays, they can only be allocated on the stack for assignment, while other parameters and return values can be allocated in registers. r2 is decomposed into two independently assignable parts in registers. When calling, a1 will be assigned to register R0, a3 will be assigned to register R1, and a2 will be assigned on the stack. When returning, r2.base will be assigned to register R0, r2.len will be assigned to register R1, and r1.x and r1.y will be assigned in stack space.

Closures#

Function values like var f func are equivalent to a pointer to a closure object, which consists of the entry address of the closure function and some memory space related to the closure environment. The calling rules for closures are essentially the same as for static functions, with the only exception being that each architecture sets a special closure context register, which holds the pointer to the closure object before calling the closure. This way, even after the closure function exits, the object within the closure can still be referenced through this special register.

Common Instructions#

// TODO

Reference#

Official documentation and code:

Other resources:

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.