infix
A JIT-Powered FFI Library for C
|
This document provides a deep dive into the architecture and internal workings of infix
. It is intended for maintainers, contributors, and advanced users who wish to understand the library's design philosophy, core mechanics, security features, and ABI implementations.
The architecture of infix
is the result of a series of deliberate design choices aimed at balancing performance, security, and developer ergonomics.
Three high-level principles guide the library's development:
infix
is designed to be built as a single translation unit. The top-level src/infix.c
file simply #include
s all other core .c
files.
src/infix.c
and the include
directory to their project, and it will build without complex makefiles.static
, we avoid polluting the global namespace. The trampoline.c
file is key, as it includes the ABI-specific .c
files directly, ensuring their internal functions remain private.Both infix_forward_t
and infix_reverse_t
are designed as self-contained objects. When a trampoline is created, it performs a deep copy of all the infix_type
metadata it needs into its own private, internal memory arena.
The low-level, "manual" API for creating infix_type
objects is exclusively arena-based.
The reverse call API is built on a dual model to serve two distinct audiences with different needs.
infix
provides a dedicated API for each.infix_reverse_create_callback
)**: Designed for C/C++ developers. The user provides a normal, type-safe C function (e.g., int my_handler(infix_context_t*, int, int)
). Internally, infix
generates a cached forward trampoline to bridge the gap between the generic internal dispatcher and the user's type-safe code. This provides compile-time checking and maximum readability at the cost of slightly higher memory usage and a few nanoseconds of overhead.infix_reverse_create_closure
)**: Designed for language bindings. The user provides a single generic handler (`void handler(infix_context_t*, void* ret, void args)). The JIT-compiled stub calls this handler directly after marshalling arguments into a
void**` array. This is more efficient and gives the binding author the raw pointers needed to unbox data into their host language's objects.The library can be broken down into five main layers:
infix.h
, trampoline.c
)**: The user-facing interface, providing both a high-level Signature API (infix_forward_create
, etc.) and a low-level Manual API (infix_forward_create_manual
, etc.). The high-level functions are wrappers that live in trampoline.c
alongside their manual counterparts.types.c
, signature.c
, registry.c
)**: Describes the data types used in function signatures, including the signature parser and the named type registry.trampoline.c
)**: The core, ABI-agnostic orchestrator that uses the other layers to build the final machine code.infix_internals.h
, arch/...
)**: Defines the v-table interfaces (infix_..._abi_spec
) and provides the concrete, platform-specific implementations.executor.c
)**: Handles the allocation and protection of memory for JIT-compiled code and contains the universal callback dispatcher.The process of creating a forward trampoline, from signature to executable code, follows a clear pipeline:
infix_forward_create
receives a signature string, which the parser builds into a temporary, possibly unresolved, graph of infix_type
objects. If a registry is provided, the resolver walks the graph and replaces @Name
placeholders.prepare_forward_call_frame
function to produce a complete layout blueprint.generate_*_prologue
, etc.) in sequence, appending machine code to a buffer.infix_forward_t
handle is allocated, containing its own private arena into which a deep copy of the type graph is made, making the handle a safe, self-contained object.The reverse call pipeline is more complex, as it supports two distinct handler models.
infix_reverse_create_callback
(for a type-safe C handler) or infix_reverse_create_closure
(for a generic handler). This choice determines the path.prepare_reverse_call_frame
and code generation functions are called to create the JIT-compiled assembly stub. This stub is identical for both paths; its only job is to marshal native arguments into a void**
array and call the universal C dispatcher.cached_forward_trampoline
exists, it uses it to call the type-safe handler; otherwise, it directly invokes the generic closure handler.A memory region is never simultaneously writable and executable. The implementation strategy varies by platform for maximum security and compatibility:
To mitigate use-after-free bugs, infix_executable_free
turns freed memory into a non-accessible "guard page," causing an immediate and safe crash on attempted use. Additionally, after a reverse trampoline's context is created, its memory is made read-only to prevent runtime corruption.
The implementation of W^X on macOS, particularly on Apple Silicon, is unique and requires special handling to balance security with developer convenience.
Apple Silicon enforces W^X at the hardware level. For a JIT engine to function correctly within a standard, distributable application (e.g., an official Python interpreter), that application must be built with specific permissions:
MAP_JIT
flag.com.apple.security.cs.allow-jit
entitlement.pthread_jit_write_protect_np()
function, as the standard mprotect()
will fail.This presents a major usability problem. Forcing every developer who uses infix
to learn about and correctly configure linker flags (-framework Security
, etc.) and code signing for their local test builds would create an unacceptable barrier to entry.
infix
solves this problem by making a runtime decision. This provides the best of both worlds: security-by-default for real-world applications, and zero-configuration ease-of-use for developers.
Here is the logic, which is executed once per process:
infix
attempts to dlopen()
the Security.framework
and CoreFoundation.framework
libraries and find the necessary functions using dlsym()
. This avoids any build-time linker dependencies.infix
checks if the currently running process has the com.apple.security.cs.allow-jit
entitlement.MAP_JIT
and pthread_jit_write_protect_np
). This is the path that will be taken by official interpreters like Python or Perl.mmap
followed by mprotect
). This path works for unhardened developer builds (like our CI tests) because macOS runs them in a more permissive mode.This ensures the library "just works" for developers, while automatically "leveling up" its security when run inside a properly configured application.
The entire infix
API surface, especially the signature parser and ABI classifiers, is continuously tested using libFuzzer
and AFL++
. The fuzzing harnesses (fuzz/
) are designed to find memory safety violations (ASan), integer overflows (UBSan), and infinite loops (timeouts). All findings are converted into permanent regression tests.
This section provides a low-level comparison of the ABIs supported by infix
.
Feature | System V AMD64 (Linux, macOS) | Windows x64 | AArch64 (ARM64) |
---|---|---|---|
Integer/Pointer Args | 6 GPRs: RDI, RSI, RDX, RCX, R8, R9 | 4 GPRs: RCX, RDX, R8, R9 (Shared slots) | 8 GPRs: X0 - X7 |
Floating-Point Args | 8 XMMs: XMM0 - XMM7 (Separate pool) | 4 XMMs: XMM0 - XMM3 (Shared slots) | 8 VPRs: V0 - V7 (Separate pool) |
Struct/Union Passing | Recursive Classification. Passed in GPRs, XMMs, or both. | By Reference if size is not 1, 2, 4, or 8 bytes. | By Reference if size > 16 bytes. HFAs passed in VPRs. |
Return by Hidden Pointer | If struct > 16 bytes or classified as MEMORY. Pointer in RDI . | If struct size is not 1, 2, 4, or 8. Pointer in RCX . | If struct > 16 bytes. Pointer in X8 . |
Return Value Registers | RAX (int), RAX:RDX (int pair), XMM0 (float), st(0) (ld) | RAX (int/struct), XMM0 (float) | X0 (int), X0:X1 (int pair), V0 (float/HFA) |
Variadic printf Rule | AL must contain the number of XMM registers used. | Floating-point variadic args are passed in GPRs and XMMs. | Standard: no special rule. Apple: All variadic args on stack. |
Stack Alignment | 16-byte boundary before call . | 16-byte boundary before call . | 16-byte boundary. |
Shadow Space | No. Has a 128-byte "red zone" below RSP . | Yes, caller allocates 32 bytes on stack for the callee. | No. |
The simplest way to see what the JIT is producing is to enable INFIX_DEBUG_ENABLED=1
in your build. This will trigger a hexdump of the generated machine code after every trampoline creation.
This is the most powerful method. It allows you to step through the JIT'd code one instruction at a time.
c infix_unbound_cif_func cif_func = infix_forward_get_code(trampoline); printf("DEBUG: Trampoline generated at address: %p\n", (void*)cif_func);
gdb ./my_test_executable
(gdb) b *0x7ffff7fde000
disassemble
to view the JIT code.stepi
(step instruction) and info registers
to walk through the code and check register values.