Security
Sandbox, Merkle verification, and authentication model
Process Sandbox
When yule serve or yule run starts (without --no-sandbox), the process is placed in a platform-specific sandbox before the model is loaded. Even parsing and weight loading run inside the sandbox.
Windows — Job Objects
- Memory limit — 32GB default via
RLIMIT_ASsimulation, prevents runaway allocations - No child spawning —
ActiveProcessLimit = 1, the process can't fork or exec - Kill-on-close — if the Job Object handle is closed (crash, parent exit), the process is terminated
- UI restrictions — clipboard, desktop switching, display settings, global atoms, user handles, system parameters all blocked
- RAII cleanup —
SandboxGuardcallsCloseHandleon drop
Linux — seccomp-BPF + Landlock + rlimit
Three layers applied in order:
- rlimit —
RLIMIT_AScaps virtual memory at 32GB - Landlock (kernel 5.13+) — filesystem restriction:
- Model file: read-only
/dev/dri,/dev/nvidia*: read + ioctl (only if--backend vulkan)/usr/lib,/lib,/proc/self: read-only (dynamic linker, shared libraries)- Everything else: denied
- Graceful degradation on older kernels
- seccomp-BPF — syscall allowlist (~60 base syscalls):
- Memory management, file I/O, threads, signals, time, epoll
- Networking syscalls (
socket,bind,listen,accept4, etc.) only ifallow_network ioctlonly ifallow_gpu(Vulkan/DRM driver communication)- Default action:
EPERMfor unlisted syscalls (debuggable, notSIGKILL)
macOS — Seatbelt + rlimit
- rlimit —
RLIMIT_AScaps virtual memory at 32GB - Seatbelt — dynamically-built SBPL profile via
sandbox_init():(deny default)— everything denied unless explicitly allowed- Model file: read-only
- System libraries (
/usr/lib,/System/Library,/Library/Apple): read-only /dev/urandom,/dev/random: read-only (CSPRNG)- GPU (
iokit-open,/Library/GPUBundles): only ifallow_gpu - Networking (
network-outbound,network-inbound,network-bind): only ifallow_network - Profile is permanent once applied — cannot be undone
Sandbox Configuration
The sandbox adapts based on the command:
yule run | yule serve | |
|---|---|---|
| Network | Denied | Allowed (API server) |
| GPU | Allowed if --backend != cpu | Denied |
| Memory | 32GB cap | 32GB cap |
Future Work
The current sandbox is in-process (Phase A). Phase B will implement a broker-target architecture:
- Broker (main process) — parses CLI args, validates model, spawns target
- Target (child process) — receives model file descriptor via IPC, runs inference, returns tokens
- Privilege separation — broker holds no model data, target has no network access
Merkle Verification
At model load time, Yule builds a blake3 Merkle tree over all tensor data:
- The tensor payload (everything after the GGUF header) is split into 1MB chunks
- Each chunk is hashed with blake3
- Leaf hashes are combined into a binary Merkle tree
- The 256-bit root hash is stored in memory
This root hash appears in every /yule/chat response under integrity.model_merkle_root. You can verify it matches the hash from yule verify:
# on disk
yule verify ./model.gguf
# → merkle root: ffc7e1fd6016a6f9...
# from the API
curl -H "Authorization: Bearer $TOKEN" http://localhost:11434/yule/model
# → "merkle_root": "ffc7e1fd6016a6f9..."If someone swaps a tensor in the model file, the Merkle root changes. If the API returns a different root than what you verified, the model has been tampered with.
Authentication
The API uses blake3-derived capability tokens:
- On startup, 32 bytes of OS entropy are collected via
getrandom - Token derivation:
blake3(master_secret || counter || timestamp), truncated to 24 bytes, hex-encoded withyule_prefix - Only the blake3 hash of the token is stored — the server never keeps plaintext tokens in memory after generation
- Verification: hash the provided token, compare against stored hashes
Tokens look like: yule_b49913e2c05162951af4f87d62c2c9a6555eb91299c7fdcc
You can also provide your own token with --token, in which case its hash is stored the same way.