Skip to content

Performance Results

Reference results from local benchmark artifacts. Guest measurements come from capsem-bench 0.3.0; lifecycle and fork measurements are host-side benchmark runs. Security Engine artifacts were refreshed on 2026-05-23. Numbers vary with host load, network path, and cache state.

Total time from VM start to shell ready: ~580ms.

StageDurationDescription
squashfs10msMount compressed rootfs from virtio block device
virtiofs<1msMount VirtioFS shared directory
overlayfs80msCreate ext4 loopback overlay (format + mount)
workspace<1msBind-mount /root from VirtioFS
network210msConfigure dummy0 and iptables DNS/HTTPS redirect rules
dns_proxytracked separatelyStart UDP/TCP DNS bridge to host vsock:5007
net_proxy100msStart TCP-to-vsock HTTPS proxy
deploy10msCopy tools from initrd to rootfs
venv170msCreate Python virtualenv (via uv)
agent_start<1msLaunch PTY agent, connect vsock
Total~580ms

The diagnostic suite enforces boot time stays under 1 second. The two heaviest stages are network setup (iptables rule installation) and venv creation.

Scratch disk performance on the VirtioFS-backed workspace (/root). Test size: 256MB.

TestThroughputIOPSDuration
Sequential write (1MB blocks)1,854 MB/s-138ms
Sequential read (1MB blocks)3,754 MB/s-68ms
Random 4K write (fdatasync)33 MB/s8,3531,197ms
Random 4K read279 MB/s71,440140ms

Sequential I/O benefits from VirtioFS pass-through to APFS. Random write IOPS are limited by per-write fdatasync — this reflects the worst case for database-style workloads.

Read-only squashfs rootfs where binaries and libraries live.

TestDetailThroughputIOPSDuration
Sequential read (1MB)codex binary (193MB)693 MB/s-266ms
Random 4K read2,588 files sampled38 MB/s9,783511ms

Squashfs decompression adds overhead compared to the scratch disk. Random reads across many small files show the cost of decompression + inode lookup on a compressed filesystem.

Wall-clock time to run <cli> --version with page cache dropped (3 runs, best/mean/worst).

CLIMinMeanMax
python37ms9ms11ms
node126ms128ms132ms
claude335ms337ms340ms
gemini594ms599ms605ms
codex293ms293ms293ms

Python starts near-instantly. Node-based CLIs and native agent CLIs generally start in the low hundreds of milliseconds.

50 GET requests to https://www.google.com/ with concurrency 5, routed through the MITM proxy.

MetricValue
Requests50/50
Requests/sec19.6
Transfer3.8MB
Total duration2,557ms
Latency percentileValue
min107ms
p50162ms
p95659ms
p99713ms
max732ms

Latency includes the full path: guest -> net-proxy -> vsock -> host MITM proxy -> TLS termination -> internet -> re-encryption -> response. The tail mostly reflects upstream internet latency and TLS/session setup.

Reference file download through the MITM proxy.

MetricValue
Downloaded9.98MB
Duration4.56s
Throughput2.09 MB/s

This is the sustained bandwidth ceiling for the proxy pipeline (TLS termination + body inspection + re-encryption). Actual throughput varies with internet connection speed.

End-to-end latency for snapshot operations via the guest MCP endpoint at 3 workspace sizes. Each operation is a full round-trip: guest CLI -> framed vsock -> host endpoint -> APFS filesystem -> response.

OperationLatency
create1,217ms
list514ms
changes463ms
revert457ms
delete444ms
OperationLatency
create507ms
list463ms
changes439ms
revert417ms
delete370ms
OperationLatency
create377ms
list372ms
changes402ms
revert420ms
delete430ms

The 10-file create is slower than 100/500 because it includes the first MCP handshake (JSON-RPC initialize). Subsequent operations reuse the connection. List and changes scale modestly with file count. The host gateway-side latency is typically 3-20ms — the rest is vsock + MCP protocol overhead.

Host-side latency for individual VM operations. Measured over 3 provision/exec/delete cycles on the same service instance.

OperationMinMeanMaxDescription
provision895ms931ms951msCreate and boot a temporary VM
exec_ready11.5ms12.1ms12.9msFirst ready check after provisioning
exec10.7ms10.9ms11.3msSimple echo ok on running VM
delete60.1ms60.6ms61.5msVM teardown request
total980ms1,015ms1,033ms

Provision includes the boot path, so it carries the bulk of lifecycle latency. Exec and ready checks are low-latency once the VM is running.

Run: uv run pytest tests/capsem-serial/test_lifecycle_benchmark.py::test_lifecycle_benchmark -xvs

Host-side latency for fork (image creation) and boot-from-image. Measured over 3 cycles: create VM, install jq, write workspace files, fork, boot from image, verify data survived.

MetricMinMeanMaxGateDescription
fork83ms88ms93ms500msAPFS clonefile of rootfs overlay + workspace
image_size7.5MB7.5MB7.5MB16MBActual disk (blocks), not logical sparse size
boot_provision744ms747ms752ms1,200msClone image into new session + boot
boot_ready11ms11ms12ms1,200msFirst ready check after provisioning

Fork is fast because APFS clonefile() is copy-on-write — no actual data copying. Image size reports actual allocated blocks, not the logical 2GB sparse file size. Both rootfs overlay changes (installed packages) and workspace files (/root/) survive fork.

Regression gates: fork < 500ms, image < 16MB, packages + workspace must survive every run.

Run: uv run pytest tests/capsem-serial/test_lifecycle_benchmark.py::test_fork_benchmark -xvs

Security Engine CEL microbench (host-side)

Section titled “Security Engine CEL microbench (host-side)”

First S08d host-side microbenchmark artifact: benchmarks/security-engine/data_1.1.1778860037_arm64_cel_microbench.json. Detection IR parse/lowering artifact: benchmarks/security-engine/data_1.1.1778860037_arm64_security_packs_microbench.json.

These are Rust Criterion microbenchmarks for canonical policy-context CEL paths and Detection IR pack parsing/lowering. They are not VM-originated benchmarks and should not be used as end-to-end latency claims.

BenchmarkSlope
Compile http.request.host.contains("google")8.7us
Compile full HTTP policy39.8us
Evaluate http.request.host.contains("google")14.3us
Evaluate http.request.header("authorization").exists()16.1us
Evaluate full HTTP policy22.9us
Evaluate full HTTP policy as last match across 100 rules1.28ms
Detection finding for full HTTP policy23.2us
Detection finding as last match across 100 rules1.27ms
Dedupe 100 backtest rows / 100 unique signatures19.4us
Dedupe 1,000 backtest rows / 100 unique signatures160.9us
Runtime registry install/update of one rule145ns
Runtime registry projection of 100 enabled rules7.5us
Runtime projection and compile of 100 enforcement rules307.7us
Runtime projection and compile of 100 detection rules312.9us
Rebuild engine from 100 enforcement and 100 detection rules628.5us
Update one existing rule and rebuild 100-rule plan355.3us
Project SecurityEvent to PolicyContext538ns
Project and serialize PolicyContext2.6us
Native Rust lookup for equivalent HTTP policy12ns
Parse and validate Detection IR Google-secret fixture122.6us
Lower Detection IR Google-secret fixture to CEL rules1.1us
Lower 100 Detection IR HTTP rules to CEL rules96.6us
Lower and compile 100 Detection IR HTTP rules2.8ms

Run:

Terminal window
cargo bench -p capsem-security-engine --bench security_engine_cel
cargo bench -p capsem-core --bench security_packs

Security Engine process enforcement (VM-originated)

Section titled “Security Engine process enforcement (VM-originated)”

First S08d VM-originated benchmark artifact: benchmarks/security-engine/data_1.1.1778860037_arm64_process_enforcement.json.

This host-side serial benchmark runs a live service and VM, installs a runtime CEL rule that blocks shell process exec, sends eight blocked exec requests, and verifies the response, runtime match counters, canonical session.db security events, and logs exposure.

MetricValue
Runs8
Gate750ms mean
Min blocked exec latency8.925ms
Mean blocked exec latency9.356ms
Median blocked exec latency9.265ms
p95 blocked exec latency9.992ms
p99 blocked exec latency9.992ms
Max blocked exec latency9.992ms
Runtime matches8
Session DB security events8

Run:

Terminal window
uv run pytest tests/capsem-serial/test_security_engine_benchmark.py -xvs

Security Engine HTTP request enforcement (VM-originated)

Section titled “Security Engine HTTP request enforcement (VM-originated)”

First S08d network-transport benchmark artifact: benchmarks/security-engine/data_1.1.1778860037_arm64_http_request_enforcement.json.

This host-side serial benchmark runs a live service and VM, installs a runtime CEL rule that blocks a specific HTTPS request before upstream dispatch, warms the path once, then runs a guest curl loop and verifies the block responses, runtime match counters, canonical session.db security events, and logs exposure. It also runs a persistent TLS keep-alive client over the same connection to prove repeated block decisions stay logged and avoid per-request TLS setup in the hot path.

The wall-clock metric includes spawning curl in the guest. The time_starttransfer metric is curl’s first-byte timing for the blocked response and is the better proxy for transport plus Security Engine response latency. The phase deltas show most first-byte time is TLS/MITM appconnect; the post-pretransfer server-first-byte slice, which includes request dispatch, Security Engine evaluation, synthetic 403 generation, and first-byte delivery, is below 1ms on this run.

MetricValue
Runs8
Warmup runs1
Gate1,000ms mean
Mean wall-clock blocked request9.091ms
Median wall-clock blocked request8.149ms
p95 wall-clock blocked request12.672ms
Mean time_starttransfer3.997ms
Median time_starttransfer3.939ms
p95 time_starttransfer4.525ms
Mean DNS0.911ms
Mean TCP connect after DNS0.238ms
Mean TLS appconnect2.145ms
Mean server first byte after pretransfer0.683ms
Mean response tail after first byte0.015ms
Mean keep-alive first byte0.549ms
Median keep-alive first byte0.462ms
p95 keep-alive first byte1.041ms
Mean keep-alive total response0.556ms
Keep-alive TLS handshake1.560ms
Runtime matches17
Session DB security events17

Run:

Terminal window
uv run pytest tests/capsem-serial/test_security_engine_benchmark.py::test_http_request_enforcement_benchmark_records_vm_originated_path -xvs

Security Engine DNS request enforcement (VM-originated)

Section titled “Security Engine DNS request enforcement (VM-originated)”

First S08d DNS-transport benchmark artifact: benchmarks/security-engine/data_1.1.1778860037_arm64_dns_request_enforcement.json.

This host-side serial benchmark runs a live service and VM, installs a runtime CEL rule that blocks one DNS qname, triggers repeated guest resolver lookups, and verifies NXDOMAIN-style failure, runtime match counters, canonical session.db security events, dns_events policy fields, and logs qname attribution.

MetricValue
Runs8
Gate1,000ms mean
Min blocked DNS lookup0.611ms
Mean blocked DNS lookup1.109ms
Median blocked DNS lookup0.830ms
p95 blocked DNS lookup3.508ms
p99 blocked DNS lookup3.508ms
Max blocked DNS lookup3.508ms
Runtime matches16
Session DB security events16
Session DB DNS events16

Run:

Terminal window
uv run pytest tests/capsem-serial/test_security_engine_benchmark.py::test_dns_request_enforcement_benchmark_records_vm_originated_path -xvs

Security Engine MCP request enforcement (VM-originated)

Section titled “Security Engine MCP request enforcement (VM-originated)”

First S08d framed-MCP benchmark artifact: benchmarks/security-engine/data_1.1.1778860037_arm64_mcp_request_enforcement.json.

This host-side serial benchmark runs a live service and VM, installs a runtime CEL rule that blocks the guest local__echo MCP tool, sends repeated tools/call requests through /run/capsem-mcp-server, and verifies JSON-RPC denial, runtime match counters, canonical session.db security events, mcp_calls policy fields, and logs server/tool attribution.

MetricValue
Runs8
Gate1,000ms mean
Min blocked MCP request0.222ms
Mean blocked MCP request0.312ms
Median blocked MCP request0.264ms
p95 blocked MCP request0.543ms
p99 blocked MCP request0.543ms
Max blocked MCP request0.543ms
Runtime matches8
Session DB security events8
Session DB MCP calls8

Run:

Terminal window
uv run pytest tests/capsem-serial/test_security_engine_benchmark.py::test_mcp_request_enforcement_benchmark_records_vm_originated_path -xvs
ComponentVersion
HostApple Silicon macOS local benchmark host
Capsem1.0 benchmark artifact
Guest kernelLinux 6.x (custom allnoconfig)
StorageVirtioFS mode (APFS backing)
Python3.x (rootfs)
Nodev22.x (rootfs)
Terminal window
just bench # Run all benchmarks (~2 min)

Results are displayed as rich tables in the terminal. JSON output is saved to /tmp/capsem-benchmark.json inside the VM.