Performance Tuning
zzz is designed for performance from the ground up. Routes are resolved at compile time, middleware chains are inlined by the compiler, and the I/O layer is pluggable. This guide covers the knobs you can turn to get the most out of your deployment.
Compile-time routing
Section titled “Compile-time routing”One of zzz’s most significant performance advantages is that route definitions, pattern matching, and middleware composition are all resolved at comptime.
How it works
Section titled “How it works”When you define a router, every route pattern is parsed into segments at compile time:
const App = Router.define(.{ .routes = &.{ Router.get("/users/:id", getUserHandler), Router.get("/posts/*path", catchAllHandler), },});The compilePattern function converts "/users/:id" into a slice of Segment values (.static, .param, .wildcard) that are embedded directly in the binary. At runtime, matchSegments walks the request path against these pre-computed segments — there is no parsing, no hash table lookup, and no heap allocation during dispatch.
Middleware inlining
Section titled “Middleware inlining”The middleware pipeline is also assembled at compile time. The dispatch function chains global middleware and the route dispatcher into a single call sequence:
const route_dispatcher = comptime makeRouteDispatcher(config);const pipeline = comptime config.middleware ++ &[_]HandlerFn{route_dispatcher};const entry = comptime makePipelineEntry(pipeline);Because every function pointer in the chain is comptime-known, the Zig compiler can inline the entire pipeline in ReleaseFast builds, eliminating indirect call overhead.
What this means in practice
Section titled “What this means in practice”- Zero allocation for route matching — no regex compilation, no trie traversal, no string interning.
- No indirect calls in optimized builds — the middleware pipeline compiles down to a linear sequence of inlined function bodies.
- Named routes at zero cost —
pathFor("user_path")resolves to a string literal at compile time.
I/O backend selection
Section titled “I/O backend selection”The choice of backend has a direct impact on throughput and latency characteristics.
The default backend uses a thread pool with a bounded queue. It excels when:
- Handlers perform CPU-intensive work (JSON serialization, template rendering).
- You want straightforward concurrency via OS threads.
- The number of concurrent connections is moderate (hundreds to low thousands).
Tune the thread pool for your workload:
const config: Server.Config = .{ .worker_threads = 8, // match your CPU core count .max_connections = 2048, // bounded queue capacity .kernel_backlog = 256, // TCP listen backlog};The libhv backend uses a single-threaded event loop with platform-native I/O multiplexing (epoll on Linux, kqueue on macOS). It excels when:
- You have a high number of concurrent connections (thousands+).
- Workloads are I/O-bound (proxying, WebSocket relaying).
- You need built-in timer support.
Build with the libhv backend:
zig build run -Dbackend=libhvSee Server Backends for a detailed comparison.
Response compression
Section titled “Response compression”The gzipCompress middleware compresses response bodies using gzip when the client sends Accept-Encoding: gzip. It only compresses when the result is actually smaller than the original.
Configuration
Section titled “Configuration”pub const CompressConfig = struct { min_size: usize = 256, // skip compression for bodies smaller than this};| Option | Default | Description |
|---|---|---|
min_size | 256 bytes | Minimum response body size to attempt compression. Bodies below this threshold are sent uncompressed. |
const App = Router.define(.{ .middleware = &.{ gzipCompress(.{ .min_size = 512 }), }, .routes = &.{ ... },});How it works
Section titled “How it works”-
The middleware calls
ctx.next()first, letting the downstream handler produce a response. -
It checks that the response body exceeds
min_sizeand that the client sentAccept-Encoding: gzip. -
If the response already has a
Content-Encodingheader, compression is skipped (avoiding double-encoding). -
The body is compressed using Zig’s
std.compress.flatewith gzip framing and the default compression level. -
If the compressed output is smaller than the original, it replaces the body and
Content-Encoding: gzipandVary: Accept-Encodingheaders are added. If compression did not reduce the size, the original body is kept.
Tuning tips
Section titled “Tuning tips”- Set
min_sizeto at least 150-256 bytes. Compressing tiny responses adds CPU overhead without meaningful bandwidth savings. - JSON API responses and HTML templates typically compress well (60-80% reduction).
- Binary data (images, already-compressed files) should not be compressed. If you serve static files with the
staticmiddleware, consider placing compression after the static middleware so it only applies to dynamic responses, or filter by content type in a custom middleware.
Connection and timeout tuning
Section titled “Connection and timeout tuning”The Server.Config struct provides several options that affect how connections are managed.
Buffer sizes
Section titled “Buffer sizes”const config: Server.Config = .{ .max_body_size = 10 * 1024 * 1024, // 10 MB for file uploads .max_header_size = 32768, // 32 KB for large cookies/auth headers};| Setting | Default | Guidance |
|---|---|---|
max_body_size | 1 MB | Increase for file upload endpoints. Keep low for API-only services to reject oversized payloads early. |
max_header_size | 16 KB | Increase if your application uses large cookies or JWT tokens in headers. |
Timeouts
Section titled “Timeouts”const config: Server.Config = .{ .read_timeout_ms = 15_000, // 15s - tighter for APIs .write_timeout_ms = 60_000, // 60s - generous for streaming responses .keepalive_timeout_ms = 30_000, // 30s - shorter to reclaim idle connections};| Setting | Default | Guidance |
|---|---|---|
read_timeout_ms | 30 s | How long to wait for request data. Tighten for APIs, loosen for slow clients. |
write_timeout_ms | 30 s | How long to wait for response send to complete. Increase for large downloads or streaming. |
keepalive_timeout_ms | 65 s | Idle timeout for keep-alive connections. Shorter values free connections faster under high load. |
Connection limits
Section titled “Connection limits”const config: Server.Config = .{ .max_connections = 4096, .max_requests_per_connection = 200, .kernel_backlog = 512,};| Setting | Default | Guidance |
|---|---|---|
max_connections | 1024 | Bounded queue capacity (zzz backend). Size this to your expected peak concurrent connections. |
max_requests_per_connection | 100 | Limits HTTP pipelining on a single connection. Higher values improve throughput for keep-alive clients. |
kernel_backlog | 128 | TCP SO_BACKLOG for pending connections. Increase for bursty traffic patterns. |
Worker thread sizing
Section titled “Worker thread sizing”For the native zzz backend, the worker_threads setting controls how many OS threads handle connections:
const config: Server.Config = .{ .worker_threads = 0, // auto: defaults to 1 thread};Guidelines for setting worker_threads:
- CPU-bound handlers (template rendering, JSON serialization): set to the number of CPU cores.
- I/O-bound handlers (database queries, HTTP calls to other services): set to 2-4x the core count, since threads will spend time waiting.
- Mixed workloads: start at the core count and benchmark up.
Production build flags
Section titled “Production build flags”Zig’s build modes have a significant impact on runtime performance:
zig build -Doptimize=ReleaseFastMaximum performance. The compiler inlines comptime-known function calls, eliminates safety checks, and applies aggressive optimizations. Use this for production deployments.
zig build -Doptimize=ReleaseSafeRetains safety checks (bounds checking, integer overflow detection) while still applying optimizations. Good for staging environments where you want performance but also want to catch bugs.
zig build -Doptimize=ReleaseSmallOptimizes for binary size over speed. Useful for embedded or containerized deployments where image size matters.
Performance checklist
Section titled “Performance checklist”Use this checklist when preparing a zzz application for production:
| Area | Action | Impact |
|---|---|---|
| Build mode | Use -Doptimize=ReleaseFast | Enables inlining of comptime middleware chains |
| Backend | Evaluate zzz vs libhv for your workload | Platform-native I/O can improve throughput |
| Compression | Add gzipCompress middleware | 60-80% bandwidth reduction for text responses |
| Worker threads | Match worker_threads to CPU cores | Prevents over- or under-subscription |
| Timeouts | Tighten read_timeout_ms and keepalive_timeout_ms | Frees connections from slow or idle clients |
| Body limits | Set max_body_size to the minimum your app needs | Rejects oversized payloads early |
| Backlog | Increase kernel_backlog for bursty traffic | Reduces connection refusals during spikes |
| Health checks | Place health middleware first in the pipeline | Prevents health probes from inflating metrics |
| Metrics | Expose /metrics for Prometheus monitoring | Enables data-driven tuning |
Next steps
Section titled “Next steps”- Server backends — detailed backend architecture and selection guide
- Observability — structured logging, metrics, and health checks
- Middleware — learn how the middleware pipeline works
- Deployment — production deployment strategies