Skip to content

Performance Tuning

zzz is designed for performance from the ground up. Routes are resolved at compile time, middleware chains are inlined by the compiler, and the I/O layer is pluggable. This guide covers the knobs you can turn to get the most out of your deployment.

One of zzz’s most significant performance advantages is that route definitions, pattern matching, and middleware composition are all resolved at comptime.

When you define a router, every route pattern is parsed into segments at compile time:

const App = Router.define(.{
.routes = &.{
Router.get("/users/:id", getUserHandler),
Router.get("/posts/*path", catchAllHandler),
},
});

The compilePattern function converts "/users/:id" into a slice of Segment values (.static, .param, .wildcard) that are embedded directly in the binary. At runtime, matchSegments walks the request path against these pre-computed segments — there is no parsing, no hash table lookup, and no heap allocation during dispatch.

The middleware pipeline is also assembled at compile time. The dispatch function chains global middleware and the route dispatcher into a single call sequence:

const route_dispatcher = comptime makeRouteDispatcher(config);
const pipeline = comptime config.middleware ++ &[_]HandlerFn{route_dispatcher};
const entry = comptime makePipelineEntry(pipeline);

Because every function pointer in the chain is comptime-known, the Zig compiler can inline the entire pipeline in ReleaseFast builds, eliminating indirect call overhead.

  • Zero allocation for route matching — no regex compilation, no trie traversal, no string interning.
  • No indirect calls in optimized builds — the middleware pipeline compiles down to a linear sequence of inlined function bodies.
  • Named routes at zero costpathFor("user_path") resolves to a string literal at compile time.

The choice of backend has a direct impact on throughput and latency characteristics.

The default backend uses a thread pool with a bounded queue. It excels when:

  • Handlers perform CPU-intensive work (JSON serialization, template rendering).
  • You want straightforward concurrency via OS threads.
  • The number of concurrent connections is moderate (hundreds to low thousands).

Tune the thread pool for your workload:

const config: Server.Config = .{
.worker_threads = 8, // match your CPU core count
.max_connections = 2048, // bounded queue capacity
.kernel_backlog = 256, // TCP listen backlog
};

See Server Backends for a detailed comparison.

The gzipCompress middleware compresses response bodies using gzip when the client sends Accept-Encoding: gzip. It only compresses when the result is actually smaller than the original.

pub const CompressConfig = struct {
min_size: usize = 256, // skip compression for bodies smaller than this
};
OptionDefaultDescription
min_size256 bytesMinimum response body size to attempt compression. Bodies below this threshold are sent uncompressed.
const App = Router.define(.{
.middleware = &.{
gzipCompress(.{ .min_size = 512 }),
},
.routes = &.{ ... },
});
  1. The middleware calls ctx.next() first, letting the downstream handler produce a response.

  2. It checks that the response body exceeds min_size and that the client sent Accept-Encoding: gzip.

  3. If the response already has a Content-Encoding header, compression is skipped (avoiding double-encoding).

  4. The body is compressed using Zig’s std.compress.flate with gzip framing and the default compression level.

  5. If the compressed output is smaller than the original, it replaces the body and Content-Encoding: gzip and Vary: Accept-Encoding headers are added. If compression did not reduce the size, the original body is kept.

  • Set min_size to at least 150-256 bytes. Compressing tiny responses adds CPU overhead without meaningful bandwidth savings.
  • JSON API responses and HTML templates typically compress well (60-80% reduction).
  • Binary data (images, already-compressed files) should not be compressed. If you serve static files with the static middleware, consider placing compression after the static middleware so it only applies to dynamic responses, or filter by content type in a custom middleware.

The Server.Config struct provides several options that affect how connections are managed.

const config: Server.Config = .{
.max_body_size = 10 * 1024 * 1024, // 10 MB for file uploads
.max_header_size = 32768, // 32 KB for large cookies/auth headers
};
SettingDefaultGuidance
max_body_size1 MBIncrease for file upload endpoints. Keep low for API-only services to reject oversized payloads early.
max_header_size16 KBIncrease if your application uses large cookies or JWT tokens in headers.
const config: Server.Config = .{
.read_timeout_ms = 15_000, // 15s - tighter for APIs
.write_timeout_ms = 60_000, // 60s - generous for streaming responses
.keepalive_timeout_ms = 30_000, // 30s - shorter to reclaim idle connections
};
SettingDefaultGuidance
read_timeout_ms30 sHow long to wait for request data. Tighten for APIs, loosen for slow clients.
write_timeout_ms30 sHow long to wait for response send to complete. Increase for large downloads or streaming.
keepalive_timeout_ms65 sIdle timeout for keep-alive connections. Shorter values free connections faster under high load.
const config: Server.Config = .{
.max_connections = 4096,
.max_requests_per_connection = 200,
.kernel_backlog = 512,
};
SettingDefaultGuidance
max_connections1024Bounded queue capacity (zzz backend). Size this to your expected peak concurrent connections.
max_requests_per_connection100Limits HTTP pipelining on a single connection. Higher values improve throughput for keep-alive clients.
kernel_backlog128TCP SO_BACKLOG for pending connections. Increase for bursty traffic patterns.

For the native zzz backend, the worker_threads setting controls how many OS threads handle connections:

const config: Server.Config = .{
.worker_threads = 0, // auto: defaults to 1 thread
};

Guidelines for setting worker_threads:

  • CPU-bound handlers (template rendering, JSON serialization): set to the number of CPU cores.
  • I/O-bound handlers (database queries, HTTP calls to other services): set to 2-4x the core count, since threads will spend time waiting.
  • Mixed workloads: start at the core count and benchmark up.

Zig’s build modes have a significant impact on runtime performance:

Terminal window
zig build -Doptimize=ReleaseFast

Maximum performance. The compiler inlines comptime-known function calls, eliminates safety checks, and applies aggressive optimizations. Use this for production deployments.

Use this checklist when preparing a zzz application for production:

AreaActionImpact
Build modeUse -Doptimize=ReleaseFastEnables inlining of comptime middleware chains
BackendEvaluate zzz vs libhv for your workloadPlatform-native I/O can improve throughput
CompressionAdd gzipCompress middleware60-80% bandwidth reduction for text responses
Worker threadsMatch worker_threads to CPU coresPrevents over- or under-subscription
TimeoutsTighten read_timeout_ms and keepalive_timeout_msFrees connections from slow or idle clients
Body limitsSet max_body_size to the minimum your app needsRejects oversized payloads early
BacklogIncrease kernel_backlog for bursty trafficReduces connection refusals during spikes
Health checksPlace health middleware first in the pipelinePrevents health probes from inflating metrics
MetricsExpose /metrics for Prometheus monitoringEnables data-driven tuning