Building High-Performance APIs with Rust
We recently helped a client migrate their API from Node.js to Rust, achieving a 20x improvement in throughput while reducing P99 latency from 200ms to 8ms. Here’s what we learned.
The Starting Point
The existing Node.js API was hitting scaling limits:
- P99 latency spiking during traffic peaks
- High memory usage requiring frequent restarts
- Unpredictable garbage collection pauses
- Expensive horizontal scaling
Choosing the Stack
After evaluating options, we settled on:
- Axum for the web framework
- SQLx for database access
- tokio as the async runtime
- tower for middleware
This combination offers excellent performance, type safety, and composability.
Architecture Decisions
Connection Pooling
Database connections are expensive. We configured SQLx with careful pool sizing:
let pool = PgPoolOptions::new()
.max_connections(50)
.min_connections(10)
.acquire_timeout(Duration::from_secs(3))
.connect(&database_url)
.await?;
Request Validation
We used the validator crate with Axum’s extractor pattern:
#[derive(Deserialize, Validate)]
struct CreateUser {
#[validate(email)]
email: String,
#[validate(length(min = 8))]
password: String,
}
async fn create_user(
Json(payload): Json<CreateUser>,
) -> Result<Json<User>, ApiError> {
payload.validate()?;
// ...
}
Error Handling
Consistent error responses with proper HTTP status codes:
enum ApiError {
NotFound,
Validation(ValidationErrors),
Internal(anyhow::Error),
}
impl IntoResponse for ApiError {
fn into_response(self) -> Response {
let (status, message) = match self {
ApiError::NotFound => (StatusCode::NOT_FOUND, "Not found"),
ApiError::Validation(e) => (StatusCode::BAD_REQUEST, "Validation failed"),
ApiError::Internal(_) => (StatusCode::INTERNAL_SERVER_ERROR, "Internal error"),
};
// Return JSON error response
}
}
Performance Optimizations
Response Compression
Enable gzip compression for text responses:
let app = Router::new()
.route("/api/*path", handler)
.layer(CompressionLayer::new());
Caching Headers
Proper cache headers reduce server load:
async fn get_resource() -> impl IntoResponse {
(
[(header::CACHE_CONTROL, "public, max-age=3600")],
Json(resource)
)
}
Batch Operations
Instead of N+1 queries, batch database operations:
// Instead of fetching users one by one
let users = sqlx::query_as!(User,
"SELECT * FROM users WHERE id = ANY($1)",
&user_ids
)
.fetch_all(&pool)
.await?;
Observability
Structured Logging
We used tracing with JSON output for production:
tracing_subscriber::fmt()
.json()
.with_target(false)
.init();
Metrics
Prometheus metrics via metrics crate:
metrics::counter!("api_requests_total", "endpoint" => endpoint).increment(1);
metrics::histogram!("api_request_duration_seconds").record(duration);
Results
After migration:
- 20x throughput increase on same hardware
- P99 latency: 8ms (down from 200ms)
- Memory usage: 50MB (down from 500MB)
- Zero GC pauses
Lessons Learned
- Start with profiling - Measure before optimizing
- Database is usually the bottleneck - Optimize queries first
- Type safety pays off - Fewer runtime errors in production
- Async is powerful but subtle - Understand the runtime behavior
Building a high-performance API in Rust? Let’s talk about how we can help.