Storage

Configure health checks

5 min readUpdated April 2026

Health checks are Kubernetes’ mechanism for keeping your application reliable without operator intervention. StackBlaze configures two probe types from your dashboard: a readiness probe that controls whether a pod receives traffic, and a liveness probe that controls whether a pod is restarted.

Getting these right is the single most impactful thing you can do to make your service production-grade. A well-configured health check means zero-downtime deploys, automatic recovery from crashes, and no surprises for users.

Readiness vs liveness, what's the difference?

Probe	What happens on failure	Use for
Readiness	Pod removed from load balancer endpoints, no traffic sent	App not yet warm, DB migration running, cache loading
Liveness	Pod is killed and replaced with a fresh container	Deadlock, OOM, infinite loop, frozen event loop

Express.js, /health endpoint

src/health.ts

import db from './db'

app.get('/health', async (req, res) => {

try {

// Verify DB reachable, fail fast if not

await db.raw('SELECT 1')

res.json(({ status: 'ok', uptime: process.uptime() }))

} catch {

res.status(503).json(({ status: 'error' }))

}

})

FastAPI, /health endpoint

main.py

from fastapi import FastAPI

from sqlalchemy import text

from .database import engine

app = FastAPI()

@app.get("/health")

async def health():

with engine.connect() as conn:

conn.execute(text("SELECT 1"))

return {"status": "ok"}

Traffic routing with health checks

Under the hood

StackBlaze injects both probes into the pod spec using the values you configure in the Health tab:

readinessProbe: httpGet to your configured path. failureThreshold: 3, periodSeconds: 10 by default. Pod is removed from Service endpoints on failure and re-added once it recovers.
livenessProbe: same httpGet mechanism but with a longer initial delay (initialDelaySeconds: 30) so the pod has time to start. On consecutive failure, kubelet kills the container.
Rolling deploys wait for readiness: during a rolling update, minReadySeconds and the readiness probe together ensure no old pod is terminated until the new one is confirmed healthy. This is what gives you true zero-downtime deploys.
startupProbe (advanced), for slow-starting apps (e.g. JVM warm-up), enable the startup probe in advanced settings. It disables liveness checks until the startup probe passes, preventing premature restarts during initialisation.

Step by step

Add a /health endpoint to your app

Create a lightweight HTTP endpoint that returns 200 OK when your service is ready to accept traffic. The endpoint should verify that critical dependencies (database connections, caches) are reachable. A 5xx response or a timeout signals to Kubernetes that the pod is unhealthy.

Configure the health check in the dashboard

Go to your service → Health tab. Set the path (e.g. /health), the check interval (default: 10s), and the failure threshold (default: 3 consecutive failures before action is taken). You can configure readiness and liveness probes independently with different paths and intervals.

StackBlaze routes traffic only to passing pods

Kubernetes removes pods failing their readiness probe from the Service's endpoint list. Traffic is automatically redistributed to healthy pods. This happens silently, end users see no errors. During a rolling deploy, new pods must pass their readiness probe before old pods are terminated.

Failed liveness probe triggers automatic restart

While the readiness probe gates traffic, the liveness probe gates pod survival. If a pod fails the liveness probe consecutively (at the configured failure threshold), Kubernetes kills it and starts a fresh replacement. This self-heals frozen or deadlocked processes without any manual intervention.

Dashboard configuration reference

Field	Default	Description
Path	/health	HTTP GET path for both probes
Period	10s	How often the probe runs
Failure threshold	3	Consecutive failures before action is taken
Timeout	5s	How long to wait for a response
Initial delay	10s	Seconds after container start before probing begins

Next steps

Zero-downtime deployments Horizontal auto-scaling Attach a persistent disk