kombify TechStack provides comprehensive monitoring for your infrastructure through built-in metrics, health checks, and integrations with popular observability tools.
Monitoring 2.0: TechStack now includes an embedded Prometheus TSDB with a PromQL-compatible API, enabling built-in metric storage and querying without external Prometheus. External Prometheus integration remains supported.
Built-in health checks
API health endpoint
curl http://localhost:5260/api/v1/health
Response:
{
"status" : "healthy" ,
"version" : "1.0.0" ,
"uptime" : "72h15m30s" ,
"checks" : {
"database" : "ok" ,
"grpc" : "ok" ,
"workers" : "ok"
}
}
Health probes
Stack exposes standard health probes (compatible with Kubernetes, Docker healthchecks, and load balancers):
Endpoint Purpose When to use /health/liveLiveness probe Is the process running? /health/readyReadiness probe Can it handle traffic? /health/startupStartup probe Has it started successfully?
# Docker Compose healthcheck example
healthcheck :
test : [ "CMD" , "curl" , "-f" , "http://localhost:5260/health/ready" ]
interval : 30s
timeout : 10s
retries : 3
Prometheus metrics
Stack exposes metrics in Prometheus format at /metrics:
curl http://localhost:5260/metrics
Key metrics
Metric Type Description kombistack_api_requests_totalCounter Total API requests kombistack_api_request_duration_secondsHistogram Request latency kombistack_workers_connectedGauge Connected agent count kombistack_workers_healthyGauge Healthy agent count kombistack_jobs_totalCounter Jobs by type and status kombistack_jobs_duration_secondsHistogram Job execution time kombistack_stacks_totalGauge Total stacks managed kombistack_drift_detected_totalCounter Drift detections
Prometheus configuration
scrape_configs :
- job_name : 'kombistack'
static_configs :
- targets : [ 'localhost:5260' ]
metrics_path : /metrics
scheme : http
Worker (node) monitoring
Each connected agent reports metrics about its node:
# Get worker status
curl http://localhost:5260/api/v1/workers
Response:
{
"workers" : [
{
"id" : "worker_abc123" ,
"name" : "main-server" ,
"status" : "healthy" ,
"last_heartbeat" : "2026-02-03T10:30:00Z" ,
"metrics" : {
"cpu_percent" : 25.5 ,
"memory_percent" : 45.2 ,
"disk_percent" : 62.0 ,
"containers_running" : 12
}
}
]
}
Worker health states
State Description healthyAll checks passing degradedSome non-critical issues unhealthyCritical issues detected unreachableNo heartbeat received pendingAwaiting approval
Grafana dashboards
Import our pre-built Grafana dashboards:
Stack Overview
Workers
Jobs
Dashboard ID: kombistack-overview Shows:
API request rates
Error rates
Active workers
Job queue status
Dashboard ID: kombistack-workers Shows:
CPU/memory per worker
Container counts
Network traffic
Heartbeat status
Dashboard ID: kombistack-jobs Shows:
Job success/failure rates
Duration by type
Queue depth
Recent failures
Alerting
Example alert rules
groups :
- name : kombistack
rules :
- alert : WorkerUnhealthy
expr : kombistack_workers_healthy < kombistack_workers_connected
for : 5m
labels :
severity : warning
annotations :
summary : "Worker unhealthy"
- alert : HighErrorRate
expr : rate(kombistack_api_requests_total{status=~"5.."}[5m]) > 0.1
for : 2m
labels :
severity : critical
annotations :
summary : "High API error rate"
- alert : DriftDetected
expr : increase(kombistack_drift_detected_total[1h]) > 0
labels :
severity : warning
annotations :
summary : "Configuration drift detected"
Log aggregation
Stack outputs structured JSON logs that can be collected by any log aggregator:
{
"time" : "2026-02-03T10:30:00Z" ,
"level" : "INFO" ,
"msg" : "Job completed" ,
"job_id" : "job_abc123" ,
"job_type" : "provision" ,
"duration_ms" : 45230 ,
"worker_id" : "worker_xyz789"
}
Loki configuration
scrape_configs :
- job_name : kombistack
docker_sd_configs :
- host : unix:///var/run/docker.sock
relabel_configs :
- source_labels : [ '__meta_docker_container_name' ]
regex : '/kombistack'
action : keep
Service monitoring
Monitor deployed services via Stack’s built-in checks:
# Get service health for a stack
curl http://localhost:5260/api/v1/stacks/{stack-id}/health
Response:
{
"stack_id" : "stack_abc123" ,
"overall" : "healthy" ,
"services" : [
{
"name" : "traefik" ,
"status" : "healthy" ,
"uptime" : "72h" ,
"health_check" : {
"url" : "http://traefik:8080/ping" ,
"status_code" : 200 ,
"latency_ms" : 5
}
},
{
"name" : "dokploy" ,
"status" : "healthy" ,
"uptime" : "71h55m"
}
]
}
Monitoring stack deployment
Deploy a complete monitoring stack with Prometheus, Grafana, and Loki:
stackkit : base-kit
variant : default
services :
# ... your services
monitoring :
enabled : true
prometheus :
retention : 15d
storage : 50Gi
grafana :
enabled : true
admin_password : ${GRAFANA_PASSWORD}
loki :
enabled : true
retention : 7d
Next steps
Troubleshooting Debug common issues
Drift detection Understand drift detection