Skip to main content
kombify TechStack provides comprehensive monitoring for your infrastructure through built-in metrics, health checks, and integrations with popular observability tools.
Monitoring 2.0: TechStack now includes an embedded Prometheus TSDB with a PromQL-compatible API, enabling built-in metric storage and querying without external Prometheus. External Prometheus integration remains supported.

Built-in health checks

API health endpoint

curl http://localhost:5260/api/v1/health
Response:
{
  "status": "healthy",
  "version": "1.0.0",
  "uptime": "72h15m30s",
  "checks": {
    "database": "ok",
    "grpc": "ok",
    "workers": "ok"
  }
}

Health probes

Stack exposes standard health probes (compatible with Kubernetes, Docker healthchecks, and load balancers):
EndpointPurposeWhen to use
/health/liveLiveness probeIs the process running?
/health/readyReadiness probeCan it handle traffic?
/health/startupStartup probeHas it started successfully?
# Docker Compose healthcheck example
healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:5260/health/ready"]
  interval: 30s
  timeout: 10s
  retries: 3

Prometheus metrics

Stack exposes metrics in Prometheus format at /metrics:
curl http://localhost:5260/metrics

Key metrics

MetricTypeDescription
kombistack_api_requests_totalCounterTotal API requests
kombistack_api_request_duration_secondsHistogramRequest latency
kombistack_workers_connectedGaugeConnected agent count
kombistack_workers_healthyGaugeHealthy agent count
kombistack_jobs_totalCounterJobs by type and status
kombistack_jobs_duration_secondsHistogramJob execution time
kombistack_stacks_totalGaugeTotal stacks managed
kombistack_drift_detected_totalCounterDrift detections

Prometheus configuration

prometheus.yml
scrape_configs:
  - job_name: 'kombistack'
    static_configs:
      - targets: ['localhost:5260']
    metrics_path: /metrics
    scheme: http

Worker (node) monitoring

Each connected agent reports metrics about its node:
# Get worker status
curl http://localhost:5260/api/v1/workers
Response:
{
  "workers": [
    {
      "id": "worker_abc123",
      "name": "main-server",
      "status": "healthy",
      "last_heartbeat": "2026-02-03T10:30:00Z",
      "metrics": {
        "cpu_percent": 25.5,
        "memory_percent": 45.2,
        "disk_percent": 62.0,
        "containers_running": 12
      }
    }
  ]
}

Worker health states

StateDescription
healthyAll checks passing
degradedSome non-critical issues
unhealthyCritical issues detected
unreachableNo heartbeat received
pendingAwaiting approval

Grafana dashboards

Import our pre-built Grafana dashboards:
Dashboard ID: kombistack-overviewShows:
  • API request rates
  • Error rates
  • Active workers
  • Job queue status

Alerting

Example alert rules

alerting-rules.yml
groups:
  - name: kombistack
    rules:
      - alert: WorkerUnhealthy
        expr: kombistack_workers_healthy < kombistack_workers_connected
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Worker unhealthy"
          
      - alert: HighErrorRate
        expr: rate(kombistack_api_requests_total{status=~"5.."}[5m]) > 0.1
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "High API error rate"
          
      - alert: DriftDetected
        expr: increase(kombistack_drift_detected_total[1h]) > 0
        labels:
          severity: warning
        annotations:
          summary: "Configuration drift detected"

Log aggregation

Stack outputs structured JSON logs that can be collected by any log aggregator:
{
  "time": "2026-02-03T10:30:00Z",
  "level": "INFO",
  "msg": "Job completed",
  "job_id": "job_abc123",
  "job_type": "provision",
  "duration_ms": 45230,
  "worker_id": "worker_xyz789"
}

Loki configuration

loki-config.yml
scrape_configs:
  - job_name: kombistack
    docker_sd_configs:
      - host: unix:///var/run/docker.sock
    relabel_configs:
      - source_labels: ['__meta_docker_container_name']
        regex: '/kombistack'
        action: keep

Service monitoring

Monitor deployed services via Stack’s built-in checks:
# Get service health for a stack
curl http://localhost:5260/api/v1/stacks/{stack-id}/health
Response:
{
  "stack_id": "stack_abc123",
  "overall": "healthy",
  "services": [
    {
      "name": "traefik",
      "status": "healthy",
      "uptime": "72h",
      "health_check": {
        "url": "http://traefik:8080/ping",
        "status_code": 200,
        "latency_ms": 5
      }
    },
    {
      "name": "dokploy",
      "status": "healthy",
      "uptime": "71h55m"
    }
  ]
}

Monitoring stack deployment

Deploy a complete monitoring stack with Prometheus, Grafana, and Loki:
kombination.yaml
stackkit: base-kit
variant: default

services:
  # ... your services

monitoring:
  enabled: true
  prometheus:
    retention: 15d
    storage: 50Gi
  grafana:
    enabled: true
    admin_password: ${GRAFANA_PASSWORD}
  loki:
    enabled: true
    retention: 7d

Next steps

Troubleshooting

Debug common issues

Drift detection

Understand drift detection