Skip to main content

Kubernetes Deployment

This guide covers deploying Octokraft on Kubernetes for production environments that need scaling, high availability, and operational maturity. This is the recommended approach for teams with 50+ developers or when uptime requirements are critical.

Prerequisites

  • Kubernetes 1.28+
  • Helm 3.x
  • kubectl configured for your cluster
  • PostgreSQL 16+ (managed or self-hosted)
  • Redis 7+ (managed or self-hosted)
  • FalkorDB instance
  • Temporal server
  • A registered GitHub App (see GitHub Integration)
  • A Clerk account for authentication
  • Access to an OpenAI-compatible AI model API
For production deployments, use managed database services (e.g., Amazon RDS, Cloud SQL, Azure Database) rather than running PostgreSQL and Redis inside the cluster.

Quick Start

1

Add the Octokraft Helm repository

helm repo add octokraft https://charts.octokraft.com
helm repo update
2

Create a namespace

kubectl create namespace octokraft
3

Create secrets

Store sensitive configuration in Kubernetes secrets:
kubectl create secret generic octokraft-config \
  --namespace octokraft \
  --from-literal=DATABASE_URL='postgres://user:pass@host:5432/octokraft?sslmode=require' \
  --from-literal=REDIS_URL='redis://host:6379' \
  --from-literal=SECRET_KEY='your-secret-key' \
  --from-literal=CLERK_SECRET_KEY='your-clerk-secret' \
  --from-literal=GITHUB_WEBHOOK_SECRET='your-webhook-secret' \
  --from-literal=LLM_OPENAI_LARGE_API_KEY='your-api-key' \
  --from-literal=LLM_OPENAI_SMALL_API_KEY='your-api-key'

kubectl create secret generic github-private-key \
  --namespace octokraft \
  --from-file=github-app.pem=/path/to/your/private-key.pem
4

Create a values file

Create values.yaml with your configuration. See the values reference below.
5

Install the chart

helm install octokraft octokraft/octokraft \
  --namespace octokraft \
  --values values.yaml
6

Run database migrations

kubectl exec -n octokraft deploy/octokraft-api -- octokraft migrate up
7

Verify the deployment

kubectl exec -n octokraft deploy/octokraft-api -- curl -s http://localhost:8080/healthz

values.yaml Reference

# Octokraft Helm chart values

global:
  image:
    repository: ghcr.io/octokraft/octokraft-api
    tag: latest
    pullPolicy: Always

# API Server
api:
  replicas: 2
  port: 8080
  resources:
    requests:
      cpu: 500m
      memory: 512Mi
    limits:
      cpu: 2000m
      memory: 2Gi
  env:
    APP_ENV: production
    LOG_LEVEL: info
    TEMPORAL_ADDRESS: temporal.temporal:7233
    FALKORDB_HOST: falkordb.octokraft:6379
    CLERK_JWT_ISSUER: https://your-clerk-instance.clerk.accounts.dev
    GITHUB_APP_ID: "123456"
    GITHUB_PRIVATE_KEY_PATH: /keys/github-app.pem
    FRONTEND_URL: https://octokraft.yourcompany.com
    BACKEND_URL: https://api.octokraft.yourcompany.com
    CORS_ORIGINS: https://octokraft.yourcompany.com
    LLM_OPENAI_LARGE_PROVIDER: openai
    LLM_OPENAI_LARGE_MODEL: gpt-4o
    LLM_OPENAI_LARGE_BASE_URL: https://api.openai.com/v1
    LLM_OPENAI_SMALL_PROVIDER: openai
    LLM_OPENAI_SMALL_MODEL: gpt-4o-mini
    LLM_OPENAI_SMALL_BASE_URL: https://api.openai.com/v1
  envFromSecret: octokraft-config
  volumeMounts:
    - name: github-key
      mountPath: /keys
      readOnly: true
  volumes:
    - name: github-key
      secret:
        secretName: github-private-key
  service:
    type: ClusterIP
    port: 8080
  ingress:
    enabled: true
    className: nginx
    hosts:
      - host: api.octokraft.yourcompany.com
        paths:
          - path: /
            pathType: Prefix
    tls:
      - secretName: octokraft-api-tls
        hosts:
          - api.octokraft.yourcompany.com

# Analysis Workers
worker:
  replicas: 2
  resources:
    requests:
      cpu: 1000m
      memory: 1Gi
    limits:
      cpu: 4000m
      memory: 4Gi
  env:
    AGENT_EXECUTION_MODE: k8s
    K8S_AGENT_NAMESPACE: octokraft
  envFromSecret: octokraft-config
  volumeMounts:
    - name: github-key
      mountPath: /keys
      readOnly: true
  volumes:
    - name: github-key
      secret:
        secretName: github-private-key

# Frontend
frontend:
  replicas: 2
  image:
    repository: ghcr.io/octokraft/octokraft-frontend
    tag: latest
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 500m
      memory: 256Mi
  service:
    type: ClusterIP
    port: 80
  ingress:
    enabled: true
    className: nginx
    hosts:
      - host: octokraft.yourcompany.com
        paths:
          - path: /
            pathType: Prefix
    tls:
      - secretName: octokraft-frontend-tls
        hosts:
          - octokraft.yourcompany.com

# Monitoring
metrics:
  enabled: true
  serviceMonitor:
    enabled: true
    interval: 30s
    path: /metrics
    port: 8080

Scaling

Horizontal Scaling

The API server and workers scale independently. Workers are the primary scaling target — add more replicas to process analysis tasks faster.
# Scale workers for higher analysis throughput
kubectl scale deploy octokraft-worker -n octokraft --replicas=5

# Scale API for higher request throughput
kubectl scale deploy octokraft-api -n octokraft --replicas=4

Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: octokraft-worker
  namespace: octokraft
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: octokraft-worker
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Resource Sizing Guide

Team SizeAPI ReplicasWorker ReplicasWorker CPUWorker Memory
10-50 devs221000m1 Gi
50-200 devs342000m2 Gi
200+ devs4+6+2000m4 Gi

High Availability

API Server

Run at least 2 replicas with pod anti-affinity to spread across nodes:
api:
  replicas: 2
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                app: octokraft-api
            topologyKey: kubernetes.io/hostname

Workers

Workers are stateless and can tolerate pod restarts. Temporal automatically retries tasks assigned to workers that go down. Run at least 2 replicas in production.

Infrastructure Services

For high availability of infrastructure services:
  • PostgreSQL: Use a managed service with automated failover (Amazon RDS, Cloud SQL, Azure Database).
  • Redis: Use a managed service with replication (ElastiCache, Memorystore, Azure Cache).
  • Temporal: Deploy the Temporal server cluster with multiple history and matching service replicas.
  • FalkorDB: Run with persistent storage and regular backups.

Operations

Health Checks

The Helm chart configures liveness and readiness probes automatically. The health endpoints are:
EndpointPurpose
/healthzSimple health check. Returns 200 if the service is running. Used for liveness probes.
/health/detailedComponent health check. Returns status of each infrastructure dependency with circuit breaker state. Used for readiness probes.

Monitoring

Octokraft exposes Prometheus metrics at /metrics on port 8080. The Helm chart can create a ServiceMonitor for automatic Prometheus discovery. Key metrics to monitor:
  • HTTP request latency and error rates
  • Active Temporal workflow counts
  • Analysis task duration and failure rates
  • Database connection pool utilization
OpenTelemetry tracing is supported via standard OTEL_* environment variables. Enable it in your values:
api:
  env:
    OTEL_ENABLED: "true"
    OTEL_ENDPOINT: "otel-collector.monitoring:4317"

Logs

All components write structured JSON logs to stdout. Use your cluster’s log aggregation pipeline (Fluentd, Loki, CloudWatch Logs, etc.) to collect and search logs.
# View API logs
kubectl logs -n octokraft -l app=octokraft-api -f

# View worker logs
kubectl logs -n octokraft -l app=octokraft-worker -f

Upgrades

# Update Helm repository
helm repo update

# Upgrade to latest version
helm upgrade octokraft octokraft/octokraft \
  --namespace octokraft \
  --values values.yaml

# Run any new migrations after upgrade
kubectl exec -n octokraft deploy/octokraft-api -- octokraft migrate up

Backups

Back up the PostgreSQL database regularly using your managed service’s backup features or pg_dump. Redis and FalkorDB contain derived data that can be reconstructed from a fresh analysis run.

Troubleshooting

Check the pod logs for the specific error:
kubectl logs -n octokraft deploy/octokraft-api --previous
The most common causes are missing environment variables (the application logs which variable is unset) or unreachable infrastructure services.
The readiness probe calls /health/detailed, which checks connectivity to all infrastructure services. Identify which component is unhealthy:
kubectl exec -n octokraft deploy/octokraft-api -- curl -s http://localhost:8080/health/detailed | jq .
The response includes the status of each dependency (PostgreSQL, Redis, FalkorDB, Temporal).
Verify workers are connected to Temporal and the correct namespace:
kubectl logs -n octokraft -l app=octokraft-worker --tail=50
Confirm that TEMPORAL_ADDRESS and TEMPORAL_NAMESPACE match between the API and worker deployments.
Verify the ingress resource is created and has an address assigned:
kubectl get ingress -n octokraft
Confirm your DNS records point to the ingress controller’s external IP or load balancer. Check the ingress controller logs if requests are not reaching the Octokraft pods.
Analysis task duration depends on repository size and AI model response time. If tasks are timing out:
  1. Check AI model API latency — slow responses from the model provider are the most common cause.
  2. Scale workers to reduce queue depth.
  3. Verify workers have sufficient memory — large repositories require more memory during analysis.