Worker Deployment¶

The Floh worker is a standalone Node.js process that consumes BullMQ jobs without running an HTTP server. It shares the same codebase, database, and Redis as the API server.

Architecture¶

┌────────────────┐     ┌────────────┐     ┌────────────────┐
│  HTTP Server   │────>│   Redis    │<────│    Worker(s)   │
│ (Fastify API)  │     │  (BullMQ)  │     │ (job consumer) │
│ Enqueue only   │     └────────────┘     │ No HTTP        │
└────────────────┘            │           └────────────────┘
        │                     │                    │
        └─────────────────────┼────────────────────┘
                              ▼
                      ┌──────────────┐
                      │  PostgreSQL  │
                      └──────────────┘

Running the Worker¶

Development¶

pnpm --filter @floh/server exec tsx src/worker.ts

Production (Docker)¶

The docker-compose.yml includes a worker service:

worker:
  image: ghcr.io/floh/server:latest
  command: node dist/worker.js
  env_file: .env
  deploy:
    replicas: 2

Combined Mode (Default)¶

By default, the HTTP server runs its own BullMQ worker. This is fine for single-instance deployments. No configuration change is needed.

Separate Mode¶

Set WORKER_MODE=separate on the HTTP server to disable its built-in worker. This is recommended when running dedicated worker containers:

# HTTP server (.env)
WORKER_MODE=separate

# Worker containers use the same .env but run dist/worker.js

Scaling¶

BullMQ distributes jobs across all connected workers automatically. To scale:

Increase the replicas count in Docker Compose or Kubernetes
Each worker processes one job at a time by default
Workers share no state -- they only need Redis and PostgreSQL access

Graceful Shutdown¶

The worker handles SIGTERM and SIGINT:

Stops accepting new jobs
Waits for the currently running job to complete (up to 30 seconds)
Closes Redis and database connections
Exits process

Docker and Kubernetes send SIGTERM on container stop. Set stop_grace_period to at least 35 seconds.

Monitoring¶

Job Queue Health¶

Check the BullMQ queue status via Redis:

redis-cli keys "bull:workflow-scheduler:*" | head -20

Failed Jobs¶

Failed jobs are retained for inspection (up to 5000). Query them:

redis-cli lrange "bull:workflow-scheduler:failed" 0 10

Worker Process Health¶

The worker logs to stdout. In Docker, use docker logs floh-worker-1 to check output.

For production monitoring, consider Bull Board for a web-based queue dashboard.

Configuration¶

The worker uses the same environment variables as the API server. Key settings:

Env var	Description
`DB_HOST`	PostgreSQL host
`REDIS_HOST`	Redis host
`STUCK_RUN_TIMEOUT_MINUTES`	Timeout for stuck run recovery (default: 30)
`AUDIT_CHECKPOINT_SCHEDULE`	Cron for audit checkpoints
`LOG_RETENTION_DAYS`	Not used by worker (server only)