Getting Started From zero to a running app on your own AWS account

Cumulus is a deployment platform for Rust, Python, and static sites. You connect your own AWS account; Cumulus handles the complexity. Everything runs in your account — EC2 instances, Lambda functions, S3 buckets. You pay AWS directly; Cumulus charges a flat monthly fee.

🦀 Rust apps
Axum, Actix, Tower — no host toolchain needed. EC2 or Lambda.
🐍 Python apps
Django, FastAPI, Flask, bare handlers — uv-managed interpreter, pip/poetry/pipenv support. EC2 or Lambda.
🌐 Static sites
Next.js, Vite, Hugo, or any npm build. Synced to S3 with correct cache headers, optional CloudFront CDN.
🔒 Zero-trust deploys
Health-gated deploys with automatic rollback. Secrets live in SSM, never in build artifacts.

Prerequisites

  • A Cumulus account — sign in with GitHub at cumulus.collectif.dev.
  • An AWS account — your apps run in your own account. Connect it to Cumulus once after signing up:
    cumulus aws connect
    This walks you through creating a cross-account IAM role so Cumulus can deploy on your behalf — no long-lived credentials needed.
  • A KMS key in your AWS account — Cumulus uses it to encrypt your secrets and artifacts at rest. Create one once and note the ARN:
    aws kms create-key \
      --description "cumulus" \
      --query KeyMetadata.Arn --output text
    # → arn:aws:kms:us-east-1:123456789012:key/<uuid>
    
    aws kms create-alias \
      --alias-name alias/cumulus \
      --target-key-id <key-id>

Install the CLI

Download the latest release for your platform from your dashboard, or use the install script:

# macOS / Linux
curl -fsSL https://cumulus.collectif.dev/install.sh | sh

# Verify
cumulus --version
cumulus --help
Deploy Rust, Python, and static sites to your own AWS account

Usage: cumulus [OPTIONS] <COMMAND>

Commands:
  init Scaffold a cumulus.toml interactively
  deploy Deploy apps
  rollback Roll back to a previous release
  logs Stream an app's logs
  server Manage EC2 servers
  env Manage environment variables
  ...

Quick start

  • Log in — this opens your browser to sign in with GitHub:
    cumulus login
  • Scaffold a config in your project directory. Cumulus detects your runtime from project files and asks a few questions:
    cumulus init
    Scaffolding cumulus.toml — press Enter to accept the default.

      Project name [my-api]: my-api
      AWS region [us-east-1]:
      Detected runtime: rust (from Cargo.toml)
      Runtime (rust / python / static) [rust]:
      Target (ec2 / s3 / lambda) [ec2]:
      ...

    wrote cumulus.toml
  • Register the project with Cumulus — reads your cumulus.toml and creates the project and apps on the control plane:
    cumulus link
  • Deploy — see the guides below for each runtime and target.
Tip: Each deploy is health-gated — Cumulus hits your /health endpoint after starting the new version, and automatically rolls back if it doesn't respond with a 2xx. Make sure your app serves 200 OK from that path.

Axum API on EC2 Rust EC2

Deploy an Axum (or any Rust binary) API to a managed EC2 server. Cumulus cross-compiles the binary in Docker, uploads it, and swaps the running service with zero downtime.

1. Provision a server (once)

Run this once to create the server. Cumulus handles everything — the instance, networking, and the agent that receives deploys.

cumulus server create prod-1 \
  --instance-type t4g.small   # Graviton, ~$13/mo
• launching i-0abc123def456…
• waiting for running…
server prod-1 is ready → 54.12.34.56

2. Configure cumulus.toml

[project]
name   = "my-saas"
region = "us-east-1"

[[app]]
name   = "api"
runtime = "rust"
target  = "ec2"
source  = "."
binary  = "api"          # the [[bin]] name in Cargo.toml
server  = "prod-1"      # the server you just created

[app.env]
DATABASE_URL = { secret = "/cumulus/api/DATABASE_URL" }
LOG_LEVEL    = "info"

3. Set secrets and deploy

cumulus env set api DATABASE_URL="postgres://user:pass@host/db" --secret
cumulus deploy api
→ deploying `api` (Rust → Ec2) [env: production]
  • server `prod-1` → 54.12.34.56 (i-0abc123def456)
  • building `api`…
  • built
  • artifact uploaded
  • agent deployed
  • process settled (no health check)
  • cut over
  ✓ deployed `api` to `prod-1`
Optional: health-gated deploys. Add [app.health_check] to probe a live readiness endpoint instead of waiting a fixed interval. If the probe returns non-2xx, Cumulus automatically rolls back to the previous release.
[app.health_check]
path      = "/health"
port      = 8080
timeout_s = 10

Django on EC2 Python EC2

Django apps run under uvicorn (ASGI) or gunicorn (WSGI), managed by systemd. Dependencies are installed on-server with uv — no Python version needs to be installed on the host.

cumulus.toml

[project]
name   = "my-saas"
region = "us-east-1"

[[app]]
name    = "web"
runtime = "python"
target  = "ec2"
source  = "."
server  = "prod-1"

[app.python]
version = "3.12"
entry   = "myproject.asgi:application"  # Django ASGI entrypoint
server  = "uvicorn"
workers = 2

[app.env]
DJANGO_SETTINGS_MODULE = "myproject.settings.production"
SECRET_KEY             = { secret = "/cumulus/web/SECRET_KEY" }
DATABASE_URL           = { secret = "/cumulus/web/DATABASE_URL" }

[app.tasks]
migrate         = "python manage.py migrate --noinput"
collectstatic   = "python manage.py collectstatic --noinput"

# Run migrations before the new release goes live
release = ["migrate", "collectstatic"]
[[app]]
name    = "web"
runtime = "python"
target  = "ec2"
source  = "."
server  = "prod-1"

[app.python]
version = "3.12"
entry   = "myproject.wsgi:application"
server  = "gunicorn"
workers = 4

Dependency management

Cumulus detects your dependency manager automatically:

uv.lock → uv poetry.lock → poetry Pipfile.lock → pipenv requirements.txt → pip

Deploy

cumulus env set web SECRET_KEY="your-secret" --secret
cumulus env set web DATABASE_URL="postgres://..." --secret
cumulus deploy web
The Python interpreter is managed by uv on the server — Django and all dependencies are installed into an isolated venv per release. Your declared Python version is fetched and managed automatically; nothing needs to be installed on the host.

FastAPI on EC2 Python EC2

cumulus.toml

[project]
name   = "my-api"
region = "us-east-1"

[[app]]
name    = "api"
runtime = "python"
target  = "ec2"
source  = "."
server  = "prod-1"

[app.python]
version = "3.12"
entry   = "main:app"     # `app = FastAPI()` in main.py
server  = "uvicorn"
workers = 2

[app.env]
DATABASE_URL = { secret = "/cumulus/api/DATABASE_URL" }
ENV          = "production"

Minimal FastAPI app

# main.py
from fastapi import FastAPI

app = FastAPI()

@app.get("/")
def root():
    return {"message": "Hello World"}
# requirements.txt (or pyproject.toml, or uv.lock)
fastapi
uvicorn[standard]
cumulus deploy api

FastAPI on Lambda Python Lambda

Run FastAPI on Lambda via Mangum — the ASGI adapter for AWS Lambda. Cumulus packages your source and dependencies in a Lambda-compatible container and deploys them to a managed Python runtime.

cumulus.toml

[project]
name   = "my-api"
region = "us-east-1"

[[app]]
name    = "api"
runtime = "python"
target  = "lambda"
source  = "."

[app.lambda]
handler        = "main.handler"   # module.function
python_version = "3.12"
memory_mb      = 512
timeout_s      = 29
architecture   = "arm64"

[app.lambda.url]
enabled = true    # creates a public Function URL
auth    = "none"

[app.env]
DATABASE_URL = { secret = "/cumulus/api/DATABASE_URL" }

main.py

# main.py
from fastapi import FastAPI
from mangum import Mangum

app = FastAPI()

@app.get("/health")
def health():
    return {"status": "ok"}

@app.get("/items/{item_id}")
def read_item(item_id: int):
    return {"item_id": item_id}

handler = Mangum(app, lifespan="off")
# requirements.txt
fastapi
mangum
cumulus deploy api
→ deploying `api` (Python → Lambda) [env: production]
  • packaging Python 3.12…
  • deploying function `api`…
  ✓ deployed `api` (arn:aws:lambda:us-east-1:123456789012:function:api)
    version: 3
    function URL: https://abc123.lambda-url.us-east-1.on.aws/

Python handler on Lambda Python Lambda

A bare def handler(event, context) — event-driven processing, cron jobs, SQS consumers, S3 triggers.

cumulus.toml

[project]
name   = "workers"
region = "us-east-1"

[[app]]
name    = "processor"
runtime = "python"
target  = "lambda"
source  = "."

[app.lambda]
handler        = "handler.process"   # handler.py → def process(event, ctx)
python_version = "3.12"
memory_mb      = 256
timeout_s      = 60
architecture   = "arm64"

[app.env]
QUEUE_URL = { secret = "/cumulus/processor/QUEUE_URL" }

handler.py

# handler.py
import json
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def process(event, context):
    logger.info("Received %d records", len(event.get("Records", [])))
    for record in event.get("Records", []):
        body = json.loads(record["body"])
        # do the work…
    return {"statusCode": 200, "body": "ok"}
cumulus deploy processor
No Function URL or [app.lambda.url] needed for event-driven handlers — Lambda invokes them directly from the trigger (SQS, S3, EventBridge, etc.).

Multiple handlers, one project

Each handler is a separate [[app]] entry. They can share source or be in separate directories:

[[app]]
name    = "ingest"
runtime = "python"
target  = "lambda"
source  = "handlers/ingest"

[app.lambda]
handler  = "main.handler"
memory_mb = 128
timeout_s = 30

[[app]]
name    = "notify"
runtime = "python"
target  = "lambda"
source  = "handlers/notify"

[app.lambda]
handler  = "main.handler"
memory_mb = 128
timeout_s = 10
cumulus deploy          # deploy all apps
cumulus deploy ingest   # deploy one app

Rust on Lambda Rust Lambda

Rust Lambda handlers use the aws-lambda-rust-runtime crate. Cumulus handles the build and packaging automatically — no manual cross-compilation steps needed.

Cargo.toml

[[bin]]
name = "handler"
path = "src/main.rs"

[dependencies]
lambda_runtime = "0.13"
serde          = { version = "1", features = ["derive"] }
tokio          = { version = "1", features = ["macros"] }

src/main.rs

// src/main.rs
use lambda_runtime::{run, service_fn, Error, LambdaEvent};
use serde::{Deserialize, Serialize};

#[derive(Deserialize)]
struct Request { name: String }

#[derive(Serialize)]
struct Response { message: String }

async fn function_handler(event: LambdaEvent<Request>) -> Result<Response, Error> {
    Ok(Response { message: format!("Hello, {}!", event.payload.name) })
}

#[tokio::main]
async fn main() -> Result<(), Error> {
    run(service_fn(function_handler)).await
}

cumulus.toml

[project]
name   = "my-service"
region = "us-east-1"

[[app]]
name    = "handler"
runtime = "rust"
target  = "lambda"
source  = "."
binary  = "handler"         # the [[bin]] name

[app.lambda]
memory_mb    = 256
timeout_s    = 29
architecture = "arm64"      # Graviton — cheaper and faster for Rust

[app.lambda.url]
enabled = true
auth    = "none"
cumulus deploy handler
→ deploying `handler` (Rust → Lambda) [env: production]
  • building `handler`…
  • deploying function `handler`…
  ✓ deployed `handler` (arn:aws:lambda:...)
    version: 7
    function URL: https://xyz.lambda-url.us-east-1.on.aws/
Arm64 Graviton Lambda is ~20% cheaper and often faster than x86_64 for Rust — prefer it unless you have native extensions that require x86.

Static site on S3 Static S3

Next.js, Vite, Hugo, Gatsby, or any build that outputs a directory of files. Cumulus syncs files to S3 with the right cache headers (HTML no-cache, hashed assets immutable), and optionally provisions a CloudFront distribution with an ACM certificate.

[project]
name   = "my-saas"
region = "us-east-1"

[[app]]
name    = "frontend"
runtime = "static"
target  = "s3"
source  = "."          # or "apps/frontend" in a monorepo

[app.s3]
bucket      = "my-saas-frontend"
kms_key_arn = "arn:aws:kms:us-east-1:123456789012:key/<uuid>"

[app.build]
command    = "npm run build"
output_dir = "out"         # Next.js static export output
node_version = "20"

[app.cdn]
enabled       = true
custom_domain = "app.example.com"
Next.js static export requires output: 'export' in next.config.js.
[app.build]
command    = "npm run build"
output_dir = "dist"
node_version = "20"
[app.build]
command    = "hugo --minify"
output_dir = "public"
Hugo isn't Node-based — run it locally and commit the output, or set up a CI step. The node_version field only applies to npm builds.
cumulus deploy frontend
→ deploying `frontend` (Static → S3) [env: production]
  ✓ uploaded 47, deleted 3 → s3://my-saas-frontend
  • provisioning custom domain app.example.com (ACM + CloudFront; first run takes a few minutes)…
  ✓ app.example.com → d3abc123.cloudfront.net (CloudFront E1ABC123)

Environment variables

Env vars are stored in AWS SSM Parameter Store under /cumulus/<app>/<KEY>. The agent fetches them at deploy time and writes a .env file (mode 0600) before starting the service. Secrets are encrypted with your CMK; they never appear in build artifacts.

# Set a plain value
cumulus env set api LOG_LEVEL=info

# Set an encrypted secret (uses your CMK)
cumulus env set api DATABASE_URL="postgres://user:pass@host/db" --secret
cumulus env set api SECRET_KEY="$(openssl rand -hex 32)" --secret

# List names (values are hidden)
cumulus env list api

# Pull all vars as KEY=VALUE (e.g. to populate a local .env)
cumulus env pull api > .env.production

Reference a secret in cumulus.toml

Use a { secret = "..." } reference to an SSM path instead of inline values:

[app.env]
LOG_LEVEL    = "info"                                # inline
DATABASE_URL = { secret = "/cumulus/api/DATABASE_URL" }  # from SSM
SECRET_KEY   = { secret = "/cumulus/api/SECRET_KEY" }

Secrets declared in [app.env] are read by the agent from SSM at deploy time. For Lambda apps, secrets are read and set as the function's environment variables at deploy.

Rollback

Cumulus keeps the last 5 releases on each server. You can roll back to the previous good release, or to a specific deployment id.

# List retained releases (newest first, current marked)
cumulus releases api

# Roll back to the previous good release
cumulus rollback api

# Roll back to a specific deployment
cumulus rollback api --to dpl-1718732400000
Automatic rollback: If the health check fails during a deploy, the previous binary is restored automatically — no manual intervention needed.

Lambda rollback

Lambda rollback points the live alias at the previous published version:

cumulus rollback handler             # previous version
cumulus rollback handler --to 5      # specific version number

Logs

# Stream live logs from an EC2 app (Ctrl-C to stop)
cumulus logs api

# Check a server's resource usage
cumulus server health prod-1
Jun 18 14:22:11 prod-1 api[1234]: INFO listening on 0.0.0.0:8080
Jun 18 14:22:43 prod-1 api[1234]: INFO GET /health 200 1ms
Jun 18 14:23:01 prod-1 api[1234]: INFO POST /api/users 201 45ms

Logs stream directly from journald on the server via the agent — no CloudWatch setup required.

Release tasks

Tasks are commands that run in your app's release context on the server — for database migrations, cache warmup, asset compilation, etc. Declare them in [app.tasks] and list the ones that should run before cutover in release.

[app.tasks]
migrate    = "python manage.py migrate --noinput"
seed       = "python manage.py seed_data"
check      = "python manage.py check --deploy"

# These run before the new release goes live (expand/contract pattern)
release = ["migrate"]

Run a task on demand

# Run a one-off task on the server (exits with the task's exit code)
cumulus run api seed
cumulus run api check
Migration safety: Database migrations run before the new binary starts but while the old binary is still serving traffic. Write backward-compatible migrations (expand/contract pattern) — the old binary must continue to work against the new schema during the health check window.

Restart in place

# Restart the service without redeploying (recover a hung process)
cumulus restart api

Multiple environments

Deploy the same app to separate servers for staging and production. Override the server per-environment in [app.environments.<name>].

[[app]]
name   = "api"
runtime = "rust"
target  = "ec2"
source  = "."
binary  = "api"
server  = "prod-1"         # default (production)

[app.environments.staging]
server = "staging-1"        # override for staging

# [app.health_check]             # optional — probe a live endpoint instead of a fixed wait
# path = "/health"
# port = 8080
# Deploy to staging
cumulus deploy api --env staging

# Deploy to production (default)
cumulus deploy api

Custom domains & TLS

Coming soon — the CDN and ALB provisioners exist but aren't wired into cumulus deploy yet, so [app.cdn] / [app.tls] don't provision anything on their own today. The config below is the shape it will take; for now custom domains are set up manually.

Static site (S3 + CloudFront)

Add [app.cdn] to provision a CloudFront distribution with an ACM certificate. First run takes a few minutes while ACM validates the domain via DNS.

[app.cdn]
enabled       = true
custom_domain = "app.example.com"
hosted_zone_id = "Z1ABC123EXAMPLE"  # optional; auto-discovered if absent

EC2 app (ALB + ACM)

Add [app.tls] to provision an Application Load Balancer with a regional ACM certificate. The ALB spans at least two subnets (AZs) and forwards to the app's health-check port.

[app.tls]
enabled  = true
domain   = "api.example.com"
subnets  = ["subnet-az-a", "subnet-az-b"]  # ≥2 AZs, same VPC as the server
hosted_zone_id = "Z1ABC123EXAMPLE"           # optional
Both CDN and TLS provisioning are idempotent — re-running cumulus deploy finds and reuses an existing distribution or load balancer rather than creating a new one.

EC2 or Lambda?

Both targets run your code on AWS, but the operational model is fundamentally different. Picking the wrong one for your workload costs either money or operational headaches — usually both.

EC2
A server that runs continuously. You pay by the hour whether or not traffic is arriving. The process stays alive between requests, so in-memory state, background workers, WebSocket connections, and long-running jobs all work naturally.
Lambda
A function that wakes up per request and goes back to sleep. You pay per invocation. There is no persistent memory between calls, no background threads, and connections (database, cache) must be re-established on cold starts.

Choose Lambda when

  • Traffic is spiky or unpredictable. Lambda scales to zero between bursts — you pay nothing during quiet periods. An EC2 instance runs (and bills) continuously even at idle.
  • The workload is event-driven. Webhook receivers, S3 event handlers, scheduled jobs, and async processing are a natural fit. Each invocation is independent.
  • You want zero ops overhead. No server to patch, no instance type to size, no systemd to manage. Cumulus handles packaging and deployment; AWS handles the rest.
  • Response times are under ~30 seconds. Lambda has a hard 15-minute timeout. Most HTTP APIs are fine; long-running batch jobs are not.

Choose EC2 when

  • You have background workers or scheduled processes. Celery workers, queue consumers, and cron jobs need a process that is always running. Lambda cannot do this.
  • You need persistent connections. WebSockets, Server-Sent Events, and long-polling require a stable process. Lambda invocations are short-lived.
  • Traffic is steady. At constant load, a single EC2 instance is almost always cheaper than the equivalent Lambda invocation volume. The crossover is typically around a few hundred requests per minute.
  • You are migrating an existing app. A Django, Rails, or Axum API already designed around persistent processes will run on EC2 with no code changes.
You do not have to choose one. A common pattern: run your API on EC2 (steady traffic, persistent DB connections) and use Lambda for background tasks like sending email, processing uploads, or firing webhooks. Declare both in the same cumulus.toml.

Cost reference (us-east-1, 2024)

Option ~Monthly cost Best for
t4g.small EC2 ~$12 Always-on API, workers, early-stage apps
t4g.medium EC2 ~$25 API + worker on one box, more headroom
Lambda (512 MB) ~$0 – $5 at low volume Event handlers, webhooks, low-traffic APIs
Lambda (512 MB) ~$15 – $60+ at scale High-traffic APIs — compare against EC2 at this point

Choosing a database

When you provision an EC2 server with --with-postgres, Cumulus installs PostgreSQL on the same instance as your application. It is the fastest way to get a database running, and it costs nothing beyond the instance you are already paying for. But it comes with trade-offs worth understanding before you have paying users.

Concern On-instance Postgres RDS
Monthly cost $0 extra ~$15 – $50 (db.t4g.micro – small)
CPU / memory Shared with the app Dedicated
Survives host replacement No — DB dies with the EC2 instance Yes — data is independent of the host
Data loss on host failure Up to 24 h (daily backup) Near-zero (continuous point-in-time recovery)
Automated backups Daily pg_dump → S3, 30-day retention Automated snapshots + PITR (AWS managed)
Automatic failover No Optional (Multi-AZ)
On-instance Postgres means the database and the app share the same failure domain. If the EC2 instance is terminated, corrupted, or needs to be replaced, you lose both the running process and the database. Daily backups limit data loss to at most 24 hours, but recovery still requires a restore step. If you cannot afford to lose a day of data, use RDS.

Our recommendation

Start with on-instance Postgres. It is free, zero-configuration, and right-sized for early-stage products where the operational overhead of RDS is not yet worth it. Cumulus takes a daily backup automatically — you are not flying blind.

Upgrade to RDS when any of these apply:

  • You have paying customers and data loss is unacceptable
  • The database has grown past a few gigabytes (disk and CPU contention with the app)
  • You want the database to survive a host replacement automatically
  • You need point-in-time recovery (restore to any second, not just the last daily backup)
Upgrading is a one-line change. Cumulus stores DATABASE_URL in SSM, not in your application code. Pointing it at an RDS endpoint and redeploying is all it takes — no code changes, no downtime beyond the restart. Run cumulus db create to provision RDS, then redeploy.

Provision a server with a database

# On-instance Postgres — free, included in the server
cumulus server create prod-1 --instance-type t4g.small --with-postgres

# Separate RDS instance (recommended when data is business-critical)
cumulus server create prod-1 --instance-type t4g.small
cumulus db create prod-db \
  --subnet <subnet-az-a> --subnet <subnet-az-b> \
  --source-sg <server-sg-id> \
  --kms-key-arn <cmk-arn>

Why staging matters

Deploying directly to production on every push is a habit that works fine until it doesn't — and when it doesn't, it is usually at the worst possible time. A staging environment is the smallest change you can make to your workflow that catches the most problems before they reach users.

What staging catches that testing doesn't

  • Database migrations against real data shapes. A migration that passes on a small dev fixture can fail or corrupt a production table with millions of rows and years of accumulated oddities. Run it on staging first, against a recent production snapshot.
  • Environment variable mistakes. A missing secret or a wrong endpoint URL causes a silent failure that unit tests never see. Staging shares the same SSM-based config model as production — a deploy that fails there fails safely.
  • Integration behavior at real scale. Third-party APIs, payment providers, and email services behave differently under load and with production credentials. Staging is where you find that your Stripe webhook handler has a race condition.
  • The deploy process itself. A broken release task (a migration that never exits, a cache warmup that OOMs) stalls the deployment. Finding this on staging means the old binary keeps serving production traffic while you fix it.

Set it up in five minutes

Add a staging server and override it per-environment in cumulus.toml:

[[app]]
name   = "api"
server = "prod-1"          # default: production

[app.environments.staging]
server = "staging-1"       # smaller instance is fine

Then set your webhook to deploy pushes to staging, and promote manually to production:

# GitHub pushes to main land on staging automatically
cumulus project create --name my-saas --deploy-env staging

# When staging looks good, promote to production
cumulus app promote my-saas/api --from staging --to production
Promote re-deploys the same git sha — not a new build. The binary that passed health checks on staging is exactly what goes to production. No surprises from a second build.

Staging does not need to be expensive

A t4g.micro (~$6/mo) running a stripped-down version of your stack is enough to catch most issues. It does not need to handle production load — it needs to run your migration and start your binary. On-instance Postgres is fine for staging even if production uses RDS.

Health checks

Cumulus probes your app after each deploy and rolls back automatically if the check fails. But the value of a health endpoint goes far beyond deployment gating — a well-designed /health route is the single most useful signal you have about whether your app is actually working.

What a good health endpoint checks

Returning 200 OK unconditionally is better than nothing, but it only tells you the process started. A useful health check also verifies the things the process depends on:

# Minimal — proves the process is alive
GET /health → 200 OK

# Better — proves the app can actually serve requests
GET /health → 200 { "status": "ok", "db": "ok", "latency_ms": 3 }

# Failed — the process is up but cannot reach the database
GET /health → 503 { "status": "degraded", "db": "error: connection refused" }

A 503 during the deploy health-check window causes Cumulus to roll back — the old binary stays live. A 503 after cutover is how your monitoring knows something broke between deploys.

What to check

  • Database connectivity. Run a lightweight query — SELECT 1 is enough. A misconfigured DATABASE_URL, a migration that left the schema in a bad state, or an RDS failover all show up here first.
  • Required environment variables. If a missing secret would cause a panic or an unhandled error on the first real request, check for it at startup and surface it in /health.
  • External dependencies (with care). Checking a third-party API in /health means their outage causes your deploy to roll back. Only include dependencies that your app truly cannot function without, and set aggressive timeouts.

Configure the health check

[app.health_check]
path      = "/health"
port      = 8080
timeout_s = 10    # per attempt; Cumulus retries for ~60 s before rolling back
Without a health check, Cumulus waits a fixed interval and assumes success. This means a binary that starts but immediately crashes will cut over before the crash is detected. Configure [app.health_check] for any app that connects to a database or external service.

Quick implementation examples

// src/routes/health.rs
async fn health(State(pool): State<PgPool>) -> impl IntoResponse {
    match sqlx::query("SELECT 1").execute(&pool).await {
        Ok(_)  => (StatusCode::OK, Json(json!({ "status": "ok" }))),
        Err(e) => (StatusCode::SERVICE_UNAVAILABLE,
                   Json(json!({ "status": "degraded", "db": e.to_string() }))),
    }
}
# myapp/views.py
from django.http import JsonResponse
from django.db import connection

def health(request):
    try:
        connection.ensure_connection()
        return JsonResponse({"status": "ok"})
    except Exception as e:
        return JsonResponse({"status": "degraded", "db": str(e)}, status=503)

# urls.py
path("health", views.health),
# main.py
from fastapi import FastAPI
from fastapi.responses import JSONResponse
from sqlalchemy import text

@app.get("/health")
async def health(db: AsyncSession = Depends(get_db)):
    try:
        await db.execute(text("SELECT 1"))
        return {"status": "ok"}
    except Exception as e:
        return JSONResponse({"status": "degraded", "db": str(e)}, status_code=503)

Connected mode (control plane)

In connected mode the CLI talks to a Cumulus control plane instead of AWS directly. The control plane manages builds, streaming logs, GitHub webhooks, and team access. It runs in your own AWS account.

  • Log in — opens your browser for GitHub sign-in (or pass --token for CI):
    cumulus login
  • Link this repo (reads cumulus.toml and registers the project and apps):
    cumulus link
  • Trigger a deploy via the control plane:
    # Deploy a specific git sha
    cumulus app deploy my-saas/api --sha "$(git rev-parse HEAD)"
    
    # Watch the deployment log stream
    cumulus app status my-saas/api
    cumulus app logs my-saas/api
  • Promote staging to production (re-deploys the last successful sha):
    cumulus app promote my-saas/api --from staging --to production

GitHub webhooks

Connect your repo so every push to main triggers an automatic deploy.

  • Set the webhook secret on the control plane:
    aws ssm put-parameter --type SecureString --key-id alias/cumulus \
      --name /cumulus/control/CUMULUS_GITHUB_WEBHOOK_SECRET \
      --value "$(openssl rand -hex 32)"
  • Add the webhook in GitHub:
    • Payload URL: https://cumulus.example.com/webhooks/github
    • Content type: application/json
    • Secret: the value you set above
    • Events: Just the push event
  • Configure the project's branch and deploy target when creating it:
    cumulus project create --name my-saas --region us-east-1 \
      --repo github.com/yourorg/my-saas \
      --branch main \
      --deploy-env staging   # push → staging; promote to reach production
Use --deploy-env staging so pushes land on staging and you promote to production manually with cumulus app promote — a lightweight staging gate before production.

Teams

# Create an organization
cumulus org create --name "My Company" --slug my-company

# Invite a team member (they receive a link to accept)
cumulus org invite-member <org-id> --email teammate@example.com --role member

# Accept your invitation
cumulus org accept <org-id>

# List members
cumulus org members <org-id>

Roles

Owner
Full access including billing, AWS connections, and org deletion.
Admin
Manage projects, apps, servers, and members. No billing.
Member
Deploy and manage apps.
Viewer
Read-only access — logs and status only.

CLI reference

Project & config

cumulus init                            # scaffold cumulus.toml interactively
cumulus validate                        # parse and validate cumulus.toml
cumulus apps                            # list declared apps
cumulus login                           # sign in via the browser (GitHub OAuth)
cumulus link                            # register project from cumulus.toml
cumulus aws connect                     # connect your AWS account

Servers (EC2)

cumulus server create [name]            # provision an EC2 server
cumulus server list                     # list managed servers
cumulus server health <name>           # show CPU/memory/disk

Deploy & operate

cumulus deploy [app] [--env <env>]     # deploy (all apps, or one)
cumulus rollback <app> [--to <id>]    # roll back a release
cumulus restart <app>                   # restart service in place
cumulus logs <app>                      # tail live logs
cumulus releases <app>                  # list retained releases
cumulus run <app> <task>              # run a declared task on the server

cumulus env set <app> KEY=VALUE [--secret]  # set an env var
cumulus env list <app>                      # list env var names
cumulus env pull <app>                      # print KEY=VALUE lines

Projects & apps

cumulus project list
cumulus project create --name ... --region ... --repo ... --branch ...

cumulus app create <project> --name ... --runtime ... --target ...
cumulus app deploy <project/app> --sha <sha>
cumulus app deployments <project/app>
cumulus app status <project/app>
cumulus app logs <project/app>
cumulus app releases <project/app>
cumulus app promote <project/app> --from <env> --to <env>
cumulus app env-set <project/app> KEY=VALUE [--secret]
cumulus app env-pull <project/app>
cumulus app run <project/app> <task>
cumulus app restart <project/app>

Team

cumulus org invite-member <org-id> --email ... --role ...
cumulus org members <org-id>
cumulus org accept <org-id>

cumulus.toml reference

[project]
name   = "my-saas"            # project name
region = "us-east-1"         # AWS region

[[app]]
name     = "api"              # app name (unique in project)
runtime  = "rust"             # rust | python | static
target   = "ec2"              # ec2 | lambda | s3
source   = "."                # path within the repo (monorepo: subdirectory)
binary   = "api"              # Rust: the [[bin]] name in Cargo.toml
server   = "prod-1"           # EC2: server name from `cumulus server create`
features = ["aws", "postgres"] # Rust: Cargo features to enable

# [app.health_check]            # optional — gates deploys on a live readiness probe
# path      = "/health"         # if omitted, Cumulus waits a fixed interval before cutover
# port      = 8080
# timeout_s = 10

[app.python]                  # python/ec2 only
version = "3.12"
entry   = "main:app"          # module:variable
server  = "uvicorn"           # uvicorn | gunicorn | none
workers = 2

[app.lambda]                  # rust/lambda or python/lambda
handler        = "main.handler" # Python: module.function; Rust: "bootstrap"
python_version = "3.12"       # python/lambda only
memory_mb      = 256           # default 512
timeout_s      = 29            # default 30
architecture   = "arm64"       # arm64 (default) | x86_64

[app.lambda.url]              # optional Function URL
enabled = true
auth    = "none"              # none | iam

[app.s3]                      # static/s3 only
bucket      = "my-site-bucket"
kms_key_arn = "arn:aws:kms:..."

[app.build]                   # static/s3 only
command      = "npm run build"
output_dir   = "out"
node_version = "20"

[app.cdn]                     # static/s3: CloudFront + ACM
enabled        = true
custom_domain  = "app.example.com"
hosted_zone_id = "Z1ABC123"  # optional; auto-discovered

[app.tls]                     # EC2: ALB + ACM
enabled        = true
domain         = "api.example.com"
subnets        = ["subnet-az-a", "subnet-az-b"]
hosted_zone_id = "Z1ABC123"

[app.env]                     # environment variables
LOG_LEVEL    = "info"         # inline value
DATABASE_URL = { secret = "/cumulus/api/DATABASE_URL" }  # SSM SecureString

[app.tasks]                   # on-demand tasks (run with `cumulus run`)
migrate = "python manage.py migrate --noinput"
shell   = "python manage.py shell"

release = ["migrate"]         # tasks run before cutover on each deploy

[app.environments.staging]   # per-environment overrides
server = "staging-1"

Environment variables

Variable Purpose
AWS_PROFILE AWS credential profile
AWS_REGION AWS region (overrides cumulus.toml)
CUMULUS_SSM_PREFIX SSM path prefix (default: cumulus)
CUMULUS_LAMBDA_ROLE_ARN Lambda execution role ARN (auto-created if absent)
CUMULUS_HOME Override config directory (default: ~/.cumulus)