Skip to main content

System Architecture

This chapter maps the Cirrus CDN system architecture end-to-end, covering deployment topology, component responsibilities, runtime dependencies, and data flows between subsystems.

High-Level Architecture

Cirrus CDN is composed of four primary tiers:

  1. Control Plane – A FastAPI application (src/cirrus/app.py) exposing REST endpoints for configuration and automation.
  2. Data Plane – An OpenResty-based reverse proxy (openresty/) that serves end-user traffic, enforces caching, and surfaces metrics.
  3. Automation Fabric – Celery workers/beat processes (src/cirrus/celery_app.py) that perform ACME issuance, certificate renewals, and node health checks.
  4. Observability & Tooling – Prometheus, Grafana, logrotate, and supporting services defined in docker-compose.yml.

Redis acts as the central state store and pub/sub bus for all tiers.

            +-------------------+
| Frontend |
| Next.js (SPA) |
+-------------------+
|
v
+-------------------+
| FastAPI App |
| (src/cirrus/) |
+-------------------+
|
Redis <------> Celery Workers/Beat
|
v
+----------------------------------+
| OpenResty Edge (Data Plane) |
+----------------------------------+
|
v
End-user Requests

Deployment Topology

docker-compose.yml defines the local orchestration of services. Key services:

  • redis: Primary data store with AOF persistence (services.redis).
  • api: FastAPI server built from Dockerfile stage 2, mounting ./src read-only and bundling the static Next.js export.
  • openresty: Custom image from openresty/Dockerfile, running on host networking to expose ports 8080/8443 directly.
  • worker & beat: Celery worker/beat commands running from the same Python image as the API.
  • caddy: Internal CA and ACME directory used for local issuance.
  • acmedns: ACME DNS-01 authoritative server.
  • nsd: Authoritative DNS slave receiving NOTIFY from the hidden master.
  • prometheus, grafana, logrotate, and fakedns: Operations support systems.

Services api, worker, beat, caddy, acmedns, and nsd participate in the custom acmetest bridge network with static IP assignments to simplify DNS/MX trust relationships.

Build & Runtime Artefacts

Python & Frontend Image (Dockerfile)

  • Stage 1 builds the Next.js frontend with pnpm build, leveraging cache mounts for .next artifacts.
  • Stage 2 (ghcr.io/astral-sh/uv:python3.13-trixie-slim) installs Python dependencies via uv sync (respecting uv.lock), copies the static export to /app/static, and sets up an entrypoint script (docker/entrypoint.sh) that fixes permissions and optionally copies CA certificates before invoking uvicorn.

OpenResty Image (openresty/Dockerfile)

  • Builder stage installs lua-resty-http, nginx-lua-prometheus, and compiles nginx_cache_multipurge.
  • Templater stage renders nginx.conf from openresty/conf/nginx.conf.j2 with build arguments such as NGX_HTTP_PORT and NGX_METRICS_ALLOW.
  • Runtime stage seeds a dummy TLS certificate, copies Lua scripts, and runs openresty in the foreground.

Prometheus Image (prometheus/Dockerfile)

Prometheus builds are configured to load either prometheus.dev.yml or prometheus.prod.yml, with a default scrape target at 127.0.0.1:9145 for OpenResty metrics.

Configuration & State Management

Redis is the single source of truth. Key namespaces include:

Key PatternPurpose
cdn:domains (set)All managed domain names.
cdn:dom:{domain}JSON encoded domain configuration (DomainConf).
cdn:nodes (set)Known edge node IDs.
cdn:node:{id}Hash containing node IPs, health counters, and active flag.
cdn:cert:{domain}TLS assets (fullchain PEM, private key, issued timestamp).
cdn:acme:{domain}ACME registration state, including acme-dns credentials.
cdn:acme:lock:{domain} / cdn:acme:task:{domain}Concurrency locks for issuance tasks.
cdn:tokens, cdn:token:{id}, cdn:token_hash:{hash}Service token registry and lookups.
cdn:acmeacct:globalShared ACME account key material.
cdn:acmecertkey:{domain}Stored certificate private keys for reuse across renewals.

Pub/sub channels include cdn:cname:dirty (for DNS zone rebuilds) and cdn:purge (cache invalidation).

Data Flow Summary

  1. Domain Onboarding – Frontend POSTs to /api/v1/domains/{domain}; API persists configuration in Redis; DNS service rebuilds zones; OpenResty reads config on next request.
  2. Node Lifecycle – Operator submits list to /api/v1/nodes; Redis updates node hashes; DNS zone rebuild triggered; Celery health checks adjust active flags.
  3. Purge Flow – API publishes JSON message to cdn:purge; OpenResty subscriber issues PURGE requests to cache_multipurge.
  4. TLS Issuance – API enqueues Celery task; worker obtains/renews certificates, stores them in Redis; OpenResty loads them dynamically during SNI negotiation.

Dependencies & Integrations

  • Redis (redis.asyncio in Python, resty.redis in Lua) – configuration store, cache, and pub/sub.
  • Celery – asynchronous task execution with Redis as broker/backend.
  • acme-dns – handles DNS-01 validation for ACME issuance.
  • Caddy – hosts a local ACME CA; Celery workers trust it via CA bundle copying in docker/entrypoint.sh.
  • NSD – authoritative DNS slave receiving NOTIFY from HiddenMasterServer.
  • Prometheus & Grafana – metrics scraping and visualization.

Environments

The Docker composition targets local development; production deployments leverage Ansible playbooks referenced by the just deploy recipe (see ansible/). Environment variables (such as CNAME_BASE_DOMAIN, ACME_DIRECTORY, DNS_MASTER_PORT) must be tuned per environment. Chapter 11 enumerates these variables.

Scalability Considerations

  • API/server processes are stateless aside from in-memory session cache; they can scale horizontally if the session mechanism is migrated to Redis (see Chapter 11).
  • OpenResty can scale via additional nodes registered through the /api/v1/nodes API; rendezvous hashing keeps per-domain assignment stable under churn.
  • DNS hidden master remains single-instance; NSD can scale out for regional redundancy.
  • Redis is a single point; consider managed or clustered deployments for production workloads.

The remainder of this white paper explores each tier in depth, starting with the control plane API in Chapter 3.