founderyos-api — Architecture & Operations
Complementary views: System Topology places founderyos-api in the three-machine picture; oracle-bridge VPS covers the DAO-side backend that founderyos-api exchanges cross-domain tokens with; Secret Hygiene details the Pattern A k8s
secretKeyRefflow founderyos-api is migrating onto.
What founderyos-api is, and what it is not
founderyos-api is the Python/FastAPI backend for the FounderyOS SaaS product (founderyos.dev / staging.founderyos.dev) — the off-chain founder-operations platform owned by Graydon. It is NOT the Think Tank canister (on IC mainnet at mrhyf-jqaaa-aaaab-qgpra-cai, the DAO's AI productivity tool, formerly branded "FounderyOS Suite"). The two share a name ancestor and a cross-domain auth bridge but are operationally independent: different repos, different owners, different hosts, different auth stores. Touches DOM tokens → Think Tank. Touches workspaces, CRM, document editor sessions → founderyos-api.
founderyos-api runs on AX42-U's k3s cluster in the founderyos namespace as a 2-replica Deployment behind a ClusterIP Service. Public entry points: Traefik API gateway at apis.helloworlddao.com / staging-apis.helloworlddao.com (path-routed /fos/*) and direct hostnames founderyos.dev / staging.founderyos.dev. The founderyos-dashboard React SPA is the primary upstream caller.
Deployment topology
graph TB
subgraph cf["Cloudflare DNS (grey-cloud, DNS-only)"]
dns1["staging.founderyos.dev<br/>founderyos.dev"]
dns2["apis.helloworlddao.com<br/>staging-apis.helloworlddao.com"]
end
subgraph ax["AX42-U (157.180.13.84) — k3s"]
subgraph ksys["kube-system"]
traefik["Traefik :80/:443<br/>Let's Encrypt HTTP-01"]
end
subgraph plat["ns: platform"]
xns["founderyos-api-xns<br/>ExternalName shim"]
tokz["token-authz ForwardAuth"]
pg["payment-gateway :3200"]
end
subgraph fos["ns: founderyos"]
fa["founderyos-api :8000<br/>(Deployment, replicas=2)<br/>strategy.maxUnavailable=0<br/>PodDisruptionBudget minAvailable=1"]
nsv["notification-service :3100"]
redis["redis :6379<br/>(cache, pubsub, session store)"]
end
end
subgraph vps["oracle-bridge VPS (10.0.0.2)"]
ob["oracle-bridge :8787 / :8788"]
end
subgraph ext["External managed services"]
neon["Neon Postgres<br/>(founderyos DB — separate from oracle-bridge)"]
anth["Anthropic API<br/>(Claude)"]
ollama["Ollama (Theo)<br/>192.168.2.159:31434"]
stripe["Stripe<br/>(legacy /billing routes)"]
ghoauth["GitHub OAuth"]
ggoauth["Google OAuth"]
resend["Resend (via notification-service)"]
end
dns1 --> traefik
dns2 --> traefik
traefik -->|/fos/* strip prefix| xns
xns --> fa
fa --> redis
fa --> neon
fa -->|Bearer TOKEN_NOTIFICATION_SERVICE| nsv
fa -->|Bearer PAYMENT_GATEWAY_TOKEN| pg
fa -->|Bearer CROSS_DOMAIN_SERVICE_TOKEN| ob
fa --> anth
fa --> ollama
fa --> stripe
fa --> ghoauth
fa --> ggoauth
nsv --> resendThe pod runs runAsNonRoot, readOnlyRootFilesystem, allowPrivilegeEscalation: false. The replicas=2 + strategy.maxUnavailable=0 + PodDisruptionBudget minAvailable=1 combination (BL-233) was installed after regression login fixtures repeatedly hit Traefik's "no available server" 503 during single-pod rollouts. Liveness uses /health/live (dependency-free) so a slow DB/Redis check can't restart the pod; readiness uses /health (DB + Redis) so Traefik won't route to a pod whose upstream store is offline.
Image: ghcr.io/hello-world-co-op/founderyos-api:sha-<shortsha> (pinned per rollout — BL-203 avoids :staging-tag race). Port 8000 internal, exposed via ClusterIP Service. Traefik reaches the Service cross-namespace through the founderyos-api-xns ExternalName shim in platform (PLATFORM-006.3) because Traefik Ingress cannot target a Service in a different namespace directly.
API gateway routing
At the platform edge, Traefik strips /fos and forwards to founderyos-api:
| Host | Path | Backend | Notes |
|---|---|---|---|
apis.helloworlddao.com | /fos/* | founderyos-api-xns → founderyos-api.founderyos.svc.cluster.local:8000 | Production (prefix stripped) |
staging-apis.helloworlddao.com | /fos/* | same shim, staging env | Staging |
founderyos.dev | /api/v1/* | direct to founderyos-api via nginx on founderyos-dashboard container | Legacy direct-host path; retained for SPA calls that do not route through the gateway |
staging.founderyos.dev | /api/v1/* | same, staging | Same |
Internally the router tree has 50+ mounted routers (auth, invitations, users, tasks, agents, documents, events, billing, contacts, deals, workspaces, chat, workflows, terminal, activity_logs, kpi, fleet, intelligence, organizations, templates, webhooks, …). The full set lives in routers/ and is imported in main.py. The OpenAPI spec is served at /docs, /redoc, /api/v1/docs, /api/v1/redoc, and /api/v1/openapi.json.
Authentication model
founderyos-api owns its own session authority — distinct from oracle-bridge's DAO-side store. Passwords hash with Argon2id (OWASP params: memory_cost=65536, time_cost=3, parallelism=4) via passlib.CryptContext in core/security.py. Legacy bcrypt hashes continue to verify and are transparently rehashed on next login.
Sessions are JWT access + refresh token pairs (HS256, access=30min, refresh=7d) in httpOnly cookies. JWT_SECRET_KEY rotation invalidates every in-flight session; production rotations need an announce window. CSRF middleware (X-Requested-With custom header + origin check) runs after CORS.
Surface auth matrix:
| Surface | Mechanism | Notes |
|---|---|---|
POST /api/v1/auth/register | email + password + invitation token | Invite-only — founders/admins mint invitations first |
POST /api/v1/auth/login | email + password (Argon2id) | 5/min rate limit via slowapi |
OAuth (/api/v1/oauth/google, /api/v1/oauth/github) | OAuth 2.0 PKCE | GOOGLE_CLIENT_ID/GITHUB_CLIENT_ID + secret; new users auto-registered |
2FA (/api/v1/2fa/*) | TOTP + 10 backup codes | Required for roles above member at admin discretion |
Cross-domain (POST /api/v1/auth/cross-domain-login) | oracle-bridge-minted one-time token | PLATFORM-003.1 — 64-hex token exchanged server-to-server, bound to an entry_product |
API tokens (/api/v1/tokens/*, /api/v1/auth/tokens) | hashed bearer, raw value shown once | For CLI + external automation |
Admin-elevated (/api/v1/admin/*) | FOUNDER/ADMIN role + optional admin-session escalation | Admin sessions time-boxed separately from base JWT |
Three canonical regression test accounts are seeded in staging (BL-275): regression-founder@founderyos.dev, regression-admin@founderyos.dev, regression-member@founderyos.dev. Seeding happens out-of-band via hello-world-workspace/scripts/seed-fos-canonical-accounts.mjs — there is no admin endpoint to create arbitrary users. Signup is either invitation-gated via /api/v1/auth/register or direct SQL through the seeder.
Cross-domain auth exchange
The DAO ↔ FounderyOS cross-domain handoff is the sharpest integration point. A user logged into helloworlddao.com can click "Open in FounderyOS" and land authenticated on founderyos.dev without re-entering credentials, via a one-time token minted by oracle-bridge and exchanged by founderyos-api.
sequenceDiagram
autonumber
participant U as User browser
participant TT as think-tank-suite<br/>(helloworlddao.com)
participant OB as oracle-bridge<br/>(10.0.0.2)
participant FD as founderyos-dashboard<br/>(founderyos.dev)
participant FA as founderyos-api
U->>TT: authenticated session
TT->>OB: POST /api/staging_token/mint<br/>(session cookie, entry_product)
OB->>OB: persist token (single-use, 60s TTL)
OB-->>TT: { token }
TT->>U: redirect founderyos.dev/cross-domain?token=…
U->>FD: GET /cross-domain?token=…
FD->>FA: POST /api/v1/auth/cross-domain-login { token }
FA->>OB: POST /api/staging_token/exchange<br/>(Bearer CROSS_DOMAIN_SERVICE_TOKEN, body { token })
OB->>OB: validate + consume token
OB-->>FA: { email, principal, entry_product }
FA->>FA: find-or-create user (MEMBER role)
FA->>FA: mint fresh FOS JWT pair
FA-->>FD: 200 { user, entry_product } + Set-Cookie
FD-->>U: app shell with sessionCROSS_DOMAIN_SERVICE_TOKEN is the shared bearer — rotation is two-sided and must update on both services in the same window. If unset, the endpoint fail-closes with 503; no development bypass. Every delegation event writes to audit_log (privacy-safe — no raw IP), and any IC principal returned upserts into the user's IC identity link. Consumer code lives at services/cross_domain_client.py + routers/cross_domain_auth.py; BL-262 covers the document-import variant.
Data model
PostgreSQL on Neon (separate database from oracle-bridge — isolation so a founderyos-api schema change can't break DAO sessions). Migrations: Liquibase, 111+ changesets under liquibase/changelog/, run against liquibase.staging.properties / liquibase.production.properties. Load-bearing surface:
| Table | Purpose |
|---|---|
users | email + argon2 hash, email_verified gate, role (founder/admin/member/guest), status (ACTIVE/PENDING/SUSPENDED/DEACTIVATED), invited_by_id, invitation_id |
invitations | invite-only signup chain |
workspaces + workspace_members | Sophie's "Glass Table" paradigm — shared team workspaces with versioned content, comments, shareable links (Sprint 4.1) |
documents + content_version | Tiptap-backed collaborative editor with version history (also the source of PLATFORM-004.2 governance-proposal publish flow) |
agents + agent_fleet | Order of Elephants 32-agent orchestration, live heartbeats, bulk status |
contacts + interactions + deals + transactions | CRM pipeline + financial tracking (US-AE.1/.2/.3, US-AF.1) |
events | Calendar with recurrence, agent/task links |
chat + team_chat + direct_messages | Multi-agent team chat with @-mentions |
tasks + projects + sprints | Task management with project grouping |
ideas + frameworks + reflections | Capture surfaces — Ideas Hub, Frameworks Library, reflection journaling |
api_token + deployment_token | Hashed bearer tokens for CLI + programmatic access |
oauth_credential + ic_identity_link | OAuth tokens + DAO-side principal linkage (via cross-domain flow) |
totp (on users) + password_reset + email_verification | Auth side-tables |
activity_log + audit_log | System-wide activity ledger (PRD Section H) + privacy-safe delegation audit |
work_queue + capture_event | Autonomous agent job queue + capture taxonomy |
subscription + Stripe customer_id on users | Billing (legacy — payment-gateway takes over new flows) |
push_subscription + notification | Web-push + in-app notification fanout |
feature_flag + ab_test + prediction_model + prediction_config | A/B harness + ML model registry |
Every user-owned table enforces user data isolation — queries scope to current_user.id at the router layer; cross-tenant reads are a CVE-class bug.
Environment variables
Deploy renders two manifests — founderyos-secrets (Opaque) and founderyos-config (ConfigMap) — into the founderyos namespace. Pod env references them via valueFrom.secretKeyRef / valueFrom.configMapKeyRef (Secret Hygiene Pattern A). Some vars historically lived out-of-band via kubectl set env; deploy-staging.yml uses server-side apply with --force-conflicts --field-manager=founderyos-api-ci to preserve them, and PLATFORM-009.2 closes that drift.
| Var | Source | Purpose |
|---|---|---|
DATABASE_URL, DATABASE_USER, DATABASE_PASSWORD | GH Secret | Neon Postgres (staging DB separate from prod) |
JWT_SECRET_KEY, JWT_ALGORITHM, ACCESS_TOKEN_EXPIRE_MINUTES, REFRESH_TOKEN_EXPIRE_DAYS | GH Secret / ConfigMap | JWT signing + TTLs |
ANTHROPIC_API_KEY | GH Secret | Claude API (hybrid LLM primary) |
SENTRY_DSN | GH Secret | Error tracking (GlitchTip-compatible) |
ENCRYPTION_KEY | GH Secret | At-rest encryption for sensitive capture fields |
GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET | GH Secret | Google OAuth (BL-229) |
GITHUB_CLIENT_ID, GITHUB_CLIENT_SECRET | GH Secret | GitHub OAuth |
STRIPE_SECRET_KEY | GH Secret | Legacy /billing routes (payment-gateway owns new Stripe flows) |
CROSS_DOMAIN_SERVICE_TOKEN | GH Secret | Bearer to oracle-bridge /api/staging_token/exchange (PLATFORM-003.3) — MUST match oracle-bridge's copy |
NOTIFICATION_SERVICE_TOKEN | GH Secret (mirror of platform/service-tokens.TOKEN_NOTIFICATION_SERVICE) | Bearer to notification-service |
PAYMENT_GATEWAY_TOKEN | GH Secret (mirror of platform/service-tokens.TOKEN_PAYMENT_GATEWAY) | Bearer to payment-gateway |
NOTIFICATION_SERVICE_URL, PAYMENT_GATEWAY_URL, ORACLE_BRIDGE_URL, CROSS_DOMAIN_EXPECTED_TARGET, FRONTEND_URL, REDIS_URL, CORS_ORIGINS | GH Variable → ConfigMap | Non-secret config |
ENVIRONMENT, APP_NAME, APP_VERSION, DEBUG, LOG_LEVEL, LOG_FORMAT, WORKERS, RATE_LIMIT_*, CACHE_*, METRICS_ENABLED | ConfigMap | Runtime knobs |
SECURITY_* (HSTS, frame options, CSP, referrer policy) | ConfigMap | Security headers middleware (US-30) |
Rendered manifests under k8s/rendered/ are gitignored; templates (k8s/secret.template.yaml, k8s/configmap.template.yaml) are committed. PLATFORM-009.2 story file has the per-provider rotation checklist.
Deploy flow
Push to main triggers deploy-staging.yml:
docker build→ghcr.io/hello-world-co-op/founderyos-api:staging+:sha-<shortsha>.scpk8s/*.yamlto/tmp/founderyos-api-k8s-${RUN_ID}on AX42-U (run-id-scoped — BL-230).- SSH to AX42-U,
kubectl patchcross-domain keys into the existing Secret/ConfigMap (non-clobbering merge), thenkubectl apply --server-side --field-manager=founderyos-api-ci --force-conflicts -f /tmp/.../. kubectl set image deployment/founderyos-api api=ghcr.io/.../founderyos-api:sha-${SHORT_SHA}— pulls the exact SHA (bypasses:staging-tag race).kubectl rollout statuswith 300s timeout, then verify the running image matches;::error::and fail otherwise.
Production uses deploy-production.yml (workflow_dispatch — never auto), triggered via /deliver slash-comment on a Release Please PR. All third-party actions SHA-pinned per BL-230.
Operations
- Liquibase migrations: 111+ changesets in
liquibase/changelog/. Each env has its own.propertiesfile; run vialiquibase --defaults-file=liquibase/liquibase.staging.properties update. Never point a staging properties file at production'sNEON_HOST—psql $DATABASE_URL -c "SELECT current_database()"is the pre-flight gate. - Canonical account seeding:
hello-world-workspace/scripts/seed-fos-canonical-accounts.mjs— creates/refreshesregression-founder,regression-admin,regression-member. Argon2id parameters baked into the seeder must matchcore/security.py(BL-275 root cause — a seeder with default bcrypt params produced hashes the server rejected on login). - Pod health:
kubectl -n founderyos get pods -l app.kubernetes.io/name=founderyos,app.kubernetes.io/component=api. Liveness:/health/live. Readiness:/health(DB + Redis). Detailed:/health/detailed(admin only). - Log tailing:
ssh -i ~/.ssh/hetzner_vps root@157.180.13.84 "sudo k3s kubectl -n founderyos logs -f -l app.kubernetes.io/name=founderyos,app.kubernetes.io/component=api". - Rollback:
kubectl -n founderyos rollout undo deployment/founderyos-api. Liquibase tags every changeset with a--rollbackblock for schema revert. - Drift audit:
ops-infra/scripts/audit-env-drift.sh founderyos-api— nightly cron (BL-252) catches new live-only vars before PLATFORM-009.2 completion.
Known gotchas
- Not Think Tank. founderyos-api is the SaaS on AX42-U; Think Tank is the DAO canister on IC mainnet (
mrhyf-jqaaa-aaaab-qgpra-cai). Cross-links flow through oracle-bridge. Conflating them is always wrong. - Separate Neon database. founderyos-api's Postgres is not oracle-bridge's. A Liquibase run against the wrong connection string silently succeeds against the wrong schema — verify
DATABASE_URLhost before any manual migration. - Argon2id params must match for direct-SQL seeding.
core/security.pysetsmemory_cost=65536,time_cost=3,parallelism=4. A seeder with defaultpasslibparams produces rejectable hashes (BL-275). Import the sameCryptContextor reproduce the exact numbers. - Regression accounts seed out-of-band. No admin endpoint creates arbitrary users. Signup =
/api/v1/auth/register(invitation-gated) or direct SQL via the seeder. Invite-only is a deliberate product choice; don't add/admin/create-usercasually. CROSS_DOMAIN_SERVICE_TOKENis two-sided. Must match oracle-bridge's copy. Rotate in order: set new value on both surfaces, merge both deploy PRs, verify/api/v1/auth/cross-domain-loginstill 200s, revoke old value on both services.- Matrix integration is DEPRECATED.
routers/matrix.pyis retained for read-back compatibility only. New team-chat code lives inrouters/team_chat.py+routers/chat.py. --force-conflictson apply is a transitional carve-out. It papers over out-of-bandkubectl set envwrites pre-dating PLATFORM-009.2. Removal is the positive signal the migration landed.- OAuth clients are per-environment. Staging Google/GitHub apps have callback URLs bound to
staging.founderyos.dev. Reusing staging creds in production fails the callback exchange — always provision fresh clients at cutover. - Legacy Stripe in founderyos-api, new flows in payment-gateway.
/api/v1/billing/history+/api/v1/billing/payment-methodsreadSTRIPE_SECRET_KEYdirectly (PLATFORM-007.4); new checkout/subscription flows proxy topayment-gatewayviaPAYMENT_GATEWAY_TOKEN.
References
| Reference | Purpose |
|---|---|
| System Topology | Three-machine overview with founderyos-api in context |
| oracle-bridge VPS | DAO-side backend for the cross-domain exchange |
| Secret Hygiene | Pattern A secretKeyRef flow, rotation procedure, drift-detection cron |
founderyos-api/main.py | Router mount tree (50+ routers) + middleware stack |
founderyos-api/core/security.py | Argon2id + JWT primitives |
founderyos-api/routers/cross_domain_auth.py | PLATFORM-003.1 consumer endpoint |
founderyos-api/liquibase/ | 111+ changeset migrations |
founderyos-api/k8s/ | Deployment + Service + ConfigMap + secret/configmap templates |
bmad-artifacts/implementation-artifacts/platform-009-2-founderyos-api.md | Secret-hygiene migration story |
hello-world-workspace/scripts/seed-fos-canonical-accounts.mjs | Regression account seeder (BL-275) |
| BL-223 | Discovery story — 20+ drifted vars; --force-conflicts carve-out |
| BL-229 | OAuth provider secret migration (GitHub + Google) |
| BL-233 | replicas=2 + PDB + rollout strategy (no-503 regression) |
| BL-262 | Cross-domain document-import consumer flow |
| BL-275 | Regression account seeder + Argon2id parity |
| PLATFORM-003.1 / .3 | Cross-domain auth exchange + audit + IC link |
| PLATFORM-004.2 | FOS → governance proposal publish flow |
| PLATFORM-006.3 | ExternalName shim (founderyos-api-xns) for cross-ns Traefik |
| PLATFORM-007.4 | Legacy Stripe routes remaining in founderyos-api |
| PLATFORM-009.2 | Secret-hygiene migration (this service) |