Skip to content

Checking access...

founderyos-api — Architecture & Operations

Complementary views: System Topology places founderyos-api in the three-machine picture; oracle-bridge VPS covers the DAO-side backend that founderyos-api exchanges cross-domain tokens with; Secret Hygiene details the Pattern A k8s secretKeyRef flow founderyos-api is migrating onto.

What founderyos-api is, and what it is not

founderyos-api is the Python/FastAPI backend for the FounderyOS SaaS product (founderyos.dev / staging.founderyos.dev) — the off-chain founder-operations platform owned by Graydon. It is NOT the Think Tank canister (on IC mainnet at mrhyf-jqaaa-aaaab-qgpra-cai, the DAO's AI productivity tool, formerly branded "FounderyOS Suite"). The two share a name ancestor and a cross-domain auth bridge but are operationally independent: different repos, different owners, different hosts, different auth stores. Touches DOM tokens → Think Tank. Touches workspaces, CRM, document editor sessions → founderyos-api.

founderyos-api runs on AX42-U's k3s cluster in the founderyos namespace as a 2-replica Deployment behind a ClusterIP Service. Public entry points: Traefik API gateway at apis.helloworlddao.com / staging-apis.helloworlddao.com (path-routed /fos/*) and direct hostnames founderyos.dev / staging.founderyos.dev. The founderyos-dashboard React SPA is the primary upstream caller.

Deployment topology

mermaid
graph TB
  subgraph cf["Cloudflare DNS (grey-cloud, DNS-only)"]
    dns1["staging.founderyos.dev<br/>founderyos.dev"]
    dns2["apis.helloworlddao.com<br/>staging-apis.helloworlddao.com"]
  end
  subgraph ax["AX42-U (157.180.13.84) — k3s"]
    subgraph ksys["kube-system"]
      traefik["Traefik :80/:443<br/>Let's Encrypt HTTP-01"]
    end
    subgraph plat["ns: platform"]
      xns["founderyos-api-xns<br/>ExternalName shim"]
      tokz["token-authz ForwardAuth"]
      pg["payment-gateway :3200"]
    end
    subgraph fos["ns: founderyos"]
      fa["founderyos-api :8000<br/>(Deployment, replicas=2)<br/>strategy.maxUnavailable=0<br/>PodDisruptionBudget minAvailable=1"]
      nsv["notification-service :3100"]
      redis["redis :6379<br/>(cache, pubsub, session store)"]
    end
  end
  subgraph vps["oracle-bridge VPS (10.0.0.2)"]
    ob["oracle-bridge :8787 / :8788"]
  end
  subgraph ext["External managed services"]
    neon["Neon Postgres<br/>(founderyos DB — separate from oracle-bridge)"]
    anth["Anthropic API<br/>(Claude)"]
    ollama["Ollama (Theo)<br/>192.168.2.159:31434"]
    stripe["Stripe<br/>(legacy /billing routes)"]
    ghoauth["GitHub OAuth"]
    ggoauth["Google OAuth"]
    resend["Resend (via notification-service)"]
  end
  dns1 --> traefik
  dns2 --> traefik
  traefik -->|/fos/* strip prefix| xns
  xns --> fa
  fa --> redis
  fa --> neon
  fa -->|Bearer TOKEN_NOTIFICATION_SERVICE| nsv
  fa -->|Bearer PAYMENT_GATEWAY_TOKEN| pg
  fa -->|Bearer CROSS_DOMAIN_SERVICE_TOKEN| ob
  fa --> anth
  fa --> ollama
  fa --> stripe
  fa --> ghoauth
  fa --> ggoauth
  nsv --> resend

The pod runs runAsNonRoot, readOnlyRootFilesystem, allowPrivilegeEscalation: false. The replicas=2 + strategy.maxUnavailable=0 + PodDisruptionBudget minAvailable=1 combination (BL-233) was installed after regression login fixtures repeatedly hit Traefik's "no available server" 503 during single-pod rollouts. Liveness uses /health/live (dependency-free) so a slow DB/Redis check can't restart the pod; readiness uses /health (DB + Redis) so Traefik won't route to a pod whose upstream store is offline.

Image: ghcr.io/hello-world-co-op/founderyos-api:sha-<shortsha> (pinned per rollout — BL-203 avoids :staging-tag race). Port 8000 internal, exposed via ClusterIP Service. Traefik reaches the Service cross-namespace through the founderyos-api-xns ExternalName shim in platform (PLATFORM-006.3) because Traefik Ingress cannot target a Service in a different namespace directly.

API gateway routing

At the platform edge, Traefik strips /fos and forwards to founderyos-api:

HostPathBackendNotes
apis.helloworlddao.com/fos/*founderyos-api-xnsfounderyos-api.founderyos.svc.cluster.local:8000Production (prefix stripped)
staging-apis.helloworlddao.com/fos/*same shim, staging envStaging
founderyos.dev/api/v1/*direct to founderyos-api via nginx on founderyos-dashboard containerLegacy direct-host path; retained for SPA calls that do not route through the gateway
staging.founderyos.dev/api/v1/*same, stagingSame

Internally the router tree has 50+ mounted routers (auth, invitations, users, tasks, agents, documents, events, billing, contacts, deals, workspaces, chat, workflows, terminal, activity_logs, kpi, fleet, intelligence, organizations, templates, webhooks, …). The full set lives in routers/ and is imported in main.py. The OpenAPI spec is served at /docs, /redoc, /api/v1/docs, /api/v1/redoc, and /api/v1/openapi.json.

Authentication model

founderyos-api owns its own session authority — distinct from oracle-bridge's DAO-side store. Passwords hash with Argon2id (OWASP params: memory_cost=65536, time_cost=3, parallelism=4) via passlib.CryptContext in core/security.py. Legacy bcrypt hashes continue to verify and are transparently rehashed on next login.

Sessions are JWT access + refresh token pairs (HS256, access=30min, refresh=7d) in httpOnly cookies. JWT_SECRET_KEY rotation invalidates every in-flight session; production rotations need an announce window. CSRF middleware (X-Requested-With custom header + origin check) runs after CORS.

Surface auth matrix:

SurfaceMechanismNotes
POST /api/v1/auth/registeremail + password + invitation tokenInvite-only — founders/admins mint invitations first
POST /api/v1/auth/loginemail + password (Argon2id)5/min rate limit via slowapi
OAuth (/api/v1/oauth/google, /api/v1/oauth/github)OAuth 2.0 PKCEGOOGLE_CLIENT_ID/GITHUB_CLIENT_ID + secret; new users auto-registered
2FA (/api/v1/2fa/*)TOTP + 10 backup codesRequired for roles above member at admin discretion
Cross-domain (POST /api/v1/auth/cross-domain-login)oracle-bridge-minted one-time tokenPLATFORM-003.1 — 64-hex token exchanged server-to-server, bound to an entry_product
API tokens (/api/v1/tokens/*, /api/v1/auth/tokens)hashed bearer, raw value shown onceFor CLI + external automation
Admin-elevated (/api/v1/admin/*)FOUNDER/ADMIN role + optional admin-session escalationAdmin sessions time-boxed separately from base JWT

Three canonical regression test accounts are seeded in staging (BL-275): regression-founder@founderyos.dev, regression-admin@founderyos.dev, regression-member@founderyos.dev. Seeding happens out-of-band via hello-world-workspace/scripts/seed-fos-canonical-accounts.mjs — there is no admin endpoint to create arbitrary users. Signup is either invitation-gated via /api/v1/auth/register or direct SQL through the seeder.

Cross-domain auth exchange

The DAO ↔ FounderyOS cross-domain handoff is the sharpest integration point. A user logged into helloworlddao.com can click "Open in FounderyOS" and land authenticated on founderyos.dev without re-entering credentials, via a one-time token minted by oracle-bridge and exchanged by founderyos-api.

mermaid
sequenceDiagram
  autonumber
  participant U as User browser
  participant TT as think-tank-suite<br/>(helloworlddao.com)
  participant OB as oracle-bridge<br/>(10.0.0.2)
  participant FD as founderyos-dashboard<br/>(founderyos.dev)
  participant FA as founderyos-api
  U->>TT: authenticated session
  TT->>OB: POST /api/staging_token/mint<br/>(session cookie, entry_product)
  OB->>OB: persist token (single-use, 60s TTL)
  OB-->>TT: { token }
  TT->>U: redirect founderyos.dev/cross-domain?token=…
  U->>FD: GET /cross-domain?token=…
  FD->>FA: POST /api/v1/auth/cross-domain-login { token }
  FA->>OB: POST /api/staging_token/exchange<br/>(Bearer CROSS_DOMAIN_SERVICE_TOKEN, body { token })
  OB->>OB: validate + consume token
  OB-->>FA: { email, principal, entry_product }
  FA->>FA: find-or-create user (MEMBER role)
  FA->>FA: mint fresh FOS JWT pair
  FA-->>FD: 200 { user, entry_product } + Set-Cookie
  FD-->>U: app shell with session

CROSS_DOMAIN_SERVICE_TOKEN is the shared bearer — rotation is two-sided and must update on both services in the same window. If unset, the endpoint fail-closes with 503; no development bypass. Every delegation event writes to audit_log (privacy-safe — no raw IP), and any IC principal returned upserts into the user's IC identity link. Consumer code lives at services/cross_domain_client.py + routers/cross_domain_auth.py; BL-262 covers the document-import variant.

Data model

PostgreSQL on Neon (separate database from oracle-bridge — isolation so a founderyos-api schema change can't break DAO sessions). Migrations: Liquibase, 111+ changesets under liquibase/changelog/, run against liquibase.staging.properties / liquibase.production.properties. Load-bearing surface:

TablePurpose
usersemail + argon2 hash, email_verified gate, role (founder/admin/member/guest), status (ACTIVE/PENDING/SUSPENDED/DEACTIVATED), invited_by_id, invitation_id
invitationsinvite-only signup chain
workspaces + workspace_membersSophie's "Glass Table" paradigm — shared team workspaces with versioned content, comments, shareable links (Sprint 4.1)
documents + content_versionTiptap-backed collaborative editor with version history (also the source of PLATFORM-004.2 governance-proposal publish flow)
agents + agent_fleetOrder of Elephants 32-agent orchestration, live heartbeats, bulk status
contacts + interactions + deals + transactionsCRM pipeline + financial tracking (US-AE.1/.2/.3, US-AF.1)
eventsCalendar with recurrence, agent/task links
chat + team_chat + direct_messagesMulti-agent team chat with @-mentions
tasks + projects + sprintsTask management with project grouping
ideas + frameworks + reflectionsCapture surfaces — Ideas Hub, Frameworks Library, reflection journaling
api_token + deployment_tokenHashed bearer tokens for CLI + programmatic access
oauth_credential + ic_identity_linkOAuth tokens + DAO-side principal linkage (via cross-domain flow)
totp (on users) + password_reset + email_verificationAuth side-tables
activity_log + audit_logSystem-wide activity ledger (PRD Section H) + privacy-safe delegation audit
work_queue + capture_eventAutonomous agent job queue + capture taxonomy
subscription + Stripe customer_id on usersBilling (legacy — payment-gateway takes over new flows)
push_subscription + notificationWeb-push + in-app notification fanout
feature_flag + ab_test + prediction_model + prediction_configA/B harness + ML model registry

Every user-owned table enforces user data isolation — queries scope to current_user.id at the router layer; cross-tenant reads are a CVE-class bug.

Environment variables

Deploy renders two manifests — founderyos-secrets (Opaque) and founderyos-config (ConfigMap) — into the founderyos namespace. Pod env references them via valueFrom.secretKeyRef / valueFrom.configMapKeyRef (Secret Hygiene Pattern A). Some vars historically lived out-of-band via kubectl set env; deploy-staging.yml uses server-side apply with --force-conflicts --field-manager=founderyos-api-ci to preserve them, and PLATFORM-009.2 closes that drift.

VarSourcePurpose
DATABASE_URL, DATABASE_USER, DATABASE_PASSWORDGH SecretNeon Postgres (staging DB separate from prod)
JWT_SECRET_KEY, JWT_ALGORITHM, ACCESS_TOKEN_EXPIRE_MINUTES, REFRESH_TOKEN_EXPIRE_DAYSGH Secret / ConfigMapJWT signing + TTLs
ANTHROPIC_API_KEYGH SecretClaude API (hybrid LLM primary)
SENTRY_DSNGH SecretError tracking (GlitchTip-compatible)
ENCRYPTION_KEYGH SecretAt-rest encryption for sensitive capture fields
GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRETGH SecretGoogle OAuth (BL-229)
GITHUB_CLIENT_ID, GITHUB_CLIENT_SECRETGH SecretGitHub OAuth
STRIPE_SECRET_KEYGH SecretLegacy /billing routes (payment-gateway owns new Stripe flows)
CROSS_DOMAIN_SERVICE_TOKENGH SecretBearer to oracle-bridge /api/staging_token/exchange (PLATFORM-003.3) — MUST match oracle-bridge's copy
NOTIFICATION_SERVICE_TOKENGH Secret (mirror of platform/service-tokens.TOKEN_NOTIFICATION_SERVICE)Bearer to notification-service
PAYMENT_GATEWAY_TOKENGH Secret (mirror of platform/service-tokens.TOKEN_PAYMENT_GATEWAY)Bearer to payment-gateway
NOTIFICATION_SERVICE_URL, PAYMENT_GATEWAY_URL, ORACLE_BRIDGE_URL, CROSS_DOMAIN_EXPECTED_TARGET, FRONTEND_URL, REDIS_URL, CORS_ORIGINSGH Variable → ConfigMapNon-secret config
ENVIRONMENT, APP_NAME, APP_VERSION, DEBUG, LOG_LEVEL, LOG_FORMAT, WORKERS, RATE_LIMIT_*, CACHE_*, METRICS_ENABLEDConfigMapRuntime knobs
SECURITY_* (HSTS, frame options, CSP, referrer policy)ConfigMapSecurity headers middleware (US-30)

Rendered manifests under k8s/rendered/ are gitignored; templates (k8s/secret.template.yaml, k8s/configmap.template.yaml) are committed. PLATFORM-009.2 story file has the per-provider rotation checklist.

Deploy flow

Push to main triggers deploy-staging.yml:

  1. docker buildghcr.io/hello-world-co-op/founderyos-api:staging + :sha-<shortsha>.
  2. scp k8s/*.yaml to /tmp/founderyos-api-k8s-${RUN_ID} on AX42-U (run-id-scoped — BL-230).
  3. SSH to AX42-U, kubectl patch cross-domain keys into the existing Secret/ConfigMap (non-clobbering merge), then kubectl apply --server-side --field-manager=founderyos-api-ci --force-conflicts -f /tmp/.../.
  4. kubectl set image deployment/founderyos-api api=ghcr.io/.../founderyos-api:sha-${SHORT_SHA} — pulls the exact SHA (bypasses :staging-tag race).
  5. kubectl rollout status with 300s timeout, then verify the running image matches; ::error:: and fail otherwise.

Production uses deploy-production.yml (workflow_dispatch — never auto), triggered via /deliver slash-comment on a Release Please PR. All third-party actions SHA-pinned per BL-230.

Operations

  • Liquibase migrations: 111+ changesets in liquibase/changelog/. Each env has its own .properties file; run via liquibase --defaults-file=liquibase/liquibase.staging.properties update. Never point a staging properties file at production's NEON_HOSTpsql $DATABASE_URL -c "SELECT current_database()" is the pre-flight gate.
  • Canonical account seeding: hello-world-workspace/scripts/seed-fos-canonical-accounts.mjs — creates/refreshes regression-founder, regression-admin, regression-member. Argon2id parameters baked into the seeder must match core/security.py (BL-275 root cause — a seeder with default bcrypt params produced hashes the server rejected on login).
  • Pod health: kubectl -n founderyos get pods -l app.kubernetes.io/name=founderyos,app.kubernetes.io/component=api. Liveness: /health/live. Readiness: /health (DB + Redis). Detailed: /health/detailed (admin only).
  • Log tailing: ssh -i ~/.ssh/hetzner_vps root@157.180.13.84 "sudo k3s kubectl -n founderyos logs -f -l app.kubernetes.io/name=founderyos,app.kubernetes.io/component=api".
  • Rollback: kubectl -n founderyos rollout undo deployment/founderyos-api. Liquibase tags every changeset with a --rollback block for schema revert.
  • Drift audit: ops-infra/scripts/audit-env-drift.sh founderyos-api — nightly cron (BL-252) catches new live-only vars before PLATFORM-009.2 completion.

Known gotchas

  • Not Think Tank. founderyos-api is the SaaS on AX42-U; Think Tank is the DAO canister on IC mainnet (mrhyf-jqaaa-aaaab-qgpra-cai). Cross-links flow through oracle-bridge. Conflating them is always wrong.
  • Separate Neon database. founderyos-api's Postgres is not oracle-bridge's. A Liquibase run against the wrong connection string silently succeeds against the wrong schema — verify DATABASE_URL host before any manual migration.
  • Argon2id params must match for direct-SQL seeding. core/security.py sets memory_cost=65536, time_cost=3, parallelism=4. A seeder with default passlib params produces rejectable hashes (BL-275). Import the same CryptContext or reproduce the exact numbers.
  • Regression accounts seed out-of-band. No admin endpoint creates arbitrary users. Signup = /api/v1/auth/register (invitation-gated) or direct SQL via the seeder. Invite-only is a deliberate product choice; don't add /admin/create-user casually.
  • CROSS_DOMAIN_SERVICE_TOKEN is two-sided. Must match oracle-bridge's copy. Rotate in order: set new value on both surfaces, merge both deploy PRs, verify /api/v1/auth/cross-domain-login still 200s, revoke old value on both services.
  • Matrix integration is DEPRECATED. routers/matrix.py is retained for read-back compatibility only. New team-chat code lives in routers/team_chat.py + routers/chat.py.
  • --force-conflicts on apply is a transitional carve-out. It papers over out-of-band kubectl set env writes pre-dating PLATFORM-009.2. Removal is the positive signal the migration landed.
  • OAuth clients are per-environment. Staging Google/GitHub apps have callback URLs bound to staging.founderyos.dev. Reusing staging creds in production fails the callback exchange — always provision fresh clients at cutover.
  • Legacy Stripe in founderyos-api, new flows in payment-gateway. /api/v1/billing/history + /api/v1/billing/payment-methods read STRIPE_SECRET_KEY directly (PLATFORM-007.4); new checkout/subscription flows proxy to payment-gateway via PAYMENT_GATEWAY_TOKEN.

References

ReferencePurpose
System TopologyThree-machine overview with founderyos-api in context
oracle-bridge VPSDAO-side backend for the cross-domain exchange
Secret HygienePattern A secretKeyRef flow, rotation procedure, drift-detection cron
founderyos-api/main.pyRouter mount tree (50+ routers) + middleware stack
founderyos-api/core/security.pyArgon2id + JWT primitives
founderyos-api/routers/cross_domain_auth.pyPLATFORM-003.1 consumer endpoint
founderyos-api/liquibase/111+ changeset migrations
founderyos-api/k8s/Deployment + Service + ConfigMap + secret/configmap templates
bmad-artifacts/implementation-artifacts/platform-009-2-founderyos-api.mdSecret-hygiene migration story
hello-world-workspace/scripts/seed-fos-canonical-accounts.mjsRegression account seeder (BL-275)
BL-223Discovery story — 20+ drifted vars; --force-conflicts carve-out
BL-229OAuth provider secret migration (GitHub + Google)
BL-233replicas=2 + PDB + rollout strategy (no-503 regression)
BL-262Cross-domain document-import consumer flow
BL-275Regression account seeder + Argon2id parity
PLATFORM-003.1 / .3Cross-domain auth exchange + audit + IC link
PLATFORM-004.2FOS → governance proposal publish flow
PLATFORM-006.3ExternalName shim (founderyos-api-xns) for cross-ns Traefik
PLATFORM-007.4Legacy Stripe routes remaining in founderyos-api
PLATFORM-009.2Secret-hygiene migration (this service)

Hello World Co-Op DAO