Operator Runbook¶
Document: docs/docs/operations/operator-runbook.md
Status: Canonical
Last updated: 2026-04-13
Authority: Tim Rignold, RTOpacks Pty Ltd
Who This Is For¶
The person who runs RTOpacks day-to-day. Currently Tim Rignold. This document tells you how to do every operational task without reading architecture docs.
Add a User to the Allowlist¶
- Go to
admin.rtopacks.com.au/access - Select type: Email (specific person) or Domain (entire organisation)
- Enter the value and optional label
- Click Add
The user can now request a magic link at my.rtopacks.com.au/auth. They will not be told they were previously blocked — the experience is seamless.
To remove access: Click Deactivate next to the entry. This blocks new magic link requests but does not invalidate existing sessions.
Add an RTO Client Organisation¶
Currently manual via D1 console:
CLOUDFLARE_API_TOKEN=<token> npx wrangler d1 execute ops-db --remote \
--command="INSERT INTO rto_clients (rto_code, is_client, client_since, client_status, plan) VALUES ('<RTO_CODE>', 1, datetime('now'), 'active', 'essential')"
The RTO code must match a record in rtopacks-db rtos table. When the user logs in with an email linked to this org (via workspace-db users.org_id), they will resolve as L4.
Read the Anomaly Log¶
CLOUDFLARE_API_TOKEN=<token> npx wrangler d1 execute ops-db --remote \
--command="SELECT * FROM anomaly_log ORDER BY created_at DESC LIMIT 20"
Or query via the admin panel (route to be built — ADMIN-UI-01).
Columns: session_id, user_id, org_id, ucca_layer, zone (amber/red), trigger_reason (velocity/sequential/breadth/timing/escalation), detail (JSON), action (latency_injected/session_revoked), created_at.
SEC-02 Red Zone Alert Email¶
Subject: [SEC-02 RED] Session revoked — {user_id}
From: noreply@rtopacks.com.au
To: admin@rtopacks.com.au
What it means: A session exceeded red zone thresholds. The session has been automatically revoked. The user will see a normal "session expired" message — no indication of the revocation reason.
What to do:
1. Check anomaly_log for the session_id in the email
2. Check api_access_log for the full request history of that session
3. Determine if this was a legitimate user (e.g. a power user with many tabs) or an automated agent
4. If legitimate: add a note, no further action needed — the user can re-authenticate
5. If automated: check if the user_id is associated with an org. Consider deactivating the allowlist entry for that email/domain.
Check API Access Log¶
CLOUDFLARE_API_TOKEN=<token> npx wrangler d1 execute ops-db --remote \
--command="SELECT endpoint, method, response_status, timestamp FROM api_access_log WHERE user_id = '<USER_ID>' ORDER BY timestamp DESC LIMIT 50"
Suspicious patterns:
- Hundreds of sequential /rto/{code} or /units/{code} requests
- Sub-second timing between requests
- Requests across many endpoint groups with no repeat pattern
Worker Deployment¶
Always verify account before deploying:
Must show the RTOpacks account (f95d45376ebeeeaf011a4f0ec0fb7b38). If it shows e5a9830... (UCCO Foundation), run with CLOUDFLARE_API_TOKEN=<token> prefix or cd into a sub-app directory (apps/admin/, apps/workspace/) first so wrangler resolves the correct account from wrangler.jsonc.
Standing rule: for any wrangler d1 execute ... command, always cd apps/admin/ first. See docs/ops/standing-rules.md.
Deploy order (when all surfaces touched):
1. internal-api — always first (other surfaces depend on it)
2. workspace — auth surface
3. admin — operator surface
4. site — public surface (lowest risk)
Commands:
cd workers/internal-api && npx wrangler deploy
cd apps/workspace && npm run deploy
cd apps/admin && npm run deploy
cd apps/site && npm run deploy
Key Workers and Their Roles¶
| Worker | Domain | Purpose |
|---|---|---|
rtopacks-internal-api |
internal-api.rtopacks.com.au | Sole access point for rtopacks-db. Auth + logging + anomaly detection. |
rtopacks-workspace |
my.rtopacks.com.au | Workspace app — auth, AppGrid, studio, documents, people |
rtopacks-admin |
admin.rtopacks.com.au | Admin panel — behind CF Access. Org management, CRM, allowlist. |
rtopacks-site |
rtopacks.com.au | Public site — marketing, search, claim flow |
rtopacks-prelaunch |
rtopacks.com.au (front) | Prelaunch blocker — remove at go-live |
Key Database Bindings¶
| Database | ID | Purpose | Access |
|---|---|---|---|
| rtopacks-db | 1249760d-070a-43f8-81d7-de462b626cdf |
NRT corpus — qualifications, units, RTOs, scope, CRICOS | READ ONLY from most workers; writers are tga-sync, cricos-sync, tga-ingest, reg-intel, qual-enrichment |
| ops-db | 0692049c-1bf1-49e7-9229-3773eeba1a45 |
Operational — billing, sessions, observatory, sync cursors, anomaly/audit logs | Read/Write |
| engine-db-oc | 81e2919a-6587-40a1-b749-0a65103d95f0 |
Workspace engine — tier resolution, user sessions, org memberships | Read/Write |
Canonical source for DB IDs: docs/docs/workers/inventory.md.
Billing — Day-to-Day Operations¶
Look up a client's billing state¶
admin → Organisations → [search rto_code] → Billing tab. Shows plan, subscription status, current period, payment method (brand + last4), invoice history with QB sync status.
Source: GET /billing/admin/org/:rto_code on internal-api. See docs/docs/infrastructure/finance-reference.md.
Retry a failed QB push¶
admin → Finance → QuickBooks tab → Retry next to any invoice stuck at qb_sync_status = 'failed'. Or programmatically: POST /billing/qb-retry { invoice_id }.
Reconnect QuickBooks (token died)¶
admin → Finance → QuickBooks tab → amber "Reconnect QuickBooks" button. OAuth flow via CF Access. Full procedure + detection signals: docs/ops/standing-rules.md.
In production, this should rarely be needed — the daily heartbeat cron (qb-reconcile) keeps the rolling 101-day token alive automatically. See docs/docs/infrastructure/qb-config.md for QB-HEARTBEAT-01.
Decovert a test client back to non-client¶
Situation: a test run against a real rto_code flipped is_client=1 on rto_clients. To undo without wiping real paying clients:
admin → Finance → Test Tools → Decovert RTO button (takes rto_code). Calls POST /api/admin/organisations/[code] with action=decovert. Real-path conversions are NOT touched by the bulk reset because they write is_test=0.
Full test data purge¶
admin → Finance → Test Tools → Reset Test Data. Unions three sources (rto_clients.is_test=1, billing_customers.is_test=1, billing_subscriptions.is_test=1 joined), decoverts the union, nukes billing rows. Response reports orgs_decoverted count. Does NOT touch NRT data.
Sync Operations¶
Check if the weekly TGA sync ran¶
- Check inbox for the Sunday
[tga-sync] cycle completeemail (NOTIFY-01) - If missing:
admin → Observatoryand filter bysync_type=tga_sync - If no run row: worker cron didn't fire.
wrangler deployments list --name tga-syncfromscripts/workers/tga-sync/and check queue health - Manual re-trigger:
curl -X POST https://tga-sync.dark-firefly-3289.workers.dev/trigger
Same pattern for cricos-sync (monthly, 1st of month ~4am AEST).
See docs/docs/operations/observatory-guide.md and docs/docs/infrastructure/notifications.md.
Check the QB reconciliation ran¶
Expected: {"status":"ok","synced":N,"failed":0,"abandoned":0,"heartbeat":true}. heartbeat:true means the QB token refresh worked — the most important signal.
Investigate a stuck sync phase¶
See Observatory guide. Short version: read tga_sync_cursor.cycle_phase / cricos_sync_cursor.cycle_phase in ops-db to find the phase pointer, then re-trigger the worker to resume from that point.
Amendment Log¶
| Date | Change | Authority |
|---|---|---|
| 2026-04-06 | Initial document — Session 44 | Tim Rignold |
| 2026-04-13 | Session 52 docs pass: corrected DB IDs, added billing + sync + QB operations sections | Alex (Claude Code) |