Skip to content

Operator Runbook

Document: docs/docs/operations/operator-runbook.md
Status: Canonical
Last updated: 2026-04-13
Authority: Tim Rignold, RTOpacks Pty Ltd


Who This Is For

The person who runs RTOpacks day-to-day. Currently Tim Rignold. This document tells you how to do every operational task without reading architecture docs.


Add a User to the Allowlist

  1. Go to admin.rtopacks.com.au/access
  2. Select type: Email (specific person) or Domain (entire organisation)
  3. Enter the value and optional label
  4. Click Add

The user can now request a magic link at my.rtopacks.com.au/auth. They will not be told they were previously blocked — the experience is seamless.

To remove access: Click Deactivate next to the entry. This blocks new magic link requests but does not invalidate existing sessions.


Add an RTO Client Organisation

Currently manual via D1 console:

CLOUDFLARE_API_TOKEN=<token> npx wrangler d1 execute ops-db --remote \
  --command="INSERT INTO rto_clients (rto_code, is_client, client_since, client_status, plan) VALUES ('<RTO_CODE>', 1, datetime('now'), 'active', 'essential')"

The RTO code must match a record in rtopacks-db rtos table. When the user logs in with an email linked to this org (via workspace-db users.org_id), they will resolve as L4.


Read the Anomaly Log

CLOUDFLARE_API_TOKEN=<token> npx wrangler d1 execute ops-db --remote \
  --command="SELECT * FROM anomaly_log ORDER BY created_at DESC LIMIT 20"

Or query via the admin panel (route to be built — ADMIN-UI-01).

Columns: session_id, user_id, org_id, ucca_layer, zone (amber/red), trigger_reason (velocity/sequential/breadth/timing/escalation), detail (JSON), action (latency_injected/session_revoked), created_at.


SEC-02 Red Zone Alert Email

Subject: [SEC-02 RED] Session revoked — {user_id}
From: noreply@rtopacks.com.au
To: admin@rtopacks.com.au

What it means: A session exceeded red zone thresholds. The session has been automatically revoked. The user will see a normal "session expired" message — no indication of the revocation reason.

What to do: 1. Check anomaly_log for the session_id in the email 2. Check api_access_log for the full request history of that session 3. Determine if this was a legitimate user (e.g. a power user with many tabs) or an automated agent 4. If legitimate: add a note, no further action needed — the user can re-authenticate 5. If automated: check if the user_id is associated with an org. Consider deactivating the allowlist entry for that email/domain.


Check API Access Log

CLOUDFLARE_API_TOKEN=<token> npx wrangler d1 execute ops-db --remote \
  --command="SELECT endpoint, method, response_status, timestamp FROM api_access_log WHERE user_id = '<USER_ID>' ORDER BY timestamp DESC LIMIT 50"

Suspicious patterns: - Hundreds of sequential /rto/{code} or /units/{code} requests - Sub-second timing between requests - Requests across many endpoint groups with no repeat pattern


Worker Deployment

Always verify account before deploying:

npx wrangler whoami

Must show the RTOpacks account (f95d45376ebeeeaf011a4f0ec0fb7b38). If it shows e5a9830... (UCCO Foundation), run with CLOUDFLARE_API_TOKEN=<token> prefix or cd into a sub-app directory (apps/admin/, apps/workspace/) first so wrangler resolves the correct account from wrangler.jsonc.

Standing rule: for any wrangler d1 execute ... command, always cd apps/admin/ first. See docs/ops/standing-rules.md.

Deploy order (when all surfaces touched): 1. internal-api — always first (other surfaces depend on it) 2. workspace — auth surface 3. admin — operator surface 4. site — public surface (lowest risk)

Commands:

cd workers/internal-api && npx wrangler deploy
cd apps/workspace && npm run deploy
cd apps/admin && npm run deploy
cd apps/site && npm run deploy


Key Workers and Their Roles

Worker Domain Purpose
rtopacks-internal-api internal-api.rtopacks.com.au Sole access point for rtopacks-db. Auth + logging + anomaly detection.
rtopacks-workspace my.rtopacks.com.au Workspace app — auth, AppGrid, studio, documents, people
rtopacks-admin admin.rtopacks.com.au Admin panel — behind CF Access. Org management, CRM, allowlist.
rtopacks-site rtopacks.com.au Public site — marketing, search, claim flow
rtopacks-prelaunch rtopacks.com.au (front) Prelaunch blocker — remove at go-live

Key Database Bindings

Database ID Purpose Access
rtopacks-db 1249760d-070a-43f8-81d7-de462b626cdf NRT corpus — qualifications, units, RTOs, scope, CRICOS READ ONLY from most workers; writers are tga-sync, cricos-sync, tga-ingest, reg-intel, qual-enrichment
ops-db 0692049c-1bf1-49e7-9229-3773eeba1a45 Operational — billing, sessions, observatory, sync cursors, anomaly/audit logs Read/Write
engine-db-oc 81e2919a-6587-40a1-b749-0a65103d95f0 Workspace engine — tier resolution, user sessions, org memberships Read/Write

Canonical source for DB IDs: docs/docs/workers/inventory.md.


Billing — Day-to-Day Operations

Look up a client's billing state

admin → Organisations → [search rto_code] → Billing tab. Shows plan, subscription status, current period, payment method (brand + last4), invoice history with QB sync status.

Source: GET /billing/admin/org/:rto_code on internal-api. See docs/docs/infrastructure/finance-reference.md.

Retry a failed QB push

admin → Finance → QuickBooks tab → Retry next to any invoice stuck at qb_sync_status = 'failed'. Or programmatically: POST /billing/qb-retry { invoice_id }.

Reconnect QuickBooks (token died)

admin → Finance → QuickBooks tab → amber "Reconnect QuickBooks" button. OAuth flow via CF Access. Full procedure + detection signals: docs/ops/standing-rules.md.

In production, this should rarely be needed — the daily heartbeat cron (qb-reconcile) keeps the rolling 101-day token alive automatically. See docs/docs/infrastructure/qb-config.md for QB-HEARTBEAT-01.

Decovert a test client back to non-client

Situation: a test run against a real rto_code flipped is_client=1 on rto_clients. To undo without wiping real paying clients:

admin → Finance → Test Tools → Decovert RTO button (takes rto_code). Calls POST /api/admin/organisations/[code] with action=decovert. Real-path conversions are NOT touched by the bulk reset because they write is_test=0.

Full test data purge

admin → Finance → Test Tools → Reset Test Data. Unions three sources (rto_clients.is_test=1, billing_customers.is_test=1, billing_subscriptions.is_test=1 joined), decoverts the union, nukes billing rows. Response reports orgs_decoverted count. Does NOT touch NRT data.


Sync Operations

Check if the weekly TGA sync ran

  1. Check inbox for the Sunday [tga-sync] cycle complete email (NOTIFY-01)
  2. If missing: admin → Observatory and filter by sync_type=tga_sync
  3. If no run row: worker cron didn't fire. wrangler deployments list --name tga-sync from scripts/workers/tga-sync/ and check queue health
  4. Manual re-trigger: curl -X POST https://tga-sync.dark-firefly-3289.workers.dev/trigger

Same pattern for cricos-sync (monthly, 1st of month ~4am AEST).

See docs/docs/operations/observatory-guide.md and docs/docs/infrastructure/notifications.md.

Check the QB reconciliation ran

curl -X POST https://qb-reconcile.dark-firefly-3289.workers.dev/_test

Expected: {"status":"ok","synced":N,"failed":0,"abandoned":0,"heartbeat":true}. heartbeat:true means the QB token refresh worked — the most important signal.

Investigate a stuck sync phase

See Observatory guide. Short version: read tga_sync_cursor.cycle_phase / cricos_sync_cursor.cycle_phase in ops-db to find the phase pointer, then re-trigger the worker to resume from that point.


Amendment Log

Date Change Authority
2026-04-06 Initial document — Session 44 Tim Rignold
2026-04-13 Session 52 docs pass: corrected DB IDs, added billing + sync + QB operations sections Alex (Claude Code)