Archived Brief
This brief has been completed and is retained as a build record.
B-ENRICH-04b/c/d/e — VOCSTATS Full Collection Ingest¶
SURFACE: rtopacks-db D1 — data layer only
CLOUDFLARE ACCOUNT: e5a9830215a8d88961dc6c80a8c7442a (UCCA)
DO NOT TOUCH: All Workers, public surfaces, engine-db, UCCO account
DB: rtopacks-db D1 334ac8fb-9850-48c0-9da0-b56c55640e98
PREREQUISITE: B-ENRICH-04 complete — VOCSTATS API key confirmed working, schema endpoint explored
BRIEF DRIP RULE: Deploy and confirm before next brief.
Overview¶
This brief completes the VOCSTATS data layer by ingesting the remaining four collections available under the RignoldT account. Run after B-ENRICH-04 (TVA enrolments) is confirmed working — the API key, schema discovery pattern, and base script are already proven.
API key: VOCSTATS_API_KEY — already in Cloudflare Worker secrets from B-ENRICH-04
Base URL: https://vocstats.ncver.edu.au/webapi/rest/v1/
B-ENRICH-04b — TVA Program Completions 2015–2024¶
Collection: Total VET students and courses (TVA)
Table: TVA program completions 2015–2024
Query dimensions: - Qualification/program code - Collection year (2015–2024) - State/territory (for state-level breakdown — optional, do national totals first)
Measure: Program completions count
Schema:
CREATE TABLE IF NOT EXISTS vocstats_completions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
qual_code TEXT NOT NULL,
collection_year INTEGER NOT NULL,
completions INTEGER,
collection TEXT NOT NULL DEFAULT 'TVA',
ingested_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE(qual_code, collection_year, collection)
);
CREATE INDEX IF NOT EXISTS idx_vocstats_comp_qual ON vocstats_completions(qual_code);
Display value (future B-DETAIL-08): Completion rate = completions / enrolments × 100. Show as "X% completion rate (2024)" on qual page.
B-ENRICH-04c — Student Outcomes Survey¶
Collection: Student outcomes
Table: allusers
What this contains: Post-completion survey data — employment outcomes, further study, satisfaction. Survey-based, so different structure from enrolment data. Typically available at AQF level and field of education level, not always at individual qual code level — check schema first.
Query dimensions: - Program/qualification identifier (check if qual code or FOE code) - Survey year - Outcome type (employed, further study, satisfied overall)
Measure: Percentage achieving outcome (perturbed)
Schema:
CREATE TABLE IF NOT EXISTS vocstats_outcomes (
id INTEGER PRIMARY KEY AUTOINCREMENT,
qual_code TEXT,
foe_code TEXT,
aqf_level TEXT,
survey_year INTEGER NOT NULL,
outcome_type TEXT NOT NULL,
outcome_pct REAL,
sample_size INTEGER,
collection TEXT NOT NULL DEFAULT 'SOS',
ingested_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE(qual_code, foe_code, aqf_level, survey_year, outcome_type)
);
CREATE INDEX IF NOT EXISTS idx_vocstats_out_qual ON vocstats_outcomes(qual_code);
CREATE INDEX IF NOT EXISTS idx_vocstats_out_foe ON vocstats_outcomes(foe_code);
Note: If the Student Outcomes data only resolves to FOE (field of education) level rather than individual qual codes, store at FOE level and join to quals via the qualifications.training_package_code → FOE mapping. The qual page can then show "X% of graduates in this field of education found employment" rather than qual-specific data.
Priority outcome types to capture:
- employed_after_training — employed 6 months after completing
- further_study — went on to further study
- satisfied_overall — overall satisfaction with training
- improved_employment — improved employment status after training
B-ENRICH-04d — Apprentices and Trainees¶
Collection: Apprentices and trainees
Table: explore via /schema — likely quarterly commencements, in-training, completions
What this contains: Training contract data — commencements, in-training counts, completions, cancellations. Quarterly from the 1990s. Indexed by occupation (ANZSCO), industry (ANZSIC), state, age, gender.
Note on in-training figures: Cannot be cumulated over time — represent active contracts at end of quarter. Use commencements or completions for trend analysis.
Query dimensions: - ANZSCO occupation code (4-digit) - Year/quarter - State/territory - Contract type (apprenticeship vs traineeship)
Measure: Commencements, completions (not in-training for trend)
Schema:
CREATE TABLE IF NOT EXISTS vocstats_apprentices (
id INTEGER PRIMARY KEY AUTOINCREMENT,
anzsco_code TEXT NOT NULL,
collection_year INTEGER NOT NULL,
collection_quarter INTEGER,
state TEXT,
contract_type TEXT,
commencements INTEGER,
completions INTEGER,
in_training INTEGER,
ingested_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE(anzsco_code, collection_year, collection_quarter, state, contract_type)
);
CREATE INDEX IF NOT EXISTS idx_vocstats_app_anzsco ON vocstats_apprentices(anzsco_code);
Join to qual page: vocstats_apprentices.anzsco_code joins to qualifications.anzsco_code. On the qual page Employment tab: "X apprenticeships/traineeships commenced in this occupation (2024)."
Covid caveat: NCVER flags that Covid (2020 onwards) significantly disrupted apprentice/trainee numbers. Display note: "Data from 2020 onwards reflects COVID-19 disruption to apprenticeship and traineeship activity."
B-ENRICH-04e — VET in Schools (PRIORITY)¶
Collection: VET in Schools
Tables:
- ViS program enrolments 2006–2024
- ViS program completions 2006–2024
- ViS students 2006–2024
- ViS subject enrolments 2006–2023
Why this matters: VET in Schools is delivered to secondary students (Years 10–12) who enrol in nationally recognised VET qualifications as part of their senior secondary certificate. Almost no commercial platform surfaces this data. RTOs that deliver VIS programmes have zero intelligence tooling. 18 years of data back to 2006.
What this contains: VIS enrolments and completions by qualification, year, state, school sector (government/Catholic/independent), year level. Perturbation applied for confidentiality.
Query dimensions: - Program/qualification code - Collection year (2006–2024) - State/territory - School sector (optional — do national totals first)
Measure: Program enrolments, program completions
Schema:
CREATE TABLE IF NOT EXISTS vocstats_vis_enrolments (
id INTEGER PRIMARY KEY AUTOINCREMENT,
qual_code TEXT NOT NULL,
collection_year INTEGER NOT NULL,
state TEXT DEFAULT 'National',
school_sector TEXT DEFAULT 'All',
enrolments INTEGER,
completions INTEGER,
collection TEXT NOT NULL DEFAULT 'VIS',
ingested_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE(qual_code, collection_year, state, school_sector)
);
CREATE INDEX IF NOT EXISTS idx_vocstats_vis_qual ON vocstats_vis_enrolments(qual_code);
CREATE INDEX IF NOT EXISTS idx_vocstats_vis_year ON vocstats_vis_enrolments(collection_year);
Display value (future B-DETAIL-08): On quals that have VIS data, show a distinct badge: "Delivered in secondary schools" with enrolment count and year. This is a differentiator — no competitor has it.
VIS training packages (priority quals): VIS is heavily concentrated in certain training packages — SIT (hospitality), CHC (community services/early childhood), ICT (technology), CPC (construction), HLT (health). These will have the most data.
Implementation approach¶
Run all four as a single extended ingest script: scripts/ingest-vocstats-full.mjs
Order of operations:
1. Discover all dataset IDs via /schema (one call — already done in B-ENRICH-04 if that worked)
2. For each collection, run the query and insert to the appropriate table
3. Log results per collection: rows inserted, year range, qual coverage
4. Single run, ~30–60 minutes depending on API rate limits
Rate limiting: Check /rate_limit endpoint before running. If limits are tight, run each collection separately with a day between runs.
Error handling: Some cells will be suppressed (null) due to perturbation — skip nulls, log count of suppressed cells per collection.
Ops stubs (OPS-AS-OS rule)¶
Add to ops.rtopacks.com.au under Data Enrichment:
| Table | Collection | Rows | Year range | Last ingest |
|---|---|---|---|---|
| vocstats_completions | TVA completions | [n] | 2015–2024 | [ts] |
| vocstats_outcomes | Student outcomes | [n] | [years] | [ts] |
| vocstats_apprentices | Apprentices & trainees | [n] | [years] | [ts] |
| vocstats_vis_enrolments | VET in Schools | [n] | 2006–2024 | [ts] |
Deploy checklist¶
- All four tables created in rtopacks-db
- Schema explored for each collection — dataset IDs and field IDs documented
- TVA completions ingested — spot check
SELECT * FROM vocstats_completions WHERE qual_code = 'PSP50922' - Student outcomes ingested — check if qual-level or FOE-level data, document which
- Apprentices & trainees ingested — spot check ANZSCO 2643 (Interpreters & Translators)
- VET in Schools ingested — spot check a high-VIS qual (e.g. SIT30821 Certificate III in Commercial Cookery)
- VIS coverage report:
SELECT COUNT(DISTINCT qual_code) FROM vocstats_vis_enrolments - All ops stubs added
- Ingest script committed to repo for annual refresh
What this does NOT build¶
Display of this data on the qual page — that's B-DETAIL-08. This brief is data layer only.