Data Sources¶
Version: 1.0 Updated: 31 March 2026 Maintained by: Tim Rignold / Alex
Full technical reference for every dataset ingested into the RTOpacks platform. All data is Australian government-sourced unless otherwise noted. All ingestion is via automated Workers unless marked manual.
Primary databases¶
| Database | UUID | Region | Size | Purpose |
|---|---|---|---|---|
| rtopacks-db | 334ac8fb | OC/SYD | 2.9GB | NRT corpus, RTOs, scope, enrichment |
| licensing-db-oc | 6c2abf4d | OC/MEL | ~21MB | NSW Fair Trading, TEQSA TQNR |
| abs-db-oc | 66389f3f | OC/BNE | 620MB | ABS Labour Force, TableBuilder exports |
| calendar-db | c51c8778 | OC/AKL | — | VET industry calendar |
| ops-db | 00daba3d | OC | 373KB | UCCA business ops, Magda discovery |
| microcredentials-db-oc | 3924412d | OC/SYD | 74KB | Non-accredited content |
| engine-db-oc | fb6ddc43 | OC/AKL | 803KB | UCCA engine, users, credentials |
Dataset inventory¶
1. TGA / Training.gov.au (DEWR)¶
What: Full NRT corpus — qualifications, units, skill sets, RTOs, scope
Tables: qualifications, units, skill_sets, rtos, rto_scope_v2 (6.2M rows), qualification_units, and ~90 supporting tables
DB: rtopacks-db
Sync: tga-sync Worker — weekly, Sunday 2am AEST
API: training.gov.au REST API
Licence: CC BY 4.0
Volume: ~8,000 quals, 75,000 units, 12,500 RTOs, 6.2M scope rows
2. Jobs and Skills Australia (JSA)¶
What: Occupation Shortage List (OSL), Internet Vacancy Index (IVI), employment projections, yourcareer.gov.au course-to-career pathways
Tables: osl_ratings, ivi_vacancies, emp_projections, qual_career_pathways
DB: rtopacks-db
Sync: Manual ingest + jsa-ingest Worker (in development)
API: jobsandskills.gov.au — OSL quarterly, IVI monthly
Licence: CC BY 4.0
3. CRICOS / Department of Education¶
What: Commonwealth Register of Institutions and Courses for Overseas Students
Tables: cricos_providers, cricos_courses, cricos_locations, cricos_course_locations
DB: rtopacks-db
Sync: cricos-sync Worker — 1st of month, 4am AEST
API: data.gov.au (dataset: e5ae7059)
Licence: CC BY 2.5 Australia
Volume: ~1,542 providers, ~26,000 courses
Historical: 48 monthly snapshots back to July 2021 (ingest pending — CRICOS-02)
4. Australian Bureau of Statistics (ABS)¶
What: Labour Force Survey (SDMX API), TableBuilder exports — COE, WRTAL, ERP population
Tables: abs_lf, abs_lf_ages, abs_lf_edu, abs_lf_hours, abs_lf_under, abs_tb_lf_occupation (42,300 rows), abs_tb_coe (14,742 rows), abs_tb_edwork_2025 (405 rows), abs_tb_wrtal_qual_occupation_state (900 rows), abs_wpi, abs_jv, abs_awe, abs_labour_acct, abs_annual_erp_asgs2021, abs_annual_erp_lga2024
DB: abs-db-oc
Sync: abs-monthly-sync, abs-quarterly-sync, abs-annual-sync Workers
API: ABS SDMX API (stat.data.abs.gov.au) + TableBuilder manual exports
Licence: CC BY 4.0
TableBuilder account: User ID 1258668
5. NCVER¶
What: Apprentice and trainee data (national quarterly series), VOCSTATS enrolment and completion tables
Tables: ncver_apprentice_national, vocstats_completions_foe, vocstats_enrolments_* (8 tables)
DB: rtopacks-db
Sync: Manual ingest — quarterly on NCVER data release
Source: ncver.edu.au/research-and-statistics
Licence: CC BY 4.0
6. State training authorities (×8)¶
What: State funding schedules — qualification-level subsidy rates for all 8 jurisdictions
Tables: state_funding
DB: rtopacks-db
Sync: Manual — updated when state authorities publish new schedules
Sources: NSW Training Services, Skills Victoria, DESBT QLD, DTWD WA, DITI SA, Skills Tasmania, ACT Skills, DITT NT
7. Australian Business Register (ABR)¶
What: ABN status, entity type, GST registration, business names — enrichment on all RTOs
Tables: Enrichment columns on rtos table
DB: rtopacks-db
Sync: qual-enrichment Worker — daily 3am AEST
API: abr.business.gov.au
Licence: CC BY 4.0
8. NSW Fair Trading¶
What: 12 licensing registers — tradespeople, property agents, motor dealers, etc.
Tables: nsw_licences (29,144 rows)
DB: licensing-db-oc
Sync: Manual — quarterly refresh from data.nsw.gov.au
Licence: CC BY 4.0
Note: Not yet surfaced in product UI — licensed data held for future licensing intelligence feature
9. TEQSA — TQNR (Tertiary Quality and National Register)¶
What: All TEQSA-registered higher education providers, courses, and regulatory decisions
Tables: teqsa_providers (269), teqsa_courses (2,886), teqsa_decisions (5,974)
DB: licensing-db-oc
Sync: rtopacks-teqsa-sync Worker — monthly, 1st of month 5am UTC
API: data.gov.au (dataset: 0c4f6591)
Licence: CC BY 2.5 Australia
Cross-reference: 69 dual-regulated providers matched to rtopacks-db rtos table
10. VET Industry Calendar (compiled)¶
What: Regulatory deadlines, data release dates, funding rounds, industry conferences
Tables: vet_calendar (255 events), cal_sync_log
DB: calendar-db
Sync: rtopacks-cal-sync Worker — monthly, 1st of month 6am AEST
Sources (auto): NCVER upcoming releases, data.gov.au public holidays API, ABS release calendar
Sources (pending review): TDA newsletter, NCVER VET calendar, ASQA news
Sources (seed): Manually curated ASQA regulatory dates, JSA quarterly pattern, confirmed 2026 conference dates
11. CSIRO Magda — Open data discovery¶
What: Federated index of 12,500+ Australian government datasets — discovery and monitoring layer
Tables: magda_datasets (ops-db)
Sync: magda-monitor Worker — monthly, 1st of month 6am AEST
API: dev.magda.io/api/v0/search/datasets
Purpose: Monitors for new VET-related datasets across data.gov.au, state portals, ABS, AURIN
Seeded: 10 Victorian DJSIR and NSW DoE datasets — status: reviewed, pending ingest
12. yourcareer.gov.au (DEWR)¶
What: Career pathway data — qual to career mappings, 1,107 quals, 54,640 pathway rows
Tables: qual_career_pathways
DB: rtopacks-db
Sync: Manual — annual
Source: yourcareer.gov.au API
Pending ingestion¶
| Dataset | Brief | Status |
|---|---|---|
| CRICOS historical 48-month snapshots | CRICOS-02 | Briefed, not started |
| Victorian DJSIR VET enrolment series (8 datasets) | DATA-18 seeded | Reviewed, pending ingest |
| NSW Smart & Skilled commencements 2015-2026 | DATA-18 seeded | Reviewed, pending ingest |
| ABS EEH 2023 | DATA-16b | Waiting on TableBuilder queue |
| ABS microdata (9 datasets) | Alan Dailly contact | Awaiting response |
Sync schedule summary¶
| Worker | Schedule | Sources |
|---|---|---|
| tga-sync | Sun 2am AEST | TGA NRT corpus |
| cricos-sync | 1st/mo 4am AEST | CRICOS data.gov.au |
| rtopacks-teqsa-sync | 1st/mo 5am UTC | TEQSA TQNR data.gov.au |
| abs-monthly-sync | Monthly | ABS SDMX LF/JV/AWE/WPI |
| abs-quarterly-sync | Quarterly | ABS SDMX quarterly series |
| abs-annual-sync | Annual | ABS ERP population |
| qual-enrichment | Daily 3am AEST | ABR enrichment on RTOs |
| stats-cache | Every 6hr | Live stat bar figures |
| rtopacks-d1-warmer | Every 5min | D1 replica warming |
| rtopacks-cal-sync | 1st/mo 6am AEST | Calendar sources (7 sources) |
| magda-monitor | 1st/mo 6am AEST | Magda open data discovery |
| ops-tender-sync | Daily 6pm + Sat 8pm AEST | AusTender |