Containerized Medallion Architecture
Automated University ETL & AI Auditing
Transforming raw API snapshots into executive-ready "Single Source of Truth" performance metrics for Canadian Higher Education.
Airflow
Orchestration Engine
Daily Automated Extraction & Monitoring
Docker
Environment Stack
Multi-Container (MySQL/Postgres/Airflow)
100%
Schema Integrity
Fail-Fast Logic in Transformation Layer
Pipeline Resilience Analysis
Balancing API latency with vectorized transformation efficiency.
Bronze Layer
Raw Immutable JSON
Silver Layer
Clean MySQL Upserts
The Medallion Workflow
Step 1: Extract
Hipo Labs API (JSON)
↓
Step 2: Bronze persistence
Raw Snapshots (Audit Trail)
↓
Step 3: Transform & Validate
Vectorized Pandas Logic
↓
Step 4: Silver Load
Optimized MySQL Production Tables
"Centralized logging tracks API latency and row counts in real-time to ensure longitudinal data impact."
Core Intelligence Features
Automated Auditor
Filters and validates Canadian institutional data before final persistence.
Airflow DAG Scheduler
Manages daily execution cycles with built-in retry and failure monitoring.
Upsert Intelligence
Custom SQLAlchemy logic preventing record duplication in Silver tables.