Decision Intelligence Orchestration

Healthcare Staffing & Analytics Engine

A multi-modal comparison of Payroll-Based Journal (PBJ) data processing across research, local, and enterprise-distributed environments.

Method 1

Zero-setup prototyping using SQLite In-Memory logic.

Method 2

Vectorized Pandas execution on Driver nodes for mid-market scale.

Method 3

Distributed PySpark SQL for Big Data production reliability.

Decision Insight: For datasets < 2GB, Method 2 (Pandas on Driver) outperforms Spark by eliminating distributed shuffle overhead.

Ingestion Remote CSV Retrieval (Requests/IO)

Cleaning Regex-based Dynamic Header Detection

Standard CMS ID Zero-Padding & Fuzzy Matching

Output PBJ Staffing Market Share Analysis

Top 100

Target Facilities

3 Methods

Execution Engines