Raw data in.
Reliable decisions
out — at any scale.
Data Engineer with 5+ years designing cloud-native data platforms at Amazon and TouchWorld. I build the pipelines that power ML models, BI dashboards, and regulatory compliance — at any scale, in any domain.
high-volume scale
governance built
fraud & risk models
5+ years of engineering work — visualized. Pipeline volumes, tech adoption, and domain coverage across every role.
Healthcare Claims Data Platform
End-to-end platform ingesting, validating, and curating insurance claims. Batch and CDC ingestion with HIPAA-compliant handling, late-arriving data logic, and automated quality gates — producing ML-ready feature stores and analytics datasets.
Change Data Capture Pipeline
CDC pipeline from PostgreSQL to Snowflake via AWS DMS. Full idempotency, deduplication, schema evolution, and historical preservation — replacing expensive full-loads with sub-minute incremental ingestion and zero data loss.
Event-Driven Processing Pipeline
Near-real-time pipeline triggered on S3 uploads via EventBridge. Lambda and Glue handle automated transformations with zero idle compute — cutting end-to-end latency from hours to under 2 minutes on a fully serverless architecture.
Cloud-Native AWS Data Platform
Library of production-ready Glue jobs, Lambda functions, and architecture patterns with IAM least-privilege policies and Lake Formation governance. Annotated architecture diagrams for batch and event-driven designs — built for reuse across projects.
Spark & Python Utility Library
Reusable PySpark transformation modules and Python utilities with unit tests. Covers window functions, advanced aggregations, partitioning strategies, and common ETL patterns — designed as plug-in components for any pipeline.
DevOps & Infrastructure as Code
Complete DevOps layer: Terraform modules for S3, Glue, and IAM; Dockerized PySpark for local dev parity; Jenkins + GitHub Actions CI/CD on every merge; Kubernetes (EKS) orchestration. Production never gets a surprise.
Health OS — Personal Health Data Pipeline
End-to-end personal health data system. Apple Watch → Health Auto Export → Node.js backend → observable dashboard. Built to demonstrate idempotent ingestion, raw/clean/derived data separation, SLA monitoring, anomaly detection, and metric explainability on real biometric data.