Project 01
repoReal-Time Fraud Signals Pipeline
Simulates fraud-event detection across streaming telecom-style data, with rules + statistical anomalies surfaced in seconds.
Kafka → Spark Structured Streaming → Delta → dbt → Streamlit, with watermarked windows, exactly-once writes, and dbt tests on the marts.
- Apache Kafka
- Spark Structured Streaming
- Delta Lake
- dbt
- Streamlit
What this demonstrates
- Streaming ingestion + watermarks
- Data quality / dbt tests
- Fraud analytics architecture
- Production-style local dev