🦀 event stream processing for developers to collect and transform data in motion to power responsive data intensive applications.
-
Updated
Jun 23, 2026 - Rust
🦀 event stream processing for developers to collect and transform data in motion to power responsive data intensive applications.
Serverless multi-protocol + multi-destination event collection system.
Adapter for dbt that executes dbt pipelines on Apache Flink
📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.
Different flavours of CUSUM for change point detection.
This repo assists in building streaming analytics platform using RisingWave and dbt, empowering your real-time data insights.
Real-time ETL pipeline for financial data (kafka, pyspark) .
Extends the standard cumulocity administration with dialog to add analytics builder extensions
⚡ Real-time fraud & anomaly detection system for streaming transactions. Built with Kafka Streams + Isolation Forest ML. Low-latency processing, online learning, and scalable architecture for detecting fraud patterns in transaction data. 🚨🔍
Online, interpretable anomaly detection for industrial sensor streams — self-adapting operating limits, root-cause isolation, no labels or retraining.
This repo provides a template for developing Cumulocity Streaming Analytics assets such as Blocks and EPL Apps, with a dev container for opening in Visual Studio Code
Source code of a heavy hitter packet streaming application implemented with four stream processing systems: Flink, Spark Streaming, Storm and WindFlow.
Rust stream processing engine for real-time detection. Open-source Apache Flink alternative built for detection engineering, fraud prevention, and MITRE ATT&CK coverage. 1.5M events/sec, single 15MB binary, no JVM.
Real-time monitoring pipeline using Kafka, Flink, PostgreSQL, and Grafana to stream metrics, detect anomalies (EWMA + 3σ), and visualize results.
Built a Large Scale Distributed Data Processing system for Streaming Analytics using Hadoop Ecosystem (Apache Spark and HDFS), in Cloud for real-time spatial analytics.
Full-stack machine learning project that predicts viewer satisfaction (high ratings) on Netflix using demographic data and TMDB movie metadata. Includes EDA, XGBoost modeling, and real-time enrichment using the TMDB API.
Real-time Coinbase market data streaming pipeline with visualizations. Much appreciation to DataTalks.Club Data Engineering Zoom Camp: https://github.com/DataTalksClub/data-engineering-zoomcamp
Real-time network anomaly detection using Go and Kafka
Real-time geopolitical instability prediction system using Apache Kafka, Apache Spark, XGBoost, NLP, and GDELT data for streaming analytics and risk forecasting.
An end‑to‑end real-time analytics & anomaly detection with PySpark Structured Streaming on user activity logs from Kafka
Add a description, image, and links to the streaming-analytics topic page so that developers can more easily learn about it.
To associate your repository with the streaming-analytics topic, visit your repo's landing page and select "manage topics."