eBPF-powered network observability for Kubernetes. Indexes L4/L7 traffic with full K8s context, decrypts TLS without keys. Queryable by AI agents via MCP and humans via dashboard.
-
Updated
Apr 3, 2026 - Go
eBPF-powered network observability for Kubernetes. Indexes L4/L7 traffic with full K8s context, decrypts TLS without keys. Queryable by AI agents via MCP and humans via dashboard.
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.
AIOps学习资料汇总,欢迎一起补全这个仓库,欢迎star
Build your own AI SRE agents. The open source toolkit for the AI era ✨
[ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
Papers about Root Cause Analysis in MicroService Systems. Reference to Paper Notes: https://dreamhomes.top/
Official implementation of RiskLoc, a method for localizing multi-dimensional root causes in time-series data.
Aurora — Open source AI-powered agentic incident management & root cause analysis for SREs. LangGraph agents investigate across AWS, Azure, GCP, Kubernetes. Integrates with PagerDuty, Datadog, Grafana, Slack. Apache 2.0.
[FSE'26][WWW'25][ASE'24] RCAEval: A Benchmark for Root Cause Analysis.
Root Cause Analysis for Kubernetes
AERCA: Root Cause Analysis of Anomalies in Multivariate Time Series through Granger Causal Discovery (ICLR 2025 Oral)
This repository contains a reading list of papers on multivariate time series anomaly detection. This repository is still being continuously improved.
Awesome resources for failure diagnosis research.
[FSE'24 - 🏆 Best Artifact Award] BARO: Robust Root Cause Analysis for Time Series Data.
Multi-Agent Debugger: An AI-powered debugging system using CrewAI to orchestrate specialized agents that analyze logs, trace code, and uncover root causes across your stack — powered by LLM providers.
Web UI for OpenRCA
Code for "LEMMA-RCA: A Large Multi-modal Multi-domain Dataset for Root Cause Analysis" paper
RootCLAM: On Root Cause Localization and Anomaly Mitigation through Causal Inference (CIKM 2023)
Systematic debugging methodology for AI agents and developers. Prevents common anti-patterns like patch-chaining and wrong-environment restarts.
Add a description, image, and links to the root-cause-analysis topic page so that developers can more easily learn about it.
To associate your repository with the root-cause-analysis topic, visit your repo's landing page and select "manage topics."