Here are
175 public repositories
matching this topic...
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
Updated
Apr 23, 2026
TypeScript
The Metadata Platform for your Data and AI Stack
Updated
Apr 23, 2026
Java
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Updated
Apr 2, 2026
Python
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Updated
Apr 23, 2026
Java
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Intake is a lightweight package for finding, investigating, loading and disseminating data.
Updated
Mar 23, 2026
Python
📙 Awesome Data Catalogs and Observability Platforms.
🐳 The stupidly simple CLI workspace for your data warehouse.
Updated
Feb 8, 2023
Python
Marmot is an open-source data catalog designed for teams who want powerful data discovery without enterprise complexity. Catalog every data asset, enrich it with the context that matters and make it accessible to your team and your AI tools.
Work with your web service, database, and streaming schemas in a single format.
Updated
Dec 30, 2025
Python
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
Updated
Jan 5, 2024
Python
An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.
Updated
Apr 6, 2026
Python
The World's Most Comprehensive, Authoritative, and Structured Open Source Data Source Knowledge Base
Updated
Apr 23, 2026
Python
The GenAI-powered toolkit for automated data intelligence.
Updated
Mar 30, 2026
Jupyter Notebook
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.
Updated
Feb 15, 2026
Python
Reference Architectures for Datalakes on AWS
Updated
May 13, 2020
HTML
Modern documentation site generator for dbt Core — lineage explorer, health scoring, full-text search. Live demo: https://demo.docglow.com
Updated
Apr 23, 2026
Python
Sample code with integration between Data Catalog and RDBMS data sources.
Updated
Dec 6, 2021
Python
End-to-end DataOps platform deployed by Terraform.
Updated
Mar 22, 2025
Python
Improve this page
Add a description, image, and links to the
data-catalog
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
data-catalog
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.