From f3e132030c3af291c1b2e93c26179bd090dc0909 Mon Sep 17 00:00:00 2001 From: Jununn Date: Mon, 1 Jun 2026 16:10:30 +0800 Subject: [PATCH] Add DataFlow to the list of AI frameworks Adds [DataFlow](https://github.com/OpenDCAI/DataFlow) to the **LLMOps** section. ## What it is DataFlow is an open-source data-centric AI platform for LLM data preparation, synthetic data generation, and AI/data pipelines. It provides reusable skills, operator-based pipelines, and a WebUI for constructing and executing data workflows for AI tasks. ## Position in the list I placed it in the LLMOps section because DataFlow focuses on data preparation, generation, filtering, refinement, and reusable pipelines for LLM training, fine-tuning, and RAG workflows. ## Project status - Licensed under Apache-2.0. - Keywords: data-centric AI, LLM data preparation, synthetic data generation, AI/data pipelines, reusable skills, operator-based workflows. - Provides Python package and Docker-based usage. - Includes a WebUI for visual pipeline construction via `dataflow webui`. - Includes DataFlow-Skills for operator development, pipeline construction, and data-centric AI workflows. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b8b3381..e41697f 100644 --- a/README.md +++ b/README.md @@ -476,4 +476,4 @@ For more information about the above compiled landscape for 2025, please refer t - [Superduper](https://github.com/superduper-io/superduper) - a Python based framework for building AI-data workflows and applications - [Cognee](https://github.com/topoteretes/cognee) - LLM Memory Engine for implementing LLM Workflows - [vLLM](https://github.com/vllm-project/vllm) - A high-throughput and memory-efficient inference and serving engine for LLMs - +- [DataFlow](https://github.com/OpenDCAI/DataFlow) - A data-centric AI platform for LLM data preparation, synthetic data generation, and AI/data pipelines with skills and WebUI