Skip to content

Feature/lucene search engine#2892

Open
yanlibert wants to merge 95 commits intoMarquezProject:mainfrom
libertyann:feature/lucene-search-engine
Open

Feature/lucene search engine#2892
yanlibert wants to merge 95 commits intoMarquezProject:mainfrom
libertyann:feature/lucene-search-engine

Conversation

@yanlibert
Copy link
Copy Markdown
Contributor

Problem

👋 Thanks for opening a [pull request]
Currently, the new version of Marquez uses OpenSearch as a backend for the new search feature.
This might be overkill because not only it introduces an external dependency but also only the search and indexing features of OpenSearch are used.

Solution !! Warning: Currently a WiP !!

This is a small implementation of Lucene to perform only indexing and search of a dataset and a job index. This is done in a form of a subproject that can be run alongside marquez api and marquez-web.
It 's designed as a drop-in replacement of OpenSearch, so it's easy to switch between this implementation or a full-fledge OpenSearch.
It uses a ByteBuffersDirectory so all documents are stored in memory. The datasets and jobs are reloaded in the background at startup from the lineage_events table using the Marquez DAO.

Note: Please note that at the time of opening this PR, this is a PoC only here to open the discussion about the possibility of creating this new Marquez component, and as such it is still lacking some key elements (unit tests, integration tests, memory management feature, proper DB management, proper config ...)

phixMe and others added 28 commits August 9, 2024 18:02
Signed-off-by: wslulciuc <willy@datakin.com>
Signed-off-by: wslulciuc <willy@datakin.com>
Signed-off-by: wslulciuc <willy@datakin.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
Signed-off-by: Yannick Libert <yannick.libert@gmail.com>
@netlify
Copy link
Copy Markdown

netlify bot commented Sep 9, 2024

Deploy Preview for peppy-sprite-186812 canceled.

Name Link
🔨 Latest commit 5689a67
🔍 Latest deploy log https://app.netlify.com/sites/peppy-sprite-186812/deploys/66df15353dd9ab0008478562

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.25%. Comparing base (a586a89) to head (5689a67).
⚠️ Report is 98 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #2892   +/-   ##
=========================================
  Coverage     83.25%   83.25%           
  Complexity     1476     1476           
=========================================
  Files           259      259           
  Lines          6785     6785           
  Branches        313      313           
=========================================
  Hits           5649     5649           
  Misses          981      981           
  Partials        155      155           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants