- Segment block compression: LZ4 compression for posting list blocks, reducing index size
- Coverity static analysis integration for detecting code quality issues
- Implement AMPROP_DISTANCE_ORDERABLE in amproperty for better planner support
- Fixed 'too many LWLocks taken' error on partitioned tables
- Fixed Coverity static analysis issues
- Segment format v3: Indexes created with v0.3.0 or earlier must be
recreated after upgrading to v0.4.0. Use
REINDEX INDEX index_nameor drop and recreate the index.
- Block-Max WAND optimization: 4x faster top-k queries using block-level upper bounds to skip non-contributing blocks
- Added code coverage workflow with Codecov integration
- Added segment integrity tests
- Added nightly stress tests with memory leak detection
- Added competitive benchmarks comparing pg_textsearch to other search engines
- Code consolidation and refactoring for better maintainability
- Fixed memory leaks by using private DSA for index builds
- Fixed bm25query varlena detoasting for binary I/O and scoring
-
V2 segment format: Block-based posting storage (128 docs/block) with skip index metadata for future Block-Max WAND optimization
-
Unlimited indexes: Replaced fixed-size registry with dshash for unlimited concurrent BM25 indexes
-
Benchmark suite: MS MARCO and Wikipedia benchmarks with historical performance tracking on GitHub Pages
- Major code refactoring: organized source into am/, memtable/, segment/, types/, state/, planner/, and debug/ directories
- Page reclamation after segment compaction
- Better cost estimation for query planning
- Fixed excessive memory allocation in document scoring
- Fixed buildempty() to write init fork correctly
First open-source release!
-
Implicit index resolution: Queries can now use simpler syntax with automatic index lookup:
SELECT * FROM docs ORDER BY content <@> 'search terms' LIMIT 10;
Instead of the explicit form:
SELECT * FROM docs ORDER BY content <@> to_bm25query('search terms', 'docs_idx') LIMIT 10;
-
Partitioned table support: BM25 indexes now work on partitioned tables. Each partition maintains its own index, with parent index OID automatically mapping to partition indexes.
- Added
bm25_summarize_index()function for fast index statistics without content dump - Added
COST 1000to scoring functions to help the query planner prefer index scans over sequential scans - Improved page versioning with distinct magic numbers for different page types
- PostgreSQL 18 support verified in CI
- Fixed extension upgrade paths from 0.0.4 and 0.0.5
- Fixed compiler warnings and standardized header guards
- Removed
to_bm25vector()function (useto_bm25query()instead) - Renamed scoring functions to
bm25_prefix (bm25_text_bm25query_score,bm25_text_text_score) - Existing BM25 indexes must be rebuilt after upgrading
- Added PostgreSQL License and NOTICE file
- Added license headers to source files