Releases: Restream/reindexer
Releases · Restream/reindexer
v5.13.0
Core
- [fea] Optimized grouping equal positions comparator
- [fea] Added implicit conversion between scalar values and single-value composites
- [fea] Added extendible hashing into
hash-indexes. This slightly increases mean insertion time, but makes it much more stable, allowing to avoid huge latency spikes on resizing - [fea] Improved estimate cost calculation for equal_positions
- [fix] Disallowed creating a
sparseindex over JSON fields with type 'object'. Previously this behavior could cause a few critical bugs and inconsistent states, and was mostly unusable. This may break compatibility if you have object fields in your sparse indexes - [fix] Fixed data migration on PK update, when some of the documents are marked as 'deleted'
- [fix] Fixed support for subqueries on the left side of
WhereExpressions
Fulltext
- [fea] Reworked initial term variants generation logic. Now the search engine uses variants from previous steps to generate more variants on the current step. For example, variants received from the typos-handling mechanism will be used for stemming, etc.
- [fea] Changed ranking for multiword synonyms. Now rank divides proportionally between all the new terms
- [fea] Changed logic around base ranking config. Now full match will always have the best rank, and all the other base ranks will be proportionally decreased according to the configured values. This may break compatibility in some cases, because from now on there is no way to make full-match rank lower than any other base rank
- [fea]
MinRelevancyparameter in config has been deprecated. Added newMinRankparameter, which covers the full ranking range - [fea] Deprecated
FuzzyTextIndexand related configs were removed completely - [fix] Fixed crash during
IS NOT NULLfiltering condition andEnablePreselectBeforeFtindex option interaction
Vector indexes
- [fea] Added support for multiple vectors in a single field (i.e. array vector indexes)
- [fix] Fixed crash in dequantizing of empty (null) vectors during quantization config update
Reindexer server
- [fea] Added memory limit for max response size (default value is 1 GB) to avoid unexpected OOMs. Limit may be changed via
net.max_http_rsp_sizeyaml-config option,--max-http-rspCLI flag orRX_MAX_HTTP_RSPDocker env - [fea] Optimized tags synchronization for Update/Delete queries (new logic allows reducing response sizes by cutting off CJSON tags dictionary)
- [fea] Updated swagger to v5.32
Go connector
- [fea] Bumped dependency versions and updated min Go version (v1.24.0)
- [fea] Optimized some of the CJSON marshaling logic
- [fea] Updated content returned in
IsSortableandConditionsfields of theIndexDescriptionobject. Now it corresponds to the actual index capabilities - [upd] Updated fulltext config structure according to changes in core fulltext indexes
Face
- [fea] Added vertical resize for the JSON preview window on the
Index editpage - [fea] Added new fields to the
Vector index settings
v5.12.1
Core
- [fix] Fixed crash during error handling in
IndexUpdatewhen PK is missing - [fix] Fixed
compositevalues validation inALLSEToperator - [fix] Fixed
compositeindexes error handling in equal_position - [fix] Fixed negative radius validation in
DWithincondition for geo index
Vector indexes
- [fix] Fixed disk ANN cache for
compositeprimary keys. Previously it could lead to crash on startup - [fix] Fixed KNN search with radius for quantized HNSW index
Go connector
- [fix] Fixed type tags handling for empty slices
v5.12.0
Core
- [fea] Added now()-function support into
WHERE-clause. Now it may be used both inUPDATE SETandWHEREclauses - [fea] Added flat_array_len()-function into
UPDATE SET. Now it may be used both inUPDATE SETandWHEREclauses - [fea] Added
checksumfield into#memstats-namespaces as better alternative fordatahash - [fea] Changed grouping logic for equal_position. New syntax/logic has better match with standard json-paths and also supports nested arrays in explicit way
- [fix] Fixed possible memory leak during
composite-indexes substitution inside WHERE-clauses (in cases, whenint->stringconvertion was performed beforecompositesubstitution) - [fix] Fixed SQL parsing for queries with combination of
or inner join(...)andleft join(...) - [fix] Fixed storage data migration, when Primary key index was changed
- [fix] Fixed
2D pointsconvertion on WHERE-clause (previously it could led to crashes on assertion) - [fix] Added explicit check for
rtreePrimary keys. Geo-indexes can not be PK anymore - [fix] Fixed forced sort errors handling for KNN-queries, when query has
LIMITandOFFSET - [fix] Fixed
UUID->stringconversions for nested arrays onUUID-index deletion
Fulltext
- [fea] Added optional terms boost, that allows to set rank multiplier for specific terms
Vector indexes
- [fea] Added 8 bit scalar quantization for HNSW-index. Read more...
- [fea] Added more effective vectorized implementations for
L2,IPandcosinemetrics.
Replication
- [fea] Added
checksumcheck instead ofdatahash-checksumimplementation has lower collisions rate and higher impact from each document's field - [fix] Fixed some rare case, when
temporarynamespace could remain alive after replication error
Reindexer server
- [fea] Changed FilterDef in Query DSL: some of the fields were marked as deprecated and
left_expression/right_expressionwere as more unified alternatives for better functions support and future filtering expressions development
Go connector
- [fea] Added unified
WhereExpressionsmethod for better functions support and future filtering expressions development - [fix] Fixed deserialization crash for queries, where
inner joinstays before equal_position in brackets
Face
- [fea] Added
Explainvisualization for queries with MERGE - [fea] Added
Boost for specific fulltext termsinto fulltext config tab - [fix] Fixed up/down buttons for custom field on pagination section
- [fix] Fixed the issue related to the page opening in a new window from left bar
- [fix] Fixed the issue related to the page opening in a new window from
Namespacetabs
v5.11.1
Fulltext
- [fix] Fixed possible heap-use-after in composite fulltext indexes, created over non-indexed fields
- [fix] Fixed composite fulltext index cache invalidation after
UPDATE-queries - [fix] Fixed deleted docs handling, when selection results exceeding
merge_limitand there are multiple build steps in incremental index
Vector indexes
- [fix] Fixed possible buffer overflow in transactions logic in case of multithreading insertion into
HNSW
v5.11.0
Core
- [fea] Optimized indexes memory layout for namespaces with large amount of items. Index
IdSet-structures now produce noticeably less overhead - [fea] Added support for
JOINoncomposite-indexes (i.e. queries likeSELECT * FROM ns1 INNER JOIN (SELECT * ns2) ON ns1.composite = ns2.composite) - [fea] Added support for serial()/now() precepts with non-indexed fields
- [fea] Added more optimal
preselectforJOIN-queries in cases, when right namespace is small and right query does not have filtering conditions withIdSets - [fea] Added new
EXPLAINformat forSELECT-queries with MERGE. Now it contains aggregated timing information and separate explains for each query - [fix] Fixed serial()/now() precepts with indexed fields, when target
jsonpathis missing in document - [fix] Fixed
UPDATE-queries for indexed fields, when targetjsonpathis missing in document - [fix] Fixed indexing of empty arrays after
UPDATE-queries: previously those arrays won't be selected byIS NULLcondition - [fix] Fixed memory leak in
composite-indexes after particular item update viaUPDATEquery - [fix] Fixed
UPDATE DROPforcomposite-index parts, whenjsonpathof subindex has nested field - [fix] Fixed
UPDATE-query interaction withnull-fields - [fix] Fixed handling for duplicate
sparse-indexes in DISTINCT with multiple fields - [fix] Fixed storage data migration after
Primary keyindex update
Fulltext
- [fea] Changed indexing structure for typos handling. New structure has noticeably less memory consumation
- [fea] Added support for
ORDER BY ft_compositecreated over non-indexed fields - [fix] Fixed few incorrect interactions between
UPDATE-queries andtext compositeindex withnull/missing fields
Vector indexes
- [fix] Fixed situation, when some row IDs in
KNNresults withrange searchcould be incorrect (due to missing internal/external index mapping)
Reindexer server
- [fix] Fixed QPS in Prometheus-metrics for
SELECT-queries (after5.9.0it was always equal toUPDATE-queries QPS) - [fix] Fixed
columnslist content in HTTP query results response (now it will contain full list of existing columns)
Face
- [fea] Removed autocomplete from index fields for create/edit index forms
- [fea] Added caching of added float vector data config
- [fea] Deleted
is_appendablefield from index config
v5.10.0
Core
- [fea] Added filtering by field length
- [fea] Optimized selection plan for
treesparse-indexes withis null/is not nullconditions - [fea] Support multifield sort by
treesparse-indexes - [fea] Improved index detection logic for target fields in update-queries in cases, when
jsonpathdoes not equal toindex name - [fix] Fixed arrays concatenation for
sparse-indexes - [fix] Fixed
assertion throwfor non-existing fields in forced sort - [fix] Fixed multiple issues with
collate numericindex option:null-values handling and space characters handling - [fix] Fixed original strings content preservation for
collate asciiandcollate utf-8(previously those strings could be normalized) - [fix] Disabled invalid config with multiple
jsonpathsfor geo indexes - [fix] Fixed
update dropfor heterogenious arrays withsparse-indexes - [fix] Fixed array fields rollback for unsuccessful update-queries in some corner cases
Fulltext
- [fea] Added terms concatenation. Enabled by default. Check
EnableTermsConcatflag andConcatProcvalue - [fix] Fixed crash on
null-values withenable_preselect_before_ft: trueindex option
Vector indexes
- [fea] Added
embed_input_trafficandoutput_trafficprometheus metrics for auto-embedding - [fea] Added
skip_embedding()precept for vector fields. Check embedding configuration for details - [fix] Fixed auto-embedding statistics in
#perfstatsafter vector index update
Replication
- [fea] Added
queued_namespace_syncsfield into#replicationstats. It shows currentWAL/force-sync queue size for each node - [fea] Improved namespaces sync ordering. Now replicator tries to achieve better vectors data sharing
- [fea] Extended admissible_replication_tokens functionality: now those tokens may be used on
leaderto protect it from role switch by other node (useful in scenarios, whenfollowerhas to become newleader)
Go connector
- [fix] Fixed
panicin case ofinner joinwith closed namespace
C++ connector
- [fea] Added support for array-fields setting via
Item::operator[]
Reindexer tool
- [fea] Improved interaction with between DB dump restoration and auto-embedding: auto-embedding will be skipped for existing data
Face
- [fea] Added
enable_terms_concatandconcat_procsetting intotext-index page - [fix] Fixed item template on
New itempage for interfering nested keys
v5.9.0
Core
- [fea] Added direct support for nested arrays storing/indexing in
JSON,CJSONandMsgPack(i.e. JSONs like this{ "id": 7, "arr": [ 1, "string", [ 1, 2, 3], { "field": 10 }] }now may be stored into database) - [fea] Allowed to sort
null-values inhash-indexes (includingnull'sinside arrays).Nulls-order is now consistent for different indexes/fields:nullis considering less than any other value. This changes behavior for some queries withsparse treeindexes: previouslynulls-order was inconsistent and had depent on the selection plan and index/field type - [fea] Added
TagsMatcher'sinfo into#memstats - [fea] Added fields check according to current
StrictModefor joined fields insideON-clause - [fea] Added
Distinct-support forcomposite-indexes - [fix] Fixed assertion in ordered queries with
Distinctover fulltext-indexes - [fix] Fixed background index optimization in cases, when target index contains
null-values - [fix] Fixed
CJSON-corruption afterUPDATE-queries with non-existing array indexes
Fulltext
- [fea] Improved merging logic, when MergeLimit is exceeded. Search engine will try to find documents with maximum corresponding terms. This may be slower, but provides better quality. You may set environment variable
REINDEXER_NO_2PHASE_FT_MERGE=1to disable 2-phase merging and fallback to the old merge logic - [fea] Supported select functions for array values in composite indexes
- [fix] Allowed to index
null-fields infulltext compositeindexes
Vector indexes
- [fea] Added performance metrics for auto-embedding logic. Check
indexesperformance stats in#perfstatsnamespace for details (make sure, thatperfstatsare enabled in#config) - [fix] Fixed segmentation fault in KNN-queries with
radius, when target index is empty - [fix] Disabled vector indexes update/create operations, when namespace does not have PK-index (it could led to disk storage corruption)
Go connector
- [fea] Added support for nested arrays into
CJSON-coding/decoding - [fix] Fixed Transactions with
Update/Delete-queries. Now such transactions will return actual count of items, affected by the queries
Reindexer server
- [fea] Added Prometheus-metrics for auto-embedding logic
- [fix] Fixed screening in
api/v1/db/:db/namespaces/:ns/meta*endpoints
Reindexer tool
- [fix] Fixed screening in
\meta-calls
Face
- [fea] Added total for
Memory Statisticsof namespace - [fix] Fixed issue appeared on
Performance Statisticsrefresh - [fix] Fixed Embedder's URL validation to allow local domains
v5.8.0
Core
- [fea] Added new EqualPosition syntax to perform grouping conditions over object arrays
- [fea] Added MERGE support for hybrid select results
- [fea] Optimized comparator for multifield
Distinct(for conditions likeDistinct(field1,field2,...)) - [fea] Added
Distinctsupport for fulltext indexed (works the same way asDistinctfor regular indexes) - [fea] Optimized selection plan for empty query results
- [fix] Fixed error handling in
composite-index update/delete operations - [fix] Fixed DSN masking in
#config-namespace - [fix] Fixed token's positions in SQL parsing error descriptions
- [fix] Fixed data race in namespaces renaming
Fulltext
- [fea] Added extra strict validation for non-existing fields/indexes in fulltext dsl
Vector indexes
- [fea] Added vector's data sharing between multiple query results to reduce peak memory footprint for results, containing vectors
Replication
- [fix] Fixed possible "split mind" in RAFT-cluster
Sharding
- [fea] Added forced RAFT-leader elections request after proxying errors during
#replicationstatsrequest for more stable errors handling - [fix] Disabled sharding by vector indexes
Reindexer tool
- [fix] Fixed interactive mode termination after error
Face
- [fea] Added new
vectors keeper sizefield to the Statistics page
v5.7.0
Core
- [fea] Added support for sorting with array fields (i.e.
ORDER BY array_field) - [fea] Json-paths ordering for composite indexes was made consistent and now depends on initial json-paths ordering in indexes definition array
- [fea] Improved error messages in cases, when user tries to create new PK-index over the field with duplicated values
- [fea] Optimized dynamic memory allocations count in JOIN-queries
- [fix] Fixed crash during
null-values handling in equal_position - [fix] Fixed quotes handling in sort expressions
Fulltext
- [fea] Sufficiently optimized ranks merging loop for queries with large relevant results count (up to 25% performance boost according to our CPP-benchmarks)
- [fix] Fixed composite fulltext indexes update when target index has individual fields configs
- [fix] Fixed crash when indexing arrays with
enable_numbers_search - [fix] Fixed fast-path index update
Reindexer server
- [fea] Added support for transaction in Protobuf and MsgPack format (in
/api/v1/db/:db/namespaces/:ns/transactionsendpoint) - [fix] Fixed crash on incorrect JSON for
equal_positionsandjoin_queryfields in Query DSL parser - [fix] Fixed response for GRPC EnumNamespaces with
onlyNames-option
CXX API
- [fea] Added few more safety checks for
client::Reindexer - [fix] Fixed handling of nested json-paths in
reindexer::Item::operator[](i.e. cases likeitem["obj.field"] = 10) - [ref] Method
Select(std::string_view sql)was renamed toExecSQL(std::string_view sql) - [ref] Removed deprecated
temporaryflag from namespace's#memstat
Deploy
- [upd] Updated base docker image from
alpine:3.21toalpine:3.22
v5.6.0
Core
- [fea] Added subqueries and
or inner joinsupport for UPDATE and DELETE queries - [fea] Added more informative error message in case of unsuccessful index creation
- [fea] Improved anti-join handling (excessive braces do not required anymore)
- [fea] Added more strict validation for incorrect conditions with LEFT JOINS
- [fea] Improved protobuf/msgpack content validation
- [fea] Added more strict validation for UPDATE-queries targeting non-array fields
- [fix] Fixed SQL/DSL(JSON) parsing of
NOT-operator inside JOIN's ON-clause - [fix] Fixed case-insensitive namespaces names in DSL(JSON) queries
- [fix] Fixed automatic indexes substitution for array-indexes with multiple
jsonpathsandsparse-indexes - [fix] Fixed compatibility in empty arrays JOINs
- [fix] Fixed incorrect LIMIT handling in queries with combination of array-field DISTINCT/multi-DISTINCT and forced sort
- [fix] Fixed
matchedfield value inexplainresults for conditions withNOToperators
Vector indexes
- [fea] Added automatic fallback in hybrid query, when embedder is not available. This query will be executed as pure fulltext query without KNN-part
- [fix] Fixed incorrect handling of the deleted vectors by KNN-conditions with
radius - [fix] Changed embedders validation logic to avoid indexes creation error on startup
Replication
- [fea] Added proxying for UPDATE and DELETE queries with subqueries and inner joins
Reindexer tool
- [fea] Added storage convertion tool
Deploy
- [upd] Added deployment for
debian:13(trixie) - [upd] Removed deployment for
debian:11(bookworm)
Face
- [fea] Added new fields to fulltext index config (
keep_diacritics,min_word_part_sizeandword_part_delimiters) - [fix] Fixed ms measure for statistics column titles