SOLR-18245: Add reRankCutoff output on LTR queries#4442
Conversation
| import org.apache.solr.ltr.model.LinearModel; | ||
| import org.junit.AfterClass; | ||
|
|
||
| public abstract class AbstractLTRSolrCloudTestBase extends TestRerankBase { |
There was a problem hiding this comment.
Created this to share code between TestLTROnSolrCloud and the new TestLTRReRankCutoffOnSolrCloud. The new tests needed to be separate as the existing one can create a SolrCloud with a single shard and I want to assert behaviour when there are multiple shards.
|
The tests are timing out on the setup stage (removing previous clone). @mkhludnev I can see you retried this and it timed out again. I don't have the option to retry... I don't think this is related to the PR itself, it's not actually running the tests, but is there something I've done or missed to cause this failure? Or something I can do to fix it? |
|
@mkhludnev thanks for the review and re-running the tests. Are you able to merge this? Or anything else I need to do before you can? |
I am not the right person for this area ;-(. I think @cpoerschke is probably the strongest person to review this, she has done the most work in this space. Maybe @alessandrobenedetti could review (or suggest someone!) |
|
Hi @cpoerschke @alessandrobenedetti have you had chance to look at this? Any thoughts or, alternatively, suggestions of who else might have the expertise? |
There was a problem hiding this comment.
Pull request overview
Adds an optional echoReRankCutoff local parameter for LTR rerank queries so clients can see the first-pass cutoff value(s) that determined eligibility for reranking. This is implemented by capturing the cutoff in the rerank collector and exposing it in responseHeader (and, for distributed queries, aggregating shard-local values into a dedicated header field).
Changes:
- Add
echoReRankCutoffsupport to LTR rerank parsing and propagate the request intent via request context. - Compute and emit
reRankCutoffin the response header (supporting single- and multi-sort cutoffs, including schema-aware sort value marshalling). - For distributed queries, aggregate shard-reported cutoffs into
responseHeader.reRankCutoffByShard, and add/refactor tests + reference-guide documentation + changelog entry.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| solr/solr-ref-guide/modules/query-guide/pages/learning-to-rank.adoc | Documents echoReRankCutoff, reRankCutoff, and reRankCutoffByShard with examples. |
| solr/modules/ltr/src/test/org/apache/solr/ltr/TestLTRWithSort.java | Adds tests for cutoff reporting when sorting by function(s). |
| solr/modules/ltr/src/test/org/apache/solr/ltr/TestLTRReRankCutoffOnSolrCloud.java | Adds SolrCloud/distributed coverage asserting shard-local cutoff reporting. |
| solr/modules/ltr/src/test/org/apache/solr/ltr/TestLTRQParserPlugin.java | Adds core LTR tests for header presence/absence and cutoff value types/order. |
| solr/modules/ltr/src/test/org/apache/solr/ltr/TestLTROnSolrCloud.java | Refactors to use a shared SolrCloud LTR test base. |
| solr/modules/ltr/src/test/org/apache/solr/ltr/AbstractLTRSolrCloudTestBase.java | New shared SolrCloud setup/index/model-loading base for LTR tests. |
| solr/modules/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java | Adds echoReRankCutoff local param and stores intent in request context. |
| solr/core/src/java/org/apache/solr/search/ReRankCollector.java | Computes cutoff from the last eligible first-pass doc and writes it to responseHeader when enabled. |
| solr/core/src/java/org/apache/solr/search/QueryCommand.java | Adds sortSchemaFields plumbing so cutoff sort values can be marshalled correctly. |
| solr/core/src/java/org/apache/solr/search/AbstractReRankQuery.java | Defines shared constants for cutoff header keys and request-context key. |
| solr/core/src/java/org/apache/solr/handler/component/ResponseBuilder.java | Populates QueryCommand.sortSchemaFields from SortSpec. |
| solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java | Merges shard reRankCutoff into top-level reRankCutoffByShard for distributed queries. |
| changelog/unreleased/SOLR-18245-rerank-cutoff-value.yml | Adds unreleased changelog entry describing the new optional header output. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@shawdm thank you. |
|
https://ci-builds.apache.org/job/Solr/job/Solr-Test-main/14191/ |
|
ok. tests passed in main https://ci-builds.apache.org/job/Solr/job/Solr-Test-main/14192/ |
Co-authored-by: Darren Shaw <shawdm@gmail.com>
|
Thanks @mkhludnev - I can see you've done the cherry pick to branch_10x too - appreciate your help with this. |
https://issues.apache.org/jira/browse/SOLR-18245
Description
When using LTR, a client may need to know the sort score that a document would have required to have been eligible for rerank by LTR.
Solution
This PR adds a
echoReRankCutofflocal parameter to the LTR rerank query (e.g.{!ltr model=myModel reRankDocs=100 echoReRankCutoff=true}. When this parameter is set to true, theresponseHeaderwill include areRankCutofffield which will return the score of the lowest ranked document that was included for rerank. When multiple sorts are used (e.g.sort=price desc,discount asc), the response header contains the cutoff value for each sort, in order.Example Response
Tests
TestLTRQParserPluginto test core functionality.TestLTRReRankCutoffOnSolrCloudadded to test for sharded cases. This has also meant I've extracted out shared code fromTestLTROnSolrCloudin to a commonAbstractLTRSolrCloudTestBase.Checklist
Please review the following and check all that apply:
mainbranch../gradlew check.