feat: implement regions by moflotas · Pull Request #364 · ozontech/seq-db

moflotas · 2026-02-27T13:25:50Z

Description

Fixes #363

I have read and followed all requirements in CONTRIBUTING.md;
I used LLM/AI assistance to make this pull request;

If you have used LLM/AI assistance please provide model name and full prompt:

Model: {{model-name}}
Prompt: {{prompt}}

codecov-commenter · 2026-02-27T13:29:49Z

Codecov Report

❌ Patch coverage is 21.81070% with 190 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.73%. Comparing base (8aefcc9) to head (7a320e4).

Files with missing lines	Patch %	Lines
proxyapi/store_emulator.go	3.77%	51 Missing ⚠️
proxy/search/store_response.go	0.00%	49 Missing ⚠️
cmd/seq-db/seq-db.go	0.00%	47 Missing ⚠️
proxy/search/ingestor.go	47.05%	15 Missing and 3 partials ⚠️
proxyapi/ingestor.go	53.84%	11 Missing and 7 partials ⚠️
proxyapi/grpc_v1.go	22.22%	6 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #364      +/-   ##
==========================================
- Coverage   71.42%   70.73%   -0.70%     
==========================================
  Files         205      207       +2     
  Lines       14910    15088     +178     
==========================================
+ Hits        10650    10672      +22     
- Misses       3484     3634     +150     
- Partials      776      782       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-02-27T13:33:26Z

🔴 Performance Degradation

Some benchmarks have degraded compared to the previous run.
Click on Show table button to see full list of degraded benchmarks.

Show table

Name	Previous	Current	Ratio	Verdict
`AggDeep/size=1000-4`	`8aefcc`	`e5872f`
	`4817.00 ns/op`	`5783.00 ns/op`	`1.20`	`🔴`
`AggDeep/size=1000000-4`	`8aefcc`	`e5872f`
	`4837044.00 ns/op`	`5863059.00 ns/op`	`1.21`	`🔴`
`AggWide/size=1000-4`	`8aefcc`	`e5872f`
	`4804.00 ns/op`	`5829.00 ns/op`	`1.21`	`🔴`
`AggWide/size=10000-4`	`8aefcc`	`e5872f`
	`47761.00 ns/op`	`58337.00 ns/op`	`1.22`	`🔴`
`FindSequence_Random/medium-4`	`8aefcc`	`e5872f`
	`10747.09 MB/s`	`9231.79 MB/s`	`0.86`	`🔴`
	`99.33 ns/op`	`110.90 ns/op`	`1.12`	`🔴`
`FindSequence_Random/small-4`	`8aefcc`	`e5872f`
	`6638.44 MB/s`	`5583.28 MB/s`	`0.84`	`🔴`

github-actions · 2026-02-27T15:20:27Z

🔴 Performance Degradation

Some benchmarks have degraded compared to the previous run.
Click on Show table button to see full list of degraded benchmarks.

Show table

Name	Previous	Current	Ratio	Verdict
`FindSequence_Random/small-4`	`8aefcc`	`bbff1f`
	`6638.44 MB/s`	`4997.87 MB/s`	`0.75`	`🔴`

cheb0 · 2026-03-10T07:07:08Z

config/config.go

+		// When set, hot_stores, hot_read_stores, write_stores, read_stores must be empty. No bulk; search only.
+		Regions struct {
+			// UseRegions is whether to use regions or not
+			UseRegions bool `config:"use_regions"`


currently, config looks like that:

experimental: regions: use_regions: true

We repeat regions twice here.

I think it would be cleaner to have just:

experimental: regions: enabled: true

cheb0 · 2026-03-10T07:08:45Z

proxyapi/store_emulator.go

+
+// storeEmulator implements store API (Search, Fetch, Status) so seq-proxy can be
+// queried as a store for aggregation-friendly cross-region queries (e.g. quantiles).
+type storeEmulator struct {


For me storeAdapter would be a better name.

cheb0 · 2026-03-10T07:11:37Z

proxyapi/store_emulator.go

+	return &storeEmulator{searchIngestor: si}
+}
+
+func (s *storeEmulator) Search(ctx context.Context, req *storeapi.SearchRequest) (*storeapi.SearchResponse, error) {


One thing: explain doesn't work for regions. I set up a simple region cluster and here are the results:

Explained query for regions:

{ "error": { "code": "ERROR_CODE_NO", "message": "" }, "explain": { "duration": "316.815518ms", "message": "proxy/ComplexSearch", "children": [ { "duration": "206.588436ms", "message": "proxy/searchShard", "children": [ { "duration": "17.383µs", "message": "Making search request to localhost:9114" } ] }, { "duration": "316.62144ms", "message": "proxy/searchShard", "children": [ { "duration": "48.615µs", "message": "Making search request to localhost:9124" } ] } ] } }

Explained query for a single region:

{ "error": { "code": "ERROR_CODE_NO", "message": "" }, "explain": { "duration": "1.5405ms", "message": "proxy/ComplexSearch", "children": [ { "duration": "1.41697ms", "message": "proxy/searchShard", "children": [ { "duration": "33.876µs", "message": "Making search request to localhost:9014" }, { "duration": "157.948µs", "message": "store/Search", "children": [ { "duration": "17.734µs", "message": "parse query" }, { "duration": "8.126µs", "message": "search iteratively" }, { "duration": "0s", "message": "search iteratively (cpu time)" }, { "duration": "0s", "message": "waiting goroutines (all cores)" } ] } ] } ] } }

cheb0 · 2026-03-10T07:12:46Z

config/config.go

+			// Regions maps region name to store API gRPC address (e.g. "eu" -> "eu-proxy:9004").
+			Regions map[string]string `config:"regions"`
+			// MaxConcurrent limits how many regions are queried in parallel per search (0 = no limit).
+			MaxConcurrent int `config:"max_concurrent" default:"0"`


I think it's hard to tell what max_concurrent means without reading a code comment.

experimental: regions: use_regions: true ... max_concurrent: 5

cheb0 · 2026-03-10T07:20:23Z

proxyapi/store_emulator.go

+}
+
+func (s *storeEmulator) Search(ctx context.Context, req *storeapi.SearchRequest) (*storeapi.SearchResponse, error) {
+	sr := search.SearchRequestFromStoreAPI(req)


Store adapter doesn't bypass store protocol version. It means regions proxy currently get store protocol version as 2 while the primary proxy gets 1. It's possible that proxy can misinterpret QPRs from regions.

cheb0 · 2026-03-10T07:25:04Z

config/config.go

 		MaxRegexTokensCheck int `config:"max_regex_tokens_check" default:"0"`
+
+		// Regions configures the proxy to query only remote regions (other seq-proxies via store API).
+		// When set, hot_stores, hot_read_stores, write_stores, read_stores must be empty. No bulk; search only.


nit: Even with proxy setup I still can configure hot_stores:

cluster: replicas: 1 hot_stores: - localhost:9004 - localhost:9008

However, I can't use them:

{ "code": 3, "message": "no regions specified in request (required when using experimental.regions)" }

I guess, we could validate config that hot_stores can't be set if regions are enabled.

cheb0 · 2026-03-10T07:30:14Z

proxy/search/ingestor.go

+	// RegionStores is a separate topology for experimental.regions (not mixed with hot_stores). When set, search uses only this.
+	RegionStores *stores.Stores
+	// RegionNames maps shard index to region name (same order as RegionStores.Shards).
+	RegionNames []string


nit: so we basically have region names decoupled from region stores, but indexes are same. This is a bit ugly. I suppose it's not possible to solve this problem altogether, but I would not be against moving regions inside stores.Stores. Default setup could have empty (not used) region names

cheb0 · 2026-03-10T07:31:42Z

proxy/search/ingestor.go

-		searchStores = si.config.HotReadStores
+
+	var searchStores *stores.Stores
+	if config.UseRegions {


nit: I'm not against having longer methods. However, here I would really move this logic of determing which stores we call to some method like locateStores or something like that.

cheb0 · 2026-03-10T07:34:15Z

proxy/search/ingestor.go

+		shards := make([][]string, 0)
+		vers := make([]string, 0)
+		for i, name := range si.config.RegionNames {
+			if _, ok := allowed[name]; ok && i < len(si.config.RegionStores.Shards) {


nit (not really): would be cool to support wildcards in future. Users could specify regions like krd*

cheb0 · 2026-03-10T08:43:35Z

proxyapi/store_emulator.go

+// storeEmulator implements store API (Search, Fetch, Status) so seq-proxy can be
+// queried as a store for aggregation-friendly cross-region queries (e.g. quantiles).
+type storeEmulator struct {
+	storeapi.UnimplementedStoreApiServer


nit: I presume, we do not support async searches with regions enabled. It would be reasonable if we got some errors clarifying that. Currently, I get: can't find store shards in config when async search request is issued.

feat: implement regions

fbb3c5f

fix: rename variable to fix linting

7a320e4

eguguchkin requested review from cheb0, dkharms and eguguchkin and removed request for dkharms March 2, 2026 10:29

cheb0 requested changes Mar 10, 2026

View reviewed changes

Conversation

moflotas commented Feb 27, 2026

Description

Uh oh!

codecov-commenter commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Feb 27, 2026

🔴 Performance Degradation

Uh oh!

github-actions bot commented Feb 27, 2026

🔴 Performance Degradation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Feb 27, 2026 •

edited

Loading