-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Bug description
We tried to upgrade signoz from v0.99.0 to 0.100.1 and it seems that everything aside from the schema migrator pod works OK.
Schema migrator crashes and goes into crashLoopBack.
The error/logs:
{"L":"info","timestamp":"2026-01-08T12:57:51.502Z","C":"schema_migrator/manager.go:972","M":"Inserting migration entry","query":"INSERT INTO signoz_metrics.distributed_schema_migrations_v2 (migration_id, status, created_at) VALUES (1001, 'in-progress', '2026-01-08 12:57:51')"} {"L":"info","timestamp":"2026-01-08T12:57:51.506Z","C":"schema_migrator/manager.go:1029","M":"Running operation","sql":"ALTER TABLE signoz_metrics.time_series_v4_6hrs_mv_separate_attrs ON CLUSTER cluster MODIFY QUERY SELECT\n\t\t\t\t\t\t\tenv,\n\t\t\t\t\t\t\ttemporality,\n\t\t\t\t\t\t\tmetric_name,\n\t\t\t\t\t\t\tdescription,\n\t\t\t\t\t\t\tunit,\n\t\t\t\t\t\t\ttype,\n\t\t\t\t\t\t\tis_monotonic,\n\t\t\t\t\t\t\tfingerprint,\n\t\t\t\t\t\t\tfloor(unix_milli / 21600000) * 21600000 AS unix_milli,\n\t\t\t\t\t\t\tlabels,\n\t\t\t\t\t\t\tattrs,\n\t\t\t\t\t\t\tscope_attrs,\n\t\t\t\t\t\t\tresource_attrs,\n\t\t\t\t\t\t\t__normalized\n\t\t\t\t\t\tFROM signoz_metrics.time_series_v4"} {"L":"info","timestamp":"2026-01-08T12:57:51.633Z","C":"schema_migrator/manager.go:978","M":"Updating migration entry","query":"ALTER TABLE signoz_metrics.schema_migrations_v2 ON CLUSTER cluster UPDATE status = $1, error = $2, updated_at = $3 WHERE migration_id = $4","status":"failed","error":"code: 60, message: There was an error on [chi-signoz12-clickhouse-cluster-5-2:9000]: Code: 60. DB::Exception: Could not find table: time_series_v4_6hrs_mv_separate_attrs. (UNKNOWN_TABLE) (version 25.5.6.14 (official build))","migration_id":1001} Error: code: 60, message: There was an error on [chi-signoz12-clickhouse-cluster-5-2:9000]: Code: 60. DB::Exception: Could not find table: time_series_v4_6hrs_mv_separate_attrs. (UNKNOWN_TABLE) (version 25.5.6.14 (official build))
{"L":"error","timestamp":"2026-01-08T15:24:34.413Z","C":"schema_migrator/manager.go:546","M":"A retryable exception was received while fetching distributed DDL queue","error":"code: 999, message: Coordination error: No node, path /clickhous e/signoz12-clickhouse/task_queue/ddl/query-0000003719/finished/chi%2Dsignoz12%2Dclickhouse%2Dcluster%2D5%2D2:9000","atte mpt":1,"S":"github.com/SigNoz/signoz-otel-collector/cmd/signozschemamigrator/schema_migrator.(*MigrationManager).getDist ributedDDLQueue\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/cmd/signozschemamigrator/schema_migrato r/manager.go:546\ngithub.com/SigNoz/signoz-otel-collector/cmd/signozschemamigrator/schema_migrator.(*MigrationManager).W aitDistributedDDLQueue\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/cmd/signozschemamigrator/schema_ migrator/manager.go:508\ngithub.com/SigNoz/signoz-otel-collector/cmd/signozschemamigrator/schema_migrator.(*MigrationMan ager).RunOperation\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/cmd/signozschemamigrator/schema_migr ator/manager.go:1002\ngithub.com/SigNoz/signoz-otel-collector/cmd/signozschemamigrator/schema_migrator.(*MigrationManage r).MigrateUpSync\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/cmd/signozschemamigrator/schema_migrat or/manager.go:725\nmain.registerSyncMigrate.func1\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/cmd/s ignozschemamigrator/main.go:163\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/c obra@v1.9.1/command.go:1015\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/cobr a@v1.9.1/command.go:1148\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1 .9.1/command.go:1071\nmain.main\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/cmd/signozschemamigrato r/main.go:70\nruntime.main\n\t/opt/hostedtoolcache/go/1.23.12/x64/src/runtime/proc.go:272"}
Our setup is:
layout: shardsCount: 6 replicasCount: 4
It seems for some reason schema migrator cannot execute the migration on one shard.
Any ideas how can we solve this issue?
Expected behavior
How to reproduce
Version information
- Signoz version: 0.99.0
- Browser version:
- Your OS and version:
- Your CPU Architecture(ARM/Intel):