Skip to content

Fully handle type initialization#2778

Merged
Hydrocharged merged 1 commit into
mainfrom
daylon/full-type-resolution
Jun 3, 2026
Merged

Fully handle type initialization#2778
Hydrocharged merged 1 commit into
mainfrom
daylon/full-type-resolution

Conversation

@Hydrocharged
Copy link
Copy Markdown
Collaborator

@Hydrocharged Hydrocharged commented May 29, 2026

We deserialize types such that all related types (the array type, the base type, composite attributes, etc.) are IDs that are stored on the type, requiring a lookup when that related type is needed. This fails in cases where we either do not have a context available, or something in GMS causes incomplete type information to be created after the type resolution analyzer step.

To work around these limitations, we now pass a context to the deserialization step, and allow the collection to cache types that are currently being deserialized to allow for recursive type references (such as the base type and array type referring to one another). All other changes are related to this core idea, being either interface/field changes or bug fixes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

Main PR
covering_index_scan_postgres 1932.46/s 1913.44/s -1.0%
groupby_scan_postgres 123.44/s 126.10/s +2.1%
index_join_postgres 679.12/s 671.58/s -1.2%
index_join_scan_postgres 865.18/s 848.88/s -1.9%
index_scan_postgres 24.55/s 24.41/s -0.6%
oltp_delete_insert_postgres 831.30/s 823.99/s -0.9%
oltp_insert 742.74/s 744.34/s +0.2%
oltp_point_select 3057.50/s 3039.13/s -0.7%
oltp_read_only 3017.98/s 3014.30/s -0.2%
oltp_read_write 2335.16/s 2286.87/s -2.1%
oltp_update_index 760.23/s 764.03/s +0.4%
oltp_update_non_index 800.74/s 816.37/s +1.9%
oltp_write_only 1772.71/s 1824.09/s +2.8%
select_random_points 1888.65/s 1899.72/s +0.5%
select_random_ranges 1126.29/s 1106.29/s -1.8%
table_scan_postgres 23.32/s 23.00/s -1.4%
types_delete_insert_postgres 785.89/s 821.98/s +4.5%
types_table_scan_postgres 10.12/s 10.00/s -1.2%

@Hydrocharged Hydrocharged force-pushed the daylon/full-type-resolution branch from e699b6d to 176cc27 Compare May 29, 2026 12:13
@itoqa
Copy link
Copy Markdown

itoqa Bot commented May 29, 2026

Ito Test Report ❌

14 test cases ran. 3 failed, 2 additional findings, 9 passed.

Overall, 14 test cases ran with 9 passing and 5 failing: recursive type-graph resolution, most explicit-cast behavior, custom type creation happy paths, array/ANY runtime checks, and cold-start initialization scenarios all behaved correctly in local verification (with SCRAM authentication disabled for deterministic access). The key confirmed defects are four High-severity issues and one Medium issue in production code paths: domain-default explicit casts can panic (unqualified table lookup and nil resolved-type dereference), type serialization and pg_type relationship projection can nil-dereference pointer fields, CREATE TYPE/DOMAIN companion-array creation is non-atomic and can leave orphaned/conflicting metadata on failure, and domain merge logic can drop check constraints by appending to the wrong object.

❌ Failed (3)
Category Summary Screenshot
Cast ⚠️ Domain default explicit cast path can panic with unqualified table lookup and nil type dereference. CAST-4
Creation ⚠️ Pointer-based type serialization and pg_type materialization can dereference nil related-type pointers and panic during metadata/cast flows. CREATION-2
Metadata ⚠️ pg_type relationship columns dereference pointer fields without nil guards, which can panic metadata queries. METADATA-1
⚠️ Domain default explicit cast can panic during type resolution
  • What failed: Instead of consistently resolving the cast or returning a stable SQL error, execution can panic with table not found during domain-default analysis and can also hit a nil-pointer path in explicit cast type resolution.
  • Impact: Domain default workflows with explicit casts can fail unpredictably, including panic-recovery errors during normal DDL/query operations. This breaks a core schema-definition path and can block users from safely using domain defaults.
  • Steps to reproduce:
    1. Create a user-defined type in a schema.
    2. Create a domain whose default expression includes an explicit cast to that type.
    3. Create or query objects that force domain default/check planning and observe panic-recovery errors instead of stable cast resolution.
  • Stub / mock context: Real SCRAM authentication was bypassed by forcing server/authentication_scram.go to disable auth, so cast behavior was exercised with deterministic local access. No API or route-response mocks were applied to the cast logic itself.
  • Code analysis: I inspected the domain-constraint analyzer and explicit-cast/type-resolution paths and found two plausible production defects: default analysis builds select <expr> from <table> using an unqualified table token, and explicit cast resolution dereferences a possibly nil resolved type when type lookup returns no concrete match.
  • Why this is likely a bug: The production code paths themselves show unsafe assumptions (unqualified table resolution and dereference without guarding a nil result), which directly explain the observed panic behavior without requiring test harness interference.

Relevant code:

server/analyzer/domain_constraints.go (lines 87-101)

parsed, err := a.Parser.ParseSimple(fmt.Sprintf("select %s from %s", defExpr, tblName))
if err != nil {
    return nil, err
}
selectStmt, ok := parsed.(*vitess.Select)
if !ok || len(selectStmt.SelectExprs) != 1 {
    return nil, sql.ErrInvalidColumnDefaultValue.New(defExpr)
}
builder := planbuilder.New(ctx, a.Catalog, nil)
return builder.BuildColumnDefaultValueWithTable(ae.Expr, selectStmt.From[0], typ, nullable), nil

server/expression/explicit_cast.go (lines 208-214)

resolvedType, err := typeColl.ResolveType(sqlCtx, c.castToType.ID)
if err != nil {
    return nil, err
}
if !resolvedType.IsResolvedType() {
    return nil, errors.Errorf("unable to resolve type `%s`", c.castToType.ID.TypeName())
}

core/typecollection/typecollection.go (lines 224-235)

func (pgs *TypeCollection) ResolveType(ctx context.Context, name id.Type) (*pgtypes.DoltgresType, error) {
    if t, err := pgs.GetType(ctx, name); err != nil {
        return nil, err
    } else if t != nil && t.IsResolvedType() {
        return t, nil
    }
    resolvedId, err := pgs.resolveName(ctx, name.SchemaName(), name.TypeName())
    if err != nil {
        return nil, err
    }
    return pgs.GetType(ctx, resolvedId)
}
⚠️ Nil pointer panic in type metadata paths
  • What failed: The server can panic with nil-pointer dereferences instead of safely handling unresolved or missing related-type pointers during serialization and catalog metadata reads.
  • Impact: Core type workflows can crash during normal schema operations and metadata introspection. This can break type creation and downstream client behavior that relies on stable catalog metadata.
  • Steps to reproduce:
    1. Create a composite type and a domain that reference custom types.
    2. Execute DDL or DML paths that serialize type metadata.
    3. Query pg_catalog.pg_type fields such as typelem, typarray, and typbasetype.
  • Stub / mock context: Authentication checks were bypassed during this run so local SQL regression coverage could execute without SCRAM login friction while investigating pointer-wiring behavior.
  • Code analysis: I inspected the type serialization and pg_type row generation paths and found direct pointer-field dereferences (Elem, Array, BaseTypeType, and composite attribute Type) without nil guards. If any related pointer is unset, these paths panic while serializing or rendering type metadata.
  • Why this is likely a bug: These production paths dereference nested type pointers with no nil checks, so partially initialized type metadata can trigger deterministic runtime panics.

Relevant code:

server/types/serialization.go (lines 191-203)

writer.Id(t.Elem.ID.AsId())
	writer.Id(t.Array.ID.AsId())
	writer.Id(globalFunctionRegistry.GetInternalID(t.InputFunc).AsId())
	writer.Id(globalFunctionRegistry.GetInternalID(t.OutputFunc).AsId())
	writer.Id(globalFunctionRegistry.GetInternalID(t.ReceiveFunc).AsId())
	writer.Id(globalFunctionRegistry.GetInternalID(t.SendFunc).AsId())
	writer.Id(globalFunctionRegistry.GetInternalID(t.ModInFunc).AsId())
	writer.Id(globalFunctionRegistry.GetInternalID(t.ModOutFunc).AsId())
	writer.Id(globalFunctionRegistry.GetInternalID(t.AnalyzeFunc).AsId())
	writer.String(string(t.Align))
	writer.String(string(t.Storage))
	writer.Bool(t.NotNull)
	writer.Id(t.BaseTypeType.ID.AsId())

server/tables/pgcatalog/pg_type.go (lines 380-392)

nextType.typ.Elem.ID.AsId(),         // typelem
		nextType.typ.Array.ID.AsId(),        // typarray
		nextType.typ.InputFuncName(),        // typinput
		nextType.typ.OutputFuncName(),       // typoutput
		nextType.typ.ReceiveFuncName(),      // typreceive
		nextType.typ.SendFuncName(),         // typsend
		nextType.typ.ModInFuncName(),        // typmodin
		nextType.typ.ModOutFuncName(),       // typmodout
		nextType.typ.AnalyzeFuncName(),      // typanalyze
		string(nextType.typ.Align),          // typalign
		string(nextType.typ.Storage),        // typstorage
		nextType.typ.NotNull,                // typnotnull
		nextType.typ.BaseTypeType.ID.AsId(), // typbasetype
⚠️ Nil guard missing in pg_type relationship projection
  • What failed: The metadata projection path dereferences pointer-derived fields directly; when any pointer is unresolved or nil, the query path can panic instead of returning stable catalog rows.
  • Impact: Metadata queries against pg_type can crash the server for affected type graphs. This breaks core introspection and can block clients that depend on catalog reads.
  • Steps to reproduce:
    1. Create enum, composite, domain, and array-related types in a test schema.
    2. Query pg_catalog.pg_type for those objects including typelem, typarray, and typbasetype columns.
    3. Reconnect and re-run metadata queries to validate that relationship IDs still round-trip correctly.
  • Stub / mock context: Authentication was intentionally disabled for the run so local SQL flows could execute without SCRAM login gates; the metadata query path itself was exercised against the real local server logic.
  • Code analysis: I inspected the catalog row builder and type model. DoltgresType stores Elem, Array, and BaseTypeType as pointers, but pgTypeToRow reads nested IDs without nil checks before building output rows.
  • Why this is likely a bug: Pointer fields are accessed unconditionally in production code, so unresolved or nil relationship pointers can trigger panics during normal catalog reads.

Relevant code:

server/tables/pgcatalog/pg_type.go (lines 379-392)

nextType.typ.SubscriptFuncName(),    // typsubscript
		nextType.typ.Elem.ID.AsId(),         // typelem
		nextType.typ.Array.ID.AsId(),        // typarray
		nextType.typ.InputFuncName(),        // typinput
		nextType.typ.OutputFuncName(),       // typoutput
		nextType.typ.ReceiveFuncName(),      // typreceive
		nextType.typ.SendFuncName(),         // typsend
		nextType.typ.ModInFuncName(),        // typmodin
		nextType.typ.ModOutFuncName(),       // typmodout
		nextType.typ.AnalyzeFuncName(),      // typanalyze
		string(nextType.typ.Align),          // typalign
		string(nextType.typ.Storage),        // typstorage
		nextType.typ.NotNull,                // typnotnull
		nextType.typ.BaseTypeType.ID.AsId(), // typbasetype
✅ Passed (9)
Category Summary Screenshot
Cast Explicit enum cast target resolved correctly in SELECT and UPDATE flows. CAST-1
Cast Shell and unresolved cast targets were rejected early with deterministic errors. CAST-2
Cast Default-expression and direct explicit casts both resolved and returned matching values. CAST-3
Creation Enum/composite/domain creation and companion array linkage worked, with successful insert/query checks and consistent pg_type relationships. N/A
Initialization Cold-start and second-session type-heavy queries succeeded with no nil-pointer signatures in logs. INITIALIZATION-1
Initialization Parallel early and settled startup queries remained consistent with no initialization-order crash evidence. INITIALIZATION-2
Record Verified array casting applies per-element conversion on built-in scalar cast path (int4[] -> int8[]), preserves NULLs ({1,NULL,2}), and surfaces element-conversion errors for invalid input. RECORD-1
Record ANY with json operands failed immediately with operator resolution error ('operator does not exist: json = json'), while integer control query executed successfully and returned true. RECORD-2
Recursive Deep nested domain/array/composite types resolved and round-tripped successfully with coherent pg_type metadata. RECURSIVE-1
ℹ️ Additional Findings (2)

These findings are unrelated to the current changes but were observed during testing.

Category Summary Screenshot
Creation ⚠️ CREATE TYPE/DOMAIN writes base type before companion array and does not roll back on second-step failure, leaving conflicting orphaned metadata. CREATION-3
Metadata 🟠 Domain merge appends checks to ourType instead of mergedType, so returned merged metadata can lose constraints. METADATA-3
⚠️ Non-atomic companion array type creation
  • What failed: Base type insertion is committed before companion array insertion, and the failure path returns without rollback, leaving orphaned/conflicting metadata that blocks retries.
  • Impact: A failed type-creation attempt can leave persistent inconsistent metadata that breaks retries and destabilizes follow-up introspection. This affects a core schema-management workflow with no practical user workaround besides manual cleanup.
  • Steps to reproduce:
    1. Start CREATE TYPE or CREATE DOMAIN for a user-defined type that requires a companion array type.
    2. Force the companion array creation step to fail after base type insertion.
    3. Retry creation and inspect pg_type metadata and cast behavior for conflicts or orphaned entries.
  • Stub / mock context: Authentication checks were bypassed, and the run intentionally pre-created a conflicting companion-array type name to simulate a second-step create failure and verify rollback behavior.
  • Code analysis: I reviewed CREATE TYPE, CREATE DOMAIN, and type collection insertion logic. Both nodes insert the base type first, then attempt array creation; on second-step failure they return immediately, while TypeCollection.CreateType has already mutated cache state and does not provide rollback semantics.
  • Why this is likely a bug: The production create flow performs multi-step writes without compensating rollback, so an induced second-step error can leave persistent partial state that violates expected atomic type creation.

Relevant code:

server/node/create_type.go (lines 145-158)

err = collection.CreateType(ctx, newType)
	if err != nil {
		return nil, err
	}

	// create array type for defined types
	if newType.IsDefined {
		arrayType := types.CreateArrayTypeFromBaseType(newType)
		err = collection.CreateType(ctx, arrayType)
		if err != nil {
			return nil, err
		}
		newType.Array = arrayType
	}

server/node/create_domain.go (lines 103-114)

newType := types.NewDomainType(ctx, c.AsType, defExpr, c.IsNotNull, checkDefs, arrayID, internalID)
	err = collection.CreateType(ctx, newType)
	if err != nil {
		return nil, err
	}

	// create array type of this type
	arrayType := types.CreateArrayTypeFromBaseType(newType)
	err = collection.CreateType(ctx, arrayType)
	if err != nil {
		return nil, err
	}

core/typecollection/typecollection.go (lines 66-77)

// Ensure that the type does not already exist in the cache or underlying map
	if _, ok := pgs.accessedMap[typ.ID]; ok {
		return pgtypes.ErrTypeAlreadyExists.New(typ.Name())
	}
	if ok, err := pgs.underlyingMap.Has(ctx, string(typ.ID)); err != nil {
		return err
	} else if ok {
		return pgtypes.ErrTypeAlreadyExists.New(typ.Name())
	}
	// Add it to our cache, which will be written when we do anything permanent
	pgs.accessedMap[typ.ID] = typ
🟠 Domain merge drops checks on returned merged type
  • What failed: Merge logic returns mergedType, but appends theirType.Checks into ourType.Checks; this can drop check constraints from the returned merged object and create metadata/runtime drift.
  • Impact: Merged domains can carry incomplete check metadata, causing inconsistent behavior between expected merged definitions and runtime constraint application. This creates hard-to-debug post-merge validation drift.
  • Steps to reproduce:
    1. Create diverging branches that modify domain default and check definitions in different ways.
    2. Merge those branches through the normal merge flow for the same domain identifier.
    3. Execute inserts and casts against the merged domain and compare runtime behavior to merged metadata expectations.
  • Stub / mock context: Authentication remained bypassed for deterministic local execution, and a temporary bypass of domain-cast constraint rewrite logic was active during this run to avoid a separate analyzer panic while validating merge behavior.
  • Code analysis: I reviewed merge assembly in core/typecollection/collection_funcs.go and verified that checks are appended to the wrong target object right before returning the merged type.
  • Why this is likely a bug: The function mutates ourType.Checks but returns mergedType, so merged results can omit newly merged checks in the object that is actually persisted.

Relevant code:

core/typecollection/collection_funcs.go (lines 61-90)

mergedType := *ourType
	switch theirType.TypType {
	case pgtypes.TypeType_Domain:
		if ourType.BaseTypeType.ID != theirType.BaseTypeType.ID {
			return nil, nil, errors.Errorf(`base types of domain type "%s" do not match`, theirType.ID.TypeName())
		}
		...
		if len(theirType.Checks) > 0 {
			// TODO: check for duplicate check constraints
			ourType.Checks = append(ourType.Checks, theirType.Checks...)
		}
		return TypeWrapper{Type: &mergedType}, &merge.MergeStats{

Commit: b33c0af

View Full Run


Tell us how we did: Give Ito Feedback

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

Main PR
Total 42090 42090
Successful 18128 18095
Failures 23962 23995
Partial Successes1 5385 5375
Main PR
Successful 43.0696% 42.9912%
Failures 56.9304% 57.0088%

${\color{red}Regressions (28)}$

create_cast

QUERY:          CREATE FUNCTION int4_casttesttype(int4) RETURNS casttesttype LANGUAGE SQL AS
$$ SELECT ('foo'::text || $1::text)::casttesttype; $$;
RECEIVED ERROR: type "casttesttype" is only a shell (errno 1105) (sqlstate HY000)

create_type

QUERY:          CREATE FUNCTION get_default_test() RETURNS SETOF default_test_row AS '
  SELECT * FROM default_test;
' LANGUAGE SQL;
RECEIVED ERROR: type "default_test_row" does not exist (errno 1105) (sqlstate HY000)
QUERY:          CREATE DOMAIN myvarchardom AS myvarchar;
RECEIVED ERROR: type "myvarchar" is only a shell (errno 1105) (sqlstate HY000)
QUERY:          SELECT typinput, typoutput, typreceive, typsend, typmodin, typmodout,
       typanalyze, typsubscript, typstorage
FROM pg_type WHERE typname = '_myvarchardom';
RECEIVED ERROR: unable to resolve type `text_w_default` during deserialization (errno 1105) (sqlstate HY000)

domain

QUERY:          create table ddtest2(f1 ddtest1);
RECEIVED ERROR: type "ddtest1" does not exist (errno 1105) (sqlstate HY000)
QUERY:          create table ddtest2(f1 ddtest1[]);
RECEIVED ERROR: unable to resolve type `posint` during deserialization (errno 1105) (sqlstate HY000)
QUERY:          create domain ddtest1d as ddtest1;
RECEIVED ERROR: unable to resolve type `posint` during deserialization (errno 1105) (sqlstate HY000)
QUERY:          create table ddtest2(f1 ddtest1d);
RECEIVED ERROR: type "ddtest1d" does not exist (errno 1105) (sqlstate HY000)
QUERY:          drop domain ddtest1d;
RECEIVED ERROR: type "ddtest1d" does not exist (errno 1105) (sqlstate HY000)
QUERY:          create domain ddtest1d as ddtest1[];
RECEIVED ERROR: unable to resolve type `posint` during deserialization (errno 1105) (sqlstate HY000)
QUERY:          create table ddtest2(f1 ddtest1d);
RECEIVED ERROR: type "ddtest1d" does not exist (errno 1105) (sqlstate HY000)
QUERY:          drop domain ddtest1d;
RECEIVED ERROR: type "ddtest1d" does not exist (errno 1105) (sqlstate HY000)
QUERY:          drop type ddtest1;
RECEIVED ERROR: unable to resolve type `posint` during deserialization (errno 1105) (sqlstate HY000)

hash_func

QUERY:          DROP TYPE hash_test_t2;
RECEIVED ERROR: unable to resolve type `money` during deserialization (errno 1105) (sqlstate HY000)

json

QUERY:          DROP TYPE jsrec_i_not_null;
RECEIVED ERROR: unable to resolve type `js_int_not_null` during deserialization (errno 1105) (sqlstate HY000)

jsonb

QUERY:          DROP TYPE jsbrec_i_not_null;
RECEIVED ERROR: unable to resolve type `jsb_int_not_null` during deserialization (errno 1105) (sqlstate HY000)

multirangetypes

QUERY:          create domain restrictedmultirange as int4multirange check (upper(value) < 10);
RECEIVED ERROR: unable to resolve type `int4multirange` (errno 1105) (sqlstate HY000)
QUERY:          drop domain restrictedmultirange;
RECEIVED ERROR: type "restrictedmultirange" does not exist (errno 1105) (sqlstate HY000)

psql

QUERY:          SELECT n.nspname as "Schema",
       t.typname as "Name",
       pg_catalog.format_type(t.typbasetype, t.typtypmod) as "Type",
       (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type bt
        WHERE c.oid = t.typcollation AND bt.oid = t.typbasetype AND t.typcollation <> bt.typcollation) as "Collation",
       CASE WHEN t.typnotnull THEN 'not null' END as "Nullable",
       t.typdefault as "Default",
       pg_catalog.array_to_string(ARRAY(
         SELECT pg_catalog.pg_get_constraintdef(r.oid, true) FROM pg_catalog.pg_constraint r WHERE t.oid = r.contypid
       ), ' ') as "Check"
FROM pg_catalog.pg_type t
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
WHERE t.typtype = 'd'
  AND t.typname OPERATOR(pg_catalog.~) E'^(no\\.such\\.domain)$' COLLATE pg_catalog.default
  AND pg_catalog.pg_type_is_visible(t.oid)
ORDER BY 1, 2;
RECEIVED ERROR: unable to resolve type `money` during deserialization (errno 1105) (sqlstate HY000)
QUERY:          SELECT n.nspname as "Schema",
  pg_catalog.format_type(t.oid, NULL) AS "Name",
  pg_catalog.obj_description(t.oid, 'pg_type') as "Description"
FROM pg_catalog.pg_type t
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
WHERE (t.typrelid = 0 OR (SELECT c.relkind = 'c' FROM pg_catalog.pg_class c WHERE c.oid = t.typrelid))
  AND NOT EXISTS(SELECT 1 FROM pg_catalog.pg_type el WHERE el.oid = t.typelem AND el.typarray = t.oid)
  AND (t.typname OPERATOR(pg_catalog.~) E'^(no\\.such\\.data\\.type)$' COLLATE pg_catalog.default
        OR pg_catalog.format_type(t.oid, NULL) OPERATOR(pg_catalog.~) E'^(no\\.such\\.data\\.type)$' COLLATE pg_catalog.default)
  AND pg_catalog.pg_type_is_visible(t.oid)
ORDER BY 1, 2;
RECEIVED ERROR: unable to resolve type `money` during deserialization (errno 1105) (sqlstate HY000)
QUERY:          SELECT n.nspname as "Schema",
       t.typname as "Name",
       pg_catalog.format_type(t.typbasetype, t.typtypmod) as "Type",
       (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type bt
        WHERE c.oid = t.typcollation AND bt.oid = t.typbasetype AND t.typcollation <> bt.typcollation) as "Collation",
       CASE WHEN t.typnotnull THEN 'not null' END as "Nullable",
       t.typdefault as "Default",
       pg_catalog.array_to_string(ARRAY(
         SELECT pg_catalog.pg_get_constraintdef(r.oid, true) FROM pg_catalog.pg_constraint r WHERE t.oid = r.contypid
       ), ' ') as "Check"
FROM pg_catalog.pg_type t
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
WHERE t.typtype = 'd'
  AND t.typname OPERATOR(pg_catalog.~) E'^(no\\.such\\.domain)$' COLLATE pg_catalog.default
  AND n.nspname OPERATOR(pg_catalog.~) E'^(no\\.such\\.schema)$' COLLATE pg_catalog.default
ORDER BY 1, 2;
RECEIVED ERROR: unable to resolve type `money` during deserialization (errno 1105) (sqlstate HY000)
QUERY:          SELECT n.nspname as "Schema",
  pg_catalog.format_type(t.oid, NULL) AS "Name",
  pg_catalog.obj_description(t.oid, 'pg_type') as "Description"
FROM pg_catalog.pg_type t
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
WHERE (t.typrelid = 0 OR (SELECT c.relkind = 'c' FROM pg_catalog.pg_class c WHERE c.oid = t.typrelid))
  AND NOT EXISTS(SELECT 1 FROM pg_catalog.pg_type el WHERE el.oid = t.typelem AND el.typarray = t.oid)
  AND (t.typname OPERATOR(pg_catalog.~) E'^(no\\.such\\.data\\.type)$' COLLATE pg_catalog.default
        OR pg_catalog.format_type(t.oid, NULL) OPERATOR(pg_catalog.~) E'^(no\\.such\\.data\\.type)$' COLLATE pg_catalog.default)
  AND n.nspname OPERATOR(pg_catalog.~) E'^(no\\.such\\.schema)$' COLLATE pg_catalog.default
ORDER BY 1, 2;
RECEIVED ERROR: unable to resolve type `money` during deserialization (errno 1105) (sqlstate HY000)
QUERY:          SELECT n.nspname as "Schema",
       t.typname as "Name",
       pg_catalog.format_type(t.typbasetype, t.typtypmod) as "Type",
       (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type bt
        WHERE c.oid = t.typcollation AND bt.oid = t.typbasetype AND t.typcollation <> bt.typcollation) as "Collation",
       CASE WHEN t.typnotnull THEN 'not null' END as "Nullable",
       t.typdefault as "Default",
       pg_catalog.array_to_string(ARRAY(
         SELECT pg_catalog.pg_get_constraintdef(r.oid, true) FROM pg_catalog.pg_constraint r WHERE t.oid = r.contypid
       ), ' ') as "Check"
FROM pg_catalog.pg_type t
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
WHERE t.typtype = 'd'
  AND t.typname OPERATOR(pg_catalog.~) E'^(no\\.such\\.domain)$' COLLATE pg_catalog.default
  AND n.nspname OPERATOR(pg_catalog.~) E'^(no\\.such\\.schema)$' COLLATE pg_catalog.default
ORDER BY 1, 2;
RECEIVED ERROR: unable to resolve type `money` during deserialization (errno 1105) (sqlstate HY000)
QUERY:          SELECT n.nspname as "Schema",
  pg_catalog.format_type(t.oid, NULL) AS "Name",
  pg_catalog.obj_description(t.oid, 'pg_type') as "Description"
FROM pg_catalog.pg_type t
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
WHERE (t.typrelid = 0 OR (SELECT c.relkind = 'c' FROM pg_catalog.pg_class c WHERE c.oid = t.typrelid))
  AND NOT EXISTS(SELECT 1 FROM pg_catalog.pg_type el WHERE el.oid = t.typelem AND el.typarray = t.oid)
  AND (t.typname OPERATOR(pg_catalog.~) E'^(no\\.such\\.data\\.type)$' COLLATE pg_catalog.default
        OR pg_catalog.format_type(t.oid, NULL) OPERATOR(pg_catalog.~) E'^(no\\.such\\.data\\.type)$' COLLATE pg_catalog.default)
  AND n.nspname OPERATOR(pg_catalog.~) E'^(no\\.such\\.schema)$' COLLATE pg_catalog.default
ORDER BY 1, 2;
RECEIVED ERROR: unable to resolve type `money` during deserialization (errno 1105) (sqlstate HY000)

rangetypes

QUERY:          create domain restrictedrange as int4range check (upper(value) < 10);
RECEIVED ERROR: unable to resolve type `int4range` (errno 1105) (sqlstate HY000)
QUERY:          drop domain restrictedrange;
RECEIVED ERROR: type "restrictedrange" does not exist (errno 1105) (sqlstate HY000)

rowtypes

QUERY:          create temp table quadtable(f1 int, q quad);
RECEIVED ERROR: type "quad" does not exist (errno 1105) (sqlstate HY000)

union

QUERY:          drop type ct1;
RECEIVED ERROR: unable to resolve type `money` during deserialization (errno 1105) (sqlstate HY000)

${\color{lightgreen}Progressions (7)}$

create_view

QUERY: SELECT * FROM unspecified_types;

domain

QUERY: select makedcomp(1,2);
QUERY: insert into dtest values('x123');
QUERY: insert into ddtest2 values(2);

enum

QUERY: SELECT 'red' = ANY ('{red,green,blue}'::rainbow[]);
QUERY: SELECT 'yellow' = ANY ('{red,green,blue}'::rainbow[]);

rowtypes

QUERY: SELECT (NULL::compositetable).a;

Footnotes

  1. These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct.

Copy link
Copy Markdown
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't find too much to fault. The AI tester's comments seem relevant in light of the regressions.

Comment thread core/typecollection/typecollection.go Outdated
Comment thread server/types/serialization.go Outdated
@Hydrocharged Hydrocharged force-pushed the daylon/full-type-resolution branch 2 times, most recently from 88afb3d to cd4e68e Compare June 2, 2026 12:15
@itoqa
Copy link
Copy Markdown

itoqa Bot commented Jun 2, 2026

Ito Test Report ❌

History reset (rebase or force-push detected). Starting test narrative over.

16 test cases ran. 3 failed, 13 passed.

Across the unified verification, 16 test cases were exercised with 13 passing and 3 failing (plus one execution that yielded no verifiable functional outcome), indicating broad stability but not release readiness. Most cast/operator/metadata and deserialization-guard paths behaved correctly—including cyclic type-graph handling, trailing-byte rejection, composite attribute hydration after restart, explicit cast/domain resolution, assignment-cast and synthesized array-cast behavior, expected ANY/missing-type errors, and consistent pg_type/OID exposure—but three high-severity regressions introduced by this PR remain: restart-time crashes when persisted enum or migrated user-defined types are deserialized via recursive context-dependent type loading with nil SQL context, and a domain-backed composite literal path that succeeds at DDL creation but fails later with unresolved attribute-type deserialization.

❌ Failed (3)
Category Summary Screenshot
Deserialization ⚠️ Restart after enum creation crashes during type deserialization. DESERIALIZATION-3
Deserialization ⚠️ Server restart crashes when reading persisted user-defined types. DESERIALIZATION-5
Operator ⚠️ Composite literal deserialization fails after successful domain-backed composite DDL. OPERATOR-4
⚠️ Server restart panics while loading persisted enum types
  • What failed: The server crashes during restart instead of reloading enum metadata, so enum casts and enum_range are unavailable after restart.
  • Impact: Environments with persisted user-defined enum types can fail restart and become unavailable until data/state is repaired. This breaks a core reliability workflow for upgrades and restarts.
  • Steps to reproduce:
    1. Create enum mood and persist it.
    2. Restart the server to force type deserialization.
    3. Run SELECT 'happy'::mood; or SELECT enum_range(NULL::mood);
  • Stub / mock context: The run used a temporary startup bypass so local authentication/bootstrap checks and domain-cast constraint rewriting were relaxed while validating type deserialization behavior.
  • Code analysis: I traced the deserialization path in server/types/serialization.go and context access in core/context.go; unresolved related types are recursively resolved using GetTypesCollectionFromContext(ctx) without guarding against nil SQL context during restart-time object loading.
  • Why this is likely a bug: The restart flow can call deserialization with a nil SQL context, and the code dereferences context-dependent loaders in that path, creating a deterministic crash in production code rather than a test harness artifact.

Relevant code:

server/types/serialization.go (lines 154-169)

typ.Elem, typeColl, err = recursiveDeserializeType(ctx, typ, typeColl, typ.Elem)
if err != nil {
    return nil, err
}
typ.Array, typeColl, err = recursiveDeserializeType(ctx, typ, typeColl, typ.Array)
if err != nil {
    return nil, err
}
typ.BaseTypeType, typeColl, err = recursiveDeserializeType(ctx, typ, typeColl, typ.BaseTypeType)
if err != nil {
    return nil, err
}

server/types/serialization.go (lines 260-269)

func recursiveDeserializeType(ctx *sql.Context, typ *DoltgresType, typeColl TypeCollection, target *DoltgresType) (*DoltgresType, TypeCollection, error) {
    if !target.IsUnresolved {
        return target, typeColl, nil
    }
    var recursedType *DoltgresType
    var err error
    if typeColl == nil {
        typeColl, err = GetTypesCollectionFromContext(ctx)
        if err != nil {

core/context.go (lines 383-390)

func GetTypesCollectionFromContext(ctx *sql.Context) (*typecollection.TypeCollection, error) {
    cv, err := getContextValues(ctx)
    if err != nil {
        return nil, err
    }
    if cv.types == nil {
        _, root, err := GetRootFromContext(ctx)
⚠️ Persisted type metadata crashes restart after pointer migration
  • What failed: Restart crashes before the server becomes healthy, so persisted type metadata cannot be read and legacy/migrated objects are inaccessible.
  • Impact: Databases with persisted user-defined types may fail to come back up after restart, blocking normal operations. This creates high operational risk for deployments carrying existing type metadata.
  • Steps to reproduce:
    1. Persist user-defined composite/domain/enum types.
    2. Restart the server so type metadata is deserialized.
    3. Attempt to query or cast values that require those persisted types.
  • Stub / mock context: The run used a temporary startup bypass so local authentication/bootstrap checks and domain-cast constraint rewriting were relaxed while validating type deserialization behavior.
  • Code analysis: I reviewed the same recursive deserialization path and verified that persisted related-type pointers are resolved through context-dependent collection loading; when restart initialization provides nil context, recursiveDeserializeType fails the same way and prevents type graph hydration.
  • Why this is likely a bug: The failure is caused by production deserialization logic that requires context-backed type collection access during restart, which breaks persisted-type hydration and causes a repeatable crash.

Relevant code:

server/types/serialization.go (lines 154-169)

typ.Elem, typeColl, err = recursiveDeserializeType(ctx, typ, typeColl, typ.Elem)
if err != nil {
    return nil, err
}
typ.Array, typeColl, err = recursiveDeserializeType(ctx, typ, typeColl, typ.Array)
if err != nil {
    return nil, err
}
typ.BaseTypeType, typeColl, err = recursiveDeserializeType(ctx, typ, typeColl, typ.BaseTypeType)
if err != nil {
    return nil, err
}

server/types/serialization.go (lines 266-274)

if typeColl == nil {
        typeColl, err = GetTypesCollectionFromContext(ctx)
        if err != nil {
            return nil, nil, err
        }
    }
    typeColl.WithCachedType(typ, func() {
        t, nErr := typeColl.GetType(ctx, target.ID)

core/context.go (lines 383-390)

func GetTypesCollectionFromContext(ctx *sql.Context) (*typecollection.TypeCollection, error) {
    cv, err := getContextValues(ctx)
    if err != nil {
        return nil, err
    }
    if cv.types == nil {
        _, root, err := GetRootFromContext(ctx)
⚠️ Composite literal fails after domain-backed type creation
  • What failed: DDL succeeds, but later composite literal IO fails with unable to resolve type 'pos_i' during deserialization instead of returning parsed composite fields.
  • Impact: Users can successfully define domain-backed composite types but cannot reliably read/write values through composite literals. This breaks a core SQL type workflow with no practical workaround once such schemas are created.
  • Steps to reproduce:
    1. Create DOMAIN pos_i AS int4 CHECK (VALUE > 0).
    2. Create TYPE comp_t AS (a pos_i, b text).
    3. Run SELECT ('(1,hello)'::comp_t).a, ('(1,hello)'::comp_t).b and observe the error.
  • Stub / mock context: This check ran through a temporary in-process SQL integration harness instead of the normal external startup path because service boot was unstable in this run. Local auth and domain-constraint bypass patches under server/auth/auth_handler.go and server/analyzer/domain_constraints.go were active to keep setup deterministic, while the SQL assertion itself used real engine behavior.
  • Code analysis: CREATE TYPE stores composite attribute types without resolving unresolved user-defined references, while deserialization later requires those attribute types to be fully resolvable; this mismatch causes runtime failure after successful DDL.
  • Why this is likely a bug: The code path accepts unresolved attribute types at composite type creation but later hard-fails when those same attributes must resolve during deserialization, creating an internally inconsistent and user-visible failure mode.

Relevant code:

server/ast/resolvable_type_reference.go (lines 52-56)

case *tree.UnresolvedObjectName:
	tn := columnType.ToTableName()
	columnTypeName = tn.Object()
	doltgresType = pgtypes.NewUnresolvedDoltgresType(tn.Schema(), columnTypeName)

server/node/create_type.go (lines 136-140)

attrs := make([]types.CompositeAttribute, len(c.AsTypes))
for i, a := range c.AsTypes {
	attrs[i] = types.NewCompositeAttribute(ctx, relID, a.AttrName, a.Typ, int16(i+1), a.Collation)
}
newType = types.NewCompositeType(ctx, relID, &types.DoltgresType{ID: arrayID, IsUnresolved: true}, typeID, attrs)

server/types/serialization.go (lines 166-170)

for i := range typ.CompositeAttrs {
	typ.CompositeAttrs[i].Type, typeColl, err = recursiveDeserializeType(ctx, typ, typeColl, typ.CompositeAttrs[i].Type)
	if err != nil {
		return nil, err
	}
}
✅ Passed (13)
Category Summary Screenshot
Deserialization Cyclic type graph checks stayed deterministic without recursion loops or hangs. DESERIALIZATION-1
Deserialization Trailing-byte payloads are rejected by deserialization guards as expected. DESERIALIZATION-2
Deserialization Composite type casts and attribute reads succeeded after restart. DESERIALIZATION-4
Metadata Domain column metadata correctly exposed the base type OID instead of the domain OID. METADATA-1
Metadata Re-run confirmed stable pg_type linked IDs without reproducing the earlier deserialization error. METADATA-2
Metadata Code-level safeguards prevent malformed pointer graphs from emitting spoofed catalog or wire OIDs. METADATA-3
Operator IN tuple with unknown params compiles via assignment-cast fallback and returns true. OPERATOR-1
Operator Array-to-array cast succeeds by using synthesized base-type cast. OPERATOR-2
Operator Incompatible ANY comparison correctly returns operator resolution error. OPERATOR-3
Resolution Explicit cast to a newly created enum resolved and returned the expected value new. RESOLUTION-1
Resolution Domain creation over an enum base type and value round-trip succeeded. N/A
Resolution Casting to a missing type returned the expected unresolved-type error at bind/resolution time. N/A
Resolution Alternate cast execution paths returned consistent new results and did not reproduce instability. RESOLUTION-4

Commit: cd4e68e

View Full Run


Tell us how we did: Give Ito Feedback

@Hydrocharged
Copy link
Copy Markdown
Collaborator Author

Regressions are exposing missing logic that didn't cause errors before, and are causing errors now that we're effectively enforcing our created types to actually be correct.

@Hydrocharged Hydrocharged force-pushed the daylon/full-type-resolution branch from cd4e68e to c08a1d7 Compare June 3, 2026 07:07
@Hydrocharged Hydrocharged enabled auto-merge June 3, 2026 07:07
@Hydrocharged Hydrocharged merged commit f48dfa2 into main Jun 3, 2026
22 of 23 checks passed
@Hydrocharged Hydrocharged deleted the daylon/full-type-resolution branch June 3, 2026 08:06
@itoqa
Copy link
Copy Markdown

itoqa Bot commented Jun 3, 2026

Ito Test Report ❌

History reset (rebase or force-push detected). Starting test narrative over.

17 test cases ran. 2 failed, 2 additional findings, 13 passed.

Overall, the unified run failed with 13 of 17 tests passing and 4 high-severity failures, despite broad stability in recursive unresolved-type handling, concurrent lookup determinism, array operator error safety, polymorphic anyarray behavior, and pg_type/pgwire metadata integrity across reconnects. The critical defects were reproducible panics in restart-time and post-restart domain/composite type paths (including a nil SQL-context deserialization crash and cast panic, both marked as introduced by this PR), a recovered insert-time domain-constraint analyzer panic (“table not found”), and non-atomic CREATE DOMAIN behavior that can leave orphaned type state after companion array creation failure.

❌ Failed (2)
Category Summary Screenshot
Deserialization ⚠️ Post-restart casts on persisted domain/composite types triggered a server panic. DESERIALIZATION-1
Migration ⚠️ Restart after persisting user-defined types can panic during recursive type deserialization when context is not a SQL context. MIGRATION-4
⚠️ Domain cast can panic after restart with persisted user-defined types
  • What failed: The cast path panicked server-side during expected successful type handling instead of returning typed values after restart.
  • Impact: Persisted user-defined type workflows can fail during normal query execution after restart, making a core schema feature unreliable. Teams using domain/composite arrays may hit runtime panics instead of actionable SQL errors.
  • Steps to reproduce:
    1. Create enum, composite, and domain types plus a table that uses arrays of those types.
    2. Restart the server and reconnect to the database.
    3. Execute casts such as ROW('eve','ok')::person_t and inserts using mood_dom[] values.
    4. Observe the server panic instead of successful typed results.
  • Stub / mock context: Authentication and bootstrap behavior were temporarily bypassed in the application startup/auth paths so the local environment could run deterministically, and domain-cast constraint enforcement was made environment-gated during this run.
  • Code analysis: The cast path always unwraps domain targets via DomainUnderlyingBaseType, but that helper assumes domain base-type pointers are fully resolved. In this PR's pointer-based type model, unresolved placeholders are created during deserialization; if resolution is incomplete on a runtime path, cast evaluation can dereference invalid domain base-type state and panic.
  • Why this is likely a bug: Production cast execution depends on domain base-type resolution, but the current pointer-based path can feed unresolved domain references into unguarded recursive dereferencing, which matches the observed panic.

Relevant code:

server/expression/assignment_cast.go (lines 101-105)

func checkForDomainType(t *pgtypes.DoltgresType) *pgtypes.DoltgresType {
	if t.TypType == pgtypes.TypeType_Domain {
		t = t.DomainUnderlyingBaseType()
	}
	return t
}

server/types/type.go (lines 535-539)

func (t *DoltgresType) DomainUnderlyingBaseType() *DoltgresType {
	if t.BaseTypeType.TypType == TypeType_Domain {
		return t.BaseTypeType.DomainUnderlyingBaseType()
	} else {
		return t.BaseTypeType
	}
}

server/types/serialization.go (lines 247-255)

func prefillTypeDuringDeserialization(target id.Type) *DoltgresType {
	if target == id.NullType {
		return internalNullType
	}
	if builtin, ok := IDToBuiltInDoltgresType[target]; ok {
		return builtin
	}
	return NewUnresolvedDoltgresTypeFromID(target)
}
⚠️ Startup type deserialization panics without SQL context
  • What failed: Server startup crashed while resolving recursive type pointers because deserialization proceeded with a nil SQL context and then requested a types collection from that nil context.
  • Impact: Restarting an instance with persisted user-defined type metadata can crash the server and leave SQL unavailable. This blocks normal recovery and upgrade workflows for affected repositories.
  • Steps to reproduce:
    1. Create and persist enum/composite/domain type graph data.
    2. Restart the server so metadata is loaded from storage.
    3. Trigger type loading during startup/reconnect.
    4. Observe panic in recursive deserialization when SQL context is nil.
  • Stub / mock context: Local SCRAM authentication was bypassed (EnableAuthentication = false in server/authentication_scram.go) and restart checks were run against persisted local type metadata; no network/API stubs were used.
  • Code analysis: I followed the deserialization call chain from root-object loading into recursive type resolution; the root-object path accepts non-SQL context and passes nil into deserialization, while recursive resolution unconditionally calls context-dependent type collection access.
  • Why this is likely a bug: The current restart deserialization path can propagate a nil SQL context into mandatory context-dependent resolution, creating a deterministic panic path during startup.

Relevant code:

core/typecollection/root_object.go (lines 32-35)

func (pgs *TypeCollection) DeserializeRootObject(ctx context.Context, data []byte) (objinterface.RootObject, error) {
    sqlCtx, _ := ctx.(*sql.Context)
    t, err := pgtypes.DeserializeType(sqlCtx, data)
    if err != nil {

server/types/serialization.go (lines 266-269)

if typeColl == nil {
    typeColl, err = GetTypesCollectionFromContext(ctx)
    if err != nil {
        return nil, nil, err

core/context.go (lines 54-56)

func getContextValues(ctx *sql.Context) (*contextValues, error) {
    sess := dsess.DSessFromSess(ctx.Session)
    if sess.DoltgresSessObj == nil {
✅ Passed (13)
Category Summary Screenshot
Array Repeated adversarial ANY/IN probes produced consistent invalid-input failures, while valid control queries still succeeded and service health remained stable. ARRAY-1
Array Unsupported ANY comparisons consistently returned operator-binding errors, while supported controls returned true and post-check health queries succeeded. ARRAY-3
Cast Domain creation, defaults/checks, and catalog entries behaved as expected. CAST-1
Cast Missing and shell base types were rejected cleanly; valid control domain succeeded. CAST-2
Deserialization Unresolved type request failed fast and the next query succeeded. DESERIALIZATION-3
Deserialization Concurrent resolved/unresolved lookups stayed deterministic across 12 workers. DESERIALIZATION-4
Deserialization Repeated unresolved lookups remained deterministic and did not poison nearby types. DESERIALIZATION-5
Metadata Domain column wire metadata matched the domain base type OID. METADATA-1
Metadata pg_type linkage fields (typelem, typarray, typbasetype) were consistent and stable across reconnect. METADATA-2
Metadata Code-first re-check showed SQL PREPARE text is unsupported by design, while protocol prepared statements and wire/catalog OID mapping were consistent. METADATA-3
Metadata Pointer-linked type graph invariants and decode behavior were coherent after reconnect in remediation validation. METADATA-4
Migration Verified reciprocal typarray / typelem linkage for a user-defined enum and its array type, including persistence after reconnect. MIGRATION-1
Migration Verified array_position and array_to_string resolved element semantics correctly for user-defined and built-in arrays. MIGRATION-3
ℹ️ Additional Findings (2)

These findings are unrelated to the current changes but were observed during testing.

Category Summary Screenshot
Cast ⚠️ CREATE DOMAIN can persist a domain even when companion array creation fails, leaving orphaned state. CAST-3
Migration ⚠️ Insert-time domain constraint evaluation can trigger a recovered server panic (table not found) instead of stable constraint handling. MIGRATION-2
⚠️ CREATE DOMAIN leaves orphaned type state after companion array failure
  • What failed: The domain type is created and cached first, then array creation fails; retry reports the domain already exists and the type graph remains partially written instead of rolling back atomically.
  • Impact: A failed CREATE DOMAIN can leave non-retryable catalog state that breaks subsequent domain creation attempts and related array-domain operations. This disrupts a core DDL workflow with no practical workaround besides manual cleanup.
  • Steps to reproduce:
    1. Trigger CREATE DOMAIN while forcing companion array type creation to fail, such as by pre-creating a conflicting _domain_name type.
    2. Retry the same CREATE DOMAIN statement.
    3. Run catalog checks and array-domain casts to verify orphaned state and retry failure behavior.
  • Stub / mock context: Authentication bypass remained enabled for local execution, and this test intentionally injected a companion-array naming conflict to force second-step CREATE DOMAIN failure and verify rollback behavior.
  • Code analysis: I inspected server/node/create_domain.go and core/typecollection/typecollection.go. CREATE DOMAIN performs two sequential CreateType calls without rollback, and CreateType immediately stores the first type in cache; cache flush paths persist cached types later, allowing partial state to survive when the second step fails.
  • Why this is likely a bug: The production code has a clear non-atomic two-step write path with no compensation on second-step failure, which directly explains the observed orphaned/retry-failure behavior.

Relevant code:

server/node/create_domain.go (lines 103-114)

newType := pgtypes.NewDomainType(ctx, c.AsType, defExpr, c.IsNotNull, checkDefs, arrayID, internalID)
err = collection.CreateType(ctx, newType)
if err != nil {
    return nil, err
}

// create array type of this type
arrayType := pgtypes.CreateArrayTypeFromBaseType(newType)
err = collection.CreateType(ctx, arrayType)
if err != nil {
    return nil, err
}

core/typecollection/typecollection.go (lines 61-78)

func (pgs *TypeCollection) CreateType(ctx context.Context, typ *pgtypes.DoltgresType) error {
    // First we check the built-in types
    if _, ok := pgtypes.IDToBuiltInDoltgresType[typ.ID]; ok {
        return pgtypes.ErrTypeAlreadyExists.New(typ.Name())
    }
    // Ensure that the type does not already exist in the cache or underlying map
    if _, ok := pgs.accessedMap[typ.ID]; ok {
        return pgtypes.ErrTypeAlreadyExists.New(typ.Name())
    }
    // Add it to our cache, which will be written when we do anything permanent
    pgs.accessedMap[typ.ID] = typ
    return nil
}

core/typecollection/typecollection.go (lines 505-529)

func (pgs *TypeCollection) writeCache(ctx context.Context) (err error) {
    if len(pgs.accessedMap) == 0 {
        return nil
    }
    mapEditor := pgs.underlyingMap.Editor()
    for _, t := range pgs.accessedMap {
        data := t.Serialize()
        h, err := pgs.ns.WriteBytes(ctx, data)
        if err != nil {
            return err
        }
        if err = mapEditor.Update(ctx, string(t.ID), h); err != nil {
            return err
        }
    }
    pgs.underlyingMap = flushed
    clear(pgs.accessedMap)
    return nil
}
⚠️ Domain constraint analysis panics during insert
  • What failed: Domain-check evaluation hit a recovered panic (table not found: t) instead of consistently enforcing domain checks without panicking.
  • Impact: Domain-typed write paths can fail unpredictably and emit panic traces in normal SQL usage. This breaks a core data-write workflow with no practical workaround for affected schemas.
  • Steps to reproduce:
    1. Start the server with a clean data directory.
    2. Create enum/domain/table objects that exercise domain check constraints.
    3. Run insert/select statements that trigger domain check compilation.
    4. Observe recovered panic output from domain-constraint analyzer logic.
  • Stub / mock context: Local SCRAM authentication was bypassed (EnableAuthentication = false in server/authentication_scram.go) and the server was restarted on a clean data directory to exercise first-start type paths; no API or route-response stubs were used.
  • Code analysis: I traced domain-check construction in analyzer code and found table-bound scalar building from a synthesized FROM value derived from column source names; this can fail resolution in insert/update flows and matches the observed table not found panic site.
  • Why this is likely a bug: The analyzer builds domain-check expressions against a synthesized table expression path that can be unresolved at runtime, directly aligning with the observed panic location and violating expected panic-free domain enforcement.

Relevant code:

server/analyzer/domain_constraints.go (lines 105-113)

func getDomainCheckConstraintsForTable(ctx *sql.Context, a *analyzer.Analyzer, colName string, tblName string, checkDefs []*sql.CheckDefinition) (sql.CheckConstraints, error) {
    checks := make(sql.CheckConstraints, len(checkDefs))
    for i, check := range checkDefs {
        q := fmt.Sprintf("select %s from %s", check.CheckExpression, tblName)
        checkExpr, err := parseAndReplaceDomainCheckConstraint(ctx, a, check.CheckExpression, q, &tree.ColumnItem{
            ColumnName: tree.Name(colName),
            TableName:  &tree.UnresolvedObjectName{NumParts: 1, Parts: [3]string{tblName}},
        })

server/analyzer/domain_constraints.go (lines 231-236)

builder := planbuilder.New(ctx, a.Catalog, nil)
var tblExpr vitess.TableExpr
if len(convertedSelectStmt.From) == 1 {
    tblExpr = convertedSelectStmt.From[0]
}
return builder.BuildScalarWithTable(ae.Expr, tblExpr), nil

Commit: c08a1d7

View Full Run


Tell us how we did: Give Ito Feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants