Skip to content

Add native_uuid connection option for UNIQUEIDENTIFIER return type control #447

@thegoodwinner

Description

@thegoodwinner

Hi,

I'm migrating a product codebase that uses SQLAlchemy mssql+pyodbc - on Azure Functions; primarily due to SIGSEGV crashes, traced back to likely runtime/driver issues.

So far so good - however the refactor has been significant due to UUID handling behaviours (which are extensively used in the ORM) - pyodbc returns string by default (v2 flags via native_uuid).


Is your feature request related to a problem? Please describe.

mssql-python always returns uuid.UUID objects from UNIQUEIDENTIFIER columns. For codebases migrating from pyodbc (which returns strings by default), this causes widespread breakage across every codepath that touches a GUID:

  • AttributeErroe: existing code calls .strip(), .upper(), .replace() on values that are now uuid.UUID, not str
  • TypeError in JSON serialization: uuid.UUID is not JSON-serializable by default; every API response, queue message, and telemetry payload that includes a GUID fails
  • Silent equality mismatches: "ABC..." == UUID("ABC...") is False, causing lookups, dictionary keys, and set membership checks to silently break
  • SQLAlchemy result processor crash: SQLAlchemy's _python_UUID helper calls value.replace("-", "") assuming a string input, raising AttributeError when it receives a uuid.UUID object (this is the same class of bug as pymssql might support native UUID, set attribute sqlalchemy/sqlalchemy#9414 which was filed for pymssql)

This is not a niche concern, pyodbc is the dominant Python MSSQL driver, so most codebases migrating to mssql-python will hit this. In my production codebase (~700 UNIQUEIDENTIFIER columns, ~90 files touching GUIDs), adopting mssql-python required a global monkey-patch on SQLAlchemy internals and an extensive audit of every string operation on GUID values.

Describe the solution you'd like

A native_uuid parameter on connect() that controls whether UNIQUEIDENTIFIER columns are returned as uuid.UUID objects or str:

Current behavior preserved (default): returns uuid.UUID objects

conn = mssql_python.connect(conn_str, native_uuid=True)

Migration-friendly: returns str (hyphenated, e.g. "110e2700-9d34-44e9-ba0e-bd74401a54a4")

conn = mssql_python.connect(conn_str, native_uuid=False)

Both modes should continue to accept uuid.UUID objects as bind parameters (input handling is already correct).

The implementation would be minimal: in the result-row processing where UUID(bytes_le=raw_bytes) is already constructed:

val = uuid.UUID(bytes_le=raw_bytes)
return val if self._native_uuid else str(val)

This should apply consistently across fetchone(), fetchmany(), and fetchall() (the inconsistency from #241 was already fixed).

Describe alternatives you've considered

  1. Monkey-patching SQLAlchemy internals: This is what we currently do. We patch sqlalchemy.sql.sqltypes._python_UUID to short-circuit when it receives a uuid.UUID object instead of a string. It works but is fragile (depends on an internal API that could be renamed/removed) and doesn't help non-SQLAlchemy consumers.

My current workaround in production

original = sqltypes._python_UUID

def _python_uuid(value):
    if isinstance(value, UUID):
        return value  # short-circuit: already a UUID
    return original(value)

sqltypes._python_UUID = _python_uuid
  1. Application-level str() coercion everywhere: Wrapping every DB result in str() at the repository layer. This is error-prone at scale (easy to miss a codepath) and defeats the purpose of having typed UUID support.
  2. SQLAlchemy TypeDecorator: A custom column type that overrides process_result_value to stringify. This works for ORM queries but doesn't cover text() queries or raw cursor usage.
  3. Waiting for the official SQLAlchemy dialect: The dialect being developed (referenced in coercion of UNIQUEIDENTIFIER / uuid values from binary to string only works with fetchone(), not fetchall() or fetchmany() #241) will presumably set supports_native_uuid = True, which solves the SQLAlchemy layer. But it doesn't help raw DB-API consumers, Pydantic serialization, or non-SQLAlchemy frameworks (Django, SQLModel, etc.).

Additional context

Precedent: pyodbc addressed this exact problem with its native_uuid flag (mkleehammer/pyodbc#177, added in v4.0.9). The default is False (strings), with True opting into uuid.UUID objects. mssql-python could use the inverse default (True) since it's a newer driver, while still offering the escape hatch.

Ecosystem pattern:

Driver UNIQUEIDENTIFIER return type Configurable?
pyodbc str (default) / uuid.UUID (opt-in) Yes — pyodbc.native_uuid
pymssql 2.x uuid.UUID (always) No
mssql-python uuid.UUID (always) No (this request)

Migration context: mssql-python positions itself as a modern, Microsoft-supported alternative to pyodbc. The primary adoption path is pyodbc → mssql-python. A compatibility flag would significantly lower the migration
barrier, teams can switch drivers first, then migrate UUID handling incrementally, rather than requiring a big-bang refactor of every GUID touchpoint.

Related issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    triage neededFor new issues, not triaged yet.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions