Handle SERIAL autoincrement columns when targeting MS-SQL

## Summary

PostgreSQL uses `SERIAL` (and its variants `BIGSERIAL`, `SMALLSERIAL`) as shorthand for autoincrement integer columns. When datafaker reflects a PostgreSQL source schema, these columns appear as `SERIAL` in the generated DDL. The existing code already works around this for DuckDB via a `@compiles` hook, but no equivalent exists for MS-SQL, which uses `IDENTITY(1,1)` instead.

## Problem

In `datafaker/create.py` there are two dialect-specific DDL hooks registered with SQLAlchemy's `@compiles` decorator:

```python
@compiles(CreateColumn, "duckdb")
def remove_serial(element: CreateColumn, compiler: Any, **kw: Any) -> str:
    text: str = compiler.visit_create_column(element, **kw)
    return text.replace(" SERIAL ", " INTEGER ")

@compiles(CreateTable, "duckdb")
def remove_on_delete_cascade(element: CreateTable, compiler: Any, **kw: Any) -> str:
    text: str = compiler.visit_create_table(element, **kw)
    return text.replace(" ON DELETE CASCADE", "")
```

The `remove_serial` hook replaces `SERIAL` with `INTEGER` for DuckDB. MS-SQL does not understand `SERIAL` either, so `create-tables` against an MS-SQL destination will raise a syntax error whenever the source schema has autoincrement columns.

Additionally, `ON DELETE CASCADE` is stripped for DuckDB but not for MS-SQL, where it **is** supported — so that hook does not need an MS-SQL equivalent. However, MS-SQL has its own DDL quirks that may need similar treatment (see proposed steps below).

## How SERIAL arises

When datafaker reflects a PostgreSQL source, SQLAlchemy maps `SERIAL` columns to `INTEGER` columns with a `Sequence` or `autoincrement=True`. When the ORM YAML is then used to `create-tables` on a destination, SQLAlchemy re-generates DDL. For a PostgreSQL destination this renders as `SERIAL`; for MS-SQL the correct rendering is `INTEGER IDENTITY(1,1)`.

In practice, SQLAlchemy **already** handles this correctly when the column is defined with `autoincrement=True` and the engine is an MS-SQL engine — it will emit `IDENTITY(1,1)` automatically. The problem is the `@compiles(CreateColumn, "duckdb")` hook, which replaces the compiled text at the string level. A naive port of the same approach to `"mssql"` is unnecessary and could interfere with SQLAlchemy's correct MS-SQL DDL generation.

The real risk is the reverse: if a source schema is reflected with `SERIAL` stored literally as the column type string in `orm.yaml` (rather than being resolved to `INTEGER + autoincrement`), parsing and re-emitting that string on an MS-SQL engine will produce invalid DDL.

## Proposed steps

### 1. Confirm how SERIAL is stored in orm.yaml

Run `make-tables` against a PostgreSQL source that contains `SERIAL` columns and inspect the resulting `orm.yaml`. Determine whether:

- (a) The type string is `INTEGER` (autoincrement handled separately), or
- (b) The type string is literally `SERIAL` / `BIGSERIAL` / `SMALLSERIAL`.

If (a), SQLAlchemy will generate `IDENTITY(1,1)` for MS-SQL automatically and no further action is needed. If (b), proceed with steps 2–4.

### 2. Add SERIAL type strings to the type parser

If `orm.yaml` contains `SERIAL`, `BIGSERIAL`, or `SMALLSERIAL`, add parsers for them in `datafaker/serialize_metadata.py` that map to the appropriate generic integer types with `autoincrement=True`:

```python
# SERIAL variants → generic integers; autoincrement is set via column flag
parsy.string("BIGSERIAL").result(sqltypes.BigInteger),   # must precede SERIAL
parsy.string("SMALLSERIAL").result(sqltypes.SmallInteger),
parsy.string("SERIAL").result(sqltypes.Integer),
```

Note: longer names (`BIGSERIAL`, `SMALLSERIAL`) must appear **before** `SERIAL` in the `parsy.alt()` list to avoid the prefix-collision issue described in #96.

### 3. Add a @compiles hook for MS-SQL if needed

If SQLAlchemy does not automatically strip `SERIAL` when targeting MS-SQL (i.e. it passes the token through unchanged), add a `@compiles(CreateColumn, "mssql")` hook in `datafaker/create.py` modelled on the existing DuckDB one:

```python
@compiles(CreateColumn, "mssql")
def remove_serial_mssql(element: CreateColumn, compiler: Any, **kw: Any) -> str:
    text: str = compiler.visit_create_column(element, **kw)
    return text.replace(" SERIAL ", " INTEGER ")
```

This is only a safety net; the preferred fix is step 2 (parser-level mapping).

### 4. Add tests

- A unit test confirming that `SERIAL`, `BIGSERIAL`, and `SMALLSERIAL` type strings parse to the correct generic integer types.
- An integration test (or a DDL-compilation test using a mock MS-SQL engine) confirming that `create_all()` on an MS-SQL engine does not emit `SERIAL` in the generated DDL.

## Acceptance criteria

- `create-tables` against an MS-SQL destination succeeds when the `orm.yaml` was produced from a PostgreSQL source containing `SERIAL` / `BIGSERIAL` / `SMALLSERIAL` columns.
- The generated MS-SQL DDL uses `INTEGER IDENTITY(1,1)` (or equivalent) rather than `SERIAL`.
- Existing PostgreSQL and DuckDB behaviour is unchanged.

## Related

- #93 — MS-SQL driver dependency
- #94 — Hardcoded async DSN rewriting
- #95 — PostgreSQL search_path schema selection
- #96 — PostgreSQL-specific column types in the type parser

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle SERIAL autoincrement columns when targeting MS-SQL #97

Summary

Problem

How SERIAL arises

Proposed steps

1. Confirm how SERIAL is stored in orm.yaml

2. Add SERIAL type strings to the type parser

3. Add a @compiles hook for MS-SQL if needed

4. Add tests

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Handle SERIAL autoincrement columns when targeting MS-SQL #97

Description

Summary

Problem

How SERIAL arises

Proposed steps

1. Confirm how SERIAL is stored in orm.yaml

2. Add SERIAL type strings to the type parser

3. Add a @compiles hook for MS-SQL if needed

4. Add tests

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions