Releases · dlt-hub/dlt

28 Apr 18:49

rudolfix

1.26.0

b3fc4a9

1.26.0 Latest

Latest

dlt 1.26.0 Release Notes

Breaking Changes

Incremental external scheduler now raises instead of silently warning (#3877 @rudolfix) — Untyped/non-coercible cursor values now raise JoinSchedulerError; missing intervals raise ExternalSchedulerNotAvailable. Resources with allow_external_schedulers=True that previously fell back to dlt state will now fail. This is a bugfix that corrects previously incorrect behavior.

Highlights

dlt.Relation.join(...) (#3590 @Travior) — Adds a join() method on dlt.Relation based on the normalizer and table references, enabling fluent relational composition over datasets.
Extended Snowflake query tags (#3759 @Travior) — Snowflake query tagging is extended beyond load jobs to broader dlt operations: storage setup, schema/state reads, schema updates, load execution, load completion, and table drops. TJobQueryTags is generalized to TQueryTags with a new operation field (with a compatibility export).
Time interval context for incrementals (#3877 @rudolfix) — New dlt.current.interval() returns the active (start, end) interval or None, backed by an injectable TimeIntervalContext with optional allow_external_schedulers override and auto-detection from env vars / Airflow.

Core Library

dlt.Relation.join(...) (#3590 @Travior) — see Highlights.
Extended Snowflake query tags beyond load jobs (#3759 @Travior) — see Highlights.
Time interval context for incrementals (#3877 @rudolfix) — see Highlights.
Destination-aware incremental SQL filter (#3877 @rudolfix) — New dlt/extract/incremental/sql.py (to_sqlglot_filter) honors timestamp_precision, supports_tz_aware_datetime_in_cast, and sqlite quirks; works on bound and unbound incrementals.
start_value persisted in incremental state (#3877 @rudolfix) — Only written when rows actually arrive, so it is no longer advanced silently on empty runs.
uuid_to_string PyArrow fast path (#3877 @rudolfix) — Numpy-vectorized with a pure-Python fallback; pyarrow ≥ 24 arrow.uuid extension arrays are coerced to canonical strings, and UUID columns under pyarrow < 24 also take the fast path.
Custom metrics emitted even when a resource produces no data (#3877 @rudolfix) — dlt.current.resource_metrics() counters are no longer dropped when every item is filtered out.
TypedDict validator honors NotRequired[T] (#3877 @rudolfix) — Via __required_keys__.
sqlglot "dremio" dialect literal (#3877 @rudolfix) — Added to TSqlGlotDialect.
Opt-in naming-convention check in Schema.unify_schemas() (#3898 @burnash) — The naming-convention check in Schema.unify_schemas() is now opt-in; also drops the max_length tests workaround.
Fix(clickhouse): correct metadata sort keys to avoid full scans (#3851 @anuunchin) — Resolves #3806 by correcting metadata sort keys so ClickHouse no longer performs full scans.
Fix(pyarrow): preserve string encoding for UUID columns under pyarrow 24+ (#3894 @burnash) — Fixes #3893 so UUID columns keep their string encoding under pyarrow 24+.
Fix: clearer error from dlt.attach() when pipeline cannot be restored (#3890 @bjoaquinc) — Rewrites CannotRestorePipelineException messages to name required inputs, show a concrete dlt.attach(...) example, and offer dlt.dataset() as a lighter alternative; suppresses a redundant inner exception in tracebacks.
Fix: closed-pipe race in iter_std (#3877 @rudolfix) — Reader threads swallow ValueError/OSError and always close the queue.
Fix: active profile shown in ConfigFieldMissingException (#3877 @rudolfix)

Docs

Improved dashboard docs (#3484 @VioletM)
Updated documentation for pg_replication (#3900 @dat-a-man) — Adds an "Alternative: CDC with Debezium" section linking to the Debezium + dlt demo.
Snowflake marketplace listing link (#3888 @kaliole)
Quality-checker fixes to pipeline.md (#3885 @ShreyasGS) — Tense, contractions, and grammar cleanups via Harper + Vale Google Developer Docs style.
Remove playground page and associated CI (#3883 @zilto) — Streamlines the "Getting Started" section; interactive material remains on Google Colab and marimo molab.
Pydantic Logfire export example (#3882 @zilto)

New Contributors

@bjoaquinc made their first contribution in #3890

Contributors

burnash, VioletM, and 8 other contributors

Assets 2

15 Apr 07:41

rudolfix

1.25.0

a50ab06

1.25.0

dlt 1.25.0 Release Notes

Breaking Changes

Multischema datasets (#3770 @burnash) — Datasets can now hold multiple schemas. The main benefit is to be able to see tables from all source in multi-source pipelines. This is a new default behavior.
Users can pass a list of schemas to dataset() method and still go back to single-schema dataset by providing pipeline.default_schema when creating dataset.

Highlights

lance destination (#3810 @jorritsandbrink) — New destination for the Lance table format with optional vector embedding generation via lancedb. Supports local storage and s3/az/gs, uses the Lance Directory Namespace V2 spec, and supports branching. Complements the existing lancedb destination (which targets LanceDB Cloud).
Multischema datasets (#3770 @burnash) — See Breaking Changes above. Enables sidecar schemas (e.g. data-quality quarantine tables) to live alongside the primary schema in a single dataset.
Improved progress and load metrics (#3768 @rudolfix) — Load metrics now persist across restarts, normalizer metrics are updated via update files, and the follow-up job graph is saved into the trace. Closes the long-standing #853.

Core Library

lance destination (#3810 @jorritsandbrink) — See Highlights.
Multischema datasets (#3770 @burnash) — See Highlights.
Improved progress and load metrics (#3768 @rudolfix) — See Highlights.
ducklake: metadata_schema ATTACH option (#3763 @sangwookWoo) — Adds metadata_schema to DuckLakeCredentials so the DuckLake metadata schema can be configured independently from ducklake_name.
Fix: preserve credential chain in AWS credentials (#3798 @rudolfix) — Default credential mixing applied correctly, STS scoped to Databricks only. Closes #3115.
Fix: replay state transitions after crash (#3767 @rudolfix) — Writes a pending state-transition marker right after the DB commit so an interrupted load no longer leaves the load package in an inconsistent state.
Fix: create all eligible tables on staging dataset (#3765 @rudolfix) — Closes #2862.
Fix: normalize pool workers skip __main__ in orchestrators (#3784 @rudolfix) — Closes #3586.
Fix(clickhouse): lightweight DELETE for single-table merge (#3783 @rudolfix) — Removes the _dlt_id requirement when merging arrow tables without nested tables on ClickHouse.
Fix(clickhouse): pass aws_session_token to staging s3() table function (#3769 @anuunchin) — Temporary AWS credentials now work for ClickHouse staging.
Fix: avoid leaking PUA markers in nested fields (#3760 @serl) — Fixes Pydantic nested-model PUA-marker leak. Closes #3755.
Fix: deepcopy paginator in child resource (#3779 @anuunchin) — Prevents paginator state corruption across child-resource invocations. Closes #3772.
Fix: honor explicit non-utf8 encoding in filesystem read_csv (#3743 @biefan) — File is opened with the requested encoding so SFTP/paramiko stacks no longer pre-decode as UTF-8.
Fix: don't filter out trace steps with exceptions (#3843 @anuunchin) — trace.asdict() now retains pipelines that fail in the sync step before extract.
Fix: check duckdb version when installing lance extension (#3773 @zilto) — Handles the lance extension promotion to built-in in duckdb 1.5.
Fix: transient Windows file-lock PermissionError in rename_tree (#3853 @burnash) — Resolves intermittent Windows CI failures during normalize→loaded rename.
Fix: deprecation warnings across supported package versions (#3831 @anuunchin) — Closes #3785, #3807, #3787, #3794.

Docs

Cookbook section (#3860 @zilto) — Tested examples moved to a dedicated top-level tab; dlt tab added for navigation back; UI cleanups.
Same-domain docs button (#3859 @zilto) — Avoids full page reload when navigating.
Explore-and-transform page (#3782 @hibajamal) — New page covering data-exploration and transformations workbench toolkits.
Expand handover-to-other-toolkits section (#3737 @njaltran) — Expands data-exploration toolkit coverage in llm-native-workflow.md.
Add EAI instructions (#3803 @kaliole)
Update name to dlt Connector App (#3857 @kaliole) — Snowflake Native App docs renamed.
Update source count to 8,000+ (#3830 @Pawansingh3889) — Closes #3761.
Rename dltHub Basic tier to dltHub Pro (#3795 @elviskahoro)
Fix outdated hf login command (#3781 @julien-c)

Chores

Move mypy configs to pyproject.toml (#3780 @zilto) — Partially resolves #3346.
Remove Python 3.9 from CI matrices (#3777 @zilto) — Python 3.9 reached EOL in October 2025. Resolves #3587, #3619.
Increase Playwright timeout in e2e dashboard test (#3848 @burnash) — Matches the 15s timeout used elsewhere; reduces Windows CI flakiness.
Silence Airflow 3.2 smoke-test log noise (#3835 @burnash) — Fixes #3834.

New Contributors

@Pawansingh3889 made their first contribution in #3830
@biefan made their first contribution in #3743
@julien-c made their first contribution in #3781
@serl made their first contribution in #3760
@sangwookWoo made their first contribution in #3763
@njaltran made their first contribution in #3737

Contributors

burnash, julien-c, and 12 other contributors

Assets 2

19 Mar 11:59

rudolfix

1.24.0

fa36355

1.24.0

dlt 1.24.0 Release Notes

Breaking Changes

Custom resource metrics now stored as tables (#3718 @rudolfix) — Incremental metrics in the trace are now represented in table format. This changes the location and structure of incremental metrics in the trace object.

Highlights

Insert-only merge strategy (#3741 @rudolfix, based on #3372 by @OnAzart) — New insert-only merge strategy that performs idempotent, key-based appending: inserts records whose primary key doesn't exist in the destination while silently skipping duplicates. No updates or deletes. Supported across all SQL destinations, Delta Lake, and Iceberg.
Parallelize all sources in Airflow (#3652 @JustinSobayo) — In parallel and parallel-isolated decompose modes, all source components now fan out concurrently from a shared start node. Previously the first source had to complete before others could begin, adding unnecessary wall-clock time. This release also adds basic Airflow 3 support with smoke tests.
ClickHouse ReplacingMergeTree support (#3366 @prevostc) — New replacing_merge_tree table engine type for ClickHouse that enables native deduplication and soft deletes via dedup_sort and hard_delete column hints.
Custom resource metrics as tables (#3718 @rudolfix) — Resources can now emit custom metrics that are stored as tables in the trace, enabling richer observability for pipelines.

Core Library

Insert-only merge strategy (#3741 @rudolfix, based on #3372 by @OnAzart) — See Highlights.
ClickHouse ReplacingMergeTree support (#3366 @prevostc) — See Highlights.
Parallelize all sources in Airflow (#3652 @JustinSobayo) — See Highlights.
Custom resource metrics as tables (#3718 @rudolfix) — See Highlights.
Configurable Arrow table concatenation promote_options (#3701 @AyushPatel101) — arrow_concat_promote_options can now be set to "default" or "permissive" instead of the hardcoded "none", enabling automatic type promotion when yielding multiple Arrow tables with slightly different inferred types.
Fix: CLI info/show fails on custom destinations (#3676 @anuunchin) — dlt pipeline info/show no longer crashes with UnknownDestinationModule on pipelines using @dlt.destination.
Fix: Primary key assignment for incremental resources (#3679 @shnhdan) — Passing primary_key=() to Incremental to disable deduplication is no longer silently overwritten by the resource's own primary key.
Fix: MotherDuck missing catalog validation (#3723 @YuF-9468) — Connection strings that omit the catalog/database name (e.g. bare md:) now raise a clear configuration error instead of a confusing connection failure.
Fix: BigQuery infinite loop on internal error (#3732 @aditypan) — BigQuery jobs that encounter an internal error no longer cause an infinite retry loop.
Fix: SCD2 column order mismatch in SQLAlchemy destinations (#3733 @anuunchin) — SCD2 validity column insert jobs now match the column order of existing tables in SQLAlchemy destinations.
Fix: Timezone mapping in SQL timestamp datatype (#3735 @aditypan) — Timezone is now correctly set for timestamp/datetime column datatypes.

Docs

Realistic closure-based data masking example (#3617 @veeceey) — Replaced the hardcoded example with a reusable mask_columns() function supporting all sql_database backends.
Redirects for removed pages (#3688 @djudjuu)
AI workbench license info (#3729 @lis365b)
Minor doc fixes (#3734 @anuunchin)

Chores

Bumps npm docs deps (#3728 @rudolfix)
Switch lancedb example from Spotify to PodcastIndex (#3736 @Travior)
Adds CLI docs check to docs CI workflow (#3739 @rudolfix)
Moves render CLI docs command to a separate tool (#3740 @rudolfix)

New Contributors

@prevostc made their first contribution in #3366
@veeceey made their first contribution in #3617
@JustinSobayo made their first contribution in #3652
@shnhdan made their first contribution in #3679
@YuF-9468 made their first contribution in #3723

Contributors

prevostc, djudjuu, and 11 other contributors

Assets 2

06 Mar 14:47

rudolfix

1.23.0

b981294

1.23.0

Breaking Changes

Streamlit dashboard removed (#3674 @rudolfix) — The legacy Streamlit-based pipeline dashboard (dlt pipeline show) has been removed. It was a dead code for a long time.
New sources.<name>.<key> configuration lookup path (#3626 @rudolfix) — Source configuration now supports a compact layout. When a source's section name differs from its resource/source name, dlt now also looks up sources.<name>.<key> in addition to the full sources.<section>.<name>.<key> path. For example, for a source registered under section chess_com with name chess:
```
# Before (still works): full qualified path
[sources.chess_com.chess]
api_key = "secret"

# New (also works now): compact path using just the source name
[sources.chess]
api_key = "secret"

# Credentials follow the same pattern:
# Full:    sources.chess_com.chess.credentials.api_key
# Compact: sources.chess.credentials.api_key
```
This is breaking if you previously had values at sources.<name> that were unrelated to this source — they will now be resolved where they were previously ignored.

Highlights

AI Workbench (#3674 @rudolfix) — New dlt ai CLI command group that turns dlt workspaces into AI-assisted development environments. Includes toolkit system for installing curated skill/rule bundles, pluggable MCP server architecture with composable features (pipeline, workspace, toolkit, secrets), and multi-agent support (Claude Code, Cursor, Codex).
Relational normalizer optimization (#3626 @rudolfix) — Major performance improvements to JSON data normalization and schema evolution: 5x faster on flat data, ~2x on nested REST API data, ~1.8x on wide nested data. ISO timestamp parsing improved 2-3x by removing timezone conversions.
Iceberg table properties (#3699 @rudolfix) — Adds support for setting Iceberg table and namespace properties via the adapter and configuration.

Core Library

Fetch Databricks compute credentials (#3667 @aditypan) — Automatically fetches credentials from Databricks shared/job compute when running dlt in a notebook, fixing the issue of defaulting to SQL warehouse connections.
Add override_data_path option to DuckLake ATTACH (#3709 @udus122) — New override_data_path configuration option that appends OVERRIDE_DATA_PATH true to the ATTACH statement, allowing the current DATA_PATH to override the path stored in catalog metadata.
Add missing parameters in Paginator Configs (#3658 @aditypan) — Adds missing parameters to PageNumberPaginatorConfig, OffsetPaginatorConfig, and JSONResponseCursorPaginatorConfig.
Fix: path traversal in FileStorage (CWE-22) (#3678 @rudolfix) — Replaced os.path.commonprefix() with os.path.commonpath() in FileStorage.is_path_in_storage() to correctly validate path containment using path segments instead of characters.
Fix: monotonic wall clock (#3695 @rudolfix) — Improves elapsed time calculation across several places, ensuring load IDs are always monotonic even on systems with clock jitter.
Fix: threading issues causing potential locking (#3698 @rudolfix) — Fixes async pool shutdown in extract (now closed with timeout) and corrects synchronization sections in various tests.
Fix: dev mode survives attach and reset (#3662 @rudolfix) — Saves dev_mode flag in pipeline local state so it persists across dlt.attach() calls. Detects dev→non-dev transitions and resets working folder cleanly.
Fix: respect custom Hugging Face endpoint for dataset card operations (#3696 @jorritsandbrink) — Fixes custom endpoint support broken by subset/dataset card feature by temporarily setting HF_ENDPOINT env var for card operations.
Fix: explicit dataset name should be authoritative (#3700 @anuunchin) — Makes the dataset argument passed to the pipeline authoritative, always setting pipeline dataset when restoring state.
Fix: start_out_of_range flag with range_start="open" (#3708 @AyushPatel101) — Correctly sets start_out_of_range=True when a row's cursor value equals start_value with range_start="open", fixing delayed can_close() in descending-order pipelines.
Fix: LanceDB SQL view creation with dataset_name=None (#3710 @Travior) — Handles the case where dataset_name is None in LanceDBSqlClient.create_view, preventing None prefix in view names.

Docs

Fix docstring typo in BigQuery factory (#3705 @dnskr)

New Contributors

@aditypan made their first contribution in #3658
@dnskr made their first contribution in #3705
@udus122 made their first contribution in #3709

Contributors

rudolfix, udus122, and 6 other contributors

Assets 2

01 Mar 18:44

rudolfix

1.22.2

259a36d

1.22.2

Highlights

Hugging Face filesystem destination (#3669 @jorritsandbrink) — Adds hf protocol support to the filesystem destination, enabling direct loading to Hugging Face datasets. Closes #1227.
Composable marimo widgets (#3613 @zilto) — Introduces composable widgets built with marimo for interactive pipeline exploration. Widgets can accept inputs and produce outputs, building on earlier read-only widgets with updated schema viewer, load package viewer, and a new pipeline selector.

Core Library

Hugging Face subsets (#3689 @jorritsandbrink) — Adds dataset cards with metadata to configure a subset for each table, enabling the Hugging Face dataset viewer to display tables properly.
Hugging Face filesystem destination (#3669 @jorritsandbrink) — See Highlights.
Composable marimo widgets (#3613 @zilto) — See Highlights.
Dashboard UX improvements (#3675 @sh-rp) — Collapsed sections show title and subtitle on a single line to reduce vertical space, shortened long subtitles, and improved layout for narrow viewports.

Docs

Hugging Face destination documentation (#3687 @AstrakhantsevaAA) — Rewrote HF destination docs and moved content to a dedicated page.
Remove outdated Motherduck troubleshooting (#3683 @elviskahoro) — Removed read-only database troubleshooting section for deprecated DuckDB versions.
Update DuckLake docs for v1.4 (#3682 @elviskahoro) — Updated DuckLake documentation to reflect Motherduck as catalog database and corrected catalog URI format.

New Contributors

@elviskahoro made their first contribution in #3682

Contributors

sh-rp, AstrakhantsevaAA, and 3 other contributors

Assets 2

23 Feb 20:59

rudolfix

1.22.1

75efe7d

1.22.1

dlt 1.22.1 Release Notes

Core Library

feat(workspace): add default exclude patterns for file selector (#3661 @canassa) — WorkspaceFileSelector now ships with DEFAULT_EXCLUDES (.git/, .venv/, __pycache__/, node_modules/, etc.) so well-known non-deployable paths are always excluded, even without a .gitignore.
feat(workspace): add ignore_file_found attribute to WorkspaceFileSelector (#3663 @canassa) — Consumers can now check whether the configured ignore file (e.g. .gitignore) was actually found.
Dashboard cleanup and refactor (#3660 @sh-rp) — Broke up monolithic utils.py and dlt_dashboard.py into focused modules with simplified UI across all sections.
Sets default MCP transport to http-stream (#3624 @rudolfix) — Swaps sse for http-stream transport for built-in MCP servers and annotates pipeline trace schema.
Fixes data inspection tools (#3664 @rudolfix) — Allows incomplete columns in schema converters, attaches pipeline in every command, adds new allowed layout for sources.<name>.api_key.
Fix: Mermaid doesn't handle incomplete columns (#3659 @anuunchin) — .to_mermaid() now handles columns missing the data_type field instead of crashing.
Fix: ClickHouse makes reads sequential by default (#3651 @rudolfix) — Enforces select_sequential_consistency to fix flaky tests caused by ClickHouse's eventual consistency model.
Fix: data quality checks component (#3647 @zilto) — Fixes silently broken data quality checks code caused by upstream dlthub changes.
Fix: autouse test storage dir not empty (#3648 @tetelio) — Fixes intermittent CI failure in read-only file deletion by aligning with shutil docs.
Fix: dashboard tests (#3672 @sh-rp) — Fixes a few broken tests in the dashboard.

Docs

Run streamlit/MCPs in runtime (#3510 @tetelio) — Adds documentation for running MCP servers and Streamlit apps in the runtime.
Release highlights 1.18 & 1.19 (#3654 @AstrakhantsevaAA)
Add dlthub metrics section; update checks (#3641 @zilto)

Chores

Apply all docs/ linting in one make command (#3666 @anuunchin) — Introduces an overarching lint target in the docs Makefile. Resolves #3642.
Tests newest dbt on dbt runner, enables fabric (#3656 @rudolfix)
Adjust scaffold api and vibe source tests (#3649 @djudjuu) — Tests no longer expect source.md file.
Add install command for make test-load-local-p (#3645 @tetelio) — Convenience make install target for local load tests on duckdb and filesystem.
Remove license autofixture, add selective license application (#3646 @rudolfix)
Setup Claude and Continue agents (#3622 @rudolfix) — Adds comprehensive AI assistant configuration for Claude Code and Continue IDE.

New Contributors

@canassa made their first contribution in #3661

Contributors

canassa, sh-rp, and 6 other contributors

Assets 2

17 Feb 16:55

rudolfix

1.22.0

15c7b46

1.22.0

Breaking Changes

Pydantic v1 support removed (#3572 @anuunchin) — All Pydantic v1 compatibility code has been removed. The codebase now requires Pydantic v2 only.
data_type contract semantic change (#3572 @anuunchin @rudolfix) — The data_type contract now applies to full data type (ie. precision, nullability), not only to variant columns (data type change). Users with data_type: freeze who relied on changing nullable/precision/scale on existing columns will now be blocked.
merge_columns now removes compound properties (#3431 @anuunchin) — Previously merge_columns was purely additive, which caused compound properties like merge_key to be incorrectly replaced rather than properly merged. The function now correctly removes compound properties that should be removed.

Highlights

Pydantic data validation overhaul (#3572 @anuunchin @rudolfix ) — Major rework of Pydantic support: discriminated union RootModel types (validation of event streams with various event types), schema contracts properly separate resource-defined vs data-derived hints, Pydantic model columns bypass contract checks when authoritative. Supports Pydantic models on arrow and model items with full schema contract enforcement. Prepares for Pydantic v3.
Snowflake atomic table swap for replace (#3540 @Travior) — Uses ALTER TABLE ... SWAP for staging-optimized replace strategy on Snowflake, eliminating table downtime during data replacement.
Custom backends for sql_database (#3595 @rudolfix) — Register custom TableLoader implementations as named backends. ConnectorX backend ported as PoC; ADBC and paginated loader implemented as test cases.
SQLAlchemy destination dialect customization (#3600 @rudolfix) — Customize type mapping, adjust SQLAlchemy table schemas before creation, and override destination capabilities per-dialect.
llms.txt and Markdown docs generation (#3635 @rudolfix) — Generates llms.txt index and Markdown versions of docs pages with a "View Markdown" navigation option, making the docs LLM-friendly.

Core Library

rest_api: parallelized dependent resources (#3574 @Shadesfear) — Add parallelized flag to dependent resources (transformers) so child resource fetches run concurrently.
dlt.Relation: filter by load_id (#3547 @zilto) — Filter dataset relations by load ID (experimental).
dlt.Relation: flatten logic and improve typing (#3578 @zilto) — Remove dynamic methods; explicit return types for .df(), .arrow(), etc.
Source preprocessors on SourceFactory (#3636 @rudolfix) — Add preprocessor hooks to dlt.source factory for modifying source instances.
engine_kwargs for sql_database/sql_table sources (#3414 @tetelio) — Pass SQLAlchemy engine arguments directly to create_engine() for sources.
DECFLOAT support for Snowflake (#3513 @ivasio) — Properly handles DECFLOAT columns via the SQLAlchemy backend.
Athena query_result_bucket now optional (#3566 @arel) — Omit or set to None when using Athena's managed results bucket.
ClickHouse extra_credentials for S3 (#2888 @warje) — Adds extra_credentials config for role-based S3 authentication.
Fix: Snowflake sort column escaping (#3594 @rudolfix)
Fix: BigQuery partition clause on ALTER TABLE (#3571 @kien-truong)
Fix: Redshift schema existence check (#3570 @timH6502)
Fix: _dlt_load_id written as dict on MSSQL + ADBC (#3584 @rudolfix)
Fix: ClickHouse CREATE OR REPLACE for merge temp tables (#3589 @rudolfix)
Fix: read_csv_duckdb respects filename=True (#3606 @karlanka)
Fix: column order mismatch in sql_database (#3638 @rudolfix)
Fix: consistent UUID handling as strings (#3599 @rudolfix)
Fix: managed SQLAlchemy engine ref counting (#3601 @rudolfix)
Fix: suppress psutil warning during dlt init (#3615 @rudolfix)
Fix: query lifecycle cleanup (#3627 @rudolfix)
Fix: Pydantic model synthesis bugs (#3605 @rudolfix)
Detect AI agent execution context (#3628 @rudolfix)
Upgrade ibis-framework, remove sqlglot constraint (#3621 @Travior)
Vibe sources: use new scaffold API (#3512 @djudjuu)
Update GitHub API pipeline template (#3603 @ShreyasGS)

Docs

Oracle usage caveats documented (#3561 @ivasio)
Fabric destination added to sidebars (#3555 @dat-a-man)
Database connector app links added (#3552 @kaliole)
Typo fixes (#3556 @michelzurkirchen, #3616 @thecaptain789)
READMEs updated (#3596 @rudolfix)
dltHub tiers fixed (#3564 @VioletM)

Chores

Parallelized test execution (#3479 @tetelio) — Major CI speedup via pytest-xdist with fully isolated workers.
Self-hosted PokeAPI for tests (#3488 @anuunchin)
Parallel testing documentation (#3577 @tetelio)
Double-encode deploy key to avoid GitHub bans (#3501 @tetelio)
Exclude zendesk tests from CI (#3580 @zilto)
Fix test parallelism race conditions (#3593 @tetelio)
Restore sqlglot version constraints (#3575 @tetelio, #3579 @zilto)
Ignore .zed/ directory (#3633 @zilto)

New Contributors

@Shadesfear made their first contribution in #3574
@ShreyasGS made their first contribution in #3603
@arel made their first contribution in #3566
@karlanka made their first contribution in #3606
@kien-truong made their first contribution in #3571
@thecaptain789 made their first contribution in #3616

Contributors

kien-truong, arel, and 17 other contributors

Assets 2

20 Jan 12:11

rudolfix

1.21.0

ab0459a

1.21.0

This release adds several interesting improvements and many bugfixes. Lancedb destination now uses duckdb extension to let you query lance tables with SQL, ibis or sqlglot via our standard .dataset() interface. We introduced several iceberg-relates improvements (catalog support, s3 tables for Athena, advanced partitioning). There's also new fabric destination and additional options in `clickhouse_adapter. Finally: we have test environment for Oracle and we stared to fix Oracle related bugs.

Core Library

feat(lancedb): implement SqlClientBase - query lance tables with ibis, sqlglot or raw SQL by @zilto in #3527
feat: S3 Tables support for athena destination by @jorritsandbrink in #3434
feat: fabric destination (by @mattiasthalen) by @jorritsandbrink in #3535
set iceberg advanced partitioning options by @rakesh-tmdc in #3053
allow to use any catalog in filesystem+iceberg destination by @lfagliano in #3325
feat: Snowflake-provided OAuth token auth by @jorritsandbrink in #3322
Add column description support for BigQuery destination by @Travior in #3483
feat: clickhouse_adapter extensions by @jorritsandbrink in #3511
3480 chore update weaviate destination to use v4 client by @djudjuu in #3481
add support for gcp adc with iceberg by @benjibuiltit in #3460
3312: feat(snowflake) - add column_comment/description hint support by @somasays in #3462
Fix-3292: Load retrieval to only show loads that contain a schema name that is in the pipeline.schema_names by @alkaline-0 in #3446
feat: Support OAuth and base GCP credentials for BigQuery destination by @daniel-nagish in #3382
helper to filter dlt-changes out of schema updates by @djudjuu in #2968

Bugfixes

pyarrow: respect resource hints before extract by @djudjuu in #3436
Fix: 3490 better error message in schema contract application by @anuunchin in #3498
Fix state file being ignored when pipline_name includes FILENAME_SEPA… by @Travior in #3448
Fix/3464 sync error results in success label (workspace dashboard) by @anuunchin in #3492
Fix/3376 load state changes in load package that changed it by @anuunchin in #3521
3353 normalize start method spawn seems to ignore environment variables by @djudjuu in #3463
Fix: Connectorx arrow_stream timestamp conversion issue by @louiewhw and @anuunchin in #3528
fix/3141 - process Oracle "table not found" exception by @ivasio in #3509
fix: 3514 mermaid reference label by @zilto in #3515
Fix cluster hint overriding partition hint on bigquery by @Travior in #3497
Fix: Special handling of numeric type for oracle by @ivasio in #3144
Fix/3159 pydantic model incorrect serialization by @tetelio in #3421

Chores

fix(typing): add None to Container._Instance typing by @zilto in #3469
Fix dashboard e2e tests by @burnash in #3453
Chore: poke tests disabled by @anuunchin in #3487
Feat/3362 - add oracle testing setup - able to test oracle-specific edge cases now by @ivasio in #3496
chore/Omit git private key tests by @tetelio in #3502
adds checks for plugins version in hub extra by @rudolfix in #3474
do progress=log in vibe source template by @djudjuu in #3543

Docs

Fix: 3467 PipelineTasksGroup Docstring fix by @anuunchin in #3482
Docs: Converting Jupyter notebooks in education to marimo notebooks by @anuunchin in #3068
Docs/data quality by @adrianbr in #3466
docs: fix typo in dlt fundamentals lesson 8 by @dmschauer in #3508
docs: fix typos on parquet page by @dmschauer in #3506
docs/snowflake-native-app-how-to by @kaliole in #3455
change "remove" to "keep" in Advanced Lesson 1 by @AstrakhantsevaAA in #3522
Docs/explicit note for row hash calculation with non deterministic API responses by @anuunchin in #3524
docs: update lancedb documentation links by @ccmao1130 in #3542
Added docs for chunksize and use_pyarrow for parquet reader by @rishabh1815769 in #3526

New Contributors

@rakesh-tmdc made their first contribution in #3053
@somasays made their first contribution in #3462
@daniel-nagish made their first contribution in #3382
@tetelio made their first contribution in #3502
@dmschauer made their first contribution in #3508
@Travior made their first contribution in #3448
@lfagliano made their first contribution in #3325
@ccmao1130 made their first contribution in #3542
@benjibuiltit made their first contribution in #3460
@rishabh1815769 made their first contribution in #3526

Full Changelog: 1.20.0...1.21.0

Contributors

burnash, mattiasthalen, and 21 other contributors

Assets 2

09 Dec 23:13

rudolfix

1.20.0

a7c3571

1.20.0

Core Library

feat: implement ConfigurationFileSelector by @ivasio in #3418
Fix: reset config in PluggableRunContext.reload_providers by @ivasio in #3409
add runtime CLI configs in WorkspaceRuntimeConfiguration by @ivasio in #3424
implements run artifacts sync to a bucket using filesystem by @ivasio in #3339
Fix: extensive .gitignore for dlt init by @anuunchin in #3437
Fix: Invisible sections are receiving border and background color in dashboard by @anuunchin in #3439
implements cancellation of normalize jobs by @rudolfix in #3444
information on pending and partially loaded packages when pipeline fails @rudolfix in #3444
Fix race condition in LimitItem by @burnash in #3442
Add offset/limit body_path fields to OffsetPaginatorConfig by @kinghuang in #3260
[fix/3358] add pagination stopping to JSONResponseCursorPaginator by @segetsy in #3374
(feat) small dashboard improvements by @rudolfix in #3450

Chores

Skip doc examples requiring secrets on fork PRs by @burnash in #3438

Docs

docs: LLM workflow update by @zilto in #3422

New Contributors

@kinghuang made their first contribution in #3260
@segetsy made their first contribution in #3374

Full Changelog: 1.19.1...1.20.0

Contributors

kinghuang, burnash, and 5 other contributors

Assets 2

02 Dec 22:02

rudolfix

1.19.1

10cd908

1.19.1

Bugfixes

fixes arrow import in sql_database by @rudolfix in #3411
blocks sqlglot 28.1.0 by @rudolfix in #3420

Full Changelog: 1.19.0...1.19.1

Contributors

rudolfix

Assets 2

Releases: dlt-hub/dlt

1.26.0

dlt 1.26.0 Release Notes

Breaking Changes

Highlights

Core Library

Docs

New Contributors

Contributors

Uh oh!

1.25.0

dlt 1.25.0 Release Notes

Breaking Changes

Highlights

Core Library

Docs

Chores

New Contributors

Contributors

Uh oh!

1.24.0

dlt 1.24.0 Release Notes

Breaking Changes

Highlights

Core Library

Docs

Chores

New Contributors

Contributors

Uh oh!

1.23.0

Breaking Changes

Highlights

Core Library

Docs

New Contributors

Contributors

Uh oh!

1.22.2

Highlights

Core Library

Docs

New Contributors

Contributors

Uh oh!

1.22.1

dlt 1.22.1 Release Notes

Core Library

Docs

Chores

New Contributors

Contributors

Uh oh!

1.22.0

Breaking Changes

Highlights

Core Library

Docs

Chores

New Contributors

Contributors

Uh oh!

1.21.0

Core Library

Bugfixes

Chores

Docs

New Contributors

Contributors

Uh oh!

1.20.0

Core Library

Chores

Docs

New Contributors

Contributors

Uh oh!

1.19.1

Bugfixes

Contributors