Releases: dlt-hub/dlt
1.26.0
dlt 1.26.0 Release Notes
Breaking Changes
- Incremental external scheduler now raises instead of silently warning (#3877 @rudolfix) — Untyped/non-coercible cursor values now raise
JoinSchedulerError; missing intervals raiseExternalSchedulerNotAvailable. Resources withallow_external_schedulers=Truethat previously fell back to dlt state will now fail. This is a bugfix that corrects previously incorrect behavior.
Highlights
dlt.Relation.join(...)(#3590 @Travior) — Adds ajoin()method ondlt.Relationbased on the normalizer and table references, enabling fluent relational composition over datasets.- Extended Snowflake query tags (#3759 @Travior) — Snowflake query tagging is extended beyond load jobs to broader dlt operations: storage setup, schema/state reads, schema updates, load execution, load completion, and table drops.
TJobQueryTagsis generalized toTQueryTagswith a newoperationfield (with a compatibility export). - Time interval context for incrementals (#3877 @rudolfix) — New
dlt.current.interval()returns the active(start, end)interval orNone, backed by an injectableTimeIntervalContextwith optionalallow_external_schedulersoverride and auto-detection from env vars / Airflow.
Core Library
dlt.Relation.join(...)(#3590 @Travior) — see Highlights.- Extended Snowflake query tags beyond load jobs (#3759 @Travior) — see Highlights.
- Time interval context for incrementals (#3877 @rudolfix) — see Highlights.
- Destination-aware incremental SQL filter (#3877 @rudolfix) — New
dlt/extract/incremental/sql.py(to_sqlglot_filter) honorstimestamp_precision,supports_tz_aware_datetime_in_cast, and sqlite quirks; works on bound and unbound incrementals. start_valuepersisted in incremental state (#3877 @rudolfix) — Only written when rows actually arrive, so it is no longer advanced silently on empty runs.uuid_to_stringPyArrow fast path (#3877 @rudolfix) — Numpy-vectorized with a pure-Python fallback; pyarrow ≥ 24arrow.uuidextension arrays are coerced to canonical strings, and UUID columns under pyarrow < 24 also take the fast path.- Custom metrics emitted even when a resource produces no data (#3877 @rudolfix) —
dlt.current.resource_metrics()counters are no longer dropped when every item is filtered out. - TypedDict validator honors
NotRequired[T](#3877 @rudolfix) — Via__required_keys__. - sqlglot
"dremio"dialect literal (#3877 @rudolfix) — Added toTSqlGlotDialect. - Opt-in naming-convention check in
Schema.unify_schemas()(#3898 @burnash) — The naming-convention check inSchema.unify_schemas()is now opt-in; also drops the max_length tests workaround. - Fix(clickhouse): correct metadata sort keys to avoid full scans (#3851 @anuunchin) — Resolves #3806 by correcting metadata sort keys so ClickHouse no longer performs full scans.
- Fix(pyarrow): preserve string encoding for UUID columns under pyarrow 24+ (#3894 @burnash) — Fixes #3893 so UUID columns keep their string encoding under pyarrow 24+.
- Fix: clearer error from
dlt.attach()when pipeline cannot be restored (#3890 @bjoaquinc) — RewritesCannotRestorePipelineExceptionmessages to name required inputs, show a concretedlt.attach(...)example, and offerdlt.dataset()as a lighter alternative; suppresses a redundant inner exception in tracebacks. - Fix: closed-pipe race in
iter_std(#3877 @rudolfix) — Reader threads swallowValueError/OSErrorand always close the queue. - Fix: active profile shown in
ConfigFieldMissingException(#3877 @rudolfix)
Docs
- Improved dashboard docs (#3484 @VioletM)
- Updated documentation for pg_replication (#3900 @dat-a-man) — Adds an "Alternative: CDC with Debezium" section linking to the Debezium + dlt demo.
- Snowflake marketplace listing link (#3888 @kaliole)
- Quality-checker fixes to
pipeline.md(#3885 @ShreyasGS) — Tense, contractions, and grammar cleanups via Harper + Vale Google Developer Docs style. - Remove playground page and associated CI (#3883 @zilto) — Streamlines the "Getting Started" section; interactive material remains on Google Colab and marimo molab.
- Pydantic Logfire export example (#3882 @zilto)
New Contributors
- @bjoaquinc made their first contribution in #3890
1.25.0
dlt 1.25.0 Release Notes
Breaking Changes
- Multischema datasets (#3770 @burnash) — Datasets can now hold multiple schemas. The main benefit is to be able to see tables from all source in multi-source pipelines. This is a new default behavior.
Users can pass a list of schemas todataset()method and still go back to single-schema dataset by providingpipeline.default_schemawhen creating dataset.
Highlights
lancedestination (#3810 @jorritsandbrink) — New destination for the Lance table format with optional vector embedding generation vialancedb. Supports local storage ands3/az/gs, uses the Lance Directory Namespace V2 spec, and supports branching. Complements the existinglancedbdestination (which targets LanceDB Cloud).- Multischema datasets (#3770 @burnash) — See Breaking Changes above. Enables sidecar schemas (e.g. data-quality quarantine tables) to live alongside the primary schema in a single dataset.
- Improved progress and load metrics (#3768 @rudolfix) — Load metrics now persist across restarts, normalizer metrics are updated via update files, and the follow-up job graph is saved into the trace. Closes the long-standing #853.
Core Library
lancedestination (#3810 @jorritsandbrink) — See Highlights.- Multischema datasets (#3770 @burnash) — See Highlights.
- Improved progress and load metrics (#3768 @rudolfix) — See Highlights.
ducklake:metadata_schemaATTACH option (#3763 @sangwookWoo) — Addsmetadata_schematoDuckLakeCredentialsso the DuckLake metadata schema can be configured independently fromducklake_name.- Fix: preserve credential chain in AWS credentials (#3798 @rudolfix) — Default credential mixing applied correctly, STS scoped to Databricks only. Closes #3115.
- Fix: replay state transitions after crash (#3767 @rudolfix) — Writes a pending state-transition marker right after the DB commit so an interrupted load no longer leaves the load package in an inconsistent state.
- Fix: create all eligible tables on staging dataset (#3765 @rudolfix) — Closes #2862.
- Fix: normalize pool workers skip
__main__in orchestrators (#3784 @rudolfix) — Closes #3586. - Fix(clickhouse): lightweight DELETE for single-table merge (#3783 @rudolfix) — Removes the
_dlt_idrequirement when merging arrow tables without nested tables on ClickHouse. - Fix(clickhouse): pass
aws_session_tokento stagings3()table function (#3769 @anuunchin) — Temporary AWS credentials now work for ClickHouse staging. - Fix: avoid leaking PUA markers in nested fields (#3760 @serl) — Fixes Pydantic nested-model PUA-marker leak. Closes #3755.
- Fix: deepcopy paginator in child resource (#3779 @anuunchin) — Prevents paginator state corruption across child-resource invocations. Closes #3772.
- Fix: honor explicit non-utf8 encoding in filesystem
read_csv(#3743 @biefan) — File is opened with the requested encoding so SFTP/paramiko stacks no longer pre-decode as UTF-8. - Fix: don't filter out trace steps with exceptions (#3843 @anuunchin) —
trace.asdict()now retains pipelines that fail in the sync step before extract. - Fix: check duckdb version when installing lance extension (#3773 @zilto) — Handles the
lanceextension promotion to built-in in duckdb 1.5. - Fix: transient Windows file-lock
PermissionErrorinrename_tree(#3853 @burnash) — Resolves intermittent Windows CI failures during normalize→loaded rename. - Fix: deprecation warnings across supported package versions (#3831 @anuunchin) — Closes #3785, #3807, #3787, #3794.
Docs
- Cookbook section (#3860 @zilto) — Tested examples moved to a dedicated top-level tab; dlt tab added for navigation back; UI cleanups.
- Same-domain docs button (#3859 @zilto) — Avoids full page reload when navigating.
- Explore-and-transform page (#3782 @hibajamal) — New page covering data-exploration and transformations workbench toolkits.
- Expand handover-to-other-toolkits section (#3737 @njaltran) — Expands data-exploration toolkit coverage in
llm-native-workflow.md. - Add EAI instructions (#3803 @kaliole)
- Update name to dlt Connector App (#3857 @kaliole) — Snowflake Native App docs renamed.
- Update source count to 8,000+ (#3830 @Pawansingh3889) — Closes #3761.
- Rename dltHub Basic tier to dltHub Pro (#3795 @elviskahoro)
- Fix outdated
hf logincommand (#3781 @julien-c)
Chores
- Move
mypyconfigs topyproject.toml(#3780 @zilto) — Partially resolves #3346. - Remove Python 3.9 from CI matrices (#3777 @zilto) — Python 3.9 reached EOL in October 2025. Resolves #3587, #3619.
- Increase Playwright timeout in e2e dashboard test (#3848 @burnash) — Matches the 15s timeout used elsewhere; reduces Windows CI flakiness.
- Silence Airflow 3.2 smoke-test log noise (#3835 @burnash) — Fixes #3834.
New Contributors
1.24.0
dlt 1.24.0 Release Notes
Breaking Changes
- Custom resource metrics now stored as tables (#3718 @rudolfix) — Incremental metrics in the trace are now represented in table format. This changes the location and structure of incremental metrics in the trace object.
Highlights
- Insert-only merge strategy (#3741 @rudolfix, based on #3372 by @OnAzart) — New
insert-onlymerge strategy that performs idempotent, key-based appending: inserts records whose primary key doesn't exist in the destination while silently skipping duplicates. No updates or deletes. Supported across all SQL destinations, Delta Lake, and Iceberg. - Parallelize all sources in Airflow (#3652 @JustinSobayo) — In
parallelandparallel-isolateddecompose modes, all source components now fan out concurrently from a shared start node. Previously the first source had to complete before others could begin, adding unnecessary wall-clock time. This release also adds basic Airflow 3 support with smoke tests. - ClickHouse ReplacingMergeTree support (#3366 @prevostc) — New
replacing_merge_treetable engine type for ClickHouse that enables native deduplication and soft deletes viadedup_sortandhard_deletecolumn hints. - Custom resource metrics as tables (#3718 @rudolfix) — Resources can now emit custom metrics that are stored as tables in the trace, enabling richer observability for pipelines.
Core Library
- Insert-only merge strategy (#3741 @rudolfix, based on #3372 by @OnAzart) — See Highlights.
- ClickHouse ReplacingMergeTree support (#3366 @prevostc) — See Highlights.
- Parallelize all sources in Airflow (#3652 @JustinSobayo) — See Highlights.
- Custom resource metrics as tables (#3718 @rudolfix) — See Highlights.
- Configurable Arrow table concatenation promote_options (#3701 @AyushPatel101) —
arrow_concat_promote_optionscan now be set to"default"or"permissive"instead of the hardcoded"none", enabling automatic type promotion when yielding multiple Arrow tables with slightly different inferred types. - Fix: CLI info/show fails on custom destinations (#3676 @anuunchin) —
dlt pipeline info/showno longer crashes withUnknownDestinationModuleon pipelines using@dlt.destination. - Fix: Primary key assignment for incremental resources (#3679 @shnhdan) — Passing
primary_key=()toIncrementalto disable deduplication is no longer silently overwritten by the resource's own primary key. - Fix: MotherDuck missing catalog validation (#3723 @YuF-9468) — Connection strings that omit the catalog/database name (e.g. bare
md:) now raise a clear configuration error instead of a confusing connection failure. - Fix: BigQuery infinite loop on internal error (#3732 @aditypan) — BigQuery jobs that encounter an internal error no longer cause an infinite retry loop.
- Fix: SCD2 column order mismatch in SQLAlchemy destinations (#3733 @anuunchin) — SCD2 validity column insert jobs now match the column order of existing tables in SQLAlchemy destinations.
- Fix: Timezone mapping in SQL timestamp datatype (#3735 @aditypan) — Timezone is now correctly set for timestamp/datetime column datatypes.
Docs
- Realistic closure-based data masking example (#3617 @veeceey) — Replaced the hardcoded example with a reusable
mask_columns()function supporting allsql_databasebackends. - Redirects for removed pages (#3688 @djudjuu)
- AI workbench license info (#3729 @lis365b)
- Minor doc fixes (#3734 @anuunchin)
Chores
- Bumps npm docs deps (#3728 @rudolfix)
- Switch lancedb example from Spotify to PodcastIndex (#3736 @Travior)
- Adds CLI docs check to docs CI workflow (#3739 @rudolfix)
- Moves render CLI docs command to a separate tool (#3740 @rudolfix)
New Contributors
1.23.0
Breaking Changes
-
Streamlit dashboard removed (#3674 @rudolfix) — The legacy Streamlit-based pipeline dashboard (
dlt pipeline show) has been removed. It was a dead code for a long time. -
New
sources.<name>.<key>configuration lookup path (#3626 @rudolfix) — Source configuration now supports a compact layout. When a source's section name differs from its resource/source name, dlt now also looks upsources.<name>.<key>in addition to the fullsources.<section>.<name>.<key>path. For example, for a source registered under sectionchess_comwith namechess:# Before (still works): full qualified path [sources.chess_com.chess] api_key = "secret" # New (also works now): compact path using just the source name [sources.chess] api_key = "secret" # Credentials follow the same pattern: # Full: sources.chess_com.chess.credentials.api_key # Compact: sources.chess.credentials.api_key
This is breaking if you previously had values at
sources.<name>that were unrelated to this source — they will now be resolved where they were previously ignored.
Highlights
-
AI Workbench (#3674 @rudolfix) — New
dlt aiCLI command group that turns dlt workspaces into AI-assisted development environments. Includes toolkit system for installing curated skill/rule bundles, pluggable MCP server architecture with composable features (pipeline, workspace, toolkit, secrets), and multi-agent support (Claude Code, Cursor, Codex). -
Relational normalizer optimization (#3626 @rudolfix) — Major performance improvements to JSON data normalization and schema evolution: 5x faster on flat data, ~2x on nested REST API data, ~1.8x on wide nested data. ISO timestamp parsing improved 2-3x by removing timezone conversions.
-
Iceberg table properties (#3699 @rudolfix) — Adds support for setting Iceberg table and namespace properties via the
adapterand configuration.
Core Library
- Fetch Databricks compute credentials (#3667 @aditypan) — Automatically fetches credentials from Databricks shared/job compute when running dlt in a notebook, fixing the issue of defaulting to SQL warehouse connections.
- Add
override_data_pathoption to DuckLake ATTACH (#3709 @udus122) — Newoverride_data_pathconfiguration option that appendsOVERRIDE_DATA_PATH trueto the ATTACH statement, allowing the currentDATA_PATHto override the path stored in catalog metadata. - Add missing parameters in Paginator Configs (#3658 @aditypan) — Adds missing parameters to
PageNumberPaginatorConfig,OffsetPaginatorConfig, andJSONResponseCursorPaginatorConfig. - Fix: path traversal in FileStorage (CWE-22) (#3678 @rudolfix) — Replaced
os.path.commonprefix()withos.path.commonpath()inFileStorage.is_path_in_storage()to correctly validate path containment using path segments instead of characters. - Fix: monotonic wall clock (#3695 @rudolfix) — Improves elapsed time calculation across several places, ensuring load IDs are always monotonic even on systems with clock jitter.
- Fix: threading issues causing potential locking (#3698 @rudolfix) — Fixes async pool shutdown in extract (now closed with timeout) and corrects synchronization sections in various tests.
- Fix: dev mode survives attach and reset (#3662 @rudolfix) — Saves
dev_modeflag in pipeline local state so it persists acrossdlt.attach()calls. Detects dev→non-dev transitions and resets working folder cleanly. - Fix: respect custom Hugging Face endpoint for dataset card operations (#3696 @jorritsandbrink) — Fixes custom endpoint support broken by subset/dataset card feature by temporarily setting
HF_ENDPOINTenv var for card operations. - Fix: explicit dataset name should be authoritative (#3700 @anuunchin) — Makes the dataset argument passed to the pipeline authoritative, always setting pipeline dataset when restoring state.
- Fix:
start_out_of_rangeflag withrange_start="open"(#3708 @AyushPatel101) — Correctly setsstart_out_of_range=Truewhen a row's cursor value equalsstart_valuewithrange_start="open", fixing delayedcan_close()in descending-order pipelines. - Fix: LanceDB SQL view creation with
dataset_name=None(#3710 @Travior) — Handles the case wheredataset_nameisNoneinLanceDBSqlClient.create_view, preventingNoneprefix in view names.
Docs
New Contributors
1.22.2
Highlights
- Hugging Face filesystem destination (#3669 @jorritsandbrink) — Adds
hfprotocol support to thefilesystemdestination, enabling direct loading to Hugging Face datasets. Closes #1227. - Composable marimo widgets (#3613 @zilto) — Introduces composable widgets built with marimo for interactive pipeline exploration. Widgets can accept inputs and produce outputs, building on earlier read-only widgets with updated schema viewer, load package viewer, and a new pipeline selector.
Core Library
- Hugging Face subsets (#3689 @jorritsandbrink) — Adds dataset cards with metadata to configure a subset for each table, enabling the Hugging Face dataset viewer to display tables properly.
- Hugging Face filesystem destination (#3669 @jorritsandbrink) — See Highlights.
- Composable marimo widgets (#3613 @zilto) — See Highlights.
- Dashboard UX improvements (#3675 @sh-rp) — Collapsed sections show title and subtitle on a single line to reduce vertical space, shortened long subtitles, and improved layout for narrow viewports.
Docs
- Hugging Face destination documentation (#3687 @AstrakhantsevaAA) — Rewrote HF destination docs and moved content to a dedicated page.
- Remove outdated Motherduck troubleshooting (#3683 @elviskahoro) — Removed read-only database troubleshooting section for deprecated DuckDB versions.
- Update DuckLake docs for v1.4 (#3682 @elviskahoro) — Updated DuckLake documentation to reflect Motherduck as catalog database and corrected catalog URI format.
New Contributors
- @elviskahoro made their first contribution in #3682
1.22.1
dlt 1.22.1 Release Notes
Core Library
- feat(workspace): add default exclude patterns for file selector (#3661 @canassa) —
WorkspaceFileSelectornow ships withDEFAULT_EXCLUDES(.git/,.venv/,__pycache__/,node_modules/, etc.) so well-known non-deployable paths are always excluded, even without a.gitignore. - feat(workspace): add
ignore_file_foundattribute to WorkspaceFileSelector (#3663 @canassa) — Consumers can now check whether the configured ignore file (e.g..gitignore) was actually found. - Dashboard cleanup and refactor (#3660 @sh-rp) — Broke up monolithic
utils.pyanddlt_dashboard.pyinto focused modules with simplified UI across all sections. - Sets default MCP transport to http-stream (#3624 @rudolfix) — Swaps
sseforhttp-streamtransport for built-in MCP servers and annotates pipeline trace schema. - Fixes data inspection tools (#3664 @rudolfix) — Allows incomplete columns in schema converters, attaches pipeline in every command, adds new allowed layout for
sources.<name>.api_key. - Fix: Mermaid doesn't handle incomplete columns (#3659 @anuunchin) —
.to_mermaid()now handles columns missing thedata_typefield instead of crashing. - Fix: ClickHouse makes reads sequential by default (#3651 @rudolfix) — Enforces
select_sequential_consistencyto fix flaky tests caused by ClickHouse's eventual consistency model. - Fix: data quality checks component (#3647 @zilto) — Fixes silently broken data quality checks code caused by upstream
dlthubchanges. - Fix: autouse test storage dir not empty (#3648 @tetelio) — Fixes intermittent CI failure in read-only file deletion by aligning with
shutildocs. - Fix: dashboard tests (#3672 @sh-rp) — Fixes a few broken tests in the dashboard.
Docs
- Run streamlit/MCPs in runtime (#3510 @tetelio) — Adds documentation for running MCP servers and Streamlit apps in the runtime.
- Release highlights 1.18 & 1.19 (#3654 @AstrakhantsevaAA)
- Add
dlthubmetrics section; update checks (#3641 @zilto)
Chores
- Apply all
docs/linting in onemakecommand (#3666 @anuunchin) — Introduces an overarching lint target in the docs Makefile. Resolves #3642. - Tests newest dbt on dbt runner, enables fabric (#3656 @rudolfix)
- Adjust scaffold api and vibe source tests (#3649 @djudjuu) — Tests no longer expect
source.mdfile. - Add install command for
make test-load-local-p(#3645 @tetelio) — Convenience make install target for local load tests on duckdb and filesystem. - Remove license autofixture, add selective license application (#3646 @rudolfix)
- Setup Claude and Continue agents (#3622 @rudolfix) — Adds comprehensive AI assistant configuration for Claude Code and Continue IDE.
New Contributors
1.22.0
Breaking Changes
- Pydantic v1 support removed (#3572 @anuunchin) — All Pydantic v1 compatibility code has been removed. The codebase now requires Pydantic v2 only.
data_typecontract semantic change (#3572 @anuunchin @rudolfix) — Thedata_typecontract now applies to full data type (ie. precision, nullability), not only to variant columns (data type change). Users withdata_type: freezewho relied on changingnullable/precision/scaleon existing columns will now be blocked.merge_columnsnow removes compound properties (#3431 @anuunchin) — Previouslymerge_columnswas purely additive, which caused compound properties likemerge_keyto be incorrectly replaced rather than properly merged. The function now correctly removes compound properties that should be removed.
Highlights
- Pydantic data validation overhaul (#3572 @anuunchin @rudolfix ) — Major rework of Pydantic support: discriminated union
RootModeltypes (validation of event streams with various event types), schema contracts properly separate resource-defined vs data-derived hints, Pydantic model columns bypass contract checks when authoritative. Supports Pydantic models on arrow and model items with full schema contract enforcement. Prepares for Pydantic v3. - Snowflake atomic table swap for replace (#3540 @Travior) — Uses
ALTER TABLE ... SWAPforstaging-optimizedreplace strategy on Snowflake, eliminating table downtime during data replacement. - Custom backends for
sql_database(#3595 @rudolfix) — Register customTableLoaderimplementations as named backends. ConnectorX backend ported as PoC; ADBC and paginated loader implemented as test cases. - SQLAlchemy destination dialect customization (#3600 @rudolfix) — Customize type mapping, adjust SQLAlchemy table schemas before creation, and override destination capabilities per-dialect.
llms.txtand Markdown docs generation (#3635 @rudolfix) — Generatesllms.txtindex and Markdown versions of docs pages with a "View Markdown" navigation option, making the docs LLM-friendly.
Core Library
rest_api: parallelized dependent resources (#3574 @Shadesfear) — Addparallelizedflag to dependent resources (transformers) so child resource fetches run concurrently.dlt.Relation: filter byload_id(#3547 @zilto) — Filter dataset relations by load ID (experimental).dlt.Relation: flatten logic and improve typing (#3578 @zilto) — Remove dynamic methods; explicit return types for.df(),.arrow(), etc.- Source preprocessors on
SourceFactory(#3636 @rudolfix) — Add preprocessor hooks todlt.sourcefactory for modifying source instances. engine_kwargsforsql_database/sql_tablesources (#3414 @tetelio) — Pass SQLAlchemy engine arguments directly tocreate_engine()for sources.- DECFLOAT support for Snowflake (#3513 @ivasio) — Properly handles
DECFLOATcolumns via the SQLAlchemy backend. - Athena
query_result_bucketnow optional (#3566 @arel) — Omit or set toNonewhen using Athena's managed results bucket. - ClickHouse
extra_credentialsfor S3 (#2888 @warje) — Addsextra_credentialsconfig for role-based S3 authentication. - Fix: Snowflake sort column escaping (#3594 @rudolfix)
- Fix: BigQuery partition clause on ALTER TABLE (#3571 @kien-truong)
- Fix: Redshift schema existence check (#3570 @timH6502)
- Fix:
_dlt_load_idwritten as dict on MSSQL + ADBC (#3584 @rudolfix) - Fix: ClickHouse
CREATE OR REPLACEfor merge temp tables (#3589 @rudolfix) - Fix:
read_csv_duckdbrespectsfilename=True(#3606 @karlanka) - Fix: column order mismatch in
sql_database(#3638 @rudolfix) - Fix: consistent UUID handling as strings (#3599 @rudolfix)
- Fix: managed SQLAlchemy engine ref counting (#3601 @rudolfix)
- Fix: suppress psutil warning during
dlt init(#3615 @rudolfix) - Fix: query lifecycle cleanup (#3627 @rudolfix)
- Fix: Pydantic model synthesis bugs (#3605 @rudolfix)
- Detect AI agent execution context (#3628 @rudolfix)
- Upgrade
ibis-framework, removesqlglotconstraint (#3621 @Travior) - Vibe sources: use new scaffold API (#3512 @djudjuu)
- Update GitHub API pipeline template (#3603 @ShreyasGS)
Docs
- Oracle usage caveats documented (#3561 @ivasio)
- Fabric destination added to sidebars (#3555 @dat-a-man)
- Database connector app links added (#3552 @kaliole)
- Typo fixes (#3556 @michelzurkirchen, #3616 @thecaptain789)
- READMEs updated (#3596 @rudolfix)
- dltHub tiers fixed (#3564 @VioletM)
Chores
- Parallelized test execution (#3479 @tetelio) — Major CI speedup via
pytest-xdistwith fully isolated workers. - Self-hosted PokeAPI for tests (#3488 @anuunchin)
- Parallel testing documentation (#3577 @tetelio)
- Double-encode deploy key to avoid GitHub bans (#3501 @tetelio)
- Exclude zendesk tests from CI (#3580 @zilto)
- Fix test parallelism race conditions (#3593 @tetelio)
- Restore sqlglot version constraints (#3575 @tetelio, #3579 @zilto)
- Ignore
.zed/directory (#3633 @zilto)
New Contributors
- @Shadesfear made their first contribution in #3574
- @ShreyasGS made their first contribution in #3603
- @arel made their first contribution in #3566
- @karlanka made their first contribution in #3606
- @kien-truong made their first contribution in #3571
- @thecaptain789 made their first contribution in #3616
1.21.0
This release adds several interesting improvements and many bugfixes. Lancedb destination now uses duckdb extension to let you query lance tables with SQL, ibis or sqlglot via our standard .dataset() interface. We introduced several iceberg-relates improvements (catalog support, s3 tables for Athena, advanced partitioning). There's also new fabric destination and additional options in `clickhouse_adapter. Finally: we have test environment for Oracle and we stared to fix Oracle related bugs.
Core Library
- feat(lancedb): implement
SqlClientBase- query lance tables with ibis, sqlglot or raw SQL by @zilto in #3527 - feat: S3 Tables support for
athenadestination by @jorritsandbrink in #3434 - feat:
fabricdestination (by @mattiasthalen) by @jorritsandbrink in #3535 - set iceberg advanced partitioning options by @rakesh-tmdc in #3053
- allow to use any catalog in filesystem+iceberg destination by @lfagliano in #3325
- feat: Snowflake-provided OAuth token auth by @jorritsandbrink in #3322
- Add column description support for BigQuery destination by @Travior in #3483
- feat:
clickhouse_adapterextensions by @jorritsandbrink in #3511 - 3480 chore update weaviate destination to use v4 client by @djudjuu in #3481
- add support for gcp adc with iceberg by @benjibuiltit in #3460
- 3312: feat(snowflake) - add column_comment/description hint support by @somasays in #3462
- Fix-3292: Load retrieval to only show loads that contain a schema name that is in the pipeline.schema_names by @alkaline-0 in #3446
- feat: Support OAuth and base GCP credentials for BigQuery destination by @daniel-nagish in #3382
- helper to filter dlt-changes out of schema updates by @djudjuu in #2968
Bugfixes
- pyarrow: respect resource hints before extract by @djudjuu in #3436
- Fix: 3490 better error message in schema contract application by @anuunchin in #3498
- Fix state file being ignored when pipline_name includes FILENAME_SEPA… by @Travior in #3448
- Fix/3464 sync error results in success label (workspace dashboard) by @anuunchin in #3492
- Fix/3376 load state changes in load package that changed it by @anuunchin in #3521
- 3353 normalize start method spawn seems to ignore environment variables by @djudjuu in #3463
- Fix: Connectorx arrow_stream timestamp conversion issue by @louiewhw and @anuunchin in #3528
- fix/3141 - process Oracle "table not found" exception by @ivasio in #3509
- fix: 3514 mermaid reference label by @zilto in #3515
- Fix cluster hint overriding partition hint on bigquery by @Travior in #3497
- Fix: Special handling of numeric type for oracle by @ivasio in #3144
- Fix/3159 pydantic model incorrect serialization by @tetelio in #3421
Chores
- fix(typing): add None to Container._Instance typing by @zilto in #3469
- Fix dashboard e2e tests by @burnash in #3453
- Chore: poke tests disabled by @anuunchin in #3487
- Feat/3362 - add oracle testing setup - able to test oracle-specific edge cases now by @ivasio in #3496
- chore/Omit git private key tests by @tetelio in #3502
- adds checks for plugins version in hub extra by @rudolfix in #3474
- do progress=log in vibe source template by @djudjuu in #3543
Docs
- Fix: 3467 PipelineTasksGroup Docstring fix by @anuunchin in #3482
- Docs: Converting Jupyter notebooks in education to marimo notebooks by @anuunchin in #3068
- Docs/data quality by @adrianbr in #3466
- docs: fix typo in dlt fundamentals lesson 8 by @dmschauer in #3508
- docs: fix typos on parquet page by @dmschauer in #3506
- docs/snowflake-native-app-how-to by @kaliole in #3455
- change "remove" to "keep" in Advanced Lesson 1 by @AstrakhantsevaAA in #3522
- Docs/explicit note for row hash calculation with non deterministic API responses by @anuunchin in #3524
- docs: update lancedb documentation links by @ccmao1130 in #3542
- Added docs for chunksize and use_pyarrow for parquet reader by @rishabh1815769 in #3526
New Contributors
- @rakesh-tmdc made their first contribution in #3053
- @somasays made their first contribution in #3462
- @daniel-nagish made their first contribution in #3382
- @tetelio made their first contribution in #3502
- @dmschauer made their first contribution in #3508
- @Travior made their first contribution in #3448
- @lfagliano made their first contribution in #3325
- @ccmao1130 made their first contribution in #3542
- @benjibuiltit made their first contribution in #3460
- @rishabh1815769 made their first contribution in #3526
Full Changelog: 1.20.0...1.21.0
1.20.0
Core Library
- feat: implement ConfigurationFileSelector by @ivasio in #3418
- Fix: reset config in PluggableRunContext.reload_providers by @ivasio in #3409
- add runtime CLI configs in WorkspaceRuntimeConfiguration by @ivasio in #3424
- implements run artifacts sync to a bucket using
filesystemby @ivasio in #3339 - Fix: extensive .gitignore for dlt init by @anuunchin in #3437
- Fix: Invisible sections are receiving border and background color in dashboard by @anuunchin in #3439
- implements cancellation of normalize jobs by @rudolfix in #3444
- information on pending and partially loaded packages when pipeline fails @rudolfix in #3444
- Fix race condition in LimitItem by @burnash in #3442
- Add offset/limit body_path fields to OffsetPaginatorConfig by @kinghuang in #3260
- [fix/3358] add pagination stopping to
JSONResponseCursorPaginatorby @segetsy in #3374 - (feat) small dashboard improvements by @rudolfix in #3450
Chores
Docs
New Contributors
- @kinghuang made their first contribution in #3260
- @segetsy made their first contribution in #3374
Full Changelog: 1.19.1...1.20.0