Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/lakehouse/catalogs/iceberg-catalog.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,16 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (

If set to `false`, Doris will check the type of each table one by one and only return Iceberg type tables. This mode will have poor performance when there are many tables.

- `iceberg.rest.view-enabled`

Supported since version 4.0.6.

Only effective for Iceberg REST Catalog. Whether to enable View-related operations (including `listViews`, `loadView`, `viewExists`, `dropView`, etc.). Default is `true`.

Some Iceberg REST Catalog implementations expose Table APIs but the View APIs are unavailable or return errors. When executing `SHOW TABLES`, Doris calls `ViewCatalog.listViews()` to filter Views out of the Table list, so such REST services may cause `SHOW TABLES` and table metadata loading to fail.

In this case, set this parameter to `false`, and Doris will skip all View-related operations, allowing metadata operations such as `SHOW TABLES` to work normally.

* `{CommonProperties}`

The CommonProperties section is for entering general properties. See the [Catalog Overview](../catalog-overview.md) for details on common properties.
Expand Down
1 change: 1 addition & 0 deletions docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -383,6 +383,7 @@ const config = {
searchResultLimits: 100,
searchContextByPaths: ['docs'],
useAllContextsWithNoSearchContext: false,
ignoreFiles: [/^docs\/(?:[^/]+\/)?key-features\//],
},
],
],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,16 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (

这种方式性能最好。如果设置为 `false`,则 Doris 会逐一检查每个 Table 的类型,并只返回 Iceberg 类型的 Table。该模式在表很多的情况下,性能会比较差。

- `iceberg.rest.view-enabled`

自 4.0.6 版本支持。

仅对 Iceberg REST Catalog 生效。是否启用 View 相关操作(包括 `listViews`、`loadView`、`viewExists`、`dropView` 等)。默认为 `true`。

部分 Iceberg REST Catalog 实现虽然暴露了 Table 相关 API,但 View 相关 API 不可用或会返回错误。由于 Doris 在执行 `SHOW TABLES` 时会调用 `ViewCatalog.listViews()` 以将 View 从 Table 列表中过滤掉,因此此类 REST 服务可能导致 `SHOW TABLES` 及表元数据加载失败。

遇到上述情况时,可将该参数设置为 `false`,Doris 将跳过所有 View 相关操作,使 `SHOW TABLES` 等元数据操作可以正常工作。

* `{CommonProperties}`

CommonProperties 部分用于填写通用属性。请参阅[数据目录概述](../catalog-overview.md)中【通用属性】部分。
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---

Check warning on line 1 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

seo-title-duplicate

Rendered SEO title is duplicated across indexable pages%3A "Iceberg Catalog - Apache Doris". Add a version%2C locale%2C or page-specific qualifier. Owner%3A @apache/doris-website-maintainers
{
"title": "Iceberg Catalog",
"language": "zh-CN",
Expand Down Expand Up @@ -55,7 +55,7 @@

* `dlf`:使用阿里云 DLF 作为元数据服务。

* `s3tables`:使用 AWS S3 Tables Catalog 访问 [S3 Table Bucket](https://aws.amazon.com/s3/features/tables/)。

Check notice on line 58 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

link-external-report-only

External link is report-only and was not fetched%3A https%3A//aws.amazon.com/s3/features/tables/. Owner%3A @apache/doris-website-maintainers

* `<warehouse>`

Expand Down Expand Up @@ -83,6 +83,16 @@

这种方式性能最好。如果设置为 `false`,则 Doris 会逐一检查每个 Table 的类型,并只返回 Iceberg 类型的 Table。该模式在表很多的情况下,性能会比较差。

- `iceberg.rest.view-enabled`

自 4.0.6 版本支持。

仅对 Iceberg REST Catalog 生效。是否启用 View 相关操作(包括 `listViews`、`loadView`、`viewExists`、`dropView` 等)。默认为 `true`。

部分 Iceberg REST Catalog 实现虽然暴露了 Table 相关 API,但 View 相关 API 不可用或会返回错误。由于 Doris 在执行 `SHOW TABLES` 时会调用 `ViewCatalog.listViews()` 以将 View 从 Table 列表中过滤掉,因此此类 REST 服务可能导致 `SHOW TABLES` 及表元数据加载失败。

遇到上述情况时,可将该参数设置为 `false`,Doris 将跳过所有 View 相关操作,使 `SHOW TABLES` 等元数据操作可以正常工作。

* `{CommonProperties}`

CommonProperties 部分用于填写通用属性。请参阅[数据目录概述](../catalog-overview.md)中【通用属性】部分。
Expand Down Expand Up @@ -262,7 +272,7 @@
> Doris 当前不支持带时区的 `Timestamp` 类型。所有 `timestamp` 和 `timestamptz` 会统一映射到 `datetime(N)` 类型上。但在读取和写入时,Doris 会根据实际源类型正确处理时区。如通过 `SET time_zone=<tz>` 指定时区后,会影响 `timestamptz` 列的读取和写入结果。
>
> 可以在 `DESCRIBE table_name` 语句中的 Extra 列查看源类型是否带时区信息。如显示 `WITH_TIMEZONE`,则表示源类型是带时区的类型(该功能自 3.1.0 版本支持)。
>

Check warning on line 275 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
> 4.0.3 后开始支持,可以映射 `timestamptz (Timestamp with timezone)` 到 Doris 的 `timestamptz` 类型。

## Namespace 映射
Expand Down Expand Up @@ -671,7 +681,7 @@
<TabItem value='S3' label='S3' default>
AWS Glue 和 S3 存储服务共用一套认证信息。

非 EC2 环境下,需要使用 [aws configure](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html) 配置 Credentials 信息,同时在 ~/.aws 目录下生成 credentials 文件。

Check notice on line 684 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

link-external-report-only

External link is report-only and was not fetched%3A https%3A//docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html. Owner%3A @apache/doris-website-maintainers
```sql
CREATE CATALOG glue PROPERTIES (
'type' = 'iceberg',
Expand Down Expand Up @@ -2182,7 +2192,7 @@

* 支持创建单列或多列分区表。

* 支持分区转换函数来支持 Iceberg 隐式分区以及分区演进的功能。具体 Iceberg 分区转换函数可以查看 [Iceberg partition transforms](https://iceberg.apache.org/spec/#partition-transforms)。

Check notice on line 2195 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

link-external-report-only

External link is report-only and was not fetched%3A https%3A//iceberg.apache.org/spec/#partition-transforms. Owner%3A @apache/doris-website-maintainers

* `year(ts)` 或者 `years(ts)`

Expand Down Expand Up @@ -2762,7 +2772,7 @@
1. `rewrite_data_files` 操作会读取数据文件并重新写入,会产生额外的 I/O 和计算开销,请合理分配集群资源。
2. 执行前可以通过[查看数据文件分布](#查看数据文件分布)章节中的 SQL 来评估是否需要执行重写操作。
3. WHERE 条件可用于限制重写的分区或数据范围,这个条件会过滤掉那些不包含符合 WHERE 条件的数据的文件,从而减少重写的文件数量和数据量。
4. 执行前可以通过[重写文件选择逻辑](#重写文件选择逻辑)章节中的 SQL 来计算哪些文件会被重写。

Check failure on line 2775 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

link-missing-anchor

Anchor #重写文件选择逻辑 does not exist in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx. Owner%3A @apache/doris-website-maintainers

### rewrite_manifests

Expand Down Expand Up @@ -2999,7 +3009,7 @@

### Dangling Delete

某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot 元数据中删除(Dangling Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。

Check notice on line 3012 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

link-external-report-only

External link is report-only and was not fetched%3A https%3A//iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files. Owner%3A @apache/doris-website-maintainers

因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT 下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。

Expand Down Expand Up @@ -3059,7 +3069,7 @@
WHEN file_size_in_bytes > @max_file_size_bytes THEN 'Too large'
END AS size_issue
FROM iceberg_table$data_files
WHERE file_size_in_bytes < @min_file_size_bytes

Check warning on line 3072 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
OR file_size_in_bytes > @max_file_size_bytes
ORDER BY `partition`, file_size_in_bytes DESC;
```
Expand Down Expand Up @@ -3120,7 +3130,7 @@
FROM file_analysis
UNION ALL
SELECT
'Percentage meeting file-level conditions (%)',

Check warning on line 3133 in i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
ROUND(SUM(CASE WHEN meets_file_level_conditions THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2)
FROM file_analysis;
```
Expand Down
1 change: 0 additions & 1 deletion sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -716,7 +716,6 @@ const sidebars: SidebarsConfig = {
label: 'Lakehouse Best Practices',
items: [
'lakehouse/best-practices/optimization',
'lakehouse/best-practices/doris-snowflake-catalog',
'lakehouse/best-practices/kerberos',
'lakehouse/best-practices/tpch',
'lakehouse/best-practices/tpcds',
Expand Down
Loading
Loading