[improvement](be) Track and cap inverted index writer memory#63651
Open
airborne12 wants to merge 3 commits into
Open
[improvement](be) Track and cap inverted index writer memory#63651airborne12 wants to merge 3 commits into
airborne12 wants to merge 3 commits into
Conversation
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Add minimal memory-estimate instrumentation for inverted index writing so high-cardinality BKD index build peak memory can be measured.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.codex/tmp/arg0/codex-arg0dYzs9D:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin build-support/clang-format.sh
- PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.codex/tmp/arg0/codex-arg0dYzs9D:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin build-support/check-format.sh
- sh run-be-ut.sh --run --filter=InvertedIndexWriterTest.HighCardinalityBkdMemoryEstimate
- build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (attempted; failed because clang-tidy could not analyze the touched files due pre-existing/environment diagnostics)
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary: Large string full-text inverted index builds buffered CLucene postings in SDocumentsWriter, but Doris did not include that RAM in IndexColumnWriter::size(). Account the full-text writer RAM for analyzed slice indexes so segment memory estimation can see the largest buffered source, while leaving numeric/BKD and non-tokenized string paths unchanged.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- sh run-be-ut.sh --run --filter=InvertedIndexWriterTest.FullTextStringMemoryEstimateIncludesBufferedPostings
- PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.codex/tmp/arg0/codex-arg0dYzs9D:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk1/jiangkai/.bun/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/workspace/bin/lark-cli/:/mnt/disk1/jiangkai/workspace/bin/go/bin:/mnt/disk1/jiangkai/workspace/bin/ldb_toolchain/bin:/mnt/disk1/jiangkai/.local/bin:/mnt/disk1/jiangkai/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin build-support/check-format.sh
- build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN --files be/src/storage/index/inverted/inverted_index_writer.cpp be/test/storage/segment/inverted_index_writer_test.cpp (attempted; failed on environment/pre-existing diagnostics including missing stddef.h, unmatched NOLINTEND in be/src/core/types.h, and existing full-file complexity warnings)
- Behavior changed: Yes (full-text inverted index writer memory is now included in segment memory estimation; file format is unchanged)
- Does this need documentation: No
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Cap CLucene buffered postings memory for analyzed inverted index writers when the actual directory is DorisFSDirectory. This reduces retained writer memory for large fulltext string values when RAM directory is disabled, while leaving RAMDirectory and untokenized index writer buffering unchanged.
### Release note
Add inverted_index_ram_buffer_size_when_ram_dir_disabled to cap analyzed inverted index writer buffering when RAM directory is disabled.
### Check List (For Author)
- Test: Unit Test
- sh run-be-ut.sh --run --filter=InvertedIndexWriterTest.FullTextStringMemoryEstimateIncludesBufferedPostings:InvertedIndexWriterTest.FullTextLargeStringRamDirDisabledCapsBufferedPostings
- PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/check-format.sh
- git diff --check
- CLANG_TIDY_BINARY=/mnt/disk6/common/ldb_toolchain_toucan/bin/clang-tidy-16 build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (failed: existing clang-tidy analysis issues in jni-util.h static assertions and existing complexity warnings)
- Behavior changed: Yes. Analyzed inverted index writers use the lower configurable RAM buffer cap when the actual index directory is DorisFSDirectory.
- Does this need documentation: No
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #xxx
Related PRs:
SDocumentsWriterbuffered postings viagetRAMUsed(). Must mergebefore this PR (this PR bumps
contrib/cluceneto that commit).storage format). Sits on top of this PR.
Problem Summary:
Tighten BE memory tracking and capping for analyzed inverted index
writers (fulltext + Chinese tokenization etc.). Today the BE's
IndexColumnWriter::size()returns 0 for inverted index writers, sosegment_writer.cpp's segment memory estimate misses the CLucenebuffered postings entirely — by far the largest source of resident
bytes during a fulltext flush. Under tight memory pressure (cloud BE
with
inverted_index_ram_dir_enable=false), this causes thesegment-builder to overshoot the memory budget and OOM the BE.
Three commits, smallest-first:
[improvement](be) Track inverted index writer memory estimate—Plumb a real
size()reporter through the writer hierarchy. Nobehaviour change yet; just the scaffolding the next two commits
plug values into.
[improvement](be) Account fulltext inverted index writer memory—Implement
size()for the fulltext path: report_null_bitmap.getSizeInBytes()+index_writer->ramSizeInBytes()(via the clucene
getRAMUsed()exposed bydoris-thirdparty#393) +
ram_directory_memory_size(). The segmentmemory estimate now sees fulltext writer memory.
[improvement](be) Cap fulltext index memory without ram dir—New config
inverted_index_ram_buffer_size_when_ram_dir_disabled(default 64 MB). When the column has an analyzer AND
is_fs_directory(_dir)(i.e.inverted_index_ram_dir_enable=false),clamp CLucene's
setRAMBufferSizeMB()down from the globalinverted_index_ram_buffer_size = 512MB to this new cap. Forcesmore frequent segment flushes in exchange for tighter per-column
RAM peak — the trade-off cloud users want when RAM dir is
disabled.
Why a precursor PR
These three are independently useful as memory-tracking +
capping for the existing V1/V2/V3 CLucene writer path. They were
originally bundled in #63633 (the V4 SPIMI PR) as
prerequisites; lifting them out makes the V4 PR smaller and lets
the cloud teams pick up the memory tracking improvements without
waiting for the larger SPIMI refactor to land.
Release note
Add inverted index writer memory tracking and a new cap config
inverted_index_ram_buffer_size_when_ram_dir_disabled(default64 MB) that limits the CLucene buffered postings size for analyzed
columns when RAM directory is disabled. Reduces BE OOM risk under
fulltext-heavy concurrent loads in cloud mode.
Check List (For Author)
(
InvertedIndexWriterTest.FullTextStringMemoryEstimateIncludesBufferedPostings*RamBufferCap*tests)setRAMBufferSizeMBwhen RAM dir is disabled. Default 64 MB.Tunable at runtime (mutable bool).