Skip to content

[GLUTEN-12143][VL] Route bitmap_construct_agg to native Velox execution#12142

Open
minni31 wants to merge 7 commits into
apache:mainfrom
minni31:bitmap-construct-agg
Open

[GLUTEN-12143][VL] Route bitmap_construct_agg to native Velox execution#12142
minni31 wants to merge 7 commits into
apache:mainfrom
minni31:bitmap-construct-agg

Conversation

@minni31
Copy link
Copy Markdown

@minni31 minni31 commented May 26, 2026

Route bitmap_construct_agg to Native Velox Execution

This PR enables native Velox execution for the bitmap_construct_agg aggregate function, avoiding fallback to vanilla Spark.

Changes

  1. Expression Registration (ExpressionNames.scala): Added BITMAP_CONSTRUCT_AGG constant.
  2. Shim Registration (Spark35Shims.scala, Spark40Shims.scala, Spark41Shims.scala): Register Sig[BitmapConstructAgg] mapping to the substrait function name.
  3. Native Validator (SubstraitToVeloxPlanValidator.cc): Added "bitmap_construct_agg" to the supportedAggFuncs set so the native validator accepts this function.
  4. Plan-shape assertion test (GlutenBitmapExpressionsQuerySuite.scala for spark35/40/41): Verifies the query plan contains HashAggregateExecBaseTransformer (native execution) rather than falling back.

Test Results

Spark Version Tests Result
Spark 3.5 11/11 ✅ Pass
Spark 4.0 11/11 ✅ Pass
Spark 4.1 13/13 ✅ Pass

All tests include the new plan-shape assertion test confirming native Velox execution.

Related issue: #12143

Register bitmap_construct_agg as a supported aggregate expression in the
Velox backend, allowing it to be executed natively instead of falling back
to vanilla Spark.

Changes:
- Add BITMAP_CONSTRUCT_AGG constant to ExpressionNames
- Register Sig[BitmapConstructAgg] in Spark 3.5, 4.0, and 4.1 shims
- Add bitmap_construct_agg to C++ plan validator allowed list

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@minni31 minni31 changed the title [CORE] Route bitmap_construct_agg to native Velox execution [VL] Route bitmap_construct_agg to native Velox execution May 26, 2026
@minni31 minni31 changed the title [VL] Route bitmap_construct_agg to native Velox execution [GLUTEN-12143][VL] Route bitmap_construct_agg to native Velox execution May 26, 2026
@github-actions github-actions Bot added CORE works for Gluten Core VELOX labels May 26, 2026
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Contributor

@weiting-chen weiting-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clean implementation — the three-layer plumbing (constant + shim registration + C++ validator) follows the established pattern correctly.

Two suggestions for consideration (non-blocking):

Sig[RegrSXY](ExpressionNames.REGR_SXY),
Sig[RegrReplacement](ExpressionNames.REGR_REPLACEMENT)
Sig[RegrReplacement](ExpressionNames.REGR_REPLACEMENT),
Sig[BitmapConstructAgg](ExpressionNames.BITMAP_CONSTRUCT_AGG)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding plan-shape assertion in tests

Problem: GlutenBitmapExpressionsQuerySuite inherits upstream Spark tests that only use checkAnswer() — they verify correctness but do not assert that native Velox execution is actually active. A regression that silently disables offloading would go undetected since vanilla Spark also produces correct results.

Evidence:

class GlutenBitmapExpressionsQuerySuite
  extends BitmapExpressionsQuerySuite
  with GlutenSQLTestsTrait {}
// All upstream tests only call checkAnswer(df, expected)

Suggested Fix: Add at least one Gluten-specific test asserting the native transformer is present:

test("bitmap_construct_agg routes to native") {
  val df = spark.sql("SELECT bitmap_construct_agg(col) FROM values (1L), (2L) AS t(col)")
  val plan = df.queryExecution.executedPlan
  assert(collect(plan) { case h: HashAggregateExecTransformer => h }.nonEmpty,
    "Expected native HashAggregateExecTransformer in plan")
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in all spark version. Thanks @weiting-chen

"regr_sxy",
"regr_replacement"};
"regr_replacement",
"bitmap_construct_agg"};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up opportunity: register bitmap_or_agg (and bitmap_and_agg for Spark 4.1)

Problem: bitmap_or_agg was introduced in the same Spark 3.5 commit as bitmap_construct_agg, and bitmap_and_agg in Spark 4.1. If native Velox implementations exist for these, registering them together would avoid unnecessary columnar-to-row transitions when users combine bitmap functions in the same query stage.

Investigation Needed: Confirm whether bitmap_or_agg and bitmap_and_agg have native Velox implementations (look for registration in cpp/velox/udf/ or the Velox aggregate function registry). If yes, consider adding them in a follow-up PR:

      "bitmap_construct_agg",
      "bitmap_or_agg",
      "bitmap_and_agg"};

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've Velox PR for bitmap_or_agg in review phase. Will add follow up PR for Gluten once Velox PR merges. Thanks!

Adds a test verifying that bitmap_construct_agg routes to native Velox
execution (HashAggregateExecBaseTransformer) rather than falling back
to vanilla Spark. Test added for Spark 3.5, 4.0, and 4.1.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@yaooqinn
Copy link
Copy Markdown
Member

+1. Three-layer plumbing (Sig[BitmapConstructAgg] on all 3 shims + supportedAggFuncs entry + plan-shape assertion) follows the same shape as the existing BloomFilterAgg registration in this file, which is the right precedent here.

Grounded the plan-shape test guard: bitmap_bit_position is registered in gen-function-support-docs.py (BitmapBitPosition -> bitmap_bit_position), so the inner expression won't silently fallback and mask a bitmap_construct_agg regression. Good defensive test shape.

bitmap_construct_agg offloaded to Velox throws GlutenException instead of
SparkArrayIndexOutOfBoundsException. Exclude these error-path tests following
the established pattern for native execution error type mismatches.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants