Skip to content

Conversation

@coderfender
Copy link
Contributor

@coderfender coderfender commented Jan 13, 2026

Which issue does this PR close?

  1. Add int (byte/short/int/long) → binary cast support for LEGACY mode only (per Spark)
  2. Add boolean → decimal cast support
  3. Update cast tests to use DSL instead of SQL for regular casts (except try mode since try_cast is only supported starting Spark 4x)
  4. Skip try_cast tests for casts not supported by Spark SQL

Relavent spark code :
https://github.com/apache/spark/blob/f89fbd49c45045b44259341c6c73e30ee9c46dd0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala#L682

(Note that only legacy mode routes the code here while we Try and ANSI are not supported)

Closes #.

Rationale for this change

  1. Add new comet native cast functionality

What changes are included in this PR?

How are these changes tested?

  1. CastTestSuite tests and additional tests to cast Boolean to various Decimal Types

@coderfender coderfender changed the title feat : implement cast from int (whole numbers) to binary feat : implement cast from int (whole numbers) to binary format Jan 14, 2026
@coderfender coderfender changed the title feat : implement cast from int (whole numbers) to binary format feat: implement cast from int (whole numbers) to binary format Jan 14, 2026
@coderfender
Copy link
Contributor Author

@parthchandra , @andygrove these are the changes to support int to binary cast and bool to decimal cast operation . I have also had to change the base test to make sure that we use DSL (since Int -> Binary casts are only supported in the DSL side of Spark and would raise an encoding exception if we use Spark SQL) . Also, Int -> Binary casts do not support Try or ANSI eval modes in Spark and I have followed the same convention here as well

@coderfender coderfender changed the title feat: implement cast from int (whole numbers) to binary format feat: implement cast from int whole numbers to binary format Jan 14, 2026
@coderfender coderfender changed the title feat: implement cast from int whole numbers to binary format feat: implement cast from whole numbers to binary format Jan 14, 2026
@coderfender
Copy link
Contributor Author

coderfender commented Jan 14, 2026

Fixed format issue with compatibility.md ( I initially didn't add it but turns out the builds are failing with git errors)

@codecov-commenter
Copy link

codecov-commenter commented Jan 14, 2026

Codecov Report

❌ Patch coverage is 58.82353% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.97%. Comparing base (f09f8af) to head (910fe4f).
⚠️ Report is 847 commits behind head on main.

Files with missing lines Patch % Lines
...scala/org/apache/comet/expressions/CometCast.scala 58.82% 4 Missing and 17 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3083      +/-   ##
============================================
+ Coverage     56.12%   58.97%   +2.84%     
- Complexity      976     1371     +395     
============================================
  Files           119      167      +48     
  Lines         11743    15562    +3819     
  Branches       2251     2586     +335     
============================================
+ Hits           6591     9177    +2586     
- Misses         4012     5086    +1074     
- Partials       1140     1299     +159     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding int→binary and boolean→decimal cast support!

Issue: Dead code

The function canCastToBinary (lines 354-358 in CometCast.scala) is defined but never called anywhere. This appears to be unused code that should be removed.

private def canCastToBinary(fromType: DataType): SupportLevel = fromType match {
  case DataTypes.ByteType | DataTypes.ShortType | DataTypes.IntegerType | DataTypes.LongType =>
    Compatible()
  case _ => Unsupported(Some(s"Cast from BinaryType to $fromType is not supported"))
}

The actual logic for int→binary casts is correctly handled in the canCastFromByte, canCastFromShort, canCastFromInt, and canCastFromLong functions.


Otherwise, the implementation looks good:

  • Correctly limits int→binary to LEGACY mode only (matching Spark)
  • Boolean→Decimal implementation handles precision/scale correctly
  • Tests use column data from parquet (not just literals)
  • Compatibility docs updated
  • ANSI and Try modes correctly disabled for int→binary tests

This review was generated with AI assistance.

@andygrove
Copy link
Member

There is a test failure:

2026-01-14T23:17:19.7092172Z - cast StringType to DecimalType(2,2) *** FAILED *** (684 milliseconds)
2026-01-14T23:17:19.7092965Z   "[[CAST_INVALID_INPUT] The value '
2026-01-14T23:17:19.7100086Z 3' of the type "STRING" cannot be cast to "DECIMAL(2,2)" because it is malformed. Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error]." did not equal "[[NUMERIC_VALUE_OUT_OF_RANGE] 3 cannot be represented as Decimal(2, 2). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error, and return NULL instead]." (CometCastSuite.scala:1424)
2026-01-14T23:17:19.7102421Z   Analysis:
2026-01-14T23:17:19.7102696Z   "[[CAST_INVALID_INPUT] The value '
2026-01-14T23:17:19.7105340Z 3' of the type "STRING" cannot be cast to "DECIMAL(2,2)" because it is malformed. Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error]." -> "[[NUMERIC_VALUE_OUT_OF_RANGE] 3 cannot be represented as Decimal(2, 2). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error, and return NULL instead]."

@coderfender
Copy link
Contributor Author

Thank you @andygrove ,I am working on the test failures (I believe there is a bug on the DSL side vs using Spark SQL)

@coderfender coderfender force-pushed the support_short_to_binary branch from 910fe4f to 69d8290 Compare January 17, 2026 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants