Skip to content

interval analysis error when have two filterexec that inner filter proves zero selectirity #20742

@haohuaijin

Description

@haohuaijin

Describe the bug

When two FilterExec nodes are stacked and the inner filter proves zero selectivity (no rows can match), the outer filter panics during interval analysis.

Root cause: When a FilterExec determines that no rows can pass its predicate (e.g., a > 200 when a's max is 100), collect_new_statistics produced column statistics with untyped ScalarValue::Null for min/max/sum values. The Null variant has data type Null.

If an outer FilterExec sits on top and tries to analyze its own predicate (e.g., a = 50), it attempts to intersect intervals from the inner filter's statistics (Null type) with the literal in its predicate (Int32 type). Interval::intersect requires both sides to have the same data type, so it panics with:

"Only intervals with the same data type are intersectable, lhs:Null, rhs:Int32"

To Reproduce

    #[tokio::test]
    async fn test_nested_filter_with_zero_selectivity_inner() -> Result<()> {
        // Inner table: a: [1, 100], b: [1, 3]
        let schema = Schema::new(vec![
            Field::new("a", DataType::Int32, false),
            Field::new("b", DataType::Int32, false),
        ]);
        let input = Arc::new(StatisticsExec::new(
            Statistics {
                num_rows: Precision::Inexact(1000),
                total_byte_size: Precision::Inexact(4000),
                column_statistics: vec![
                    ColumnStatistics {
                        min_value: Precision::Inexact(ScalarValue::Int32(Some(1))),
                        max_value: Precision::Inexact(ScalarValue::Int32(Some(100))),
                        ..Default::default()
                    },
                    ColumnStatistics {
                        min_value: Precision::Inexact(ScalarValue::Int32(Some(1))),
                        max_value: Precision::Inexact(ScalarValue::Int32(Some(3))),
                        ..Default::default()
                    },
                ],
            },
            schema,
        ));

        // Inner filter: a > 200 (impossible given a max=100 → zero selectivity)
        let inner_predicate: Arc<dyn PhysicalExpr> = Arc::new(BinaryExpr::new(
            Arc::new(Column::new("a", 0)),
            Operator::Gt,
            Arc::new(Literal::new(ScalarValue::Int32(Some(200)))),
        ));
        let inner_filter: Arc<dyn ExecutionPlan> =
            Arc::new(FilterExec::try_new(inner_predicate, input)?);

        let outer_predicate: Arc<dyn PhysicalExpr> = Arc::new(BinaryExpr::new(
            Arc::new(Column::new("a", 0)),
            Operator::Eq,
            Arc::new(Literal::new(ScalarValue::Int32(Some(50)))),
        ));
        let outer_filter: Arc<dyn ExecutionPlan> =
            Arc::new(FilterExec::try_new(outer_predicate, inner_filter)?);

        let statistics = outer_filter.partition_statistics(None)?;
        assert_eq!(statistics.num_rows, Precision::Inexact(0));

        Ok(())
    }

Expected behavior

work without panic

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions