You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-52484][SQL] Skip child.supportsColumnar assertion from driver side in ColumnarToRowExec
https://issues.apache.org/jira/browse/SPARK-52484
### What changes were proposed in this pull request?
The PR removes the unnecessary assertion in `ColumnarToRowExec` introduced by #25264 to guarantee some flexibilities for 3rd Spark plugins. Especially in Apache Gluten, the assertion blocks some of our effort in query optimization because we needed an intermediate state of the query plan which Spark may see as illegal.
Moreover, some typical reasons this intermediate state is needed in Gluten are:
1. Gluten has a cost evaluator API to evaluate the cost of a `transition rule` (which adds a unary node on top of an input plan). In the case Gluten will need a fake leaf to let the rule apply on it for cost evaluation. This leaf node has to be made a columnar one to bypass this assertion, which is a bit hacky.
2. Gluten has a cascades-style query optimizer (RAS) which could set a leaf, dummy, row-based plan node to hide up a child-tree of a brach query plan node, during which this leaf is to represent a so-called cascades 'group'. Although this pattern (C2R on a row-based plan) is illegal, it could still be used as the input of an optimizer rule to potentially be matched on and then to be converted into a valid query plan.
This PR is to remove the assertion to ensure some flexibilities to the 3rd plugins. This should be no harm for the upstream Apache Spark, because the query execution will still be failed by [this error](https://github.com/apache/spark/blob/5d0b2f41794bf4dd25b3ce19bc4f634082b40876/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala#L343-L351) without this assertion on an illegal query plan.
Some workarounds used by Gluten for bypassing this assertion:
1. https://github.com/apache/incubator-gluten/blob/0a1b5c28678653242ab0fd7b28ebba1dca43ccb1/gluten-core/src/main/scala/org/apache/gluten/extension/columnar/transition/package.scala#L83
2. https://github.com/apache/incubator-gluten/blob/0a1b5c28678653242ab0fd7b28ebba1dca43ccb1/gluten-core/src/main/scala/org/apache/gluten/extension/columnar/enumerated/planner/plan/GlutenPlanModel.scala#L51-L55
Once the assertion is removed, Gluten will be able to remove these workarounds to simply code.
### Does this PR introduce _any_ user-facing change?
Basically no. An assertion error in plan-building time will be replaced by an exception in execution time (still from the driver side) when an illegal query plan is generated.
### How was this patch tested?
Existing UTs.
Closes#51183 from zhztheplayer/wip-rm-c2r-check.
Authored-by: Hongze Zhang <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
0 commit comments