Skip to content

[SPARK-52461] [SQL] Collapse inner Cast from DecimalType to DecimalType #51169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -138,11 +138,28 @@ object DecimalPrecisionTypeCoercion extends SQLConfHelper {
case b @ BinaryComparison(e1 @ DecimalExpression(p1, s1), e2 @ DecimalExpression(p2, s2))
if p1 != p2 || s1 != s2 =>
val resultType = widerDecimalType(p1, s1, p2, s2)
val newE1 = if (e1.dataType == resultType) e1 else Cast(e1, resultType)
val newE2 = if (e2.dataType == resultType) e2 else Cast(e2, resultType)
val newE1 = if (e1.dataType == resultType) e1 else tryCollapseInnerCast(e1, resultType)
val newE2 = if (e2.dataType == resultType) e2 else tryCollapseInnerCast(e2, resultType)
b.withNewChildren(Seq(newE1, newE2))
}

/**
* In case `SQLConf.COLLAPSE_INNER_CAST_TO_DECIMAL` is true and inner cast is redundant (if the
* inner one is not a `USER_SPECIFIED_CAST` has lower precision and scale than the outer one),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has -> and has

* remove it.
*/
private def tryCollapseInnerCast(expression: Expression, decimalType: DecimalType): Cast = {
expression match {
case innerCast @ Cast(_, innerDecimalType: DecimalType, _, _)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inner cast is not redundant if it can cause truncation or overflow. Strictly speaking, we can only remove the inner cast if the final query result remains the same.

I think we should also check Cast.canNullSafeCastToDecimal(child.dataType, innerDecimalType)

if conf.getConf(SQLConf.COLLAPSE_INNER_CAST_TO_DECIMAL) &&
decimalType.precision >= innerDecimalType.precision &&
decimalType.scale >= innerDecimalType.scale &&
innerCast.getTagValue(Cast.USER_SPECIFIED_CAST).isEmpty =>
Cast(innerCast.child, decimalType)
case other => Cast(other, decimalType)
}
}

/**
* Type coercion for BinaryOperator in which one side is a non-decimal numeric, and the other
* side is a decimal.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5949,6 +5949,17 @@ object SQLConf {
.createWithDefault(2)
}

val COLLAPSE_INNER_CAST_TO_DECIMAL =
buildConf("spark.sql.collapseInnerCastToDecimal.enabled")
.internal()
.doc(
"If true, collapse inner cast to DecimalType if it is redundant (if the inner one has " +
"lower precision and scale than the outer one)."
)
.version("4.1.0")
.booleanConf
.createWithDefault(true)

/**
* Holds information about keys that have been deprecated.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ SELECT s AS s, COUNT(*) c FROM stuff GROUP BY ALL HAVING SUM(f) > 0 ORDER BY s
-- !query analysis
Sort [s#x ASC NULLS FIRST], true
+- Project [s#x, c#xL]
+- Filter (sum(f)#x > cast(cast(0 as decimal(1,0)) as decimal(16,4)))
+- Filter (sum(f)#x > cast(0 as decimal(16,4)))
+- Aggregate [s#x], [s#x AS s#x, count(1) AS c#xL, sum(f#x) AS sum(f)#x]
+- SubqueryAlias stuff
+- View (`stuff`, [i#x, f#x, s#x, t#x, d#x, a#x])
Expand Down
Loading