You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2022/07/09 05:02:19 UTC

[spark] branch master updated: [SPARK-37623][SPARK-39230][SQL][FOLLOW-UP] Make regr_slope and regr_intercept safe with ANSI mode

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 0bb9d4f355a [SPARK-37623][SPARK-39230][SQL][FOLLOW-UP] Make regr_slope and regr_intercept safe with ANSI mode
0bb9d4f355a is described below

commit 0bb9d4f355a05190f261077e11ccf1e5cdd00505
Author: Hyukjin Kwon <gu...@apache.org>
AuthorDate: Sat Jul 9 14:02:09 2022 +0900

    [SPARK-37623][SPARK-39230][SQL][FOLLOW-UP] Make regr_slope and regr_intercept safe with ANSI mode
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to make `regr_slope` and `regr_intercept` ANSI-safe by checking zero to avoid divide-by-zero exceptions when ANSI mode is on.
    
    ### Why are the changes needed?
    
    To make both expressions working regardless of ANSI mode.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, both functions are not released yet.
    
    ### How was this patch tested?
    
    `JDBCV2Suite` already covers it when ASNI mode is on. This was found by a broken test case, see also https://github.com/apache/spark/pull/36773/files#r916607535
    
    Closes #37137 from HyukjinKwon/SPARK-39385.
    
    Authored-by: Hyukjin Kwon <gu...@apache.org>
    Signed-off-by: Hyukjin Kwon <gu...@apache.org>
---
 .../spark/sql/catalyst/expressions/aggregate/linearRegression.scala   | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/linearRegression.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/linearRegression.scala
index c371f0b40c2..40518982958 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/linearRegression.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/linearRegression.scala
@@ -278,7 +278,7 @@ case class RegrSlope(left: Expression, right: Expression) extends DeclarativeAgg
     covarPop.mergeExpressions ++ varPop.mergeExpressions
 
   override lazy val evaluateExpression: Expression = {
-    If(covarPop.n === 0.0, Literal.create(null, DoubleType), covarPop.ck / varPop.m2)
+    If(varPop.m2 === 0.0, Literal.create(null, DoubleType), covarPop.ck / varPop.m2)
   }
 
   override lazy val inputAggBufferAttributes: Seq[AttributeReference] =
@@ -331,7 +331,7 @@ case class RegrIntercept(left: Expression, right: Expression) extends Declarativ
     covarPop.mergeExpressions ++ varPop.mergeExpressions
 
   override lazy val evaluateExpression: Expression = {
-    If(covarPop.n === 0.0, Literal.create(null, DoubleType),
+    If(varPop.m2 === 0.0, Literal.create(null, DoubleType),
       covarPop.yAvg - covarPop.ck / varPop.m2 * covarPop.xAvg)
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org