You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2020/12/30 12:29:00 UTC

[jira] [Assigned] (SPARK-33945) Handles a random seed consisting of an expr tree

     [ https://issues.apache.org/jira/browse/SPARK-33945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-33945:
------------------------------------

    Assignee:     (was: Apache Spark)

> Handles a random seed consisting of an expr tree
> ------------------------------------------------
>
>                 Key: SPARK-33945
>                 URL: https://issues.apache.org/jira/browse/SPARK-33945
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.8, 3.0.2, 3.1.0
>            Reporter: Takeshi Yamamuro
>            Priority: Minor
>
> This ticket aims at fixing the minor bug that throws an analysis exception when a seed param in `rand`/`randn` having a expr tree (e.g., `rand(1 + 1)`) with constant folding (`ConstantFolding` and `ReorderAssociativeOperator`) disabled. A query to reproduce this issue is as follows;
> {code}
> // v3.1.0, v3.0.2, and v2.4.8
> $./bin/spark-shell 
> scala> sql("select rand(1 + 2)").show()
> +-------------------+
> |      rand((1 + 2))|
> +-------------------+
> |0.25738143505962285|
> +-------------------+
> $./bin/spark-shell --conf spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConstantFolding,org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator
> scala> sql("select rand(1 + 2)").show()
> org.apache.spark.sql.AnalysisException: Input argument to rand must be an integer, long or null literal.;
>   at org.apache.spark.sql.catalyst.expressions.RDG.seed$lzycompute(randomExpressions.scala:49)
>   at org.apache.spark.sql.catalyst.expressions.RDG.seed(randomExpressions.scala:46)
>   at org.apache.spark.sql.catalyst.expressions.Rand.doGenCode(randomExpressions.scala:98)
>   at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:146)
>   at scala.Option.getOrElse(Option.scala:189)
>   ...
> {code}
> A root cause is that the match-case code below cannot handle the case described above:
> https://github.com/apache/spark/blob/42f5e62403469cec6da680b9fbedd0aa508dcbe5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala#L46-L51



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org