You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Patrick Cording (Jira)" <ji...@apache.org> on 2020/01/24 10:49:00 UTC
[jira] [Updated] (SPARK-30633) Codegen fails when xxHash seed is
not an integer
[ https://issues.apache.org/jira/browse/SPARK-30633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Cording updated SPARK-30633:
------------------------------------
Description:
If the seed for xxHash is not an integer the generated code does not compile.
Steps to reproduce:
{code:java}
import org.apache.spark.sql.catalyst.expressions.XxHash64
import org.apache.spark.sql.Column
val file = "..."
val column = col("...")
val df = spark.read.csv(file)
def xxHash(seed: Long, cols: Column*): Column = new Column(
XxHash64(cols.map(_.expr), seed)
)
val seed = (Math.pow(2, 32)+1).toLong
df.select(xxHash(seed, column)).show()
{code}
Appending an L to the seed when the datatype is long fixes the issue.
was:
If the seed for xxHash is not an integer the generated code does not compile.
Steps to reproduce:
{code:java}
import org.apache.spark.sql.catalyst.expressions.XxHash64
import org.apache.spark.sql.Column
val file = "..."
val column = col("...")
val df = spark.read.csv(file)
def xxHash(seed: Long, cols: Column*): Column = new Column( XxHash64(cols.map(_.expr), seed)
)
val seed = (Math.pow(2, 32)+1).toLong
df.select(xxHash(seed, column)).show()
{code}
Appending an L to the seed when the datatype is long fixes the issue.
> Codegen fails when xxHash seed is not an integer
> ------------------------------------------------
>
> Key: SPARK-30633
> URL: https://issues.apache.org/jira/browse/SPARK-30633
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.4
> Reporter: Patrick Cording
> Priority: Major
>
> If the seed for xxHash is not an integer the generated code does not compile.
> Steps to reproduce:
> {code:java}
> import org.apache.spark.sql.catalyst.expressions.XxHash64
> import org.apache.spark.sql.Column
> val file = "..."
> val column = col("...")
> val df = spark.read.csv(file)
> def xxHash(seed: Long, cols: Column*): Column = new Column(
> XxHash64(cols.map(_.expr), seed)
> )
> val seed = (Math.pow(2, 32)+1).toLong
> df.select(xxHash(seed, column)).show()
> {code}
> Appending an L to the seed when the datatype is long fixes the issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org