You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Patrick Cording (Jira)" <ji...@apache.org> on 2020/01/24 10:49:00 UTC
[jira] [Updated] (SPARK-30633) Codegen fails when xxHash seed is not an integer

     [ https://issues.apache.org/jira/browse/SPARK-30633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Cording updated SPARK-30633:
------------------------------------
    Description: 
If the seed for xxHash is not an integer the generated code does not compile.

Steps to reproduce:
{code:java}
import org.apache.spark.sql.catalyst.expressions.XxHash64
import org.apache.spark.sql.Column

val file = "..."
val column = col("...")

val df = spark.read.csv(file)

def xxHash(seed: Long, cols: Column*): Column = new Column(
   XxHash64(cols.map(_.expr), seed)
)

val seed = (Math.pow(2, 32)+1).toLong
df.select(xxHash(seed, column)).show()
{code}
Appending an L to the seed when the datatype is long fixes the issue.

  was:
If the seed for xxHash is not an integer the generated code does not compile.


Steps to reproduce:
{code:java}
import org.apache.spark.sql.catalyst.expressions.XxHash64
import org.apache.spark.sql.Column

val file = "..."
val column = col("...")

val df = spark.read.csv(file)

def xxHash(seed: Long, cols: Column*): Column = new Column(     XxHash64(cols.map(_.expr), seed)
)

val seed = (Math.pow(2, 32)+1).toLong
df.select(xxHash(seed, column)).show()
{code}

Appending an L to the seed when the datatype is long fixes the issue.


> Codegen fails when xxHash seed is not an integer
> ------------------------------------------------
>
>                 Key: SPARK-30633
>                 URL: https://issues.apache.org/jira/browse/SPARK-30633
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.4
>            Reporter: Patrick Cording
>            Priority: Major
>
> If the seed for xxHash is not an integer the generated code does not compile.
> Steps to reproduce:
> {code:java}
> import org.apache.spark.sql.catalyst.expressions.XxHash64
> import org.apache.spark.sql.Column
> val file = "..."
> val column = col("...")
> val df = spark.read.csv(file)
> def xxHash(seed: Long, cols: Column*): Column = new Column(
>    XxHash64(cols.map(_.expr), seed)
> )
> val seed = (Math.pow(2, 32)+1).toLong
> df.select(xxHash(seed, column)).show()
> {code}
> Appending an L to the seed when the datatype is long fixes the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org