You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Guiju Zhang (JIRA)" <ji...@apache.org> on 2019/03/20 07:49:00 UTC
[jira] [Updated] (SPARK-27211) cast error when select column from
Row
[ https://issues.apache.org/jira/browse/SPARK-27211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guiju Zhang updated SPARK-27211:
--------------------------------
Description:
First, I have an object RawLogPlayload which has an field: long timestamp
Then I try to join two Dataset<RawLogPlayload> and select some of the columns
Following is the code Snippet
extractedRawTc.printSchema(); // output1
Dataset<RawLogPayload> extractedRawW3cFilled = extractedRawW3c.alias("extractedRawW3c")
.join(extractedRawTc.alias("extractedRawTc"), functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")), "inner")
.select(functions.col("extractedRawW3c.df_logdatetime"), functions.col("extractedRawW3c.rawsessionid"), functions.col("extractedRawTc.uid"),
functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"),
functions.col("extractedRawW3c.tid"), functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"),
functions.col("extractedRawW3c.curWindow"), *functions.col("extractedRawW3c.timestamp")*)
.as(Encoders.bean(RawLogPayload.class));
extractedRawW3cFilled.printSchema(); // output2
After run this, it will cast following exception
2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)"
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: *No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"*
at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)
at org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910)
Output1 extractedRawTc schema
root
|-- curWindow: string (nullable = true)
|-- df_logdatetime: string (nullable = true)
|-- fid: string (nullable = true)
|-- rawsessionid: string (nullable = true)
|-- string1: string (nullable = true)
|-- t: string (nullable = true)
|-- tid: string (nullable = true)
|-- time: string (nullable = true)
|-- *timestamp: long (nullable = true)*
|-- uid: string (nullable = true)
|-- url: string (nullable = true)
|-- wid: string (nullable = true)
Output2 extractedRawW3cFilled schema
root
|-- df_logdatetime: string (nullable = true)
|-- rawsessionid: string (nullable = true)
|-- uid: string (nullable = true)
|-- time: string (nullable = true)
|-- T: string (nullable = true)
|-- url: string (nullable = true)
|-- wid: string (nullable = true)
|-- tid: string (nullable = true)
|-- fid: string (nullable = true)
My question: the schema of column timestamp is long, but from the exception log, it seems after selecting the datatype of timestamp becomes UTF8String, Why would this happen? Is it a bug? If not could you point how to use it correctly?
Thanks
|-- string1: string (nullable = true)
|-- curWindow: string (nullable = true)
|-- *timestamp: long (nullable = true)*
was:
(1) RawLogPlayload has an field: long timestamp
(2)
extractedRawTc.printSchema(); // output1
Dataset<RawLogPayload> extractedRawW3cFilled = extractedRawW3c.alias("extractedRawW3c")
.join(extractedRawTc.alias("extractedRawTc"), functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")), "inner")
.select(functions.col("extractedRawW3c.df_logdatetime"), functions.col("extractedRawW3c.rawsessionid"), functions.col("extractedRawTc.uid"),
functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"),
functions.col("extractedRawW3c.tid"), functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"),
functions.col("extractedRawW3c.curWindow"), *functions.col("extractedRawW3c.timestamp")*)
.as(Encoders.bean(RawLogPayload.class));
extractedRawW3cFilled.printSchema(); // output2
(4) cast exception
2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)"
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: *No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"*
at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)
at org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910)
Output1 extractedRawTc schema
root
|-- curWindow: string (nullable = true)
|-- df_logdatetime: string (nullable = true)
|-- fid: string (nullable = true)
|-- rawsessionid: string (nullable = true)
|-- string1: string (nullable = true)
|-- t: string (nullable = true)
|-- tid: string (nullable = true)
|-- time: string (nullable = true)
|-- *timestamp: long (nullable = true)*
|-- uid: string (nullable = true)
|-- url: string (nullable = true)
|-- wid: string (nullable = true)
Output2 extractedRawW3cFilled schema
root
|-- df_logdatetime: string (nullable = true)
|-- rawsessionid: string (nullable = true)
|-- uid: string (nullable = true)
|-- time: string (nullable = true)
|-- T: string (nullable = true)
|-- url: string (nullable = true)
|-- wid: string (nullable = true)
|-- tid: string (nullable = true)
|-- fid: string (nullable = true)
|-- string1: string (nullable = true)
|-- curWindow: string (nullable = true)
|-- *timestamp: long (nullable = true)*
> cast error when select column from Row
> --------------------------------------
>
> Key: SPARK-27211
> URL: https://issues.apache.org/jira/browse/SPARK-27211
> Project: Spark
> Issue Type: Question
> Components: Java API
> Affects Versions: 2.3.0, 2.3.1
> Reporter: Guiju Zhang
> Priority: Major
> Labels: SQL, Spark
>
> First, I have an object RawLogPlayload which has an field: long timestamp
> Then I try to join two Dataset<RawLogPlayload> and select some of the columns
> Following is the code Snippet
> extractedRawTc.printSchema(); // output1
> Dataset<RawLogPayload> extractedRawW3cFilled = extractedRawW3c.alias("extractedRawW3c")
> .join(extractedRawTc.alias("extractedRawTc"), functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")), "inner")
> .select(functions.col("extractedRawW3c.df_logdatetime"), functions.col("extractedRawW3c.rawsessionid"), functions.col("extractedRawTc.uid"),
> functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"),
> functions.col("extractedRawW3c.tid"), functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"),
> functions.col("extractedRawW3c.curWindow"), *functions.col("extractedRawW3c.timestamp")*)
> .as(Encoders.bean(RawLogPayload.class));
> extractedRawW3cFilled.printSchema(); // output2
>
> After run this, it will cast following exception
> 2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)"
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: *No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"*
> at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)
> at org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910)
>
> Output1 extractedRawTc schema
> root
> |-- curWindow: string (nullable = true)
> |-- df_logdatetime: string (nullable = true)
> |-- fid: string (nullable = true)
> |-- rawsessionid: string (nullable = true)
> |-- string1: string (nullable = true)
> |-- t: string (nullable = true)
> |-- tid: string (nullable = true)
> |-- time: string (nullable = true)
> |-- *timestamp: long (nullable = true)*
> |-- uid: string (nullable = true)
> |-- url: string (nullable = true)
> |-- wid: string (nullable = true)
>
> Output2 extractedRawW3cFilled schema
> root
> |-- df_logdatetime: string (nullable = true)
> |-- rawsessionid: string (nullable = true)
> |-- uid: string (nullable = true)
> |-- time: string (nullable = true)
> |-- T: string (nullable = true)
> |-- url: string (nullable = true)
> |-- wid: string (nullable = true)
> |-- tid: string (nullable = true)
> |-- fid: string (nullable = true)
>
> My question: the schema of column timestamp is long, but from the exception log, it seems after selecting the datatype of timestamp becomes UTF8String, Why would this happen? Is it a bug? If not could you point how to use it correctly?
> Thanks
>
>
> |-- string1: string (nullable = true)
> |-- curWindow: string (nullable = true)
> |-- *timestamp: long (nullable = true)*
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org