You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Zhong (JIRA)" <ji...@apache.org> on 2016/06/15 20:50:09 UTC
[jira] [Commented] (SPARK-15786) joinWith bytecode generation
calling ByteBuffer.wrap with InternalRow
[ https://issues.apache.org/jira/browse/SPARK-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332527#comment-15332527 ]
Sean Zhong commented on SPARK-15786:
------------------------------------
Basically, what you described can be shorten to:
{code}
scala> val ds = Seq((1,1) -> (1, 1)).toDS()
res4: org.apache.spark.sql.Dataset[((Int, Int), (Int, Int))] = [_1: struct<_1: int, _2: int>, _2: struct<_1: int, _2: int>]
scala> implicit val enc = Encoders.tuple(Encoders.kryo[Option[(Int, Int)]], Encoders.kryo[Option[(Int, Int)]])
enc: org.apache.spark.sql.Encoder[(Option[(Int, Int)], Option[(Int, Int)])] = class[_1[0]: binary, _2[0]: binary]
scala> ds.as[(Option[(Int, Int)], Option[(Int, Int)])].collect()
{code}
> joinWith bytecode generation calling ByteBuffer.wrap with InternalRow
> ---------------------------------------------------------------------
>
> Key: SPARK-15786
> URL: https://issues.apache.org/jira/browse/SPARK-15786
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.1, 2.0.0
> Reporter: Richard Marscher
>
> {code}java.lang.RuntimeException: Error while decoding: java.util.concurrent.ExecutionException: java.lang.Exception: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 36, Column 107: No applicable constructor/method found for actual parameters "org.apache.spark.sql.catalyst.InternalRow"; candidates are: "public static java.nio.ByteBuffer java.nio.ByteBuffer.wrap(byte[])", "public static java.nio.ByteBuffer java.nio.ByteBuffer.wrap(byte[], int, int)"{code}
> I have been trying to use joinWith along with Option data types to get an approximation of the RDD semantics for outer joins with Dataset to have a nicer API for Scala. However, using the Dataset.as[] syntax leads to bytecode generation trying to pass an InternalRow object into the ByteBuffer.wrap function which expects byte[] with or without a couple int qualifiers.
> I have a notebook reproducing this against 2.0 preview in Databricks Community Edition: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/160347920874755/1039589581260901/673639177603143/latest.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org