You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Daniel Imberman <da...@gmail.com> on 2016/06/27 15:50:35 UTC

Arrays in Datasets (1.6.1)

Hi all,

So I've been attempting to reformat a project I'm working on to use the
Dataset API and have been having some issues with encoding errors. From
what I've read, I think that I should be able to store Arrays of primitive
values in a dataset. However, the following class gives me encoding errors:

case class InvertedIndex(partition:Int, docs:Array[Int],
indices:Array[Long], weights:Array[Double])

val inv = RDD[InvertedIndex]
val invertedIndexDataset = sqlContext.createDataset(inv)
invertedIndexDataset.groupBy(x => x.partition).mapGroups {
    //...
}

Could someone please help me understand what the issue is here? Can
Datasets not currently handle Arrays of primitives, or is there something
extra that I need to do to make them work?

Thank you

Re: Arrays in Datasets (1.6.1)

Posted by Ted Yu <yu...@gmail.com>.
Can you show the stack trace for encoding error(s) ?

Have you looked at the following test which involves NestedArray of
primitive type ?

./sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala

Cheers

On Mon, Jun 27, 2016 at 8:50 AM, Daniel Imberman <da...@gmail.com>
wrote:

> Hi all,
>
> So I've been attempting to reformat a project I'm working on to use the
> Dataset API and have been having some issues with encoding errors. From
> what I've read, I think that I should be able to store Arrays of primitive
> values in a dataset. However, the following class gives me encoding errors:
>
> case class InvertedIndex(partition:Int, docs:Array[Int],
> indices:Array[Long], weights:Array[Double])
>
> val inv = RDD[InvertedIndex]
> val invertedIndexDataset = sqlContext.createDataset(inv)
> invertedIndexDataset.groupBy(x => x.partition).mapGroups {
>     //...
> }
>
> Could someone please help me understand what the issue is here? Can
> Datasets not currently handle Arrays of primitives, or is there something
> extra that I need to do to make them work?
>
> Thank you
>
>