You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Jason White <ja...@shopify.com> on 2017/02/12 00:19:26 UTC

Case class with POJO - encoder issues

I'd like to create a Dataset using some classes from Geotools to do some
geospatial analysis. In particular, I'm trying to use Spark to distribute
the work based on ID and label fields that I extract from the polygon data.

My simplified case class looks like this:
implicit val geometryEncoder: Encoder[Geometry] = Encoders.kryo[Geometry]
case class IndexedGeometry(label: String, tract: Geometry)

When I try to create a dataset using this case class, it give me this error
message:
Exception in thread "main" java.lang.UnsupportedOperationException: No
Encoder found for com.vividsolutions.jts.geom.Geometry
- field (class: "com.vividsolutions.jts.geom.Geometry", name: "tract")
- root class: "org.me.HelloWorld.IndexedGeometry"

If I add another encoder for my case class...:
implicit val indexedGeometryEncoder: Encoder[IndexedGeometry] =
Encoders.kryo[IndexedGeometry]

...it works, but now the entire dataset has a single field, "value", and
it's a binary blob.

Is there a way to do what I'm trying to do?
I believe this PR is related, but it's been idle since December:
https://github.com/apache/spark/pull/15918




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Case-class-with-POJO-encoder-issues-tp28381.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: Case class with POJO - encoder issues

Posted by geoHeil <ge...@gmail.com>.

http://stackoverflow.com/questions/36648128/how-to-store-custom-objects-in-a-dataset
describes the Problem. Actually, I have the same Problem. Is there a simple
way to build such an Encoder which serializes into multiple fields? I would
not want to replicate the Whole JTS geometry class hierarchy only for spark.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Case-class-with-POJO-encoder-issues-tp28381p28478.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: Case class with POJO - encoder issues

Posted by Michael Armbrust <mi...@databricks.com>.

You are right, you need that PR.  I pinged the author, but otherwise it
would be great if someone could carry it over the finish line.

On Sat, Feb 11, 2017 at 4:19 PM, Jason White <ja...@shopify.com>
wrote:

> I'd like to create a Dataset using some classes from Geotools to do some
> geospatial analysis. In particular, I'm trying to use Spark to distribute
> the work based on ID and label fields that I extract from the polygon data.
>
> My simplified case class looks like this:
> implicit val geometryEncoder: Encoder[Geometry] = Encoders.kryo[Geometry]
> case class IndexedGeometry(label: String, tract: Geometry)
>
> When I try to create a dataset using this case class, it give me this error
> message:
> Exception in thread "main" java.lang.UnsupportedOperationException: No
> Encoder found for com.vividsolutions.jts.geom.Geometry
> - field (class: "com.vividsolutions.jts.geom.Geometry", name: "tract")
> - root class: "org.me.HelloWorld.IndexedGeometry"
>
> If I add another encoder for my case class...:
> implicit val indexedGeometryEncoder: Encoder[IndexedGeometry] =
> Encoders.kryo[IndexedGeometry]
>
> ...it works, but now the entire dataset has a single field, "value", and
> it's a binary blob.
>
> Is there a way to do what I'm trying to do?
> I believe this PR is related, but it's been idle since December:
> https://github.com/apache/spark/pull/15918
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Case-class-with-POJO-encoder-issues-tp28381.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>