You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Jacek Laskowski <ja...@japila.pl> on 2016/09/22 10:27:17 UTC

Deserializing InternalRow using a case class - how to avoid creating attrs manually?

Hi,

I've just discovered* that I can SerDe my case classes. What a nice
feature which I can use in spark-shell, too! Thanks a lot for offering
me so much fun!

What I don't really like about the code is the following part (esp.
that it conflicts with the implicit for Column):

import org.apache.spark.sql.catalyst.dsl.expressions._
// in spark-shell there are competing implicits
// That's why DslSymbol is used explicitly in the following line
scala> val attrs = Seq(DslSymbol('id).long, DslSymbol('name).string)
attrs: Seq[org.apache.spark.sql.catalyst.expressions.AttributeReference]
= List(id#8L, name#9)

scala> val jacekReborn = personExprEncoder.resolveAndBind(attrs).fromRow(row)
jacekReborn: Person = Person(0,Jacek)

Since I've got the "schema" as the case class Person I'd like to avoid
creating attrs manually. Is there a way to avoid the step and use a
"reflection"-like approach so the attrs are built out of the case
class?

Also, since we're at it, why's resolveAndBind required? Is this for
names and their types only?

Thanks for your help (and for such fantastic project Apache Spark!)

[*] yeah, took me a while, but the happiness is stronger and I'll
remember longer! :-)

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org