You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2017/10/09 11:05:01 UTC
[jira] [Commented] (SPARK-17952) SparkSession createDataFrame
method throws exception for nested JavaBeans
[ https://issues.apache.org/jira/browse/SPARK-17952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196814#comment-16196814 ]
Hyukjin Kwon commented on SPARK-17952:
--------------------------------------
This still happens in the master.
> SparkSession createDataFrame method throws exception for nested JavaBeans
> -------------------------------------------------------------------------
>
> Key: SPARK-17952
> URL: https://issues.apache.org/jira/browse/SPARK-17952
> Project: Spark
> Issue Type: Bug
> Affects Versions: 2.0.0, 2.0.1
> Reporter: Amit Baghel
>
> As per latest spark documentation for Java at http://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection,
> {quote}
> Nested JavaBeans and List or Array fields are supported though.
> {quote}
> However nested JavaBean is not working. Please see the below code.
> SubCategory class
> {code}
> public class SubCategory implements Serializable{
> private String id;
> private String name;
>
> public String getId() {
> return id;
> }
> public void setId(String id) {
> this.id = id;
> }
> public String getName() {
> return name;
> }
> public void setName(String name) {
> this.name = name;
> }
> }
> {code}
> Category class
> {code}
> public class Category implements Serializable{
> private String id;
> private SubCategory subCategory;
>
> public String getId() {
> return id;
> }
> public void setId(String id) {
> this.id = id;
> }
> public SubCategory getSubCategory() {
> return subCategory;
> }
> public void setSubCategory(SubCategory subCategory) {
> this.subCategory = subCategory;
> }
> }
> {code}
> SparkSample class
> {code}
> public class SparkSample {
> public static void main(String[] args) throws IOException {
> SparkSession spark = SparkSession
> .builder()
> .appName("SparkSample")
> .master("local")
> .getOrCreate();
> //SubCategory
> SubCategory sub = new SubCategory();
> sub.setId("sc-111");
> sub.setName("Sub-1");
> //Category
> Category category = new Category();
> category.setId("s-111");
> category.setSubCategory(sub);
> //categoryList
> List<Category> categoryList = new ArrayList<Category>();
> categoryList.add(category);
> //DF
> Dataset<Row> dframe = spark.createDataFrame(categoryList, Category.class);
> dframe.show();
> }
> }
> {code}
> Above code throws below error.
> {code}
> Exception in thread "main" scala.MatchError: com.sample.SubCategory@e7391d (of class com.sample.SubCategory)
> at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:256)
> at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:251)
> at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:103)
> at org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:403)
> at org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
> at org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
> at org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1106)
> at org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1104)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
> at scala.collection.Iterator$class.toStream(Iterator.scala:1322)
> at scala.collection.AbstractIterator.toStream(Iterator.scala:1336)
> at scala.collection.TraversableOnce$class.toSeq(TraversableOnce.scala:298)
> at scala.collection.AbstractIterator.toSeq(Iterator.scala:1336)
> at org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:373)
> at com.sample.SparkSample.main(SparkSample.java:33)
> {code}
> createDataFrame method throws above exception. But I observed that createDataset method works fine with below code.
> {code}
> Encoder<Category> encoder = Encoders.bean(Category.class);
> Dataset<Category> dframe = spark.createDataset(categoryList, encoder);
> dframe.show();
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org