You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Justin Uang (JIRA)" <ji...@apache.org> on 2015/04/20 17:09:59 UTC
[jira] [Commented] (SPARK-6999) infinite recursion with
createDataFrame(JavaRDD[Row], java.util.List[String])
[ https://issues.apache.org/jira/browse/SPARK-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502994#comment-14502994 ]
Justin Uang commented on SPARK-6999:
------------------------------------
Looking at the source, it looks like one way to implement this is to extract part of getSchema(), specifically the
{code}
case c: Class[_] if c.isAnnotationPresent(classOf[SQLUserDefinedType]) =>
(c.getAnnotation(classOf[SQLUserDefinedType]).udt().newInstance(), true)
case c: Class[_] if c == classOf[java.lang.String] => (StringType, true)
case c: Class[_] if c == java.lang.Short.TYPE => (ShortType, false)
case c: Class[_] if c == java.lang.Integer.TYPE => (IntegerType, false)
case c: Class[_] if c == java.lang.Long.TYPE => (LongType, false)
case c: Class[_] if c == java.lang.Double.TYPE => (DoubleType, false)
case c: Class[_] if c == java.lang.Byte.TYPE => (ByteType, false)
case c: Class[_] if c == java.lang.Float.TYPE => (FloatType, false)
case c: Class[_] if c == java.lang.Boolean.TYPE => (BooleanType, false)
case c: Class[_] if c == classOf[java.lang.Short] => (ShortType, true)
case c: Class[_] if c == classOf[java.lang.Integer] => (IntegerType, true)
case c: Class[_] if c == classOf[java.lang.Long] => (LongType, true)
case c: Class[_] if c == classOf[java.lang.Double] => (DoubleType, true)
case c: Class[_] if c == classOf[java.lang.Byte] => (ByteType, true)
case c: Class[_] if c == classOf[java.lang.Float] => (FloatType, true)
case c: Class[_] if c == classOf[java.lang.Boolean] => (BooleanType, true)
case c: Class[_] if c == classOf[java.math.BigDecimal] => (DecimalType(), true)
case c: Class[_] if c == classOf[java.sql.Date] => (DateType, true)
case c: Class[_] if c == classOf[java.sql.Timestamp] => (TimestampType, true)
{code}
section and then have another method pull out elements from the first {{Row}} using {{Row.get}}, then using the switch statement to identify the type. Are there any gotchas that I'm missing?
> infinite recursion with createDataFrame(JavaRDD[Row], java.util.List[String])
> -----------------------------------------------------------------------------
>
> Key: SPARK-6999
> URL: https://issues.apache.org/jira/browse/SPARK-6999
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.3.0
> Reporter: Justin Uang
> Priority: Critical
>
> It looks like
> {code}
> def createDataFrame(rowRDD: JavaRDD[Row], columns: java.util.List[String]): DataFrame = {
> createDataFrame(rowRDD.rdd, columns.toSeq)
> }
> {code}
> is in fact an infinite recursion because it calls itself. Scala implicit conversions convert the arguments back into a JavaRDD and a java.util.List.
> {code}
> 15/04/19 16:51:24 INFO BlockManagerMaster: Trying to register BlockManager
> 15/04/19 16:51:24 INFO BlockManagerMasterActor: Registering block manager localhost:53711 with 1966.1 MB RAM, BlockManagerId(<driver>, localhost, 53711)
> 15/04/19 16:51:24 INFO BlockManagerMaster: Registered BlockManager
> Exception in thread "main" java.lang.StackOverflowError
> at scala.collection.mutable.AbstractSeq.<init>(Seq.scala:47)
> at scala.collection.mutable.AbstractBuffer.<init>(Buffer.scala:48)
> at scala.collection.convert.Wrappers$JListWrapper.<init>(Wrappers.scala:84)
> at scala.collection.convert.WrapAsScala$class.asScalaBuffer(WrapAsScala.scala:127)
> at scala.collection.JavaConversions$.asScalaBuffer(JavaConversions.scala:53)
> at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
> at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
> at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
> at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:408)
> {code}
> Here is the code sample I used to reproduce the issue:
> {code}
> /**
> * @author juang
> */
> public final class InfiniteRecursionExample {
> public static void main(String[] args) {
> JavaSparkContext sc = new JavaSparkContext("local", "infinite_recursion_example");
> List<Row> rows = Lists.newArrayList();
> JavaRDD<Row> rowRDD = sc.parallelize(rows);
> SQLContext sqlContext = new SQLContext(sc);
> sqlContext.createDataFrame(rowRDD, ImmutableList.of("myCol"));
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org