You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ryan Blue (JIRA)" <ji...@apache.org> on 2016/09/06 23:11:20 UTC

[jira] [Created] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

Ryan Blue created SPARK-17424:
---------------------------------

             Summary: Dataset job fails from unsound substitution in ScalaReflect
                 Key: SPARK-17424
                 URL: https://issues.apache.org/jira/browse/SPARK-17424
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.0.0, 1.6.1
            Reporter: Ryan Blue


I have a job that uses datasets in 1.6.1 and is failing with this error:

{code}
16/09/02 17:02:56 ERROR Driver ApplicationMaster: User class threw exception: java.lang.AssertionError: assertion failed: Unsound substitution from List(type T, type U) to List()
java.lang.AssertionError: assertion failed: Unsound substitution from List(type T, type U) to List()
    at scala.reflect.internal.Types$SubstMap.<init>(Types.scala:4644)
    at scala.reflect.internal.Types$SubstTypeMap.<init>(Types.scala:4761)
    at scala.reflect.internal.Types$Type.subst(Types.scala:796)
    at scala.reflect.internal.Types$TypeApiImpl.substituteTypes(Types.scala:321)
    at scala.reflect.internal.Types$TypeApiImpl.substituteTypes(Types.scala:298)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$getConstructorParameters$1.apply(ScalaReflection.scala:769)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$getConstructorParameters$1.apply(ScalaReflection.scala:768)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at org.apache.spark.sql.catalyst.ScalaReflection$class.getConstructorParameters(ScalaReflection.scala:768)
    at org.apache.spark.sql.catalyst.ScalaReflection$.getConstructorParameters(ScalaReflection.scala:30)
    at org.apache.spark.sql.catalyst.ScalaReflection$.getConstructorParameters(ScalaReflection.scala:610)
    at org.apache.spark.sql.catalyst.trees.TreeNode.org$apache$spark$sql$catalyst$trees$TreeNode$$argNames$lzycompute(TreeNode.scala:418)
    at org.apache.spark.sql.catalyst.trees.TreeNode.org$apache$spark$sql$catalyst$trees$TreeNode$$argNames(TreeNode.scala:418)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$argsMap$1.apply(TreeNode.scala:415)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$argsMap$1.apply(TreeNode.scala:414)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.TraversableOnce$class.toMap(TraversableOnce.scala:279)
    at scala.collection.AbstractIterator.toMap(Iterator.scala:1157)
    at org.apache.spark.sql.catalyst.trees.TreeNode.argsMap(TreeNode.scala:416)
    at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:46)
    at org.apache.spark.sql.execution.SparkPlanInfo$$anonfun$2.apply(SparkPlanInfo.scala:44)
    at org.apache.spark.sql.execution.SparkPlanInfo$$anonfun$2.apply(SparkPlanInfo.scala:44)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:44)
    at org.apache.spark.sql.execution.SparkPlanInfo$$anonfun$2.apply(SparkPlanInfo.scala:44)
    at org.apache.spark.sql.execution.SparkPlanInfo$$anonfun$2.apply(SparkPlanInfo.scala:44)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:44)
    at org.apache.spark.sql.execution.SparkPlanInfo$$anonfun$2.apply(SparkPlanInfo.scala:44)
    at org.apache.spark.sql.execution.SparkPlanInfo$$anonfun$2.apply(SparkPlanInfo.scala:44)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:44)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:51)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:56)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
    at org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:193)
    at org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:166)
    at com.netflix.jobs.main(Processing.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:557)
{code}

I think this is the same bug as SPARK-13067. It looks like that issue wasn't fixed, there was just a work-around added to get the test passing.

The problem is that the reflection code is trying to substitute concrete types for type parameters of {{MapPartitions[T, U]}}, but the concrete types aren't known. So Spark ends up calling {{substituteTypes}} to substitute {{T}} and {{U}} with {{Nil}} (which gets shown as {{List()}}).

An easy fix that works for me is this:

{code:lang=scala}
    // if there are type variables to fill in, do the substitution (SomeClass[T] -> SomeClass[Int])
    if (actualTypeArgs.nonEmpty) {
      params.map { p =>
        p.name.toString -> p.typeSignature.substituteTypes(formalTypeArgs, actualTypeArgs)
      }
    } else {
      params.map { p =>
        p.name.toString -> p.typeSignature
      }
    }
{code}

Does this sound like a reasonable solution?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org