You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Philipp Poetter (JIRA)" <ji...@apache.org> on 2015/07/14 13:14:04 UTC

[jira] [Created] (SPARK-9032) scala.MatchError in DataFrameReader.json(String path)

Philipp Poetter created SPARK-9032:
--------------------------------------

             Summary: scala.MatchError in DataFrameReader.json(String path)
                 Key: SPARK-9032
                 URL: https://issues.apache.org/jira/browse/SPARK-9032
             Project: Spark
          Issue Type: Bug
          Components: Java API, SQL
    Affects Versions: 1.4.0
         Environment: Ubuntu 15.04
            Reporter: Philipp Poetter


Executing read().json() of SQLContext e.g. DataFrameReader raises a MatchError with a stacktrace as follows while trying to read JSON data:

15/07/14 11:25:26 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/07/14 11:25:26 INFO DAGScheduler: Job 0 finished: json at Example.java:23, took 6.981330 s
Exception in thread "main" scala.MatchError: StringType (of class org.apache.spark.sql.types.StringType$)
	at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58)
	at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139)
	at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137)
	at org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137)
	at org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)
	at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:213)
	at com.hp.sparkdemo.Example.main(Example.java:23)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/07/14 11:25:26 INFO SparkContext: Invoking stop() from shutdown hook
15/07/14 11:25:26 INFO SparkUI: Stopped Spark web UI at http://10.0.2.15:4040
15/07/14 11:25:26 INFO DAGScheduler: Stopping DAGScheduler
15/07/14 11:25:26 INFO SparkDeploySchedulerBackend: Shutting down all executors
15/07/14 11:25:26 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
15/07/14 11:25:26 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!

Offending code snippet (around line 23):
...
       JavaSparkContext sctx = new JavaSparkContext(sparkConf);
        SQLContext ctx = new SQLContext(sctx);
        DataFrame frame = ctx.read().json(facebookJSON);
        frame.printSchema();
...

The exception is reproducable using the following JSON:

{
   "data": [
      {
         "id": "X999_Y999",
         "from": {
            "name": "Tom Brady", "id": "X12"
         },
         "message": "Looking forward to 2010!",
         "actions": [
            {
               "name": "Comment",
               "link": "http://www.facebook.com/X999/posts/Y999"
            },
            {
               "name": "Like",
               "link": "http://www.facebook.com/X999/posts/Y999"
            }
         ],
         "type": "status",
         "created_time": "2010-08-02T21:27:44+0000",
         "updated_time": "2010-08-02T21:27:44+0000"
      },
      {
         "id": "X998_Y998",
         "from": {
            "name": "Peyton Manning", "id": "X18"
         },
         "message": "Where's my contract?",
         "actions": [
            {
               "name": "Comment",
               "link": "http://www.facebook.com/X998/posts/Y998"
            },
            {
               "name": "Like",
               "link": "http://www.facebook.com/X998/posts/Y998"
            }
         ],
         "type": "status",
         "created_time": "2010-08-02T21:27:44+0000",
         "updated_time": "2010-08-02T21:27:44+0000"
      }
   ]
}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org