You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Frederick Reiss (JIRA)" <ji...@apache.org> on 2015/05/01 18:18:06 UTC

[jira] [Commented] (SPARK-7273) The SQLContext.jsonFile() api has a problem when load a format json file?

    [ https://issues.apache.org/jira/browse/SPARK-7273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523384#comment-14523384 ] 

Frederick Reiss commented on SPARK-7273:
----------------------------------------

The error in the description indicates that there is a character in the middle of the first line of the JSON file that TextInputFormat treats as a line separator. Spark sees the JSON content as 

I can think of two potential causes:
a) Steven's JSON content has run through a pretty-printing function, and there is a newline character between the two parts of the JSON object, or
b) Steven's local Hadoop/YARN configuration has a nonstandard setting for "textinputformat.record.delimiter"

[~jiege]: Can you share a copy of your JSON file?

Technical details:
SQLContext.jsonFile() makes a call to org.apache.spark.sql.json.DefaultSource, which delegates the task to org.apache.spark.sql.json.JSONRelation, which uses SparkContext.textFile() to open the JSON file. SparkContext.textFile() uses TextInputFormat to read the file.

> The SQLContext.jsonFile() api has a problem when load a format json file?
> -------------------------------------------------------------------------
>
>                 Key: SPARK-7273
>                 URL: https://issues.apache.org/jira/browse/SPARK-7273
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.1
>            Reporter: steven
>            Priority: Minor
>
> my code as follow:
>  val df = sqlContext.jsonFile("test.json");
> test.json content is:
>  { "name": "steven",
>     "age" : "20"
> }
> the jsonFile invoke will get a Exception as follow:
>       java.lang.RuntimeException: Failed to parse record     "age" : "20"}. Please make sure that each line of the file (or each string in the RDD) is a valid JSON object or an array of JSON objects.
> 	at scala.sys.package$.error(package.scala:27)
> 	at org.apache.spark.sql.json.JsonRDD$$anonfun$parseJson$1$$anonfun$apply$2.apply(JsonRDD.scala:313)
> 	at org.apache.spark.sql.json.JsonRDD$$anonfun$parseJson$1$$anonfun$apply$2.apply(JsonRDD.scala:307)
> is it a bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org