You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/09/10 19:28:27 UTC

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22237#discussion_r216444450
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
    @@ -469,4 +470,26 @@ class JsonFunctionsSuite extends QueryTest with SharedSQLContext {
     
         checkAnswer(sql("""select json[0] from jsonTable"""), Seq(Row(null)))
       }
    +
    +  test("from_json invalid json - check modes") {
    +    val df = Seq("""{"a" 1}""", """{"a": 2}""").toDS()
    +    val schema = new StructType().add("a", IntegerType)
    +
    +    checkAnswer(
    +      df.select(from_json($"value", schema, Map("mode" -> "PERMISSIVE"))),
    +      Row(Row(null)) :: Row(Row(2)) :: Nil)
    +
    +    val exception1 = intercept[SparkException] {
    +      df.select(from_json($"value", schema, Map("mode" -> "FAILFAST"))).collect()
    +    }.getMessage
    +    assert(exception1.contains(
    +      "Malformed records are detected in record parsing. Parse Mode: FAILFAST."))
    +
    +    val exception2 = intercept[AnalysisException] {
    +      df.select(from_json($"value", schema, Map("mode" -> "DROPMALFORMED"))).collect()
    --- End diff --
    
    I replaced it by `AnalysisException` but I think it is wrong decision. Throwing of `AnalysisException` at run-time looks ugly:
    ```
    Caused by: org.apache.spark.sql.AnalysisException: from_json() doesn't support the DROPMALFORMED mode. Acceptable modes are PERMISSIVE and FAILFAST.;
    	at org.apache.spark.sql.catalyst.expressions.JsonToStructs.parser$lzycompute(jsonExpressions.scala:568)
    	at org.apache.spark.sql.catalyst.expressions.JsonToStructs.parser(jsonExpressions.scala:564)
    ...
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    ```
    I am going to replace it by something else or revert back to `IllegalArgumentException`. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org