You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/05/26 17:32:02 UTC

[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/21439

    [SPARK-24391][SQL] Support arrays of any types by from_json

    ## What changes were proposed in this pull request?
    
    The PR removes a restriction for element types of array type which exists in `from_json` for the root type. Currently, the function can handle only arrays of structs. Even array of primitive types is disallowed. The PR allows arrays of any types currently supported by JSON datasource. Here is an example of an array of a primitive type:
    
    ```
    scala> import org.apache.spark.sql.functions._
    scala> val df = Seq("[1, 2, 3]").toDF("a")
    scala> val schema = new ArrayType(IntegerType, false)
    scala> val arr = df.select(from_json($"a", schema))
    scala> arr.printSchema
    root
     |-- jsontostructs(a): array (nullable = true)
     |    |-- element: integer (containsNull = true)
    ```
    and result of converting of the json string to the `ArrayType`:
    ```
    scala> arr.show
    +----------------+
    |jsontostructs(a)|
    +----------------+
    |       [1, 2, 3]|
    +----------------+
    ```
    
    ## How was this patch tested?
    
    I added a few positive and negative tests:
    - array of primitive types
    - array of arrays
    - array of structs
    - array of maps
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 from_json-array

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21439
    
----
commit 3a7559b8809757ba32491a9c882ec40c8986c3b0
Author: Maxim Gekk <ma...@...>
Date:   2018-05-26T16:46:08Z

    Support arrays by from_json

commit b601a9365e12744c408a8c198a94a5c6a9e4607e
Author: Maxim Gekk <ma...@...>
Date:   2018-05-26T17:08:49Z

    Fix comments

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #92480 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92480/testReport)** for PR 21439 at commit [`fced8ec`](https://github.com/apache/spark/commit/fced8ec9557ad8173a6e1be07d7f4f11369e5727).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91874/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204233840
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -61,6 +61,7 @@ class JacksonParser(
         dt match {
           case st: StructType => makeStructRootConverter(st)
           case mt: MapType => makeMapRootConverter(mt)
    +      case at: ArrayType => makeArrayRootConverter(at)
    --- End diff --
    
    ah, ok. You touched the the `JacksonParser.scala` file, so I though there were some behaivour changes in the json datasorce. But I notice that there are not, so the current tests are enough. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #93734 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93734/testReport)** for PR 21439 at commit [`021350b`](https://github.com/apache/spark/commit/021350b29c11c19c5a1b8343553c4a531abe39a2).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @maropu @gengliangwang I added SQL tests. Please, take a look at the PR again, please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #93734 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93734/testReport)** for PR 21439 at commit [`021350b`](https://github.com/apache/spark/commit/021350b29c11c19c5a1b8343553c4a531abe39a2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204553995
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,17 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    --- End diff --
    
    I can only say that this `case START_OBJECT` was added to handle the case when an user specified a schema as an `array` of `struct`, and a `struct` is found in the input json. See the existing test case which I pointed out above: https://github.com/apache/spark/pull/16929/files#diff-88230f171af0b7a40791a867f9dd3a36R382 . I don't want to change the behavior in the PR and potentially break user's apps. What I would propose is to put the functionality under a `spark.sql.legacy.*` flag which could be deleted in Spark 3.0.
    
    @maropu Initially I put the behavior under a flag in `JsonToStructs` but Reynold asked me to support new behavior without any new sticks/flags.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91668/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Is there anything for now which blocks the PR?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94659/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94659/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #92981 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92981/testReport)** for PR 21439 at commit [`82d4fd5`](https://github.com/apache/spark/commit/82d4fd5dfa20b953ec71e2949b7578218ee0b260).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204300575
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,17 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    --- End diff --
    
    I think what was done in that PR is pretty different and it is actually the opposite of what we are doing here. Indeed, there we are returning an array of structs when a struct is specified as schema and the JSON contains an array. Here we are returning an array with one struct when the schema is an array of struct and there is a struct instead of an array.
    
    Despite I don't really like the behavior introduced in the PR you mentioned, I can understand it, as it was a way to support array of struct (the only at the moment) and I don't think we can change it before 3.0 at least for backward compatibility. But since here we are introducing a new behavior, if an array is required and a struct is found, I think returning an array with one element is a wrong/unexpected behavior and returning null would be what I'd expect as a user.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r191298921
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ---
    @@ -523,6 +523,8 @@ case class JsonToStructs(
       // can generate incorrect files if values are missing in columns declared as non-nullable.
       val nullableSchema = if (forceNullableSchema) schema.asNullable else schema
     
    +  val unpackArray: Boolean = options.get("unpackArray").map(_.toBoolean).getOrElse(false)
    --- End diff --
    
    Can you make the option `unpackArray` case-insensitive?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r209461334
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,21 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    +        // This handles the case when an input JSON object is a structure but
    +        // the specified schema is an array of structures. In that case, the input JSON is
    --- End diff --
    
    Could you add an example here, like what we did in `makeStructRootConverter `?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r192465130
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ---
    @@ -548,7 +553,9 @@ case class JsonToStructs(
           forceNullableSchema = SQLConf.get.getConf(SQLConf.FROM_JSON_FORCE_NULLABLE_SCHEMA))
     
       override def checkInputDataTypes(): TypeCheckResult = nullableSchema match {
    -    case _: StructType | ArrayType(_: StructType, _) | _: MapType =>
    +    case ArrayType(_: StructType, _) if unpackArray =>
    --- End diff --
    
    Even if `unpackArray` is `false`, the next branch in line 558 still do `super.checkInputDataTypes()` for any `ArrayType`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204933936
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ---
    @@ -544,34 +544,27 @@ case class JsonToStructs(
           timeZoneId = None)
    --- End diff --
    
    Pleas also update the comment of `JsonToStructs`:
    
    `Converts an json input string to a [[StructType]] or [[ArrayType]] of [[StructType]]s`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92986/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    > How does it relate to #19492?
    
    In my PR, I supported arrays of ANY types.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204263502
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,17 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    --- End diff --
    
    If an array of data is empty, and the schema is an array, it should return an empty array, https://github.com/apache/spark/pull/16929/files#diff-88230f171af0b7a40791a867f9dd3a36R389


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r191299100
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ---
    @@ -523,6 +523,8 @@ case class JsonToStructs(
       // can generate incorrect files if values are missing in columns declared as non-nullable.
       val nullableSchema = if (forceNullableSchema) schema.asNullable else schema
     
    +  val unpackArray: Boolean = options.get("unpackArray").map(_.toBoolean).getOrElse(false)
    --- End diff --
    
    If we add this new option here, I feel we'd be better to document somewhere (e.g., `sq/functions.scala`)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Thank you @maropu for your review of the PR.
    
    > better to add tests in json-functions.sql?
    
    What kind of tests would you expect in `json-functions.sql`. Probably you would expect tests that are different from added to `JsonExpressionsSuite.scala`.
    
    > Can we also accept primitive arrays in to_json?
    
    I believe it should be implemented in another PR because the changes required for `to_json` don't intersect with this PR.
    
    
    
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    > What kind of tests would you expect in json-functions.sql. Probably you would expect tests
    > that are different from added to JsonExpressionsSuite.scala.
    IIUC there is no strict rule there and we tend to put SQL related tests in `SQLQueryTestSuite`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #92480 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92480/testReport)** for PR 21439 at commit [`fced8ec`](https://github.com/apache/spark/commit/fced8ec9557ad8173a6e1be07d7f4f11369e5727).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    cc @gengliangwang 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    I set the option to the value which keep current behavior. So, it should be absolutely compatibly with current implementation


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92981/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94659 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94659/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94655/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #92986 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92986/testReport)** for PR 21439 at commit [`758d1df`](https://github.com/apache/spark/commit/758d1dfebc604715aac826c88c7ff8421095b05e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Can we also accept primitive arrays in `to_json`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91874 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91874/testReport)** for PR 21439 at commit [`ce9918b`](https://github.com/apache/spark/commit/ce9918b2598a4d70a95d2e8269a32afe2db2ec26).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204274790
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,17 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    --- End diff --
    
    I think it is a super weird case...can we put this special handling code for back-compatibility in `JsonToStructs`?
    Ether way, I think we should leave code comments here to make others understood easily?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    better to add tests in `json-functions.sql`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r192502051
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala ---
    @@ -423,7 +423,9 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         val input = """{"a": 1}"""
         val schema = ArrayType(StructType(StructField("a", IntegerType) :: Nil))
         val output = InternalRow(1) :: Nil
    -    checkEvaluation(JsonToStructs(schema, Map.empty, Literal(input), gmtId, true), output)
    +    checkEvaluation(
    +      JsonToStructs(schema, Map("unpackArray" -> "true"), Literal(input), gmtId, true),
    --- End diff --
    
    add case for `unpackArray` as `false`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93734/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @maropu For now it is impossible to specify schema for `from_json` if it is not `StructType`. The PR https://github.com/apache/spark/pull/21472 solves the problem.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @gatorsmile @HyukjinKwon May I ask you to look at the PR one more time.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91356 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91356/testReport)** for PR 21439 at commit [`9d0230a`](https://github.com/apache/spark/commit/9d0230aa9c335e69099a0ddfc280244bb42a360b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91356 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91356/testReport)** for PR 21439 at commit [`9d0230a`](https://github.com/apache/spark/commit/9d0230aa9c335e69099a0ddfc280244bb42a360b).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94680/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204232671
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -61,6 +61,7 @@ class JacksonParser(
         dt match {
           case st: StructType => makeStructRootConverter(st)
           case mt: MapType => makeMapRootConverter(mt)
    +      case at: ArrayType => makeArrayRootConverter(at)
    --- End diff --
    
    Sorry, I didn't catch you meant specific class. Just in case, what is the reason for adding tests to `JsonSuite`? I changed behavior of a function, so, `JsonFunctionsSuite` is perfect place for new tests, I believe. At the moment, you cannot specify a schema of json objects different from `StructType` in `DataFrameReader`. In this way, the `ArrayType` can come into `JacksonParser` only from json functions (`from_json`). I can add similar test to `JsonSuite` as in `JsonFunctionsSuite` but it doesn't make any sense from my point of view.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204932903
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ---
    @@ -544,34 +544,27 @@ case class JsonToStructs(
           timeZoneId = None)
     
       override def checkInputDataTypes(): TypeCheckResult = nullableSchema match {
    -    case _: StructType | ArrayType(_: StructType, _) | _: MapType =>
    +    case _: StructType | _: ArrayType | _: MapType =>
           super.checkInputDataTypes()
         case _ => TypeCheckResult.TypeCheckFailure(
           s"Input schema ${nullableSchema.catalogString} must be a struct or an array of structs.")
    --- End diff --
    
    `or an array of structs.` -> `or an array.`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #92981 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92981/testReport)** for PR 21439 at commit [`82d4fd5`](https://github.com/apache/spark/commit/82d4fd5dfa20b953ec71e2949b7578218ee0b260).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `public class JavaSummarizerExample `
      * `  class SerializableConfiguration(@transient var value: Configuration)`
      * `  class IncompatibleSchemaException(msg: String, ex: Throwable = null) extends Exception(msg, ex)`
      * `  case class SchemaType(dataType: DataType, nullable: Boolean)`
      * `  implicit class AvroDataFrameWriter[T](writer: DataFrameWriter[T]) `
      * `  implicit class AvroDataFrameReader(reader: DataFrameReader) `
      * `trait ComplexTypeMergingExpression extends Expression `
      * `case class Size(child: Expression) extends UnaryExpression with ExpectsInputTypes `
      * `case class MapConcat(children: Seq[Expression]) extends Expression `
      * `abstract class ArraySetLike extends BinaryArrayExpressionWithImplicitCast `
      * `case class ArrayUnion(left: Expression, right: Expression) extends ArraySetLike `
      * `  case class StreamingGlobalLimitStrategy(outputMode: OutputMode) extends Strategy `
      * `case class StreamingGlobalLimitExec(`
      * `sealed trait MultipleWatermarkPolicy `
      * `case class WatermarkTracker(policy: MultipleWatermarkPolicy) extends Logging `
      * `trait MemorySinkBase extends BaseStreamingSink `
      * `class MemorySink(val schema: StructType, outputMode: OutputMode) extends Sink`
      * `class MemoryWriter(sink: MemorySinkV2, batchId: Long, outputMode: OutputMode)`
      * `class MemoryStreamWriter(val sink: MemorySinkV2, outputMode: OutputMode)`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    I guess it is still controversial to have this new behavior with the new option.  


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94666 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94666/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91189/testReport)** for PR 21439 at commit [`b601a93`](https://github.com/apache/spark/commit/b601a9365e12744c408a8c198a94a5c6a9e4607e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by huaxingao <gi...@git.apache.org>.
Github user huaxingao commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Sure. I will work on it. Thanks for letting me know. @viirya 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @HyukjinKwon Are there any chances the PR will be merged? or I should close it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94677/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #93499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93499/testReport)** for PR 21439 at commit [`39a0a4e`](https://github.com/apache/spark/commit/39a0a4e96266747674814e5a19821d994e438ea1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92480/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #93393 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93393/testReport)** for PR 21439 at commit [`2746d35`](https://github.com/apache/spark/commit/2746d35e567ac2eb8e9947e90d3ecf055177b0c4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @gatorsmile Could you look at the PR, please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r192501912
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ---
    @@ -523,6 +523,11 @@ case class JsonToStructs(
       // can generate incorrect files if values are missing in columns declared as non-nullable.
       val nullableSchema = if (forceNullableSchema) schema.asNullable else schema
     
    +  private val caseInsensitiveOptions = CaseInsensitiveMap(options)
    +  private val unpackArray: Boolean = {
    --- End diff --
    
    Why do we need this? Can you add comments about it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    LGTM 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    How does it relate to https://github.com/apache/spark/pull/19492?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @gatorsmile May I ask you to look at the PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204250446
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,17 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    --- End diff --
    
    This is for backward compatibility to the behavior introduced by this PR: https://github.com/apache/spark/pull/16929,  most likely by this: https://github.com/apache/spark/pull/16929/files#diff-6626026091295ad8c0dfb66ecbcd04b1R506 . Even special test was added: https://github.com/apache/spark/pull/16929/files#diff-88230f171af0b7a40791a867f9dd3a36R382 . Please, ask author @HyukjinKwon about the it. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204231416
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -61,6 +61,7 @@ class JacksonParser(
         dt match {
           case st: StructType => makeStructRootConverter(st)
           case mt: MapType => makeMapRootConverter(mt)
    +      case at: ArrayType => makeArrayRootConverter(at)
    --- End diff --
    
    > This change accepts the json datasource form that the master can't parse?
    
    Right, it accept arrays of any types comparing to the master which accepts arrays of structs only
    
    > If so, I think we need tests in JsonSuite ...
    
    I added a few tests to `JsonSuite` or do you mean some concrete test case?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91874 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91874/testReport)** for PR 21439 at commit [`ce9918b`](https://github.com/apache/spark/commit/ce9918b2598a4d70a95d2e8269a32afe2db2ec26).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    cc @mgaido91 sinceI found you are not cc'ed here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204931175
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
    @@ -136,12 +136,11 @@ class JsonFunctionsSuite extends QueryTest with SharedSQLContext {
       test("from_json invalid schema") {
    --- End diff --
    
    Not a invalid schema now.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    I think R side is not update for this yet. @huaxingao would you like to do that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94655/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93393/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94680/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #93396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93396/testReport)** for PR 21439 at commit [`bc3a2dd`](https://github.com/apache/spark/commit/bc3a2ddb870a220cc075337b6ada31614901f591).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #93396 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93396/testReport)** for PR 21439 at commit [`bc3a2dd`](https://github.com/apache/spark/commit/bc3a2ddb870a220cc075337b6ada31614901f591).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91668 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91668/testReport)** for PR 21439 at commit [`e321e37`](https://github.com/apache/spark/commit/e321e37b0aaf2d17ab267b685ea14ee778f06192).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    LGTM too.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91189/testReport)** for PR 21439 at commit [`b601a93`](https://github.com/apache/spark/commit/b601a9365e12744c408a8c198a94a5c6a9e4607e).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #93499 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93499/testReport)** for PR 21439 at commit [`39a0a4e`](https://github.com/apache/spark/commit/39a0a4e96266747674814e5a19821d994e438ea1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91191/testReport)** for PR 21439 at commit [`02a97ac`](https://github.com/apache/spark/commit/02a97acdde938af1479c7bd93236c353fefc35d7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91356/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91191/testReport)** for PR 21439 at commit [`02a97ac`](https://github.com/apache/spark/commit/02a97acdde938af1479c7bd93236c353fefc35d7).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #92708 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92708/testReport)** for PR 21439 at commit [`f3efb1b`](https://github.com/apache/spark/commit/f3efb1b97f9366839eacbc2611e39013f8c1fcfc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r205953494
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,17 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    --- End diff --
    
    I am adding a comment for this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94677 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94677/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94666/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #92986 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92986/testReport)** for PR 21439 at commit [`758d1df`](https://github.com/apache/spark/commit/758d1dfebc604715aac826c88c7ff8421095b05e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #93393 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93393/testReport)** for PR 21439 at commit [`2746d35`](https://github.com/apache/spark/commit/2746d35e567ac2eb8e9947e90d3ecf055177b0c4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #92708 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92708/testReport)** for PR 21439 at commit [`f3efb1b`](https://github.com/apache/spark/commit/f3efb1b97f9366839eacbc2611e39013f8c1fcfc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91350 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91350/testReport)** for PR 21439 at commit [`9d0230a`](https://github.com/apache/spark/commit/9d0230aa9c335e69099a0ddfc280244bb42a360b).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r209468355
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql ---
    @@ -39,3 +39,8 @@ select from_json('{"a":1, "b":"2"}', 'struct<a:int,b:string>');
     -- infer schema of json literal
     select schema_of_json('{"c1":0, "c2":[1]}');
     select from_json('{"c1":[1, 2, 3]}', schema_of_json('{"c1":[0]}'));
    +
    +-- from_json - array type
    +select from_json('[1, 2, 3]', 'array<int>');
    --- End diff --
    
    added


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91350 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91350/testReport)** for PR 21439 at commit [`9d0230a`](https://github.com/apache/spark/commit/9d0230aa9c335e69099a0ddfc280244bb42a360b).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94655 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94655/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93499/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91191/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21439


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204233568
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala ---
    @@ -423,7 +423,9 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         val input = """{"a": 1}"""
         val schema = ArrayType(StructType(StructField("a", IntegerType) :: Nil))
         val output = InternalRow(1) :: Nil
    -    checkEvaluation(JsonToStructs(schema, Map.empty, Literal(input), gmtId), output)
    +    checkEvaluation(
    +      JsonToStructs(schema, Map.empty, Literal(input), gmtId),
    +      output)
    --- End diff --
    
    Looks unrelated change.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92708/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r192488365
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,13 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    --- End diff --
    
    In line 87:
    ```
            val array = convertArray(parser, elementConverter)
            // Here, as we support reading top level JSON arrays and take every element
            // in such an array as a row, this case is possible.
            if (array.numElements() == 0) {
              Nil
            } else {
              array.toArray[InternalRow](schema).toSeq
            }
    ```
    Should we also follow this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91189/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204226404
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -61,6 +61,7 @@ class JacksonParser(
         dt match {
           case st: StructType => makeStructRootConverter(st)
           case mt: MapType => makeMapRootConverter(mt)
    +      case at: ArrayType => makeArrayRootConverter(at)
    --- End diff --
    
    This change accepts the json datasource form that the master can't parse? If so, I think we need tests in `JsonSuite`, too. cc: @HyukjinKwon 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94680 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94680/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93396/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r209461516
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql ---
    @@ -39,3 +39,8 @@ select from_json('{"a":1, "b":"2"}', 'struct<a:int,b:string>');
     -- infer schema of json literal
     select schema_of_json('{"c1":0, "c2":[1]}');
     select from_json('{"c1":[1, 2, 3]}', schema_of_json('{"c1":[0]}'));
    +
    +-- from_json - array type
    +select from_json('[1, 2, 3]', 'array<int>');
    --- End diff --
    
    Add more cases ?
    select from_json('[3, null, 4]', 'array<int>')
    select from_json('[3, "str", 4]', 'array<int>')


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #91668 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91668/testReport)** for PR 21439 at commit [`e321e37`](https://github.com/apache/spark/commit/e321e37b0aaf2d17ab267b685ea14ee778f06192).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94677 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94677/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @maropu @gengliangwang May I ask you to look at the PR one more time.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @HyukjinKwon I have a practical use case where I have json objects of any types `struct`, `map` and `array of any types` in a column. I need the changes to cover the use case fully.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91350/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204231723
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -61,6 +61,7 @@ class JacksonParser(
         dt match {
           case st: StructType => makeStructRootConverter(st)
           case mt: MapType => makeMapRootConverter(mt)
    +      case at: ArrayType => makeArrayRootConverter(at)
    --- End diff --
    
    You've already added tests in `JsonSuite`? It seems there are tests in `JsonFunctionsSuite`, `JsonExpressionSuite`, and `SQLQueryTestSuite` now?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r194491865
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,13 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    --- End diff --
    
    The code in line 87 returns `null` for json input `[]` if schema is `StructType(StructField("a", IntegerType) :: Nil)`. I would explain why we should return `null` in that case: we *extract* struct from the array. If the array is _empty_, it means there is nothing to extract and we returns `null` for the nothing.
    
    In case when schema is `ArrayType(...)`,  I believe we should return `empty` array for empty JSON array `[]`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204915146
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,17 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    --- End diff --
    
    Shall we add a comment on top of this `case` to explain it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @gatorsmile Could you tell me, please, what does prevent the PR from getting merged?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r191298844
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ---
    @@ -523,6 +523,8 @@ case class JsonToStructs(
       // can generate incorrect files if values are missing in columns declared as non-nullable.
       val nullableSchema = if (forceNullableSchema) schema.asNullable else schema
     
    +  val unpackArray: Boolean = options.get("unpackArray").map(_.toBoolean).getOrElse(false)
    --- End diff --
    
    private? (This is not related to this pr though, `nullableSchema` also can be private?)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @maropu I need changes from this PR https://github.com/apache/spark/pull/21550 (or this https://github.com/apache/spark/pull/21472) to write SQL tests for `ArrayType`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21439#discussion_r204248469
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala ---
    @@ -101,6 +102,17 @@ class JacksonParser(
         }
       }
     
    +  private def makeArrayRootConverter(at: ArrayType): JsonParser => Seq[InternalRow] = {
    +    val elemConverter = makeConverter(at.elementType)
    +    (parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
    +      case START_ARRAY => Seq(InternalRow(convertArray(parser, elemConverter)))
    +      case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
    --- End diff --
    
    why do we need this? This means that if the user asks for an array of data and the data doesn't contain an array we return an array with a single element. This seems wrong to me. I'd rather return null or throw an exception.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    @gatorsmile @gengliangwang @maropu The change doesn't break existing behavior. I set new option to the value which preserve backward compatibly. The PR just extend existing implementation and supports arrays of any types. May I ask you to look at the PR again.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21439
  
    **[Test build #94666 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94666/testReport)** for PR 21439 at commit [`74a7799`](https://github.com/apache/spark/commit/74a779964b666b36b36a65b2cdd4b47d9df1e04c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org