You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/03/02 11:16:14 UTC

[GitHub] [spark] iRakson opened a new pull request #27759: [SPARK-31008][SQL]Support json_array_length function

iRakson opened a new pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759
 
 
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   At the moment we do not have any function to compute length of json array directly.
   I propose a  `json_array_length` function which will return the length of outer json array.
   
   - This function will return length of outer array, if json array is valid. Otherwise, it will give null.
   - For null array `null` will be returned.
   - If user pass anything other than `json array`, an analysis exception will be thrown.
   
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   
   - As mentioned in JIRA, this function is supported by presto, postgreSQL, redshift. 
   - for better user experience and ease of use.
   
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce any user-facing change?
   Yes, now users can directly get length of a json array by using `json_array_length`.
   
   <!--
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   Added UT.
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037085
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25504/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601023292
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594337296
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24019/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-605156118
 
 
   gentle ping @MaxGekk @HyukjinKwon @maropu 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597406594
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24374/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609088064
 
 
   **[Test build #120817 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120817/testReport)** for PR 27759 at commit [`391f33d`](https://github.com/apache/spark/commit/391f33d003998f2f5672405dc26ee96e0a1e4333).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609940608
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120872/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597468277
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609588884
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037113
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120804/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597406589
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593352709
 
 
   cc @cloud-fan @HyukjinKwon 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603342725
 
 
   gentle ping @HyukjinKwon @MaxGekk @maropu 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400594807
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ##########
 @@ -791,4 +791,30 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         checkDecimalInfer(_, """struct<d:decimal(7,3)>""")
     }
   }
+
+  test("Length of JSON array") {
+    val null_json_array = """"""
+    val simple_json_array = """[1,2,3]"""
+    val empty_json_array = """[]"""
+    val json_array_of_array = """[[1],[2,3],[]]"""
+    val json_array_of_objects = """[{"a":123},{"b":"hello"}]"""
+    val complex_json_array = """[1,2,3,[33,44],{"key":[2,3,4]}]"""
+    val not_a_json_array = """{"key":"not a json array"}"""
+    val invalid_json_array = """[1,2,3,4,5"""
+
+    checkEvaluation(LengthOfJsonArray(Literal(null_json_array)), null)
+    checkEvaluation(LengthOfJsonArray(Literal(simple_json_array)), 3)
+    checkEvaluation(LengthOfJsonArray(Literal(empty_json_array)), 0)
+    checkEvaluation(LengthOfJsonArray(Literal(json_array_of_array)), 3)
+    checkEvaluation(LengthOfJsonArray(Literal(json_array_of_objects)), 2)
+    checkEvaluation(LengthOfJsonArray(Literal(complex_json_array)), 5)
+    checkEvaluation(LengthOfJsonArray(Literal(invalid_json_array)), null)
 
 Review comment:
   Shall we use shorter pattern?
   ```scala
   Seq(
     ("", null),
     ("[]", 0),
     ("[1,2,3]", 3),
     ("[[1],[2,3],[]]", 3),
     ("""[{"a":123},{"b":"hello"}]""", 2),
     ("""[1,2,3,[33,44],{"key":[2,3,4]}]""", 5),
     ("""[1,2,3,4,5""", null)
   ).foreach { case (literal, expectedValue) =>
     checkEvaluation(LengthOfJsonArray(Literal(literal)), expectedValue)
   }
   
   val not_a_json_array = """{"key":"not a json array"}"""
   ...
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609753323
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120862/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603340910
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609604020
 
 
   **[Test build #120860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120860/testReport)** for PR 27759 at commit [`313151f`](https://github.com/apache/spark/commit/313151f5964ed111b34d3c0dd03305150b2ef0b1).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390425608
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. An Exception is thrown if any
+          other valid JSON strings are passed. `NULL` is returned in case of invalid JSON.
 
 Review comment:
   Ideally we should. I just returned `NULL` because other JSON functions returns `NULL` as well.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387498217
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    @transient
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException => null
 
 Review comment:
   Just in case, the exception are handled by JacksonParser:
   https://github.com/apache/spark/blob/c1986204e59f1e8cc4b611d5a578cb248cb74c28/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala#L433

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609039222
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602799639
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120216/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595084300
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403509915
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
 ##########
 @@ -710,4 +710,11 @@ class JsonFunctionsSuite extends QueryTest with SharedSparkSession {
       Seq(Row("string")))
   }
 
+  test("json_array_length") {
+    val df = Seq(1).toDF("json")
+    val errMsg = intercept[AnalysisException] {
+      df.selectExpr("json_array_length(json)")
+    }.getMessage
+    assert(errMsg.contains("due to data type mismatch"))
+  }
 
 Review comment:
   Shall we remove this because this is already covered at `json-functions.sql` more extensively.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595084145
 
 
   **[Test build #119367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119367/testReport)** for PR 27759 at commit [`0cb4f84`](https://github.com/apache/spark/commit/0cb4f84162a8ed9bd2fb7b11b3a06fe2a702535d).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609062957
 
 
   Build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601140735
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120033/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601023332
 
 
   @MaxGekk does it look good to you?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400592519
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
 
 Review comment:
   `NullPointerException` is not catched here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601140735
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120033/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-600013779
 
 
   gentle ping @HyukjinKwon @maropu @cloud-fan 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595084307
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119367/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609940596
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609591018
 
 
   **[Test build #120860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120860/testReport)** for PR 27759 at commit [`313151f`](https://github.com/apache/spark/commit/313151f5964ed111b34d3c0dd03305150b2ef0b1).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387369101
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,59 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
 
 Review comment:
   Can you also add `arguments` description?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-596979155
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24343/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601024307
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24751/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593355177
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595680224
 
 
   @HyukjinKwon @maropu @MaxGekk I have tried to handle review comments from my side. Please Review once.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594236888
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24002/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-596979143
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597046784
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119614/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400594807
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ##########
 @@ -791,4 +791,30 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         checkDecimalInfer(_, """struct<d:decimal(7,3)>""")
     }
   }
+
+  test("Length of JSON array") {
+    val null_json_array = """"""
+    val simple_json_array = """[1,2,3]"""
+    val empty_json_array = """[]"""
+    val json_array_of_array = """[[1],[2,3],[]]"""
+    val json_array_of_objects = """[{"a":123},{"b":"hello"}]"""
+    val complex_json_array = """[1,2,3,[33,44],{"key":[2,3,4]}]"""
+    val not_a_json_array = """{"key":"not a json array"}"""
+    val invalid_json_array = """[1,2,3,4,5"""
+
+    checkEvaluation(LengthOfJsonArray(Literal(null_json_array)), null)
+    checkEvaluation(LengthOfJsonArray(Literal(simple_json_array)), 3)
+    checkEvaluation(LengthOfJsonArray(Literal(empty_json_array)), 0)
+    checkEvaluation(LengthOfJsonArray(Literal(json_array_of_array)), 3)
+    checkEvaluation(LengthOfJsonArray(Literal(json_array_of_objects)), 2)
+    checkEvaluation(LengthOfJsonArray(Literal(complex_json_array)), 5)
+    checkEvaluation(LengthOfJsonArray(Literal(invalid_json_array)), null)
 
 Review comment:
   Shall we use a shorter pattern?
   ```scala
   Seq(
     ("", null),
     ("[]", 0),
     ("[1,2,3]", 3),
     ("[[1],[2,3],[]]", 3),
     ("""[{"a":123},{"b":"hello"}]""", 2),
     ("""[1,2,3,[33,44],{"key":[2,3,4]}]""", 5),
     ("""[1,2,3,4,5""", null)
   ).foreach { case (literal, expectedValue) =>
     checkEvaluation(LengthOfJsonArray(Literal(literal)), expectedValue)
   }
   
   val not_a_json_array = """{"key":"not a json array"}"""
   ...
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-596979143
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r388010199
 
 

 ##########
 File path: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql
 ##########
 @@ -58,5 +58,16 @@ select schema_of_json('{"c1":01, "c2":0.1}', map('allowNumericLeadingZeros', 'tr
 select schema_of_json(null);
 CREATE TEMPORARY VIEW jsonTable(jsonField, a) AS SELECT * FROM VALUES ('{"a": 1, "b": 2}', 'a');
 SELECT schema_of_json(jsonField) FROM jsonTable;
+
+-- json_array_length
+select json_array_length('');
+select json_array_length('[]');
+select json_array_length('[1,2,3]');
+select json_array_length('[[1,2],[5,6,7]]');
+select json_array_length('[{"a":123},{"b":"hello"}]');
+select json_array_length('[1,2,3,[33,44],{"key":[2,3,4]}]');
+select json_array_length('{"key":"not a json array"}');
 
 Review comment:
   Does this function need to be array-specific? Yea, I know this is the same behaviour with pgSQL. But, `json_length` in mysql is more general one? 
   ```
   mysql> select json_length('[1,2,3]');
   +------------------------+
   | json_length('[1,2,3]') |
   +------------------------+
   |                      3 |
   +------------------------+
   1 row in set (0.01 sec)
   
   mysql> select json_length('{"key":"not a json array"}');
   +-------------------------------------------+
   | json_length('{"key":"not a json array"}') |
   +-------------------------------------------+
   |                                         1 |
   +-------------------------------------------+
   1 row in set (0.00 sec)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609940608
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120872/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602753524
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24929/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594501246
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400592720
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
 
 Review comment:
   Maybe, do we need to catch `case NonFatal(e) =>` in general?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594380143
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119279/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593532204
 
 
   > I would add an optimization rule instead of extending public API.
   
   I believe public API might serve better as user are more familiar with `json_array_length` as this function is supported by most of the database engines . Also, it seems more straight-forward than `size+from_json`.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609752360
 
 
   **[Test build #120862 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120862/testReport)** for PR 27759 at commit [`313151f`](https://github.com/apache/spark/commit/313151f5964ed111b34d3c0dd03305150b2ef0b1).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037085
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25504/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387499401
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
 
 Review comment:
   Should the expression support array of different element types? For instance, `from_json()` can parse arrays only of particular type. So, you can get length but cannot parse it by `from_json`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-596977929
 
 
   **[Test build #119614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119614/testReport)** for PR 27759 at commit [`7dc25b9`](https://github.com/apache/spark/commit/7dc25b9bb3f9cd05d5cb62e06dc552f2781ffab1).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593841149
 
 
   > Could you check more databases, e.g., oracle, sql server, snowflake, ...?
   
   Teradata supports `json_array_length`.
   MySQL supports `JSON_LENGTH` which returns length of any JSON document.
   amazon athena supports as well.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594501254
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24044/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037111
 
 
   Build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403505679
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of an invalid JSON.
 
 Review comment:
   Shall we mention `NULL` input explicitly?
   ```
   - `NULL` is returned in case of an invalid JSON.
   + `NULL` is returned in case of `NULL` or an invalid JSON
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609591018
 
 
   **[Test build #120860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120860/testReport)** for PR 27759 at commit [`313151f`](https://github.com/apache/spark/commit/313151f5964ed111b34d3c0dd03305150b2ef0b1).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594501246
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403505291
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
 
 Review comment:
   The added line 831 looks okay.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595037637
 
 
   **[Test build #119365 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119365/testReport)** for PR 27759 at commit [`d00fe19`](https://github.com/apache/spark/commit/d00fe195eb4e6d95daddf6ca2d0002c1e19ff54e).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602799635
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597468277
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609764801
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390687872
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. An Exception is thrown if any
+          other valid JSON strings are passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    var length: Int = 0;
+    // Only json array are supported for this function.
+    if (parser.currentToken != JsonToken.START_ARRAY) {
+      throw new AnalysisException(s"$prettyName can only be called on JSON Array.")
+    }
+    // Keep traversing until the end of Json Array
 
 Review comment:
   Json Array -> JSON Array

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597046495
 
 
   **[Test build #119614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119614/testReport)** for PR 27759 at commit [`7dc25b9`](https://github.com/apache/spark/commit/7dc25b9bb3f9cd05d5cb62e06dc552f2781ffab1).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r388002965
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,59 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
 
 Review comment:
   nit: I like a consistent word: `length of` or `number of`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609088216
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601140728
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609604080
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403506254
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
 
 Review comment:
   Since JSON can be a nested structure, there might be multiple `inner` and `outer`. Can we use `the outmost` instead of `outer`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609074762
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400573591
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of invalid JSON.
 
 Review comment:
   nit: `invalid JSON` -> `an invalid JSON`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594337291
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594337296
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24019/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609611052
 
 
   **[Test build #120862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120862/testReport)** for PR 27759 at commit [`313151f`](https://github.com/apache/spark/commit/313151f5964ed111b34d3c0dd03305150b2ef0b1).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603173223
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597406589
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403505679
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of an invalid JSON.
 
 Review comment:
   Shall we mention `NULL` input explicitly? Otherwise, the previous sentence `An exception is thrown if any other valid JSON strings are passed` will mislead the users.
   ```
   - `NULL` is returned in case of an invalid JSON.
   + `NULL` is returned in case of `NULL` or an invalid JSON
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390154084
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer Json Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any
+        other valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
 
 Review comment:
   `other valid JSON expression is passed.` -> `other valid JSON strings are passed`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594236878
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037081
 
 
   Build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595038357
 
 
   **[Test build #119365 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119365/testReport)** for PR 27759 at commit [`d00fe19`](https://github.com/apache/spark/commit/d00fe195eb4e6d95daddf6ca2d0002c1e19ff54e).
    * This patch **fails Scala style tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r388082157
 
 

 ##########
 File path: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql
 ##########
 @@ -58,5 +58,16 @@ select schema_of_json('{"c1":01, "c2":0.1}', map('allowNumericLeadingZeros', 'tr
 select schema_of_json(null);
 CREATE TEMPORARY VIEW jsonTable(jsonField, a) AS SELECT * FROM VALUES ('{"a": 1, "b": 2}', 'a');
 SELECT schema_of_json(jsonField) FROM jsonTable;
+
+-- json_array_length
+select json_array_length('');
+select json_array_length('[]');
+select json_array_length('[1,2,3]');
+select json_array_length('[[1,2],[5,6,7]]');
+select json_array_length('[{"a":123},{"b":"hello"}]');
+select json_array_length('[1,2,3,[33,44],{"key":[2,3,4]}]');
+select json_array_length('{"key":"not a json array"}');
 
 Review comment:
   I agree on the fact that `json_length` is generic. I defined it array-specific because most of the DBs support array-specific function only. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597406322
 
 
   **[Test build #119643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119643/testReport)** for PR 27759 at commit [`5a860d7`](https://github.com/apache/spark/commit/5a860d7c9ef3704737fe4971c704eb5f3a15a295).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387495643
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    @transient
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    // Counter for length of array
 
 Review comment:
   I think the comment can be removed. I would rename `array_length` to `counter` or `length`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387471894
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -3954,6 +3954,17 @@ object functions {
   def to_json(e: Column): Column =
     to_json(e, Map.empty[String, String])
 
+  /**
 
 Review comment:
   I believe it should be allowed to be used directly. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-596979155
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24343/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597046779
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403770485
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ##########
 @@ -790,4 +790,27 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         checkDecimalInfer(_, """struct<d:decimal(7,3)>""")
     }
   }
+
+  test("Length of JSON array") {
+    Seq(
+      ("""""", null),
+      ("""[1,2,3]""", 3),
+      ("""[]""", 0),
+      ("""[[1],[2,3],[]]""", 3),
 
 Review comment:
   This is different from the example I gave you. Do you have any reason to prefer `"""` here? Usually, Apache Spark prefer to use a simple form `"` than `"""`.
   - https://github.com/apache/spark/pull/27759#discussion_r400594807
   ```
   Seq(
     ("", null),
     ("[]", 0),
     ("[1,2,3]", 3),
     ("[[1],[2,3],[]]", 3),
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595038378
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119365/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403847787
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in the outmost JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in the outmost JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of `NULL` or an invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression) extends UnaryExpression
+  with CodegenFallback with ExpectsInputTypes {
+
+  override def inputTypes: Seq[DataType] = Seq(StringType)
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    // return null for null input
+    if (json == null) {
+      return null
+    }
+
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    var length = 0
+    // Only JSON array are supported for this function.
+    if (parser.currentToken != JsonToken.START_ARRAY) {
+      throw new IllegalArgumentException(s"$prettyName can only be called on JSON array.")
+    }
+    // Keep traversing until the end of JSON array
+    while(parser.nextToken() != JsonToken.END_ARRAY) {
+      // Null indicates end of input.
+      if (parser.currentToken == null) {
+        throw new IllegalArgumentException("Please provide a valid JSON array.")
 
 Review comment:
   Yeah. It is unreachable code. Because if we encounter null before `END_ARRAY` then our JSON is invalid. I will remove this check.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603340928
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120271/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403770169
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ##########
 @@ -790,4 +790,27 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         checkDecimalInfer(_, """struct<d:decimal(7,3)>""")
     }
   }
+
+  test("Length of JSON array") {
+    Seq(
+      ("""""", null),
+      ("""[1,2,3]""", 3),
+      ("""[]""", 0),
+      ("""[[1],[2,3],[]]""", 3),
+      ("""[{"a":123},{"b":"hello"}]""", 2),
+      ("""[1,2,3,[33,44],{"key":[2,3,4]}]""", 5),
+      ("""[1,2,3,4,5""", null),
+      ("""Random String""", null)
+    ).foreach{
 
 Review comment:
   nit. `).foreach{` -> `).foreach {`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609588887
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25554/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r388079357
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,67 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
 
 Review comment:
   ok. I will make the changes

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609039046
 
 
   **[Test build #120806 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120806/testReport)** for PR 27759 at commit [`b0c51dc`](https://github.com/apache/spark/commit/b0c51dcadcf85a11d634f04c730d4546b565c5ca).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609038166
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25505/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390687943
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ##########
 @@ -791,4 +791,30 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         checkDecimalInfer(_, """struct<d:decimal(7,3)>""")
     }
   }
+
+  test("Length of Json Array") {
 
 Review comment:
   Json Array -> JSON array

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609074767
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120806/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609764211
 
 
   **[Test build #120872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120872/testReport)** for PR 27759 at commit [`f44e24e`](https://github.com/apache/spark/commit/f44e24ec66e11fdb71c1a9a813f04f6e37244b61).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609088216
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602753524
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24929/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387458054
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,59 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
 
 Review comment:
   arguments description is added.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609113201
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593511926
 
 
   > At the moment we have to parse all ...
   
   You can avoid deep parsing by specifying string as element type. For example:
   ```scala
   scala> val df = Seq("""[{"a":1}, {"a": 2}]""").toDF("json")
   df: org.apache.spark.sql.DataFrame = [json: string]
   
   scala> df.select(size(from_json($"json", ArrayType(StringType)))).show
   +---------------------+
   |size(from_json(json))|
   +---------------------+
   |                    2|
   +---------------------+
   ``` 
   It does actually the same as your expression. Maybe it is less optimal because `from_json()` materializes arrays but this is another question how to optimize the combination of size + from_json of array of strings. I would add an optimization rule instead of extending public API.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601024307
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24751/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609764810
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25565/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609038794
 
 
   @dongjoon-hyun I have tried to handle all your review comments. 
   I have added some more test case in `son-functions.sql`. These test cases will cover all the corner cases, I think.
   
   Kindly review the changes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609764801
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595038369
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson edited a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson edited a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595680224
 
 
   @HyukjinKwon @maropu @MaxGekk I have handled review comments from my side. Please Review once.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609038166
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25505/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602799493
 
 
   **[Test build #120216 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120216/testReport)** for PR 27759 at commit [`9c7fc8e`](https://github.com/apache/spark/commit/9c7fc8ebaa24fa96973fa5e288ead686d4081738).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597468283
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119643/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609611545
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25556/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387613455
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    @transient
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    // Counter for length of array
+    var array_length: Int = 0;
 
 Review comment:
   I forgot to change the name. I will rename it and comment will be removed as well.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594236428
 
 
   **[Test build #119263 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119263/testReport)** for PR 27759 at commit [`d8ec950`](https://github.com/apache/spark/commit/d8ec9504e27a4097d7b0997d5601105607f26dec).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403515624
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of an invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression) extends UnaryExpression
+  with CodegenFallback with ExpectsInputTypes {
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(StringType)
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    // return null for null input
+    if (json == null) {
+      return null
+    }
+
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    var length = 0;
+    // Only JSON array are supported for this function.
+    if (parser.currentToken != JsonToken.START_ARRAY) {
+      throw new IllegalArgumentException(s"$prettyName can only be called on JSON array.")
+    }
+    // Keep traversing until the end of JSON array
+    while(parser.nextToken() != JsonToken.END_ARRAY) {
+      // Null indicates end of input.
+      if (parser.currentToken == null) {
+        throw new IllegalArgumentException("Please provide a valid JSON array.")
+      }
+      length += 1
+      // skip all the child of inner object or array
+      parser.skipChildren()
 
 Review comment:
   Yes.
   Actually Jackson parser will throw JsonProcessingException in case of invalid JSON. So, it is not required to check here. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609764810
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25565/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594501254
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24044/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609062677
 
 
   **[Test build #120805 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120805/testReport)** for PR 27759 at commit [`a70dc7e`](https://github.com/apache/spark/commit/a70dc7e2634f2af56f089c12395eae797f690a88).
    * This patch passes all tests.
    * This patch **does not merge cleanly**.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390150210
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer Json Array.
 
 Review comment:
   Json -> JSON

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609604080
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390368102
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. An Exception is thrown if any
+          other valid JSON strings are passed. `NULL` is returned in case of invalid JSON.
 
 Review comment:
   Won't it be good if we inform users that this function accepts JSON Array only?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387613709
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    @transient
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException => null
 
 Review comment:
   I will handle IOException. Missed that. Thanks for pointing out.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595038295
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24102/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390687811
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. An Exception is thrown if any
+          other valid JSON strings are passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    var length: Int = 0;
+    // Only json array are supported for this function.
 
 Review comment:
   json -> JSON

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609604089
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120860/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400588661
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
 
 Review comment:
   Technically, this function expects `String` as an input parameter, doesn't it?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609038162
 
 
   Build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390687886
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. An Exception is thrown if any
+          other valid JSON strings are passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    var length: Int = 0;
+    // Only json array are supported for this function.
+    if (parser.currentToken != JsonToken.START_ARRAY) {
+      throw new AnalysisException(s"$prettyName can only be called on JSON Array.")
+    }
+    // Keep traversing until the end of Json Array
+    while(parser.nextToken() != JsonToken.END_ARRAY) {
+      // Null indicates end of input.
+      if (parser.currentToken == null) {
+        throw new AnalysisException("Please provide a valid JSON Array.")
 
 Review comment:
   Array -> array

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609062960
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120805/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400590956
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ##########
 @@ -791,4 +791,30 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         checkDecimalInfer(_, """struct<d:decimal(7,3)>""")
     }
   }
+
+  test("Length of JSON array") {
+    val null_json_array = """"""
+    val simple_json_array = """[1,2,3]"""
+    val empty_json_array = """[]"""
+    val json_array_of_array = """[[1],[2,3],[]]"""
+    val json_array_of_objects = """[{"a":123},{"b":"hello"}]"""
+    val complex_json_array = """[1,2,3,[33,44],{"key":[2,3,4]}]"""
+    val not_a_json_array = """{"key":"not a json array"}"""
+    val invalid_json_array = """[1,2,3,4,5"""
+
+    checkEvaluation(LengthOfJsonArray(Literal(null_json_array)), null)
+    checkEvaluation(LengthOfJsonArray(Literal(simple_json_array)), 3)
+    checkEvaluation(LengthOfJsonArray(Literal(empty_json_array)), 0)
+    checkEvaluation(LengthOfJsonArray(Literal(json_array_of_array)), 3)
+    checkEvaluation(LengthOfJsonArray(Literal(json_array_of_objects)), 2)
+    checkEvaluation(LengthOfJsonArray(Literal(complex_json_array)), 5)
+    checkEvaluation(LengthOfJsonArray(Literal(invalid_json_array)), null)
+
+    val exception = intercept[TestFailedException]{
+      checkEvaluation(LengthOfJsonArray(Literal(not_a_json_array)), null)
+    }.getCause
+
+    assert(exception.isInstanceOf[IllegalArgumentException])
 
 Review comment:
   Shall we remove this? You can use `val exception = intercept[IllegalArgumentException]{` at line 813.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403505679
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of an invalid JSON.
 
 Review comment:
   Shall we mention `NULL` input explicitly? Otherwise, the above sentence `An exception is thrown if any other valid JSON strings are passed` will mislead the users.
   ```
   - `NULL` is returned in case of an invalid JSON.
   + `NULL` is returned in case of `NULL` or an invalid JSON
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609039222
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594337291
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609609667
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r388004965
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,67 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
 
 Review comment:
   Can you follow the other format? e.g., https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L292

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400592519
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
 
 Review comment:
   `NullPointerException` is not patched here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593354666
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601023332
 
 
   @MaxGekk does it look good to you?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387493820
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    @transient
 
 Review comment:
   Is this annotation really needed?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-610762829
 
 
   Thank you all for patiently reviewing the PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594236888
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24002/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594326795
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119263/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601140728
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037081
 
 
   Build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601024301
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400573531
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
 
 Review comment:
   nit: `Exception` -> `exception`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594729588
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593461310
 
 
   > @iRakson What is the use case when you need to know length w/o full parsing? Whereas you can apply `size` after `from_json()`.
   
   There are cases where we need json arrays whose size is greater than a certain threshold. At the moment we have to parse all the json arrays and then only we can decide whether that json array is needed or not. 
   This problem can be solved by using this function.
   There can be many other such scenarios where users first want to know the size of json array.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595092856
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609113204
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120817/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603176360
 
 
   **[Test build #120271 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120271/testReport)** for PR 27759 at commit [`c17556e`](https://github.com/apache/spark/commit/c17556e24691807c373fd399d76cf6809cf7e0cb).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r388119969
 
 

 ##########
 File path: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql
 ##########
 @@ -58,5 +58,16 @@ select schema_of_json('{"c1":01, "c2":0.1}', map('allowNumericLeadingZeros', 'tr
 select schema_of_json(null);
 CREATE TEMPORARY VIEW jsonTable(jsonField, a) AS SELECT * FROM VALUES ('{"a": 1, "b": 2}', 'a');
 SELECT schema_of_json(jsonField) FROM jsonTable;
+
+-- json_array_length
+select json_array_length('');
+select json_array_length('[]');
+select json_array_length('[1,2,3]');
+select json_array_length('[[1,2],[5,6,7]]');
+select json_array_length('[{"a":123},{"b":"hello"}]');
+select json_array_length('[1,2,3,[33,44],{"key":[2,3,4]}]');
+select json_array_length('{"key":"not a json array"}');
 
 Review comment:
   WDYT? @HyukjinKwon @MaxGekk 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390150061
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer Json Array.",
 
 Review comment:
   Json Array -> JSON array

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594380143
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119279/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602799639
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120216/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403771024
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in the outmost JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in the outmost JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of `NULL` or an invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression) extends UnaryExpression
+  with CodegenFallback with ExpectsInputTypes {
+
+  override def inputTypes: Seq[DataType] = Seq(StringType)
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    // return null for null input
+    if (json == null) {
+      return null
+    }
+
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    var length = 0
+    // Only JSON array are supported for this function.
+    if (parser.currentToken != JsonToken.START_ARRAY) {
+      throw new IllegalArgumentException(s"$prettyName can only be called on JSON array.")
+    }
+    // Keep traversing until the end of JSON array
+    while(parser.nextToken() != JsonToken.END_ARRAY) {
+      // Null indicates end of input.
+      if (parser.currentToken == null) {
+        throw new IllegalArgumentException("Please provide a valid JSON array.")
 
 Review comment:
   I'm wondering if we can have a test coverage for this code path. Otherwise, this code path can be considered as a dead code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson edited a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson edited a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593352709
 
 
   cc @cloud-fan @HyukjinKwon @dongjoon-hyun @maropu 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595048017
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601023752
 
 
   **[Test build #120033 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120033/testReport)** for PR 27759 at commit [`5a860d7`](https://github.com/apache/spark/commit/5a860d7c9ef3704737fe4971c704eb5f3a15a295).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609074513
 
 
   **[Test build #120806 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120806/testReport)** for PR 27759 at commit [`b0c51dc`](https://github.com/apache/spark/commit/b0c51dcadcf85a11d634f04c730d4546b565c5ca).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594729588
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595038290
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594326795
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119263/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603339540
 
 
   **[Test build #120271 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120271/testReport)** for PR 27759 at commit [`c17556e`](https://github.com/apache/spark/commit/c17556e24691807c373fd399d76cf6809cf7e0cb).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400591063
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ##########
 @@ -791,4 +791,30 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
         checkDecimalInfer(_, """struct<d:decimal(7,3)>""")
     }
   }
+
+  test("Length of JSON array") {
+    val null_json_array = """"""
+    val simple_json_array = """[1,2,3]"""
+    val empty_json_array = """[]"""
+    val json_array_of_array = """[[1],[2,3],[]]"""
+    val json_array_of_objects = """[{"a":123},{"b":"hello"}]"""
+    val complex_json_array = """[1,2,3,[33,44],{"key":[2,3,4]}]"""
+    val not_a_json_array = """{"key":"not a json array"}"""
+    val invalid_json_array = """[1,2,3,4,5"""
+
+    checkEvaluation(LengthOfJsonArray(Literal(null_json_array)), null)
+    checkEvaluation(LengthOfJsonArray(Literal(simple_json_array)), 3)
+    checkEvaluation(LengthOfJsonArray(Literal(empty_json_array)), 0)
+    checkEvaluation(LengthOfJsonArray(Literal(json_array_of_array)), 3)
+    checkEvaluation(LengthOfJsonArray(Literal(json_array_of_objects)), 2)
+    checkEvaluation(LengthOfJsonArray(Literal(complex_json_array)), 5)
+    checkEvaluation(LengthOfJsonArray(Literal(invalid_json_array)), null)
+
+    val exception = intercept[TestFailedException]{
+      checkEvaluation(LengthOfJsonArray(Literal(not_a_json_array)), null)
+    }.getCause
+
+    assert(exception.isInstanceOf[IllegalArgumentException])
+    assert(exception.getMessage.contains("can only be called on JSON array"))
 
 Review comment:
   Shall we add one more negative case for invalid type?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403506487
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of an invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression) extends UnaryExpression
+  with CodegenFallback with ExpectsInputTypes {
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(StringType)
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    // return null for null input
+    if (json == null) {
+      return null
+    }
+
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    var length = 0;
 
 Review comment:
   Ur, shall we remove `;` since this is **Scala**? Please check the other code together.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609090131
 
 
   > Please update the PR description, too. It's important because it will be a commit log.
   
   Updated. :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387370446
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -3954,6 +3954,17 @@ object functions {
   def to_json(e: Column): Column =
     to_json(e, Map.empty[String, String])
 
+  /**
 
 Review comment:
   Let's don't add it here per https://github.com/apache/spark/blob/d8ec9504e27a4097d7b0997d5601105607f26dec/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L42-L60

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403478544
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of an invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression) extends UnaryExpression
+  with CodegenFallback with ExpectsInputTypes {
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(StringType)
 
 Review comment:
   `AbstractDataType` -> `DataType` 
   I will fix this while handling review comments, if any.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594336960
 
 
   **[Test build #119279 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119279/testReport)** for PR 27759 at commit [`58a9e0d`](https://github.com/apache/spark/commit/58a9e0dfdb7d72bc8ffd67b5c69f4b52a65e4b5e).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390150061
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer Json Array.",
 
 Review comment:
   Json -> JSON

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387614304
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
 
 Review comment:
   Actually i wanted it to work like other `json_array_length` functions, which take any input and result length. I can change its implementation, if required.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609113204
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120817/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609062957
 
 
   Build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r400587840
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
 
 Review comment:
   Hi, @iRakson . Please use `ExpectsInputTypes` to prevent `ClassCastException` like the following.
   ```
   spark-sql> SELECT json_array_length(1);
   20/03/30 18:19:11 ERROR SparkSQLDriver: Failed in [SELECT json_array_length(1)]
   java.lang.ClassCastException: class java.lang.Integer cannot be cast to class org.apache.spark.unsafe.types.UTF8String (java.lang.Integer is in module java.base of loader 'bootstrap'; org.apache.spark.unsafe.types.UTF8String is in unnamed module of loader 'app')
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594236428
 
 
   **[Test build #119263 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119263/testReport)** for PR 27759 at commit [`d8ec950`](https://github.com/apache/spark/commit/d8ec9504e27a4097d7b0997d5601105607f26dec).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390687796
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. An Exception is thrown if any
+          other valid JSON strings are passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    var length: Int = 0;
+    // Only json array are supported for this function.
+    if (parser.currentToken != JsonToken.START_ARRAY) {
+      throw new AnalysisException(s"$prettyName can only be called on JSON Array.")
 
 Review comment:
   Array -> array

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609588887
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25554/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597406322
 
 
   **[Test build #119643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119643/testReport)** for PR 27759 at commit [`5a860d7`](https://github.com/apache/spark/commit/5a860d7c9ef3704737fe4971c704eb5f3a15a295).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387493158
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
 
 Review comment:
   The semantic of `LengthOfJsonArray` is similar to `Size` + `JsonToStructs`. What about to extend `RuntimeReplaceable` and map `LengthOfJsonArray` to the combination of existing expressions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037984
 
 
   **[Test build #120805 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120805/testReport)** for PR 27759 at commit [`a70dc7e`](https://github.com/apache/spark/commit/a70dc7e2634f2af56f089c12395eae797f690a88).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603340910
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595038369
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601140083
 
 
   **[Test build #120033 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120033/testReport)** for PR 27759 at commit [`5a860d7`](https://github.com/apache/spark/commit/5a860d7c9ef3704737fe4971c704eb5f3a15a295).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595084300
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609113030
 
 
   **[Test build #120817 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120817/testReport)** for PR 27759 at commit [`391f33d`](https://github.com/apache/spark/commit/391f33d003998f2f5672405dc26ee96e0a1e4333).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609764211
 
 
   **[Test build #120872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120872/testReport)** for PR 27759 at commit [`f44e24e`](https://github.com/apache/spark/commit/f44e24ec66e11fdb71c1a9a813f04f6e37244b61).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-601023752
 
 
   **[Test build #120033 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120033/testReport)** for PR 27759 at commit [`5a860d7`](https://github.com/apache/spark/commit/5a860d7c9ef3704737fe4971c704eb5f3a15a295).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594236878
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609039227
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25506/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595084307
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119367/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593355177
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387471894
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -3954,6 +3954,17 @@ object functions {
   def to_json(e: Column): Column =
     to_json(e, Map.empty[String, String])
 
+  /**
 
 Review comment:
   I believe it should be allowed to be used directly. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602799635
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603176360
 
 
   **[Test build #120271 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120271/testReport)** for PR 27759 at commit [`c17556e`](https://github.com/apache/spark/commit/c17556e24691807c373fd399d76cf6809cf7e0cb).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387497915
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    @transient
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException => null
 
 Review comment:
   `nextToken` can throw IOException. see:
   ```
       public abstract JsonToken nextToken() throws IOException;
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609588884
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609036874
 
 
   **[Test build #120804 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120804/testReport)** for PR 27759 at commit [`780c47b`](https://github.com/apache/spark/commit/780c47bc57b8f5f21440080d42d20c354301fa3e).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593354666
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602752773
 
 
   **[Test build #120216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120216/testReport)** for PR 27759 at commit [`9c7fc8e`](https://github.com/apache/spark/commit/9c7fc8ebaa24fa96973fa5e288ead686d4081738).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403477043
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +781,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An Exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('[1,2,3,4]');
+        4
+      > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+        5
+      > SELECT _FUNC_('[1,2');
+        NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException | _: IOException => null
 
 Review comment:
   I think we do not need to catch `NullPointerException` here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609611545
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25556/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037984
 
 
   **[Test build #120805 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120805/testReport)** for PR 27759 at commit [`a70dc7e`](https://github.com/apache/spark/commit/a70dc7e2634f2af56f089c12395eae797f690a88).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390687766
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON Array.",
 
 Review comment:
   Array -> array

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602753512
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595048026
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24104/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609074762
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390149955
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer Json Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any
+        other valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
 
 Review comment:
   Can you add two more spaces? See also the documentation at `ExpressionDescription`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390259376
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. An Exception is thrown if any
+          other valid JSON strings are passed. `NULL` is returned in case of invalid JSON.
 
 Review comment:
   if any other valid JSON strings are passed, we should return null as well?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597046784
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119614/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-603173230
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24981/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-602753512
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609611535
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609036874
 
 
   **[Test build #120804 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120804/testReport)** for PR 27759 at commit [`780c47b`](https://github.com/apache/spark/commit/780c47bc57b8f5f21440080d42d20c354301fa3e).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609501989
 
 
   Hi, @iRakson . Thank you for updating. I left only a few comments. The other things look okay to me.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594234554
 
 
   ok to test

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-597468283
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119643/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390150210
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer Json Array.
 
 Review comment:
   Json Array -> JSON array

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609088218
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25516/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609074767
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120806/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594729605
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119303/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390150563
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,68 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer Json Array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any
 
 Review comment:
   `Analysis Exception` is thrown -> An exception is thrown

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593822631
 
 
   Could you check more databases, e.g., oracle, sql server, snowflake, ...?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595038295
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24102/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037106
 
 
   **[Test build #120804 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120804/testReport)** for PR 27759 at commit [`780c47b`](https://github.com/apache/spark/commit/780c47bc57b8f5f21440080d42d20c354301fa3e).
    * This patch **fails Scala style tests**.
    * This patch **does not merge cleanly**.
    * This patch adds the following public classes _(experimental)_:
     * `case class LengthOfJsonArray(child: Expression) extends UnaryExpression`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r388119969
 
 

 ##########
 File path: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql
 ##########
 @@ -58,5 +58,16 @@ select schema_of_json('{"c1":01, "c2":0.1}', map('allowNumericLeadingZeros', 'tr
 select schema_of_json(null);
 CREATE TEMPORARY VIEW jsonTable(jsonField, a) AS SELECT * FROM VALUES ('{"a": 1, "b": 2}', 'a');
 SELECT schema_of_json(jsonField) FROM jsonTable;
+
+-- json_array_length
+select json_array_length('');
+select json_array_length('[]');
+select json_array_length('[1,2,3]');
+select json_array_length('[[1,2],[5,6,7]]');
+select json_array_length('[{"a":123},{"b":"hello"}]');
+select json_array_length('[1,2,3,[33,44],{"key":[2,3,4]}]');
+select json_array_length('{"key":"not a json array"}');
 
 Review comment:
   WDYT? @HyukjinKwon 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r403505679
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -796,3 +796,75 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns the number of elements in outer JSON array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns the number of elements in outer JSON array.",
+  arguments = """
+    Arguments:
+      * jsonArray - A JSON array. An exception is thrown if any other valid JSON strings are passed.
+          `NULL` is returned in case of an invalid JSON.
 
 Review comment:
   Shall we mention `NULL` input explicitly?
   ```
   - `NULL` is returned in case of an invalid JSON.
   + `NULL` is returned in case of `NULL` or an invalid JSON
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609753309
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387616235
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
+  extends UnaryExpression with CodegenFallback {
+  override def dataType: DataType = IntegerType
+  override def nullable: Boolean = true
+  override def prettyName: String = "json_array_length"
+
+  override def eval(input: InternalRow): Any = {
+    @transient
+    val json = child.eval(input).asInstanceOf[UTF8String]
+    try {
+      Utils.tryWithResource(CreateJacksonParser.utf8String(SharedFactory.jsonFactory, json)) {
+        parser => {
+          // return null if null array is encountered.
+          if (parser.nextToken() == null) {
+            return null
+          }
+          // Parse the array to compute its length.
+          parseCounter(parser, input)
+        }
+      }
+    } catch {
+      case _: JsonProcessingException => null
+    }
+  }
+
+  private def parseCounter(parser: JsonParser, input: InternalRow): Int = {
+    // Counter for length of array
+    var array_length: Int = 0;
+    // Only json array are supported for this function.
+    if (parser.getCurrentToken != JsonToken.START_ARRAY) {
+      throw new AnalysisException(s"$prettyName can only be called on Json Array.")
+    }
+    // Keep traversing until the end of Json Array
+    while(parser.nextToken() != JsonToken.END_ARRAY) {
 
 Review comment:
   It returns null when end of input is reached.
   If it returns null before returning `END_ARRAY` then our json is invalid. Invalid input was already handled.
   Anyway now i will add one more check for null.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r390155298
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,65 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  arguments = """
+    jsonArray - A JSON array is required as argument. `Analysis Exception` is thrown if any other
+    valid JSON expression is passed. `NULL` is returned in case of invalid JSON.
+  """,
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+    > SELECT _FUNC_('[1,2');
+      NULL
+  """,
+  since = "3.1.0"
+)
+case class LengthOfJsonArray(child: Expression)
 
 Review comment:
   I was thinking about this but seems JsonToStructs requires a schema, which is unknown in this expression.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] iRakson edited a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
iRakson edited a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-593532204
 
 
   > I would add an optimization rule instead of extending public API.
   
   I believe public API might serve better as user are more familiar with `json_array_length` as this function is supported by most of the database engines . Also, it seems more intuitive than `size+from_json`.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609611052
 
 
   **[Test build #120862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120862/testReport)** for PR 27759 at commit [`313151f`](https://github.com/apache/spark/commit/313151f5964ed111b34d3c0dd03305150b2ef0b1).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609939410
 
 
   **[Test build #120872 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120872/testReport)** for PR 27759 at commit [`f44e24e`](https://github.com/apache/spark/commit/f44e24ec66e11fdb71c1a9a813f04f6e37244b61).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594380130
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609037113
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120804/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609753309
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609753323
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120862/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r387368971
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,59 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(jsonArray) - Returns length of the jsonArray",
+  examples = """
+    Examples:
+    > SELECT _FUNC_('[1,2,3,4]');
+      4
+    > SELECT _FUNC_('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+      5
+  """,
+  since = "3.0.0"
 
 Review comment:
   -> 3.1.0

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-594379904
 
 
   **[Test build #119279 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119279/testReport)** for PR 27759 at commit [`58a9e0d`](https://github.com/apache/spark/commit/58a9e0dfdb7d72bc8ffd67b5c69f4b52a65e4b5e).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#discussion_r388002544
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ##########
 @@ -781,3 +782,67 @@ case class SchemaOfJson(
 
   override def prettyName: String = "schema_of_json"
 }
+
+/**
+ * A function that returns number of elements in outer Json Array.
 
 Review comment:
   nit: `the number of`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609604089
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120860/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609038162
 
 
   Build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL] Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-609611535
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
URL: https://github.com/apache/spark/pull/27759#issuecomment-595038290
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org