You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2018/04/09 16:22:05 UTC

[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/21011

    [SPARK-23916][SQL] Add array_join function

    ## What changes were proposed in this pull request?
    
    The PR adds the SQL function `array_join`. The behavior of the function is based on Presto's one.
    
    The function accepts an `array` of `string` which is to be joined, a `string` which is the delimiter to use between the items of the first argument and optionally a `string` which is used to replace `null` values.
    
    ## How was this patch tested?
    
    added UTs


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-23916

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21011.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21011
    
----
commit 1408cfcd8f0ad2a571d29b57d71128584ea4b4f0
Author: Marco Gaido <ma...@...>
Date:   2018-04-09T16:16:43Z

    [SPARK-23916][SQL] Add array_join function

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181380233
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    --- End diff --
    
    Hmm, I think the indent is 2 spaces in this case. For example, [namedExpressions.scala#L170-L174](https://github.com/mgaido91/spark/blob/e52ff856d42adc5af2e2b2593c2e63d5c3f3a205/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala#L170-L174) or [regexpExpressions.scala#L46-L51](https://github.com/mgaido91/spark/blob/e52ff856d42adc5af2e2b2593c2e63d5c3f3a205/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L46-L51).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2538/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    kindly ping @ueshin 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89094 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89094/testReport)** for PR 21011 at commit [`e52ff85`](https://github.com/apache/spark/commit/e52ff856d42adc5af2e2b2593c2e63d5c3f3a205).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89534/testReport)** for PR 21011 at commit [`b1597d7`](https://github.com/apache/spark/commit/b1597d791b0302a4541cf545956ecf0f3454f056).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89342/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21011


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181371786
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    --- End diff --
    
    nit: indent?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2456/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181376485
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    --- End diff --
    
    I don't think the indent is wrong since this is for the if...else and not for the method itself


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89451 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89451/testReport)** for PR 21011 at commit [`703c09c`](https://github.com/apache/spark/commit/703c09c4e9da2b96c7a5f445fd5a1d30cdc29c03).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89534/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2137/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181371816
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    +
    +  override def children: Seq[Expression] = if (nullReplacement.isDefined) {
    +      Seq(array, delimiter, nullReplacement.get)
    +    } else {
    +      Seq(array, delimiter)
    +    }
    --- End diff --
    
    nit: indent?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89451/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181380011
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    +
    +  override def children: Seq[Expression] = if (nullReplacement.isDefined) {
    +      Seq(array, delimiter, nullReplacement.get)
    +    } else {
    +      Seq(array, delimiter)
    +    }
    +
    +  override def nullable: Boolean = children.exists(_.nullable)
    +
    +  override def foldable: Boolean = children.forall(_.foldable)
    +
    +  override def eval(input: InternalRow): Any = {
    +    val arrayEval = array.eval(input)
    +    if (arrayEval == null) return null
    +    val delimiterEval = delimiter.eval(input)
    +    if (delimiterEval == null) return null
    +    val nullReplacementEval = nullReplacement.map(_.eval(input))
    +    if (nullReplacementEval.contains(null)) return null
    +
    --- End diff --
    
    I removed the other one.... :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2104/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2128/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    any more comments @ueshin ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    cc @ueshin 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89351/testReport)** for PR 21011 at commit [`ad0d4aa`](https://github.com/apache/spark/commit/ad0d4aa5d671b3a99fa1bd30dc833a8b75444f6c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r180311653
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """)
    --- End diff --
    
    and `since`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89645 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89645/testReport)** for PR 21011 at commit [`e9d7baa`](https://github.com/apache/spark/commit/e9d7baa09ecee2456d48bf7de71330714dba4c4c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class Summarizer(object):`
      * `class SummaryBuilder(JavaWrapper):`
      * `case class Reverse(child: Expression) extends UnaryExpression with ImplicitCastInputTypes `
      * `case class ArrayPosition(left: Expression, right: Expression)`
      * `case class ElementAt(left: Expression, right: Expression) extends GetMapValueUtil `
      * `case class Concat(children: Seq[Expression]) extends Expression `
      * `abstract class GetMapValueUtil extends BinaryExpression with ImplicitCastInputTypes `
      * `case class GetMapValue(child: Expression, key: Expression)`
      * `class ArrayDataIndexedSeq[T](arrayData: ArrayData, dataType: DataType) extends IndexedSeq[T] `


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Thanks! merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181371299
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    +
    +  override def children: Seq[Expression] = if (nullReplacement.isDefined) {
    +      Seq(array, delimiter, nullReplacement.get)
    +    } else {
    +      Seq(array, delimiter)
    +    }
    +
    +  override def nullable: Boolean = children.exists(_.nullable)
    +
    +  override def foldable: Boolean = children.forall(_.foldable)
    +
    +  override def eval(input: InternalRow): Any = {
    +    val arrayEval = array.eval(input)
    +    if (arrayEval == null) return null
    +    val delimiterEval = delimiter.eval(input)
    +    if (delimiterEval == null) return null
    +    val nullReplacementEval = nullReplacement.map(_.eval(input))
    +    if (nullReplacementEval.contains(null)) return null
    +
    +
    +    val buffer = new UTF8StringBuilder()
    +    var firstItem = true
    +    val nullHandling = nullReplacementEval match {
    +      case Some(rep) => (prependDelimiter: Boolean) => {
    +        if (!prependDelimiter) {
    +          buffer.append(delimiterEval.asInstanceOf[UTF8String])
    +        }
    +        buffer.append(rep.asInstanceOf[UTF8String])
    +        true
    +      }
    +      case None => (_: Boolean) => false
    +    }
    +    arrayEval.asInstanceOf[ArrayData].foreach(StringType, (_, item) => {
    +      if (item == null) {
    +        if (nullHandling(firstItem)) {
    +          firstItem = false
    +        }
    +      } else {
    +        if (!firstItem) {
    +          buffer.append(delimiterEval.asInstanceOf[UTF8String])
    +        }
    +        buffer.append(item.asInstanceOf[UTF8String])
    +        firstItem = false
    +      }
    +    })
    +    buffer.build()
    +  }
    +
    +  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    +    val code = nullReplacement match {
    +      case Some(replacement) =>
    +        val replacementGen = replacement.genCode(ctx)
    +        val nullHandling = (buffer: String, delimiter: String, firstItem: String) => {
    +          s"""
    +             |if (!$firstItem) {
    +             |  $buffer.append($delimiter);
    +             |}
    +             |$buffer.append(${replacementGen.value});
    +             |$firstItem = false;
    +               """.stripMargin
    +        }
    +        val execCode = if (replacement.nullable) {
    +          ctx.nullSafeExec(replacement.nullable, replacementGen.isNull) {
    +            genCodeForArrayAndDelimiter(ctx, ev, nullHandling)
    +          }
    +        } else {
    +          genCodeForArrayAndDelimiter(ctx, ev, nullHandling)
    +        }
    +        s"""
    +           |${replacementGen.code}
    +           |$execCode
    +           """.stripMargin
    +      case None => genCodeForArrayAndDelimiter(ctx, ev,
    +        (_: String, _: String, _: String) => "// nulls are ignored")
    +    }
    +    if (nullable) {
    +      ev.copy(
    +        s"""
    +           |boolean ${ev.isNull} = true;
    +           |UTF8String ${ev.value} = null;
    +           |$code
    +         """.stripMargin)
    +    } else {
    +      ev.copy(s"""
    --- End diff --
    
    nit: maybe we need a line break between `copy(` and `s"""`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2322/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2390/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r180311581
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ---
    @@ -413,6 +413,29 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSQLContext {
         )
       }
     
    +  test("array_join function") {
    +    val df = Seq(
    +      (Seq[String]("a", "b"), ","),
    +      (Seq[String]("a", null, "b"), ","),
    +      (Seq[String](), ",")
    --- End diff --
    
    Maybe `Seq.empty[String]`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89534/testReport)** for PR 21011 at commit [`b1597d7`](https://github.com/apache/spark/commit/b1597d791b0302a4541cf545956ecf0f3454f056).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181371352
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    +
    +  override def children: Seq[Expression] = if (nullReplacement.isDefined) {
    +      Seq(array, delimiter, nullReplacement.get)
    +    } else {
    +      Seq(array, delimiter)
    +    }
    +
    +  override def nullable: Boolean = children.exists(_.nullable)
    +
    +  override def foldable: Boolean = children.forall(_.foldable)
    +
    +  override def eval(input: InternalRow): Any = {
    +    val arrayEval = array.eval(input)
    +    if (arrayEval == null) return null
    +    val delimiterEval = delimiter.eval(input)
    +    if (delimiterEval == null) return null
    +    val nullReplacementEval = nullReplacement.map(_.eval(input))
    +    if (nullReplacementEval.contains(null)) return null
    +
    +
    +    val buffer = new UTF8StringBuilder()
    +    var firstItem = true
    +    val nullHandling = nullReplacementEval match {
    +      case Some(rep) => (prependDelimiter: Boolean) => {
    +        if (!prependDelimiter) {
    +          buffer.append(delimiterEval.asInstanceOf[UTF8String])
    +        }
    +        buffer.append(rep.asInstanceOf[UTF8String])
    +        true
    +      }
    +      case None => (_: Boolean) => false
    +    }
    +    arrayEval.asInstanceOf[ArrayData].foreach(StringType, (_, item) => {
    +      if (item == null) {
    +        if (nullHandling(firstItem)) {
    +          firstItem = false
    +        }
    +      } else {
    +        if (!firstItem) {
    +          buffer.append(delimiterEval.asInstanceOf[UTF8String])
    +        }
    +        buffer.append(item.asInstanceOf[UTF8String])
    +        firstItem = false
    +      }
    +    })
    +    buffer.build()
    +  }
    +
    +  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    +    val code = nullReplacement match {
    +      case Some(replacement) =>
    +        val replacementGen = replacement.genCode(ctx)
    +        val nullHandling = (buffer: String, delimiter: String, firstItem: String) => {
    +          s"""
    +             |if (!$firstItem) {
    +             |  $buffer.append($delimiter);
    +             |}
    +             |$buffer.append(${replacementGen.value});
    +             |$firstItem = false;
    +               """.stripMargin
    +        }
    +        val execCode = if (replacement.nullable) {
    +          ctx.nullSafeExec(replacement.nullable, replacementGen.isNull) {
    +            genCodeForArrayAndDelimiter(ctx, ev, nullHandling)
    +          }
    +        } else {
    +          genCodeForArrayAndDelimiter(ctx, ev, nullHandling)
    +        }
    +        s"""
    +           |${replacementGen.code}
    +           |$execCode
    +           """.stripMargin
    --- End diff --
    
    nit: indent


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181371571
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    +
    +  override def children: Seq[Expression] = if (nullReplacement.isDefined) {
    +      Seq(array, delimiter, nullReplacement.get)
    +    } else {
    +      Seq(array, delimiter)
    +    }
    +
    +  override def nullable: Boolean = children.exists(_.nullable)
    +
    +  override def foldable: Boolean = children.forall(_.foldable)
    +
    +  override def eval(input: InternalRow): Any = {
    +    val arrayEval = array.eval(input)
    +    if (arrayEval == null) return null
    +    val delimiterEval = delimiter.eval(input)
    +    if (delimiterEval == null) return null
    +    val nullReplacementEval = nullReplacement.map(_.eval(input))
    +    if (nullReplacementEval.contains(null)) return null
    +
    --- End diff --
    
    nit: remove an extra line.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89340 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89340/testReport)** for PR 21011 at commit [`dd9482e`](https://github.com/apache/spark/commit/dd9482e230c9efdab66609639d633a207e47f736).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181381594
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    --- End diff --
    
    ops, you are right....I am not sure where I saw it differently...maybe I just got confused...sorry, I am fixing it


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89067/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89342/testReport)** for PR 21011 at commit [`ad0d4aa`](https://github.com/apache/spark/commit/ad0d4aa5d671b3a99fa1bd30dc833a8b75444f6c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89645/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89340 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89340/testReport)** for PR 21011 at commit [`dd9482e`](https://github.com/apache/spark/commit/dd9482e230c9efdab66609639d633a207e47f736).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181376517
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    +
    +  override def children: Seq[Expression] = if (nullReplacement.isDefined) {
    +      Seq(array, delimiter, nullReplacement.get)
    +    } else {
    +      Seq(array, delimiter)
    +    }
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    any more comments?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2317/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2319/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89094 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89094/testReport)** for PR 21011 at commit [`e52ff85`](https://github.com/apache/spark/commit/e52ff856d42adc5af2e2b2593c2e63d5c3f3a205).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89106/testReport)** for PR 21011 at commit [`e52ff85`](https://github.com/apache/spark/commit/e52ff856d42adc5af2e2b2593c2e63d5c3f3a205).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r182333814
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -1846,6 +1846,27 @@ def array_contains(col, value):
         return Column(sc._jvm.functions.array_contains(_to_java_column(col), value))
     
     
    +@ignore_unicode_prefix
    +@since(2.4)
    +def array_join(col, delimiter, null_replacement=None):
    +    """
    +    Concatenates the elements of `column` using the `delimiter`. Null values are replaced with
    +    `nullReplacement` if set, otherwise they are ignored.
    --- End diff --
    
    nit: `null_replacement`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89094/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89342 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89342/testReport)** for PR 21011 at commit [`ad0d4aa`](https://github.com/apache/spark/commit/ad0d4aa5d671b3a99fa1bd30dc833a8b75444f6c).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r180317060
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """)
    --- End diff --
    
    add `since`. see [this discussion](https://github.com/apache/spark/pull/21021#discussion_r180309744).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89106/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89351/testReport)** for PR 21011 at commit [`ad0d4aa`](https://github.com/apache/spark/commit/ad0d4aa5d671b3a99fa1bd30dc833a8b75444f6c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181371114
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    +
    +  override def children: Seq[Expression] = if (nullReplacement.isDefined) {
    +      Seq(array, delimiter, nullReplacement.get)
    +    } else {
    +      Seq(array, delimiter)
    +    }
    +
    +  override def nullable: Boolean = children.exists(_.nullable)
    +
    +  override def foldable: Boolean = children.forall(_.foldable)
    +
    +  override def eval(input: InternalRow): Any = {
    +    val arrayEval = array.eval(input)
    +    if (arrayEval == null) return null
    +    val delimiterEval = delimiter.eval(input)
    +    if (delimiterEval == null) return null
    +    val nullReplacementEval = nullReplacement.map(_.eval(input))
    +    if (nullReplacementEval.contains(null)) return null
    +
    +
    +    val buffer = new UTF8StringBuilder()
    +    var firstItem = true
    +    val nullHandling = nullReplacementEval match {
    +      case Some(rep) => (prependDelimiter: Boolean) => {
    +        if (!prependDelimiter) {
    +          buffer.append(delimiterEval.asInstanceOf[UTF8String])
    +        }
    +        buffer.append(rep.asInstanceOf[UTF8String])
    +        true
    +      }
    +      case None => (_: Boolean) => false
    +    }
    +    arrayEval.asInstanceOf[ArrayData].foreach(StringType, (_, item) => {
    +      if (item == null) {
    +        if (nullHandling(firstItem)) {
    +          firstItem = false
    +        }
    +      } else {
    +        if (!firstItem) {
    +          buffer.append(delimiterEval.asInstanceOf[UTF8String])
    +        }
    +        buffer.append(item.asInstanceOf[UTF8String])
    +        firstItem = false
    +      }
    +    })
    +    buffer.build()
    +  }
    +
    +  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    +    val code = nullReplacement match {
    +      case Some(replacement) =>
    +        val replacementGen = replacement.genCode(ctx)
    +        val nullHandling = (buffer: String, delimiter: String, firstItem: String) => {
    +          s"""
    +             |if (!$firstItem) {
    +             |  $buffer.append($delimiter);
    +             |}
    +             |$buffer.append(${replacementGen.value});
    +             |$firstItem = false;
    +               """.stripMargin
    --- End diff --
    
    nit: indent


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89645/testReport)** for PR 21011 at commit [`e9d7baa`](https://github.com/apache/spark/commit/e9d7baa09ecee2456d48bf7de71330714dba4c4c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89351/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89340/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89451 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89451/testReport)** for PR 21011 at commit [`703c09c`](https://github.com/apache/spark/commit/703c09c4e9da2b96c7a5f445fd5a1d30cdc29c03).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89067 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89067/testReport)** for PR 21011 at commit [`1408cfc`](https://github.com/apache/spark/commit/1408cfcd8f0ad2a571d29b57d71128584ea4b4f0).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class ArrayJoin(`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89106/testReport)** for PR 21011 at commit [`e52ff85`](https://github.com/apache/spark/commit/e52ff856d42adc5af2e2b2593c2e63d5c3f3a205).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21011: [SPARK-23916][SQL] Add array_join function

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21011#discussion_r181363454
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -287,3 +288,173 @@ case class ArrayContains(left: Expression, right: Expression)
     
       override def prettyName: String = "array_contains"
     }
    +
    +/**
    + * Creates a String containing all the elements of the input array separated by the delimiter.
    + */
    +@ExpressionDescription(
    +  usage = """
    +    _FUNC_(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array
    +      using the delimiter and an optional string to replace nulls. If no value is set for
    +      nullReplacement, any null value is filtered.""",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(array('hello', 'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ');
    +       hello world
    +      > SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
    +       hello , world
    +  """, since = "2.4.0")
    +case class ArrayJoin(
    +    array: Expression,
    +    delimiter: Expression,
    +    nullReplacement: Option[Expression]) extends Expression with ExpectsInputTypes {
    +
    +  def this(array: Expression, delimiter: Expression) = this(array, delimiter, None)
    +
    +  def this(array: Expression, delimiter: Expression, nullReplacement: Expression) =
    +    this(array, delimiter, Some(nullReplacement))
    +
    +  override def inputTypes: Seq[AbstractDataType] = if (nullReplacement.isDefined) {
    +      Seq(ArrayType(StringType), StringType, StringType)
    +    } else {
    +      Seq(ArrayType(StringType), StringType)
    +    }
    +
    +  override def children: Seq[Expression] = if (nullReplacement.isDefined) {
    +      Seq(array, delimiter, nullReplacement.get)
    +    } else {
    +      Seq(array, delimiter)
    +    }
    +
    +  override def nullable: Boolean = children.exists(_.nullable)
    +
    +  override def foldable: Boolean = children.forall(_.foldable)
    +
    +  override def eval(input: InternalRow): Any = {
    +    val arrayEval = array.eval(input)
    +    if (arrayEval == null) return null
    +    val delimiterEval = delimiter.eval(input)
    +    if (delimiterEval == null) return null
    +    val nullReplacementEval = nullReplacement.map(_.eval(input))
    +    if (nullReplacementEval.contains(null)) return null
    +
    +
    +    val buffer = new UTF8StringBuilder()
    +    var firstItem = true
    +    val nullHandling = nullReplacementEval match {
    +      case Some(rep) => (prependDelimiter: Boolean) => {
    +        if (!prependDelimiter) {
    +          buffer.append(delimiterEval.asInstanceOf[UTF8String])
    +        }
    +        buffer.append(rep.asInstanceOf[UTF8String])
    +        true
    +      }
    +      case None => (_: Boolean) => false
    +    }
    +    arrayEval.asInstanceOf[ArrayData].foreach(StringType, (_, item) => {
    +      if (item == null) {
    +        if (nullHandling(firstItem)) {
    +          firstItem = false
    +        }
    +      } else {
    +        if (!firstItem) {
    +          buffer.append(delimiterEval.asInstanceOf[UTF8String])
    +        }
    +        buffer.append(item.asInstanceOf[UTF8String])
    +        firstItem = false
    +      }
    +    })
    +    buffer.build()
    +  }
    +
    +  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    +    val code = nullReplacement match {
    +      case Some(replacement) =>
    +        val replacementGen = replacement.genCode(ctx)
    +        val nullHandling = (buffer: String, delimiter: String, firstItem: String) => {
    +          s"""
    +             |if (!$firstItem) {
    +             |  $buffer.append($delimiter);
    +             |}
    +             |$buffer.append(${replacementGen.value});
    +             |$firstItem = false;
    +               """.stripMargin
    +        }
    +        val execCode = if (replacement.nullable) {
    +          ctx.nullSafeExec(replacement.nullable, replacementGen.isNull) {
    +            genCodeForArrayAndDelimiter(ctx, ev, nullHandling)
    +          }
    +        } else {
    +          genCodeForArrayAndDelimiter(ctx, ev, nullHandling)
    +        }
    +        s"""
    +           |${replacementGen.code}
    +           |$execCode
    +           """.stripMargin
    +      case None => genCodeForArrayAndDelimiter(ctx, ev,
    +        (_: String, _: String, _: String) => "// nulls are ignored")
    +    }
    +    if (nullable) {
    +      ev.copy(
    +        s"""
    +           |boolean ${ev.isNull} = true;
    +           |UTF8String ${ev.value} = null;
    +           |$code
    +         """.stripMargin)
    +    } else {
    +      ev.copy(s"""
    +           |boolean ${ev.isNull} = false;
    --- End diff --
    
    nit: I guess we can remove this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21011
  
    **[Test build #89067 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89067/testReport)** for PR 21011 at commit [`1408cfc`](https://github.com/apache/spark/commit/1408cfcd8f0ad2a571d29b57d71128584ea4b4f0).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org