You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2015/12/10 03:46:29 UTC

[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/10239

    [SPARK-12252][SQL] refactor MapObjects to make it less hacky

    in https://github.com/apache/spark/pull/10133 we found that, we shoud ensure the children of `TreeNode` are all accessible in the `productIterator`, or the behavior will be very confusing.
    
    In this PR, I try to fix this problem by expsing the `loopVar`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark map-objects

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10239.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10239
    
----
commit f764e3d59e47eb94d3cd23fe669f45ee44255a47
Author: Wenchen Fan <we...@databricks.com>
Date:   2015-12-10T02:41:47Z

    refactor MapObjects to make it less hacky

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163488143
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163496616
  
    Regarding the JIRA SPARK-12131, how about the following statement? 
    ```scala
    implicit val kryoEncoder = Encoders.kryo[KryoClassData]
    val ds = Seq((KryoClassData("a", 1), KryoClassData("b", 2)).toDS()
    ```
    This does not work. @liancheng just closed this PR, which sounds a similar issue for tuple, right? Will this PR resolves this issue? @cloud-fan 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163486564
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47478/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163472214
  
    **[Test build #47478 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47478/consoleFull)** for PR 10239 at commit [`4f0f9fa`](https://github.com/apache/spark/commit/4f0f9faac26095ac2e714c4d52ab566ed9fcc451).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163488144
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47479/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163493628
  
    Great, so the inner class issue is also resolved. I'm closing #10133 then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SPARK-12131][SQL] refactor MapOb...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10239


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163488086
  
    **[Test build #47479 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47479/consoleFull)** for PR 10239 at commit [`d1992a9`](https://github.com/apache/spark/commit/d1992a955ee02bf9aea2213387652e1e6242b125).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends LeafExpression`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163470973
  
    This seems fine, but I agree we should hide the construction of the loop variable.  There is nothing wrong with the apply taking a `Expression => Expression`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163484091
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163519747
  
    @cloud-fan Could you please mention SPARK-12131 in the PR title and description since it's also fixed in this PR? Then I can merge it to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SPARK-12131][SQL] refactor MapOb...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163519992
  
    Thanks, merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163468871
  
    cc @liancheng @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163469980
  
    This does look much better. Can we add a `MapObjects.apply` to hide `LoopVar` so that users don't need to create them manually?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163484019
  
    **[Test build #47472 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47472/consoleFull)** for PR 10239 at commit [`f764e3d`](https://github.com/apache/spark/commit/f764e3d59e47eb94d3cd23fe669f45ee44255a47).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `public class JavaBinarizerExample `\n  * `public class JavaBucketizerExample `\n  * `public class JavaDCTExample `\n  * `public class JavaElementwiseProductExample `\n  * `public class JavaMinMaxScalerExample `\n  * `public class JavaNGramExample `\n  * `public class JavaNormalizerExample `\n  * `public class JavaOneHotEncoderExample `\n  * `public class JavaPCAExample `\n  * `public class JavaPolynomialExpansionExample `\n  * `public class JavaRFormulaExample `\n  * `public class JavaStandardScalerExample `\n  * `public class JavaStopWordsRemoverExample `\n  * `public class JavaStringIndexerExample `\n  * `public class JavaTokenizerExample `\n  * `public class JavaVectorAssemblerExample `\n  * `public class JavaVectorIndexerExample `\n  * `public class JavaVectorSlicerExample `\n  * `case class LoopVar(value: String, isNull: String, dataType: DataType) extends LeafExpression`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163514629
  
    @cloud-fan Sure, let me open it now. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10239#discussion_r47183508
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects.scala ---
    @@ -323,22 +323,26 @@ case class WrapOption(child: Expression)
     }
     
     /**
    - * A place holder for the loop variable used in [[MapObjects]].  This should never be constructed
    - * manually, but will instead be passed into the provided lambda function.
    + * A place holder for the loop variable used in [[MapObjects]].
      */
    -case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends Expression {
    +case class LoopVar(value: String, isNull: String, dataType: DataType) extends LeafExpression
    +  with Unevaluable {
     
    -  override def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String =
    -    throw new UnsupportedOperationException("Only calling gen() is supported.")
    +  override def nullable: Boolean = true
     
    -  override def children: Seq[Expression] = Nil
    -  override def gen(ctx: CodeGenContext): GeneratedExpressionCode =
    +  override def gen(ctx: CodeGenContext): GeneratedExpressionCode = {
         GeneratedExpressionCode(code = "", value = value, isNull = isNull)
    +  }
    +}
     
    -  override def nullable: Boolean = false
    -  override def eval(input: InternalRow): Any =
    -    throw new UnsupportedOperationException("Only code-generated evaluation is supported.")
    +object LoopVar {
    +  private val curId = new java.util.concurrent.atomic.AtomicInteger()
     
    +  def apply(dataType: DataType): LoopVar = {
    +    val loopValue = "MapObjects_loopValue" + curId.getAndIncrement()
    +    val loopIsNull = "MapObjects_loopIsNull" + curId.getAndIncrement()
    +    LoopVar(loopValue, loopIsNull, dataType)
    --- End diff --
    
    Here is a little hacky, but I wanna argue that it's way less hacky than the previous implementation and I make the variable name verbose to make it less possible to get conflicted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163484093
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47472/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163486563
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163473273
  
    **[Test build #47479 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47479/consoleFull)** for PR 10239 at commit [`d1992a9`](https://github.com/apache/spark/commit/d1992a955ee02bf9aea2213387652e1e6242b125).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163486448
  
    **[Test build #47478 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47478/consoleFull)** for PR 10239 at commit [`4f0f9fa`](https://github.com/apache/spark/commit/4f0f9faac26095ac2e714c4d52ab566ed9fcc451).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends LeafExpression`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163513683
  
    @gatorsmile , the problem is: currently kryo/javaSerialization encoder is not composable, so it's another story, feel free to open a JIRA for it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10239#discussion_r47184645
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects.scala ---
    @@ -326,19 +326,28 @@ case class WrapOption(child: Expression)
      * A place holder for the loop variable used in [[MapObjects]].  This should never be constructed
      * manually, but will instead be passed into the provided lambda function.
      */
    -case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends Expression {
    +case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends LeafExpression
    +  with Unevaluable {
     
    -  override def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String =
    -    throw new UnsupportedOperationException("Only calling gen() is supported.")
    +  override def nullable: Boolean = true
     
    -  override def children: Seq[Expression] = Nil
    -  override def gen(ctx: CodeGenContext): GeneratedExpressionCode =
    +  override def gen(ctx: CodeGenContext): GeneratedExpressionCode = {
         GeneratedExpressionCode(code = "", value = value, isNull = isNull)
    +  }
    +}
     
    -  override def nullable: Boolean = false
    -  override def eval(input: InternalRow): Any =
    -    throw new UnsupportedOperationException("Only code-generated evaluation is supported.")
    +object MapObjects {
    +  private val curId = new java.util.concurrent.atomic.AtomicInteger()
     
    +  def apply(
    +      function: Expression => Expression,
    +      inputData: Expression,
    +      elementType: DataType): MapObjects = {
    +    val loopValue = "MapObjects_loopValue" + curId.getAndIncrement()
    +    val loopIsNull = "MapObjects_loopIsNull" + curId.getAndIncrement()
    +    val loopVar = LambdaVariable(loopValue, loopIsNull, elementType)
    --- End diff --
    
    Here is a little hacky, but I wanna argue that it's way less hacky than the previous implementation and I make the variable name verbose to make it less possible to get conflicted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10239#discussion_r47184698
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects.scala ---
    @@ -326,19 +326,28 @@ case class WrapOption(child: Expression)
      * A place holder for the loop variable used in [[MapObjects]].  This should never be constructed
      * manually, but will instead be passed into the provided lambda function.
      */
    -case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends Expression {
    +case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends LeafExpression
    +  with Unevaluable {
     
    -  override def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String =
    -    throw new UnsupportedOperationException("Only calling gen() is supported.")
    +  override def nullable: Boolean = true
    --- End diff --
    
    I think the `loopVar` is nullable?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12252][SQL] refactor MapObjects to make...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10239#issuecomment-163468990
  
    **[Test build #47472 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47472/consoleFull)** for PR 10239 at commit [`f764e3d`](https://github.com/apache/spark/commit/f764e3d59e47eb94d3cd23fe669f45ee44255a47).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org