You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2016/05/19 08:12:05 UTC

[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/13192

    [SPARK-13135][SQL] Don't print expressions recursively in generated code

    ## What changes were proposed in this pull request?
    
    This PR is an up-to-date and a little bit improved version of #11019 of @rxin for
    - (1) preventing recursive printing of expressions in generated code.
    
    In addition to #11019, this PR improves the followings in code generation.
    - (2) Improve multiline comment about `Codegened pipeline`.
    - (3) Improve multiline comment indentation.
    - (4) Reduce the number of empty lines (mainly consequtive empty lines).
    - (5) Remove all space characters on empty lines.
    
    **Example**
    ```
    spark.range(1, 1000).select('id+1+2+3+4)
    ```
    
    **Before**
    ```
    Generated code:
    /* 001 */ public Object generate(Object[] references) {
    ...
    /* 005 */ /** Codegened pipeline for:
    /* 006 */ * Project [((((id#0L + 1) + 2) + 3) + 4) AS ((((id + 1) + 2) + 3) + 4)#3L]
    /* 007 */ +- Range 1, 1, 8, 999, [id#0L]
    /* 008 */ */
    ...
    /* 074 */     /*** PRODUCE: Project [((((id#0L + 1) + 2) + 3) + 4) AS ((((id + 1) + 2) + 3) + 4)#3L] */
    /* 075 */     
    /* 076 */     /*** PRODUCE: Range 1, 1, 8, 999, [id#0L] */
    /* 077 */     
    /* 078 */     // initialize Range
    ...
    /* 091 */       /*** CONSUME: Project [((((id#0L + 1) + 2) + 3) + 4) AS ((((id + 1) + 2) + 3) + 4)#3L] */
    /* 092 */       
    /* 093 */       /*** CONSUME: WholeStageCodegen */
    /* 094 */       
    /* 095 */       /* ((((input[0, bigint] + 1) + 2) + 3) + 4) */
    /* 096 */       /* (((input[0, bigint] + 1) + 2) + 3) */
    /* 097 */       /* ((input[0, bigint] + 1) + 2) */
    /* 098 */       /* (input[0, bigint] + 1) */
    ...
    /* 109 */       project_value = project_value1 + 4L;
    /* 110 */       project_rowWriter.write(0, project_value);
    ...
    /* 116 */ }
    ```
    
    **After**
    ```
    Generated code:
    /* 001 */ public Object generate(Object[] references) {
    ...
    /* 005 */ /** Codegened pipeline for:
    /* 006 */  * Project [((((id#0L + 1) + 2) + 3) + 4) AS ((((id + 1) + 2) + 3) + 4)#3L]
    /* 007 */  * +- Range 1, 1, 8, 999, [id#0L]
    /* 008 */  */
    ...
    /* 074 */     /*** PRODUCE: Project [((((id#0L + 1) + 2) + 3) + 4) AS ((((id + 1) + 2) + 3) + 4)#3L] */
    /* 075 */     /*** PRODUCE: Range 1, 1, 8, 999, [id#0L] */
    /* 076 */     // initialize Range
    ...
    /* 089 */       /*** CONSUME: Project [((((id#0L + 1) + 2) + 3) + 4) AS ((((id + 1) + 2) + 3) + 4)#3L] */
    /* 090 */       /*** project list: [((((input[0, bigint, false] + 1) + 2) + 3) + 4)] */
    /* 091 */       /*** CONSUME: WholeStageCodegen */
    ...
    /* 102 */       project_value = project_value1 + 4L;
    /* 103 */       /*** project list: [input[0, bigint, false]] */
    /* 104 */       project_rowWriter.write(0, project_value);
    ...
    /* 110 */ }
    ```
    
    ## How was this patch tested?
    
    Pass the Jenkins tests and see the result of the following command manually.
    ```scala
    scala> spark.range(1, 1000).select('id+1+2+3+4).queryExecution.debug.codegen()
    ```
    
    Author: Dongjoon Hyun <do...@apache.org>
    Author: Reynold Xin <rx...@databricks.com>

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-13135

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13192.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13192
    
----
commit ea7de3b92d413352cec2520119a4ab9b3f930cf3
Author: Dongjoon Hyun <do...@apache.org>
Date:   2016-05-19T07:47:45Z

    [SPARK-13135][SQL] Don't print expressions recursively in generated code
    
    - Don't print expressions recursively
    - Reduce the number of empty lines
    - Remove all space characters on empty lines
    - Improve multiline comment indentation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64449670
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -100,8 +118,11 @@ private class CodeFormatter {
           indentString
         }
         code.append(f"/* ${currentLine}%03d */ ")
    -    code.append(thisLineIndent)
    -    code.append(line)
    +    if (line.trim().length > 0) {
    --- End diff --
    
    Can you explain a bit more about this logic?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220740772
  
    Hi, @davies and @rxin .
    I updated the code and description again according to the current master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220642836
  
    The PySpark failure is fixed as a HOTFIX.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220274764
  
    **[Test build #58855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58855/consoleFull)** for PR 13192 at commit [`ea7de3b`](https://github.com/apache/spark/commit/ea7de3b92d413352cec2520119a4ab9b3f930cf3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220563590
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r63918110
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala ---
    @@ -124,6 +124,7 @@ object GenerateMutableProjection extends CodeGenerator[Seq[Expression], MutableP
     
             public java.lang.Object apply(java.lang.Object _i) {
               InternalRow ${ctx.INPUT_ROW} = (InternalRow) _i;
    +          /*** project list: ${expressions.map(_.toCommentSafeString).mkString(", ")} */
    --- End diff --
    
    Davies' suggestion makes sense to me. I just don't want to print the expressions recursively.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220678036
  
    @dongjoon-hyun Maybe we could have a method Expression.genCodeWithComment() that is used by generated projections and operators, it requires change more places, not sure it's a good idea or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220563440
  
    **[Test build #58971 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58971/consoleFull)** for PR 13192 at commit [`5ffb249`](https://github.com/apache/spark/commit/5ffb24922c8f508b7e517c5fd3d1280409dc0315).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220257517
  
    Hi, @rxin .
    This is the first attempt according to your request.
    I removed some obsolete code in #11019 in order to pass the tests.
    Please let me know if there is something I missed mistakenly.
    
    cc @cloud-fan @nongli 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64440872
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -49,6 +49,24 @@ object CodeFormatter {
         }
         code.result()
       }
    +
    +  def stripOverlappingComments(codeAndComment: CodeAndComment): CodeAndComment = {
    +    val code = new StringBuilder
    +    val map = codeAndComment.comment
    +    var lastLine: String = "dummy"
    +    codeAndComment.body.split('\n').foreach { l =>
    +      val line = l.trim()
    +      val skip = lastLine.startsWith("/*") && lastLine.endsWith("*/") &&
    --- End diff --
    
    are we assuming the comment holder will always take an entire line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220272109
  
    cc @sameeragarwal  / @davies


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220666706
  
    **[Test build #59005 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59005/consoleFull)** for PR 13192 at commit [`2018d9f`](https://github.com/apache/spark/commit/2018d9f3358de91557b8a37ad116d685840cb6e2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220274966
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220544912
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220746249
  
    **[Test build #59039 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59039/consoleFull)** for PR 13192 at commit [`8257e78`](https://github.com/apache/spark/commit/8257e7873db9c134dfd168389cefdd90bcfa2e7e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64333811
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -49,6 +49,24 @@ object CodeFormatter {
         }
         code.result()
       }
    +
    +  def stripOverlappingComments(codeAndComment: CodeAndComment): CodeAndComment = {
    +    val code = new StringBuilder
    +    val map = codeAndComment.comment
    +    var lastLine: String = "dummy"
    +    codeAndComment.body.split('\n').foreach { l =>
    +      val line = l.trim()
    +      val skip = lastLine.startsWith("/*") && lastLine.endsWith("*/") &&
    +        line.startsWith("/*") && line.endsWith("*/") &&
    +        map(lastLine).substring(3).contains(map(line).substring(3))
    --- End diff --
    
    I think it's okay for the performance.
    - This function is used for at every `CodeAndComment` creation once.
    - It scans `codeAndComment.body` once.
    - Map lookup occurs on each line at most once. Also, it does not cost much in this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220746368
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59039/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220544914
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58960/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220667027
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-221338736
  
    LGTM, 
    Merging this into master and 2.0, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13192


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64442432
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatterSuite.scala ---
    @@ -36,6 +36,22 @@ class CodeFormatterSuite extends SparkFunSuite {
         }
       }
     
    +  test("removing overlapping comments") {
    +    val code = new CodeAndComment(
    +      """/*project_c4*/
    +        |/*project_c3*/
    +        |/*project_c2*/
    +      """.stripMargin,
    +      Map(
    +        "/*project_c4*/" -> "// (((input[0, bigint, false] + 1) + 2) + 3))",
    --- End diff --
    
    what if we have multi-line comment? Does it still work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r63916714
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala ---
    @@ -124,6 +124,7 @@ object GenerateMutableProjection extends CodeGenerator[Seq[Expression], MutableP
     
             public java.lang.Object apply(java.lang.Object _i) {
               InternalRow ${ctx.INPUT_ROW} = (InternalRow) _i;
    +          /*** project list: ${expressions.map(_.toCommentSafeString).mkString(", ")} */
    --- End diff --
    
    No problem.
    But, it is directly related with the intention of original @rxin 's PR, too.
    I think we need to make some consensus on how to improve that.
    I'll wait and collect other people's opinions until this afternoon, and do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64446968
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatterSuite.scala ---
    @@ -36,6 +36,22 @@ class CodeFormatterSuite extends SparkFunSuite {
         }
       }
     
    +  test("removing overlapping comments") {
    +    val code = new CodeAndComment(
    +      """/*project_c4*/
    +        |/*project_c3*/
    +        |/*project_c2*/
    +      """.stripMargin,
    +      Map(
    +        "/*project_c4*/" -> "// (((input[0, bigint, false] + 1) + 2) + 3))",
    --- End diff --
    
    Hi, @cloud-fan .
    This is only targeting for those overlapping expressions.
    It's not for multi-line comments.
    Could you give me some kind of `generated overlapping multi-line comment`?
    If then, I can grasp your concerns more fast.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220644473
  
    **[Test build #59005 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59005/consoleFull)** for PR 13192 at commit [`2018d9f`](https://github.com/apache/spark/commit/2018d9f3358de91557b8a37ad116d685840cb6e2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220746367
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-221344268
  
    Thank you, @davies !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64332536
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -49,6 +49,24 @@ object CodeFormatter {
         }
         code.result()
       }
    +
    +  def stripOverlappingComments(codeAndComment: CodeAndComment): CodeAndComment = {
    +    val code = new StringBuilder
    +    val map = codeAndComment.comment
    +    var lastLine: String = "dummy"
    +    codeAndComment.body.split('\n').foreach { l =>
    +      val line = l.trim()
    +      val skip = lastLine.startsWith("/*") && lastLine.endsWith("*/") &&
    +        line.startsWith("/*") && line.endsWith("*/") &&
    +        map(lastLine).substring(3).contains(map(line).substring(3))
    --- End diff --
    
    Oh, it should work, I missed the `map`. Will it have performance issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220548264
  
    **[Test build #58971 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58971/consoleFull)** for PR 13192 at commit [`5ffb249`](https://github.com/apache/spark/commit/5ffb24922c8f508b7e517c5fd3d1280409dc0315).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220547752
  
    Retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64462691
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -100,8 +118,11 @@ private class CodeFormatter {
           indentString
         }
         code.append(f"/* ${currentLine}%03d */ ")
    -    code.append(thisLineIndent)
    -    code.append(line)
    +    if (line.trim().length > 0) {
    --- End diff --
    
    Sure! This `if` makes blank lines print **only line numbers**, e.g.,
    **Before**
    ```
    "/* 001 */ <space><space><space><space> ..."
    ```
    **After**
    ```
    "/* 001 */ "
    ```
    As you see, the redundant spaces for indentation go away.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64354339
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -49,6 +49,24 @@ object CodeFormatter {
         }
         code.result()
       }
    +
    +  def stripOverlappingComments(codeAndComment: CodeAndComment): CodeAndComment = {
    +    val code = new StringBuilder
    +    val map = codeAndComment.comment
    +    var lastLine: String = "dummy"
    +    codeAndComment.body.split('\n').foreach { l =>
    +      val line = l.trim()
    +      val skip = lastLine.startsWith("/*") && lastLine.endsWith("*/") &&
    +        line.startsWith("/*") && line.endsWith("*/") &&
    +        map(lastLine).substring(3).contains(map(line).substring(3))
    --- End diff --
    
    Also, the skip condition is checking only consecutive comments lines.
    If there is something to do more, please let me know, @davies .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220274969
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58855/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220667029
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59005/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220851707
  
    Hi, @davies and @rxin .
    Could you review this PR again when you have some time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220531788
  
    **[Test build #58960 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58960/consoleFull)** for PR 13192 at commit [`5ffb249`](https://github.com/apache/spark/commit/5ffb24922c8f508b7e517c5fd3d1280409dc0315).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220544714
  
    **[Test build #58960 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58960/consoleFull)** for PR 13192 at commit [`5ffb249`](https://github.com/apache/spark/commit/5ffb24922c8f508b7e517c5fd3d1280409dc0315).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220682262
  
    Ya. There were a huge change. I've saw the PR before, but I didn't consider that in this PR. 
    My bad. Let me think how to solve the original goal with new master branch. 
    Thank you, @davies .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220563593
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58971/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220736209
  
    **[Test build #59039 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59039/consoleFull)** for PR 13192 at commit [`8257e78`](https://github.com/apache/spark/commit/8257e7873db9c134dfd168389cefdd90bcfa2e7e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r63913005
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -24,13 +24,13 @@ package org.apache.spark.sql.catalyst.expressions.codegen
      * Written by Matei Zaharia.
      */
     object CodeFormatter {
    -  def format(code: String): String = new CodeFormatter().addLines(code).result()
    +  def format(code: String): String = new CodeFormatter().addLines(stripExtraNewLines(code)).result()
    --- End diff --
    
    We can't remove the empty lines here, or LINENO of the compiled code will be different than the formatted code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64082929
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -39,6 +40,23 @@ object CodeFormatter {
         }
         code.result()
       }
    +
    +  def stripOverlappingComments(input: String): String = {
    --- End diff --
    
    After #12979 is merged, this may not work now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r63915288
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -24,13 +24,13 @@ package org.apache.spark.sql.catalyst.expressions.codegen
      * Written by Matei Zaharia.
      */
     object CodeFormatter {
    -  def format(code: String): String = new CodeFormatter().addLines(code).result()
    +  def format(code: String): String = new CodeFormatter().addLines(stripExtraNewLines(code)).result()
    --- End diff --
    
    Thank you for review, @davies .
    Oh, I thought `CodeFormatter.format` is called before Janino and Guava loading cache, too.
    I'll make that consistent in this afternoon. If then, it'll be okay.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64332407
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -49,6 +49,24 @@ object CodeFormatter {
         }
         code.result()
       }
    +
    +  def stripOverlappingComments(codeAndComment: CodeAndComment): CodeAndComment = {
    +    val code = new StringBuilder
    +    val map = codeAndComment.comment
    +    var lastLine: String = "dummy"
    +    codeAndComment.body.split('\n').foreach { l =>
    +      val line = l.trim()
    +      val skip = lastLine.startsWith("/*") && lastLine.endsWith("*/") &&
    +        line.startsWith("/*") && line.endsWith("*/") &&
    +        map(lastLine).substring(3).contains(map(line).substring(3))
    --- End diff --
    
    Have you check that this actually work? I think we have placeholders here so will not find any duplicated comments to skip.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64449626
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatterSuite.scala ---
    @@ -36,6 +36,22 @@ class CodeFormatterSuite extends SparkFunSuite {
         }
       }
     
    +  test("removing overlapping comments") {
    +    val code = new CodeAndComment(
    +      """/*project_c4*/
    +        |/*project_c3*/
    +        |/*project_c2*/
    +      """.stripMargin,
    +      Map(
    +        "/*project_c4*/" -> "// (((input[0, bigint, false] + 1) + 2) + 3))",
    --- End diff --
    
    seems no such case, then it's fine


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r63918627
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala ---
    @@ -124,6 +124,7 @@ object GenerateMutableProjection extends CodeGenerator[Seq[Expression], MutableP
     
             public java.lang.Object apply(java.lang.Object _i) {
               InternalRow ${ctx.INPUT_ROW} = (InternalRow) _i;
    +          /*** project list: ${expressions.map(_.toCommentSafeString).mkString(", ")} */
    --- End diff --
    
    Thank you for fast decision, @rxin ! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220532622
  
    As @rxin told, what was really needed is removing `overlapping` comments.
    So, I rethink about that and revert the change on `Expression.gen` which removes the `code` field.
    It has its own values. Instead, I can achieve that goal simply by adding `CodeFormatter. stripOverlappingComments`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220256665
  
    **[Test build #58855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58855/consoleFull)** for PR 13192 at commit [`ea7de3b`](https://github.com/apache/spark/commit/ea7de3b92d413352cec2520119a4ab9b3f930cf3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r64432741
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -49,6 +49,24 @@ object CodeFormatter {
         }
         code.result()
       }
    +
    +  def stripOverlappingComments(codeAndComment: CodeAndComment): CodeAndComment = {
    +    val code = new StringBuilder
    +    val map = codeAndComment.comment
    +    var lastLine: String = "dummy"
    +    codeAndComment.body.split('\n').foreach { l =>
    +      val line = l.trim()
    +      val skip = lastLine.startsWith("/*") && lastLine.endsWith("*/") &&
    +        line.startsWith("/*") && line.endsWith("*/") &&
    +        map(lastLine).substring(3).contains(map(line).substring(3))
    --- End diff --
    
    SGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/13192#issuecomment-220667461
  
    Hi, @davies .
    It's ready for review, again!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13135][SQL] Don't print expressions rec...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13192#discussion_r63913511
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala ---
    @@ -124,6 +124,7 @@ object GenerateMutableProjection extends CodeGenerator[Seq[Expression], MutableP
     
             public java.lang.Object apply(java.lang.Object _i) {
               InternalRow ${ctx.INPUT_ROW} = (InternalRow) _i;
    +          /*** project list: ${expressions.map(_.toCommentSafeString).mkString(", ")} */
    --- End diff --
    
    When there are many columns, the comments are far away from their code, it's better to interweave them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org