You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by kiszk <gi...@git.apache.org> on 2016/05/21 18:09:13 UTC

[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

GitHub user kiszk opened a pull request:

    https://github.com/apache/spark/pull/13243

    [SPARK-15258][SQL] Nested/Chained case statements generate codegen over 64k exception

    ## What changes were proposed in this pull request?
    
    This PR avoid code generation for SafeProjection if the if the total number of NewInstances whose code has been generated in ```SpecificSafeProjection.apply()``` exceeds a pre-defined threshold (400). If the total number exceeds the threshold, ```CodegenFallback.doGenCode``` is called.  This feature avoids an error that bytecode size is larger than 64KB.
    
    Generated code by this PR
    ````java
    /* 001 */
    /* 002 */ public java.lang.Object generate(Object[] references) {
    /* 003 */   return new SpecificSafeProjection(references);
    /* 004 */ }
    /* 005 */
    /* 006 */ class SpecificSafeProjection extends org.apache.spark.sql.catalyst.expressions.codegen.BaseProjection {
    /* 007 */
    /* 008 */   private Object[] references;
    /* 009 */   private MutableRow mutableRow;
    /* 010 */
    /* 011 */
    /* 012 */
    /* 013 */   public SpecificSafeProjection(Object[] references) {
    /* 014 */     this.references = references;
    /* 015 */     mutableRow = (MutableRow) references[references.length - 1];
    /* 016 */
    /* 017 */   }
    /* 018 */
    /* 019 */   public java.lang.Object apply(java.lang.Object _i) {
    /* 020 */     InternalRow i = (InternalRow) _i;
    /* 021 */     boolean isNull3 = i.isNullAt(0);
    /* 022 */     InternalRow value3 = isNull3 ? null : (i.getStruct(0, 100));
    /* 023 */     boolean isNull1 = false;
    /* 024 */     org.apache.spark.sql.S100 value1 = null;
    /* 025 */     if (!false && isNull3) {
    /* 026 */       final org.apache.spark.sql.S100 value4 = null;
    /* 027 */       isNull1 = true;
    /* 028 */       value1 = value4;
    /* 029 */     } else {
    /* 030 */       // newInstance(class org.apache.spark.sql.S100)
    /* 031 */       Object obj = ((Expression) references[0]).eval(i);
    /* 032 */       org.apache.spark.sql.S100 value5 = (org.apache.spark.sql.S100) obj;
    /* 033 */       isNull1 = false;
    /* 034 */       value1 = value5;
    /* 035 */     }
    /* 036 */     boolean isNull8 = i.isNullAt(1);
    /* 037 */     InternalRow value8 = isNull8 ? null : (i.getStruct(1, 100));
    /* 038 */     boolean isNull6 = false;
    /* 039 */     org.apache.spark.sql.S100 value6 = null;
    /* 040 */     if (!false && isNull8) {
    /* 041 */       final org.apache.spark.sql.S100 value9 = null;
    /* 042 */       isNull6 = true;
    /* 043 */       value6 = value9;
    /* 044 */     } else {
    /* 045 */       // newInstance(class org.apache.spark.sql.S100)
    /* 046 */       Object obj1 = ((Expression) references[1]).eval(i);
    /* 047 */       org.apache.spark.sql.S100 value10 = (org.apache.spark.sql.S100) obj1;
    /* 048 */       isNull6 = false;
    /* 049 */       value6 = value10;
    /* 050 */     }
    /* 051 */     boolean isNull13 = i.isNullAt(2);
    /* 052 */     InternalRow value13 = isNull13 ? null : (i.getStruct(2, 100));
    /* 053 */     boolean isNull11 = false;
    /* 054 */     org.apache.spark.sql.S100 value11 = null;
    /* 055 */     if (!false && isNull13) {
    /* 056 */       final org.apache.spark.sql.S100 value14 = null;
    /* 057 */       isNull11 = true;
    /* 058 */       value11 = value14;
    /* 059 */     } else {
    /* 060 */       // newInstance(class org.apache.spark.sql.S100)
    /* 061 */       Object obj2 = ((Expression) references[2]).eval(i);
    /* 062 */       org.apache.spark.sql.S100 value15 = (org.apache.spark.sql.S100) obj2;
    /* 063 */       isNull11 = false;
    /* 064 */       value11 = value15;
    /* 065 */     }
    /* 066 */     boolean isNull18 = i.isNullAt(3);
    /* 067 */     InternalRow value18 = isNull18 ? null : (i.getStruct(3, 100));
    /* 068 */     boolean isNull16 = false;
    /* 069 */     org.apache.spark.sql.S100 value16 = null;
    /* 070 */     if (!false && isNull18) {
    /* 071 */       final org.apache.spark.sql.S100 value19 = null;
    /* 072 */       isNull16 = true;
    /* 073 */       value16 = value19;
    /* 074 */     } else {
    /* 075 */       // newInstance(class org.apache.spark.sql.S100)
    /* 076 */       Object obj3 = ((Expression) references[3]).eval(i);
    /* 077 */       org.apache.spark.sql.S100 value20 = (org.apache.spark.sql.S100) obj3;
    /* 078 */       isNull16 = false;
    /* 079 */       value16 = value20;
    /* 080 */     }
    /* 081 */     boolean isNull23 = i.isNullAt(4);
    /* 082 */     InternalRow value23 = isNull23 ? null : (i.getStruct(4, 100));
    /* 083 */     boolean isNull21 = false;
    /* 084 */     org.apache.spark.sql.S100 value21 = null;
    /* 085 */     if (!false && isNull23) {
    /* 086 */       final org.apache.spark.sql.S100 value24 = null;
    /* 087 */       isNull21 = true;
    /* 088 */       value21 = value24;
    /* 089 */     } else {
    /* 090 */       boolean isNull28 = i.isNullAt(4);
    /* 091 */       InternalRow value28 = isNull28 ? null : (i.getStruct(4, 100));
    /* 092 */       boolean isNull27 = isNull28;
    /* 093 */       UTF8String value27 = null;
    /* 094 */
    /* 095 */       if (!isNull28) {
    /* 096 */
    /* 097 */         if (value28.isNullAt(0)) {
    /* 098 */           isNull27 = true;
    /* 099 */         } else {
    /* 100 */           value27 = value28.getUTF8String(0);
    /* 101 */         }
    /* 102 */
    /* 103 */       }
    /* 104 */
    /* 105 */       boolean isNull26 = isNull27;
    /* 106 */       final java.lang.String value26 = isNull26 ? null : (java.lang.String) value27.toString();
    /* 107 */       isNull26 = value26 == null;
    /* 108 */       boolean isNull31 = i.isNullAt(4);
    /* 109 */       InternalRow value31 = isNull31 ? null : (i.getStruct(4, 100));
    /* 110 */       boolean isNull30 = isNull31;
    /* 111 */       UTF8String value30 = null;
    /* 112 */
    /* 113 */       if (!isNull31) {
    /* 114 */
    /* 115 */         if (value31.isNullAt(1)) {
    /* 116 */           isNull30 = true;
    /* 117 */         } else {
    /* 118 */           value30 = value31.getUTF8String(1);
    /* 119 */         }
    /* 120 */
    /* 121 */       }
    /* 122 */
    /* 123 */       boolean isNull29 = isNull30;
    /* 124 */       final java.lang.String value29 = isNull29 ? null : (java.lang.String) value30.toString();
    /* 125 */       isNull29 = value29 == null;
    /* 126 */       boolean isNull34 = i.isNullAt(4);
    /* 127 */       InternalRow value34 = isNull34 ? null : (i.getStruct(4, 100));
    /* 128 */       boolean isNull33 = isNull34;
    /* 129 */       UTF8String value33 = null;
    /* 130 */
    /* 131 */       if (!isNull34) {
    /* 132 */
    /* 133 */         if (value34.isNullAt(2)) {
    /* 134 */           isNull33 = true;
    /* 135 */         } else {
    /* 136 */           value33 = value34.getUTF8String(2);
    /* 137 */         }
    /* 138 */
    /* 139 */       }
    /* 140 */
    /* 141 */       boolean isNull32 = isNull33;
    /* 142 */       final java.lang.String value32 = isNull32 ? null : (java.lang.String) value33.toString();
    /* 143 */       isNull32 = value32 == null;
    /* 144 */       boolean isNull37 = i.isNullAt(4);
    /* 145 */       InternalRow value37 = isNull37 ? null : (i.getStruct(4, 100));
    /* 146 */       boolean isNull36 = isNull37;
    /* 147 */       UTF8String value36 = null;
    /* 148 */
    /* 149 */       if (!isNull37) {
    /* 150 */
    /* 151 */         if (value37.isNullAt(3)) {
    /* 152 */           isNull36 = true;
    /* 153 */         } else {
    /* 154 */           value36 = value37.getUTF8String(3);
    /* 155 */         }
    /* 156 */
    /* 157 */       }
    ...
    /* 1869 */       boolean isNull320 = isNull321;
    /* 1870 */       final java.lang.String value320 = isNull320 ? null : (java.lang.String) value321.toString();
    /* 1871 */       isNull320 = value320 == null;
    /* 1872 */       boolean isNull325 = i.isNullAt(4);
    /* 1873 */       InternalRow value325 = isNull325 ? null : (i.getStruct(4, 100));
    /* 1874 */       boolean isNull324 = isNull325;
    /* 1875 */       UTF8String value324 = null;
    /* 1876 */
    /* 1877 */       if (!isNull325) {
    /* 1878 */
    /* 1879 */         if (value325.isNullAt(99)) {
    /* 1880 */           isNull324 = true;
    /* 1881 */         } else {
    /* 1882 */           value324 = value325.getUTF8String(99);
    /* 1883 */         }
    /* 1884 */
    /* 1885 */       }
    /* 1886 */
    /* 1887 */       boolean isNull323 = isNull324;
    /* 1888 */       final java.lang.String value323 = isNull323 ? null : (java.lang.String) value324.toString();
    /* 1889 */       isNull323 = value323 == null;
    /* 1890 */
    /* 1891 */
    /* 1892 */       final org.apache.spark.sql.S100 value25 = false ? null : new org.apache.spark.sql.S100(value26, value29, value32, value35, value38, value41, value44, value47, value50, value53, value56, value59, value62, value65, value68, value71, value74, value77, value80, value83, value86, value89, value92, value95, value98, value101, value104, value107, value110, value113, value116, value119, value122, value125, value128, value131, value134, value137, value140, value143, value146, value149, value152, value155, value158, value161, value164, value167, value170, value173, value176, value179, value182, value185, value188, value191, value194, value197, value200, value203, value206, value209, value212, value215, value218, value221, value224, value227, value230, value233, value236, value239, value242, value245, value248, value251, value254, value257, value260, value263, value266, value269, value272, value275, value278, value281, value284, value287, value290, value293, value296, value
 299, value302, value305, value308, value311, value314, value317, value320, value323);
    /* 1893 */       isNull21 = false;
    /* 1894 */       value21 = value25;
    /* 1895 */     }
    /* 1896 */
    /* 1897 */
    /* 1898 */     final org.apache.spark.sql.S100_5 value = false ? null : new org.apache.spark.sql.S100_5(value1, value6, value11, value16, value21);
    /* 1899 */     if (false) {
    /* 1900 */       mutableRow.setNullAt(0);
    /* 1901 */     } else {
    /* 1902 */
    /* 1903 */       mutableRow.update(0, value);
    /* 1904 */     }
    /* 1905 */
    /* 1906 */     return mutableRow;
    /* 1907 */   }
    /* 1908 */ }
    /* 1909 */
    
    ````
    
    ## How was this patch tested?
    
    Added new tests


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kiszk/spark SPARK-15285

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13243
    
----
commit 8c72486b753f94fc8346524fe6cb074b94ae555c
Author: Kazuaki Ishizaki <is...@jp.ibm.com>
Date:   2016-05-21T17:56:26Z

    Go to codegen fallback if the total number of NewInstances whose code has been generated exceed the pre-defined threshold

commit e32a2371b34dc1de8b226b1d58ca76a12a5e580e
Author: Kazuaki Ishizaki <is...@jp.ibm.com>
Date:   2016-05-21T17:59:04Z

    revert blank line

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64166173
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,54 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls,
    +      s"$argIsNulls = new boolean[${arguments.size}];")
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(ctx.javaType(e.dataType), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      if (arguments.length <= 10) {
    +        val argIsNull = arguments.zipWithIndex.map { case (e, i) =>
    +          s"$argIsNulls[$i]"
    +        }
    +        s"final boolean $isNull = ${argIsNull.mkString(" || ")};"
    +      } else {
    +        s"""
    +         boolean $isNull = false;
    +         for (int idx = 0; idx < ${arguments.length}; idx++) {
    +           if ($argIsNulls[idx]) { $isNull = true; break; }
    +         }
    +       """
    +      }
         } else {
           isNull = "false"
           ""
         }
     
         val constructorCall = outer.map { gen =>
    -      s"""${gen.value}.new ${cls.getSimpleName}($argString)"""
    +      s"""${gen.value}.new ${cls.getSimpleName}(${argValues.mkString(", ")})"""
         }.getOrElse {
    -      s"new $className($argString)"
    +      s"new $className(${argValues.mkString(", ")})"
         }
     
         val code = s"""
    -      ${argGen.map(_.code).mkString("\n")}
    +      ${argCode.mkString("")}
    --- End diff --
    
    Isn't `argCode` already a string?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64153170
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,55 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls, "")
    +    val argTypes = arguments.map(e => ctx.javaType(e.dataType))
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(argTypes(i), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      if (arguments.length <= 10) {
    --- End diff --
    
    any particular reason to pick `10`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220815424
  
    The fallback approach doesn't look that simple and clean, can you try split the generated code like we did in `CreateExternalRow`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221158001
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221083826
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59149/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220792581
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59076/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64266488
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -58,4 +58,38 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val nullIntRow = df.selectExpr("i[1]").collect()(0)
         assert(nullIntRow == org.apache.spark.sql.Row(null))
       }
    +
    +  test("SPARK-15285 Generated SpecificSafeProjection.apply method grows beyond 64KB") {
    +    val ds100_5 = Seq(S100_5()).toDS()
    +    ds100_5.show
    --- End diff --
    
    To use ``collect`` cannot reproduce this bug. I think that to use ``show`` involves code generation for ```Projection```.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220835818
  
    **[Test build #59109 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59109/consoleFull)** for PR 13243 at commit [`5bdfaa7`](https://github.com/apache/spark/commit/5bdfaa73f8d458b9994dd1a11b535d9413674484).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64245614
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -58,4 +58,38 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val nullIntRow = df.selectExpr("i[1]").collect()(0)
         assert(nullIntRow == org.apache.spark.sql.Row(null))
       }
    +
    +  test("SPARK-15285 Generated SpecificSafeProjection.apply method grows beyond 64KB") {
    +    val ds100_5 = Seq(S100_5()).toDS()
    +    ds100_5.show
    +  }
     }
    +
    +case class S100(
    +  s1: String = "1", s2: String = "2", s3: String = "3", s4: String = "4",
    +  s5: String = "5", s6: String = "6", s7: String = "7", s8: String = "8",
    +  s9: String = "9", s10: String = "10", s11: String = "11", s12: String = "12",
    +  s13: String = "13", s14: String = "14", s15: String = "15", s16: String = "16",
    +  s17: String = "17", s18: String = "18", s19: String = "19", s20: String = "20",
    +  s21: String = "21", s22: String = "22", s23: String = "23", s24: String = "24",
    +  s25: String = "25", s26: String = "26", s27: String = "27", s28: String = "28",
    +  s29: String = "29", s30: String = "30", s31: String = "31", s32: String = "32",
    +  s33: String = "33", s34: String = "34", s35: String = "35", s36: String = "36",
    +  s37: String = "37", s38: String = "38", s39: String = "39", s40: String = "40",
    +  s41: String = "41", s42: String = "42", s43: String = "43", s44: String = "44",
    +  s45: String = "45", s46: String = "46", s47: String = "47", s48: String = "48",
    +  s49: String = "49", s50: String = "50", s51: String = "51", s52: String = "52",
    +  s53: String = "53", s54: String = "54", s55: String = "55", s56: String = "56",
    +  s57: String = "57", s58: String = "58", s59: String = "59", s60: String = "60",
    +  s61: String = "61", s62: String = "62", s63: String = "63", s64: String = "64",
    +  s65: String = "65", s66: String = "66", s67: String = "67", s68: String = "68",
    +  s69: String = "69", s70: String = "70", s71: String = "71", s72: String = "72",
    +  s73: String = "73", s74: String = "74", s75: String = "75", s76: String = "76",
    +  s77: String = "77", s78: String = "78", s79: String = "79", s80: String = "80",
    +  s81: String = "81", s82: String = "82", s83: String = "83", s84: String = "84",
    +  s85: String = "85", s86: String = "86", s87: String = "87", s88: String = "88",
    +  s89: String = "89", s90: String = "90", s91: String = "91", s92: String = "92",
    +  s93: String = "93", s94: String = "94", s95: String = "95", s96: String = "96",
    +  s97: String = "97", s98: String = "98", s99: String = "99", s100: String = "100")
    +case class S100_5(
    --- End diff --
    
    add a blank line between these 2 classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64245550
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -58,4 +58,38 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val nullIntRow = df.selectExpr("i[1]").collect()(0)
         assert(nullIntRow == org.apache.spark.sql.Row(null))
       }
    +
    +  test("SPARK-15285 Generated SpecificSafeProjection.apply method grows beyond 64KB") {
    +    val ds100_5 = Seq(S100_5()).toDS()
    +    ds100_5.show
    --- End diff --
    
    we should not use `show` in test, can `collect` reproduce this bug?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13243: [SPARK-15285][SQL] Generated SpecificSafeProjecti...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r74974149
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -58,4 +58,39 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val nullIntRow = df.selectExpr("i[1]").collect()(0)
         assert(nullIntRow == org.apache.spark.sql.Row(null))
       }
    +
    +  test("SPARK-15285 Generated SpecificSafeProjection.apply method grows beyond 64KB") {
    +    val ds100_5 = Seq(S100_5()).toDS()
    +    ds100_5.rdd.count
    +  }
     }
    +
    +case class S100(
    --- End diff --
    
    I will take this approach for running test suite without scala 2.10.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220840523
  
    **[Test build #59109 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59109/consoleFull)** for PR 13243 at commit [`5bdfaa7`](https://github.com/apache/spark/commit/5bdfaa73f8d458b9994dd1a11b535d9413674484).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220833378
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59098/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220952730
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220902485
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59122/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221058608
  
    **[Test build #59149 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59149/consoleFull)** for PR 13243 at commit [`6f5bda3`](https://github.com/apache/spark/commit/6f5bda30d2e24347501f02881132a734b0ed1d7c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220902483
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64196573
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,47 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls,
    +      s"$argIsNulls = new boolean[${arguments.size}];")
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(ctx.javaType(e.dataType), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      s"""
    +       boolean $isNull = false;
    +       for (int idx = 0; idx < ${arguments.length}; idx++) {
    --- End diff --
    
    I was afraid of increasing code size. However, in the case of 1 or 2 arguments, the code size of generating argument value are relatively small than the cases with large number of arguments. As a result, code size here would not be an issue.
    Thus, the latest version always uses for-loop.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64318271
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -58,4 +58,38 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val nullIntRow = df.selectExpr("i[1]").collect()(0)
         assert(nullIntRow == org.apache.spark.sql.Row(null))
       }
    +
    +  test("SPARK-15285 Generated SpecificSafeProjection.apply method grows beyond 64KB") {
    +    val ds100_5 = Seq(S100_5()).toDS()
    +    ds100_5.show
    --- End diff --
    
    Again, I conformed that to call ```collect``` cannot reproduce this bug in the latest master branch. However, I confirmed that to call ```ds.rdd.collect``` can reproduce this bug. The latest code uses ```ds.rdd.collect```.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220792515
  
    **[Test build #59076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59076/consoleFull)** for PR 13243 at commit [`e32a237`](https://github.com/apache/spark/commit/e32a2371b34dc1de8b226b1d58ca76a12a5e580e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64162830
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,55 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls, "")
    +    val argTypes = arguments.map(e => ctx.javaType(e.dataType))
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(argTypes(i), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      if (arguments.length <= 10) {
    --- End diff --
    
    BTW how useful is this optimization? What if we always use the for-loop version? It will be better if we can have a small benchmark here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13243


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220883772
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220952732
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59126/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220833377
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221165640
  
    Sorry we have to revert it as it breaks scala-2.10 build


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220823691
  
    **[Test build #59098 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59098/consoleFull)** for PR 13243 at commit [`7c35c12`](https://github.com/apache/spark/commit/7c35c12d7bf0709d1dfc1f62167aa5a82a1ac2a6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220902329
  
    **[Test build #59122 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59122/consoleFull)** for PR 13243 at commit [`9cc4d41`](https://github.com/apache/spark/commit/9cc4d41ecb69c0b6fecf11da5683f1114d418e4d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220883773
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59118/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220796188
  
    **[Test build #59077 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59077/consoleFull)** for PR 13243 at commit [`59b0a76`](https://github.com/apache/spark/commit/59b0a76a2dc2ed6484f005880001c273536088ae).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221162353
  
    thanks, merging to master and 2.0!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64172857
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,47 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls,
    +      s"$argIsNulls = new boolean[${arguments.size}];")
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(ctx.javaType(e.dataType), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      s"""
    +       boolean $isNull = false;
    +       for (int idx = 0; idx < ${arguments.length}; idx++) {
    --- End diff --
    
    Do you have some benchmark number for it? Especially for 1 or 2 arguments. If there is a lot of improvement, we can specialize the code when there are 1 or 2 arguments. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220796217
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59077/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220796216
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221158003
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59173/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220952549
  
    **[Test build #59126 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59126/consoleFull)** for PR 13243 at commit [`6e730ca`](https://github.com/apache/spark/commit/6e730ca6215afca78c0d3d34bd07218962475aaa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221172065
  
    Can we just use DefinedByConstructorParams rather than using case classes?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64277601
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -58,4 +58,38 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val nullIntRow = df.selectExpr("i[1]").collect()(0)
         assert(nullIntRow == org.apache.spark.sql.Row(null))
       }
    +
    +  test("SPARK-15285 Generated SpecificSafeProjection.apply method grows beyond 64KB") {
    +    val ds100_5 = Seq(S100_5()).toDS()
    +    ds100_5.show
    --- End diff --
    
    hmm, this is weird, the main logic in `show` is calling `collect` to get the data, can you double check it? and how about `ds.rdd.collect`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221330231
  
    `DefinedByConstructorParams` is a good idea


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221147028
  
    **[Test build #59173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59173/consoleFull)** for PR 13243 at commit [`b92ab8c`](https://github.com/apache/spark/commit/b92ab8cb8fd4d9aabda7486cf9d2d5dd7b98f0bf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221083534
  
    **[Test build #59149 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59149/consoleFull)** for PR 13243 at commit [`6f5bda3`](https://github.com/apache/spark/commit/6f5bda30d2e24347501f02881132a734b0ed1d7c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64327921
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -58,4 +58,39 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val nullIntRow = df.selectExpr("i[1]").collect()(0)
         assert(nullIntRow == org.apache.spark.sql.Row(null))
       }
    +
    +  test("SPARK-15285 Generated SpecificSafeProjection.apply method grows beyond 64KB") {
    +    val ds100_5 = Seq(S100_5()).toDS()
    +    ds100_5.rdd.count
    +  }
     }
    +
    +case class S100(
    --- End diff --
    
    scala 2.10 doesn't support large case class. We can create a new test suite under `scala-2.11/src/test` and put this test there, so that we only run it under scala 2.10. `repl` module is a good example about it. cc @kiszk do you mind resend this PR with the fix?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64164066
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,55 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls, "")
    +    val argTypes = arguments.map(e => ctx.javaType(e.dataType))
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(argTypes(i), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      if (arguments.length <= 10) {
    --- End diff --
    
    Sure. But, do we use the for-loop version for 1 or 2? For 1, clearly, we do not need a loop. For 2, we always exit a loop only for executing one iteration.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220792578
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220883664
  
    **[Test build #59118 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59118/consoleFull)** for PR 13243 at commit [`74d8764`](https://github.com/apache/spark/commit/74d876453cf853a97884ff195616e909e5873a2c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220792577
  
    **[Test build #59076 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59076/consoleFull)** for PR 13243 at commit [`e32a237`](https://github.com/apache/spark/commit/e32a2371b34dc1de8b226b1d58ca76a12a5e580e).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220844617
  
    mostly LGTM except some minor comments, thanks for working on it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220792969
  
    **[Test build #59077 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59077/consoleFull)** for PR 13243 at commit [`59b0a76`](https://github.com/apache/spark/commit/59b0a76a2dc2ed6484f005880001c273536088ae).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64163700
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,55 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls, "")
    +    val argTypes = arguments.map(e => ctx.javaType(e.dataType))
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(argTypes(i), argValue, "")
    --- End diff --
    
    Sure, I did.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221083825
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220876025
  
    **[Test build #59118 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59118/consoleFull)** for PR 13243 at commit [`74d8764`](https://github.com/apache/spark/commit/74d876453cf853a97884ff195616e909e5873a2c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64172891
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,47 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls,
    +      s"$argIsNulls = new boolean[${arguments.size}];")
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(ctx.javaType(e.dataType), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      s"""
    +       boolean $isNull = false;
    +       for (int idx = 0; idx < ${arguments.length}; idx++) {
    +         if ($argIsNulls[idx]) { $isNull = true; break; }
    +       }
    +      """
         } else {
           isNull = "false"
           ""
         }
     
         val constructorCall = outer.map { gen =>
    -      s"""${gen.value}.new ${cls.getSimpleName}($argString)"""
    +      s"""${gen.value}.new ${cls.getSimpleName}(${argValues.mkString(", ")})"""
         }.getOrElse {
    -      s"new $className($argString)"
    +      s"new $className(${argValues.mkString(", ")})"
         }
     
         val code = s"""
    -      ${argGen.map(_.code).mkString("\n")}
    +      ${argCode}
    --- End diff --
    
    nit: `$argCode`, no need of brace. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64162262
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,55 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls, "")
    +    val argTypes = arguments.map(e => ctx.javaType(e.dataType))
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(argTypes(i), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      if (arguments.length <= 10) {
    --- End diff --
    
    I have no particular reason for pick ```10```. ```100``` seems to be too long.
    Do you have any suggestion to pick the value?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220890649
  
    **[Test build #59122 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59122/consoleFull)** for PR 13243 at commit [`9cc4d41`](https://github.com/apache/spark/commit/9cc4d41ecb69c0b6fecf11da5683f1114d418e4d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64196297
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,47 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls,
    +      s"$argIsNulls = new boolean[${arguments.size}];")
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(ctx.javaType(e.dataType), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      s"""
    +       boolean $isNull = false;
    +       for (int idx = 0; idx < ${arguments.length}; idx++) {
    +         if ($argIsNulls[idx]) { $isNull = true; break; }
    +       }
    +      """
         } else {
           isNull = "false"
           ""
         }
     
         val constructorCall = outer.map { gen =>
    -      s"""${gen.value}.new ${cls.getSimpleName}($argString)"""
    +      s"""${gen.value}.new ${cls.getSimpleName}(${argValues.mkString(", ")})"""
         }.getOrElse {
    -      s"new $className($argString)"
    +      s"new $className(${argValues.mkString(", ")})"
         }
     
         val code = s"""
    -      ${argGen.map(_.code).mkString("\n")}
    +      ${argCode}
    --- End diff --
    
    Good catch, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220840619
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59109/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64153124
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,55 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls, "")
    +    val argTypes = arguments.map(e => ctx.javaType(e.dataType))
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(argTypes(i), argValue, "")
    --- End diff --
    
    nit: `ctx.addMutableState(ctx.javaType(e.dataType), argValue, "")`, then we don't need to define `argTypes`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13243#discussion_r64153183
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -232,27 +232,55 @@ case class NewInstance(
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
         val javaType = ctx.javaType(dataType)
    -    val argGen = arguments.map(_.genCode(ctx))
    -    val argString = argGen.map(_.value).mkString(", ")
    +    val argIsNulls = ctx.freshName("argIsNulls")
    +    ctx.addMutableState("boolean[]", argIsNulls, "")
    +    val argTypes = arguments.map(e => ctx.javaType(e.dataType))
    +    val argValues = arguments.zipWithIndex.map { case (e, i) =>
    +      val argValue = ctx.freshName("argValue")
    +      ctx.addMutableState(argTypes(i), argValue, "")
    +      argValue
    +    }
    +
    +    val argCodes = arguments.zipWithIndex.map { case (e, i) =>
    +      val expr = e.genCode(ctx)
    +      expr.code + s"""
    +       $argIsNulls[$i] = ${expr.isNull};
    +       ${argValues(i)} = ${expr.value};
    +     """
    +    }
    +    val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
     
         val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))
     
         var isNull = ev.isNull
         val setIsNull = if (propagateNull && arguments.nonEmpty) {
    -      s"final boolean $isNull = ${argGen.map(_.isNull).mkString(" || ")};"
    +      if (arguments.length <= 10) {
    +        val argIsNull = arguments.zipWithIndex.map { case (e, i) =>
    +          s"$argIsNulls[$i]"
    +        }
    +        s"final boolean $isNull = ${argIsNull.mkString(" || ")};"
    +      } else {
    +        s"""
    +         boolean $isNull = false;
    +         for (int idx = 0; idx < ${arguments.length}; idx++) {
    +           if ($argIsNulls[idx]) { $isNull = true; break; }
    +         }
    +       """
    +      }
         } else {
           isNull = "false"
           ""
         }
     
         val constructorCall = outer.map { gen =>
    -      s"""${gen.value}.new ${cls.getSimpleName}($argString)"""
    +      s"""${gen.value}.new ${cls.getSimpleName}(${argValues.mkString(", ")})"""
         }.getOrElse {
    -      s"new $className($argString)"
    +      s"new $className(${argValues.mkString(", ")})"
         }
     
         val code = s"""
    -      ${argGen.map(_.code).mkString("\n")}
    +      $argIsNulls = new boolean[${arguments.size}];
    --- End diff --
    
    do we need to create a new boolean array every round?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220840617
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220824933
  
    Let me see the approach, which you proposed, later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220833313
  
    **[Test build #59098 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59098/consoleFull)** for PR 13243 at commit [`7c35c12`](https://github.com/apache/spark/commit/7c35c12d7bf0709d1dfc1f62167aa5a82a1ac2a6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-220935411
  
    **[Test build #59126 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59126/consoleFull)** for PR 13243 at commit [`6e730ca`](https://github.com/apache/spark/commit/6e730ca6215afca78c0d3d34bd07218962475aaa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13243#issuecomment-221157890
  
    **[Test build #59173 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59173/consoleFull)** for PR 13243 at commit [`b92ab8c`](https://github.com/apache/spark/commit/b92ab8cb8fd4d9aabda7486cf9d2d5dd7b98f0bf).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org