You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Efe Selcuk <ef...@gmail.com> on 2016/10/25 01:21:59 UTC

[Spark 2.0.1] Error in generated code, possible regression?

I have an application that works in 2.0.0 but has been dying at runtime on
the 2.0.1 distribution.

at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:893)
at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:950)
at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:947)
at
org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at
org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
... 30 more
Caused by: org.codehaus.commons.compiler.CompileException: File
'generated.java', Line 74, Column 145: Unknown variable or type "value4"

It also includes a massive 1800-line generated code output (which repeats
over and over, even on 1 thread, which makes this a pain), but fortunately
the error occurs early so I can give at least some context.

/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */   return new SpecificMutableProjection(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificMutableProjection extends
org.apache.spark.sql.catalyst.expressions.codegen.BaseMutableProjection {
/* 006 */
/* 007 */   private Object[] references;
/* 008 */   private MutableRow mutableRow;
/* 009 */   private Object[] values;
... // many lines of class variables, mostly errMsg strings and Object[]
/* 071 */   private void apply2_7(InternalRow i) {
/* 072 */
/* 073 */     boolean isNull215 = false;
/* 074 */     final com.mypackage.MyThing value215 = isNull215 ? null :
(com.mypackage.MyThing) value4._2();
/* 075 */     isNull215 = value215 == null;
/* 076 */
...

As you can see, on line 74 there's a reference to value4 but nothing called
value4 has been defined. I have no idea of where to even begin looking for
what caused this, or even whether it's my fault or a bug in the code
generation. Any help is appreciated.

Efe

Re: [Spark 2.0.1] Error in generated code, possible regression?

Posted by Michael Armbrust <mi...@databricks.com>.
I think that there should be comments that show the expressions that are
getting compiled.  Maybe make a gist with the whole generated code fragment?

On Wed, Oct 26, 2016 at 3:45 PM, Efe Selcuk <ef...@gmail.com> wrote:

> I do plan to do that Michael. Do you happen to know of any guidelines for
> tracking down the context of this generated code?
>
> On Wed, Oct 26, 2016 at 3:42 PM Michael Armbrust <mi...@databricks.com>
> wrote:
>
>> If you have a reproduction you can post for this, it would be great if
>> you could open a JIRA.
>>
>> On Mon, Oct 24, 2016 at 6:21 PM, Efe Selcuk <ef...@gmail.com> wrote:
>>
>> I have an application that works in 2.0.0 but has been dying at runtime
>> on the 2.0.1 distribution.
>>
>> at org.apache.spark.sql.catalyst.expressions.codegen.
>> CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$
>> CodeGenerator$$doCompile(CodeGenerator.scala:893)
>> at org.apache.spark.sql.catalyst.expressions.codegen.
>> CodeGenerator$$anon$1.load(CodeGenerator.scala:950)
>> at org.apache.spark.sql.catalyst.expressions.codegen.
>> CodeGenerator$$anon$1.load(CodeGenerator.scala:947)
>> at org.spark_project.guava.cache.LocalCache$LoadingValueReference.
>> loadFuture(LocalCache.java:3599)
>> at org.spark_project.guava.cache.LocalCache$Segment.loadSync(
>> LocalCache.java:2379)
>> ... 30 more
>> Caused by: org.codehaus.commons.compiler.CompileException: File
>> 'generated.java', Line 74, Column 145: Unknown variable or type "value4"
>>
>> It also includes a massive 1800-line generated code output (which repeats
>> over and over, even on 1 thread, which makes this a pain), but fortunately
>> the error occurs early so I can give at least some context.
>>
>> /* 001 */ public java.lang.Object generate(Object[] references) {
>> /* 002 */   return new SpecificMutableProjection(references);
>> /* 003 */ }
>> /* 004 */
>> /* 005 */ class SpecificMutableProjection extends
>> org.apache.spark.sql.catalyst.expressions.codegen.BaseMutableProjection {
>> /* 006 */
>> /* 007 */   private Object[] references;
>> /* 008 */   private MutableRow mutableRow;
>> /* 009 */   private Object[] values;
>> ... // many lines of class variables, mostly errMsg strings and Object[]
>> /* 071 */   private void apply2_7(InternalRow i) {
>> /* 072 */
>> /* 073 */     boolean isNull215 = false;
>> /* 074 */     final com.mypackage.MyThing value215 = isNull215 ? null :
>> (com.mypackage.MyThing) value4._2();
>> /* 075 */     isNull215 = value215 == null;
>> /* 076 */
>> ...
>>
>> As you can see, on line 74 there's a reference to value4 but nothing
>> called value4 has been defined. I have no idea of where to even begin
>> looking for what caused this, or even whether it's my fault or a bug in the
>> code generation. Any help is appreciated.
>>
>> Efe
>>
>>
>>

Re: [Spark 2.0.1] Error in generated code, possible regression?

Posted by Efe Selcuk <ef...@gmail.com>.
I do plan to do that Michael. Do you happen to know of any guidelines for
tracking down the context of this generated code?

On Wed, Oct 26, 2016 at 3:42 PM Michael Armbrust <mi...@databricks.com>
wrote:

> If you have a reproduction you can post for this, it would be great if you
> could open a JIRA.
>
> On Mon, Oct 24, 2016 at 6:21 PM, Efe Selcuk <ef...@gmail.com> wrote:
>
> I have an application that works in 2.0.0 but has been dying at runtime on
> the 2.0.1 distribution.
>
> at
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:893)
> at
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:950)
> at
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:947)
> at
> org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
> at
> org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
> ... 30 more
> Caused by: org.codehaus.commons.compiler.CompileException: File
> 'generated.java', Line 74, Column 145: Unknown variable or type "value4"
>
> It also includes a massive 1800-line generated code output (which repeats
> over and over, even on 1 thread, which makes this a pain), but fortunately
> the error occurs early so I can give at least some context.
>
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificMutableProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificMutableProjection extends
> org.apache.spark.sql.catalyst.expressions.codegen.BaseMutableProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private MutableRow mutableRow;
> /* 009 */   private Object[] values;
> ... // many lines of class variables, mostly errMsg strings and Object[]
> /* 071 */   private void apply2_7(InternalRow i) {
> /* 072 */
> /* 073 */     boolean isNull215 = false;
> /* 074 */     final com.mypackage.MyThing value215 = isNull215 ? null :
> (com.mypackage.MyThing) value4._2();
> /* 075 */     isNull215 = value215 == null;
> /* 076 */
> ...
>
> As you can see, on line 74 there's a reference to value4 but nothing
> called value4 has been defined. I have no idea of where to even begin
> looking for what caused this, or even whether it's my fault or a bug in the
> code generation. Any help is appreciated.
>
> Efe
>
>
>

Re: [Spark 2.0.1] Error in generated code, possible regression?

Posted by Michael Armbrust <mi...@databricks.com>.
If you have a reproduction you can post for this, it would be great if you
could open a JIRA.

On Mon, Oct 24, 2016 at 6:21 PM, Efe Selcuk <ef...@gmail.com> wrote:

> I have an application that works in 2.0.0 but has been dying at runtime on
> the 2.0.1 distribution.
>
> at org.apache.spark.sql.catalyst.expressions.codegen.
> CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$
> CodeGenerator$$doCompile(CodeGenerator.scala:893)
> at org.apache.spark.sql.catalyst.expressions.codegen.
> CodeGenerator$$anon$1.load(CodeGenerator.scala:950)
> at org.apache.spark.sql.catalyst.expressions.codegen.
> CodeGenerator$$anon$1.load(CodeGenerator.scala:947)
> at org.spark_project.guava.cache.LocalCache$LoadingValueReference.
> loadFuture(LocalCache.java:3599)
> at org.spark_project.guava.cache.LocalCache$Segment.loadSync(
> LocalCache.java:2379)
> ... 30 more
> Caused by: org.codehaus.commons.compiler.CompileException: File
> 'generated.java', Line 74, Column 145: Unknown variable or type "value4"
>
> It also includes a massive 1800-line generated code output (which repeats
> over and over, even on 1 thread, which makes this a pain), but fortunately
> the error occurs early so I can give at least some context.
>
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificMutableProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificMutableProjection extends
> org.apache.spark.sql.catalyst.expressions.codegen.BaseMutableProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private MutableRow mutableRow;
> /* 009 */   private Object[] values;
> ... // many lines of class variables, mostly errMsg strings and Object[]
> /* 071 */   private void apply2_7(InternalRow i) {
> /* 072 */
> /* 073 */     boolean isNull215 = false;
> /* 074 */     final com.mypackage.MyThing value215 = isNull215 ? null :
> (com.mypackage.MyThing) value4._2();
> /* 075 */     isNull215 = value215 == null;
> /* 076 */
> ...
>
> As you can see, on line 74 there's a reference to value4 but nothing
> called value4 has been defined. I have no idea of where to even begin
> looking for what caused this, or even whether it's my fault or a bug in the
> code generation. Any help is appreciated.
>
> Efe
>
>

Re: [Spark 2.0.1] Error in generated code, possible regression?

Posted by Efe Selcuk <ef...@gmail.com>.
I'd like to do that, though are there any guidelines of tracking down the
context of the generated code?

On Mon, Oct 24, 2016 at 11:44 PM Kazuaki Ishizaki <IS...@jp.ibm.com>
wrote:

Can you have a smaller program that can reproduce the same error? If you
also create a JIRA entry, it would be great.

Kazuaki Ishizaki



From:        Efe Selcuk <ef...@gmail.com>
To:        "user @spark" <us...@spark.apache.org>
Date:        2016/10/25 10:23
Subject:        [Spark 2.0.1] Error in generated code, possible regression?
------------------------------



I have an application that works in 2.0.0 but has been dying at runtime on
the 2.0.1 distribution.

at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:893)
at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:950)
at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:947)
at
org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at
org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
... 30 more
Caused by: org.codehaus.commons.compiler.CompileException: File
'generated.java', Line 74, Column 145: Unknown variable or type "value4"

It also includes a massive 1800-line generated code output (which repeats
over and over, even on 1 thread, which makes this a pain), but fortunately
the error occurs early so I can give at least some context.

/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */   return new SpecificMutableProjection(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificMutableProjection extends
org.apache.spark.sql.catalyst.expressions.codegen.BaseMutableProjection {
/* 006 */
/* 007 */   private Object[] references;
/* 008 */   private MutableRow mutableRow;
/* 009 */   private Object[] values;
... // many lines of class variables, mostly errMsg strings and Object[]
/* 071 */   private void apply2_7(InternalRow i) {
/* 072 */
/* 073 */     boolean isNull215 = false;
/* 074 */     final com.mypackage.MyThing value215 = isNull215 ? null :
(com.mypackage.MyThing) value4._2();
/* 075 */     isNull215 = value215 == null;
/* 076 */
...

As you can see, on line 74 there's a reference to value4 but nothing called
value4 has been defined. I have no idea of where to even begin looking for
what caused this, or even whether it's my fault or a bug in the code
generation. Any help is appreciated.

Efe

Re: [Spark 2.0.1] Error in generated code, possible regression?

Posted by Kazuaki Ishizaki <IS...@jp.ibm.com>.
Can you have a smaller program that can reproduce the same error? If you 
also create a JIRA entry, it would be great.

Kazuaki Ishizaki



From:   Efe Selcuk <ef...@gmail.com>
To:     "user @spark" <us...@spark.apache.org>
Date:   2016/10/25 10:23
Subject:        [Spark 2.0.1] Error in generated code, possible 
regression?



I have an application that works in 2.0.0 but has been dying at runtime on 
the 2.0.1 distribution.

at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:893)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:950)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:947)
at 
org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at 
org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
... 30 more
Caused by: org.codehaus.commons.compiler.CompileException: File 
'generated.java', Line 74, Column 145: Unknown variable or type "value4"

It also includes a massive 1800-line generated code output (which repeats 
over and over, even on 1 thread, which makes this a pain), but fortunately 
the error occurs early so I can give at least some context.

/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */   return new SpecificMutableProjection(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificMutableProjection extends 
org.apache.spark.sql.catalyst.expressions.codegen.BaseMutableProjection {
/* 006 */
/* 007 */   private Object[] references;
/* 008 */   private MutableRow mutableRow;
/* 009 */   private Object[] values;
... // many lines of class variables, mostly errMsg strings and Object[]
/* 071 */   private void apply2_7(InternalRow i) {
/* 072 */
/* 073 */     boolean isNull215 = false;
/* 074 */     final com.mypackage.MyThing value215 = isNull215 ? null : 
(com.mypackage.MyThing) value4._2();
/* 075 */     isNull215 = value215 == null;
/* 076 */
...

As you can see, on line 74 there's a reference to value4 but nothing 
called value4 has been defined. I have no idea of where to even begin 
looking for what caused this, or even whether it's my fault or a bug in 
the code generation. Any help is appreciated.

Efe