You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by vvysotskyi <gi...@git.apache.org> on 2017/04/24 16:50:36 UTC

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

GitHub user vvysotskyi opened a pull request:

    https://github.com/apache/drill/pull/818

    DRILL-5140: Fix CompileException in run-time generated code when record batch has\u2026

    \u2026rd batch has large number of fields.
    
    Implement recursive splitting of run-time generated code to the inner classes when class members number is close to the class constant pool size.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vvysotskyi/drill DRILL-5140

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/818.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #818
    
----
commit 2dd81eb89d56f3dbfc86037a5fd8d6bfe026dd45
Author: Volodymyr Vysotskyi <vv...@gmail.com>
Date:   2017-04-12T16:07:39Z

    DRILL-5140: Fix CompileException in run-time generated code when record batch has large number of fields.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r113831429
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/compile/ClassTransformer.java ---
    @@ -264,7 +265,13 @@ public boolean equals(Object obj) {
             final ClassNames nextPrecompiled = nextSet.precompiled;
             final byte[] precompiledBytes = byteCodeLoader.getClassByteCodeFromPath(nextPrecompiled.clazz);
             final ClassNames nextGenerated = nextSet.generated;
    -        final ClassNode generatedNode = classesToMerge.get(nextGenerated.slash);
    +        Pair<byte[], ClassNode> classNodePair = classesToMerge.remove(nextGenerated.slash);
    +        final ClassNode generatedNode;
    +        if (classNodePair != null) {
    +          generatedNode = classNodePair.getValue();
    +        } else {
    +          generatedNode = null;
    --- End diff --
    
    What does it mean by "generatedNode = null"? Will it cause problem in subsequent call at line 288?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r114600979
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/compile/ClassTransformer.java ---
    @@ -309,6 +316,9 @@ public boolean equals(Object obj) {
             namesCompleted.add(nextSet);
           }
     
    +      for (Map.Entry<String, Pair<byte[], ClassNode>> clazz : classesToMerge.entrySet()) {
    +        classLoader.injectByteCode(clazz.getKey().replace(FileUtils.separatorChar, '.'), clazz.getValue().getKey());
    --- End diff --
    
    Added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r114600961
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/compile/ClassTransformer.java ---
    @@ -264,7 +265,13 @@ public boolean equals(Object obj) {
             final ClassNames nextPrecompiled = nextSet.precompiled;
             final byte[] precompiledBytes = byteCodeLoader.getClassByteCodeFromPath(nextPrecompiled.clazz);
             final ClassNames nextGenerated = nextSet.generated;
    -        final ClassNode generatedNode = classesToMerge.get(nextGenerated.slash);
    +        Pair<byte[], ClassNode> classNodePair = classesToMerge.remove(nextGenerated.slash);
    +        final ClassNode generatedNode;
    +        if (classNodePair != null) {
    +          generatedNode = classNodePair.getValue();
    +        } else {
    +          generatedNode = null;
    --- End diff --
    
    It is needed for the case when generated class was empty. Then `classesToMerge` would be empty too and `classesToMerge.remove(nextGenerated.slash)` would return `null`.
    No, it wouldn't. In the method `MergeAdapter.getMergedClass()` there are a many checks for this value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r117573874
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java ---
    @@ -77,10 +81,43 @@
       private final CodeGenerator<T> codeGenerator;
     
       public final JDefinedClass clazz;
    -  private final LinkedList<SizedJBlock>[] blocks;
    +
       private final JCodeModel model;
       private final OptionSet optionManager;
     
    +  private ClassGenerator<T> innerClassGenerator;
    +  private LinkedList<SizedJBlock>[] blocks;
    +  private LinkedList<SizedJBlock>[] oldBlocks;
    +
    +  /**
    +   * Assumed that field has 3 indexes within the constant pull: index of the CONSTANT_Fieldref_info +
    --- End diff --
    
    I'm not entirely sure the calculation is correct, in terms of # of entries per field in constant pool of a class. 
    
    Per JVM spec, each class field has CONSTANT_Fieldref_info (1 entry), which has class_index and name_and_type_index. The class_index points CONSTANT_Class_info, which is shared by across all the class fields. The second points to CONSTANT_NameAndType_info (1 entry), which points to name (1 entry) and descriptor (1 entry). Therefore, for each class field, at least 4 entries are required in constant pool.  Similarly, we could get 4 entries for each method.
    
    Besides fields and methods, we also have to take constant literal into account, like int, float , string ... constant. For constant literals, since we apply source-code copy for build-in-function /udf, it's hard to figure out exactly how many constants are used in the generated class.
    
    Given the above reasons, I'm not sure whether it makes sense to try to come up with a formula to estimate the maximum # of fields a  generated class could have. If the estimation is not accurate, then what if we just provides a ballpark estimation and put some 'magic' number here?   


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r113831292
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/compile/ClassTransformer.java ---
    @@ -264,7 +265,13 @@ public boolean equals(Object obj) {
             final ClassNames nextPrecompiled = nextSet.precompiled;
             final byte[] precompiledBytes = byteCodeLoader.getClassByteCodeFromPath(nextPrecompiled.clazz);
             final ClassNames nextGenerated = nextSet.generated;
    -        final ClassNode generatedNode = classesToMerge.get(nextGenerated.slash);
    +        Pair<byte[], ClassNode> classNodePair = classesToMerge.remove(nextGenerated.slash);
    +        final ClassNode generatedNode;
    --- End diff --
    
    Can you add comments to explain why you are calling remove() on classesToMerge? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #818: DRILL-5140: Fix CompileException in run-time generated cod...

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on the issue:

    https://github.com/apache/drill/pull/818
  
    @paul-rogers Yes, I tested this fix with turned on the "plain Java" mode. Debugging generated code using an IDE considerably helped me. Thanks for offering assistance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r117708028
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java ---
    @@ -77,10 +81,43 @@
       private final CodeGenerator<T> codeGenerator;
     
       public final JDefinedClass clazz;
    -  private final LinkedList<SizedJBlock>[] blocks;
    +
       private final JCodeModel model;
       private final OptionSet optionManager;
     
    +  private ClassGenerator<T> innerClassGenerator;
    +  private LinkedList<SizedJBlock>[] blocks;
    +  private LinkedList<SizedJBlock>[] oldBlocks;
    +
    +  /**
    +   * Assumed that field has 3 indexes within the constant pull: index of the CONSTANT_Fieldref_info +
    --- End diff --
    
    In this calculation is taken into account that CONSTANT_NameAndType_info.descriptor has limited range of its values, so it was taken 3 entries for the each class field and method.
    
    In this formula supposed that each class field and local variable use different literal values that have two entries. I am agree with you that there may be cases that have not been covered by this formula.
    
    The formula is needed for at least to consider the number of generated methods, difference between entries count for class fields and local variables. The 'magic' number 1000 was used in this formula to reserve constant pool for class references and unaccounted cases.
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r113825753
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java ---
    @@ -66,7 +68,17 @@
       public static final GeneratorMapping DEFAULT_CONSTANT_MAP = GM("doSetup", "doSetup", null, null);
     
       static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ClassGenerator.class);
    -  public static enum BlockType {SETUP, EVAL, RESET, CLEANUP};
    +
    +  /**
    +   * Field has 2 indexes within the constant pull: field item + name and type item.
    --- End diff --
    
    I could not make sense how you calculate this number of 26767. According to JVM spec [1], the constant pool table consists of 2 or more bytes "info" section for each item. In some case like class name, one item could use 2 bytes, while in other cases like field, method, one item could use 4 bytes. Integer uses 4 bytes, and Long constants uses 8 bytes. 
    
    That is, if I have a class with fields < 26767, we may still hit the complain of "too many constants".    
     
    
    1. http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.4


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/818


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #818: DRILL-5140: Fix CompileException in run-time generated cod...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/818
  
    Was this improvement tested with the "plain Java" mode turned on? Most of the work is in the byte-code fixup, but the generated Java must be validated that it works when compiled directly. Let me know if you need assistance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #818: DRILL-5140: Fix CompileException in run-time generated cod...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the issue:

    https://github.com/apache/drill/pull/818
  
    +1



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r114600611
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/compile/ClassTransformer.java ---
    @@ -264,7 +265,13 @@ public boolean equals(Object obj) {
             final ClassNames nextPrecompiled = nextSet.precompiled;
             final byte[] precompiledBytes = byteCodeLoader.getClassByteCodeFromPath(nextPrecompiled.clazz);
             final ClassNames nextGenerated = nextSet.generated;
    -        final ClassNode generatedNode = classesToMerge.get(nextGenerated.slash);
    +        Pair<byte[], ClassNode> classNodePair = classesToMerge.remove(nextGenerated.slash);
    +        final ClassNode generatedNode;
    --- End diff --
    
    Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r114603169
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java ---
    @@ -66,7 +68,17 @@
       public static final GeneratorMapping DEFAULT_CONSTANT_MAP = GM("doSetup", "doSetup", null, null);
     
       static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ClassGenerator.class);
    -  public static enum BlockType {SETUP, EVAL, RESET, CLEANUP};
    +
    +  /**
    +   * Field has 2 indexes within the constant pull: field item + name and type item.
    --- End diff --
    
    I missed the third index. Fixed estimation of max index value, thanks. Also added comment that explains the way, how it is calculating.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r118755301
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java ---
    @@ -77,10 +81,43 @@
       private final CodeGenerator<T> codeGenerator;
     
       public final JDefinedClass clazz;
    -  private final LinkedList<SizedJBlock>[] blocks;
    +
       private final JCodeModel model;
       private final OptionSet optionManager;
     
    +  private ClassGenerator<T> innerClassGenerator;
    +  private LinkedList<SizedJBlock>[] blocks;
    +  private LinkedList<SizedJBlock>[] oldBlocks;
    +
    +  /**
    +   * Assumed that field has 3 indexes within the constant pull: index of the CONSTANT_Fieldref_info +
    --- End diff --
    
    Although the proposed formula might still hit problem in certain cases,  it seems to work fine for normal cases. Also, for normal queries over hundreds of columns, the run-time generated code remains same as before. Given that, it makes sense to me to merge this code change. 
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #818: DRILL-5140: Fix CompileException in run-time genera...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/818#discussion_r113831491
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/compile/ClassTransformer.java ---
    @@ -309,6 +316,9 @@ public boolean equals(Object obj) {
             namesCompleted.add(nextSet);
           }
     
    +      for (Map.Entry<String, Pair<byte[], ClassNode>> clazz : classesToMerge.entrySet()) {
    +        classLoader.injectByteCode(clazz.getKey().replace(FileUtils.separatorChar, '.'), clazz.getValue().getKey());
    --- End diff --
    
    Add comments? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---