You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Serhii-Harnyk <gi...@git.apache.org> on 2016/12/09 16:16:03 UTC

[GitHub] drill pull request #686: DRILL-5117: Compile error when query a json file wi...

GitHub user Serhii-Harnyk opened a pull request:

    https://github.com/apache/drill/pull/686

    DRILL-5117: Compile error when query a json file with 1000+columns

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Serhii-Harnyk/drill DRILL-5117

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/686.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #686
    
----
commit 00eaf30fd662530d8bd62059b85b0ad179768fdb
Author: Serhii-Harnyk <se...@gmail.com>
Date:   2016-12-08T20:08:34Z

    DRILL-5117: Compile error when query a json file with 1000+columns

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #686: DRILL-5117: Compile error when query a json file with 1000...

Posted by Serhii-Harnyk <gi...@git.apache.org>.
Github user Serhii-Harnyk commented on the issue:

    https://github.com/apache/drill/pull/686
  
    @sudheeshkatkam, when running for example test testEXTERNAL_SORT(), generates class CopierGen4, in which methods doSetup() and doEval() does not splits correctly. But with this fix them both splits into the smaller methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #686: DRILL-5117: Compile error when query a json file wi...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/686#discussion_r92292702
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/compile/TestLargeFileCompilation.java ---
    @@ -154,4 +158,20 @@ public void testProject() throws Exception {
         testNoResult(ITERATION_COUNT, LARGE_QUERY_SELECT_LIST);
       }
     
    +  @Test
    +  public void testSelectAllFromFileWithManyColumns() throws Exception {
    +    File path = new File(BaseTestQuery.getTempDir("json/input"));
    --- End diff --
    
    You do not have to generate a new json file for writing a unit test. Not sure how the new created file is cleaned up if the testcase failed.
    
    You may consider using the existing tpch sample file to do unit test, or re-enable the originally ignored test (testTOP_N_SORT).
    
    ```java
      @Test
      public void testLargeListColWithLimit() throws Exception {
        final int nCol = 1000;
        final StringBuilder sb = new StringBuilder();
    
        sb.append(" select n_nationkey ");
        for (int i = 0; i< nCol; i++) {
          sb.append(", " + "Col" + i );
        }
        sb.append(" from cp.`tpch/nation.parquet`");
        sb.append(" limit 1");
    
        test(sb.toString());
      }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #686: DRILL-5117: Compile error when query a json file with 1000...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the issue:

    https://github.com/apache/drill/pull/686
  
    I think the cause of DRILL-1808 is same as DRILL-5117. We should mark them as related or duplicated in the JIRA. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #686: DRILL-5117: Compile error when query a json file with 1000...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the issue:

    https://github.com/apache/drill/pull/686
  
    +1
    
    LGTM. Thanks for the PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #686: DRILL-5117: Compile error when query a json file wi...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/686#discussion_r92290434
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/SizedJBlock.java ---
    @@ -32,7 +32,11 @@
     
       public SizedJBlock(JBlock block) {
         this.block = block;
    -    this.count = 0;
    +    // count should be 1 because in some cases it is hard to increase it when
    +    // Logical Expressions are added to JBlock.
    +    // To avoid taking into account of some extra count from empty JBlock,
    --- End diff --
    
    I spent some time to understand why modifying this count makes the query compilation works.  
    
    The failing case encounter compilation failure in Copier (for SVR operator). Turns out that the idea of SizedJBlock (DRILL-4715) only works when we call ClassGenerator.addExpr(). This is fine with Project, Filter, Aggregator, etc, but not for Copier.  The Copier is doing kind of short-cut handling, by accessing the eval() and setup() directly [1].
    
    Ideally, we probably should try to see if we can convert Copier using same mechanism in Project/Filter. After some thoughts, I realized doing that might add additional overhead, as the current way is doing the copying directly. 
    
    Given that, I'm fine with this proposed change. Please add comments to explain why we set count = 1. 
    
    [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/vector/CopyUtil.java#L45-L50  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #686: DRILL-5117: Compile error when query a json file wi...

Posted by Serhii-Harnyk <gi...@git.apache.org>.
Github user Serhii-Harnyk commented on a diff in the pull request:

    https://github.com/apache/drill/pull/686#discussion_r92452238
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/compile/TestLargeFileCompilation.java ---
    @@ -154,4 +158,20 @@ public void testProject() throws Exception {
         testNoResult(ITERATION_COUNT, LARGE_QUERY_SELECT_LIST);
       }
     
    +  @Test
    +  public void testSelectAllFromFileWithManyColumns() throws Exception {
    +    File path = new File(BaseTestQuery.getTempDir("json/input"));
    --- End diff --
    
    You are right, with this changes both tests testEXTERNAL_SORT and testTOP_N_SORT work. So I enabled them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #686: DRILL-5117: Compile error when query a json file wi...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/686


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #686: DRILL-5117: Compile error when query a json file with 1000...

Posted by Serhii-Harnyk <gi...@git.apache.org>.
Github user Serhii-Harnyk commented on the issue:

    https://github.com/apache/drill/pull/686
  
    @jinfengni, could you please review new changes? I squashed all changes into single commit and rebased into master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #686: DRILL-5117: Compile error when query a json file with 1000...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on the issue:

    https://github.com/apache/drill/pull/686
  
    @Serhii-Harnyk I am curious how this resolves DRILL-1808. Can you provide some details?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---