You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by minji-kim <gi...@git.apache.org> on 2016/02/18 07:35:36 UTC

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

GitHub user minji-kim opened a pull request:

    https://github.com/apache/drill/pull/380

    DRILL-4410: ListVector should initialize bits in allocateNew

    ListVector does not initialize bits, but various ValueVectors depend on bits being initialized.  ONe of the side effects of not having bits set is that for each batch, the ValueVector will reAlloc() with 2x the previous buffer size (regardless of whether it needs it or not).  If there are enough batches, this results in reAlloc() request of size > Integer.MAX, which triggers out-of-memory/max-buffer exception.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/minji-kim/drill DRILL-4410

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/380.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #380
    
----
commit ca4ca48482c5066a07cd2727853c5a73df334519
Author: Minji Kim <mi...@dremio.com>
Date:   2016-02-18T06:29:11Z

    DRILL-4410: ListVector should initialize bits in allocateNew

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by jaltekruse <gi...@git.apache.org>.
Github user jaltekruse commented on the pull request:

    https://github.com/apache/drill/pull/380#issuecomment-185568844
  
    Can you generate these files in the tests rather than check them in?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by minji-kim <gi...@git.apache.org>.
Github user minji-kim commented on a diff in the pull request:

    https://github.com/apache/drill/pull/380#discussion_r53359912
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestComplexTypeReader.java ---
    @@ -241,4 +252,49 @@ public void testRepeatedJson() throws Exception {
                 .go();
       }
     
    +  @Test  // DRILL-4410
    +  // ListVector allocation
    +  public void test_array() throws Exception{
    +
    +    long numRecords = 100000;
    +    String file1 = "/tmp/" + TestComplexTypeReader.class.getName() + "arrays1.json";
    --- End diff --
    
    ParquetRecordReaderTest also uses "/tmp", so I think this should also work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by minji-kim <gi...@git.apache.org>.
Github user minji-kim commented on the pull request:

    https://github.com/apache/drill/pull/380#issuecomment-186020856
  
    I made a change in the test to use the temporary directory, since that seems to be questionable.  Also I added another test in TestValueVector.  Both tests fail without the change in ListVector. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by minji-kim <gi...@git.apache.org>.
Github user minji-kim commented on a diff in the pull request:

    https://github.com/apache/drill/pull/380#discussion_r53366006
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestComplexTypeReader.java ---
    @@ -241,4 +252,49 @@ public void testRepeatedJson() throws Exception {
                 .go();
       }
     
    +  @Test  // DRILL-4410
    +  // ListVector allocation
    +  public void test_array() throws Exception{
    +
    +    long numRecords = 100000;
    +    String file1 = "/tmp/" + TestComplexTypeReader.class.getName() + "arrays1.json";
    --- End diff --
    
    I think these tests all use /tmp. 
    
    https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/impersonation/TestImpersonationMetadata.java#L64
    
    https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/TestDropTable.java#L166
    
    https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestWriter.java#L60


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by jaltekruse <gi...@git.apache.org>.
Github user jaltekruse commented on the pull request:

    https://github.com/apache/drill/pull/380#issuecomment-185825063
  
    Could you also add result verification? While there are some older tests that just run queries to verify that errors that previously occurred are gone, we have been enforcing that new tests since the test builder was added verify their results. You can use the test builder to add records in a loop to the expected result set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/380#discussion_r53362337
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestComplexTypeReader.java ---
    @@ -241,4 +252,49 @@ public void testRepeatedJson() throws Exception {
                 .go();
       }
     
    +  @Test  // DRILL-4410
    +  // ListVector allocation
    +  public void test_array() throws Exception{
    +
    +    long numRecords = 100000;
    +    String file1 = "/tmp/" + TestComplexTypeReader.class.getName() + "arrays1.json";
    --- End diff --
    
    Seems ParquetRecordReaderTest is ignored?
    
    [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetRecordReaderTest.java#L84


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by minji-kim <gi...@git.apache.org>.
Github user minji-kim commented on the pull request:

    https://github.com/apache/drill/pull/380#issuecomment-185577973
  
    That's a good point.  Just uploaded small patch to generate it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/380#discussion_r53357932
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestComplexTypeReader.java ---
    @@ -241,4 +252,49 @@ public void testRepeatedJson() throws Exception {
                 .go();
       }
     
    +  @Test  // DRILL-4410
    +  // ListVector allocation
    +  public void test_array() throws Exception{
    +
    +    long numRecords = 100000;
    +    String file1 = "/tmp/" + TestComplexTypeReader.class.getName() + "arrays1.json";
    --- End diff --
    
    Will this work correctly on windows environment? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by minji-kim <gi...@git.apache.org>.
Github user minji-kim commented on the pull request:

    https://github.com/apache/drill/pull/380#issuecomment-185846185
  
    Added the checks in the test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by adeneche <gi...@git.apache.org>.
Github user adeneche commented on a diff in the pull request:

    https://github.com/apache/drill/pull/380#discussion_r53365438
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestComplexTypeReader.java ---
    @@ -241,4 +252,49 @@ public void testRepeatedJson() throws Exception {
                 .go();
       }
     
    +  @Test  // DRILL-4410
    +  // ListVector allocation
    +  public void test_array() throws Exception{
    +
    +    long numRecords = 100000;
    +    String file1 = "/tmp/" + TestComplexTypeReader.class.getName() + "arrays1.json";
    --- End diff --
    
    an alternative is to use BaseTestQuery.getTempDir("ComplexTypeWriter")


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/380


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by vkorukanti <gi...@git.apache.org>.
Github user vkorukanti commented on a diff in the pull request:

    https://github.com/apache/drill/pull/380#discussion_r53366855
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestComplexTypeReader.java ---
    @@ -241,4 +252,49 @@ public void testRepeatedJson() throws Exception {
                 .go();
       }
     
    +  @Test  // DRILL-4410
    +  // ListVector allocation
    +  public void test_array() throws Exception{
    +
    +    long numRecords = 100000;
    +    String file1 = "/tmp/" + TestComplexTypeReader.class.getName() + "arrays1.json";
    --- End diff --
    
    TestImpersonationMetadata refers to a directory on HDFS which should be fine as the paths are unix style.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4410: ListVector should initialize bits ...

Posted by jaltekruse <gi...@git.apache.org>.
Github user jaltekruse commented on the pull request:

    https://github.com/apache/drill/pull/380#issuecomment-187942345
  
    +1 @hsuanyi Does this second test look good to you? I will test this on windows before merging, but I think it should be fine. Will do a small refactoring to use the generated dfs_test tmp space instead if it's an issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---