You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Jason Altekruse <al...@gmail.com> on 2015/04/22 03:06:28 UTC

Review Request 33423: DRILL-2842: issue reading metadata footer in some parquet files

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33423/
-----------------------------------------------------------

Review request for drill and Steven Phillips.


Bugs: DRILL-2842
    https://issues.apache.org/jira/browse/DRILL-2842


Repository: drill-git


Description
-------

Parquet files with large footers could not be read. The length of the footer is written at the end of the file. To avoid excessive reads for smaller files, we read a reasonable amount of the end of the file that may contain the whole footer, with the actual exact length appearing at the end of the read. After checking the length we tried to read the remining portion ahead of what was already read and splice them together. The offset for where to put the bytes read first was off.


Diffs
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/FooterGatherer.java 0bb86e1 
  exec/java-exec/src/test/java/org/apache/drill/exec/fn/interp/TestConstantFolding.java b17935a 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 89837e7 

Diff: https://reviews.apache.org/r/33423/diff/


Testing
-------

In progress


Thanks,

Jason Altekruse


Re: Review Request 33423: DRILL-2842: issue reading metadata footer in some parquet files

Posted by Steven Phillips <sp...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33423/#review81734
-----------------------------------------------------------

Ship it!


Ship It!

- Steven Phillips


On April 22, 2015, 1:06 a.m., Jason Altekruse wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33423/
> -----------------------------------------------------------
> 
> (Updated April 22, 2015, 1:06 a.m.)
> 
> 
> Review request for drill and Steven Phillips.
> 
> 
> Bugs: DRILL-2842
>     https://issues.apache.org/jira/browse/DRILL-2842
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Parquet files with large footers could not be read. The length of the footer is written at the end of the file. To avoid excessive reads for smaller files, we read a reasonable amount of the end of the file that may contain the whole footer, with the actual exact length appearing at the end of the read. After checking the length we tried to read the remining portion ahead of what was already read and splice them together. The offset for where to put the bytes read first was off.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/FooterGatherer.java 0bb86e1 
>   exec/java-exec/src/test/java/org/apache/drill/exec/fn/interp/TestConstantFolding.java b17935a 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 89837e7 
> 
> Diff: https://reviews.apache.org/r/33423/diff/
> 
> 
> Testing
> -------
> 
> In progress
> 
> 
> Thanks,
> 
> Jason Altekruse
> 
>