You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Jason Altekruse <al...@gmail.com> on 2014/05/23 18:22:11 UTC
Review Request 21868: Drill 827
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21868/
-----------------------------------------------------------
Review request for drill.
Repository: drill-git
Description
-------
Parquet reader was previously reading too far into an RLE stream. Now saving each value the same way I am saving each definition level, so if the read loop is exited when we realize all of the var length values in a record will not fit in the current batch, the last value we read will still be available for insertion into the next batch. Previously it was losing the value and always reading another at the start of the next loop, causing it to try to read too many values out of the stream.
Diffs
-----
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/PageReadStatus.java e4081d9
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordReader.java 0996620
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/VarLenBinaryReader.java 4efcdaf
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetRecordReaderTest.java dec4b15
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetResultListener.java a533117
Diff: https://reviews.apache.org/r/21868/diff/
Testing
-------
Added a test for the file generated by Steven.
Thanks,
Jason Altekruse
Re: Review Request 21868: Drill 827
Posted by Jason Altekruse <al...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21868/
-----------------------------------------------------------
(Updated May 23, 2014, 4:27 p.m.)
Review request for drill.
Changes
-------
Marked the new test ignore as it relies on a binary file outside of git.
Repository: drill-git
Description
-------
Parquet reader was previously reading too far into an RLE stream. Now saving each value the same way I am saving each definition level, so if the read loop is exited when we realize all of the var length values in a record will not fit in the current batch, the last value we read will still be available for insertion into the next batch. Previously it was losing the value and always reading another at the start of the next loop, causing it to try to read too many values out of the stream.
Diffs (updated)
-----
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/PageReadStatus.java e4081d9
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordReader.java 0996620
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/VarLenBinaryReader.java 4efcdaf
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetRecordReaderTest.java dec4b15
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetResultListener.java a533117
Diff: https://reviews.apache.org/r/21868/diff/
Testing
-------
Added a test for the file generated by Steven.
Thanks,
Jason Altekruse