You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@parquet.apache.org by bl...@apache.org on 2015/07/02 01:46:29 UTC

parquet-mr git commit: PARQUET-325: Always use row group size when padding is 0.

Repository: parquet-mr
Updated Branches:
  refs/heads/master 9fde65345 -> c7720ca4c


PARQUET-325: Always use row group size when padding is 0.

For block file systems, if the size left in the block is greater than
the max padding, a row group will be targeted at the remaining size.
However, when using 0 to turn padding off, the remaining bytes will
always be greater than padding and row groups can be targeted at very
tiny spaces. When padding is off, the next row group's size should
always be the default size.

Author: Ryan Blue <bl...@apache.org>

Closes #234 from rdblue/PARQUET-325-padding-0-fix and squashes the following commits:

f4b3c2b [Ryan Blue] PARQUET-325: Always use row group size when padding is 0.


Project: http://git-wip-us.apache.org/repos/asf/parquet-mr/repo
Commit: http://git-wip-us.apache.org/repos/asf/parquet-mr/commit/c7720ca4
Tree: http://git-wip-us.apache.org/repos/asf/parquet-mr/tree/c7720ca4
Diff: http://git-wip-us.apache.org/repos/asf/parquet-mr/diff/c7720ca4

Branch: refs/heads/master
Commit: c7720ca4c232d317cfc800a04eda4a1d5a44a944
Parents: 9fde653
Author: Ryan Blue <bl...@apache.org>
Authored: Wed Jul 1 16:46:23 2015 -0700
Committer: Ryan Blue <bl...@apache.org>
Committed: Wed Jul 1 16:46:23 2015 -0700

----------------------------------------------------------------------
 .../main/java/org/apache/parquet/hadoop/ParquetFileWriter.java   | 4 ++++
 1 file changed, 4 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/parquet-mr/blob/c7720ca4/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java
----------------------------------------------------------------------
diff --git a/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java b/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java
index e285376..5f265e0 100644
--- a/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java
+++ b/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java
@@ -687,6 +687,10 @@ public class ParquetFileWriter {
 
     @Override
     public long nextRowGroupSize(FSDataOutputStream out) throws IOException {
+      if (maxPaddingSize <= 0) {
+        return rowGroupSize;
+      }
+
       long remaining = dfsBlockSize - (out.getPos() % dfsBlockSize);
 
       if (isPaddingNeeded(remaining)) {