You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/09/20 13:54:47 UTC

[GitHub] [hive] pgaref commented on a change in pull request #2639: HIVE-25521 - Fix concatenate file handling when files of different compressions are in same table/partition.

pgaref commented on a change in pull request #2639:
URL: https://github.com/apache/hive/pull/2639#discussion_r712189151



##########
File path: ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFileStripeMergeRecordReader.java
##########
@@ -63,8 +63,20 @@ public void testSplitStartsWithOffset() throws IOException {
     FileSplit split = new FileSplit(tmpPath, offset, length, (String[])null);
     OrcFileStripeMergeRecordReader reader = new OrcFileStripeMergeRecordReader(conf, split);
     reader.next(key, value);
+    // since offset is non-zero this file will not be processed.
+    Assert.assertNull(key.getInputPath());
+    split = new FileSplit(tmpPath, 0, length, (String[]) null);

Review comment:
       Offset is actually Zero here -- what am I missing?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeRecordReader.java
##########
@@ -80,7 +80,7 @@ public boolean next(OrcFileKeyWrapper key, OrcFileValueWrapper value) throws IOE
   }
 
   protected boolean nextStripe(OrcFileKeyWrapper keyWrapper, OrcFileValueWrapper valueWrapper)
-      throws IOException {
+        throws IOException {
     // missing stripe stats (old format). If numRows is 0 then its an empty file and no statistics
     // is present. We have to differentiate no stats (empty file) vs missing stats (old format).
     if ((stripeStatistics == null || stripeStatistics.isEmpty()) && reader.getNumberOfRows() > 0) {

Review comment:
       would it make sense to have the ```start > 0``` check here instead?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org