You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/02/15 03:25:54 UTC

[GitHub] [hudi] vinothchandar commented on a change in pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

vinothchandar commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r806414751



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -161,6 +173,15 @@ protected FileSplit makeSplit(Path file, long start, long length,
     return returns.toArray(new FileStatus[0]);
   }
 
+  @Override
+  public RecordReader<NullWritable, ArrayWritable> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException {
+    throw new UnsupportedEncodingException("not implemented");

Review comment:
       Why are we throwing. `EncodingException`. Can we throw `HoodieUnsupported..` 

##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files being read
  */
-public abstract class HoodieFileInputFormatBase extends FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       Actually table types are ONLY about how data is stored, not how they are queried. MOR's read optimized queries are now served via "CopyOnWrite" ? https://hudi.apache.org/docs/table_types#table-and-query-types  So not sure I agree with the reasoning above.
   
   But incremental query is now overloaded onto the input format. So this renaming is ok for now




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org