You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/01/21 19:33:35 UTC

[GitHub] [hudi] alexeykudinkin opened a new pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

alexeykudinkin opened a new pull request #4667:
URL: https://github.com/apache/hudi/pull/4667


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`, to make sure that Hive is appropriately recognizing those impls and applying corresponding optimizations.
   
   ## Brief change log
   
    - Converted `HoodieRealtimeFileInputFormatBase` and `HoodieFileInputFormatBase` into standalone implementations that could be instantiated as standalone objects (which could be used for delegation)
    - Renamed `HoodieFileInputFormatBase` > `HoodieCopyOnWriteTableInputFormat`, `HoodieRealtimeFileInputFormatBase` > `HoodieMergeOnReadTableInputFormat`
    - Scaffolded `HoodieParquetFileInputFormatBase` for all Parquet impls to inherit from
    - Rebased Parquet impls onto `HoodieParquetFileInputFormatBase`
   
   ## Verify this pull request
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018844558


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420) 
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020664709


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422) 
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1033011132


   They are not used to project what's ready by FileReader, but used to construct reader-schema w/in the `RecordReader`:
   https://github.com/apache/hudi/blob/master/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/AbstractRealtimeRecordReader.java#L107
   
   Hive architecture is unfortunate


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1028367974


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * b504aa798fa399e7b162203627490f9090656a32 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495) 
   * 29733a0d437997485b21327a6c256233e35c4d3b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yihua commented on a change in pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
yihua commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r802049943



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files being read
  */
-public abstract class HoodieFileInputFormatBase extends FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       I agree that `Realtime` naming can be removed.
   
   I come from the angle where the InputFormat classes can be named according to the Hudi file layouts, i.e., file groups with file slices containing a base file and a set of log files.  "CopyOnWrite" and "MergeOnRead" naming is one layer above for write/read logic.  They are cleaner than the previous naming.  I'm thinking that "BaseFile" and "BaseAndLogFile" prefixes may be a better fit here.
   
   Since the changes are fundamental, I prefer that the naming should be finalized in 0.11.0 and won't be changed for some time.  So consensus is needed here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1030415010


   @alexeykudinkin : Do you know why HoodieHFileInputFormat does not have UseRecordReaderFromInputFormat annotation in master ? CC @prashantwason 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r799855748



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
##########
@@ -54,20 +45,16 @@
  */
 @UseRecordReaderFromInputFormat
 @UseFileSplitsFromInputFormat
-public class HoodieParquetInputFormat extends HoodieFileInputFormatBase implements Configurable {
+public class HoodieParquetInputFormat extends HoodieParquetInputFormatBase {
 
   private static final Logger LOG = LogManager.getLogger(HoodieParquetInputFormat.class);
 
-  // NOTE: We're only using {@code MapredParquetInputFormat} to compose vectorized
-  //       {@code RecordReader}
-  private final MapredParquetInputFormat mapredParquetInputFormat = new MapredParquetInputFormat();
-
-  protected HoodieDefaultTimeline filterInstantsTimeline(HoodieDefaultTimeline timeline) {
-    return HoodieInputFormatUtils.filterInstantsTimeline(timeline);
+  public HoodieParquetInputFormat() {
+    super(new HoodieCopyOnWriteTableInputFormat());

Review comment:
       why not 
   ```
   this(new HoodieCopyOnWriteTableInputFormat())
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020698615


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018804638


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020782017


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483) 
   * b504aa798fa399e7b162203627490f9090656a32 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1029485331


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716",
       "triggerID" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 29733a0d437997485b21327a6c256233e35c4d3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686) 
   * 0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1029465563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 29733a0d437997485b21327a6c256233e35c4d3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686) 
   * 0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r807182343



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files being read
  */
-public abstract class HoodieFileInputFormatBase extends FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       Yeah, i've crossed the same path recently realizing that this dichotomy doesn't line up well with the reading path, and i think the crux of the problem is that COW is purely write-side semantic therefore when we say COW on the read-path that doesn't really make sense. 
   
   I'm touching up sibling hierarchy on Spark side, and will think about better terminology there and afterwards we can carry it over here as well




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020782017


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483) 
   * b504aa798fa399e7b162203627490f9090656a32 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020784173


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483) 
   * b504aa798fa399e7b162203627490f9090656a32 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r802056107



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files being read
  */
-public abstract class HoodieFileInputFormatBase extends FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       I see your point. However, the affiliation with COW/MOR is not about the semantic here, but rather reference to the _table's type_ which is pivotal here -- InputFormats are about reading Hudi's tables and it seems odd that we're not aliasing the table in any way with the abstraction that is supposed to read this very table.
   
   Happy to jump on a call and jam more to find the convergence here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1032057281


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #4667:
URL: https://github.com/apache/hudi/pull/4667


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1031874211


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716",
       "triggerID" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "56050e9e5accefad8861064f74eabe3f5d6f92c4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "56050e9e5accefad8861064f74eabe3f5d6f92c4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716) 
   * 56050e9e5accefad8861064f74eabe3f5d6f92c4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r806414751



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -161,6 +173,15 @@ protected FileSplit makeSplit(Path file, long start, long length,
     return returns.toArray(new FileStatus[0]);
   }
 
+  @Override
+  public RecordReader<NullWritable, ArrayWritable> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException {
+    throw new UnsupportedEncodingException("not implemented");

Review comment:
       Why are we throwing. `EncodingException`. Can we throw `HoodieUnsupported..` 

##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files being read
  */
-public abstract class HoodieFileInputFormatBase extends FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       Actually table types are ONLY about how data is stored, not how they are queried. MOR's read optimized queries are now served via "CopyOnWrite" ? https://hudi.apache.org/docs/table_types#table-and-query-types  So not sure I agree with the reasoning above.
   
   But incremental query is now overloaded onto the input format. So this renaming is ok for now




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020784173


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483) 
   * b504aa798fa399e7b162203627490f9090656a32 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1028367974


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * b504aa798fa399e7b162203627490f9090656a32 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495) 
   * 29733a0d437997485b21327a6c256233e35c4d3b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1028372936


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * b504aa798fa399e7b162203627490f9090656a32 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495) 
   * 29733a0d437997485b21327a6c256233e35c4d3b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1030414675


   @bvaradar @n3nash @vinothchandar : There is a quite a bit of change on the input format layer. Would appreciate if. you can take a look. There are 4 stacked PRs in this regard by the same author. 
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r800912024



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files being read
  */
-public abstract class HoodieFileInputFormatBase extends FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       Can you elaborate what you see confusing in there? We already have such splitting in Spark for ex (`MergeOnReadSnapshotRelation`, etc)
   
   I actually think it's much more cleaner connecting w/ MOR/COW dichotomy rather previous one of Realtime/non-Realtime




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1031750399


   @nsivabalan in general, components should be named for _what they actually _are_, not how they are used.
   
   Keep in mind that all this monkey-juggling with delegating is only required b/c of Hive's optimizations being squared on inheritance from `MapredParquetInputFormat`, which should not be necessary for other formats


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1028372936


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * b504aa798fa399e7b162203627490f9090656a32 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495) 
   * 29733a0d437997485b21327a6c256233e35c4d3b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1028440018


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 29733a0d437997485b21327a6c256233e35c4d3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020698615


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018806703


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420) 
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018884955


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018806703


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420) 
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018884955






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1029531052


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716",
       "triggerID" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018802633


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020833432


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * b504aa798fa399e7b162203627490f9090656a32 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020638631


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422) 
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020638631






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1029465563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 29733a0d437997485b21327a6c256233e35c4d3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686) 
   * 0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1028440018


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 29733a0d437997485b21327a6c256233e35c4d3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1029531052


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716",
       "triggerID" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1030628126


   just a thought. Should we think about renaming HoodieCopyOnWriteTableInputFormat to HoodieCopyOnWriteTableInputFormat**Delegatee** or something. Strictly speaking these are note the InputFormats used by the query engines. The ones that are actually used are HoodieParquetInputFormat and HoodieParquetRealtimeInputFormat. And those will delegate it to COWTableInputFormat and MORTableInputFormat. 
   
   what do you think? not too strong on the suggestion though. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r806428520



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -161,6 +173,15 @@ protected FileSplit makeSplit(Path file, long start, long length,
     return returns.toArray(new FileStatus[0]);
   }
 
+  @Override
+  public RecordReader<NullWritable, ArrayWritable> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException {
+    throw new UnsupportedEncodingException("not implemented");

Review comment:
       Good catch! Just a typo, will address in follow-up




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r806428520



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -161,6 +173,15 @@ protected FileSplit makeSplit(Path file, long start, long length,
     return returns.toArray(new FileStatus[0]);
   }
 
+  @Override
+  public RecordReader<NullWritable, ArrayWritable> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException {
+    throw new UnsupportedEncodingException("not implemented");

Review comment:
       Good catch! Just a typo, will address in follow-up




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018804638


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018844558


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420) 
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018834700


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420) 
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020638631


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422) 
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018884955


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020833432


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * b504aa798fa399e7b162203627490f9090656a32 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #4667: [HUDI-3276] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r806414751



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -161,6 +173,15 @@ protected FileSplit makeSplit(Path file, long start, long length,
     return returns.toArray(new FileStatus[0]);
   }
 
+  @Override
+  public RecordReader<NullWritable, ArrayWritable> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException {
+    throw new UnsupportedEncodingException("not implemented");

Review comment:
       Why are we throwing. `EncodingException`. Can we throw `HoodieUnsupported..` 

##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files being read
  */
-public abstract class HoodieFileInputFormatBase extends FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       Actually table types are ONLY about how data is stored, not how they are queried. MOR's read optimized queries are now served via "CopyOnWrite" ? https://hudi.apache.org/docs/table_types#table-and-query-types  So not sure I agree with the reasoning above.
   
   But incremental query is now overloaded onto the input format. So this renaming is ok for now




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1020664709


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422) 
   * 93e76959ffa1ae8408e191cf6ab9beea3fabe2c3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1029485331


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5422",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     }, {
       "hash" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5483",
       "triggerID" : "93e76959ffa1ae8408e191cf6ab9beea3fabe2c3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b504aa798fa399e7b162203627490f9090656a32",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5495",
       "triggerID" : "b504aa798fa399e7b162203627490f9090656a32",
       "triggerType" : "PUSH"
     }, {
       "hash" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686",
       "triggerID" : "29733a0d437997485b21327a6c256233e35c4d3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716",
       "triggerID" : "0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 29733a0d437997485b21327a6c256233e35c4d3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5686) 
   * 0ea9c4ad5c3eaf9fb2ba2796e6a751296a3a73cc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5716) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yihua commented on a change in pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
yihua commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r799913844



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormatBase.java
##########
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.hadoop;
+
+import org.apache.hadoop.conf.Configurable;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat;
+import org.apache.hadoop.mapred.FileInputFormat;
+import org.apache.hadoop.mapred.FileSplit;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hudi.hadoop.realtime.HoodieMergeOnReadTableInputFormat;
+
+import java.io.IOException;
+
+/**
+ * !!! PLEASE READ CAREFULLY !!!

Review comment:
       👍 

##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp (or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files being read
  */
-public abstract class HoodieFileInputFormatBase extends FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       Mixing table type in the naming might cause confusion.  Wdyt on the following naming:
   ```
   HoodieCopyOnWriteTableInputFormat -> HoodieBaseFileInputFormat
   HoodieMergeOnReadTableInputFormat -> HoodieBaseAndLogFileInputFormat
   ```
   Then, `HoodieParquetInputFormat` aligns with `HoodieBaseFileInputFormat` since parquet is a base file format.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r799902667



##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
##########
@@ -54,20 +45,16 @@
  */
 @UseRecordReaderFromInputFormat
 @UseFileSplitsFromInputFormat
-public class HoodieParquetInputFormat extends HoodieFileInputFormatBase implements Configurable {
+public class HoodieParquetInputFormat extends HoodieParquetInputFormatBase {
 
   private static final Logger LOG = LogManager.getLogger(HoodieParquetInputFormat.class);
 
-  // NOTE: We're only using {@code MapredParquetInputFormat} to compose vectorized
-  //       {@code RecordReader}
-  private final MapredParquetInputFormat mapredParquetInputFormat = new MapredParquetInputFormat();
-
-  protected HoodieDefaultTimeline filterInstantsTimeline(HoodieDefaultTimeline timeline) {
-    return HoodieInputFormatUtils.filterInstantsTimeline(timeline);
+  public HoodieParquetInputFormat() {
+    super(new HoodieCopyOnWriteTableInputFormat());

Review comment:
       B/c there's no point in that indirection -- we can go straight to the super class




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018834700


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ed1df9c2c6a5c79a5b450cf37e783fddfe861d35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "87e007b66be9767c93af1d498906b35bfc384c15",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5420) 
   * ed1df9c2c6a5c79a5b450cf37e783fddfe861d35 UNKNOWN
   * 87e007b66be9767c93af1d498906b35bfc384c15 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4667: [HUDI-3276][Stacked on 4559] Rebased Parquet-based `FileInputFormat` impls to inherit from `MapredParquetInputFormat`

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#issuecomment-1018802633


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c1939da73b134b7d99239f6099fb50476de7b9b2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c1939da73b134b7d99239f6099fb50476de7b9b2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org