You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/01/28 15:15:53 UTC

[GitHub] [hudi] YannByron opened a new pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

YannByron opened a new pull request #4714:
URL: https://github.com/apache/hudi/pull/4714


   …sult when query by partition column
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1034475793


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5858",
       "triggerID" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4ba5756a483f5f5fab6878e4dc055f4116377650 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5858) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r796026389



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##########
@@ -183,6 +185,14 @@
   public static final ConfigProperty<String> URL_ENCODE_PARTITIONING = KeyGeneratorOptions.URL_ENCODE_PARTITIONING;
   public static final ConfigProperty<String> HIVE_STYLE_PARTITIONING_ENABLE = KeyGeneratorOptions.HIVE_STYLE_PARTITIONING_ENABLE;
 
+  public static final List<String> PERSISTED_CONFIG_LIST = Arrays.asList(
+      Config.DATE_TIME_PARSER_PROP,

Review comment:
       does this mean, we are going to persist all key gen props to table config?
   Also, do we also impose any constraints that one can't change the key gen props once the table is created? 

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala
##########
@@ -282,6 +296,41 @@ object HoodieFileIndex {
     properties
   }
 
+  def convertFilterForTimestampKeyGenerator(metaClient: HoodieTableMetaClient,
+      partitionFilters: Seq[Expression]): Seq[Expression] = {
+
+    val tableConfig = metaClient.getTableConfig
+    val keyGenerator = tableConfig.getKeyGeneratorClassName
+
+    if (keyGenerator.equals(classOf[TimestampBasedKeyGenerator].getCanonicalName) ||
+        keyGenerator.equals(classOf[TimestampBasedAvroKeyGenerator].getCanonicalName)) {
+      val inputFormat = tableConfig.getString(KeyGeneratorOptions.Config.TIMESTAMP_INPUT_DATE_FORMAT_PROP)
+      val outputFormat = tableConfig.getString(KeyGeneratorOptions.Config.TIMESTAMP_OUTPUT_DATE_FORMAT_PROP)
+      if (StringUtils.isNullOrEmpty(inputFormat) || StringUtils.isNullOrEmpty(outputFormat) ||
+          inputFormat.equals(outputFormat)) {
+        partitionFilters
+      } else {
+        try {
+          val inDateFormat = new SimpleDateFormat(inputFormat)
+          val outDateFormat = new SimpleDateFormat(outputFormat)
+          partitionFilters.toArray.map {

Review comment:
       do we know how this might pan out if input format is EPOCHMILLISECONDS? InputDateformat will be empty is it and so we will fallback to using original partition filters? 

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##########
@@ -759,6 +780,11 @@ public PropertyBuilder fromMetaClient(HoodieTableMetaClient metaClient) {
 
     public PropertyBuilder fromProperties(Properties properties) {
       HoodieConfig hoodieConfig = new HoodieConfig(properties);
+
+      for (String key : hoodieConfig.getProps().stringPropertyNames()) {

Review comment:
       might be easier if we go through PERSISTED_CONFIG_LIST instead of all hoodieWriteConfigs

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala
##########
@@ -282,6 +296,41 @@ object HoodieFileIndex {
     properties
   }
 
+  def convertFilterForTimestampKeyGenerator(metaClient: HoodieTableMetaClient,
+      partitionFilters: Seq[Expression]): Seq[Expression] = {
+
+    val tableConfig = metaClient.getTableConfig
+    val keyGenerator = tableConfig.getKeyGeneratorClassName
+
+    if (keyGenerator.equals(classOf[TimestampBasedKeyGenerator].getCanonicalName) ||

Review comment:
       lets account for NullPointerException. key gen property may not be set in all code paths. 

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala
##########
@@ -770,4 +775,79 @@ class TestMORDataSource extends HoodieClientTestBase {
       .load(basePath + "/*/*/*/*")
     assertEquals(numRecords - numRecordsToDelete, snapshotDF2.count())
   }
+
+  /**
+   * This tests the case that query by with a specified partition condition on hudi table which is
+   * different between the value of the partition field and the actual partition path,
+   * like hudi table written by TimestampBasedKeyGenerator.
+   *
+   * For MOR table, test all the three query modes.
+   */
+  @Test
+  def testPrunePartitionForTimestampBasedKeyGenerator(): Unit = {
+    val options = commonOpts ++ Map(
+      "hoodie.compact.inline" -> "false",
+      DataSourceWriteOptions.TABLE_TYPE.key -> DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL,
+      DataSourceWriteOptions.KEYGENERATOR_CLASS_NAME.key -> "org.apache.hudi.keygen.TimestampBasedKeyGenerator",
+      Config.TIMESTAMP_TYPE_FIELD_PROP -> "DATE_STRING",
+      Config.TIMESTAMP_OUTPUT_DATE_FORMAT_PROP -> "yyyy/MM/dd",
+      Config.TIMESTAMP_TIMEZONE_FORMAT_PROP -> "GMT+8:00",
+      Config.TIMESTAMP_INPUT_DATE_FORMAT_PROP -> "yyyy-MM-dd"
+    )
+
+    val dataGen1 = new HoodieTestDataGenerator(Array("2022-01-01"))
+    val records1 = recordsToStrings(dataGen1.generateInserts("001", 50)).toList
+    val inputDF1 = spark.read.json(spark.sparkContext.parallelize(records1, 2))
+    inputDF1.write.format("org.apache.hudi")
+      .options(options)
+      .mode(SaveMode.Overwrite)
+      .save(basePath)
+    metaClient = HoodieTableMetaClient.builder()
+      .setBasePath(basePath)
+      .setConf(spark.sessionState.newHadoopConf)
+      .build()
+    val commit1Time = metaClient.getActiveTimeline.lastInstant().get().getTimestamp
+
+    val dataGen2 = new HoodieTestDataGenerator(Array("2022-01-02"))
+    val records2 = recordsToStrings(dataGen2.generateInserts("002", 50)).toList

Review comment:
       can we ingest diff no of records in each batch. just so we our assertions are intact. right now, we assert 50 for  2022-01-01 and for 2022-01-02 too. may be, 40 and 60. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r798474157



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##########
@@ -183,6 +185,14 @@
   public static final ConfigProperty<String> URL_ENCODE_PARTITIONING = KeyGeneratorOptions.URL_ENCODE_PARTITIONING;
   public static final ConfigProperty<String> HIVE_STYLE_PARTITIONING_ENABLE = KeyGeneratorOptions.HIVE_STYLE_PARTITIONING_ENABLE;
 
+  public static final List<String> PERSISTED_CONFIG_LIST = Arrays.asList(
+      Config.DATE_TIME_PARSER_PROP,

Review comment:
       yes. These keygen props are immutable. 
   And there are other configs that shouldn't be changed once the table is created. maybe we should open another ticket to track this, and manage them together.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1040092239


   > Hello, can someone check the build cc @nsivabalan @YannByron !
   
   https://github.com/apache/hudi/pull/4822 can fix this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1029617902


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #4714:
URL: https://github.com/apache/hudi/pull/4714


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024319777


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024409363


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1040042258


   Hello, can someone check the build cc @nsivabalan @YannByron !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024850151


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e8999e4928debb876332f287a1584cc7cbd69c85 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1029617902


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1028935199


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e8999e4928debb876332f287a1584cc7cbd69c85 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599) 
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024838933


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582) 
   * e8999e4928debb876332f287a1584cc7cbd69c85 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1029581692


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r798492743



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala
##########
@@ -282,6 +296,41 @@ object HoodieFileIndex {
     properties
   }
 
+  def convertFilterForTimestampKeyGenerator(metaClient: HoodieTableMetaClient,
+      partitionFilters: Seq[Expression]): Seq[Expression] = {
+
+    val tableConfig = metaClient.getTableConfig
+    val keyGenerator = tableConfig.getKeyGeneratorClassName
+
+    if (keyGenerator.equals(classOf[TimestampBasedKeyGenerator].getCanonicalName) ||

Review comment:
       ok. I'll judge if `keyGenerator ` is null.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024839306


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582) 
   * e8999e4928debb876332f287a1584cc7cbd69c85 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024319777


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1034450979


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5858",
       "triggerID" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836) 
   * 4ba5756a483f5f5fab6878e4dc055f4116377650 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5858) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1034450979


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5858",
       "triggerID" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836) 
   * 4ba5756a483f5f5fab6878e4dc055f4116377650 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5858) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1032223263


   @nsivabalan
   i think it's better and easy-to-use for some configs like these key gen related and `hive_style_partitioning` which are restrictive  and don't allow to update during the table's lifecycle. Otherwise, users may be confused when using different keygen configs.
   if the existing table isn't that users expected, deleting and re-creating ( or overwriting) is reasonable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1033585806


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726) 
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r802485256



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##########
@@ -750,6 +757,22 @@ public PropertyBuilder setCommitTimezone(HoodieTimelineTimeZone timelineTimeZone
       return this;
     }
 
+    public PropertyBuilder set(String key, Object value) {
+      if (HoodieTableConfig.PERSISTED_CONFIG_LIST.contains(key)) {
+        this.others.put(key, value);
+      }
+      return this;
+    }
+
+    public PropertyBuilder set(Map<String, Object> props) {
+      for (Map.Entry<String, Object> entry : props.entrySet()) {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1033588821


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726) 
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r798492743



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala
##########
@@ -282,6 +296,41 @@ object HoodieFileIndex {
     properties
   }
 
+  def convertFilterForTimestampKeyGenerator(metaClient: HoodieTableMetaClient,
+      partitionFilters: Seq[Expression]): Seq[Expression] = {
+
+    val tableConfig = metaClient.getTableConfig
+    val keyGenerator = tableConfig.getKeyGeneratorClassName
+
+    if (keyGenerator.equals(classOf[TimestampBasedKeyGenerator].getCanonicalName) ||

Review comment:
       ok




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1029581692


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #4714:
URL: https://github.com/apache/hudi/pull/4714


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024839306


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582) 
   * e8999e4928debb876332f287a1584cc7cbd69c85 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024838933


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582) 
   * e8999e4928debb876332f287a1584cc7cbd69c85 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024850151


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e8999e4928debb876332f287a1584cc7cbd69c85 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r798478214



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##########
@@ -759,6 +780,11 @@ public PropertyBuilder fromMetaClient(HoodieTableMetaClient metaClient) {
 
     public PropertyBuilder fromProperties(Properties properties) {
       HoodieConfig hoodieConfig = new HoodieConfig(properties);
+
+      for (String key : hoodieConfig.getProps().stringPropertyNames()) {

Review comment:
       you're right. set key-value If the key configured in the `hoodieConfig ` is in `PERSISTED_CONFIG_LIST `.

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala
##########
@@ -770,4 +775,79 @@ class TestMORDataSource extends HoodieClientTestBase {
       .load(basePath + "/*/*/*/*")
     assertEquals(numRecords - numRecordsToDelete, snapshotDF2.count())
   }
+
+  /**
+   * This tests the case that query by with a specified partition condition on hudi table which is
+   * different between the value of the partition field and the actual partition path,
+   * like hudi table written by TimestampBasedKeyGenerator.
+   *
+   * For MOR table, test all the three query modes.
+   */
+  @Test
+  def testPrunePartitionForTimestampBasedKeyGenerator(): Unit = {
+    val options = commonOpts ++ Map(
+      "hoodie.compact.inline" -> "false",
+      DataSourceWriteOptions.TABLE_TYPE.key -> DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL,
+      DataSourceWriteOptions.KEYGENERATOR_CLASS_NAME.key -> "org.apache.hudi.keygen.TimestampBasedKeyGenerator",
+      Config.TIMESTAMP_TYPE_FIELD_PROP -> "DATE_STRING",
+      Config.TIMESTAMP_OUTPUT_DATE_FORMAT_PROP -> "yyyy/MM/dd",
+      Config.TIMESTAMP_TIMEZONE_FORMAT_PROP -> "GMT+8:00",
+      Config.TIMESTAMP_INPUT_DATE_FORMAT_PROP -> "yyyy-MM-dd"
+    )
+
+    val dataGen1 = new HoodieTestDataGenerator(Array("2022-01-01"))
+    val records1 = recordsToStrings(dataGen1.generateInserts("001", 50)).toList
+    val inputDF1 = spark.read.json(spark.sparkContext.parallelize(records1, 2))
+    inputDF1.write.format("org.apache.hudi")
+      .options(options)
+      .mode(SaveMode.Overwrite)
+      .save(basePath)
+    metaClient = HoodieTableMetaClient.builder()
+      .setBasePath(basePath)
+      .setConf(spark.sessionState.newHadoopConf)
+      .build()
+    val commit1Time = metaClient.getActiveTimeline.lastInstant().get().getTimestamp
+
+    val dataGen2 = new HoodieTestDataGenerator(Array("2022-01-02"))
+    val records2 = recordsToStrings(dataGen2.generateInserts("002", 50)).toList

Review comment:
       ok




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1028935199


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e8999e4928debb876332f287a1584cc7cbd69c85 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599) 
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1028974180


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1033585806


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726) 
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1033159454


   sounds good 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1034449656


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836) 
   * 4ba5756a483f5f5fab6878e4dc055f4116377650 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1033668183


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r802141826



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##########
@@ -750,6 +757,22 @@ public PropertyBuilder setCommitTimezone(HoodieTimelineTimeZone timelineTimeZone
       return this;
     }
 
+    public PropertyBuilder set(String key, Object value) {
+      if (HoodieTableConfig.PERSISTED_CONFIG_LIST.contains(key)) {
+        this.others.put(key, value);
+      }
+      return this;
+    }
+
+    public PropertyBuilder set(Map<String, Object> props) {
+      for (Map.Entry<String, Object> entry : props.entrySet()) {

Review comment:
       can we go through PERSISTED_CONFIG_LIST here in this for loop and in next line check if its part of incoming props list. iterating through a smaller list is better :) 

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##########
@@ -183,6 +185,14 @@
   public static final ConfigProperty<String> URL_ENCODE_PARTITIONING = KeyGeneratorOptions.URL_ENCODE_PARTITIONING;
   public static final ConfigProperty<String> HIVE_STYLE_PARTITIONING_ENABLE = KeyGeneratorOptions.HIVE_STYLE_PARTITIONING_ENABLE;
 
+  public static final List<String> PERSISTED_CONFIG_LIST = Arrays.asList(
+      Config.DATE_TIME_PARSER_PROP,

Review comment:
       yes, please do open one. 

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##########
@@ -643,6 +644,12 @@ public static PropertyBuilder withPropertyBuilder() {
     private Boolean urlEncodePartitioning;
     private HoodieTimelineTimeZone commitTimeZone;
 
+    /**
+     * Persist the configs that is written at the first time, and should not be changed.
+     * Like KeyGenerator's configs.
+     */
+    private Properties others = new Properties();

Review comment:
       So far we have never added loose configs (random key value pairs) to our tableConfig. But these could be empty if not for timestamp based key gen. Can't think of better ways. 
   
   btw, @xushiyan @codope @yihua : Adding new properties to tableConfig does not warrant a new table version right? wanted to confirm w/ you folks. 
   

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##########
@@ -759,6 +780,11 @@ public PropertyBuilder fromMetaClient(HoodieTableMetaClient metaClient) {
 
     public PropertyBuilder fromProperties(Properties properties) {
       HoodieConfig hoodieConfig = new HoodieConfig(properties);
+
+      for (String key : hoodieConfig.getProps().stringPropertyNames()) {

Review comment:
       do you mean to say, hoodieConfig.getProps().stringPropertyNames() will be same as PERSISTED_CONFIG_LIST. Just trying to see if we can iterate through smaller list among PERSISTED_CONFIG_LIST and hoodieConfig.getProps().stringPropertyNames()




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1039162595


   @nsivabalan do you have time to continue to review this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1028974180


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1029580885


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r802475227



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##########
@@ -183,6 +185,14 @@
   public static final ConfigProperty<String> URL_ENCODE_PARTITIONING = KeyGeneratorOptions.URL_ENCODE_PARTITIONING;
   public static final ConfigProperty<String> HIVE_STYLE_PARTITIONING_ENABLE = KeyGeneratorOptions.HIVE_STYLE_PARTITIONING_ENABLE;
 
+  public static final List<String> PERSISTED_CONFIG_LIST = Arrays.asList(
+      Config.DATE_TIME_PARSER_PROP,

Review comment:
       https://issues.apache.org/jira/browse/HUDI-3403




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1034449656


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4ba5756a483f5f5fab6878e4dc055f4116377650",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836) 
   * 4ba5756a483f5f5fab6878e4dc055f4116377650 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1033588821


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726) 
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1033668183


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5726",
       "triggerID" : "1029580885",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836",
       "triggerID" : "6103ba6800a244253f9e7150f2ee90dc8019df61",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6103ba6800a244253f9e7150f2ee90dc8019df61 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5836) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1040092239


   > Hello, can someone check the build cc @nsivabalan @YannByron !
   
   https://github.com/apache/hudi/pull/4822 can fix this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024810709


   @nsivabalan @leesf could you help to review this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024409363


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1040042258


   Hello, can someone check the build cc @nsivabalan @YannByron !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024322851


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] YannByron commented on a change in pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
YannByron commented on a change in pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#discussion_r798482835



##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala
##########
@@ -282,6 +296,41 @@ object HoodieFileIndex {
     properties
   }
 
+  def convertFilterForTimestampKeyGenerator(metaClient: HoodieTableMetaClient,
+      partitionFilters: Seq[Expression]): Seq[Expression] = {
+
+    val tableConfig = metaClient.getTableConfig
+    val keyGenerator = tableConfig.getKeyGeneratorClassName
+
+    if (keyGenerator.equals(classOf[TimestampBasedKeyGenerator].getCanonicalName) ||
+        keyGenerator.equals(classOf[TimestampBasedAvroKeyGenerator].getCanonicalName)) {
+      val inputFormat = tableConfig.getString(KeyGeneratorOptions.Config.TIMESTAMP_INPUT_DATE_FORMAT_PROP)
+      val outputFormat = tableConfig.getString(KeyGeneratorOptions.Config.TIMESTAMP_OUTPUT_DATE_FORMAT_PROP)
+      if (StringUtils.isNullOrEmpty(inputFormat) || StringUtils.isNullOrEmpty(outputFormat) ||
+          inputFormat.equals(outputFormat)) {
+        partitionFilters
+      } else {
+        try {
+          val inDateFormat = new SimpleDateFormat(inputFormat)
+          val outDateFormat = new SimpleDateFormat(outputFormat)
+          partitionFilters.toArray.map {

Review comment:
       yes. convert filters only if both input.format and output.format are provided.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1028937703


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e8999e4928debb876332f287a1584cc7cbd69c85 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599) 
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1028937703


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599",
       "triggerID" : "e8999e4928debb876332f287a1584cc7cbd69c85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705",
       "triggerID" : "8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e8999e4928debb876332f287a1584cc7cbd69c85 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5599) 
   * 8ee6a4e07374f18a35f3f1807b11fa5b6570cb6e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5705) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1024322851


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582",
       "triggerID" : "1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1dc6c2f464e45fcbffe5d7fde2fbb1c66a6fca34 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5582) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #4714: [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re…

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #4714:
URL: https://github.com/apache/hudi/pull/4714#issuecomment-1031936997


   @xushiyan @YannByron : wanted your thoughts in general around dis-allowing updating key gen props. 
   lets say someone tried some timestamp based key gen and the output was not as he/she expected. So next time, they are issueing insert_overwrite_table may be. Do, we allow a diff set of key generator configs now? if not, its too restrictive right? only way around is to explicitly delete the table and then change the key gen configs? 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org