You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/06/24 10:30:06 UTC

[GitHub] [hudi] wangxianghu opened a new pull request #3147: [HUDI-1983] Add key gen type config to Configurations page in hudi we…

wangxianghu opened a new pull request #3147:
URL: https://github.com/apache/hudi/pull/3147


   …bsite
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Update docs about key generator,  because of https://github.com/apache/hudi/commit/7261f0850727aea611cd34e1bb07d684b44534f6
   
   ## Brief change log
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3147: [HUDI-1983] [DOC]Add key gen type config to Configurations page in hudi we…

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3147:
URL: https://github.com/apache/hudi/pull/3147#discussion_r659880019



##########
File path: docs/_docs/2_4_configurations.md
##########
@@ -55,7 +55,8 @@ Options useful for writing tables via `write.format.option(...)`
 | RECORDKEY_FIELD_OPT_KEY | hoodie.datasource.write.recordkey.field | NO | uuid | Record key field. Value to be used as the recordKey component of HoodieKey. Actual value will be obtained by invoking .toString() on the field value. Nested fields can be specified using the dot notation eg: a.b.c |
 | PARTITIONPATH_FIELD_OPT_KEY | hoodie.datasource.write.partitionpath.field | NO | partitionpath | Partition path field. Value to be used at the partitionPath component of HoodieKey. Actual value ontained by invoking .toString() |
 | HIVE_STYLE_PARTITIONING_OPT_KEY | hoodie.datasource.write.hive_style_partitioning | NO | false | If set true, the names of partition folders follow <partition_column_name>=<partition_value> format. |
-| KEYGENERATOR_CLASS_OPT_KEY | hoodie.datasource.write.keygenerator.class | NO | org.apache.hudi.keygen.SimpleKeyGenerator | Key generator class, that implements will extract the key out of incoming Row object. |
+| KEYGENERATOR_CLASS_OPT_KEY |  | NO | org.apache.hudi.keygen.SimpleKeyGenerator | Key generator class, that implements will extract the key out of incoming `Row` object. This config has higher precedence over keygen type, and it is used for user-defined KeyGenerator . |

Review comment:
       I thought we should remove "org.apache.hudi.keygen.SimpleKeyGenerator" from here and say "No default" 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3147: [HUDI-1983] [DOC]Add key gen type config to Configurations page in hudi we…

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3147:
URL: https://github.com/apache/hudi/pull/3147#discussion_r659880019



##########
File path: docs/_docs/2_4_configurations.md
##########
@@ -55,7 +55,8 @@ Options useful for writing tables via `write.format.option(...)`
 | RECORDKEY_FIELD_OPT_KEY | hoodie.datasource.write.recordkey.field | NO | uuid | Record key field. Value to be used as the recordKey component of HoodieKey. Actual value will be obtained by invoking .toString() on the field value. Nested fields can be specified using the dot notation eg: a.b.c |
 | PARTITIONPATH_FIELD_OPT_KEY | hoodie.datasource.write.partitionpath.field | NO | partitionpath | Partition path field. Value to be used at the partitionPath component of HoodieKey. Actual value ontained by invoking .toString() |
 | HIVE_STYLE_PARTITIONING_OPT_KEY | hoodie.datasource.write.hive_style_partitioning | NO | false | If set true, the names of partition folders follow <partition_column_name>=<partition_value> format. |
-| KEYGENERATOR_CLASS_OPT_KEY | hoodie.datasource.write.keygenerator.class | NO | org.apache.hudi.keygen.SimpleKeyGenerator | Key generator class, that implements will extract the key out of incoming Row object. |
+| KEYGENERATOR_CLASS_OPT_KEY |  | NO | org.apache.hudi.keygen.SimpleKeyGenerator | Key generator class, that implements will extract the key out of incoming `Row` object. This config has higher precedence over keygen type, and it is used for user-defined KeyGenerator . |

Review comment:
       I thought we should remove "org.apache.hudi.keygen.SimpleKeyGenerator" from here and say "No default" 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on pull request #3147: [HUDI-1983] Add key gen type config to Configurations page in hudi we…

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #3147:
URL: https://github.com/apache/hudi/pull/3147#issuecomment-867526758


   @nsivabalan, please take a look when free


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on a change in pull request #3147: [HUDI-1983] [DOC]Add key gen type config to Configurations page in hudi we…

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #3147:
URL: https://github.com/apache/hudi/pull/3147#discussion_r660206895



##########
File path: docs/_docs/2_4_configurations.md
##########
@@ -55,7 +55,8 @@ Options useful for writing tables via `write.format.option(...)`
 | RECORDKEY_FIELD_OPT_KEY | hoodie.datasource.write.recordkey.field | NO | uuid | Record key field. Value to be used as the recordKey component of HoodieKey. Actual value will be obtained by invoking .toString() on the field value. Nested fields can be specified using the dot notation eg: a.b.c |
 | PARTITIONPATH_FIELD_OPT_KEY | hoodie.datasource.write.partitionpath.field | NO | partitionpath | Partition path field. Value to be used at the partitionPath component of HoodieKey. Actual value ontained by invoking .toString() |
 | HIVE_STYLE_PARTITIONING_OPT_KEY | hoodie.datasource.write.hive_style_partitioning | NO | false | If set true, the names of partition folders follow <partition_column_name>=<partition_value> format. |
-| KEYGENERATOR_CLASS_OPT_KEY | hoodie.datasource.write.keygenerator.class | NO | org.apache.hudi.keygen.SimpleKeyGenerator | Key generator class, that implements will extract the key out of incoming Row object. |
+| KEYGENERATOR_CLASS_OPT_KEY |  | NO | org.apache.hudi.keygen.SimpleKeyGenerator | Key generator class, that implements will extract the key out of incoming `Row` object. This config has higher precedence over keygen type, and it is used for user-defined KeyGenerator . |

Review comment:
       > I thought we should remove "org.apache.hudi.keygen.SimpleKeyGenerator" from here and say "No default"
   
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on a change in pull request #3147: [HUDI-1983] [DOC]Add key gen type config to Configurations page in hudi we…

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #3147:
URL: https://github.com/apache/hudi/pull/3147#discussion_r659499344



##########
File path: docs/_docs/2_4_configurations.md
##########
@@ -83,8 +83,12 @@ Actual value ontained by invoking .toString()</span>
 
 #### KEYGENERATOR_CLASS_OPT_KEY {#KEYGENERATOR_CLASS_OPT_KEY}
   Property: `hoodie.datasource.write.keygenerator.class`, Default: `org.apache.hudi.keygen.SimpleKeyGenerator` <br/>
-  <span style="color:grey">Key generator class, that implements will extract the key out of incoming `Row` object</span>
-  
+  <span style="color:grey">Key generator class, that implements will extract the key out of incoming `Row` object. This config option is used for user-defined KeyGenerator</span>

Review comment:
       > also, there is no more default for hoodie.datasource.write.keygenerator.class if I understand correctly. can we fix that as well.
   
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3147: [HUDI-1983] [DOC]Add key gen type config to Configurations page in hudi we…

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3147:
URL: https://github.com/apache/hudi/pull/3147#discussion_r659453008



##########
File path: docs/_docs/2_4_configurations.md
##########
@@ -83,8 +83,12 @@ Actual value ontained by invoking .toString()</span>
 
 #### KEYGENERATOR_CLASS_OPT_KEY {#KEYGENERATOR_CLASS_OPT_KEY}
   Property: `hoodie.datasource.write.keygenerator.class`, Default: `org.apache.hudi.keygen.SimpleKeyGenerator` <br/>
-  <span style="color:grey">Key generator class, that implements will extract the key out of incoming `Row` object</span>
-  
+  <span style="color:grey">Key generator class, that implements will extract the key out of incoming `Row` object. This config option is used for user-defined KeyGenerator</span>

Review comment:
       Can you also add a line there, that this has higher precedence over keygen type.

##########
File path: docs/_docs/2_4_configurations.md
##########
@@ -83,8 +83,12 @@ Actual value ontained by invoking .toString()</span>
 
 #### KEYGENERATOR_CLASS_OPT_KEY {#KEYGENERATOR_CLASS_OPT_KEY}
   Property: `hoodie.datasource.write.keygenerator.class`, Default: `org.apache.hudi.keygen.SimpleKeyGenerator` <br/>
-  <span style="color:grey">Key generator class, that implements will extract the key out of incoming `Row` object</span>
-  
+  <span style="color:grey">Key generator class, that implements will extract the key out of incoming `Row` object. This config option is used for user-defined KeyGenerator</span>

Review comment:
       also, there is no more default for hoodie.datasource.write.keygenerator.class if I understand correctly. can we fix that as well. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on a change in pull request #3147: [HUDI-1983] [DOC]Add key gen type config to Configurations page in hudi we…

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #3147:
URL: https://github.com/apache/hudi/pull/3147#discussion_r660206895



##########
File path: docs/_docs/2_4_configurations.md
##########
@@ -55,7 +55,8 @@ Options useful for writing tables via `write.format.option(...)`
 | RECORDKEY_FIELD_OPT_KEY | hoodie.datasource.write.recordkey.field | NO | uuid | Record key field. Value to be used as the recordKey component of HoodieKey. Actual value will be obtained by invoking .toString() on the field value. Nested fields can be specified using the dot notation eg: a.b.c |
 | PARTITIONPATH_FIELD_OPT_KEY | hoodie.datasource.write.partitionpath.field | NO | partitionpath | Partition path field. Value to be used at the partitionPath component of HoodieKey. Actual value ontained by invoking .toString() |
 | HIVE_STYLE_PARTITIONING_OPT_KEY | hoodie.datasource.write.hive_style_partitioning | NO | false | If set true, the names of partition folders follow <partition_column_name>=<partition_value> format. |
-| KEYGENERATOR_CLASS_OPT_KEY | hoodie.datasource.write.keygenerator.class | NO | org.apache.hudi.keygen.SimpleKeyGenerator | Key generator class, that implements will extract the key out of incoming Row object. |
+| KEYGENERATOR_CLASS_OPT_KEY |  | NO | org.apache.hudi.keygen.SimpleKeyGenerator | Key generator class, that implements will extract the key out of incoming `Row` object. This config has higher precedence over keygen type, and it is used for user-defined KeyGenerator . |

Review comment:
       > I thought we should remove "org.apache.hudi.keygen.SimpleKeyGenerator" from here and say "No default"
   
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #3147: [HUDI-1983] [DOC]Add key gen type config to Configurations page in hudi we…

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #3147:
URL: https://github.com/apache/hudi/pull/3147


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org