You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/31 10:10:50 UTC

[GitHub] [hudi] dheerajpanangat opened a new pull request, #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

dheerajpanangat opened a new pull request, #7097:
URL: https://github.com/apache/hudi/pull/7097

   Pass root configuration from Context to Hudi and Hadoop
   
   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7097:
URL: https://github.com/apache/hudi/pull/7097#issuecomment-1298342623

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687",
       "triggerID" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d563eb857fef029de868f281ce58cf1682e263f7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12702",
       "triggerID" : "d563eb857fef029de868f281ce58cf1682e263f7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d563eb857fef029de868f281ce58cf1682e263f7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12702) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7097:
URL: https://github.com/apache/hudi/pull/7097#issuecomment-1298198731

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687",
       "triggerID" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d563eb857fef029de868f281ce58cf1682e263f7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12702",
       "triggerID" : "d563eb857fef029de868f281ce58cf1682e263f7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f6110b6b823d4d3165ffac87420a466050cffc42 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687) 
   * d563eb857fef029de868f281ce58cf1682e263f7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12702) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #7097:
URL: https://github.com/apache/hudi/pull/7097#discussion_r1010222957


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -347,4 +351,12 @@ private static void inferAvroSchema(Configuration conf, LogicalType rowType) {
       conf.setString(FlinkOptions.SOURCE_AVRO_SCHEMA, inferredSchema);
     }
   }
+
+  private static void setupRootOptions(Configuration conf, ReadableConfig configuration) {
+    if (configuration instanceof TableConfig) {
+      ((Configuration)((TableConfig) configuration).getRootConfiguration()).toMap().forEach((rootConfigKey, rootConfigValue) -> {
+        conf.setString(rootConfigKey, rootConfigValue);

Review Comment:
   The `TableConfig` is used for passing around Flink SQL job config options, not very suitable for hadoop configurations.
   
   Can you just put the specific hadoop config options into the hadoop `core-site.xml` file  which stays under the dir `HADOOP_CONF_DIR` `HADOOP_HOME/conf` `HADOOP_HOME/etc/hadoop` ?
   
   Or just use the table options which are supported now ~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7097:
URL: https://github.com/apache/hudi/pull/7097#issuecomment-1316238244

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687",
       "triggerID" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d563eb857fef029de868f281ce58cf1682e263f7",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12702",
       "triggerID" : "d563eb857fef029de868f281ce58cf1682e263f7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cc927203542907331a54108d9db3a8f49596021",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7cc927203542907331a54108d9db3a8f49596021",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d563eb857fef029de868f281ce58cf1682e263f7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12702) 
   * 7cc927203542907331a54108d9db3a8f49596021 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7097:
URL: https://github.com/apache/hudi/pull/7097#issuecomment-1296934345

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f6110b6b823d4d3165ffac87420a466050cffc42 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7097:
URL: https://github.com/apache/hudi/pull/7097#issuecomment-1298193408

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687",
       "triggerID" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d563eb857fef029de868f281ce58cf1682e263f7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d563eb857fef029de868f281ce58cf1682e263f7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f6110b6b823d4d3165ffac87420a466050cffc42 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687) 
   * d563eb857fef029de868f281ce58cf1682e263f7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7097:
URL: https://github.com/apache/hudi/pull/7097#issuecomment-1297261446

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687",
       "triggerID" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f6110b6b823d4d3165ffac87420a466050cffc42 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #7097:
URL: https://github.com/apache/hudi/pull/7097#discussion_r1010129745


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -347,4 +351,12 @@ private static void inferAvroSchema(Configuration conf, LogicalType rowType) {
       conf.setString(FlinkOptions.SOURCE_AVRO_SCHEMA, inferredSchema);
     }
   }
+
+  private static void setupRootOptions(Configuration conf, ReadableConfig configuration) {
+    if (configuration instanceof TableConfig) {
+      ((Configuration)((TableConfig) configuration).getRootConfiguration()).toMap().forEach((rootConfigKey, rootConfigValue) -> {
+        conf.setString(rootConfigKey, rootConfigValue);

Review Comment:
   Not sure what kind of options are passed through the `context.getConfiguration()`, seems the options from `TableConfig` ? Actually if you use the SQL, the option with prefix `hadoop.` can be passed correctly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7097:
URL: https://github.com/apache/hudi/pull/7097#issuecomment-1296940600

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687",
       "triggerID" : "f6110b6b823d4d3165ffac87420a466050cffc42",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f6110b6b823d4d3165ffac87420a466050cffc42 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12687) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] dheerajpanangat commented on a diff in pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
dheerajpanangat commented on code in PR #7097:
URL: https://github.com/apache/hudi/pull/7097#discussion_r1010174663


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -347,4 +351,12 @@ private static void inferAvroSchema(Configuration conf, LogicalType rowType) {
       conf.setString(FlinkOptions.SOURCE_AVRO_SCHEMA, inferredSchema);
     }
   }
+
+  private static void setupRootOptions(Configuration conf, ReadableConfig configuration) {
+    if (configuration instanceof TableConfig) {
+      ((Configuration)((TableConfig) configuration).getRootConfiguration()).toMap().forEach((rootConfigKey, rootConfigValue) -> {
+        conf.setString(rootConfigKey, rootConfigValue);

Review Comment:
   Hi @danny0405 ,
   Thanks for looking at this.
   
   The issue is when we try to use Hudi with Flink and Azure.
   The flink configurations includes properties which are needed by Hadoop to connect to Azure
   In the flow from Flink -> Hudi -> Hadoop -> Storage for table, the configurations are not passed.
   
   The hadoop-azure library never gets the configs for ClientId, Credentials, AuthType, etc.
   Made this change to pass these configuration from Flink Library to Hadoop library.
   
   Another work around is to send the configs as part of the CatalogTable Options. but that does not seem correct.
   Let me know your thoughts though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] dheerajpanangat closed pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
dheerajpanangat closed pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop
URL: https://github.com/apache/hudi/pull/7097


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] dheerajpanangat commented on a diff in pull request #7097: [HUDI-5103] - Pass root configuration from Context to Hudi and Hadoop

Posted by GitBox <gi...@apache.org>.
dheerajpanangat commented on code in PR #7097:
URL: https://github.com/apache/hudi/pull/7097#discussion_r1023456743


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -347,4 +351,12 @@ private static void inferAvroSchema(Configuration conf, LogicalType rowType) {
       conf.setString(FlinkOptions.SOURCE_AVRO_SCHEMA, inferredSchema);
     }
   }
+
+  private static void setupRootOptions(Configuration conf, ReadableConfig configuration) {
+    if (configuration instanceof TableConfig) {
+      ((Configuration)((TableConfig) configuration).getRootConfiguration()).toMap().forEach((rootConfigKey, rootConfigValue) -> {
+        conf.setString(rootConfigKey, rootConfigValue);

Review Comment:
   Sure @danny0405 .
   Can I close the Merge request then ? 
   Can we add these to the Hudi documentation ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org