You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/07/25 08:05:27 UTC

[GitHub] [hudi] TJX2014 opened a new pull request, #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

TJX2014 opened a new pull request, #6207:
URL: https://github.com/apache/hudi/pull/6207

   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists
   
   ## Brief change log
   Add remove HADOOP_CONF_DIR and HADOOP_HOME in TestWriteBase
   
   ## Verify this pull request
   local test
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   Manually verified the change by running a job locally
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6207:
URL: https://github.com/apache/hudi/pull/6207#discussion_r946496367


##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java:
##########
@@ -102,6 +103,23 @@ public class TestWriteBase {
         "id1,par1,id1,Danny,23,3,par1",
         "id1,par1,id1,Danny,23,4,par1",
         "id1,par1,id1,Danny,23,4,par1"));
+
+    removeHadoopConf();
+  }
+
+  private static void removeHadoopConf() {
+    Map<String, String> env = System.getenv();
+    Class<?> clazz = env.getClass();
+    Field field = null;
+    try {
+      field = clazz.getDeclaredField("m");
+      field.setAccessible(true);
+      Map<String, String> map = (Map<String, String>) field.get(env);
+      map.remove("HADOOP_CONF_DIR");

Review Comment:
   Seems test module should also go through HadoopConfigurations.getHadoopConf?if not,test module seems not consistent with hudi core module



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6207:
URL: https://github.com/apache/hudi/pull/6207#discussion_r959283718


##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java:
##########
@@ -102,6 +103,23 @@ public class TestWriteBase {
         "id1,par1,id1,Danny,23,3,par1",
         "id1,par1,id1,Danny,23,4,par1",
         "id1,par1,id1,Danny,23,4,par1"));
+
+    removeHadoopConf();
+  }
+
+  private static void removeHadoopConf() {
+    Map<String, String> env = System.getenv();
+    Class<?> clazz = env.getClass();
+    Field field = null;
+    try {
+      field = clazz.getDeclaredField("m");
+      field.setAccessible(true);
+      Map<String, String> map = (Map<String, String>) field.get(env);
+      map.remove("HADOOP_CONF_DIR");

Review Comment:
   Ok, I will fix the test code module later, seems other issues in hudi-flink module I need to fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6207:
URL: https://github.com/apache/hudi/pull/6207#discussion_r946501259


##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java:
##########
@@ -102,6 +103,23 @@ public class TestWriteBase {
         "id1,par1,id1,Danny,23,3,par1",
         "id1,par1,id1,Danny,23,4,par1",
         "id1,par1,id1,Danny,23,4,par1"));
+
+    removeHadoopConf();
+  }
+
+  private static void removeHadoopConf() {
+    Map<String, String> env = System.getenv();
+    Class<?> clazz = env.getClass();
+    Field field = null;
+    try {
+      field = clazz.getDeclaredField("m");
+      field.setAccessible(true);
+      Map<String, String> map = (Map<String, String>) field.get(env);
+      map.remove("HADOOP_CONF_DIR");

Review Comment:
   We should, we can fix the test code if you found the codes that does not use `HadoopConfigurations.getHadoopConf`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on PR #6207:
URL: https://github.com/apache/hudi/pull/6207#issuecomment-1193745982

   If we need to use local hadoop env, I think ·org.apache.hudi.utils.TestData#checkWrittenData(java.io.File, java.util.Map<java.lang.String,java.lang.String>, int)· should also floow local hadoop env rather dev env.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6207:
URL: https://github.com/apache/hudi/pull/6207#issuecomment-1193925591

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8450ba2fceee31211b34cbb5e9b97008b9aa5c01",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10310",
       "triggerID" : "8450ba2fceee31211b34cbb5e9b97008b9aa5c01",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8450ba2fceee31211b34cbb5e9b97008b9aa5c01 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10310) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6207:
URL: https://github.com/apache/hudi/pull/6207#discussion_r934087828


##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java:
##########
@@ -102,6 +103,23 @@ public class TestWriteBase {
         "id1,par1,id1,Danny,23,3,par1",
         "id1,par1,id1,Danny,23,4,par1",
         "id1,par1,id1,Danny,23,4,par1"));
+
+    removeHadoopConf();
+  }
+
+  private static void removeHadoopConf() {
+    Map<String, String> env = System.getenv();
+    Class<?> clazz = env.getClass();
+    Field field = null;
+    try {
+      field = clazz.getDeclaredField("m");
+      field.setAccessible(true);
+      Map<String, String> map = (Map<String, String>) field.get(env);
+      map.remove("HADOOP_CONF_DIR");

Review Comment:
   Should we fix this ? Can you make your local env correct ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6207:
URL: https://github.com/apache/hudi/pull/6207#discussion_r935167478


##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java:
##########
@@ -102,6 +103,23 @@ public class TestWriteBase {
         "id1,par1,id1,Danny,23,3,par1",
         "id1,par1,id1,Danny,23,4,par1",
         "id1,par1,id1,Danny,23,4,par1"));
+
+    removeHadoopConf();
+  }
+
+  private static void removeHadoopConf() {
+    Map<String, String> env = System.getenv();
+    Class<?> clazz = env.getClass();
+    Field field = null;
+    try {
+      field = clazz.getDeclaredField("m");
+      field.setAccessible(true);
+      Map<String, String> map = (Map<String, String>) field.get(env);
+      map.remove("HADOOP_CONF_DIR");

Review Comment:
   In flink module, we use a static code block for fetching the hadoop configuration, see `HadoopConfigurations.getHadoopConf` for details.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6207:
URL: https://github.com/apache/hudi/pull/6207#issuecomment-1193728955

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8450ba2fceee31211b34cbb5e9b97008b9aa5c01",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8450ba2fceee31211b34cbb5e9b97008b9aa5c01",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8450ba2fceee31211b34cbb5e9b97008b9aa5c01 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6207:
URL: https://github.com/apache/hudi/pull/6207#issuecomment-1193733918

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8450ba2fceee31211b34cbb5e9b97008b9aa5c01",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10310",
       "triggerID" : "8450ba2fceee31211b34cbb5e9b97008b9aa5c01",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8450ba2fceee31211b34cbb5e9b97008b9aa5c01 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10310) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6207: [HUDI-4461] Fix org.apache.hudi.sink.TestWriteCopyOnWrite will failed when local hadoop env exists

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6207:
URL: https://github.com/apache/hudi/pull/6207#discussion_r935063834


##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java:
##########
@@ -102,6 +103,23 @@ public class TestWriteBase {
         "id1,par1,id1,Danny,23,3,par1",
         "id1,par1,id1,Danny,23,4,par1",
         "id1,par1,id1,Danny,23,4,par1"));
+
+    removeHadoopConf();
+  }
+
+  private static void removeHadoopConf() {
+    Map<String, String> env = System.getenv();
+    Class<?> clazz = env.getClass();
+    Field field = null;
+    try {
+      field = clazz.getDeclaredField("m");
+      field.setAccessible(true);
+      Map<String, String> map = (Map<String, String>) field.get(env);
+      map.remove("HADOOP_CONF_DIR");

Review Comment:
   @danny0405 Seems a bug in test module, which should also respect HADOOP_CONF_DIR,or in hudi module should not rely on external env?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org