You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/29 09:04:54 UTC

[GitHub] [hudi] TJX2014 opened a new pull request, #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

TJX2014 opened a new pull request, #6827:
URL: https://github.com/apache/hudi/pull/6827

   ### Change Logs
   Make org.apache.hudi.util.StreamerUtil#getLatestTableSchema can catch oom exception caused by read log/base file to infer schema
   
   ### Impact
   Use HoodieHiveCatalog to infer schema can be consistent with HoodieCatalog, which can catch oom.
   
   **Risk level: none | low | medium | high**
   none
   
   ### Documentation Update
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan commented on a diff in pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

Posted by GitBox <gi...@apache.org>.
xushiyan commented on code in PR #6827:
URL: https://github.com/apache/hudi/pull/6827#discussion_r983427263


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java:
##########
@@ -557,8 +557,9 @@ public static Schema getLatestTableSchema(String path, org.apache.hadoop.conf.Co
     try {
       HoodieTableMetaClient metaClient = StreamerUtil.createMetaClient(path, hadoopConf);
       return getTableAvroSchema(metaClient, false);
-    } catch (Exception e) {
-      LOG.warn("Error while resolving the latest table schema", e);
+    } catch (Throwable throwable) {
+      LOG.warn("Error while resolving the latest table schema.", throwable);
+      // ignored
     }

Review Comment:
   it's not a good practice to ignore Throwable. why don't you want to fail loud here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan closed pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

Posted by GitBox <gi...@apache.org>.
xushiyan closed pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue
URL: https://github.com/apache/hudi/pull/6827


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6827:
URL: https://github.com/apache/hudi/pull/6827#issuecomment-1262153936

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8b315a5c99640eac71d6c9fa5a64af5445701562",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11891",
       "triggerID" : "8b315a5c99640eac71d6c9fa5a64af5445701562",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8b315a5c99640eac71d6c9fa5a64af5445701562 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11891) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan commented on pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

Posted by GitBox <gi...@apache.org>.
xushiyan commented on PR #6827:
URL: https://github.com/apache/hudi/pull/6827#issuecomment-1283746816

   as mentioned in this comment https://github.com/apache/hudi/pull/6827/files#r985062923 we should make some improvements for places where catch Throwable. I'm closing this now. If further discussion needed, we can continue it here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TJX2014 commented on a diff in pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on code in PR #6827:
URL: https://github.com/apache/hudi/pull/6827#discussion_r984309593


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java:
##########
@@ -557,8 +557,9 @@ public static Schema getLatestTableSchema(String path, org.apache.hadoop.conf.Co
     try {
       HoodieTableMetaClient metaClient = StreamerUtil.createMetaClient(path, hadoopConf);
       return getTableAvroSchema(metaClient, false);
-    } catch (Exception e) {
-      LOG.warn("Error while resolving the latest table schema", e);
+    } catch (Throwable throwable) {
+      LOG.warn("Error while resolving the latest table schema.", throwable);
+      // ignored
     }

Review Comment:
   Just to consistent with `org.apache.hudi.table.catalog.HoodieCatalog#getLatestTableSchema` which catch `Throwable` rather than `Exception`, why only ignore `Exception`, there are many situation that not the subclass of Exception for example `oom`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan commented on a diff in pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

Posted by GitBox <gi...@apache.org>.
xushiyan commented on code in PR #6827:
URL: https://github.com/apache/hudi/pull/6827#discussion_r985062923


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java:
##########
@@ -557,8 +557,9 @@ public static Schema getLatestTableSchema(String path, org.apache.hadoop.conf.Co
     try {
       HoodieTableMetaClient metaClient = StreamerUtil.createMetaClient(path, hadoopConf);
       return getTableAvroSchema(metaClient, false);
-    } catch (Exception e) {
-      LOG.warn("Error while resolving the latest table schema", e);
+    } catch (Throwable throwable) {
+      LOG.warn("Error while resolving the latest table schema.", throwable);
+      // ignored
     }

Review Comment:
   @TJX2014 then `getLatestTableSchema` should be fixed too. We don't catch Throwable because Error should not be caught. Quote from javadoc
   
   > An Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch
   
   So what is the strong reason to catch and ignore errors like OOM? You'd need to fail loud in that case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6827:
URL: https://github.com/apache/hudi/pull/6827#issuecomment-1261995759

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8b315a5c99640eac71d6c9fa5a64af5445701562",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8b315a5c99640eac71d6c9fa5a64af5445701562",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8b315a5c99640eac71d6c9fa5a64af5445701562 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6827: [HUDI-4950] Fix read log lead to oom not be catched issue

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6827:
URL: https://github.com/apache/hudi/pull/6827#issuecomment-1262001897

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "8b315a5c99640eac71d6c9fa5a64af5445701562",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11891",
       "triggerID" : "8b315a5c99640eac71d6c9fa5a64af5445701562",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8b315a5c99640eac71d6c9fa5a64af5445701562 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11891) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org