You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/05 13:38:03 UTC

[GitHub] [hudi] Mathieu1124 opened a new pull request #1921: [HUDI-1151]Fix NPE when no new data in kafka using HoodieDeltaStreamer

Mathieu1124 opened a new pull request #1921:
URL: https://github.com/apache/hudi/pull/1921


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *Fix NPE when no new data in kafka using HoodieDeltaStreamer*
   
   ## Brief change log
   
   *I get this NPE when using HoodieDeltaStreamer to etl kafka data to hudi table. the topic is newly created, and there is no data in the topic when the job is started*
   
   `20/08/05 21:18:00 INFO helpers.KafkaOffsetGen: SourceLimit not configured, set numEvents to default value : 5000000
   20/08/05 21:18:00 INFO sources.JsonKafkaSource: About to read 0 from Kafka for topic :hudi_test_callback
   20/08/05 21:18:00 INFO deltastreamer.DeltaSync: No new data, source checkpoint has not changed. Nothing to commit. Old checkpoint=(Option{val=hudi_test_callback,0:0}). New Checkpoint=(hudi_test_callback,0:0)
   20/08/05 21:18:00 ERROR deltastreamer.HoodieDeltaStreamer: Shutting down delta-sync due to exception
   java.lang.NullPointerException
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:570)
   	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   20/08/05 21:18:00 INFO deltastreamer.HoodieDeltaStreamer: Delta Sync shutdown. Error ?true
   20/08/05 21:18:00 ERROR async.AbstractAsyncService: Monitor noticed one or more threads failed. Requesting graceful shutdown of other threads
   java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException
   	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
   	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
   	at org.apache.hudi.async.AbstractAsyncService.lambda$monitorThreads$0(AbstractAsyncService.java:136)
   	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:585)
   	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
   	... 3 more
   Caused by: java.lang.NullPointerException
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:570)
   	... 4 more
   20/08/05 21:18:00 ERROR async.AbstractAsyncService: Service shutdown with error
   java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException
   	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
   	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
   	at org.apache.hudi.async.AbstractAsyncService.waitForShutdown(AbstractAsyncService.java:72)
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:152)
   	at org.apache.hudi.common.util.Option.ifPresent(Option.java:96)
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:149)
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:454)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673)
   Caused by: org.apache.hudi.exception.HoodieException
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:585)
   	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.NullPointerException
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:570)
   	... 4 more`
    
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] Mathieu1124 commented on pull request #1921: [HUDI-1151]Fix NPE when no new data in kafka using HoodieDeltaStreamer

Posted by GitBox <gi...@apache.org>.
Mathieu1124 commented on pull request #1921:
URL: https://github.com/apache/hudi/pull/1921#issuecomment-669198707


   @yanghua please take a look when free


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on a change in pull request #1921: [HUDI-1151]Fix NPE when no new data in kafka using HoodieDeltaStreamer

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #1921:
URL: https://github.com/apache/hudi/pull/1921#discussion_r465795800



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -567,7 +567,7 @@ public DeltaSync getDeltaSync() {
             try {
               long start = System.currentTimeMillis();
               Pair<Option<String>, JavaRDD<WriteStatus>> scheduledCompactionInstantAndRDD = deltaSync.syncOnce();
-              if (scheduledCompactionInstantAndRDD.getLeft().isPresent()) {
+              if (null != scheduledCompactionInstantAndRDD && scheduledCompactionInstantAndRDD.getLeft().isPresent()) {

Review comment:
       Good catch!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua merged pull request #1921: [HUDI-1151]Fix NPE when no new data in kafka using HoodieDeltaStreamer

Posted by GitBox <gi...@apache.org>.
yanghua merged pull request #1921:
URL: https://github.com/apache/hudi/pull/1921


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org