You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "pravin1406 (via GitHub)" <gi...@apache.org> on 2023/04/19 20:06:12 UTC

[GitHub] [hudi] pravin1406 opened a new issue, #8504: [SUPPORT]

pravin1406 opened a new issue, #8504:
URL: https://github.com/apache/hudi/issues/8504

   **Describe the problem you faced**
   When i give wrong (non existent) recordKeys or pre-combine keys to the hudi spark job, the spark job fails with appropriate exception and spark context gets stopped. But there are other jettyserver threads running in the background which keeps running.
   This in turn keep my OCP pod running , it never stops. 
   
   Points to Note.
   I ran the job in overwrite mode on a path, where another table existed and also had a corresponding hive table. By going throug
   h logs i can see, hudi trying to clean older data and overwriting  it successfully.
   
   A clear and concise description of the problem.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Create a simple input files.
   2. Create a hudi spark job with some random columns (non-existent in input file)
   3. Launch the spark job in a kubernetes cluster.
   
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   Hudi version : 0.12.2
   
   Spark version : 3.2.0
   
   Hive version : 3.1.2_1
   
   Hadoop version : Hadoop 3.2.1
   
   Storage (HDFS/S3/GCS..) : HDFS
   
   Running on Docker? (yes/no) : yes
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] stream2000 commented on issue #8504: [SUPPORT] JettyServer Threadpool keeps running and ocp job hangs

Posted by "stream2000 (via GitHub)" <gi...@apache.org>.
stream2000 commented on issue #8504:
URL: https://github.com/apache/hudi/issues/8504#issuecomment-1515646260

   Hi, does this pr #8335   fix your problem?  You can also try to set `hoodie.embed.timeline.server=false` to disable timeline service and check whether you app are hang by the timeline server thread. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pravin1406 commented on issue #8504: [SUPPORT]

Posted by "pravin1406 (via GitHub)" <gi...@apache.org>.
pravin1406 commented on issue #8504:
URL: https://github.com/apache/hudi/issues/8504#issuecomment-1515313612

   [hudi-stacktrace.txt](https://github.com/apache/hudi/files/11277176/hudi-stacktrace.txt)
   [threaddump.txt](https://github.com/apache/hudi/files/11277178/threaddump.txt)
   <img width="551" alt="Screenshot 2023-04-20 at 1 35 12 AM" src="https://user-images.githubusercontent.com/25177655/233188062-ed60110d-a5d1-49e6-b512-25dd917bb2ee.png">
   
   
   Have attached stactrace, threaddump and hudi configurations for proofs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8504: [SUPPORT] JettyServer Threadpool keeps running and ocp job hangs

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8504:
URL: https://github.com/apache/hudi/issues/8504#issuecomment-1516073734

   No, embedded timeline server can cache the fs file handles thus can improve write performance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pravin1406 commented on issue #8504: [SUPPORT] JettyServer Threadpool keeps running and ocp job hangs

Posted by "pravin1406 (via GitHub)" <gi...@apache.org>.
pravin1406 commented on issue #8504:
URL: https://github.com/apache/hudi/issues/8504#issuecomment-1516066536

   @stream2000 #8335 may solve the problem i believe. Also is setting hoodie.embed.timeline.server=false recommended in production env ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org