You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/07 10:20:07 UTC

[GitHub] [hudi] parisni opened a new issue, #5780: [SUPPORT] Embded timeline server limitations

parisni opened a new issue, #5780:
URL: https://github.com/apache/hudi/issues/5780

   hudi 0.11.0
   spark 3.2.0
   
   I had to turn OFF `"hoodie.embed.timeline.server", "false"` because of below limitations:
   
   
   1. Server is not resilient to 500 errors `s3 slow down`. It is supposed to limit the risk of triggering such error, but when such error happens, then it does not catch 500 errors and timeline server goes down. (while s3 based timeline s3 slow down can be mitigated with fs.s3 configs, which makes it much more robust)
   2. It makes very `cleaning`  MUCH slower.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] parisni commented on issue #5780: [SUPPORT] Embded timeline server limitations

Posted by GitBox <gi...@apache.org>.
parisni commented on issue #5780:
URL: https://github.com/apache/hudi/issues/5780#issuecomment-1274180934

   
   I  don't have time to get the stacktrace these days. The way to reproduce is to run concurrent read queries through Athena on a given table while ingesting with a spark job with timeline server enabled. Once you get s3slow down on anthena side, the ingest spark job will fail
   
   On September 13, 2022 9:51:29 PM UTC, Y Ethan Guo ***@***.***> wrote:
   ***@***.*** @codope talks about the configs to retry the requests to the timeline server if the server fails to fulfill the request at the first time.  For the 500s, could you paste the stacktrace from the timeline server? The timeline server refreshes the file system view only upon the changes of the requested instant.  Wondering if it's caused by a bug triggering repeated refresh.
   >
   >-- 
   >Reply to this email directly or view it on GitHub:
   >https://github.com/apache/hudi/issues/5780#issuecomment-1245988548
   >You are receiving this because you were mentioned.
   >
   >Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan closed issue #5780: [SUPPORT] Embded timeline server limitations

Posted by GitBox <gi...@apache.org>.
nsivabalan closed issue #5780: [SUPPORT] Embded timeline server limitations
URL: https://github.com/apache/hudi/issues/5780


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on issue #5780: [SUPPORT] Embded timeline server limitations

Posted by GitBox <gi...@apache.org>.
codope commented on issue #5780:
URL: https://github.com/apache/hudi/issues/5780#issuecomment-1156309172

   @parisni Can you give more details about your setup? If you can share, steps to reproduce that would be great.
   
   @yihua Are you aware of these limitations?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5780: [SUPPORT] Embded timeline server limitations

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5780:
URL: https://github.com/apache/hudi/issues/5780#issuecomment-1216237783

   closing it out as we have a tracking jira. @yihua : I know you are investigating some timeline server related issues. Ensure you also add HUDI-4342 as part of the investigation. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] parisni commented on issue #5780: [SUPPORT] Embded timeline server limitations

Posted by GitBox <gi...@apache.org>.
parisni commented on issue #5780:
URL: https://github.com/apache/hudi/issues/5780#issuecomment-1164453404

   > @parisni Can you give more details about your setup? If you can share, steps to reproduce that would be great.
   
   well, to simulate s3 slow down, I ran multiple athena read queries on the table while ingesting.
   
   I got s3 slow down message on athena, and by the way, the hudi logs said the timeline server is not reachable anymore, so it fall back to scanning files


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5780: [SUPPORT] Embded timeline server limitations

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5780:
URL: https://github.com/apache/hudi/issues/5780#issuecomment-1245988548

   @parisni @codope talks about the configs to retry the requests to the timeline server if the server fails to fulfill the request at the first time.  For the 500s, could you paste the stacktrace from the timeline server? The timeline server refreshes the file system view only upon the changes of the requested instant.  Wondering if it's caused by a bug triggering repeated refresh.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on issue #5780: [SUPPORT] Embded timeline server limitations

Posted by GitBox <gi...@apache.org>.
codope commented on issue #5780:
URL: https://github.com/apache/hudi/issues/5780#issuecomment-1245166831

   There could be two most common kinds of 5xx errors: 500 and 503.
   In case of 500 (internal service error), the reasons could be any kind of service failure (but not degradation).
   In case of 503 (slowdown/degradation), most likely the rate limiter is not allowing the request through.
   
   Either way, Hudi has an exponential backoff [retry mechanism](https://github.com/apache/hudi/blob/1b2179269efab4eb419fdcbc2a4fad717ee8b343/hudi-common/src/main/java/org/apache/hudi/common/table/view/RemoteHoodieTableFileSystemView.java#L181) to handle such scenarios. 
   
   But it needs to be configured using [retry config](https://hudi.apache.org/docs/configurations#hoodiefilesystemoperationretryenable).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] parisni commented on issue #5780: [SUPPORT] Embded timeline server limitations

Posted by GitBox <gi...@apache.org>.
parisni commented on issue #5780:
URL: https://github.com/apache/hudi/issues/5780#issuecomment-1245404738

   > But it needs to be configured using retry config.
   Thanks. Not sure if this also apply to timeline server ?
   
   On September 13, 2022 9:49:08 AM UTC, Sagar Sumit ***@***.***> wrote:
   >There could be two most common kinds of 5xx errors: 500 and 503.
   >In case of 500 (internal service error), the reasons could be any kind of service failure (but not degradation).
   >In case of 503 (slowdown/degradation), most likely the rate limiter is not allowing the request through.
   >
   >Either way, Hudi has an exponential backoff [retry mechanism](https://github.com/apache/hudi/blob/1b2179269efab4eb419fdcbc2a4fad717ee8b343/hudi-common/src/main/java/org/apache/hudi/common/table/view/RemoteHoodieTableFileSystemView.java#L181) to handle such scenarios. 
   >
   >But it needs to be configured using [retry config](https://hudi.apache.org/docs/configurations#hoodiefilesystemoperationretryenable).
   >
   >-- 
   >Reply to this email directly or view it on GitHub:
   >https://github.com/apache/hudi/issues/5780#issuecomment-1245166831
   >You are receiving this because you were mentioned.
   >
   >Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org