You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Raymond Xu (Jira)" <ji...@apache.org> on 2021/09/11 18:58:00 UTC

[jira] [Commented] (HUDI-2387) Too many HEAD requests from Hudi to S3

    [ https://issues.apache.org/jira/browse/HUDI-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413595#comment-17413595 ] 

Raymond Xu commented on HUDI-2387:
----------------------------------

[~uditme] would you raise this to AWS team please?

> Too many HEAD requests from Hudi to S3 
> ---------------------------------------
>
>                 Key: HUDI-2387
>                 URL: https://issues.apache.org/jira/browse/HUDI-2387
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Common Core, Spark Integration
>    Affects Versions: 0.8.0
>         Environment: AWS Glue with PySpark
>            Reporter: Sourav T
>            Priority: Major
>
> We are using Apache Hudi from AWS Glue (with PySpark runtime) to store data on S3 bucket. We are observing a very high number of S3 HEAD requests originating from what we believe from Hudi. 
> Many a time due to this high number of requests, S3 throws "Status Code: 503; Error Code: SlowDown" causing data losses. 
> Is there any any out-of-box feature to debug this further to confirm which Hudi feature causing this? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)