You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Wei-Chiu Chuang <we...@apache.org> on 2022/10/10 23:24:44 UTC

Re: Performance with large no of files

Do you have security enabled?

We did some preliminary benchmarks around webhdfs (i really want to revisit
it again) and with security enabled, a lot of overhead is between client
and KDC (SPENGO). Try run webhdfs using delegation tokens should help
remove that bottleneck.

On Sat, Oct 8, 2022 at 8:26 PM Abhishek <ah...@gmail.com> wrote:

> Hi,
> We want to backup large no of hadoop small files (~1mn) with webhdfs API
> We are getting a performance bottleneck here and it's taking days to back
> it up.
> Anyone know any solution where performance could be improved using any xml
> settings?
> This would really help us.
> v 3.1.1
>
> Appreciate your help !!
>
> --
>
>
>
>
>
>
>
>
>
>
>
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> *Abhishek...*
>