You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Lavanya Ramakrishnan <ra...@gmail.com> on 2010/05/14 19:51:14 UTC

TestFDSIO

Hello,

 I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS
installation and had a couple of questions regarding the same.

a) If I run the benchmark back to back in the same directory, I start seeing
strange errors such as NotReplicatedYetException or
AlreadyBeingCreatedException (failed to create file  .... on client 5,
because this file is already being created by DFSClient_.... on ...).  It
seems like there might be some kind of race condition between the
replication from a previous run and subsequent runs. Is there any way to
avoid this?

b) I have been testing with concurrent writers and see a significant drop in
throughput. I get about 60 MB/s for 1 writer and about 8 MB/s for 50
concurrent writers. Is this the known scalability limits for HDFS. Is there
any way to configure this to perform better?

thanks
LR

Re: TestFDSIO

Posted by Lavanya Ramakrishnan <ra...@gmail.com>.

On Fri, May 14, 2010 at 5:57 PM, Konstantin Shvachko <sh...@yahoo-inc.com>wrote:

> On the second thought, there should not be any racing.
> You probably restart the hdfs cluster between the runs.
> When you shutdown the cluster after the first run some files
> may still remain unclosed.

No we are not restarting the cluster. It is up and running.

Re: TestFDSIO

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

On the second thought, there should not be any racing.
You probably restart the hdfs cluster between the runs.
When you shutdown the cluster after the first run some files
may still remain unclosed. Then after restarting the cluster
you will have all their leases renewed, and if somebody tries to
to recreate an unclosed file he will fail with AlreadyBeingCreatedException.

If my guess is correct then you should keep the cluster running
between the consequent DFSIO runs.

Cleaning up will still help keeping benchmark data consistent.
If a bunch of files is recreated, hdfs will start removing the old file blocks.
This increases the internal load and skews the performance results.

--Konstantin

On 5/14/2010 2:26 PM, Konstantin Shvachko wrote:
> Hi Lavanya,
>
> On 5/14/2010 10:51 AM, Lavanya Ramakrishnan wrote:
>  > Hello,
>  >
>  > I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS
>  > installation and had a couple of questions regarding the same.
>  >
>  > a) If I run the benchmark back to back in the same directory, I start
> seeing
>  > strange errors such as NotReplicatedYetException or
>  > AlreadyBeingCreatedException (failed to create file .... on client 5,
>  > because this file is already being created by DFSClient_.... on ...). It
>  > seems like there might be some kind of race condition between the
>  > replication from a previous run and subsequent runs. Is there any way to
>  > avoid this?
>
> Yes this looks like a race with the previous run.
> You can just wait or run TestDFSIO -clean before the second run.
>
>  > b) I have been testing with concurrent writers and see a significant
> drop in
>  > throughput. I get about 60 MB/s for 1 writer and about 8 MB/s for 50
>  > concurrent writers. Is this the known scalability limits for HDFS. Is
> there
>  > any way to configure this to perform better?
>
> It depends on the size and the configuration of your cluster.
> In general for consistent results with DFSIO it is better to set up 1 or 2
> tasks per node. And specify as many files for DFSIO as you have map slots.
> The idea is that all maps finish in one wave.
> Then you should get optimal performance.
>
> Thanks,
> --Konstantin
>
>

Re: TestFDSIO

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

Hi Lavanya,

On 5/14/2010 10:51 AM, Lavanya Ramakrishnan wrote:
 > Hello,
 >
 >   I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS
 > installation and had a couple of questions regarding the same.
 >
 > a) If I run the benchmark back to back in the same directory, I start seeing
 > strange errors such as NotReplicatedYetException or
 > AlreadyBeingCreatedException (failed to create file  .... on client 5,
 > because this file is already being created by DFSClient_.... on ...).  It
 > seems like there might be some kind of race condition between the
 > replication from a previous run and subsequent runs. Is there any way to
 > avoid this?

Yes this looks like a race with the previous run.
You can just wait or run TestDFSIO -clean before the second run.

 > b) I have been testing with concurrent writers and see a significant drop in
 > throughput. I get about 60 MB/s for 1 writer and about 8 MB/s for 50
 > concurrent writers. Is this the known scalability limits for HDFS. Is there
 > any way to configure this to perform better?

It depends on the size and the configuration of your cluster.
In general for consistent results with DFSIO it is better to set up 1 or 2
tasks per node. And specify as many files for DFSIO as you have map slots.
The idea is that all maps finish in one wave.
Then you should get optimal performance.

Thanks,
--Konstantin