You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jon Graham <sj...@gmail.com> on 2010/05/11 01:16:50 UTC

Any plans to provide 0.20.x patch for HADOOP-4584 - Slow generation of Block Report at DataNode causes delay of sending heartbeat to NameNode

Hello Everyone,

Is there a patch available for HADOOP-4584 that can be used on 0.20.2?

Link https://issues.apache.org/jira/browse/HADOOP-4584 seems to indicate
that a patch is available for 0.21 version but this
version is not release yet.

Block reports are taking several minutes on our cluster and this causes time
out conditions and lots of retry conditions.

Thanks for your help,
Jon

Re: Any plans to provide 0.20.x patch for HADOOP-4584 - Slow generation of Block Report at DataNode causes delay of sending heartbeat to NameNode

Posted by Jon Graham <sj...@gmail.com>.
Hello Everyone,

We found a work around to speed up datanode block reports on Linux without
patching Hadoop.

Our block report times were greatly reduced by running a "find" command over
the Hadoop data storage area 5-10 minutes prior to the block report running.
This is done on each datanode in an attempt to cache file and directory
information that may be used by the block report process. By default, each
datanode runs a block report every hour.
We used the last BlockReport timestamp in the datanode log to help compute
when the "find" command should run.

Thanks,
Jon

- - - - - - - - - - - - - - - - - - -

On Mon, May 10, 2010 at 4:16 PM, Jon Graham <sj...@gmail.com> wrote:

> Hello Everyone,
>
> Is there a patch available for HADOOP-4584 that can be used on 0.20.2?
>
> Link https://issues.apache.org/jira/browse/HADOOP-4584 seems to indicate
> that a patch is available for 0.21 version but this
> version is not release yet.
>
> Block reports are taking several minutes on our cluster and this causes
> time out conditions and lots of retry conditions.
>
> Thanks for your help,
> Jon
>
>
>
>
>