You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Zl...@barclayscapital.com on 2010/01/21 20:41:13 UTC

RE: Exponential performance decay - mystery solved


Alright, the problem was caused by me setting the frequency of a block
report to 30 seconds.  The idea behind that was to create more load on
the Namenode, but I didn't notice that those block reports were taking
increasing amounts of time to generate.  During that time, a lock was
held which I'm guessing didn't allow the reporting datanode to perform
its functions.

On my hardware, with 100,000 blocks the report takes over 7 seconds.  So
every datanode was unavailable for 7 out of every 30 seconds.  Changing
the interval to a more reasonable value restored the insertion speed to
linear.

Apologies for creating this confusion, nevertheless it was a useful
thing to learn.

Regards,
Zlatin

-----Original Message-----
From: Eli Collins [mailto:eli@cloudera.com] 
Sent: Thursday, January 21, 2010 2:02 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: Exponential performance decay - possible lead

>
> The messages are of the following:
>
> 2010-01-18 14:51:25,694 WARN org.apache.hadoop.hdfs.StateChange: 
> BLOCK* NameSystem.addStoredBlock: Redundant addStoredBlock request 
> received for blk_-5804440919363539694_1026 on ip.removed:port.removed 
> size 1024

This is odd, you should't be getting this warning, I don't see it when
running your benchmark on my cluster. Are there other relevant/warnings
errors in the NN or DN logs?

Thanks,
Eli
_______________________________________________

This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the foregoing.  Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.  This email may relate to or be sent from other members of the Barclays Group.
_______________________________________________

Re: Exponential performance decay - mystery solved

Posted by Eli Collins <el...@cloudera.com>.
Hey Zlatin,

That makes sense. No apologies necessary, was a very useful exercise.

Thanks,
Eli


On Thu, Jan 21, 2010 at 11:41 AM,  <Zl...@barclayscapital.com> wrote:
>
>
> Alright, the problem was caused by me setting the frequency of a block
> report to 30 seconds.  The idea behind that was to create more load on
> the Namenode, but I didn't notice that those block reports were taking
> increasing amounts of time to generate.  During that time, a lock was
> held which I'm guessing didn't allow the reporting datanode to perform
> its functions.
>
> On my hardware, with 100,000 blocks the report takes over 7 seconds.  So
> every datanode was unavailable for 7 out of every 30 seconds.  Changing
> the interval to a more reasonable value restored the insertion speed to
> linear.
>
> Apologies for creating this confusion, nevertheless it was a useful
> thing to learn.
>
> Regards,
> Zlatin
>
> -----Original Message-----
> From: Eli Collins [mailto:eli@cloudera.com]
> Sent: Thursday, January 21, 2010 2:02 PM
> To: hdfs-user@hadoop.apache.org
> Subject: Re: Exponential performance decay - possible lead
>
>>
>> The messages are of the following:
>>
>> 2010-01-18 14:51:25,694 WARN org.apache.hadoop.hdfs.StateChange:
>> BLOCK* NameSystem.addStoredBlock: Redundant addStoredBlock request
>> received for blk_-5804440919363539694_1026 on ip.removed:port.removed
>> size 1024
>
> This is odd, you should't be getting this warning, I don't see it when
> running your benchmark on my cluster. Are there other relevant/warnings
> errors in the NN or DN logs?
>
> Thanks,
> Eli
> _______________________________________________
>
> This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the foregoing.  Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.  This email may relate to or be sent from other members of the Barclays Group.
> _______________________________________________
>

RE: Exponential performance decay - mystery solved

Posted by Zl...@barclayscapital.com.
Happy to report this doesn't happen with 0.21 even with block report interval of 30 seconds.

Zlatin

________________________________
From: Raghu Angadi [mailto:rangadi@apache.org]
Sent: Thursday, January 21, 2010 7:19 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: Exponential performance decay - mystery solved


http://issues.apache.org/jira/browse/HADOOP-4584 is supposed to fix this exact problem with the block reports. Were you running 0.21 or 0.20?

Raghu.

On Thu, Jan 21, 2010 at 11:41 AM, <Zl...@barclayscapital.com>> wrote:


Alright, the problem was caused by me setting the frequency of a block
report to 30 seconds.  The idea behind that was to create more load on
the Namenode, but I didn't notice that those block reports were taking
increasing amounts of time to generate.  During that time, a lock was
held which I'm guessing didn't allow the reporting datanode to perform
its functions.

On my hardware, with 100,000 blocks the report takes over 7 seconds.  So
every datanode was unavailable for 7 out of every 30 seconds.  Changing
the interval to a more reasonable value restored the insertion speed to
linear.

Apologies for creating this confusion, nevertheless it was a useful
thing to learn.

Regards,
Zlatin

-----Original Message-----
From: Eli Collins [mailto:eli@cloudera.com<ma...@cloudera.com>]
Sent: Thursday, January 21, 2010 2:02 PM
To: hdfs-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Exponential performance decay - possible lead

>
> The messages are of the following:
>
> 2010-01-18 14:51:25,694 WARN org.apache.hadoop.hdfs.StateChange:
> BLOCK* NameSystem.addStoredBlock: Redundant addStoredBlock request
> received for blk_-5804440919363539694_1026 on ip.removed:port.removed
> size 1024

This is odd, you should't be getting this warning, I don't see it when
running your benchmark on my cluster. Are there other relevant/warnings
errors in the NN or DN logs?

Thanks,
Eli
_______________________________________________

This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer<http://www.barcap.com/emaildisclaimer>. By messaging with Barclays you consent to the foregoing.  Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.  This email may relate to or be sent from other members of the Barclays Group.
_______________________________________________


Re: Exponential performance decay - mystery solved

Posted by Raghu Angadi <ra...@apache.org>.
http://issues.apache.org/jira/browse/HADOOP-4584 is supposed to fix this
exact problem with the block reports. Were you running 0.21 or 0.20?

Raghu.

On Thu, Jan 21, 2010 at 11:41 AM, <Zl...@barclayscapital.com>wrote:

>
>
> Alright, the problem was caused by me setting the frequency of a block
> report to 30 seconds.  The idea behind that was to create more load on
> the Namenode, but I didn't notice that those block reports were taking
> increasing amounts of time to generate.  During that time, a lock was
> held which I'm guessing didn't allow the reporting datanode to perform
> its functions.
>
> On my hardware, with 100,000 blocks the report takes over 7 seconds.  So
> every datanode was unavailable for 7 out of every 30 seconds.  Changing
> the interval to a more reasonable value restored the insertion speed to
> linear.
>
> Apologies for creating this confusion, nevertheless it was a useful
> thing to learn.
>
> Regards,
> Zlatin
>
> -----Original Message-----
> From: Eli Collins [mailto:eli@cloudera.com]
> Sent: Thursday, January 21, 2010 2:02 PM
> To: hdfs-user@hadoop.apache.org
> Subject: Re: Exponential performance decay - possible lead
>
> >
> > The messages are of the following:
> >
> > 2010-01-18 14:51:25,694 WARN org.apache.hadoop.hdfs.StateChange:
> > BLOCK* NameSystem.addStoredBlock: Redundant addStoredBlock request
> > received for blk_-5804440919363539694_1026 on ip.removed:port.removed
> > size 1024
>
> This is odd, you should't be getting this warning, I don't see it when
> running your benchmark on my cluster. Are there other relevant/warnings
> errors in the NN or DN logs?
>
> Thanks,
> Eli
> _______________________________________________
>
> This e-mail may contain information that is confidential, privileged or
> otherwise protected from disclosure. If you are not an intended recipient of
> this e-mail, do not duplicate or redistribute it by any means. Please delete
> it and any attachments and notify the sender that you have received it in
> error. Unless specifically indicated, this e-mail is not an offer to buy or
> sell or a solicitation to buy or sell any securities, investment products or
> other financial product or service, an official confirmation of any
> transaction, or an official statement of Barclays. Any views or opinions
> presented are solely those of the author and do not necessarily represent
> those of Barclays. This e-mail is subject to terms available at the
> following link: www.barcap.com/emaildisclaimer. By messaging with Barclays
> you consent to the foregoing.  Barclays Capital is the investment banking
> division of Barclays Bank PLC, a company registered in England (number
> 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.
>  This email may relate to or be sent from other members of the Barclays
> Group.
> _______________________________________________
>

RE: Exponential performance decay - mystery solved

Posted by Zl...@barclayscapital.com.
My 2c: if it is not possible to move the i/o operations listFiles() and
length() outside the lock on FSVolumeSet, maybe set a flag that a block
report is in progress so that the rest of the datanode doesn't just
hang. 
 
Thanks,
Zlatin

________________________________

From: Dhruba Borthakur [mailto:dhruba@gmail.com] 
Sent: Thursday, January 21, 2010 3:38 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: Exponential performance decay - mystery solved


Some of this delay in generating block reports might be mitigated via
http://issues.apache.org/jira/browse/HDFS-854 

thanks,
dhruba


On Thu, Jan 21, 2010 at 11:41 AM, <Zl...@barclayscapital.com>
wrote:




	Alright, the problem was caused by me setting the frequency of a
block
	report to 30 seconds.  The idea behind that was to create more
load on
	the Namenode, but I didn't notice that those block reports were
taking
	increasing amounts of time to generate.  During that time, a
lock was
	held which I'm guessing didn't allow the reporting datanode to
perform
	its functions.
	
	On my hardware, with 100,000 blocks the report takes over 7
seconds.  So
	every datanode was unavailable for 7 out of every 30 seconds.
Changing
	the interval to a more reasonable value restored the insertion
speed to
	linear.
	
	Apologies for creating this confusion, nevertheless it was a
useful
	thing to learn.
	
	Regards,
	Zlatin
	
	-----Original Message-----
	From: Eli Collins [mailto:eli@cloudera.com]
	Sent: Thursday, January 21, 2010 2:02 PM
	To: hdfs-user@hadoop.apache.org
	Subject: Re: Exponential performance decay - possible lead
	
	>
	> The messages are of the following:
	>
	> 2010-01-18 14:51:25,694 WARN
org.apache.hadoop.hdfs.StateChange:
	> BLOCK* NameSystem.addStoredBlock: Redundant addStoredBlock
request
	> received for blk_-5804440919363539694_1026 on
ip.removed:port.removed
	> size 1024
	
	This is odd, you should't be getting this warning, I don't see
it when
	running your benchmark on my cluster. Are there other
relevant/warnings
	errors in the NN or DN logs?
	
	Thanks,
	Eli
	_______________________________________________
	
	This e-mail may contain information that is confidential,
privileged or otherwise protected from disclosure. If you are not an
intended recipient of this e-mail, do not duplicate or redistribute it
by any means. Please delete it and any attachments and notify the sender
that you have received it in error. Unless specifically indicated, this
e-mail is not an offer to buy or sell or a solicitation to buy or sell
any securities, investment products or other financial product or
service, an official confirmation of any transaction, or an official
statement of Barclays. Any views or opinions presented are solely those
of the author and do not necessarily represent those of Barclays. This
e-mail is subject to terms available at the following link:
www.barcap.com/emaildisclaimer. By messaging with Barclays you consent
to the foregoing.  Barclays Capital is the investment banking division
of Barclays Bank PLC, a company registered in England (number 1026167)
with its registered office at 1 Churchill Place, London, E14 5HP.  This
email may relate to or be sent from other members of the Barclays Group.
	_______________________________________________
	




-- 
Connect to me at http://www.facebook.com/dhruba


_______________________________________________

This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the foregoing.  Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.  This email may relate to or be sent from other members of the Barclays Group.
_______________________________________________

Re: Exponential performance decay - mystery solved

Posted by Dhruba Borthakur <dh...@gmail.com>.
Some of this delay in generating block reports might be mitigated via
http://issues.apache.org/jira/browse/HDFS-854

thanks,
dhruba


On Thu, Jan 21, 2010 at 11:41 AM, <Zl...@barclayscapital.com>wrote:

>
>
> Alright, the problem was caused by me setting the frequency of a block
> report to 30 seconds.  The idea behind that was to create more load on
> the Namenode, but I didn't notice that those block reports were taking
> increasing amounts of time to generate.  During that time, a lock was
> held which I'm guessing didn't allow the reporting datanode to perform
> its functions.
>
> On my hardware, with 100,000 blocks the report takes over 7 seconds.  So
> every datanode was unavailable for 7 out of every 30 seconds.  Changing
> the interval to a more reasonable value restored the insertion speed to
> linear.
>
> Apologies for creating this confusion, nevertheless it was a useful
> thing to learn.
>
> Regards,
> Zlatin
>
> -----Original Message-----
> From: Eli Collins [mailto:eli@cloudera.com]
> Sent: Thursday, January 21, 2010 2:02 PM
> To: hdfs-user@hadoop.apache.org
> Subject: Re: Exponential performance decay - possible lead
>
> >
> > The messages are of the following:
> >
> > 2010-01-18 14:51:25,694 WARN org.apache.hadoop.hdfs.StateChange:
> > BLOCK* NameSystem.addStoredBlock: Redundant addStoredBlock request
> > received for blk_-5804440919363539694_1026 on ip.removed:port.removed
> > size 1024
>
> This is odd, you should't be getting this warning, I don't see it when
> running your benchmark on my cluster. Are there other relevant/warnings
> errors in the NN or DN logs?
>
> Thanks,
> Eli
> _______________________________________________
>
> This e-mail may contain information that is confidential, privileged or
> otherwise protected from disclosure. If you are not an intended recipient of
> this e-mail, do not duplicate or redistribute it by any means. Please delete
> it and any attachments and notify the sender that you have received it in
> error. Unless specifically indicated, this e-mail is not an offer to buy or
> sell or a solicitation to buy or sell any securities, investment products or
> other financial product or service, an official confirmation of any
> transaction, or an official statement of Barclays. Any views or opinions
> presented are solely those of the author and do not necessarily represent
> those of Barclays. This e-mail is subject to terms available at the
> following link: www.barcap.com/emaildisclaimer. By messaging with Barclays
> you consent to the foregoing.  Barclays Capital is the investment banking
> division of Barclays Bank PLC, a company registered in England (number
> 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.
>  This email may relate to or be sent from other members of the Barclays
> Group.
> _______________________________________________
>



-- 
Connect to me at http://www.facebook.com/dhruba