You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bryan Beaudreault <bb...@gmail.com> on 2012/07/17 19:11:30 UTC

Lowering HDFS socket timeouts

Today I needed to restart one of my region servers, and did so without gracefully shutting down the datanode.  For the next 1-2 minutes we had a bunch of failed queries from various other region servers trying to access that datanode.  Looking at the logs, I saw that they were all socket timeouts after 60000 milliseconds. 

We use HBase mostly as an online datastore, with various APIs powering various web apps and external consumers.  Writes come from both the API in some cases, but we have continuous hadoop jobs feeding data in as well.

Since we have web app consumers, this 60 second timeout seems unreasonably long.  If a datanode goes down, ideally the impact would be much smaller than that.  I want to lower the dfs.socket.timeout to something like 5-10 seconds, but do not know the implications of this.

In googling I did not find much precedent for this, but I did find some people talking about upping the timeout to much longer than 60 seconds.  Is it generally safe to lower this timeout dramatically if you want faster failures? Are there any downsides to this?

Thanks 

-- 
Bryan Beaudreault


Fw: Lowering HDFS socket timeouts

Posted by Bryan Beaudreault <bb...@hubspot.com>.
Cross posting this to the hdfs-user group.

See below for the context, but basically I'm wondering if it is safe to lower the dfs.socket.timeout to something like 5-10 seconds in my hbase-site.xml.  I'm thinking this would only affect the hdfs client calls that come from HBase, so it wouldn't affect inter-communication between datanodes and other hdfs services. 

Can anyone see any problems doing this?  I'm trying to avoid 60 second pauses in my APIs (and thus web apps) when we lose a datanode.  Ideally the HBase connection to the datanode would fail fast and move onto a replication. 

Thanks! 

-- 
Bryan Beaudreault


Forwarded message:

> From: N Keywal <nk...@gmail.com>
> Reply To: user@hbase.apache.org
> To: user@hbase.apache.org
> Date: Wednesday, July 18, 2012 12:44:53 PM
> Subject: Re: Lowering HDFS socket timeouts
> 
> I don't know. The question is mainly for the read time out: you will
> connect to the ipc.Client with a read timeout of let say 10s. Server
> side the implementation may do something with another server, with a
> connect & read timeout of 60s. So if you have:
> HBase --> live DN --> dead DN
> 
> The timeout will be triggered in HBase while the live DN is still
> waiting for the answer from the dead dn. It could even retry on
> another node.
> On paper, this should work, as this could happen in real life without
> changing the dfs timeouts.. And may be this case does not even exist.
> But as the extension mechanism is designed to add some extra seconds,
> it could exist for this reason or something alike. Worth asking on the
> hdfs mailing list I would say.
> 
> On Wed, Jul 18, 2012 at 4:28 PM, Bryan Beaudreault
> <bbeaudreault@hubspot.com (mailto:bbeaudreault@hubspot.com)> wrote:
> > Thanks for the response, N. I could be wrong here, but since this problem is in the HDFS client code, couldn't I set this dfs.socket.timeout in my hbase-site.xml and it would only affect hbase connections to hdfs? I.e. we wouldn't have to worry about affecting connections between datanodes, etc.
> > 
> > --
> > Bryan Beaudreault
> > 
> > 
> > On Wednesday, July 18, 2012 at 4:38 AM, N Keywal wrote:
> > 
> > > Hi Bryan,
> > > 
> > > It's a difficult question, because dfs.socket.timeout is used all over
> > > the place in hdfs. I'm currently documenting this.
> > > Especially:
> > > - it's used for connections between datanodes, and not only for
> > > connections between hdfs clients & hdfs datanodes.
> > > - It's also used for the two types of datanodes connection (ports
> > > beeing 50010 & 50020 by default).
> > > - It's used as a connect timeout, but as well as a read timeout
> > > (socket is connected, but the application does not write for a while)
> > > - It's used with various extensions, so when your seeing stuff like
> > > 69000 or 66000 it's often the same setting timeout + 3s (hardcoded) *
> > > #replica
> > > 
> > > For a single datanode issue, with everything going well, it will make
> > > the cluster much more reactive: hbase will go to another node
> > > immediately instead of waiting. But it will make it much more
> > > sensitive to gc and network issues. If you have a major hardware
> > > issue, something like 10% of your cluster going down, this setting
> > > will multiply the number of retries, and will add a lot of workload to
> > > your already damaged cluster, and this could make the things worse.
> > > 
> > > This said, I think we will need to make it shorter sooner or later, so
> > > if you do it on your cluster, it will be helpful...
> > > 
> > > N.
> > > 
> > > On Tue, Jul 17, 2012 at 7:11 PM, Bryan Beaudreault
> > > <bbeaudreault@gmail.com (mailto:bbeaudreault@gmail.com)> wrote:
> > > > Today I needed to restart one of my region servers, and did so without gracefully shutting down the datanode. For the next 1-2 minutes we had a bunch of failed queries from various other region servers trying to access that datanode. Looking at the logs, I saw that they were all socket timeouts after 60000 milliseconds.
> > > > 
> > > > We use HBase mostly as an online datastore, with various APIs powering various web apps and external consumers. Writes come from both the API in some cases, but we have continuous hadoop jobs feeding data in as well.
> > > > 
> > > > Since we have web app consumers, this 60 second timeout seems unreasonably long. If a datanode goes down, ideally the impact would be much smaller than that. I want to lower the dfs.socket.timeout to something like 5-10 seconds, but do not know the implications of this.
> > > > 
> > > > In googling I did not find much precedent for this, but I did find some people talking about upping the timeout to much longer than 60 seconds. Is it generally safe to lower this timeout dramatically if you want faster failures? Are there any downsides to this?
> > > > 
> > > > Thanks
> > > > 
> > > > --
> > > > Bryan Beaudreault
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 
> 
> 



Re: Lowering HDFS socket timeouts

Posted by N Keywal <nk...@gmail.com>.
I don't know. The question is mainly for the read time out: you will
connect to the ipc.Client with a read timeout of let say 10s. Server
side the implementation may do something with another server, with a
connect & read timeout of 60s. So if you have:
HBase --> live DN --> dead DN

The timeout will be triggered in HBase while the live DN is still
waiting for the answer from the dead dn. It could even retry on
another node.
 On paper, this should work, as this could happen in real life without
changing the dfs timeouts.. And may be this case does not even exist.
But as the extension mechanism is designed to add some extra seconds,
it could exist for this reason or something alike. Worth asking on the
hdfs mailing list I would say.

On Wed, Jul 18, 2012 at 4:28 PM, Bryan Beaudreault
<bb...@hubspot.com> wrote:
> Thanks for the response, N.  I could be wrong here, but since this problem is in the HDFS client code, couldn't I set this dfs.socket.timeout in my hbase-site.xml and it would only affect hbase connections to hdfs?  I.e. we wouldn't have to worry about affecting connections between datanodes, etc.
>
> --
> Bryan Beaudreault
>
>
> On Wednesday, July 18, 2012 at 4:38 AM, N Keywal wrote:
>
>> Hi Bryan,
>>
>> It's a difficult question, because dfs.socket.timeout is used all over
>> the place in hdfs. I'm currently documenting this.
>> Especially:
>> - it's used for connections between datanodes, and not only for
>> connections between hdfs clients & hdfs datanodes.
>> - It's also used for the two types of datanodes connection (ports
>> beeing 50010 & 50020 by default).
>> - It's used as a connect timeout, but as well as a read timeout
>> (socket is connected, but the application does not write for a while)
>> - It's used with various extensions, so when your seeing stuff like
>> 69000 or 66000 it's often the same setting timeout + 3s (hardcoded) *
>> #replica
>>
>> For a single datanode issue, with everything going well, it will make
>> the cluster much more reactive: hbase will go to another node
>> immediately instead of waiting. But it will make it much more
>> sensitive to gc and network issues. If you have a major hardware
>> issue, something like 10% of your cluster going down, this setting
>> will multiply the number of retries, and will add a lot of workload to
>> your already damaged cluster, and this could make the things worse.
>>
>> This said, I think we will need to make it shorter sooner or later, so
>> if you do it on your cluster, it will be helpful...
>>
>> N.
>>
>> On Tue, Jul 17, 2012 at 7:11 PM, Bryan Beaudreault
>> <bbeaudreault@gmail.com (mailto:bbeaudreault@gmail.com)> wrote:
>> > Today I needed to restart one of my region servers, and did so without gracefully shutting down the datanode. For the next 1-2 minutes we had a bunch of failed queries from various other region servers trying to access that datanode. Looking at the logs, I saw that they were all socket timeouts after 60000 milliseconds.
>> >
>> > We use HBase mostly as an online datastore, with various APIs powering various web apps and external consumers. Writes come from both the API in some cases, but we have continuous hadoop jobs feeding data in as well.
>> >
>> > Since we have web app consumers, this 60 second timeout seems unreasonably long. If a datanode goes down, ideally the impact would be much smaller than that. I want to lower the dfs.socket.timeout to something like 5-10 seconds, but do not know the implications of this.
>> >
>> > In googling I did not find much precedent for this, but I did find some people talking about upping the timeout to much longer than 60 seconds. Is it generally safe to lower this timeout dramatically if you want faster failures? Are there any downsides to this?
>> >
>> > Thanks
>> >
>> > --
>> > Bryan Beaudreault
>> >
>>
>>
>>
>
>

Re: Lowering HDFS socket timeouts

Posted by Bryan Beaudreault <bb...@hubspot.com>.
Thanks for the response, N.  I could be wrong here, but since this problem is in the HDFS client code, couldn't I set this dfs.socket.timeout in my hbase-site.xml and it would only affect hbase connections to hdfs?  I.e. we wouldn't have to worry about affecting connections between datanodes, etc. 

-- 
Bryan Beaudreault


On Wednesday, July 18, 2012 at 4:38 AM, N Keywal wrote:

> Hi Bryan,
> 
> It's a difficult question, because dfs.socket.timeout is used all over
> the place in hdfs. I'm currently documenting this.
> Especially:
> - it's used for connections between datanodes, and not only for
> connections between hdfs clients & hdfs datanodes.
> - It's also used for the two types of datanodes connection (ports
> beeing 50010 & 50020 by default).
> - It's used as a connect timeout, but as well as a read timeout
> (socket is connected, but the application does not write for a while)
> - It's used with various extensions, so when your seeing stuff like
> 69000 or 66000 it's often the same setting timeout + 3s (hardcoded) *
> #replica
> 
> For a single datanode issue, with everything going well, it will make
> the cluster much more reactive: hbase will go to another node
> immediately instead of waiting. But it will make it much more
> sensitive to gc and network issues. If you have a major hardware
> issue, something like 10% of your cluster going down, this setting
> will multiply the number of retries, and will add a lot of workload to
> your already damaged cluster, and this could make the things worse.
> 
> This said, I think we will need to make it shorter sooner or later, so
> if you do it on your cluster, it will be helpful...
> 
> N.
> 
> On Tue, Jul 17, 2012 at 7:11 PM, Bryan Beaudreault
> <bbeaudreault@gmail.com (mailto:bbeaudreault@gmail.com)> wrote:
> > Today I needed to restart one of my region servers, and did so without gracefully shutting down the datanode. For the next 1-2 minutes we had a bunch of failed queries from various other region servers trying to access that datanode. Looking at the logs, I saw that they were all socket timeouts after 60000 milliseconds.
> > 
> > We use HBase mostly as an online datastore, with various APIs powering various web apps and external consumers. Writes come from both the API in some cases, but we have continuous hadoop jobs feeding data in as well.
> > 
> > Since we have web app consumers, this 60 second timeout seems unreasonably long. If a datanode goes down, ideally the impact would be much smaller than that. I want to lower the dfs.socket.timeout to something like 5-10 seconds, but do not know the implications of this.
> > 
> > In googling I did not find much precedent for this, but I did find some people talking about upping the timeout to much longer than 60 seconds. Is it generally safe to lower this timeout dramatically if you want faster failures? Are there any downsides to this?
> > 
> > Thanks
> > 
> > --
> > Bryan Beaudreault
> > 
> 
> 
> 



Re: Lowering HDFS socket timeouts

Posted by N Keywal <nk...@gmail.com>.
Hi Bryan,

It's a difficult question, because dfs.socket.timeout is used all over
the place in hdfs. I'm currently documenting this.
Especially:
- it's used for connections between datanodes, and not only for
connections between hdfs clients & hdfs datanodes.
- It's also used for the two types of datanodes connection (ports
beeing 50010 & 50020 by default).
- It's used as a connect timeout, but as well as a read timeout
(socket is connected, but the application does not write for a while)
- It's used with various extensions, so when your seeing stuff like
69000 or 66000 it's often the same setting timeout + 3s (hardcoded) *
#replica

For a single datanode issue, with everything going well, it will make
the cluster much more reactive: hbase will go to another node
immediately instead of waiting. But it will make it much more
sensitive to gc and network issues. If you have a major hardware
issue, something like 10% of your cluster going down, this setting
will multiply the number of retries, and will add a lot of workload to
your already damaged cluster, and this could make the things worse.

This said, I think we will need to make it shorter sooner or later, so
if you do it on your cluster, it will be helpful...

N.

On Tue, Jul 17, 2012 at 7:11 PM, Bryan Beaudreault
<bb...@gmail.com> wrote:
> Today I needed to restart one of my region servers, and did so without gracefully shutting down the datanode.  For the next 1-2 minutes we had a bunch of failed queries from various other region servers trying to access that datanode.  Looking at the logs, I saw that they were all socket timeouts after 60000 milliseconds.
>
> We use HBase mostly as an online datastore, with various APIs powering various web apps and external consumers.  Writes come from both the API in some cases, but we have continuous hadoop jobs feeding data in as well.
>
> Since we have web app consumers, this 60 second timeout seems unreasonably long.  If a datanode goes down, ideally the impact would be much smaller than that.  I want to lower the dfs.socket.timeout to something like 5-10 seconds, but do not know the implications of this.
>
> In googling I did not find much precedent for this, but I did find some people talking about upping the timeout to much longer than 60 seconds.  Is it generally safe to lower this timeout dramatically if you want faster failures? Are there any downsides to this?
>
> Thanks
>
> --
> Bryan Beaudreault
>