You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Billy Pearson <sa...@pearsonwholesale.com> on 2009/04/12 02:02:08 UTC
WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping
I getting a bunch of WARNS
WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping
This is only happening on the hlogs on the servers while under heave import
30K/sec on 7 server
I tried to bump the hlog size between rolls to 100K in stead of 10K thing
that would help but the problem is still there
but not as much sense the logs are not rolling as often.
not sure if hbase.regionserver.flushlogentries would help any one else seen
this I am running 0.19.2-dev branch
Billy
Re: WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping
Posted by Billy Pearson <sa...@pearsonwholesale.com>.
greping the datanode looks like I get these messages when it happends
[root@server-5 hadoop]# tail -n500 -f hadoop-root-datanode-server-5.log |
grep WARN
2009-04-12 01:06:51,099 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.0.1.5:50010,
storageID=DS-234949010-10.0.1.5-50010-1237522267977, infoPort=50075,
ipcPort=50020):Failed to transfer blk_9059760482849889388_248126 to
10.0.1.4:50010 got java.net.SocketException: Original Exception :
java.io.IOException: Connection reset by peer
2009-04-12 01:06:53,033 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.0.1.5:50010,
storageID=DS-234949010-10.0.1.5-50010-1237522267977, infoPort=50075,
ipcPort=50020):Got exception while serving blk_865228606208483123_247552 to
/10.0.1.5:
2009-04-12 01:06:58,400 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.0.1.5:50010,
storageID=DS-234949010-10.0.1.5-50010-1237522267977, infoPort=50075,
ipcPort=50020):Got exception while serving blk_-2405312561519352544_247560
to /10.0.1.5:
2009-04-12 01:07:06,154 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.0.1.5:50010,
storageID=DS-234949010-10.0.1.5-50010-1237522267977, infoPort=50075,
ipcPort=50020):Got exception while serving blk_-7270111728792506289_247565
to /10.0.1.5:
[root@server-5 hadoop]# tail -n500 -f hadoop-root-datanode-server-5.log |
grep ERROR
2009-04-12 01:06:58,400 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.0.1.5:50010,
storageID=DS-234949010-10.0.1.5-50010-1237522267977, infoPort=50075,
ipcPort=50020):DataXceiver
2009-04-12 01:07:06,154 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.0.1.5:50010,
storageID=DS-234949010-10.0.1.5-50010-1237522267977, infoPort=50075,
ipcPort=50020):DataXceiver
Not sure if I need to bump handles for the datanodes or not?
Billy
"Andrew Purtell" <ap...@apache.org> wrote in
message news:479243.71024.qm@web65516.mail.ac4.yahoo.com...
>
> The "Blocks not replicated yet" is a HDFS problem.
> Maybe I am not understanding what you are saying?
>
> So you have not increased the number of xceivers in
> the datanode configs? Are there any messages of
> interest in the datanode logs?
>
> - Andy
>
>
>> From: Billy Pearson
>> Subject: Re: WARN org.apache.hadoop.hdfs.DFSClient:
>> NotReplicatedYetException sleeping
>> To: hbase-user@hadoop.apache.org
>> Date: Saturday, April 11, 2009, 8:00 PM
>> Everything is default on them except max open files its some
>> reaily high number
>> the only change I know that could be effecting it is nice
>> level of hbase and hadoop
>> hadoop nice = 5
>> hbase nice = 10
>>
>> That way hbase runs slower then the rest when we get a load
>> I run other stuff on the nodes about 6 hours out of the day
>> but this is happening when there is spare cpu
>>
>> Running dual 2.4ghz with 4GB mem dual 250GB 7200 RPM raid 0
>> drives 3 running with 147GB 15K rpm scsi drive
>> about 8 region avgerage heap on datanodes and regionservers
>> is still 1GB
>>
>> Flushing is happending offten with these high import speeds
>> so could that be blocking the hlog?
>> Sense flushing is happening often then minor compactions
>> are running almost all the time keeping up.
>>
>> Billy
>>
>>
>> "Andrew Purtell" <ap...@apache.org>
>> wrote in message
>> news:899617.77011.qm@web65501.mail.ac4.yahoo.com...
>> >
>> > Hi Billy,
>> >
>> > It makes sense to me that you'd see this on the
>> HLogs
>> > first. HDFS blocks are allocated most frequently for
>> > them, except during compaction.
>> >
>> > Seems like a classic sign of DFS stress to me. What
>> are
>> > your configuration details in terms of max open files,
>> > maximum xceiver limit, and datanode handlers?
>> >
>> > - Andy
>> >
>> >> From: Billy Pearson
>> >> Subject: WARN org.apache.hadoop.hdfs.DFSClient:
>> NotReplicatedYetException sleeping
>> >> To: hbase-user@hadoop.apache.org
>> >> Date: Saturday, April 11, 2009, 5:02 PM
>> >> I getting a bunch of WARNS
>> >> WARN org.apache.hadoop.hdfs.DFSClient:
>> >> NotReplicatedYetException sleeping
>> >>
>> >> This is only happening on the hlogs on the servers
>> while
>> >> under heave import 30K/sec on 7 server
>> >> I tried to bump the hlog size between rolls to
>> 100K in
>> >> stead of 10K thing that would help but the problem
>> is still
>> >> there but not as much sense the logs are not
>> rolling as
>> >> often.
>> >>
>> >> not sure if hbase.regionserver.flushlogentries
>> would help
>> >> any one else seen this I am running 0.19.2-dev
>> branch
>> >>
>> >> Billy
>> >
>> >
>> >
>> >
>
>
>
>
Re: WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping
Posted by Andrew Purtell <ap...@apache.org>.
The "Blocks not replicated yet" is a HDFS problem.
Maybe I am not understanding what you are saying?
So you have not increased the number of xceivers in
the datanode configs? Are there any messages of
interest in the datanode logs?
- Andy
> From: Billy Pearson
> Subject: Re: WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping
> To: hbase-user@hadoop.apache.org
> Date: Saturday, April 11, 2009, 8:00 PM
> Everything is default on them except max open files its some
> reaily high number
> the only change I know that could be effecting it is nice
> level of hbase and hadoop
> hadoop nice = 5
> hbase nice = 10
>
> That way hbase runs slower then the rest when we get a load
> I run other stuff on the nodes about 6 hours out of the day
> but this is happening when there is spare cpu
>
> Running dual 2.4ghz with 4GB mem dual 250GB 7200 RPM raid 0
> drives 3 running with 147GB 15K rpm scsi drive
> about 8 region avgerage heap on datanodes and regionservers
> is still 1GB
>
> Flushing is happending offten with these high import speeds
> so could that be blocking the hlog?
> Sense flushing is happening often then minor compactions
> are running almost all the time keeping up.
>
> Billy
>
>
> "Andrew Purtell" <ap...@apache.org>
> wrote in message
> news:899617.77011.qm@web65501.mail.ac4.yahoo.com...
> >
> > Hi Billy,
> >
> > It makes sense to me that you'd see this on the
> HLogs
> > first. HDFS blocks are allocated most frequently for
> > them, except during compaction.
> >
> > Seems like a classic sign of DFS stress to me. What
> are
> > your configuration details in terms of max open files,
> > maximum xceiver limit, and datanode handlers?
> >
> > - Andy
> >
> >> From: Billy Pearson
> >> Subject: WARN org.apache.hadoop.hdfs.DFSClient:
> NotReplicatedYetException sleeping
> >> To: hbase-user@hadoop.apache.org
> >> Date: Saturday, April 11, 2009, 5:02 PM
> >> I getting a bunch of WARNS
> >> WARN org.apache.hadoop.hdfs.DFSClient:
> >> NotReplicatedYetException sleeping
> >>
> >> This is only happening on the hlogs on the servers
> while
> >> under heave import 30K/sec on 7 server
> >> I tried to bump the hlog size between rolls to
> 100K in
> >> stead of 10K thing that would help but the problem
> is still
> >> there but not as much sense the logs are not
> rolling as
> >> often.
> >>
> >> not sure if hbase.regionserver.flushlogentries
> would help
> >> any one else seen this I am running 0.19.2-dev
> branch
> >>
> >> Billy
> >
> >
> >
> >
Re: WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping
Posted by Billy Pearson <sa...@pearsonwholesale.com>.
Everything is default on them except max open files its some reaily high
number
the only change I know that could be effecting it is nice level of hbase and
hadoop
hadoop nice = 5
hbase nice = 10
That way hbase runs slower then the rest when we get a load I run other
stuff on the nodes about 6 hours out of the day but this is happening when
there is spare cpu
Running dual 2.4ghz with 4GB mem dual 250GB 7200 RPM raid 0 drives 3 running
with 147GB 15K rpm scsi drive
about 8 region avgerage heap on datanodes and regionservers is still 1GB
Flushing is happending offten with these high import speeds so could that be
blocking the hlog?
Sense flushing is happening often then minor compactions are running almost
all the time keeping up.
Billy
"Andrew Purtell" <ap...@apache.org> wrote in
message news:899617.77011.qm@web65501.mail.ac4.yahoo.com...
>
> Hi Billy,
>
> It makes sense to me that you'd see this on the HLogs
> first. HDFS blocks are allocated most frequently for
> them, except during compaction.
>
> Seems like a classic sign of DFS stress to me. What are
> your configuration details in terms of max open files,
> maximum xceiver limit, and datanode handlers?
>
> - Andy
>
>> From: Billy Pearson
>> Subject: WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException
>> sleeping
>> To: hbase-user@hadoop.apache.org
>> Date: Saturday, April 11, 2009, 5:02 PM
>> I getting a bunch of WARNS
>> WARN org.apache.hadoop.hdfs.DFSClient:
>> NotReplicatedYetException sleeping
>>
>> This is only happening on the hlogs on the servers while
>> under heave import 30K/sec on 7 server
>> I tried to bump the hlog size between rolls to 100K in
>> stead of 10K thing that would help but the problem is still
>> there but not as much sense the logs are not rolling as
>> often.
>>
>> not sure if hbase.regionserver.flushlogentries would help
>> any one else seen this I am running 0.19.2-dev branch
>>
>> Billy
>
>
>
>
Re: WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping
Posted by Andrew Purtell <ap...@apache.org>.
Hi Billy,
It makes sense to me that you'd see this on the HLogs
first. HDFS blocks are allocated most frequently for
them, except during compaction.
Seems like a classic sign of DFS stress to me. What are
your configuration details in terms of max open files,
maximum xceiver limit, and datanode handlers?
- Andy
> From: Billy Pearson
> Subject: WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping
> To: hbase-user@hadoop.apache.org
> Date: Saturday, April 11, 2009, 5:02 PM
> I getting a bunch of WARNS
> WARN org.apache.hadoop.hdfs.DFSClient:
> NotReplicatedYetException sleeping
>
> This is only happening on the hlogs on the servers while
> under heave import 30K/sec on 7 server
> I tried to bump the hlog size between rolls to 100K in
> stead of 10K thing that would help but the problem is still
> there but not as much sense the logs are not rolling as
> often.
>
> not sure if hbase.regionserver.flushlogentries would help
> any one else seen this I am running 0.19.2-dev branch
>
> Billy