You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Vimal Jain <vk...@gmail.com> on 2013/10/22 09:02:16 UTC

High Full GC count for Region server

Hi,
I am running in Hbase in pseudo distributed mode. ( Hadoop version - 1.1.2
, Hbase version - 0.94.7 )
I am getting few exceptions in both hadoop ( namenode , datanode) logs and
hbase(region server).
When i search for these exceptions on google , i concluded  that problem is
mainly due to large number of full GC in region server process.

I used jstat and found that there are total of 950 full GCs in span of 4
days for region server process.Is this ok?

I am totally confused by number of exceptions i am getting.
Also i get below exceptions intermittently.


Region server:-

2013-10-22 12:00:26,627 WARN org.apache.hadoop.ipc.HBaseServer:
(responseTooSlow):
{"processingtimems":15312,"call":"next(-6681408251916104762, 1000), rpc
version=1, client version=29, methodsFingerPrint=-1368823753","client":"
192.168.20.31:48270
","starttimems":1382423411293,"queuetimems":0,"class":"HRegionServer","responsesize":4808556,"method":"next"}
2013-10-22 12:06:17,606 WARN org.apache.hadoop.ipc.HBaseServer:
(operationTooSlow): {"processingtimems":14759,"client":"192.168.20.31:48247
","timeRange":[0,9223372036854775807],"starttimems":1382423762845,"responsesize":61,"class":"HRegionServer","table":"event_data","cacheBlocks":true,"families":{"ginfo":["netGainPool"]},"row":"1629657","queuetimems":0,"method":"get","totalColumns":1,"maxVersions":1}

2013-10-18 10:37:45,008 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/hbase/event_data/4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0
could only be replicated to 0 nodes, instead of 1
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)

Name node :-
java.io.IOException: File
/hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e
could only be replicated to 0 nodes, instead of 1

java.io.IOException: Got blockReceived message from unregistered or dead
node blk_-2949905629769882833_52274

Data node :-
480000 millis timeout while waiting for channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/192.168.20.30:50010remote=/
192.168.20.30:36188]

ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
192.168.20.30:50010,
storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.EOFException: while trying to read 39309 bytes


-- 
Thanks and Regards,
Vimal Jain

Re: High Full GC count for Region server

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Can you stop HBase and run  fsck on Hadoop to see how your HDFS health is?


2013/10/24 Vimal Jain <vk...@gmail.com>

> Hi Ted/Jean,
> Can you please help here ?
>
>
> On Tue, Oct 22, 2013 at 10:29 PM, Vimal Jain <vk...@gmail.com> wrote:
>
> > Hi Ted,
> > Yes i checked namenode and datanode logs and i found below exceptions in
> > both the logs:-
> >
> > Name node :-
> > java.io.IOException: File
> >
> /hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e
> > could only be replicated to 0 nodes, instead of 1
> >
> > java.io.IOException: Got blockReceived message from unregistered or dead
> > node blk_-2949905629769882833_52274
> >
> > Data node :-
> > 480000 millis timeout while waiting for channel to be ready for write. ch
> > : java.nio.channels.SocketChannel[connected local=/192.168.20.30:50010
> >  remote=/192.168.20.30:36188]
> >
> > ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
> > DatanodeRegistration(192.168.20.30:50010,
> > storageID=DS-1816106352-192.168.20.30-50010-1369314076237,
> infoPort=50075,
> > ipcPort=50020):DataXceiver
> >
> > java.io.EOFException: while trying to read 39309 bytes
> >
> >
> > On Tue, Oct 22, 2013 at 10:19 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> >> bq. java.io.IOException: File /hbase/event_data/
> >> 4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0
> >> could
> >> only be replicated to 0 nodes, instead of 1
> >>
> >> Have you checked Namenode / Datanode logs ?
> >> Looks like hdfs was not stable.
> >>
> >>
> >> On Tue, Oct 22, 2013 at 9:01 AM, Vimal Jain <vk...@gmail.com> wrote:
> >>
> >> > HI Jean,
> >> > Thanks for your reply.
> >> > I have total 8 GB memory and distribution is as follows:-
> >> >
> >> > Region server  - 2 GB
> >> > Master,Namenode,Datanode,Secondary Namenode,Zookepeer - 1 GB
> >> > OS - 1 GB
> >> >
> >> > Please let me know if you need more information.
> >> >
> >> >
> >> > On Tue, Oct 22, 2013 at 8:15 PM, Jean-Marc Spaggiari <
> >> > jean-marc@spaggiari.org> wrote:
> >> >
> >> > > Hi Vimal,
> >> > >
> >> > > What are your settings? Memory of the host, and memory allocated for
> >> the
> >> > > different HBase services?
> >> > >
> >> > > Thanks,
> >> > >
> >> > > JM
> >> > >
> >> > >
> >> > > 2013/10/22 Vimal Jain <vk...@gmail.com>
> >> > >
> >> > > > Hi,
> >> > > > I am running in Hbase in pseudo distributed mode. ( Hadoop
> version -
> >> > > 1.1.2
> >> > > > , Hbase version - 0.94.7 )
> >> > > > I am getting few exceptions in both hadoop ( namenode , datanode)
> >> logs
> >> > > and
> >> > > > hbase(region server).
> >> > > > When i search for these exceptions on google , i concluded  that
> >> > problem
> >> > > is
> >> > > > mainly due to large number of full GC in region server process.
> >> > > >
> >> > > > I used jstat and found that there are total of 950 full GCs in
> span
> >> of
> >> > 4
> >> > > > days for region server process.Is this ok?
> >> > > >
> >> > > > I am totally confused by number of exceptions i am getting.
> >> > > > Also i get below exceptions intermittently.
> >> > > >
> >> > > >
> >> > > > Region server:-
> >> > > >
> >> > > > 2013-10-22 12:00:26,627 WARN org.apache.hadoop.ipc.HBaseServer:
> >> > > > (responseTooSlow):
> >> > > > {"processingtimems":15312,"call":"next(-6681408251916104762,
> 1000),
> >> rpc
> >> > > > version=1, client version=29,
> >> > methodsFingerPrint=-1368823753","client":"
> >> > > > 192.168.20.31:48270
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> ","starttimems":1382423411293,"queuetimems":0,"class":"HRegionServer","responsesize":4808556,"method":"next"}
> >> > > > 2013-10-22 12:06:17,606 WARN org.apache.hadoop.ipc.HBaseServer:
> >> > > > (operationTooSlow): {"processingtimems":14759,"client":"
> >> > > > 192.168.20.31:48247
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> ","timeRange":[0,9223372036854775807],"starttimems":1382423762845,"responsesize":61,"class":"HRegionServer","table":"event_data","cacheBlocks":true,"families":{"ginfo":["netGainPool"]},"row":"1629657","queuetimems":0,"method":"get","totalColumns":1,"maxVersions":1}
> >> > > >
> >> > > > 2013-10-18 10:37:45,008 WARN org.apache.hadoop.hdfs.DFSClient:
> >> > > DataStreamer
> >> > > > Exception: org.apache.hadoop.ipc.RemoteException:
> >> java.io.IOException:
> >> > > File
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> /hbase/event_data/4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0
> >> > > > could only be replicated to 0 nodes, instead of 1
> >> > > >     at
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)
> >> > > >
> >> > > > Name node :-
> >> > > > java.io.IOException: File
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> /hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e
> >> > > > could only be replicated to 0 nodes, instead of 1
> >> > > >
> >> > > > java.io.IOException: Got blockReceived message from unregistered
> or
> >> > dead
> >> > > > node blk_-2949905629769882833_52274
> >> > > >
> >> > > > Data node :-
> >> > > > 480000 millis timeout while waiting for channel to be ready for
> >> write.
> >> > > ch :
> >> > > > java.nio.channels.SocketChannel[connected local=/
> >> 192.168.20.30:50010
> >> > > > remote=/
> >> > > > 192.168.20.30:36188]
> >> > > >
> >> > > > ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
> >> > > > DatanodeRegistration(
> >> > > > 192.168.20.30:50010,
> >> > > > storageID=DS-1816106352-192.168.20.30-50010-1369314076237,
> >> > > infoPort=50075,
> >> > > > ipcPort=50020):DataXceiver
> >> > > > java.io.EOFException: while trying to read 39309 bytes
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Thanks and Regards,
> >> > > > Vimal Jain
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks and Regards,
> >> > Vimal Jain
> >> >
> >>
> >
> >
> >
> > --
> > Thanks and Regards,
> > Vimal Jain
> >
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>