You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Jean-Adrien <ad...@jeanjean.ch> on 2008/10/16 16:32:44 UTC

Regionserver sleeps too much

Hello again.

My second question concerns one of my region server (often the same)
which shutdowns often because it misses the window to heartbeats to master:
Maybe it is overloaded. But it misses it for about 6min.

I turned the log file to debug mode, but I havn't found anything more
interesting. 
The last action is a compaction, but it ends normally. Maybe it is followed
by a heavy hadoop task ?
Or maybe it is linked to the fact that there is only 1Gb HD free ? That is
the only difference I notice between this node and the others, Note that its
hostname is the first on the regionsserver list. Does this position increase
the amount of work ? (e.g. META table always loaded here ?)
By the way, on a computer that have (only) 1Gb of RAM should I decrease the
jvm max allowed memory to the heaps of hadoop datanode and hbase
regionserver (default is 1Gb for each I think) to avoid endless swap ?
Nothing in jira seems to match my problem. Other idea ? 

--- region server log ---
2008-10-16 15:18:45,812 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for region: table-0.3,PLQ80+70101200
:key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9
2008-10-16 15:18:45,812 INFO org.apache.hadoop.hbase.regionserver.HRegion:
starting compaction on region table-0.3,PLQ80+70101200
:key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9,1224096999059
2008-10-16 15:18:45,820 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
Skipped compaction of 1 file; compaction size of 1082805005/header: 4
83.5k; Skipped 3 files, size: 488461
2008-10-16 15:18:45,826 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
Skipped compaction of 1 file; compaction size of 1082805005/bytes: 13
6.5m; Skipped 2 files, size: 141612841
2008-10-16 15:18:45,833 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
Skipped compaction of 1 file; compaction size of 1082805005/info: 1.1
m; Skipped 3 files, size: 1109592
2008-10-16 15:18:45,833 INFO org.apache.hadoop.hbase.regionserver.HRegion:
compaction completed on region table-0.3,PLQ80+70101200
:key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9,1224096999059
 in 0sec
2008-10-16 15:24:32,656 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
265463ms, ten times longer than scheduled: 3000
2008-10-16 15:24:32,656 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
master for 265463 milliseconds - aborting
server
2008-10-16 15:24:32,656 DEBUG org.apache.hadoop.hbase.RegionHistorian:
Offlined
2008-10-16 15:24:32,657 INFO org.apache.hadoop.ipc.Server: Stopping server
on 60020
2

Again. Thank you for your advises.

-- Jean-Adrien

Cluster setup:
Ubuntu linux
4 regionsservers / datanodes
1 is master / namenode as well.
java-6-sun
Total size of hdfs: 81.98 GB (replication factor 3)
fsck -> healthy
hadoop: 0.18.1
hbase: 0.18.0 (jar of hadoop replaced with 0.18.1)
1Gb ram per node
-- 
View this message in context: http://www.nabble.com/Regionserver-sleeps-too-much-tp20014722p20014722.html
Sent from the HBase User mailing list archive at Nabble.com.

RE: Regionserver sleeps too much

Posted by Jonathan Gray <jl...@streamy.com>.

With only 1GB of memory, you definitely want to reduce the heap size given
to the JVM running regionservers and datanodes.  You can change this for
hbase in conf/hbase-env.sh, export HBASE_HEAPSIZE=SIZE_IN_MB.  I'd say you
want to configure everything so you won't be out of memory when every java
process is maxed out.  Also, 1GB is on the low end of the spectrum when you
want to run both an RS and DataNode on one machine.  At least 2GB is
preferred.

Can you confirm if the machine is swapping during this time?

Java/HBase will try to make use of all the heap it's been given, so it seems
likely that this is the issue.

What are the other hardware specs on this machine?  Number of cores, for
example?  Also, I'd be very careful with only leaving 1GB of space free on a
machine... Are you still writing to this instance when that is the free
space on this machine?

JG


-----Original Message-----
From: Jean-Adrien [mailto:adv1@jeanjean.ch] 
Sent: Thursday, October 16, 2008 7:33 AM
To: hbase-user@hadoop.apache.org
Subject: Regionserver sleeps too much


Hello again.

My second question concerns one of my region server (often the same)
which shutdowns often because it misses the window to heartbeats to master:
Maybe it is overloaded. But it misses it for about 6min.

I turned the log file to debug mode, but I havn't found anything more
interesting. 
The last action is a compaction, but it ends normally. Maybe it is followed
by a heavy hadoop task ?
Or maybe it is linked to the fact that there is only 1Gb HD free ? That is
the only difference I notice between this node and the others, Note that its
hostname is the first on the regionsserver list. Does this position increase
the amount of work ? (e.g. META table always loaded here ?)
By the way, on a computer that have (only) 1Gb of RAM should I decrease the
jvm max allowed memory to the heaps of hadoop datanode and hbase
regionserver (default is 1Gb for each I think) to avoid endless swap ?
Nothing in jira seems to match my problem. Other idea ? 

--- region server log ---
2008-10-16 15:18:45,812 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for region: table-0.3,PLQ80+70101200
:key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9
2008-10-16 15:18:45,812 INFO org.apache.hadoop.hbase.regionserver.HRegion:
starting compaction on region table-0.3,PLQ80+70101200
:key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9,1224096999059
2008-10-16 15:18:45,820 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
Skipped compaction of 1 file; compaction size of 1082805005/header: 4
83.5k; Skipped 3 files, size: 488461
2008-10-16 15:18:45,826 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
Skipped compaction of 1 file; compaction size of 1082805005/bytes: 13
6.5m; Skipped 2 files, size: 141612841
2008-10-16 15:18:45,833 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
Skipped compaction of 1 file; compaction size of 1082805005/info: 1.1
m; Skipped 3 files, size: 1109592
2008-10-16 15:18:45,833 INFO org.apache.hadoop.hbase.regionserver.HRegion:
compaction completed on region table-0.3,PLQ80+70101200
:key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9,1224096999059
 in 0sec
2008-10-16 15:24:32,656 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
265463ms, ten times longer than scheduled: 3000
2008-10-16 15:24:32,656 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
master for 265463 milliseconds - aborting
server
2008-10-16 15:24:32,656 DEBUG org.apache.hadoop.hbase.RegionHistorian:
Offlined
2008-10-16 15:24:32,657 INFO org.apache.hadoop.ipc.Server: Stopping server
on 60020
2

Again. Thank you for your advises.

-- Jean-Adrien

Cluster setup:
Ubuntu linux
4 regionsservers / datanodes
1 is master / namenode as well.
java-6-sun
Total size of hdfs: 81.98 GB (replication factor 3)
fsck -> healthy
hadoop: 0.18.1
hbase: 0.18.0 (jar of hadoop replaced with 0.18.1)
1Gb ram per node
-- 
View this message in context:
http://www.nabble.com/Regionserver-sleeps-too-much-tp20014722p20014722.html
Sent from the HBase User mailing list archive at Nabble.com.