You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ey-Chih chow <ey...@gmail.com> on 2012/07/23 21:39:01 UTC

hbase threw NotServingRegionException

Hi,

We got a Map/Reduce job that threw NotServingRegionException when the reducer was about to insert data into a Hbase table. The error message is as follows.  I also copied the corresponding region server log at the end of the message.  Also, we browsed through the hbase administrative page of the table. We couldn't see the list of Table Regions (the table is pre-splitted.) Is there anybody who knows what's happening?  Thanks.

Ey-Chih Chow

========= log from the map/reduce job =========================

12:59:06 [dba@dba@h01 1-exec][INFO] 12/07/22 00:50:35 INFO mapred.JobClient: Task Id : attempt_201206142240_19696_r_000005_0, Status : FAILED 
12:59:06 [dba@dba@h01 1-exec][INFO] org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 110 actions: NotServingRegionException: 110 times, servers with issues: h07.mtv.byah.net:60020, 
12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227) 
12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241) 
12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826) 
12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682) 
12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667) 
12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) 
12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) 
12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapreduce.AvroHbaseTableOutputFormat$AvroHbaseTableRecordWriter.write(AvroHbaseTableOutputFormat.java:75) 
12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapred.AdaptAvroHbaseTableOu 
12:59:06 [dba@dba@h01 1-exec][INFO] attempt_201206142240_19696_r_000005_0: INFO 22-07 00:45:54,927 - Loaded the native-hadoop library

=================================================================




========log from the corresponding region server==========================

2012-07-22T09:08:25.833-0700: [GC [ParNew: 136739K->138K(153344K), 0.0027890 secs] 316329K->179757K(1621376K) icms_dc=0 , 0.0028470 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
2012-07-22T09:25:00.822-0700: [GC [ParNew: 136458K->82K(153344K), 0.0028930 secs] 316077K->179701K(1621376K) icms_dc=0 , 0.0029500 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T09:41:35.638-0700: [GC [ParNew: 136402K->49K(153344K), 0.0030770 secs] 316021K->179668K(1621376K) icms_dc=0 , 0.0031310 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
2012-07-22T09:58:10.796-0700: [GC [ParNew: 136351K->44K(153344K), 0.0028190 secs] 315970K->179663K(1621376K) icms_dc=0 , 0.0028750 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
2012-07-22T10:14:45.638-0700: [GC [ParNew: 136364K->66K(153344K), 0.0031410 secs] 315983K->179694K(1621376K) icms_dc=0 , 0.0031960 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
2012-07-22T10:31:15.761-0700: [GC [ParNew: 136346K->52K(153344K), 0.0029310 secs] 315974K->179680K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T10:47:48.745-0700: [GC [ParNew: 136341K->37K(153344K), 0.0031490 secs] 315969K->179665K(1621376K) icms_dc=0 , 0.0032070 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T11:04:25.008-0700: [GC [ParNew: 136357K->39K(153344K), 0.0027710 secs] 315985K->179667K(1621376K) icms_dc=0 , 0.0028260 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T11:20:55.715-0700: [GC [ParNew: 136337K->39K(153344K), 0.0032670 secs] 315965K->179667K(1621376K) icms_dc=0 , 0.0033210 secs] [Times: user=0.03 sys=0.00, real=0.00 secs] 
2012-07-22T11:37:28.701-0700: [GC [ParNew: 136327K->39K(153344K), 0.0027510 secs] 315955K->179667K(1621376K) icms_dc=0 , 0.0028070 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
2012-07-22T11:54:02.688-0700: [GC [ParNew: 136342K->39K(153344K), 0.0033410 secs] 315971K->179667K(1621376K) icms_dc=0 , 0.0033980 secs] [Times: user=0.03 sys=0.00, real=0.00 secs] 
2012-07-22T12:10:35.639-0700: [GC [ParNew: 136359K->39K(153344K), 0.0026440 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027000 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T12:27:05.649-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027960 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0028520 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T12:43:35.627-0700: [GC [ParNew: 136340K->39K(153344K), 0.0030750 secs] 315968K->179667K(1621376K) icms_dc=0 , 0.0031320 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T13:00:05.607-0700: [GC [ParNew: 136359K->39K(153344K), 0.0032030 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0032770 secs] [Times: user=0.03 sys=0.01, real=0.00 secs] 
2012-07-22T13:16:35.587-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027270 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027820 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T13:33:05.565-0700: [GC [ParNew: 136325K->39K(153344K), 0.0028510 secs] 315953K->179667K(1621376K) icms_dc=0 , 0.0029060 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T13:49:35.545-0700: [GC [ParNew: 136359K->39K(153344K), 0.0030620 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0031170 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
2012-07-22T14:06:05.524-0700: [GC [ParNew: 136359K->56K(153344K), 0.0029310 secs] 315987K->179684K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
2012-07-22T14:22:35.502-0700: [GC [ParNew: 136376K->55K(153344K), 0.0028840 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0029400 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
2012-07-22T14:39:06.248-0700: [GC [ParNew: 136375K->55K(153344K), 0.0032820 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0033360 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/7069292654066145383 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476
12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476, entries=91568, sequenceid=15360597, memsize=22.4m, filesize=1.4m
2012-07-22T14:47:33.177-0700: [GC [ParNew: 136324K->12390K(153344K), 0.0061720 secs] 315953K->192019K(1621376K) icms_dc=0 , 0.0062440 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 
12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/4192914148964281571 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317
12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317, entries=101249, sequenceid=15360597, memsize=46.1m, filesize=4.2m
12/07/22 14:47:33 INFO regionserver.HRegion: Finished memstore flush of ~68.5m for region session_prod2,2EEEEEEC-0000-0000-0000-000000000000,1342965495011.010f43c1636e86b8546bd480158df536. in 275ms, sequenceid=15360597, compaction requested=false
2012-07-22T14:47:35.138-0700: [GC [ParNew: 148634K->8478K(153344K), 0.0116530 secs] 328263K->198054K(1621376K) icms_dc=0 , 0.0117260 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] 
2012-07-22T14:47:43.631-0700: [GC [ParNew: 144640K->17024K(153344K), 0.0302560 secs] 334216K->238447K(1621376K) icms_dc=0 , 0.0303350 secs] [Times: user=0.20 sys=0.00, real=0.03 secs]
=================================================================

Re: hbase threw NotServingRegionException

Posted by Ey-Chih chow <ey...@gmail.com>.
Besides GC info, the master log is as follows.

Ey-Chih

===================================

12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_1_prod2,11111110-0000-0000-0000-000000000000,1339745664043.27d8af632e8a790d4a72254cadbdcb0c./info:server/1339746517691/Put/vlen=22, session_1_prod2,11111110-0000-0000-0000-000000000000,1339745664043.27d8af632e8a790d4a72254cadbdcb0c./info:serverstartcode/1339746517691/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_1_prod2,1DDDDDDC-0000-0000-0000-000000000000,1339745664043.287beb1a649505008edb34564b67dbaa./info:server/1339790024840/Put/vlen=22, session_1_prod2,1DDDDDDC-0000-0000-0000-000000000000,1339745664043.287beb1a649505008edb34564b67dbaa./info:serverstartcode/1339790024840/Put/vlen=8}
2012-07-22T00:47:18.399-0700: [GC [ParNew: 17219K->172K(19136K), 0.0013380 secs] 23565K->6518K(346732K) icms_dc=0 , 0.0014010 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_1_prod2,3FFFFFFC-0000-0000-0000-000000000000,1339745664044.655378f6ee550200f85415aad81a876a./info:server/1339746494906/Put/vlen=22, session_1_prod2,3FFFFFFC-0000-0000-0000-000000000000,1339745664044.655378f6ee550200f85415aad81a876a./info:serverstartcode/1339746494906/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,0CCCCCCC-0000-0000-0000-000000000000,1339612083332.2437e319dd4ae29ab9d43bc7962c022e./info:server/1339614727706/Put/vlen=22, session_prod2,0CCCCCCC-0000-0000-0000-000000000000,1339612083332.2437e319dd4ae29ab9d43bc7962c022e./info:serverstartcode/1339614727706/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,2AAAAAA8-0000-0000-0000-000000000000,1339612083333.71a8929cb03baf96b3a5d2c2ddfda600./info:server/1339614726844/Put/vlen=22, session_prod2,2AAAAAA8-0000-0000-0000-000000000000,1339612083333.71a8929cb03baf96b3a5d2c2ddfda600./info:serverstartcode/1339614726844/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,3FFFFFFC-0000-0000-0000-000000000000,1339612083333.18ec0eae6fdb1c720684b8e85f02f482./info:server/1339613525749/Put/vlen=22, session_prod2,3FFFFFFC-0000-0000-0000-000000000000,1339612083333.18ec0eae6fdb1c720684b8e85f02f482./info:serverstartcode/1339613525749/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,44444440-0000-0000-0000-000000000000,1339612083333.e55655bcdb5f369158ef80b9a8f83a71./info:server/1339614726816/Put/vlen=22, session_prod2,44444440-0000-0000-0000-000000000000,1339612083333.e55655bcdb5f369158ef80b9a8f83a71./info:serverstartcode/1339614726816/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,4CCCCCC8-0000-0000-0000-000000000000,1339612083333.29a52f98f640f35e187de9208a16f77a./info:server/1339614727337/Put/vlen=22, session_prod2,4CCCCCC8-0000-0000-0000-000000000000,1339612083333.29a52f98f640f35e187de9208a16f77a./info:serverstartcode/1339614727337/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,5111110C-0000-0000-0000-000000000000,1339612083333.4cf3dbac075c508e6bb4868d3138d223./info:server/1339614730155/Put/vlen=22, session_prod2,5111110C-0000-0000-0000-000000000000,1339612083333.4cf3dbac075c508e6bb4868d3138d223./info:serverstartcode/1339614730155/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,6EEEEEE8-0000-0000-0000-000000000000,1339612083333.b90def2e52482c15a1ea275aa8bcabbf./info:server/1339614728640/Put/vlen=22, session_prod2,6EEEEEE8-0000-0000-0000-000000000000,1339612083333.b90def2e52482c15a1ea275aa8bcabbf./info:serverstartcode/1339614728640/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,7333332C-0000-0000-0000-000000000000,1339612083333.8cdb04c17aa9dd9e9eb05b8be0d5b58c./info:server/1339614728349/Put/vlen=22, session_prod2,7333332C-0000-0000-0000-000000000000,1339612083333.8cdb04c17aa9dd9e9eb05b8be0d5b58c./info:serverstartcode/1339614728349/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,8444443C-0000-0000-0000-000000000000,1339612083333.17609b20443e66270e3bc08ee8d1b31c./info:server/1339614728931/Put/vlen=22, session_prod2,8444443C-0000-0000-0000-000000000000,1339612083333.17609b20443e66270e3bc08ee8d1b31c./info:serverstartcode/1339614728931/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,9DDDDDD4-0000-0000-0000-000000000000,1339612083334.df6df275cd52f326f7f1098379db626b./info:server/1339614728399/Put/vlen=22, session_prod2,9DDDDDD4-0000-0000-0000-000000000000,1339612083334.df6df275cd52f326f7f1098379db626b./info:serverstartcode/1339614728399/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,AAAAAAA0-0000-0000-0000-000000000000,1339612083334.1ff35698a6f6e332a72ec7769b805343./info:server/1339614730650/Put/vlen=22, session_prod2,AAAAAAA0-0000-0000-0000-000000000000,1339612083334.1ff35698a6f6e332a72ec7769b805343./info:serverstartcode/1339614730650/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,D999998C-0000-0000-0000-000000000000,1339612083334.b4691c824d6e611b2210f90af9c3d7ac./info:server/1339614728301/Put/vlen=22, session_prod2,D999998C-0000-0000-0000-000000000000,1339612083334.b4691c824d6e611b2210f90af9c3d7ac./info:serverstartcode/1339614728301/Put/vlen=8}
12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={session_prod2,EEEEEEE0-0000-0000-0000-000000000000,1339612083334.da306c199cfa04e6e2aa58807f753abe./info:server/1339614727959/Put/vlen=22, session_prod2,EEEEEEE0-0000-0000-0000-000000000000,1339612083334.da306c199cfa04e6e2aa58807f753abe./info:serverstartcode/1339614727959/Put/vlen=8}
=======================================================================

On Jul 23, 2012, at 3:01 PM, Elliott Clark wrote:

> hbck should help expose more problems than a single scan would.  With that
> said the logs are the best bet in understanding what is going on with the
> cluster at the time of the issue.  You posted logs that seem to only really
> contain GC info.  Do you have more information about the cluster state when
> the exceptions were thrown ?  Do you have the logs from the master? Was the
> RS responding at the time? Were there compactions or balancer actions
> happening ?
> 
> On Mon, Jul 23, 2012 at 2:05 PM, Ey-Chih chow <ey...@gmail.com> wrote:
> 
>> Thanks.  But if we do a scan on the table via the hbase shell, data in the
>> table did show up.
>> 
>> Ey-Chih
>> 
>> On Jul 23, 2012, at 1:10 PM, Mohammad Tariq wrote:
>> 
>>> Hello sir,
>>> 
>>>    A possible reason could be, your client is contacting the given
>>> regionserver, and that regionserver kept on rejecting the requests.
>>> Are you sure your table and all regions online? Use hbck once and see
>>> if you find anything interesting.
>>> 
>>> Regards,
>>>   Mohammad Tariq
>>> 
>>> 
>>> On Tue, Jul 24, 2012 at 1:26 AM, Ey-Chih chow <ey...@gmail.com> wrote:
>>>> Sorry I pasted the wrong portion of the region server log.  The right
>> portion should be as follows:
>>>> 
>>>> 
>>>> ======================================
>>>> 2012-07-22T00:48:57.147-0700: [GC [ParNew: 18863K->2112K(19136K),
>> 0.0029870 secs] 57106K->42831K(97048K) icms_dc=0 , 0.0030480 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:09.475-0700: [GC [ParNew: 18686K->2112K(19136K),
>> 0.0219960 secs] 59405K->43951K(97048K) icms_dc=0 , 0.0220530 secs] [Times:
>> user=0.15 sys=0.00, real=0.02 secs]
>>>> 2012-07-22T00:49:20.872-0700: [GC [ParNew: 19135K->2112K(19136K),
>> 0.0091670 secs] 60974K->44659K(97048K) icms_dc=0 , 0.0092480 secs] [Times:
>> user=0.06 sys=0.00, real=0.01 secs]
>>>> 2012-07-22T00:49:22.285-0700: [GC [ParNew: 19136K->2112K(19136K),
>> 0.0021870 secs] 61683K->46380K(97048K) icms_dc=0 , 0.0022480 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:24.999-0700: [GC [ParNew: 19136K->2087K(19136K),
>> 0.0019340 secs] 63404K->47842K(97048K) icms_dc=0 , 0.0019910 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:26.618-0700: [GC [ParNew: 18002K->2112K(19136K),
>> 0.0016160 secs] 63757K->48065K(97048K) icms_dc=0 , 0.0016780 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:28.089-0700: [GC [ParNew: 19136K->1413K(19136K),
>> 0.0033140 secs] 65089K->48748K(97048K) icms_dc=0 , 0.0033810 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:29.352-0700: [GC [ParNew: 18423K->2112K(19136K),
>> 0.0063630 secs] 65759K->50827K(97048K) icms_dc=0 , 0.0064210 secs] [Times:
>> user=0.05 sys=0.00, real=0.01 secs]
>>>> 2012-07-22T00:49:30.226-0700: [GC [ParNew: 19136K->1897K(19136K),
>> 0.0021470 secs] 67851K->52058K(97048K) icms_dc=0 , 0.0022010 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:31.190-0700: [GC [ParNew: 18921K->1606K(19136K),
>> 0.0013270 secs] 69082K->52336K(97048K) icms_dc=0 , 0.0013780 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:33.111-0700: [GC [ParNew: 18482K->2112K(19136K),
>> 0.0017970 secs] 69212K->53496K(97048K) icms_dc=0 , 0.0018510 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:34.396-0700: [GC [ParNew: 19136K->1468K(19136K),
>> 0.0120290 secs] 70520K->54718K(97048K) icms_dc=0 , 0.0120820 secs] [Times:
>> user=0.07 sys=0.00, real=0.02 secs]
>>>> 2012-07-22T00:49:35.294-0700: [GC [ParNew: 18181K->2112K(19136K),
>> 0.0011290 secs] 71431K->55399K(97048K) icms_dc=0 , 0.0011840 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:39.165-0700: [GC [ParNew: 19003K->547K(19136K),
>> 0.0130200 secs] 72291K->55364K(97048K) icms_dc=0 , 0.0130880 secs] [Times:
>> user=0.08 sys=0.00, real=0.02 secs]
>>>> 2012-07-22T00:49:40.642-0700: [GC [ParNew: 17571K->646K(19136K),
>> 0.0006740 secs] 72388K->55463K(97048K) icms_dc=0 , 0.0007220 secs] [Times:
>> user=0.00 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:41.541-0700: [GC [ParNew: 17670K->2112K(19136K),
>> 0.0104430 secs] 72487K->58179K(97048K) icms_dc=0 , 0.0105000 secs] [Times:
>> user=0.07 sys=0.00, real=0.01 secs]
>>>> 2012-07-22T00:49:42.972-0700: [GC [ParNew: 19136K->868K(19136K),
>> 0.0015360 secs] 75203K->58718K(97048K) icms_dc=0 , 0.0015880 secs] [Times:
>> user=0.02 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:43.355-0700: [GC [ParNew: 17892K->2112K(19136K),
>> 0.0017450 secs] 75742K->60552K(97048K) icms_dc=0 , 0.0018070 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:46.711-0700: [GC [ParNew: 18900K->644K(19136K),
>> 0.0015250 secs] 77340K->60427K(97048K) icms_dc=0 , 0.0015760 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:47.824-0700: [GC [ParNew: 17668K->1986K(19136K),
>> 0.0015260 secs] 77451K->61943K(97048K) icms_dc=0 , 0.0015780 secs] [Times:
>> user=0.00 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:51.552-0700: [GC [ParNew: 19010K->752K(19136K),
>> 0.0011550 secs] 78967K->60715K(97048K) icms_dc=0 , 0.0012500 secs] [Times:
>> user=0.00 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:49:58.460-0700: [GC [ParNew: 17776K->2055K(19136K),
>> 0.0010860 secs] 77739K->62018K(97048K) icms_dc=0 , 0.0011440 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:50:03.692-0700: [GC [ParNew: 19079K->2112K(19136K),
>> 0.0257530 secs] 79042K->63417K(97048K) icms_dc=0 , 0.0258130 secs] [Times:
>> user=0.18 sys=0.00, real=0.02 secs]
>>>> 2012-07-22T00:50:07.522-0700: [GC [ParNew: 19136K->2112K(19136K),
>> 0.0028960 secs] 80441K->67709K(97048K) icms_dc=0 , 0.0029600 secs] [Times:
>> user=0.02 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:50:30.491-0700: [GC [ParNew: 19136K->2112K(19136K),
>> 0.0023910 secs] 84733K->70543K(97048K) icms_dc=0 , 0.0024560 secs] [Times:
>> user=0.02 sys=0.01, real=0.00 secs]
>>>> 2012-07-22T00:50:39.242-0700: [GC [ParNew: 18974K->1947K(19136K),
>> 0.0009880 secs] 87406K->70737K(97048K) icms_dc=0 , 0.0010440 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:50:51.654-0700: [GC [ParNew: 18765K->2112K(19136K),
>> 0.0017700 secs] 87555K->71087K(97048K) icms_dc=0 , 0.0018250 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:51:01.245-0700: [GC [ParNew: 18136K->1984K(19136K),
>> 0.0015550 secs] 87112K->70974K(97048K) icms_dc=0 , 0.0016230 secs] [Times:
>> user=0.00 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:51:03.300-0700: [GC [ParNew: 19008K->2112K(19136K),
>> 0.0193190 secs] 87998K->73184K(97048K) icms_dc=0 , 0.0193830 secs] [Times:
>> user=0.10 sys=0.00, real=0.01 secs]
>>>> 2012-07-22T00:51:08.001-0700: [GC [ParNew: 19111K->2112K(19136K),
>> 0.0014340 secs] 90183K->73830K(97048K) icms_dc=0 , 0.0014900 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> 2012-07-22T00:51:10.532-0700: [GC [ParNew: 19136K->2112K(19136K),
>> 0.0037150 secs] 90854K->76592K(97048K) icms_dc=15 , 0.0037930 secs] [Times:
>> user=0.01 sys=0.00, real=0.00 secs]
>>>> ==========================================
>>>> 
>>>> 
>>>> On Jul 23, 2012, at 12:39 PM, Ey-Chih chow wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> We got a Map/Reduce job that threw NotServingRegionException when the
>> reducer was about to insert data into a Hbase table. The error message is
>> as follows.  I also copied the corresponding region server log at the end
>> of the message.  Also, we browsed through the hbase administrative page of
>> the table. We couldn't see the list of Table Regions (the table is
>> pre-splitted.) Is there anybody who knows what's happening?  Thanks.
>>>>> 
>>>>> Ey-Chih Chow
>>>>> 
>>>>> ========= log from the map/reduce job =========================
>>>>> 
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] 12/07/22 00:50:35 INFO
>> mapred.JobClient: Task Id : attempt_201206142240_19696_r_000005_0, Status :
>> FAILED
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO]
>> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
>> 110 actions: NotServingRegionException: 110 times, servers with issues:
>> h07.mtv.byah.net:60020,
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227)
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241)
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826)
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682)
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> com.booyah.analytics.mapreduce.AvroHbaseTableOutputFormat$AvroHbaseTableRecordWriter.write(AvroHbaseTableOutputFormat.java:75)
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
>> com.booyah.analytics.mapred.AdaptAvroHbaseTableOu
>>>>> 12:59:06 [dba@dba@h01 1-exec][INFO]
>> attempt_201206142240_19696_r_000005_0: INFO 22-07 00:45:54,927 - Loaded the
>> native-hadoop library
>>>>> 
>>>>> =================================================================
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ========log from the corresponding region
>> server==========================
>>>>> 
>>>>> 2012-07-22T09:08:25.833-0700: [GC [ParNew: 136739K->138K(153344K),
>> 0.0027890 secs] 316329K->179757K(1621376K) icms_dc=0 , 0.0028470 secs]
>> [Times: user=0.03 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T09:25:00.822-0700: [GC [ParNew: 136458K->82K(153344K),
>> 0.0028930 secs] 316077K->179701K(1621376K) icms_dc=0 , 0.0029500 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T09:41:35.638-0700: [GC [ParNew: 136402K->49K(153344K),
>> 0.0030770 secs] 316021K->179668K(1621376K) icms_dc=0 , 0.0031310 secs]
>> [Times: user=0.02 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T09:58:10.796-0700: [GC [ParNew: 136351K->44K(153344K),
>> 0.0028190 secs] 315970K->179663K(1621376K) icms_dc=0 , 0.0028750 secs]
>> [Times: user=0.03 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T10:14:45.638-0700: [GC [ParNew: 136364K->66K(153344K),
>> 0.0031410 secs] 315983K->179694K(1621376K) icms_dc=0 , 0.0031960 secs]
>> [Times: user=0.02 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T10:31:15.761-0700: [GC [ParNew: 136346K->52K(153344K),
>> 0.0029310 secs] 315974K->179680K(1621376K) icms_dc=0 , 0.0029870 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T10:47:48.745-0700: [GC [ParNew: 136341K->37K(153344K),
>> 0.0031490 secs] 315969K->179665K(1621376K) icms_dc=0 , 0.0032070 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T11:04:25.008-0700: [GC [ParNew: 136357K->39K(153344K),
>> 0.0027710 secs] 315985K->179667K(1621376K) icms_dc=0 , 0.0028260 secs]
>> [Times: user=0.01 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T11:20:55.715-0700: [GC [ParNew: 136337K->39K(153344K),
>> 0.0032670 secs] 315965K->179667K(1621376K) icms_dc=0 , 0.0033210 secs]
>> [Times: user=0.03 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T11:37:28.701-0700: [GC [ParNew: 136327K->39K(153344K),
>> 0.0027510 secs] 315955K->179667K(1621376K) icms_dc=0 , 0.0028070 secs]
>> [Times: user=0.02 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T11:54:02.688-0700: [GC [ParNew: 136342K->39K(153344K),
>> 0.0033410 secs] 315971K->179667K(1621376K) icms_dc=0 , 0.0033980 secs]
>> [Times: user=0.03 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T12:10:35.639-0700: [GC [ParNew: 136359K->39K(153344K),
>> 0.0026440 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027000 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T12:27:05.649-0700: [GC [ParNew: 136359K->39K(153344K),
>> 0.0027960 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0028520 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T12:43:35.627-0700: [GC [ParNew: 136340K->39K(153344K),
>> 0.0030750 secs] 315968K->179667K(1621376K) icms_dc=0 , 0.0031320 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T13:00:05.607-0700: [GC [ParNew: 136359K->39K(153344K),
>> 0.0032030 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0032770 secs]
>> [Times: user=0.03 sys=0.01, real=0.00 secs]
>>>>> 2012-07-22T13:16:35.587-0700: [GC [ParNew: 136359K->39K(153344K),
>> 0.0027270 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027820 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T13:33:05.565-0700: [GC [ParNew: 136325K->39K(153344K),
>> 0.0028510 secs] 315953K->179667K(1621376K) icms_dc=0 , 0.0029060 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 2012-07-22T13:49:35.545-0700: [GC [ParNew: 136359K->39K(153344K),
>> 0.0030620 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0031170 secs]
>> [Times: user=0.03 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T14:06:05.524-0700: [GC [ParNew: 136359K->56K(153344K),
>> 0.0029310 secs] 315987K->179684K(1621376K) icms_dc=0 , 0.0029870 secs]
>> [Times: user=0.03 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T14:22:35.502-0700: [GC [ParNew: 136376K->55K(153344K),
>> 0.0028840 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0029400 secs]
>> [Times: user=0.02 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T14:39:06.248-0700: [GC [ParNew: 136375K->55K(153344K),
>> 0.0032820 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0033360 secs]
>> [Times: user=0.02 sys=0.00, real=0.00 secs]
>>>>> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at
>> hdfs://
>> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/7069292654066145383to hdfs://
>> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476
>>>>> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://
>> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476,
>> entries=91568, sequenceid=15360597, memsize=22.4m, filesize=1.4m
>>>>> 2012-07-22T14:47:33.177-0700: [GC [ParNew: 136324K->12390K(153344K),
>> 0.0061720 secs] 315953K->192019K(1621376K) icms_dc=0 , 0.0062440 secs]
>> [Times: user=0.04 sys=0.00, real=0.01 secs]
>>>>> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at
>> hdfs://
>> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/4192914148964281571to hdfs://
>> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317
>>>>> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://
>> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317,
>> entries=101249, sequenceid=15360597, memsize=46.1m, filesize=4.2m
>>>>> 12/07/22 14:47:33 INFO regionserver.HRegion: Finished memstore flush
>> of ~68.5m for region
>> session_prod2,2EEEEEEC-0000-0000-0000-000000000000,1342965495011.010f43c1636e86b8546bd480158df536.
>> in 275ms, sequenceid=15360597, compaction requested=false
>>>>> 2012-07-22T14:47:35.138-0700: [GC [ParNew: 148634K->8478K(153344K),
>> 0.0116530 secs] 328263K->198054K(1621376K) icms_dc=0 , 0.0117260 secs]
>> [Times: user=0.06 sys=0.00, real=0.01 secs]
>>>>> 2012-07-22T14:47:43.631-0700: [GC [ParNew: 144640K->17024K(153344K),
>> 0.0302560 secs] 334216K->238447K(1621376K) icms_dc=0 , 0.0303350 secs]
>> [Times: user=0.20 sys=0.00, real=0.03 secs]
>>>>> =================================================================
>>>> 
>> 
>> 


Re: hbase threw NotServingRegionException

Posted by Elliott Clark <ec...@stumbleupon.com>.
hbck should help expose more problems than a single scan would.  With that
said the logs are the best bet in understanding what is going on with the
cluster at the time of the issue.  You posted logs that seem to only really
contain GC info.  Do you have more information about the cluster state when
the exceptions were thrown ?  Do you have the logs from the master? Was the
RS responding at the time? Were there compactions or balancer actions
happening ?

On Mon, Jul 23, 2012 at 2:05 PM, Ey-Chih chow <ey...@gmail.com> wrote:

> Thanks.  But if we do a scan on the table via the hbase shell, data in the
> table did show up.
>
> Ey-Chih
>
> On Jul 23, 2012, at 1:10 PM, Mohammad Tariq wrote:
>
> > Hello sir,
> >
> >     A possible reason could be, your client is contacting the given
> > regionserver, and that regionserver kept on rejecting the requests.
> > Are you sure your table and all regions online? Use hbck once and see
> > if you find anything interesting.
> >
> > Regards,
> >    Mohammad Tariq
> >
> >
> > On Tue, Jul 24, 2012 at 1:26 AM, Ey-Chih chow <ey...@gmail.com> wrote:
> >> Sorry I pasted the wrong portion of the region server log.  The right
> portion should be as follows:
> >>
> >>
> >> ======================================
> >> 2012-07-22T00:48:57.147-0700: [GC [ParNew: 18863K->2112K(19136K),
> 0.0029870 secs] 57106K->42831K(97048K) icms_dc=0 , 0.0030480 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:09.475-0700: [GC [ParNew: 18686K->2112K(19136K),
> 0.0219960 secs] 59405K->43951K(97048K) icms_dc=0 , 0.0220530 secs] [Times:
> user=0.15 sys=0.00, real=0.02 secs]
> >> 2012-07-22T00:49:20.872-0700: [GC [ParNew: 19135K->2112K(19136K),
> 0.0091670 secs] 60974K->44659K(97048K) icms_dc=0 , 0.0092480 secs] [Times:
> user=0.06 sys=0.00, real=0.01 secs]
> >> 2012-07-22T00:49:22.285-0700: [GC [ParNew: 19136K->2112K(19136K),
> 0.0021870 secs] 61683K->46380K(97048K) icms_dc=0 , 0.0022480 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:24.999-0700: [GC [ParNew: 19136K->2087K(19136K),
> 0.0019340 secs] 63404K->47842K(97048K) icms_dc=0 , 0.0019910 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:26.618-0700: [GC [ParNew: 18002K->2112K(19136K),
> 0.0016160 secs] 63757K->48065K(97048K) icms_dc=0 , 0.0016780 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:28.089-0700: [GC [ParNew: 19136K->1413K(19136K),
> 0.0033140 secs] 65089K->48748K(97048K) icms_dc=0 , 0.0033810 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:29.352-0700: [GC [ParNew: 18423K->2112K(19136K),
> 0.0063630 secs] 65759K->50827K(97048K) icms_dc=0 , 0.0064210 secs] [Times:
> user=0.05 sys=0.00, real=0.01 secs]
> >> 2012-07-22T00:49:30.226-0700: [GC [ParNew: 19136K->1897K(19136K),
> 0.0021470 secs] 67851K->52058K(97048K) icms_dc=0 , 0.0022010 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:31.190-0700: [GC [ParNew: 18921K->1606K(19136K),
> 0.0013270 secs] 69082K->52336K(97048K) icms_dc=0 , 0.0013780 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:33.111-0700: [GC [ParNew: 18482K->2112K(19136K),
> 0.0017970 secs] 69212K->53496K(97048K) icms_dc=0 , 0.0018510 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:34.396-0700: [GC [ParNew: 19136K->1468K(19136K),
> 0.0120290 secs] 70520K->54718K(97048K) icms_dc=0 , 0.0120820 secs] [Times:
> user=0.07 sys=0.00, real=0.02 secs]
> >> 2012-07-22T00:49:35.294-0700: [GC [ParNew: 18181K->2112K(19136K),
> 0.0011290 secs] 71431K->55399K(97048K) icms_dc=0 , 0.0011840 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:39.165-0700: [GC [ParNew: 19003K->547K(19136K),
> 0.0130200 secs] 72291K->55364K(97048K) icms_dc=0 , 0.0130880 secs] [Times:
> user=0.08 sys=0.00, real=0.02 secs]
> >> 2012-07-22T00:49:40.642-0700: [GC [ParNew: 17571K->646K(19136K),
> 0.0006740 secs] 72388K->55463K(97048K) icms_dc=0 , 0.0007220 secs] [Times:
> user=0.00 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:41.541-0700: [GC [ParNew: 17670K->2112K(19136K),
> 0.0104430 secs] 72487K->58179K(97048K) icms_dc=0 , 0.0105000 secs] [Times:
> user=0.07 sys=0.00, real=0.01 secs]
> >> 2012-07-22T00:49:42.972-0700: [GC [ParNew: 19136K->868K(19136K),
> 0.0015360 secs] 75203K->58718K(97048K) icms_dc=0 , 0.0015880 secs] [Times:
> user=0.02 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:43.355-0700: [GC [ParNew: 17892K->2112K(19136K),
> 0.0017450 secs] 75742K->60552K(97048K) icms_dc=0 , 0.0018070 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:46.711-0700: [GC [ParNew: 18900K->644K(19136K),
> 0.0015250 secs] 77340K->60427K(97048K) icms_dc=0 , 0.0015760 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:47.824-0700: [GC [ParNew: 17668K->1986K(19136K),
> 0.0015260 secs] 77451K->61943K(97048K) icms_dc=0 , 0.0015780 secs] [Times:
> user=0.00 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:51.552-0700: [GC [ParNew: 19010K->752K(19136K),
> 0.0011550 secs] 78967K->60715K(97048K) icms_dc=0 , 0.0012500 secs] [Times:
> user=0.00 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:49:58.460-0700: [GC [ParNew: 17776K->2055K(19136K),
> 0.0010860 secs] 77739K->62018K(97048K) icms_dc=0 , 0.0011440 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:50:03.692-0700: [GC [ParNew: 19079K->2112K(19136K),
> 0.0257530 secs] 79042K->63417K(97048K) icms_dc=0 , 0.0258130 secs] [Times:
> user=0.18 sys=0.00, real=0.02 secs]
> >> 2012-07-22T00:50:07.522-0700: [GC [ParNew: 19136K->2112K(19136K),
> 0.0028960 secs] 80441K->67709K(97048K) icms_dc=0 , 0.0029600 secs] [Times:
> user=0.02 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:50:30.491-0700: [GC [ParNew: 19136K->2112K(19136K),
> 0.0023910 secs] 84733K->70543K(97048K) icms_dc=0 , 0.0024560 secs] [Times:
> user=0.02 sys=0.01, real=0.00 secs]
> >> 2012-07-22T00:50:39.242-0700: [GC [ParNew: 18974K->1947K(19136K),
> 0.0009880 secs] 87406K->70737K(97048K) icms_dc=0 , 0.0010440 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:50:51.654-0700: [GC [ParNew: 18765K->2112K(19136K),
> 0.0017700 secs] 87555K->71087K(97048K) icms_dc=0 , 0.0018250 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:51:01.245-0700: [GC [ParNew: 18136K->1984K(19136K),
> 0.0015550 secs] 87112K->70974K(97048K) icms_dc=0 , 0.0016230 secs] [Times:
> user=0.00 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:51:03.300-0700: [GC [ParNew: 19008K->2112K(19136K),
> 0.0193190 secs] 87998K->73184K(97048K) icms_dc=0 , 0.0193830 secs] [Times:
> user=0.10 sys=0.00, real=0.01 secs]
> >> 2012-07-22T00:51:08.001-0700: [GC [ParNew: 19111K->2112K(19136K),
> 0.0014340 secs] 90183K->73830K(97048K) icms_dc=0 , 0.0014900 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> 2012-07-22T00:51:10.532-0700: [GC [ParNew: 19136K->2112K(19136K),
> 0.0037150 secs] 90854K->76592K(97048K) icms_dc=15 , 0.0037930 secs] [Times:
> user=0.01 sys=0.00, real=0.00 secs]
> >> ==========================================
> >>
> >>
> >> On Jul 23, 2012, at 12:39 PM, Ey-Chih chow wrote:
> >>
> >>> Hi,
> >>>
> >>> We got a Map/Reduce job that threw NotServingRegionException when the
> reducer was about to insert data into a Hbase table. The error message is
> as follows.  I also copied the corresponding region server log at the end
> of the message.  Also, we browsed through the hbase administrative page of
> the table. We couldn't see the list of Table Regions (the table is
> pre-splitted.) Is there anybody who knows what's happening?  Thanks.
> >>>
> >>> Ey-Chih Chow
> >>>
> >>> ========= log from the map/reduce job =========================
> >>>
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] 12/07/22 00:50:35 INFO
> mapred.JobClient: Task Id : attempt_201206142240_19696_r_000005_0, Status :
> FAILED
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO]
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
> 110 actions: NotServingRegionException: 110 times, servers with issues:
> h07.mtv.byah.net:60020,
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227)
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241)
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826)
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682)
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> com.booyah.analytics.mapreduce.AvroHbaseTableOutputFormat$AvroHbaseTableRecordWriter.write(AvroHbaseTableOutputFormat.java:75)
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO] at
> com.booyah.analytics.mapred.AdaptAvroHbaseTableOu
> >>> 12:59:06 [dba@dba@h01 1-exec][INFO]
> attempt_201206142240_19696_r_000005_0: INFO 22-07 00:45:54,927 - Loaded the
> native-hadoop library
> >>>
> >>> =================================================================
> >>>
> >>>
> >>>
> >>>
> >>> ========log from the corresponding region
> server==========================
> >>>
> >>> 2012-07-22T09:08:25.833-0700: [GC [ParNew: 136739K->138K(153344K),
> 0.0027890 secs] 316329K->179757K(1621376K) icms_dc=0 , 0.0028470 secs]
> [Times: user=0.03 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T09:25:00.822-0700: [GC [ParNew: 136458K->82K(153344K),
> 0.0028930 secs] 316077K->179701K(1621376K) icms_dc=0 , 0.0029500 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T09:41:35.638-0700: [GC [ParNew: 136402K->49K(153344K),
> 0.0030770 secs] 316021K->179668K(1621376K) icms_dc=0 , 0.0031310 secs]
> [Times: user=0.02 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T09:58:10.796-0700: [GC [ParNew: 136351K->44K(153344K),
> 0.0028190 secs] 315970K->179663K(1621376K) icms_dc=0 , 0.0028750 secs]
> [Times: user=0.03 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T10:14:45.638-0700: [GC [ParNew: 136364K->66K(153344K),
> 0.0031410 secs] 315983K->179694K(1621376K) icms_dc=0 , 0.0031960 secs]
> [Times: user=0.02 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T10:31:15.761-0700: [GC [ParNew: 136346K->52K(153344K),
> 0.0029310 secs] 315974K->179680K(1621376K) icms_dc=0 , 0.0029870 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T10:47:48.745-0700: [GC [ParNew: 136341K->37K(153344K),
> 0.0031490 secs] 315969K->179665K(1621376K) icms_dc=0 , 0.0032070 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T11:04:25.008-0700: [GC [ParNew: 136357K->39K(153344K),
> 0.0027710 secs] 315985K->179667K(1621376K) icms_dc=0 , 0.0028260 secs]
> [Times: user=0.01 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T11:20:55.715-0700: [GC [ParNew: 136337K->39K(153344K),
> 0.0032670 secs] 315965K->179667K(1621376K) icms_dc=0 , 0.0033210 secs]
> [Times: user=0.03 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T11:37:28.701-0700: [GC [ParNew: 136327K->39K(153344K),
> 0.0027510 secs] 315955K->179667K(1621376K) icms_dc=0 , 0.0028070 secs]
> [Times: user=0.02 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T11:54:02.688-0700: [GC [ParNew: 136342K->39K(153344K),
> 0.0033410 secs] 315971K->179667K(1621376K) icms_dc=0 , 0.0033980 secs]
> [Times: user=0.03 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T12:10:35.639-0700: [GC [ParNew: 136359K->39K(153344K),
> 0.0026440 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027000 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T12:27:05.649-0700: [GC [ParNew: 136359K->39K(153344K),
> 0.0027960 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0028520 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T12:43:35.627-0700: [GC [ParNew: 136340K->39K(153344K),
> 0.0030750 secs] 315968K->179667K(1621376K) icms_dc=0 , 0.0031320 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T13:00:05.607-0700: [GC [ParNew: 136359K->39K(153344K),
> 0.0032030 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0032770 secs]
> [Times: user=0.03 sys=0.01, real=0.00 secs]
> >>> 2012-07-22T13:16:35.587-0700: [GC [ParNew: 136359K->39K(153344K),
> 0.0027270 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027820 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T13:33:05.565-0700: [GC [ParNew: 136325K->39K(153344K),
> 0.0028510 secs] 315953K->179667K(1621376K) icms_dc=0 , 0.0029060 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 2012-07-22T13:49:35.545-0700: [GC [ParNew: 136359K->39K(153344K),
> 0.0030620 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0031170 secs]
> [Times: user=0.03 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T14:06:05.524-0700: [GC [ParNew: 136359K->56K(153344K),
> 0.0029310 secs] 315987K->179684K(1621376K) icms_dc=0 , 0.0029870 secs]
> [Times: user=0.03 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T14:22:35.502-0700: [GC [ParNew: 136376K->55K(153344K),
> 0.0028840 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0029400 secs]
> [Times: user=0.02 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T14:39:06.248-0700: [GC [ParNew: 136375K->55K(153344K),
> 0.0032820 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0033360 secs]
> [Times: user=0.02 sys=0.00, real=0.00 secs]
> >>> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at
> hdfs://
> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/7069292654066145383to hdfs://
> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476
> >>> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://
> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476,
> entries=91568, sequenceid=15360597, memsize=22.4m, filesize=1.4m
> >>> 2012-07-22T14:47:33.177-0700: [GC [ParNew: 136324K->12390K(153344K),
> 0.0061720 secs] 315953K->192019K(1621376K) icms_dc=0 , 0.0062440 secs]
> [Times: user=0.04 sys=0.00, real=0.01 secs]
> >>> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at
> hdfs://
> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/4192914148964281571to hdfs://
> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317
> >>> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://
> h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317,
> entries=101249, sequenceid=15360597, memsize=46.1m, filesize=4.2m
> >>> 12/07/22 14:47:33 INFO regionserver.HRegion: Finished memstore flush
> of ~68.5m for region
> session_prod2,2EEEEEEC-0000-0000-0000-000000000000,1342965495011.010f43c1636e86b8546bd480158df536.
> in 275ms, sequenceid=15360597, compaction requested=false
> >>> 2012-07-22T14:47:35.138-0700: [GC [ParNew: 148634K->8478K(153344K),
> 0.0116530 secs] 328263K->198054K(1621376K) icms_dc=0 , 0.0117260 secs]
> [Times: user=0.06 sys=0.00, real=0.01 secs]
> >>> 2012-07-22T14:47:43.631-0700: [GC [ParNew: 144640K->17024K(153344K),
> 0.0302560 secs] 334216K->238447K(1621376K) icms_dc=0 , 0.0303350 secs]
> [Times: user=0.20 sys=0.00, real=0.03 secs]
> >>> =================================================================
> >>
>
>

Re: hbase threw NotServingRegionException

Posted by Ey-Chih chow <ey...@gmail.com>.
Thanks.  But if we do a scan on the table via the hbase shell, data in the table did show up.

Ey-Chih
 
On Jul 23, 2012, at 1:10 PM, Mohammad Tariq wrote:

> Hello sir,
> 
>     A possible reason could be, your client is contacting the given
> regionserver, and that regionserver kept on rejecting the requests.
> Are you sure your table and all regions online? Use hbck once and see
> if you find anything interesting.
> 
> Regards,
>    Mohammad Tariq
> 
> 
> On Tue, Jul 24, 2012 at 1:26 AM, Ey-Chih chow <ey...@gmail.com> wrote:
>> Sorry I pasted the wrong portion of the region server log.  The right portion should be as follows:
>> 
>> 
>> ======================================
>> 2012-07-22T00:48:57.147-0700: [GC [ParNew: 18863K->2112K(19136K), 0.0029870 secs] 57106K->42831K(97048K) icms_dc=0 , 0.0030480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:09.475-0700: [GC [ParNew: 18686K->2112K(19136K), 0.0219960 secs] 59405K->43951K(97048K) icms_dc=0 , 0.0220530 secs] [Times: user=0.15 sys=0.00, real=0.02 secs]
>> 2012-07-22T00:49:20.872-0700: [GC [ParNew: 19135K->2112K(19136K), 0.0091670 secs] 60974K->44659K(97048K) icms_dc=0 , 0.0092480 secs] [Times: user=0.06 sys=0.00, real=0.01 secs]
>> 2012-07-22T00:49:22.285-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0021870 secs] 61683K->46380K(97048K) icms_dc=0 , 0.0022480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:24.999-0700: [GC [ParNew: 19136K->2087K(19136K), 0.0019340 secs] 63404K->47842K(97048K) icms_dc=0 , 0.0019910 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:26.618-0700: [GC [ParNew: 18002K->2112K(19136K), 0.0016160 secs] 63757K->48065K(97048K) icms_dc=0 , 0.0016780 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:28.089-0700: [GC [ParNew: 19136K->1413K(19136K), 0.0033140 secs] 65089K->48748K(97048K) icms_dc=0 , 0.0033810 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:29.352-0700: [GC [ParNew: 18423K->2112K(19136K), 0.0063630 secs] 65759K->50827K(97048K) icms_dc=0 , 0.0064210 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
>> 2012-07-22T00:49:30.226-0700: [GC [ParNew: 19136K->1897K(19136K), 0.0021470 secs] 67851K->52058K(97048K) icms_dc=0 , 0.0022010 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:31.190-0700: [GC [ParNew: 18921K->1606K(19136K), 0.0013270 secs] 69082K->52336K(97048K) icms_dc=0 , 0.0013780 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:33.111-0700: [GC [ParNew: 18482K->2112K(19136K), 0.0017970 secs] 69212K->53496K(97048K) icms_dc=0 , 0.0018510 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:34.396-0700: [GC [ParNew: 19136K->1468K(19136K), 0.0120290 secs] 70520K->54718K(97048K) icms_dc=0 , 0.0120820 secs] [Times: user=0.07 sys=0.00, real=0.02 secs]
>> 2012-07-22T00:49:35.294-0700: [GC [ParNew: 18181K->2112K(19136K), 0.0011290 secs] 71431K->55399K(97048K) icms_dc=0 , 0.0011840 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:39.165-0700: [GC [ParNew: 19003K->547K(19136K), 0.0130200 secs] 72291K->55364K(97048K) icms_dc=0 , 0.0130880 secs] [Times: user=0.08 sys=0.00, real=0.02 secs]
>> 2012-07-22T00:49:40.642-0700: [GC [ParNew: 17571K->646K(19136K), 0.0006740 secs] 72388K->55463K(97048K) icms_dc=0 , 0.0007220 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:41.541-0700: [GC [ParNew: 17670K->2112K(19136K), 0.0104430 secs] 72487K->58179K(97048K) icms_dc=0 , 0.0105000 secs] [Times: user=0.07 sys=0.00, real=0.01 secs]
>> 2012-07-22T00:49:42.972-0700: [GC [ParNew: 19136K->868K(19136K), 0.0015360 secs] 75203K->58718K(97048K) icms_dc=0 , 0.0015880 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:43.355-0700: [GC [ParNew: 17892K->2112K(19136K), 0.0017450 secs] 75742K->60552K(97048K) icms_dc=0 , 0.0018070 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:46.711-0700: [GC [ParNew: 18900K->644K(19136K), 0.0015250 secs] 77340K->60427K(97048K) icms_dc=0 , 0.0015760 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:47.824-0700: [GC [ParNew: 17668K->1986K(19136K), 0.0015260 secs] 77451K->61943K(97048K) icms_dc=0 , 0.0015780 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:51.552-0700: [GC [ParNew: 19010K->752K(19136K), 0.0011550 secs] 78967K->60715K(97048K) icms_dc=0 , 0.0012500 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:49:58.460-0700: [GC [ParNew: 17776K->2055K(19136K), 0.0010860 secs] 77739K->62018K(97048K) icms_dc=0 , 0.0011440 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:50:03.692-0700: [GC [ParNew: 19079K->2112K(19136K), 0.0257530 secs] 79042K->63417K(97048K) icms_dc=0 , 0.0258130 secs] [Times: user=0.18 sys=0.00, real=0.02 secs]
>> 2012-07-22T00:50:07.522-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0028960 secs] 80441K->67709K(97048K) icms_dc=0 , 0.0029600 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:50:30.491-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0023910 secs] 84733K->70543K(97048K) icms_dc=0 , 0.0024560 secs] [Times: user=0.02 sys=0.01, real=0.00 secs]
>> 2012-07-22T00:50:39.242-0700: [GC [ParNew: 18974K->1947K(19136K), 0.0009880 secs] 87406K->70737K(97048K) icms_dc=0 , 0.0010440 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:50:51.654-0700: [GC [ParNew: 18765K->2112K(19136K), 0.0017700 secs] 87555K->71087K(97048K) icms_dc=0 , 0.0018250 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:51:01.245-0700: [GC [ParNew: 18136K->1984K(19136K), 0.0015550 secs] 87112K->70974K(97048K) icms_dc=0 , 0.0016230 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:51:03.300-0700: [GC [ParNew: 19008K->2112K(19136K), 0.0193190 secs] 87998K->73184K(97048K) icms_dc=0 , 0.0193830 secs] [Times: user=0.10 sys=0.00, real=0.01 secs]
>> 2012-07-22T00:51:08.001-0700: [GC [ParNew: 19111K->2112K(19136K), 0.0014340 secs] 90183K->73830K(97048K) icms_dc=0 , 0.0014900 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T00:51:10.532-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0037150 secs] 90854K->76592K(97048K) icms_dc=15 , 0.0037930 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> ==========================================
>> 
>> 
>> On Jul 23, 2012, at 12:39 PM, Ey-Chih chow wrote:
>> 
>>> Hi,
>>> 
>>> We got a Map/Reduce job that threw NotServingRegionException when the reducer was about to insert data into a Hbase table. The error message is as follows.  I also copied the corresponding region server log at the end of the message.  Also, we browsed through the hbase administrative page of the table. We couldn't see the list of Table Regions (the table is pre-splitted.) Is there anybody who knows what's happening?  Thanks.
>>> 
>>> Ey-Chih Chow
>>> 
>>> ========= log from the map/reduce job =========================
>>> 
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] 12/07/22 00:50:35 INFO mapred.JobClient: Task Id : attempt_201206142240_19696_r_000005_0, Status : FAILED
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 110 actions: NotServingRegionException: 110 times, servers with issues: h07.mtv.byah.net:60020,
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227)
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241)
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826)
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682)
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapreduce.AvroHbaseTableOutputFormat$AvroHbaseTableRecordWriter.write(AvroHbaseTableOutputFormat.java:75)
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapred.AdaptAvroHbaseTableOu
>>> 12:59:06 [dba@dba@h01 1-exec][INFO] attempt_201206142240_19696_r_000005_0: INFO 22-07 00:45:54,927 - Loaded the native-hadoop library
>>> 
>>> =================================================================
>>> 
>>> 
>>> 
>>> 
>>> ========log from the corresponding region server==========================
>>> 
>>> 2012-07-22T09:08:25.833-0700: [GC [ParNew: 136739K->138K(153344K), 0.0027890 secs] 316329K->179757K(1621376K) icms_dc=0 , 0.0028470 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>>> 2012-07-22T09:25:00.822-0700: [GC [ParNew: 136458K->82K(153344K), 0.0028930 secs] 316077K->179701K(1621376K) icms_dc=0 , 0.0029500 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 2012-07-22T09:41:35.638-0700: [GC [ParNew: 136402K->49K(153344K), 0.0030770 secs] 316021K->179668K(1621376K) icms_dc=0 , 0.0031310 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>>> 2012-07-22T09:58:10.796-0700: [GC [ParNew: 136351K->44K(153344K), 0.0028190 secs] 315970K->179663K(1621376K) icms_dc=0 , 0.0028750 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>>> 2012-07-22T10:14:45.638-0700: [GC [ParNew: 136364K->66K(153344K), 0.0031410 secs] 315983K->179694K(1621376K) icms_dc=0 , 0.0031960 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>>> 2012-07-22T10:31:15.761-0700: [GC [ParNew: 136346K->52K(153344K), 0.0029310 secs] 315974K->179680K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 2012-07-22T10:47:48.745-0700: [GC [ParNew: 136341K->37K(153344K), 0.0031490 secs] 315969K->179665K(1621376K) icms_dc=0 , 0.0032070 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 2012-07-22T11:04:25.008-0700: [GC [ParNew: 136357K->39K(153344K), 0.0027710 secs] 315985K->179667K(1621376K) icms_dc=0 , 0.0028260 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>>> 2012-07-22T11:20:55.715-0700: [GC [ParNew: 136337K->39K(153344K), 0.0032670 secs] 315965K->179667K(1621376K) icms_dc=0 , 0.0033210 secs] [Times: user=0.03 sys=0.00, real=0.00 secs]
>>> 2012-07-22T11:37:28.701-0700: [GC [ParNew: 136327K->39K(153344K), 0.0027510 secs] 315955K->179667K(1621376K) icms_dc=0 , 0.0028070 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>>> 2012-07-22T11:54:02.688-0700: [GC [ParNew: 136342K->39K(153344K), 0.0033410 secs] 315971K->179667K(1621376K) icms_dc=0 , 0.0033980 secs] [Times: user=0.03 sys=0.00, real=0.00 secs]
>>> 2012-07-22T12:10:35.639-0700: [GC [ParNew: 136359K->39K(153344K), 0.0026440 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027000 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 2012-07-22T12:27:05.649-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027960 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0028520 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 2012-07-22T12:43:35.627-0700: [GC [ParNew: 136340K->39K(153344K), 0.0030750 secs] 315968K->179667K(1621376K) icms_dc=0 , 0.0031320 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 2012-07-22T13:00:05.607-0700: [GC [ParNew: 136359K->39K(153344K), 0.0032030 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0032770 secs] [Times: user=0.03 sys=0.01, real=0.00 secs]
>>> 2012-07-22T13:16:35.587-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027270 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027820 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 2012-07-22T13:33:05.565-0700: [GC [ParNew: 136325K->39K(153344K), 0.0028510 secs] 315953K->179667K(1621376K) icms_dc=0 , 0.0029060 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 2012-07-22T13:49:35.545-0700: [GC [ParNew: 136359K->39K(153344K), 0.0030620 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0031170 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>>> 2012-07-22T14:06:05.524-0700: [GC [ParNew: 136359K->56K(153344K), 0.0029310 secs] 315987K->179684K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>>> 2012-07-22T14:22:35.502-0700: [GC [ParNew: 136376K->55K(153344K), 0.0028840 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0029400 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>>> 2012-07-22T14:39:06.248-0700: [GC [ParNew: 136375K->55K(153344K), 0.0032820 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0033360 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>>> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/7069292654066145383 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476
>>> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476, entries=91568, sequenceid=15360597, memsize=22.4m, filesize=1.4m
>>> 2012-07-22T14:47:33.177-0700: [GC [ParNew: 136324K->12390K(153344K), 0.0061720 secs] 315953K->192019K(1621376K) icms_dc=0 , 0.0062440 secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
>>> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/4192914148964281571 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317
>>> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317, entries=101249, sequenceid=15360597, memsize=46.1m, filesize=4.2m
>>> 12/07/22 14:47:33 INFO regionserver.HRegion: Finished memstore flush of ~68.5m for region session_prod2,2EEEEEEC-0000-0000-0000-000000000000,1342965495011.010f43c1636e86b8546bd480158df536. in 275ms, sequenceid=15360597, compaction requested=false
>>> 2012-07-22T14:47:35.138-0700: [GC [ParNew: 148634K->8478K(153344K), 0.0116530 secs] 328263K->198054K(1621376K) icms_dc=0 , 0.0117260 secs] [Times: user=0.06 sys=0.00, real=0.01 secs]
>>> 2012-07-22T14:47:43.631-0700: [GC [ParNew: 144640K->17024K(153344K), 0.0302560 secs] 334216K->238447K(1621376K) icms_dc=0 , 0.0303350 secs] [Times: user=0.20 sys=0.00, real=0.03 secs]
>>> =================================================================
>> 


Re: hbase threw NotServingRegionException

Posted by Mohammad Tariq <do...@gmail.com>.
Hello sir,

     A possible reason could be, your client is contacting the given
regionserver, and that regionserver kept on rejecting the requests.
Are you sure your table and all regions online? Use hbck once and see
if you find anything interesting.

Regards,
    Mohammad Tariq


On Tue, Jul 24, 2012 at 1:26 AM, Ey-Chih chow <ey...@gmail.com> wrote:
> Sorry I pasted the wrong portion of the region server log.  The right portion should be as follows:
>
>
> ======================================
> 2012-07-22T00:48:57.147-0700: [GC [ParNew: 18863K->2112K(19136K), 0.0029870 secs] 57106K->42831K(97048K) icms_dc=0 , 0.0030480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:09.475-0700: [GC [ParNew: 18686K->2112K(19136K), 0.0219960 secs] 59405K->43951K(97048K) icms_dc=0 , 0.0220530 secs] [Times: user=0.15 sys=0.00, real=0.02 secs]
> 2012-07-22T00:49:20.872-0700: [GC [ParNew: 19135K->2112K(19136K), 0.0091670 secs] 60974K->44659K(97048K) icms_dc=0 , 0.0092480 secs] [Times: user=0.06 sys=0.00, real=0.01 secs]
> 2012-07-22T00:49:22.285-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0021870 secs] 61683K->46380K(97048K) icms_dc=0 , 0.0022480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:24.999-0700: [GC [ParNew: 19136K->2087K(19136K), 0.0019340 secs] 63404K->47842K(97048K) icms_dc=0 , 0.0019910 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:26.618-0700: [GC [ParNew: 18002K->2112K(19136K), 0.0016160 secs] 63757K->48065K(97048K) icms_dc=0 , 0.0016780 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:28.089-0700: [GC [ParNew: 19136K->1413K(19136K), 0.0033140 secs] 65089K->48748K(97048K) icms_dc=0 , 0.0033810 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:29.352-0700: [GC [ParNew: 18423K->2112K(19136K), 0.0063630 secs] 65759K->50827K(97048K) icms_dc=0 , 0.0064210 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
> 2012-07-22T00:49:30.226-0700: [GC [ParNew: 19136K->1897K(19136K), 0.0021470 secs] 67851K->52058K(97048K) icms_dc=0 , 0.0022010 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:31.190-0700: [GC [ParNew: 18921K->1606K(19136K), 0.0013270 secs] 69082K->52336K(97048K) icms_dc=0 , 0.0013780 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:33.111-0700: [GC [ParNew: 18482K->2112K(19136K), 0.0017970 secs] 69212K->53496K(97048K) icms_dc=0 , 0.0018510 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:34.396-0700: [GC [ParNew: 19136K->1468K(19136K), 0.0120290 secs] 70520K->54718K(97048K) icms_dc=0 , 0.0120820 secs] [Times: user=0.07 sys=0.00, real=0.02 secs]
> 2012-07-22T00:49:35.294-0700: [GC [ParNew: 18181K->2112K(19136K), 0.0011290 secs] 71431K->55399K(97048K) icms_dc=0 , 0.0011840 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:39.165-0700: [GC [ParNew: 19003K->547K(19136K), 0.0130200 secs] 72291K->55364K(97048K) icms_dc=0 , 0.0130880 secs] [Times: user=0.08 sys=0.00, real=0.02 secs]
> 2012-07-22T00:49:40.642-0700: [GC [ParNew: 17571K->646K(19136K), 0.0006740 secs] 72388K->55463K(97048K) icms_dc=0 , 0.0007220 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:41.541-0700: [GC [ParNew: 17670K->2112K(19136K), 0.0104430 secs] 72487K->58179K(97048K) icms_dc=0 , 0.0105000 secs] [Times: user=0.07 sys=0.00, real=0.01 secs]
> 2012-07-22T00:49:42.972-0700: [GC [ParNew: 19136K->868K(19136K), 0.0015360 secs] 75203K->58718K(97048K) icms_dc=0 , 0.0015880 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:43.355-0700: [GC [ParNew: 17892K->2112K(19136K), 0.0017450 secs] 75742K->60552K(97048K) icms_dc=0 , 0.0018070 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:46.711-0700: [GC [ParNew: 18900K->644K(19136K), 0.0015250 secs] 77340K->60427K(97048K) icms_dc=0 , 0.0015760 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:47.824-0700: [GC [ParNew: 17668K->1986K(19136K), 0.0015260 secs] 77451K->61943K(97048K) icms_dc=0 , 0.0015780 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:51.552-0700: [GC [ParNew: 19010K->752K(19136K), 0.0011550 secs] 78967K->60715K(97048K) icms_dc=0 , 0.0012500 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> 2012-07-22T00:49:58.460-0700: [GC [ParNew: 17776K->2055K(19136K), 0.0010860 secs] 77739K->62018K(97048K) icms_dc=0 , 0.0011440 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:50:03.692-0700: [GC [ParNew: 19079K->2112K(19136K), 0.0257530 secs] 79042K->63417K(97048K) icms_dc=0 , 0.0258130 secs] [Times: user=0.18 sys=0.00, real=0.02 secs]
> 2012-07-22T00:50:07.522-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0028960 secs] 80441K->67709K(97048K) icms_dc=0 , 0.0029600 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
> 2012-07-22T00:50:30.491-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0023910 secs] 84733K->70543K(97048K) icms_dc=0 , 0.0024560 secs] [Times: user=0.02 sys=0.01, real=0.00 secs]
> 2012-07-22T00:50:39.242-0700: [GC [ParNew: 18974K->1947K(19136K), 0.0009880 secs] 87406K->70737K(97048K) icms_dc=0 , 0.0010440 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:50:51.654-0700: [GC [ParNew: 18765K->2112K(19136K), 0.0017700 secs] 87555K->71087K(97048K) icms_dc=0 , 0.0018250 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:51:01.245-0700: [GC [ParNew: 18136K->1984K(19136K), 0.0015550 secs] 87112K->70974K(97048K) icms_dc=0 , 0.0016230 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> 2012-07-22T00:51:03.300-0700: [GC [ParNew: 19008K->2112K(19136K), 0.0193190 secs] 87998K->73184K(97048K) icms_dc=0 , 0.0193830 secs] [Times: user=0.10 sys=0.00, real=0.01 secs]
> 2012-07-22T00:51:08.001-0700: [GC [ParNew: 19111K->2112K(19136K), 0.0014340 secs] 90183K->73830K(97048K) icms_dc=0 , 0.0014900 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> 2012-07-22T00:51:10.532-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0037150 secs] 90854K->76592K(97048K) icms_dc=15 , 0.0037930 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> ==========================================
>
>
> On Jul 23, 2012, at 12:39 PM, Ey-Chih chow wrote:
>
>> Hi,
>>
>> We got a Map/Reduce job that threw NotServingRegionException when the reducer was about to insert data into a Hbase table. The error message is as follows.  I also copied the corresponding region server log at the end of the message.  Also, we browsed through the hbase administrative page of the table. We couldn't see the list of Table Regions (the table is pre-splitted.) Is there anybody who knows what's happening?  Thanks.
>>
>> Ey-Chih Chow
>>
>> ========= log from the map/reduce job =========================
>>
>> 12:59:06 [dba@dba@h01 1-exec][INFO] 12/07/22 00:50:35 INFO mapred.JobClient: Task Id : attempt_201206142240_19696_r_000005_0, Status : FAILED
>> 12:59:06 [dba@dba@h01 1-exec][INFO] org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 110 actions: NotServingRegionException: 110 times, servers with issues: h07.mtv.byah.net:60020,
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227)
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241)
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826)
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682)
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapreduce.AvroHbaseTableOutputFormat$AvroHbaseTableRecordWriter.write(AvroHbaseTableOutputFormat.java:75)
>> 12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapred.AdaptAvroHbaseTableOu
>> 12:59:06 [dba@dba@h01 1-exec][INFO] attempt_201206142240_19696_r_000005_0: INFO 22-07 00:45:54,927 - Loaded the native-hadoop library
>>
>> =================================================================
>>
>>
>>
>>
>> ========log from the corresponding region server==========================
>>
>> 2012-07-22T09:08:25.833-0700: [GC [ParNew: 136739K->138K(153344K), 0.0027890 secs] 316329K->179757K(1621376K) icms_dc=0 , 0.0028470 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>> 2012-07-22T09:25:00.822-0700: [GC [ParNew: 136458K->82K(153344K), 0.0028930 secs] 316077K->179701K(1621376K) icms_dc=0 , 0.0029500 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T09:41:35.638-0700: [GC [ParNew: 136402K->49K(153344K), 0.0030770 secs] 316021K->179668K(1621376K) icms_dc=0 , 0.0031310 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> 2012-07-22T09:58:10.796-0700: [GC [ParNew: 136351K->44K(153344K), 0.0028190 secs] 315970K->179663K(1621376K) icms_dc=0 , 0.0028750 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>> 2012-07-22T10:14:45.638-0700: [GC [ParNew: 136364K->66K(153344K), 0.0031410 secs] 315983K->179694K(1621376K) icms_dc=0 , 0.0031960 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> 2012-07-22T10:31:15.761-0700: [GC [ParNew: 136346K->52K(153344K), 0.0029310 secs] 315974K->179680K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T10:47:48.745-0700: [GC [ParNew: 136341K->37K(153344K), 0.0031490 secs] 315969K->179665K(1621376K) icms_dc=0 , 0.0032070 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T11:04:25.008-0700: [GC [ParNew: 136357K->39K(153344K), 0.0027710 secs] 315985K->179667K(1621376K) icms_dc=0 , 0.0028260 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> 2012-07-22T11:20:55.715-0700: [GC [ParNew: 136337K->39K(153344K), 0.0032670 secs] 315965K->179667K(1621376K) icms_dc=0 , 0.0033210 secs] [Times: user=0.03 sys=0.00, real=0.00 secs]
>> 2012-07-22T11:37:28.701-0700: [GC [ParNew: 136327K->39K(153344K), 0.0027510 secs] 315955K->179667K(1621376K) icms_dc=0 , 0.0028070 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> 2012-07-22T11:54:02.688-0700: [GC [ParNew: 136342K->39K(153344K), 0.0033410 secs] 315971K->179667K(1621376K) icms_dc=0 , 0.0033980 secs] [Times: user=0.03 sys=0.00, real=0.00 secs]
>> 2012-07-22T12:10:35.639-0700: [GC [ParNew: 136359K->39K(153344K), 0.0026440 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027000 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T12:27:05.649-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027960 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0028520 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T12:43:35.627-0700: [GC [ParNew: 136340K->39K(153344K), 0.0030750 secs] 315968K->179667K(1621376K) icms_dc=0 , 0.0031320 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T13:00:05.607-0700: [GC [ParNew: 136359K->39K(153344K), 0.0032030 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0032770 secs] [Times: user=0.03 sys=0.01, real=0.00 secs]
>> 2012-07-22T13:16:35.587-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027270 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027820 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T13:33:05.565-0700: [GC [ParNew: 136325K->39K(153344K), 0.0028510 secs] 315953K->179667K(1621376K) icms_dc=0 , 0.0029060 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 2012-07-22T13:49:35.545-0700: [GC [ParNew: 136359K->39K(153344K), 0.0030620 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0031170 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>> 2012-07-22T14:06:05.524-0700: [GC [ParNew: 136359K->56K(153344K), 0.0029310 secs] 315987K->179684K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>> 2012-07-22T14:22:35.502-0700: [GC [ParNew: 136376K->55K(153344K), 0.0028840 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0029400 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> 2012-07-22T14:39:06.248-0700: [GC [ParNew: 136375K->55K(153344K), 0.0032820 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0033360 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
>> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/7069292654066145383 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476
>> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476, entries=91568, sequenceid=15360597, memsize=22.4m, filesize=1.4m
>> 2012-07-22T14:47:33.177-0700: [GC [ParNew: 136324K->12390K(153344K), 0.0061720 secs] 315953K->192019K(1621376K) icms_dc=0 , 0.0062440 secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
>> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/4192914148964281571 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317
>> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317, entries=101249, sequenceid=15360597, memsize=46.1m, filesize=4.2m
>> 12/07/22 14:47:33 INFO regionserver.HRegion: Finished memstore flush of ~68.5m for region session_prod2,2EEEEEEC-0000-0000-0000-000000000000,1342965495011.010f43c1636e86b8546bd480158df536. in 275ms, sequenceid=15360597, compaction requested=false
>> 2012-07-22T14:47:35.138-0700: [GC [ParNew: 148634K->8478K(153344K), 0.0116530 secs] 328263K->198054K(1621376K) icms_dc=0 , 0.0117260 secs] [Times: user=0.06 sys=0.00, real=0.01 secs]
>> 2012-07-22T14:47:43.631-0700: [GC [ParNew: 144640K->17024K(153344K), 0.0302560 secs] 334216K->238447K(1621376K) icms_dc=0 , 0.0303350 secs] [Times: user=0.20 sys=0.00, real=0.03 secs]
>> =================================================================
>

Re: hbase threw NotServingRegionException

Posted by Ey-Chih chow <ey...@gmail.com>.
Sorry I pasted the wrong portion of the region server log.  The right portion should be as follows:


======================================
2012-07-22T00:48:57.147-0700: [GC [ParNew: 18863K->2112K(19136K), 0.0029870 secs] 57106K->42831K(97048K) icms_dc=0 , 0.0030480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:09.475-0700: [GC [ParNew: 18686K->2112K(19136K), 0.0219960 secs] 59405K->43951K(97048K) icms_dc=0 , 0.0220530 secs] [Times: user=0.15 sys=0.00, real=0.02 secs] 
2012-07-22T00:49:20.872-0700: [GC [ParNew: 19135K->2112K(19136K), 0.0091670 secs] 60974K->44659K(97048K) icms_dc=0 , 0.0092480 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] 
2012-07-22T00:49:22.285-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0021870 secs] 61683K->46380K(97048K) icms_dc=0 , 0.0022480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:24.999-0700: [GC [ParNew: 19136K->2087K(19136K), 0.0019340 secs] 63404K->47842K(97048K) icms_dc=0 , 0.0019910 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:26.618-0700: [GC [ParNew: 18002K->2112K(19136K), 0.0016160 secs] 63757K->48065K(97048K) icms_dc=0 , 0.0016780 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:28.089-0700: [GC [ParNew: 19136K->1413K(19136K), 0.0033140 secs] 65089K->48748K(97048K) icms_dc=0 , 0.0033810 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:29.352-0700: [GC [ParNew: 18423K->2112K(19136K), 0.0063630 secs] 65759K->50827K(97048K) icms_dc=0 , 0.0064210 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 
2012-07-22T00:49:30.226-0700: [GC [ParNew: 19136K->1897K(19136K), 0.0021470 secs] 67851K->52058K(97048K) icms_dc=0 , 0.0022010 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:31.190-0700: [GC [ParNew: 18921K->1606K(19136K), 0.0013270 secs] 69082K->52336K(97048K) icms_dc=0 , 0.0013780 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:33.111-0700: [GC [ParNew: 18482K->2112K(19136K), 0.0017970 secs] 69212K->53496K(97048K) icms_dc=0 , 0.0018510 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:34.396-0700: [GC [ParNew: 19136K->1468K(19136K), 0.0120290 secs] 70520K->54718K(97048K) icms_dc=0 , 0.0120820 secs] [Times: user=0.07 sys=0.00, real=0.02 secs] 
2012-07-22T00:49:35.294-0700: [GC [ParNew: 18181K->2112K(19136K), 0.0011290 secs] 71431K->55399K(97048K) icms_dc=0 , 0.0011840 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:39.165-0700: [GC [ParNew: 19003K->547K(19136K), 0.0130200 secs] 72291K->55364K(97048K) icms_dc=0 , 0.0130880 secs] [Times: user=0.08 sys=0.00, real=0.02 secs] 
2012-07-22T00:49:40.642-0700: [GC [ParNew: 17571K->646K(19136K), 0.0006740 secs] 72388K->55463K(97048K) icms_dc=0 , 0.0007220 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:41.541-0700: [GC [ParNew: 17670K->2112K(19136K), 0.0104430 secs] 72487K->58179K(97048K) icms_dc=0 , 0.0105000 secs] [Times: user=0.07 sys=0.00, real=0.01 secs] 
2012-07-22T00:49:42.972-0700: [GC [ParNew: 19136K->868K(19136K), 0.0015360 secs] 75203K->58718K(97048K) icms_dc=0 , 0.0015880 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:43.355-0700: [GC [ParNew: 17892K->2112K(19136K), 0.0017450 secs] 75742K->60552K(97048K) icms_dc=0 , 0.0018070 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:46.711-0700: [GC [ParNew: 18900K->644K(19136K), 0.0015250 secs] 77340K->60427K(97048K) icms_dc=0 , 0.0015760 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:47.824-0700: [GC [ParNew: 17668K->1986K(19136K), 0.0015260 secs] 77451K->61943K(97048K) icms_dc=0 , 0.0015780 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:51.552-0700: [GC [ParNew: 19010K->752K(19136K), 0.0011550 secs] 78967K->60715K(97048K) icms_dc=0 , 0.0012500 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2012-07-22T00:49:58.460-0700: [GC [ParNew: 17776K->2055K(19136K), 0.0010860 secs] 77739K->62018K(97048K) icms_dc=0 , 0.0011440 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:50:03.692-0700: [GC [ParNew: 19079K->2112K(19136K), 0.0257530 secs] 79042K->63417K(97048K) icms_dc=0 , 0.0258130 secs] [Times: user=0.18 sys=0.00, real=0.02 secs] 
2012-07-22T00:50:07.522-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0028960 secs] 80441K->67709K(97048K) icms_dc=0 , 0.0029600 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
2012-07-22T00:50:30.491-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0023910 secs] 84733K->70543K(97048K) icms_dc=0 , 0.0024560 secs] [Times: user=0.02 sys=0.01, real=0.00 secs] 
2012-07-22T00:50:39.242-0700: [GC [ParNew: 18974K->1947K(19136K), 0.0009880 secs] 87406K->70737K(97048K) icms_dc=0 , 0.0010440 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:50:51.654-0700: [GC [ParNew: 18765K->2112K(19136K), 0.0017700 secs] 87555K->71087K(97048K) icms_dc=0 , 0.0018250 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:51:01.245-0700: [GC [ParNew: 18136K->1984K(19136K), 0.0015550 secs] 87112K->70974K(97048K) icms_dc=0 , 0.0016230 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2012-07-22T00:51:03.300-0700: [GC [ParNew: 19008K->2112K(19136K), 0.0193190 secs] 87998K->73184K(97048K) icms_dc=0 , 0.0193830 secs] [Times: user=0.10 sys=0.00, real=0.01 secs] 
2012-07-22T00:51:08.001-0700: [GC [ParNew: 19111K->2112K(19136K), 0.0014340 secs] 90183K->73830K(97048K) icms_dc=0 , 0.0014900 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2012-07-22T00:51:10.532-0700: [GC [ParNew: 19136K->2112K(19136K), 0.0037150 secs] 90854K->76592K(97048K) icms_dc=15 , 0.0037930 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
==========================================


On Jul 23, 2012, at 12:39 PM, Ey-Chih chow wrote:

> Hi,
> 
> We got a Map/Reduce job that threw NotServingRegionException when the reducer was about to insert data into a Hbase table. The error message is as follows.  I also copied the corresponding region server log at the end of the message.  Also, we browsed through the hbase administrative page of the table. We couldn't see the list of Table Regions (the table is pre-splitted.) Is there anybody who knows what's happening?  Thanks.
> 
> Ey-Chih Chow
> 
> ========= log from the map/reduce job =========================
> 
> 12:59:06 [dba@dba@h01 1-exec][INFO] 12/07/22 00:50:35 INFO mapred.JobClient: Task Id : attempt_201206142240_19696_r_000005_0, Status : FAILED 
> 12:59:06 [dba@dba@h01 1-exec][INFO] org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 110 actions: NotServingRegionException: 110 times, servers with issues: h07.mtv.byah.net:60020, 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227) 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241) 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826) 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682) 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667) 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapreduce.AvroHbaseTableOutputFormat$AvroHbaseTableRecordWriter.write(AvroHbaseTableOutputFormat.java:75) 
> 12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapred.AdaptAvroHbaseTableOu 
> 12:59:06 [dba@dba@h01 1-exec][INFO] attempt_201206142240_19696_r_000005_0: INFO 22-07 00:45:54,927 - Loaded the native-hadoop library
> 
> =================================================================
> 
> 
> 
> 
> ========log from the corresponding region server==========================
> 
> 2012-07-22T09:08:25.833-0700: [GC [ParNew: 136739K->138K(153344K), 0.0027890 secs] 316329K->179757K(1621376K) icms_dc=0 , 0.0028470 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
> 2012-07-22T09:25:00.822-0700: [GC [ParNew: 136458K->82K(153344K), 0.0028930 secs] 316077K->179701K(1621376K) icms_dc=0 , 0.0029500 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 2012-07-22T09:41:35.638-0700: [GC [ParNew: 136402K->49K(153344K), 0.0030770 secs] 316021K->179668K(1621376K) icms_dc=0 , 0.0031310 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
> 2012-07-22T09:58:10.796-0700: [GC [ParNew: 136351K->44K(153344K), 0.0028190 secs] 315970K->179663K(1621376K) icms_dc=0 , 0.0028750 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
> 2012-07-22T10:14:45.638-0700: [GC [ParNew: 136364K->66K(153344K), 0.0031410 secs] 315983K->179694K(1621376K) icms_dc=0 , 0.0031960 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
> 2012-07-22T10:31:15.761-0700: [GC [ParNew: 136346K->52K(153344K), 0.0029310 secs] 315974K->179680K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 2012-07-22T10:47:48.745-0700: [GC [ParNew: 136341K->37K(153344K), 0.0031490 secs] 315969K->179665K(1621376K) icms_dc=0 , 0.0032070 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 2012-07-22T11:04:25.008-0700: [GC [ParNew: 136357K->39K(153344K), 0.0027710 secs] 315985K->179667K(1621376K) icms_dc=0 , 0.0028260 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
> 2012-07-22T11:20:55.715-0700: [GC [ParNew: 136337K->39K(153344K), 0.0032670 secs] 315965K->179667K(1621376K) icms_dc=0 , 0.0033210 secs] [Times: user=0.03 sys=0.00, real=0.00 secs] 
> 2012-07-22T11:37:28.701-0700: [GC [ParNew: 136327K->39K(153344K), 0.0027510 secs] 315955K->179667K(1621376K) icms_dc=0 , 0.0028070 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
> 2012-07-22T11:54:02.688-0700: [GC [ParNew: 136342K->39K(153344K), 0.0033410 secs] 315971K->179667K(1621376K) icms_dc=0 , 0.0033980 secs] [Times: user=0.03 sys=0.00, real=0.00 secs] 
> 2012-07-22T12:10:35.639-0700: [GC [ParNew: 136359K->39K(153344K), 0.0026440 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027000 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 2012-07-22T12:27:05.649-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027960 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0028520 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 2012-07-22T12:43:35.627-0700: [GC [ParNew: 136340K->39K(153344K), 0.0030750 secs] 315968K->179667K(1621376K) icms_dc=0 , 0.0031320 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 2012-07-22T13:00:05.607-0700: [GC [ParNew: 136359K->39K(153344K), 0.0032030 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0032770 secs] [Times: user=0.03 sys=0.01, real=0.00 secs] 
> 2012-07-22T13:16:35.587-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027270 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027820 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 2012-07-22T13:33:05.565-0700: [GC [ParNew: 136325K->39K(153344K), 0.0028510 secs] 315953K->179667K(1621376K) icms_dc=0 , 0.0029060 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 2012-07-22T13:49:35.545-0700: [GC [ParNew: 136359K->39K(153344K), 0.0030620 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0031170 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
> 2012-07-22T14:06:05.524-0700: [GC [ParNew: 136359K->56K(153344K), 0.0029310 secs] 315987K->179684K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 
> 2012-07-22T14:22:35.502-0700: [GC [ParNew: 136376K->55K(153344K), 0.0028840 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0029400 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
> 2012-07-22T14:39:06.248-0700: [GC [ParNew: 136375K->55K(153344K), 0.0032820 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0033360 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/7069292654066145383 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476
> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476, entries=91568, sequenceid=15360597, memsize=22.4m, filesize=1.4m
> 2012-07-22T14:47:33.177-0700: [GC [ParNew: 136324K->12390K(153344K), 0.0061720 secs] 315953K->192019K(1621376K) icms_dc=0 , 0.0062440 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 
> 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/4192914148964281571 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317
> 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317, entries=101249, sequenceid=15360597, memsize=46.1m, filesize=4.2m
> 12/07/22 14:47:33 INFO regionserver.HRegion: Finished memstore flush of ~68.5m for region session_prod2,2EEEEEEC-0000-0000-0000-000000000000,1342965495011.010f43c1636e86b8546bd480158df536. in 275ms, sequenceid=15360597, compaction requested=false
> 2012-07-22T14:47:35.138-0700: [GC [ParNew: 148634K->8478K(153344K), 0.0116530 secs] 328263K->198054K(1621376K) icms_dc=0 , 0.0117260 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] 
> 2012-07-22T14:47:43.631-0700: [GC [ParNew: 144640K->17024K(153344K), 0.0302560 secs] 334216K->238447K(1621376K) icms_dc=0 , 0.0303350 secs] [Times: user=0.20 sys=0.00, real=0.03 secs]
> =================================================================