You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Schubert Zhang <zs...@gmail.com> on 2011/01/26 16:58:25 UTC

HBase 0.90.0 cannot be put more data after running hours

I am using 0.90.0 (8 RS + 1Master)
and the HDFS is CDH3b3

During the first hours of running, I puts many (tens of millions entites,
each about 200 bytes), it worked well.

But then, the client cannot put more data.

I checked all log files of hbase, no abnormal is found, I will continue to
check this issue.

It seems related to ZooKeeper......

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Ryan Rawson <ry...@gmail.com>.

the client does synchronizing so that only 1 thread does the actual
META lookup to reduce extra traffic on the META table.  You can use
the object ids to find out which threads are blocking and which is/are
the blocker(s).  Once you poke at it, it's not too hard to figure out.
 Personally I use 'less' and the / search feature which also
highlights.

During splits you can see this kind of behaviour, does the client
"unstick" itself and move on?

-ryan

On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com> wrote:
> 1. The .META. table seems ok
>     I can read my data table (one thread for reading).
>     I can use hbase shell to scan my data table.
>     And I can use 1~4 threads to put more data into my data table.
>
>    Before this issue happen, about 800 millions entities (column) have been
> put into the table successfully, and there are 253 regions for this table.
>
> 2. There is  no strange things in logs (INFO level).
> 3. All clients use HBaseConfiguration.create() for a new Configuration
> instance.
>
> 4. The 8+ client threads running on a single machine and a single JVM.
>
> 5. Seems all 8+ threads are blocked in same location waiting on call to
> return.
>
> Currently, there is no more clue, and I am digging for more clue.
>
> On Fri, Jan 28, 2011 at 12:02 PM, Stack <st...@duboce.net> wrote:
>
>> Thats a lookup on the .META. table.  Is the region hosting .META. OK?
>> Anything in its logs?  Do your clients share a Configuration instance
>> or do you make a new one of these each time you make an HTable?   Your
>> threaded client is running on single machine?  Can we see full stack
>> trace?  Are all 8+ threads blocked in same location waiting on call to
>> return?
>>
>> St.Ack
>>
>>
>>
>> On Wed, Jan 26, 2011 at 9:19 AM, Schubert Zhang <zs...@gmail.com> wrote:
>> > The "Thread-Opr0" the client thread to put data into hbase, it is
>> waiting.
>> >
>> > "Thread-Opr0-EventThread" daemon prio=10 tid=0x00002aaafc7a8000 nid=0xe08
>> > waiting on condition [0x000000004383f000]
>> >   java.lang.Thread.State: WAITING (parking)
>> >        at sun.misc.Unsafe.park(Native Method)
>> >        - parking to wait for  <0x00002aaab632ae50> (a
>> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>> >        at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>> >        at
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>> >        at
>> >
>> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>> >        at
>> > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
>> > "Thread-Opr0-SendThread(nd1-rack2-cloud:2181)" daemon prio=10
>> > tid=0x00002aaafc7a6800 nid=0xe07 runnable [0x000000004373e000]
>> >   java.lang.Thread.State: RUNNABLE
>> >        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>> >        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
>> >        at
>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>> >        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>> >        - locked <0x00002aaab6304410> (a sun.nio.ch.Util$1)
>> >        - locked <0x00002aaab6304428> (a
>> > java.util.Collections$UnmodifiableSet)
>> >        - locked <0x00002aaab632abd0> (a sun.nio.ch.EPollSelectorImpl)
>> >        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>> >        at
>> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
>> >
>> > "Thread-Opr0" prio=10 tid=0x00002aab0402a000 nid=0xdf2 in Object.wait()
>> > [0x000000004262d000]
>> >   java.lang.Thread.State: WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        - waiting on <0x00002aaab04302d0> (a
>> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>> >        at java.lang.Object.wait(Object.java:485)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:739)
>> >        - locked <0x00002aaab04302d0> (a
>> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>> >        at $Proxy0.getClosestRowBefore(Unknown Source)
>> >        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:517)
>> >        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:515)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
>> >        at
>> > org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:514)
>> >        at
>> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133)
>> >        at
>> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:645)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:699)
>> >        - locked <0x00002aaab6294660> (a java.lang.Object)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
>> >        at
>> > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
>> >        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
>> >        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
>> >        at com.bigdata.bench.hbase.HBaseWriter$Operator.operateTo(Unknown
>> > Source)
>> >        at com.bigdata.bench.hbase.HBaseWriter$Operator.run(Unknown
>> Source)
>> >
>> > On Thu, Jan 27, 2011 at 12:06 AM, Schubert Zhang <zs...@gmail.com>
>> wrote:
>> >
>> >> Even though cannot put more data into table, I can read the existing
>> data.
>> >>
>> >> And I stop and re-start the HBase, still cannot put more data.
>> >>
>> >> hbase(main):031:0> status 'simple'
>> >> 8 live servers
>> >>     nd5-rack2-cloud:60020 1296057544120
>> >>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>> >>     nd8-rack2-cloud:60020 1296057544350
>> >>         requests=0, regions=31, usedHeap=128, maxHeap=8983
>> >>     nd2-rack2-cloud:60020 1296057543346
>> >>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>> >>     nd3-rack2-cloud:60020 1296057544224
>> >>         requests=0, regions=32, usedHeap=133, maxHeap=8973
>> >>     nd6-rack2-cloud:60020 1296057544482
>> >>         requests=0, regions=32, usedHeap=130, maxHeap=8983
>> >>     nd9-rack2-cloud:60020 1296057544565
>> >>         requests=174, regions=32, usedHeap=180, maxHeap=8983
>> >>     nd7-rack2-cloud:60020 1296057544617
>> >>         requests=0, regions=32, usedHeap=126, maxHeap=8983
>> >>     nd4-rack2-cloud:60020 1296057544138
>> >>         requests=0, regions=32, usedHeap=126, maxHeap=8973
>> >> 0 dead servers
>> >> Aggregate load: 174, regions: 255
>> >>
>> >>
>> >> On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <zsongbo@gmail.com
>> >wrote:
>> >>
>> >>> I am using 0.90.0 (8 RS + 1Master)
>> >>> and the HDFS is CDH3b3
>> >>>
>> >>> During the first hours of running, I puts many (tens of millions
>> entites,
>> >>> each about 200 bytes), it worked well.
>> >>>
>> >>> But then, the client cannot put more data.
>> >>>
>> >>> I checked all log files of hbase, no abnormal is found, I will continue
>> to
>> >>> check this issue.
>> >>>
>> >>> It seems related to ZooKeeper......
>> >>>
>> >>
>> >>
>> >
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Anty <an...@gmail.com>.

Sorry, the vmstat output for 2) is wrong.
1) when there are only 2 client threads, vmstat output is

procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0
On Thu, Feb 24, 2011 at 11:39 AM, Anty <an...@gmail.com> wrote:


2) up client threads number to 8, vmstat output is

procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 29236 1442908 557276 12026508 0 0 19 53 0 0 3 1 97 0 0
1 0 29236 1443652 557276 12026508 0 0 0 0 1362 4740 8 0 91 0 0
1 0 29236 1443676 557276 12026508 0 0 0 36 1530 7361 10 0 90 0 0
0 0 29236 1443384 557284 12026512 0 0 0 728 1327 4565 8 0 92 0 0
1 0 29236 1442864 557284 12026512 0 0 0 5 1558 4889 10 0 90 0 0
1 0 29236 1443028 557284 12026516 0 0 0 24 1321 5751 9 0 91 0 0
2 0 29236 1440228 557296 12026512 0 0 0 61 1399 5529 9 0 90 0 0
2 0 29236 1439756 557296 12026528 0 0 0 91 1450 3485 8 0 92 0 0
1 0 29236 1440164 557296 12026528 0 0 0 217 1433 7146 10 0 90 0 0
1 0 29236 1439604 557296 12026528 0 0 0 4 1387 3420 8 0 92 0 0
4 0 29236 1439604 557296 12026532 0 0 0 593 1438 6291 9 0 90 0 0
1 0 29236 1439624 557296 12026532 0 0 0 41 1296 7312 9 0 91 0 0
1 0 29236 1439632 557296 12026532 0 0 0 1 1417 6495 9 0 91 0 0
1 0 29236 1439820 557296 12026536 0 0 0 59 1379 5482 10 0 90 0 0
1 0 29236 1439112 557296 12026540 0 0 0 12 1409 3470 8 0 92 0 0
1 0 29236 1438024 557296 12026556 0 0 0 93 1384 4741 8 0 91 0 0
1 0 29236 1437672 557296 12026556 0 0 0 627 1298 6614 10 0 90 0 0
1 0 29236 1438288 557296 12026556 0 0 0 4 1482 4857 8 0 91 0 0
1 0 29236 1438636 557296 12026560 0 0 0 169 1421 6282 10 0 90 0 0
1 0 29236 1438576 557296 12026560 0 0 0 431 1309 6078 9 0 91 0 0
1 0 29236 1438744 557296 12026560 0 0 0 728 1357 7340 9 0 91 0 0
1 0 29236 1439008 557296 12026564 0 0 0 12 1446 6676 10 0 90 0 0
4 0 29236 1438384 557296 12026564 0 0 0 3 1350 4046 9 0 91 0 0
2 0 29236 1438644 557296 12026568 0 0 0 144 1429 3761 8 0 92 0 0
1 0 29236 1438776 557296 12026568 0 0 0 1 1285 5099 8 0 91 0 0
1 0 29236 1438760 557296 12026568 0 0 0 853 1491 6029 10 0 90 0 0
1 0 29236 1438752 557296 12026572 0 0 0 0 1398 6067 8 0 91 0 0
1 0 29236 1438864 557296 12026572 0 0 0 148 1318 4662 8 0 92 0 0
1 0 29236 1439104 557296 12026572 0 0 0 0 1493 5119 10 0 90 0 0
1 0 29236 1438116 557296 12026576 0 0 0 20 1382 3487 8 0 92 0 0
1 0 29236 1438260 557296 12026576 0 0 0 139 1371 7088 10 0 90 0 0
4 0 29236 1438428 557296 12026580 0 0 0 0 1360 3528 9 0 91 0 0
1 0 29236 1438200 557296 12026580 0 0 0 597 1418 3866 8 0 92 0 0
2 0 29236 1438368 557296 12026580 0 0 0 0 1366 7512 10 0 90 0 0
1 0 29236 1438732 557300 12026584 0 0 0 47 1358 4756 9 0 91 0 0
1 0 29236 1438412 557300 12026584 0 0 0 32 1399 6982 10 0 90 0 0
1 0 29236 1438104 557300 12026584 0 0 0 609 1358 5713 8 0 91 0 0
1 0 29236 1438580 557300 12026588 0 0 0 11 1375 6874 9 0 91 0 0
1 0 29236 1438644 557300 12026588 0 0 0 176 1338 6014 10 0 90 0 0
1 0 29236 1438576 557300 12026592 0 0 0 5 1349 3276 8 0 92 0 0
4 0 29236 1438128 557300 12026592 0 0 0 601 2145 28840 17 1 82 0 0
1 0 29236 1438128 557300 12026592 0 0 0 0 1409 5143 9 0 91 0 0
1 0 29236 1438248 557300 12026596 0 0 0 20 1411 7039 10 0 90 0 0
1 0 29236 1438256 557308 12026596 0 0 0 5 1271 7634 9 0 91 0 0

and the "VM Thread" is very busing, take up ~90% one cpu time.




> 1) when there are only 2 client threads, vmstat output is
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us sy id wa st
> 0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
> 0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
> 0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
> 0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
> 0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
> 3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
> 1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
> 0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
> 0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
> 0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
> 0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
> 1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
> 0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
> 1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
> 1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
> 2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
> 1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
> 0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
> 1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
> 0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
> 1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
> 1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
> 0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
> 2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
> 0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
> 0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
> 0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
> 1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
> 1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
> 0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
> 2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
> 1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
> 0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
> 0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
> 0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
> 0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
> 1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
> 2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
> 2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
> 0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
> 1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
> 0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
> 1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
> 1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0
>
>
> 2) up client threads number to 8, vmstat output is
>
>
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us sy id wa st
> 0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
> 0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
> 0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
> 0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
> 0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
> 3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
> 1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
> 0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
> 0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
> 0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
> 0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
> 1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
> 0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
> 1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
> 1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
> 2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
> 1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
> 0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
> 1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
> 0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
> 1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
> 1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
> 0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
> 2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
> 0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
> 0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
> 0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
> 1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
> 1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
> 0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
> 2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
> 1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
> 0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
> 0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
> 0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
> 0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
> 1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
> 2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
> 2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
> 0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
> 1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
> 0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
> 1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
> 1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0
>
> and the "VM Thread" is very busing, take up ~90% one cpu time.
>
>
>
>
> On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com>wrote:
>
>> Now, I am trying the 0.90.1, but this issue is still there.
>>
>> I attach the jstack output. Coud you please help me analyze it.
>>
>> Seems all the 8 client threads are doing metaScan!
>>
>> On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>>
>>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>>> wrote:
>>> > 1. The .META. table seems ok
>>> >     I can read my data table (one thread for reading).
>>> >     I can use hbase shell to scan my data table.
>>> >     And I can use 1~4 threads to put more data into my data table.
>>> >
>>>
>>> Good.  This would seem to say that .META. is not locked out (You are
>>> doing these scans while your 8+client process is hung?).
>>>
>>>
>>> >    Before this issue happen, about 800 millions entities (column) have
>>> been
>>> > put into the table successfully, and there are 253 regions for this
>>> table.
>>> >
>>>
>>>
>>> So, you were running fine with 8+ clients until you hit the 800million
>>> entries?
>>>
>>>
>>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>>> > instance.
>>> >
>>>
>>> Do you do this for each new instance of HTable or do you pass them all
>>> the same Configuration instance?
>>>
>>>
>>> > 4. The 8+ client threads running on a single machine and a single JVM.
>>> >
>>>
>>> How many instances of this process?  One or many?
>>>
>>>
>>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>>> > return.
>>> >
>>>
>>> If you want to paste a thread dump of your client, some one of us will
>>> give it a gander.
>>>
>>> St.Ack
>>>
>>
>>
>
>
> --
> Best Regards
> Anty Rao
>



-- 
Best Regards
Anty Rao

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Anty <an...@gmail.com>.

1) when there are only 2 client threads, vmstat output is
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0


2) up client threads number to 8, vmstat output is


procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0

and the "VM Thread" is very busing, take up ~90% one cpu time.




On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Now, I am trying the 0.90.1, but this issue is still there.
>
> I attach the jstack output. Coud you please help me analyze it.
>
> Seems all the 8 client threads are doing metaScan!
>
> On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>
>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>> wrote:
>> > 1. The .META. table seems ok
>> >     I can read my data table (one thread for reading).
>> >     I can use hbase shell to scan my data table.
>> >     And I can use 1~4 threads to put more data into my data table.
>> >
>>
>> Good.  This would seem to say that .META. is not locked out (You are
>> doing these scans while your 8+client process is hung?).
>>
>>
>> >    Before this issue happen, about 800 millions entities (column) have
>> been
>> > put into the table successfully, and there are 253 regions for this
>> table.
>> >
>>
>>
>> So, you were running fine with 8+ clients until you hit the 800million
>> entries?
>>
>>
>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>> > instance.
>> >
>>
>> Do you do this for each new instance of HTable or do you pass them all
>> the same Configuration instance?
>>
>>
>> > 4. The 8+ client threads running on a single machine and a single JVM.
>> >
>>
>> How many instances of this process?  One or many?
>>
>>
>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>> > return.
>> >
>>
>> If you want to paste a thread dump of your client, some one of us will
>> give it a gander.
>>
>> St.Ack
>>
>
>


-- 
Best Regards
Anty Rao

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Oh I see, I was looking at the jstack and thought that you must be
having a ton of families, and you confirm that.

The fact that we store the table schema with EVERY meta row isn't
usually such a bad thing, but in your case I guess it's becoming huge
and it's taking a long time to deserialize!

I think you should review your schema to use at most a handful of families.

> Seems the client-side metaCache (for region infos) is not work, and then
> every submit of puts will do metaScan.

My guess is that you're splitting a lot since you're inserting a lot
of data? If you still wish to continue with your current schema, maybe
pre-splitting the table would help a lot (checkout HBaseAdmin)? Also
the prefetching of .META. rows is killing your client performance, so
instead set hbase.client.prefetch.limit to like 1 or 2 instead of the
default of 10.

J-D

On Thu, Feb 24, 2011 at 12:37 AM, Schubert Zhang <zs...@gmail.com> wrote:
> New clues:
>
> Seems the client-side metaCache (for region infos) is not work, and then
> every submit of puts will do metaScan.
> The specific of my test is:
>
> The table have many column family (366 cfs for every day of a year), but
> only one column family is active now for writing data, so the memory usage
> for memstore is ok.
>
> Then, when do metaScan for regioninfos, the code will run into large loop to
> get and deserialize every column family info.
>
> When the number of regions increase (64 in my test), the loop will be 366*64
> for each put submit. Then the client thread become very busy.
>
>
> Now, we should determine why to do metaScan for each submit of puts.
>
>
> On Thu, Feb 24, 2011 at 11:53 AM, Schubert Zhang <zs...@gmail.com> wrote:
>
>> Currently, with 0.90.1, this issue happen when there is only 8 regions in
>> each RS, and totally 64 regions in all totally 8 RS.
>>
>> Ths CPU% of the client is very high.
>>
>>   On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com>wrote:
>>
>>> Now, I am trying the 0.90.1, but this issue is still there.
>>>
>>> I attach the jstack output. Coud you please help me analyze it.
>>>
>>> Seems all the 8 client threads are doing metaScan!
>>>
>>>   On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>>>
>>>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>>>> wrote:
>>>> > 1. The .META. table seems ok
>>>> >     I can read my data table (one thread for reading).
>>>> >     I can use hbase shell to scan my data table.
>>>> >     And I can use 1~4 threads to put more data into my data table.
>>>> >
>>>>
>>>> Good.  This would seem to say that .META. is not locked out (You are
>>>> doing these scans while your 8+client process is hung?).
>>>>
>>>>
>>>> >    Before this issue happen, about 800 millions entities (column) have
>>>> been
>>>> > put into the table successfully, and there are 253 regions for this
>>>> table.
>>>> >
>>>>
>>>>
>>>> So, you were running fine with 8+ clients until you hit the 800million
>>>> entries?
>>>>
>>>>
>>>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>>>> > instance.
>>>> >
>>>>
>>>> Do you do this for each new instance of HTable or do you pass them all
>>>> the same Configuration instance?
>>>>
>>>>
>>>> > 4. The 8+ client threads running on a single machine and a single JVM.
>>>> >
>>>>
>>>> How many instances of this process?  One or many?
>>>>
>>>>
>>>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>>>> > return.
>>>> >
>>>>
>>>> If you want to paste a thread dump of your client, some one of us will
>>>> give it a gander.
>>>>
>>>> St.Ack
>>>>
>>>
>>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

New clues:

Seems the client-side metaCache (for region infos) is not work, and then
every submit of puts will do metaScan.
The specific of my test is:

The table have many column family (366 cfs for every day of a year), but
only one column family is active now for writing data, so the memory usage
for memstore is ok.

Then, when do metaScan for regioninfos, the code will run into large loop to
get and deserialize every column family info.

When the number of regions increase (64 in my test), the loop will be 366*64
for each put submit. Then the client thread become very busy.


Now, we should determine why to do metaScan for each submit of puts.


On Thu, Feb 24, 2011 at 11:53 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Currently, with 0.90.1, this issue happen when there is only 8 regions in
> each RS, and totally 64 regions in all totally 8 RS.
>
> Ths CPU% of the client is very high.
>
>   On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com>wrote:
>
>> Now, I am trying the 0.90.1, but this issue is still there.
>>
>> I attach the jstack output. Coud you please help me analyze it.
>>
>> Seems all the 8 client threads are doing metaScan!
>>
>>   On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>>
>>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>>> wrote:
>>> > 1. The .META. table seems ok
>>> >     I can read my data table (one thread for reading).
>>> >     I can use hbase shell to scan my data table.
>>> >     And I can use 1~4 threads to put more data into my data table.
>>> >
>>>
>>> Good.  This would seem to say that .META. is not locked out (You are
>>> doing these scans while your 8+client process is hung?).
>>>
>>>
>>> >    Before this issue happen, about 800 millions entities (column) have
>>> been
>>> > put into the table successfully, and there are 253 regions for this
>>> table.
>>> >
>>>
>>>
>>> So, you were running fine with 8+ clients until you hit the 800million
>>> entries?
>>>
>>>
>>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>>> > instance.
>>> >
>>>
>>> Do you do this for each new instance of HTable or do you pass them all
>>> the same Configuration instance?
>>>
>>>
>>> > 4. The 8+ client threads running on a single machine and a single JVM.
>>> >
>>>
>>> How many instances of this process?  One or many?
>>>
>>>
>>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>>> > return.
>>> >
>>>
>>> If you want to paste a thread dump of your client, some one of us will
>>> give it a gander.
>>>
>>> St.Ack
>>>
>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

New clues:

Seems the client-side metaCache (for region infos) is not work, and then
every submit of puts will do metaScan.
The specific of my test is:

The table have many column family (366 cfs for every day of a year), but
only one column family is active now for writing data, so the memory usage
for memstore is ok.

Then, when do metaScan for regioninfos, the code will run into large loop to
get and deserialize every column family info.

When the number of regions increase (64 in my test), the loop will be 366*64
for each put submit. Then the client thread become very busy.


Now, we should determine why to do metaScan for each submit of puts.


On Thu, Feb 24, 2011 at 11:53 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Currently, with 0.90.1, this issue happen when there is only 8 regions in
> each RS, and totally 64 regions in all totally 8 RS.
>
> Ths CPU% of the client is very high.
>
>   On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com>wrote:
>
>> Now, I am trying the 0.90.1, but this issue is still there.
>>
>> I attach the jstack output. Coud you please help me analyze it.
>>
>> Seems all the 8 client threads are doing metaScan!
>>
>>   On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>>
>>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>>> wrote:
>>> > 1. The .META. table seems ok
>>> >     I can read my data table (one thread for reading).
>>> >     I can use hbase shell to scan my data table.
>>> >     And I can use 1~4 threads to put more data into my data table.
>>> >
>>>
>>> Good.  This would seem to say that .META. is not locked out (You are
>>> doing these scans while your 8+client process is hung?).
>>>
>>>
>>> >    Before this issue happen, about 800 millions entities (column) have
>>> been
>>> > put into the table successfully, and there are 253 regions for this
>>> table.
>>> >
>>>
>>>
>>> So, you were running fine with 8+ clients until you hit the 800million
>>> entries?
>>>
>>>
>>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>>> > instance.
>>> >
>>>
>>> Do you do this for each new instance of HTable or do you pass them all
>>> the same Configuration instance?
>>>
>>>
>>> > 4. The 8+ client threads running on a single machine and a single JVM.
>>> >
>>>
>>> How many instances of this process?  One or many?
>>>
>>>
>>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>>> > return.
>>> >
>>>
>>> If you want to paste a thread dump of your client, some one of us will
>>> give it a gander.
>>>
>>> St.Ack
>>>
>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

Currently, with 0.90.1, this issue happen when there is only 8 regions in
each RS, and totally 64 regions in all totally 8 RS.

Ths CPU% of the client is very high.

On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Now, I am trying the 0.90.1, but this issue is still there.
>
> I attach the jstack output. Coud you please help me analyze it.
>
> Seems all the 8 client threads are doing metaScan!
>
>   On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>
>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>> wrote:
>> > 1. The .META. table seems ok
>> >     I can read my data table (one thread for reading).
>> >     I can use hbase shell to scan my data table.
>> >     And I can use 1~4 threads to put more data into my data table.
>> >
>>
>> Good.  This would seem to say that .META. is not locked out (You are
>> doing these scans while your 8+client process is hung?).
>>
>>
>> >    Before this issue happen, about 800 millions entities (column) have
>> been
>> > put into the table successfully, and there are 253 regions for this
>> table.
>> >
>>
>>
>> So, you were running fine with 8+ clients until you hit the 800million
>> entries?
>>
>>
>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>> > instance.
>> >
>>
>> Do you do this for each new instance of HTable or do you pass them all
>> the same Configuration instance?
>>
>>
>> > 4. The 8+ client threads running on a single machine and a single JVM.
>> >
>>
>> How many instances of this process?  One or many?
>>
>>
>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>> > return.
>> >
>>
>> If you want to paste a thread dump of your client, some one of us will
>> give it a gander.
>>
>> St.Ack
>>
>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

Now, I am trying the 0.90.1, but this issue is still there.

I attach the jstack output. Coud you please help me analyze it.

Seems all the 8 client threads are doing metaScan!

On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:

> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
> wrote:
> > 1. The .META. table seems ok
> >     I can read my data table (one thread for reading).
> >     I can use hbase shell to scan my data table.
> >     And I can use 1~4 threads to put more data into my data table.
> >
>
> Good.  This would seem to say that .META. is not locked out (You are
> doing these scans while your 8+client process is hung?).
>
>
> >    Before this issue happen, about 800 millions entities (column) have
> been
> > put into the table successfully, and there are 253 regions for this
> table.
> >
>
>
> So, you were running fine with 8+ clients until you hit the 800million
> entries?
>
>
> > 3. All clients use HBaseConfiguration.create() for a new Configuration
> > instance.
> >
>
> Do you do this for each new instance of HTable or do you pass them all
> the same Configuration instance?
>
>
> > 4. The 8+ client threads running on a single machine and a single JVM.
> >
>
> How many instances of this process?  One or many?
>
>
> > 5. Seems all 8+ threads are blocked in same location waiting on call to
> > return.
> >
>
> If you want to paste a thread dump of your client, some one of us will
> give it a gander.
>
> St.Ack
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Stack <st...@duboce.net>.

On Wed, Feb 23, 2011 at 7:48 PM, Schubert Zhang <zs...@gmail.com> wrote:
>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>> > instance.
>> >
>>
>> Do you do this for each new instance of HTable or do you pass them all
>> the same Configuration instance?
>
>
> Every client thread use HBaseConfiguration.create() to create a new
> Configuration and use it to new HTable.
>

8 threads total.  These 8 threads are long-lived or you are
continuously creating them.


St.Ack

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:

> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
> wrote:
> > 1. The .META. table seems ok
> >     I can read my data table (one thread for reading).
> >     I can use hbase shell to scan my data table.
> >     And I can use 1~4 threads to put more data into my data table.
> >
>
> Good.  This would seem to say that .META. is not locked out (You are
> doing these scans while your 8+client process is hung?).
>

Yes, write and read at the same time.

>
>
> >    Before this issue happen, about 800 millions entities (column) have
> been
> > put into the table successfully, and there are 253 regions for this
> table.
> >
>
>
> So, you were running fine with 8+ clients until you hit the 800million
> entries?


Yes, it ware running fine before this issue happen.


>
>
> > 3. All clients use HBaseConfiguration.create() for a new Configuration
> > instance.
> >
>
> Do you do this for each new instance of HTable or do you pass them all
> the same Configuration instance?


Every client thread use HBaseConfiguration.create() to create a new
Configuration and use it to new HTable.


>
> > 4. The 8+ client threads running on a single machine and a single JVM.
> >
>
> How many instances of this process?  One or many?
>
>

In this process, 8 threads each with a instance of HTable (i.e. totally 8
HTable instances)



>
> > 5. Seems all 8+ threads are blocked in same location waiting on call to
> > return.
> >
>
> If you want to paste a thread dump of your client, some one of us will
> give it a gander.
>
> St.Ack
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Stack <st...@duboce.net>.

On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com> wrote:
> 1. The .META. table seems ok
>     I can read my data table (one thread for reading).
>     I can use hbase shell to scan my data table.
>     And I can use 1~4 threads to put more data into my data table.
>

Good.  This would seem to say that .META. is not locked out (You are
doing these scans while your 8+client process is hung?).

>    Before this issue happen, about 800 millions entities (column) have been
> put into the table successfully, and there are 253 regions for this table.
>

So, you were running fine with 8+ clients until you hit the 800million entries?

> 3. All clients use HBaseConfiguration.create() for a new Configuration
> instance.
>

Do you do this for each new instance of HTable or do you pass them all
the same Configuration instance?

> 4. The 8+ client threads running on a single machine and a single JVM.
>

How many instances of this process?  One or many?

> 5. Seems all 8+ threads are blocked in same location waiting on call to
> return.
>

If you want to paste a thread dump of your client, some one of us will
give it a gander.

St.Ack

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

1. The .META. table seems ok
     I can read my data table (one thread for reading).
     I can use hbase shell to scan my data table.
     And I can use 1~4 threads to put more data into my data table.

    Before this issue happen, about 800 millions entities (column) have been
put into the table successfully, and there are 253 regions for this table.

2. There is  no strange things in logs (INFO level).
3. All clients use HBaseConfiguration.create() for a new Configuration
instance.

4. The 8+ client threads running on a single machine and a single JVM.

5. Seems all 8+ threads are blocked in same location waiting on call to
return.

Currently, there is no more clue, and I am digging for more clue.

On Fri, Jan 28, 2011 at 12:02 PM, Stack <st...@duboce.net> wrote:

> Thats a lookup on the .META. table.  Is the region hosting .META. OK?
> Anything in its logs?  Do your clients share a Configuration instance
> or do you make a new one of these each time you make an HTable?   Your
> threaded client is running on single machine?  Can we see full stack
> trace?  Are all 8+ threads blocked in same location waiting on call to
> return?
>
> St.Ack
>
>
>
> On Wed, Jan 26, 2011 at 9:19 AM, Schubert Zhang <zs...@gmail.com> wrote:
> > The "Thread-Opr0" the client thread to put data into hbase, it is
> waiting.
> >
> > "Thread-Opr0-EventThread" daemon prio=10 tid=0x00002aaafc7a8000 nid=0xe08
> > waiting on condition [0x000000004383f000]
> >   java.lang.Thread.State: WAITING (parking)
> >        at sun.misc.Unsafe.park(Native Method)
> >        - parking to wait for  <0x00002aaab632ae50> (a
> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> >        at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> >        at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
> >        at
> >
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> >        at
> > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> > "Thread-Opr0-SendThread(nd1-rack2-cloud:2181)" daemon prio=10
> > tid=0x00002aaafc7a6800 nid=0xe07 runnable [0x000000004373e000]
> >   java.lang.Thread.State: RUNNABLE
> >        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> >        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> >        at
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> >        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> >        - locked <0x00002aaab6304410> (a sun.nio.ch.Util$1)
> >        - locked <0x00002aaab6304428> (a
> > java.util.Collections$UnmodifiableSet)
> >        - locked <0x00002aaab632abd0> (a sun.nio.ch.EPollSelectorImpl)
> >        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> >        at
> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> >
> > "Thread-Opr0" prio=10 tid=0x00002aab0402a000 nid=0xdf2 in Object.wait()
> > [0x000000004262d000]
> >   java.lang.Thread.State: WAITING (on object monitor)
> >        at java.lang.Object.wait(Native Method)
> >        - waiting on <0x00002aaab04302d0> (a
> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
> >        at java.lang.Object.wait(Object.java:485)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:739)
> >        - locked <0x00002aaab04302d0> (a
> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> >        at $Proxy0.getClosestRowBefore(Unknown Source)
> >        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:517)
> >        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:515)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
> >        at
> > org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:514)
> >        at
> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133)
> >        at
> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:645)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:699)
> >        - locked <0x00002aaab6294660> (a java.lang.Object)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
> >        at
> > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
> >        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
> >        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
> >        at com.bigdata.bench.hbase.HBaseWriter$Operator.operateTo(Unknown
> > Source)
> >        at com.bigdata.bench.hbase.HBaseWriter$Operator.run(Unknown
> Source)
> >
> > On Thu, Jan 27, 2011 at 12:06 AM, Schubert Zhang <zs...@gmail.com>
> wrote:
> >
> >> Even though cannot put more data into table, I can read the existing
> data.
> >>
> >> And I stop and re-start the HBase, still cannot put more data.
> >>
> >> hbase(main):031:0> status 'simple'
> >> 8 live servers
> >>     nd5-rack2-cloud:60020 1296057544120
> >>         requests=0, regions=32, usedHeap=130, maxHeap=8973
> >>     nd8-rack2-cloud:60020 1296057544350
> >>         requests=0, regions=31, usedHeap=128, maxHeap=8983
> >>     nd2-rack2-cloud:60020 1296057543346
> >>         requests=0, regions=32, usedHeap=130, maxHeap=8973
> >>     nd3-rack2-cloud:60020 1296057544224
> >>         requests=0, regions=32, usedHeap=133, maxHeap=8973
> >>     nd6-rack2-cloud:60020 1296057544482
> >>         requests=0, regions=32, usedHeap=130, maxHeap=8983
> >>     nd9-rack2-cloud:60020 1296057544565
> >>         requests=174, regions=32, usedHeap=180, maxHeap=8983
> >>     nd7-rack2-cloud:60020 1296057544617
> >>         requests=0, regions=32, usedHeap=126, maxHeap=8983
> >>     nd4-rack2-cloud:60020 1296057544138
> >>         requests=0, regions=32, usedHeap=126, maxHeap=8973
> >> 0 dead servers
> >> Aggregate load: 174, regions: 255
> >>
> >>
> >> On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <zsongbo@gmail.com
> >wrote:
> >>
> >>> I am using 0.90.0 (8 RS + 1Master)
> >>> and the HDFS is CDH3b3
> >>>
> >>> During the first hours of running, I puts many (tens of millions
> entites,
> >>> each about 200 bytes), it worked well.
> >>>
> >>> But then, the client cannot put more data.
> >>>
> >>> I checked all log files of hbase, no abnormal is found, I will continue
> to
> >>> check this issue.
> >>>
> >>> It seems related to ZooKeeper......
> >>>
> >>
> >>
> >
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Stack <st...@duboce.net>.

Thats a lookup on the .META. table.  Is the region hosting .META. OK?
Anything in its logs?  Do your clients share a Configuration instance
or do you make a new one of these each time you make an HTable?   Your
threaded client is running on single machine?  Can we see full stack
trace?  Are all 8+ threads blocked in same location waiting on call to
return?

St.Ack



On Wed, Jan 26, 2011 at 9:19 AM, Schubert Zhang <zs...@gmail.com> wrote:
> The "Thread-Opr0" the client thread to put data into hbase, it is waiting.
>
> "Thread-Opr0-EventThread" daemon prio=10 tid=0x00002aaafc7a8000 nid=0xe08
> waiting on condition [0x000000004383f000]
>   java.lang.Thread.State: WAITING (parking)
>        at sun.misc.Unsafe.park(Native Method)
>        - parking to wait for  <0x00002aaab632ae50> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>        at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>        at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>        at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> "Thread-Opr0-SendThread(nd1-rack2-cloud:2181)" daemon prio=10
> tid=0x00002aaafc7a6800 nid=0xe07 runnable [0x000000004373e000]
>   java.lang.Thread.State: RUNNABLE
>        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
>        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>        - locked <0x00002aaab6304410> (a sun.nio.ch.Util$1)
>        - locked <0x00002aaab6304428> (a
> java.util.Collections$UnmodifiableSet)
>        - locked <0x00002aaab632abd0> (a sun.nio.ch.EPollSelectorImpl)
>        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
>
> "Thread-Opr0" prio=10 tid=0x00002aab0402a000 nid=0xdf2 in Object.wait()
> [0x000000004262d000]
>   java.lang.Thread.State: WAITING (on object monitor)
>        at java.lang.Object.wait(Native Method)
>        - waiting on <0x00002aaab04302d0> (a
> org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>        at java.lang.Object.wait(Object.java:485)
>        at
> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:739)
>        - locked <0x00002aaab04302d0> (a
> org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>        at $Proxy0.getClosestRowBefore(Unknown Source)
>        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:517)
>        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:515)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
>        at
> org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:514)
>        at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133)
>        at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:645)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:699)
>        - locked <0x00002aaab6294660> (a java.lang.Object)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
>        at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
>        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
>        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
>        at com.bigdata.bench.hbase.HBaseWriter$Operator.operateTo(Unknown
> Source)
>        at com.bigdata.bench.hbase.HBaseWriter$Operator.run(Unknown Source)
>
> On Thu, Jan 27, 2011 at 12:06 AM, Schubert Zhang <zs...@gmail.com> wrote:
>
>> Even though cannot put more data into table, I can read the existing data.
>>
>> And I stop and re-start the HBase, still cannot put more data.
>>
>> hbase(main):031:0> status 'simple'
>> 8 live servers
>>     nd5-rack2-cloud:60020 1296057544120
>>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>>     nd8-rack2-cloud:60020 1296057544350
>>         requests=0, regions=31, usedHeap=128, maxHeap=8983
>>     nd2-rack2-cloud:60020 1296057543346
>>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>>     nd3-rack2-cloud:60020 1296057544224
>>         requests=0, regions=32, usedHeap=133, maxHeap=8973
>>     nd6-rack2-cloud:60020 1296057544482
>>         requests=0, regions=32, usedHeap=130, maxHeap=8983
>>     nd9-rack2-cloud:60020 1296057544565
>>         requests=174, regions=32, usedHeap=180, maxHeap=8983
>>     nd7-rack2-cloud:60020 1296057544617
>>         requests=0, regions=32, usedHeap=126, maxHeap=8983
>>     nd4-rack2-cloud:60020 1296057544138
>>         requests=0, regions=32, usedHeap=126, maxHeap=8973
>> 0 dead servers
>> Aggregate load: 174, regions: 255
>>
>>
>> On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <zs...@gmail.com>wrote:
>>
>>> I am using 0.90.0 (8 RS + 1Master)
>>> and the HDFS is CDH3b3
>>>
>>> During the first hours of running, I puts many (tens of millions entites,
>>> each about 200 bytes), it worked well.
>>>
>>> But then, the client cannot put more data.
>>>
>>> I checked all log files of hbase, no abnormal is found, I will continue to
>>> check this issue.
>>>
>>> It seems related to ZooKeeper......
>>>
>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Ryan Rawson <ry...@gmail.com>.

why do they hang up?

On Thu, Jan 27, 2011 at 12:25 AM, Schubert Zhang <zs...@gmail.com> wrote:
> The update:
>
> If I start 1 or 2 or 4 client threads (each have a HTable instance), normal.
>
> If I start 8 or more client threads (each have a HTable instance), the put
> operations hang-up.
>
> On Thu, Jan 27, 2011 at 1:19 AM, Schubert Zhang <zs...@gmail.com> wrote:
>
>> The "Thread-Opr0" the client thread to put data into hbase, it is waiting.
>>
>> "Thread-Opr0-EventThread" daemon prio=10 tid=0x00002aaafc7a8000 nid=0xe08
>> waiting on condition [0x000000004383f000]
>>    java.lang.Thread.State: WAITING (parking)
>>         at sun.misc.Unsafe.park(Native Method)
>>         - parking to wait for  <0x00002aaab632ae50> (a
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>>         at
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>>         at
>> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>>         at
>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
>> "Thread-Opr0-SendThread(nd1-rack2-cloud:2181)" daemon prio=10
>> tid=0x00002aaafc7a6800 nid=0xe07 runnable [0x000000004373e000]
>>    java.lang.Thread.State: RUNNABLE
>>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
>>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>>         - locked <0x00002aaab6304410> (a sun.nio.ch.Util$1)
>>         - locked <0x00002aaab6304428> (a
>> java.util.Collections$UnmodifiableSet)
>>         - locked <0x00002aaab632abd0> (a sun.nio.ch.EPollSelectorImpl)
>>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
>>
>> "Thread-Opr0" prio=10 tid=0x00002aab0402a000 nid=0xdf2 in Object.wait()
>> [0x000000004262d000]
>>    java.lang.Thread.State: WAITING (on object monitor)
>>         at java.lang.Object.wait(Native Method)
>>         - waiting on <0x00002aaab04302d0> (a
>> org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>>         at java.lang.Object.wait(Object.java:485)
>>         at
>> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:739)
>>         - locked <0x00002aaab04302d0> (a
>> org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>>         at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>>         at $Proxy0.getClosestRowBefore(Unknown Source)
>>         at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:517)
>>         at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:515)
>>         at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
>>         at
>> org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:514)
>>         at
>> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133)
>>         at
>> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
>>         at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:645)
>>         at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:699)
>>         - locked <0x00002aaab6294660> (a java.lang.Object)
>>         at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
>>         at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
>>         at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
>>         at
>> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
>>         at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
>>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
>>         at com.bigdata.bench.hbase.HBaseWriter$Operator.operateTo(Unknown
>> Source)
>>         at com.bigdata.bench.hbase.HBaseWriter$Operator.run(Unknown Source)
>>
>>   On Thu, Jan 27, 2011 at 12:06 AM, Schubert Zhang <zs...@gmail.com>wrote:
>>
>>> Even though cannot put more data into table, I can read the existing data.
>>>
>>> And I stop and re-start the HBase, still cannot put more data.
>>>
>>> hbase(main):031:0> status 'simple'
>>> 8 live servers
>>>     nd5-rack2-cloud:60020 1296057544120
>>>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>>>     nd8-rack2-cloud:60020 1296057544350
>>>         requests=0, regions=31, usedHeap=128, maxHeap=8983
>>>     nd2-rack2-cloud:60020 1296057543346
>>>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>>>     nd3-rack2-cloud:60020 1296057544224
>>>         requests=0, regions=32, usedHeap=133, maxHeap=8973
>>>     nd6-rack2-cloud:60020 1296057544482
>>>         requests=0, regions=32, usedHeap=130, maxHeap=8983
>>>     nd9-rack2-cloud:60020 1296057544565
>>>         requests=174, regions=32, usedHeap=180, maxHeap=8983
>>>     nd7-rack2-cloud:60020 1296057544617
>>>         requests=0, regions=32, usedHeap=126, maxHeap=8983
>>>     nd4-rack2-cloud:60020 1296057544138
>>>         requests=0, regions=32, usedHeap=126, maxHeap=8973
>>> 0 dead servers
>>> Aggregate load: 174, regions: 255
>>>
>>>
>>> On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <zs...@gmail.com>wrote:
>>>
>>>> I am using 0.90.0 (8 RS + 1Master)
>>>> and the HDFS is CDH3b3
>>>>
>>>> During the first hours of running, I puts many (tens of millions entites,
>>>> each about 200 bytes), it worked well.
>>>>
>>>> But then, the client cannot put more data.
>>>>
>>>> I checked all log files of hbase, no abnormal is found, I will continue
>>>> to check this issue.
>>>>
>>>> It seems related to ZooKeeper......
>>>>
>>>
>>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by tsuna <ts...@gmail.com>.

On Thu, Jan 27, 2011 at 5:43 PM, Schubert Zhang <zs...@gmail.com> wrote:
> I know the HTable class is not thread-safe, so in my test code, each thread
> new a HTable instance to put data. I think there is no thread-safe issue.

OK as long as you make sure each instance is isolated in its own
thread, then you're fine, the problem is elsewhere.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

Thanks tsuna,

I know the HTable class is not thread-safe, so in my test code, each thread
new a HTable instance to put data. I think there is no thread-safe issue.

Last night, I used the same test program to test  a previous version
hbase-0.89.20100924+28, everything is fine.

I am also interesting in asynchbase .... Thanks.
On Fri, Jan 28, 2011 at 8:22 AM, tsuna <ts...@gmail.com> wrote:

> On Thu, Jan 27, 2011 at 12:25 AM, Schubert Zhang <zs...@gmail.com>
> wrote:
> > The update:
> >
> > If I start 1 or 2 or 4 client threads (each have a HTable instance),
> normal.
> >
> > If I start 8 or more client threads (each have a HTable instance), the
> put
> > operations hang-up.
>
> Your stack trace seems to indicate that you didn't synchronize on the
> HTable instance before calling put().
> HTable isn't thread-safe, you need some synchronization in there.
> Alternatively, if you don't mind rewriting the part of your code that
> interacts with HTable, you could replace it with asynchbase
> (https://github.com/stumbleupon/asynchbase), an alternative HBase
> client library that was written to be thread-safe from the ground up.
>
> --
> Benoit "tsuna" Sigoure
> Software Engineer @ www.StumbleUpon.com <http://www.stumbleupon.com/>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by tsuna <ts...@gmail.com>.

On Thu, Jan 27, 2011 at 12:25 AM, Schubert Zhang <zs...@gmail.com> wrote:
> The update:
>
> If I start 1 or 2 or 4 client threads (each have a HTable instance), normal.
>
> If I start 8 or more client threads (each have a HTable instance), the put
> operations hang-up.

Your stack trace seems to indicate that you didn't synchronize on the
HTable instance before calling put().
HTable isn't thread-safe, you need some synchronization in there.
Alternatively, if you don't mind rewriting the part of your code that
interacts with HTable, you could replace it with asynchbase
(https://github.com/stumbleupon/asynchbase), an alternative HBase
client library that was written to be thread-safe from the ground up.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

The update:

If I start 1 or 2 or 4 client threads (each have a HTable instance), normal.

If I start 8 or more client threads (each have a HTable instance), the put
operations hang-up.

On Thu, Jan 27, 2011 at 1:19 AM, Schubert Zhang <zs...@gmail.com> wrote:

> The "Thread-Opr0" the client thread to put data into hbase, it is waiting.
>
> "Thread-Opr0-EventThread" daemon prio=10 tid=0x00002aaafc7a8000 nid=0xe08
> waiting on condition [0x000000004383f000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00002aaab632ae50> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> "Thread-Opr0-SendThread(nd1-rack2-cloud:2181)" daemon prio=10
> tid=0x00002aaafc7a6800 nid=0xe07 runnable [0x000000004373e000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>         - locked <0x00002aaab6304410> (a sun.nio.ch.Util$1)
>         - locked <0x00002aaab6304428> (a
> java.util.Collections$UnmodifiableSet)
>         - locked <0x00002aaab632abd0> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
>
> "Thread-Opr0" prio=10 tid=0x00002aab0402a000 nid=0xdf2 in Object.wait()
> [0x000000004262d000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00002aaab04302d0> (a
> org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>         at java.lang.Object.wait(Object.java:485)
>         at
> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:739)
>         - locked <0x00002aaab04302d0> (a
> org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>         at $Proxy0.getClosestRowBefore(Unknown Source)
>         at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:517)
>         at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:515)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
>         at
> org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:514)
>         at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133)
>         at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:645)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:699)
>         - locked <0x00002aaab6294660> (a java.lang.Object)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
>         at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
>         at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
>         at com.bigdata.bench.hbase.HBaseWriter$Operator.operateTo(Unknown
> Source)
>         at com.bigdata.bench.hbase.HBaseWriter$Operator.run(Unknown Source)
>
>   On Thu, Jan 27, 2011 at 12:06 AM, Schubert Zhang <zs...@gmail.com>wrote:
>
>> Even though cannot put more data into table, I can read the existing data.
>>
>> And I stop and re-start the HBase, still cannot put more data.
>>
>> hbase(main):031:0> status 'simple'
>> 8 live servers
>>     nd5-rack2-cloud:60020 1296057544120
>>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>>     nd8-rack2-cloud:60020 1296057544350
>>         requests=0, regions=31, usedHeap=128, maxHeap=8983
>>     nd2-rack2-cloud:60020 1296057543346
>>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>>     nd3-rack2-cloud:60020 1296057544224
>>         requests=0, regions=32, usedHeap=133, maxHeap=8973
>>     nd6-rack2-cloud:60020 1296057544482
>>         requests=0, regions=32, usedHeap=130, maxHeap=8983
>>     nd9-rack2-cloud:60020 1296057544565
>>         requests=174, regions=32, usedHeap=180, maxHeap=8983
>>     nd7-rack2-cloud:60020 1296057544617
>>         requests=0, regions=32, usedHeap=126, maxHeap=8983
>>     nd4-rack2-cloud:60020 1296057544138
>>         requests=0, regions=32, usedHeap=126, maxHeap=8973
>> 0 dead servers
>> Aggregate load: 174, regions: 255
>>
>>
>> On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <zs...@gmail.com>wrote:
>>
>>> I am using 0.90.0 (8 RS + 1Master)
>>> and the HDFS is CDH3b3
>>>
>>> During the first hours of running, I puts many (tens of millions entites,
>>> each about 200 bytes), it worked well.
>>>
>>> But then, the client cannot put more data.
>>>
>>> I checked all log files of hbase, no abnormal is found, I will continue
>>> to check this issue.
>>>
>>> It seems related to ZooKeeper......
>>>
>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

The "Thread-Opr0" the client thread to put data into hbase, it is waiting.

"Thread-Opr0-EventThread" daemon prio=10 tid=0x00002aaafc7a8000 nid=0xe08
waiting on condition [0x000000004383f000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00002aaab632ae50> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
        at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
"Thread-Opr0-SendThread(nd1-rack2-cloud:2181)" daemon prio=10
tid=0x00002aaafc7a6800 nid=0xe07 runnable [0x000000004373e000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
        - locked <0x00002aaab6304410> (a sun.nio.ch.Util$1)
        - locked <0x00002aaab6304428> (a
java.util.Collections$UnmodifiableSet)
        - locked <0x00002aaab632abd0> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)

"Thread-Opr0" prio=10 tid=0x00002aab0402a000 nid=0xdf2 in Object.wait()
[0x000000004262d000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00002aaab04302d0> (a
org.apache.hadoop.hbase.ipc.HBaseClient$Call)
        at java.lang.Object.wait(Object.java:485)
        at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:739)
        - locked <0x00002aaab04302d0> (a
org.apache.hadoop.hbase.ipc.HBaseClient$Call)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
        at $Proxy0.getClosestRowBefore(Unknown Source)
        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:517)
        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:515)
        at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
        at
org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:514)
        at
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133)
        at
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
        at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:645)
        at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:699)
        - locked <0x00002aaab6294660> (a java.lang.Object)
        at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
        at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
        at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
        at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
        at com.bigdata.bench.hbase.HBaseWriter$Operator.operateTo(Unknown
Source)
        at com.bigdata.bench.hbase.HBaseWriter$Operator.run(Unknown Source)

On Thu, Jan 27, 2011 at 12:06 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Even though cannot put more data into table, I can read the existing data.
>
> And I stop and re-start the HBase, still cannot put more data.
>
> hbase(main):031:0> status 'simple'
> 8 live servers
>     nd5-rack2-cloud:60020 1296057544120
>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>     nd8-rack2-cloud:60020 1296057544350
>         requests=0, regions=31, usedHeap=128, maxHeap=8983
>     nd2-rack2-cloud:60020 1296057543346
>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>     nd3-rack2-cloud:60020 1296057544224
>         requests=0, regions=32, usedHeap=133, maxHeap=8973
>     nd6-rack2-cloud:60020 1296057544482
>         requests=0, regions=32, usedHeap=130, maxHeap=8983
>     nd9-rack2-cloud:60020 1296057544565
>         requests=174, regions=32, usedHeap=180, maxHeap=8983
>     nd7-rack2-cloud:60020 1296057544617
>         requests=0, regions=32, usedHeap=126, maxHeap=8983
>     nd4-rack2-cloud:60020 1296057544138
>         requests=0, regions=32, usedHeap=126, maxHeap=8973
> 0 dead servers
> Aggregate load: 174, regions: 255
>
>
> On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <zs...@gmail.com>wrote:
>
>> I am using 0.90.0 (8 RS + 1Master)
>> and the HDFS is CDH3b3
>>
>> During the first hours of running, I puts many (tens of millions entites,
>> each about 200 bytes), it worked well.
>>
>> But then, the client cannot put more data.
>>
>> I checked all log files of hbase, no abnormal is found, I will continue to
>> check this issue.
>>
>> It seems related to ZooKeeper......
>>
>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

Even though cannot put more data into table, I can read the existing data.

And I stop and re-start the HBase, still cannot put more data.

hbase(main):031:0> status 'simple'
8 live servers
    nd5-rack2-cloud:60020 1296057544120
        requests=0, regions=32, usedHeap=130, maxHeap=8973
    nd8-rack2-cloud:60020 1296057544350
        requests=0, regions=31, usedHeap=128, maxHeap=8983
    nd2-rack2-cloud:60020 1296057543346
        requests=0, regions=32, usedHeap=130, maxHeap=8973
    nd3-rack2-cloud:60020 1296057544224
        requests=0, regions=32, usedHeap=133, maxHeap=8973
    nd6-rack2-cloud:60020 1296057544482
        requests=0, regions=32, usedHeap=130, maxHeap=8983
    nd9-rack2-cloud:60020 1296057544565
        requests=174, regions=32, usedHeap=180, maxHeap=8983
    nd7-rack2-cloud:60020 1296057544617
        requests=0, regions=32, usedHeap=126, maxHeap=8983
    nd4-rack2-cloud:60020 1296057544138
        requests=0, regions=32, usedHeap=126, maxHeap=8973
0 dead servers
Aggregate load: 174, regions: 255


On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <zs...@gmail.com> wrote:

> I am using 0.90.0 (8 RS + 1Master)
> and the HDFS is CDH3b3
>
> During the first hours of running, I puts many (tens of millions entites,
> each about 200 bytes), it worked well.
>
> But then, the client cannot put more data.
>
> I checked all log files of hbase, no abnormal is found, I will continue to
> check this issue.
>
> It seems related to ZooKeeper......
>