You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Schubert Zhang <zs...@gmail.com> on 2011/02/24 03:55:42 UTC

Re: HBase 0.90.0 cannot be put more data after running hours

Now, I am trying the 0.90.1, but this issue is still there.

I attach the jstack output. Coud you please help me analyze it.

Seems all the 8 client threads are doing metaScan!

On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:

> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
> wrote:
> > 1. The .META. table seems ok
> >     I can read my data table (one thread for reading).
> >     I can use hbase shell to scan my data table.
> >     And I can use 1~4 threads to put more data into my data table.
> >
>
> Good.  This would seem to say that .META. is not locked out (You are
> doing these scans while your 8+client process is hung?).
>
>
> >    Before this issue happen, about 800 millions entities (column) have
> been
> > put into the table successfully, and there are 253 regions for this
> table.
> >
>
>
> So, you were running fine with 8+ clients until you hit the 800million
> entries?
>
>
> > 3. All clients use HBaseConfiguration.create() for a new Configuration
> > instance.
> >
>
> Do you do this for each new instance of HTable or do you pass them all
> the same Configuration instance?
>
>
> > 4. The 8+ client threads running on a single machine and a single JVM.
> >
>
> How many instances of this process?  One or many?
>
>
> > 5. Seems all 8+ threads are blocked in same location waiting on call to
> > return.
> >
>
> If you want to paste a thread dump of your client, some one of us will
> give it a gander.
>
> St.Ack
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Anty <an...@gmail.com>.

Sorry, the vmstat output for 2) is wrong.
1) when there are only 2 client threads, vmstat output is

procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0
On Thu, Feb 24, 2011 at 11:39 AM, Anty <an...@gmail.com> wrote:


2) up client threads number to 8, vmstat output is

procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 29236 1442908 557276 12026508 0 0 19 53 0 0 3 1 97 0 0
1 0 29236 1443652 557276 12026508 0 0 0 0 1362 4740 8 0 91 0 0
1 0 29236 1443676 557276 12026508 0 0 0 36 1530 7361 10 0 90 0 0
0 0 29236 1443384 557284 12026512 0 0 0 728 1327 4565 8 0 92 0 0
1 0 29236 1442864 557284 12026512 0 0 0 5 1558 4889 10 0 90 0 0
1 0 29236 1443028 557284 12026516 0 0 0 24 1321 5751 9 0 91 0 0
2 0 29236 1440228 557296 12026512 0 0 0 61 1399 5529 9 0 90 0 0
2 0 29236 1439756 557296 12026528 0 0 0 91 1450 3485 8 0 92 0 0
1 0 29236 1440164 557296 12026528 0 0 0 217 1433 7146 10 0 90 0 0
1 0 29236 1439604 557296 12026528 0 0 0 4 1387 3420 8 0 92 0 0
4 0 29236 1439604 557296 12026532 0 0 0 593 1438 6291 9 0 90 0 0
1 0 29236 1439624 557296 12026532 0 0 0 41 1296 7312 9 0 91 0 0
1 0 29236 1439632 557296 12026532 0 0 0 1 1417 6495 9 0 91 0 0
1 0 29236 1439820 557296 12026536 0 0 0 59 1379 5482 10 0 90 0 0
1 0 29236 1439112 557296 12026540 0 0 0 12 1409 3470 8 0 92 0 0
1 0 29236 1438024 557296 12026556 0 0 0 93 1384 4741 8 0 91 0 0
1 0 29236 1437672 557296 12026556 0 0 0 627 1298 6614 10 0 90 0 0
1 0 29236 1438288 557296 12026556 0 0 0 4 1482 4857 8 0 91 0 0
1 0 29236 1438636 557296 12026560 0 0 0 169 1421 6282 10 0 90 0 0
1 0 29236 1438576 557296 12026560 0 0 0 431 1309 6078 9 0 91 0 0
1 0 29236 1438744 557296 12026560 0 0 0 728 1357 7340 9 0 91 0 0
1 0 29236 1439008 557296 12026564 0 0 0 12 1446 6676 10 0 90 0 0
4 0 29236 1438384 557296 12026564 0 0 0 3 1350 4046 9 0 91 0 0
2 0 29236 1438644 557296 12026568 0 0 0 144 1429 3761 8 0 92 0 0
1 0 29236 1438776 557296 12026568 0 0 0 1 1285 5099 8 0 91 0 0
1 0 29236 1438760 557296 12026568 0 0 0 853 1491 6029 10 0 90 0 0
1 0 29236 1438752 557296 12026572 0 0 0 0 1398 6067 8 0 91 0 0
1 0 29236 1438864 557296 12026572 0 0 0 148 1318 4662 8 0 92 0 0
1 0 29236 1439104 557296 12026572 0 0 0 0 1493 5119 10 0 90 0 0
1 0 29236 1438116 557296 12026576 0 0 0 20 1382 3487 8 0 92 0 0
1 0 29236 1438260 557296 12026576 0 0 0 139 1371 7088 10 0 90 0 0
4 0 29236 1438428 557296 12026580 0 0 0 0 1360 3528 9 0 91 0 0
1 0 29236 1438200 557296 12026580 0 0 0 597 1418 3866 8 0 92 0 0
2 0 29236 1438368 557296 12026580 0 0 0 0 1366 7512 10 0 90 0 0
1 0 29236 1438732 557300 12026584 0 0 0 47 1358 4756 9 0 91 0 0
1 0 29236 1438412 557300 12026584 0 0 0 32 1399 6982 10 0 90 0 0
1 0 29236 1438104 557300 12026584 0 0 0 609 1358 5713 8 0 91 0 0
1 0 29236 1438580 557300 12026588 0 0 0 11 1375 6874 9 0 91 0 0
1 0 29236 1438644 557300 12026588 0 0 0 176 1338 6014 10 0 90 0 0
1 0 29236 1438576 557300 12026592 0 0 0 5 1349 3276 8 0 92 0 0
4 0 29236 1438128 557300 12026592 0 0 0 601 2145 28840 17 1 82 0 0
1 0 29236 1438128 557300 12026592 0 0 0 0 1409 5143 9 0 91 0 0
1 0 29236 1438248 557300 12026596 0 0 0 20 1411 7039 10 0 90 0 0
1 0 29236 1438256 557308 12026596 0 0 0 5 1271 7634 9 0 91 0 0

and the "VM Thread" is very busing, take up ~90% one cpu time.




> 1) when there are only 2 client threads, vmstat output is
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us sy id wa st
> 0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
> 0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
> 0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
> 0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
> 0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
> 3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
> 1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
> 0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
> 0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
> 0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
> 0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
> 1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
> 0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
> 1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
> 1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
> 2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
> 1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
> 0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
> 1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
> 0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
> 1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
> 1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
> 0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
> 2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
> 0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
> 0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
> 0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
> 1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
> 1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
> 0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
> 2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
> 1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
> 0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
> 0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
> 0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
> 0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
> 1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
> 2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
> 2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
> 0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
> 1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
> 0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
> 1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
> 1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0
>
>
> 2) up client threads number to 8, vmstat output is
>
>
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us sy id wa st
> 0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
> 0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
> 0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
> 0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
> 0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
> 3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
> 1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
> 0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
> 0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
> 0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
> 0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
> 1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
> 0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
> 1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
> 1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
> 2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
> 1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
> 0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
> 1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
> 0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
> 1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
> 1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
> 0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
> 2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
> 0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
> 0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
> 0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
> 1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
> 1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
> 0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
> 2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
> 1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
> 0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
> 0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
> 0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
> 0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
> 1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
> 2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
> 2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
> 0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
> 1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
> 0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
> 1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
> 1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0
>
> and the "VM Thread" is very busing, take up ~90% one cpu time.
>
>
>
>
> On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com>wrote:
>
>> Now, I am trying the 0.90.1, but this issue is still there.
>>
>> I attach the jstack output. Coud you please help me analyze it.
>>
>> Seems all the 8 client threads are doing metaScan!
>>
>> On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>>
>>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>>> wrote:
>>> > 1. The .META. table seems ok
>>> >     I can read my data table (one thread for reading).
>>> >     I can use hbase shell to scan my data table.
>>> >     And I can use 1~4 threads to put more data into my data table.
>>> >
>>>
>>> Good.  This would seem to say that .META. is not locked out (You are
>>> doing these scans while your 8+client process is hung?).
>>>
>>>
>>> >    Before this issue happen, about 800 millions entities (column) have
>>> been
>>> > put into the table successfully, and there are 253 regions for this
>>> table.
>>> >
>>>
>>>
>>> So, you were running fine with 8+ clients until you hit the 800million
>>> entries?
>>>
>>>
>>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>>> > instance.
>>> >
>>>
>>> Do you do this for each new instance of HTable or do you pass them all
>>> the same Configuration instance?
>>>
>>>
>>> > 4. The 8+ client threads running on a single machine and a single JVM.
>>> >
>>>
>>> How many instances of this process?  One or many?
>>>
>>>
>>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>>> > return.
>>> >
>>>
>>> If you want to paste a thread dump of your client, some one of us will
>>> give it a gander.
>>>
>>> St.Ack
>>>
>>
>>
>
>
> --
> Best Regards
> Anty Rao
>



-- 
Best Regards
Anty Rao

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Anty <an...@gmail.com>.

1) when there are only 2 client threads, vmstat output is
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0


2) up client threads number to 8, vmstat output is


procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 29236 1610000 557488 12027072 0 0 19 53 0 0 3 1 97 0 0
0 0 29236 1610152 557488 12027076 0 0 0 0 4630 23177 2 1 97 0 0
0 0 29236 1610484 557488 12027076 0 0 0 603 4604 22758 2 1 97 0 0
0 0 29236 1610544 557488 12027076 0 0 0 11 4643 23009 2 1 97 0 0
0 0 29236 1610912 557488 12027080 0 0 0 5 4578 22900 2 1 97 0 0
3 0 29236 1610600 557488 12027080 0 0 0 21 4627 22922 2 1 97 0 0
1 0 29236 1610616 557488 12027080 0 0 0 0 4552 23077 2 1 97 0 0
0 0 29236 1610992 557488 12027084 0 0 0 608 4608 22995 2 1 97 0 0
0 0 29236 1611900 557488 12027084 0 0 0 144 4641 23000 2 1 97 0 0
0 0 29236 1612040 557488 12027084 0 0 0 0 4621 22952 2 1 97 0 0
0 0 29236 1611428 557488 12027088 0 0 0 593 4668 23133 2 1 96 0 0
1 0 29236 1611812 557488 12027088 0 0 0 0 4623 23673 2 1 97 0 0
0 0 29236 1612068 557488 12027088 0 0 0 145 4627 23245 2 1 97 0 0
1 0 29236 1612440 557488 12027092 0 0 0 1 4654 23351 2 1 97 0 0
1 0 29236 1612120 557488 12027092 0 0 0 131 4695 23137 4 1 95 0 0
2 0 29236 1612492 557492 12027096 0 0 0 39 4647 22818 2 1 97 0 0
1 0 29236 1610164 557492 12027096 0 0 0 633 4666 22914 2 1 97 0 0
0 0 29236 1610380 557492 12027096 0 0 0 73 4668 22957 2 1 97 0 0
1 0 29236 1610420 557492 12027100 0 0 0 47 4642 22907 2 1 97 0 0
0 0 29236 1611596 557492 12027100 0 0 0 0 4664 22999 2 1 97 0 0
1 0 29236 1612236 557492 12027084 0 0 0 657 4665 22552 2 1 97 0 0
1 0 29236 1612732 557492 12027088 0 0 0 0 4586 22640 2 1 97 0 0
0 0 29236 1612504 557492 12027088 0 0 0 33 4605 22670 2 1 97 0 0
2 0 29236 1611500 557492 12027092 0 0 0 161 4575 22763 2 1 97 0 0
0 0 29236 1612160 557500 12027092 0 0 0 5 4570 23265 2 1 97 0 0
0 0 29236 1612568 557500 12027092 0 0 0 592 4616 23285 2 1 97 0 0
0 0 29236 1612700 557500 12027096 0 0 0 0 4688 23754 3 1 96 0 0
1 0 29236 1612192 557504 12027092 0 0 0 27 4649 23501 2 1 97 0 0
1 0 29236 1611748 557504 12027104 0 0 0 12 4612 23664 2 1 97 0 0
0 0 29236 1611932 557504 12027116 0 0 0 768 4606 22910 2 1 97 0 0
2 0 29236 1611788 557504 12027116 0 0 0 25 4545 22991 2 1 97 0 0
1 0 29236 1611916 557504 12027120 0 0 0 0 4615 23138 2 1 97 0 0
0 0 29236 1612176 557504 12027120 0 0 0 139 4604 23231 2 1 97 0 0
0 0 29236 1612848 557504 12027108 0 0 0 0 4624 23673 2 1 97 0 0
0 0 29236 1612820 557504 12027108 0 0 0 616 4657 23247 2 1 97 0 0
0 0 29236 1613196 557508 12027108 0 0 0 107 4581 23193 2 1 97 0 0
1 0 29236 1611772 557508 12027112 0 0 0 185 4594 22807 2 1 97 0 0
2 0 29236 1610432 557508 12027116 0 0 0 1 4603 23395 2 1 97 0 0
2 0 29236 1610348 557512 12027116 0 0 0 653 4724 23562 2 1 97 0 0
0 0 29236 1610612 557512 12027132 0 0 0 0 4621 23533 2 1 97 0 0
1 0 29236 1612368 557512 12027116 0 0 0 77 4609 23223 2 1 97 0 0
0 0 29236 1612628 557512 12027120 0 0 0 0 4571 22697 2 1 97 0 0
1 0 29236 1610908 557512 12027120 0 0 0 19 4585 22614 2 1 97 0 0
1 0 29236 1610020 557512 12027124 0 0 0 629 4656 23382 2 1 97 0 0

and the "VM Thread" is very busing, take up ~90% one cpu time.




On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Now, I am trying the 0.90.1, but this issue is still there.
>
> I attach the jstack output. Coud you please help me analyze it.
>
> Seems all the 8 client threads are doing metaScan!
>
> On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>
>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>> wrote:
>> > 1. The .META. table seems ok
>> >     I can read my data table (one thread for reading).
>> >     I can use hbase shell to scan my data table.
>> >     And I can use 1~4 threads to put more data into my data table.
>> >
>>
>> Good.  This would seem to say that .META. is not locked out (You are
>> doing these scans while your 8+client process is hung?).
>>
>>
>> >    Before this issue happen, about 800 millions entities (column) have
>> been
>> > put into the table successfully, and there are 253 regions for this
>> table.
>> >
>>
>>
>> So, you were running fine with 8+ clients until you hit the 800million
>> entries?
>>
>>
>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>> > instance.
>> >
>>
>> Do you do this for each new instance of HTable or do you pass them all
>> the same Configuration instance?
>>
>>
>> > 4. The 8+ client threads running on a single machine and a single JVM.
>> >
>>
>> How many instances of this process?  One or many?
>>
>>
>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>> > return.
>> >
>>
>> If you want to paste a thread dump of your client, some one of us will
>> give it a gander.
>>
>> St.Ack
>>
>
>


-- 
Best Regards
Anty Rao

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Oh I see, I was looking at the jstack and thought that you must be
having a ton of families, and you confirm that.

The fact that we store the table schema with EVERY meta row isn't
usually such a bad thing, but in your case I guess it's becoming huge
and it's taking a long time to deserialize!

I think you should review your schema to use at most a handful of families.

> Seems the client-side metaCache (for region infos) is not work, and then
> every submit of puts will do metaScan.

My guess is that you're splitting a lot since you're inserting a lot
of data? If you still wish to continue with your current schema, maybe
pre-splitting the table would help a lot (checkout HBaseAdmin)? Also
the prefetching of .META. rows is killing your client performance, so
instead set hbase.client.prefetch.limit to like 1 or 2 instead of the
default of 10.

J-D

On Thu, Feb 24, 2011 at 12:37 AM, Schubert Zhang <zs...@gmail.com> wrote:
> New clues:
>
> Seems the client-side metaCache (for region infos) is not work, and then
> every submit of puts will do metaScan.
> The specific of my test is:
>
> The table have many column family (366 cfs for every day of a year), but
> only one column family is active now for writing data, so the memory usage
> for memstore is ok.
>
> Then, when do metaScan for regioninfos, the code will run into large loop to
> get and deserialize every column family info.
>
> When the number of regions increase (64 in my test), the loop will be 366*64
> for each put submit. Then the client thread become very busy.
>
>
> Now, we should determine why to do metaScan for each submit of puts.
>
>
> On Thu, Feb 24, 2011 at 11:53 AM, Schubert Zhang <zs...@gmail.com> wrote:
>
>> Currently, with 0.90.1, this issue happen when there is only 8 regions in
>> each RS, and totally 64 regions in all totally 8 RS.
>>
>> Ths CPU% of the client is very high.
>>
>>   On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com>wrote:
>>
>>> Now, I am trying the 0.90.1, but this issue is still there.
>>>
>>> I attach the jstack output. Coud you please help me analyze it.
>>>
>>> Seems all the 8 client threads are doing metaScan!
>>>
>>>   On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>>>
>>>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>>>> wrote:
>>>> > 1. The .META. table seems ok
>>>> >     I can read my data table (one thread for reading).
>>>> >     I can use hbase shell to scan my data table.
>>>> >     And I can use 1~4 threads to put more data into my data table.
>>>> >
>>>>
>>>> Good.  This would seem to say that .META. is not locked out (You are
>>>> doing these scans while your 8+client process is hung?).
>>>>
>>>>
>>>> >    Before this issue happen, about 800 millions entities (column) have
>>>> been
>>>> > put into the table successfully, and there are 253 regions for this
>>>> table.
>>>> >
>>>>
>>>>
>>>> So, you were running fine with 8+ clients until you hit the 800million
>>>> entries?
>>>>
>>>>
>>>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>>>> > instance.
>>>> >
>>>>
>>>> Do you do this for each new instance of HTable or do you pass them all
>>>> the same Configuration instance?
>>>>
>>>>
>>>> > 4. The 8+ client threads running on a single machine and a single JVM.
>>>> >
>>>>
>>>> How many instances of this process?  One or many?
>>>>
>>>>
>>>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>>>> > return.
>>>> >
>>>>
>>>> If you want to paste a thread dump of your client, some one of us will
>>>> give it a gander.
>>>>
>>>> St.Ack
>>>>
>>>
>>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

New clues:

Seems the client-side metaCache (for region infos) is not work, and then
every submit of puts will do metaScan.
The specific of my test is:

The table have many column family (366 cfs for every day of a year), but
only one column family is active now for writing data, so the memory usage
for memstore is ok.

Then, when do metaScan for regioninfos, the code will run into large loop to
get and deserialize every column family info.

When the number of regions increase (64 in my test), the loop will be 366*64
for each put submit. Then the client thread become very busy.


Now, we should determine why to do metaScan for each submit of puts.


On Thu, Feb 24, 2011 at 11:53 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Currently, with 0.90.1, this issue happen when there is only 8 regions in
> each RS, and totally 64 regions in all totally 8 RS.
>
> Ths CPU% of the client is very high.
>
>   On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com>wrote:
>
>> Now, I am trying the 0.90.1, but this issue is still there.
>>
>> I attach the jstack output. Coud you please help me analyze it.
>>
>> Seems all the 8 client threads are doing metaScan!
>>
>>   On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>>
>>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>>> wrote:
>>> > 1. The .META. table seems ok
>>> >     I can read my data table (one thread for reading).
>>> >     I can use hbase shell to scan my data table.
>>> >     And I can use 1~4 threads to put more data into my data table.
>>> >
>>>
>>> Good.  This would seem to say that .META. is not locked out (You are
>>> doing these scans while your 8+client process is hung?).
>>>
>>>
>>> >    Before this issue happen, about 800 millions entities (column) have
>>> been
>>> > put into the table successfully, and there are 253 regions for this
>>> table.
>>> >
>>>
>>>
>>> So, you were running fine with 8+ clients until you hit the 800million
>>> entries?
>>>
>>>
>>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>>> > instance.
>>> >
>>>
>>> Do you do this for each new instance of HTable or do you pass them all
>>> the same Configuration instance?
>>>
>>>
>>> > 4. The 8+ client threads running on a single machine and a single JVM.
>>> >
>>>
>>> How many instances of this process?  One or many?
>>>
>>>
>>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>>> > return.
>>> >
>>>
>>> If you want to paste a thread dump of your client, some one of us will
>>> give it a gander.
>>>
>>> St.Ack
>>>
>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

New clues:

Seems the client-side metaCache (for region infos) is not work, and then
every submit of puts will do metaScan.
The specific of my test is:

The table have many column family (366 cfs for every day of a year), but
only one column family is active now for writing data, so the memory usage
for memstore is ok.

Then, when do metaScan for regioninfos, the code will run into large loop to
get and deserialize every column family info.

When the number of regions increase (64 in my test), the loop will be 366*64
for each put submit. Then the client thread become very busy.


Now, we should determine why to do metaScan for each submit of puts.


On Thu, Feb 24, 2011 at 11:53 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Currently, with 0.90.1, this issue happen when there is only 8 regions in
> each RS, and totally 64 regions in all totally 8 RS.
>
> Ths CPU% of the client is very high.
>
>   On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com>wrote:
>
>> Now, I am trying the 0.90.1, but this issue is still there.
>>
>> I attach the jstack output. Coud you please help me analyze it.
>>
>> Seems all the 8 client threads are doing metaScan!
>>
>>   On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>>
>>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>>> wrote:
>>> > 1. The .META. table seems ok
>>> >     I can read my data table (one thread for reading).
>>> >     I can use hbase shell to scan my data table.
>>> >     And I can use 1~4 threads to put more data into my data table.
>>> >
>>>
>>> Good.  This would seem to say that .META. is not locked out (You are
>>> doing these scans while your 8+client process is hung?).
>>>
>>>
>>> >    Before this issue happen, about 800 millions entities (column) have
>>> been
>>> > put into the table successfully, and there are 253 regions for this
>>> table.
>>> >
>>>
>>>
>>> So, you were running fine with 8+ clients until you hit the 800million
>>> entries?
>>>
>>>
>>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>>> > instance.
>>> >
>>>
>>> Do you do this for each new instance of HTable or do you pass them all
>>> the same Configuration instance?
>>>
>>>
>>> > 4. The 8+ client threads running on a single machine and a single JVM.
>>> >
>>>
>>> How many instances of this process?  One or many?
>>>
>>>
>>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>>> > return.
>>> >
>>>
>>> If you want to paste a thread dump of your client, some one of us will
>>> give it a gander.
>>>
>>> St.Ack
>>>
>>
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Posted by Schubert Zhang <zs...@gmail.com>.

Currently, with 0.90.1, this issue happen when there is only 8 regions in
each RS, and totally 64 regions in all totally 8 RS.

Ths CPU% of the client is very high.

On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Now, I am trying the 0.90.1, but this issue is still there.
>
> I attach the jstack output. Coud you please help me analyze it.
>
> Seems all the 8 client threads are doing metaScan!
>
>   On Sat, Jan 29, 2011 at 1:02 AM, Stack <st...@duboce.net> wrote:
>
>> On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <zs...@gmail.com>
>> wrote:
>> > 1. The .META. table seems ok
>> >     I can read my data table (one thread for reading).
>> >     I can use hbase shell to scan my data table.
>> >     And I can use 1~4 threads to put more data into my data table.
>> >
>>
>> Good.  This would seem to say that .META. is not locked out (You are
>> doing these scans while your 8+client process is hung?).
>>
>>
>> >    Before this issue happen, about 800 millions entities (column) have
>> been
>> > put into the table successfully, and there are 253 regions for this
>> table.
>> >
>>
>>
>> So, you were running fine with 8+ clients until you hit the 800million
>> entries?
>>
>>
>> > 3. All clients use HBaseConfiguration.create() for a new Configuration
>> > instance.
>> >
>>
>> Do you do this for each new instance of HTable or do you pass them all
>> the same Configuration instance?
>>
>>
>> > 4. The 8+ client threads running on a single machine and a single JVM.
>> >
>>
>> How many instances of this process?  One or many?
>>
>>
>> > 5. Seems all 8+ threads are blocked in same location waiting on call to
>> > return.
>> >
>>
>> If you want to paste a thread dump of your client, some one of us will
>> give it a gander.
>>
>> St.Ack
>>
>
>