You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Omkar Joshi <Om...@lntinfotech.com> on 2013/04/17 11:47:28 UTC

Speeding up the row count

Hi,

I'm having two tables - CUSTOMERS(60000 + rows) and PRODUCTS(1000851 rows).

The table structures are  :

CUSTOMERS
rowkey :                       CUSTOMER_ID

column family : CUSTOMER_INFO

            columns :          NAME
                                    EMAIL
                                    ADDRESS
                                    MOBILE


PRODUCTS
rowkey :                       PRODUCT_ID

column family : PRODUCT_INFO

            columns : NAME
                                    CATEGORY
                                    GROUP
                                    COMPANY
                                    COST
                                    COLOR

I'm trying to get the row count for each table using the following snippet :
.
.
.
hbaseCRUD.getTableCount(args[1], "CUSTOMER_INFO","NAME");
.
.
hbaseCRUD.getTableCount(args[1], "PRODUCT_INFO","NAME");

public long getTableCount(String tableName, String columnFamilyName,
                  String columnName) {
            AggregationClient aggregationClient = new AggregationClient(config);
            Scan scan = new Scan();
            scan.addFamily(Bytes.toBytes(columnFamilyName));
            if (columnName != null && !columnName.isEmpty()) {
                  scan.addColumn(Bytes.toBytes(columnFamilyName),
                              Bytes.toBytes(columnName));
            }

            long rowCount = 0;
            try {
                  rowCount = aggregationClient.rowCount(Bytes.toBytes(tableName),
                              null, scan);
            } catch (Throwable e) {
                  // TODO Auto-generated catch block
                  e.printStackTrace();
            }
            System.out.println("row count is " + rowCount);

            return rowCount;
      }

For CUSTOMERS, the response is acceptable but for PRODUCTS, it is timing-out(even on the shell 1000851 row(s) in 258.9220 seconds).

What needs to be done to get a response quickly? Approach other than AggregationClient or tweaking the Scan in the above code snippet?

Regards,
Omkar Joshi

________________________________
The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

Re: Speeding up the row count

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi,

You might want to take a look at that:

http://hbase.apache.org/book/ops_mgt.html#rowcounter

JM


2013/4/17 Omkar Joshi <Om...@lntinfotech.com>

> Hi,
>
> I'm having two tables - CUSTOMERS(60000 + rows) and PRODUCTS(1000851 rows).
>
> The table structures are  :
>
> CUSTOMERS
> rowkey :                       CUSTOMER_ID
>
> column family : CUSTOMER_INFO
>
>             columns :          NAME
>                                     EMAIL
>                                     ADDRESS
>                                     MOBILE
>
>
> PRODUCTS
> rowkey :                       PRODUCT_ID
>
> column family : PRODUCT_INFO
>
>             columns : NAME
>                                     CATEGORY
>                                     GROUP
>                                     COMPANY
>                                     COST
>                                     COLOR
>
> I'm trying to get the row count for each table using the following snippet
> :
> .
> .
> .
> hbaseCRUD.getTableCount(args[1], "CUSTOMER_INFO","NAME");
> .
> .
> hbaseCRUD.getTableCount(args[1], "PRODUCT_INFO","NAME");
>
> public long getTableCount(String tableName, String columnFamilyName,
>                   String columnName) {
>             AggregationClient aggregationClient = new
> AggregationClient(config);
>             Scan scan = new Scan();
>             scan.addFamily(Bytes.toBytes(columnFamilyName));
>             if (columnName != null && !columnName.isEmpty()) {
>                   scan.addColumn(Bytes.toBytes(columnFamilyName),
>                               Bytes.toBytes(columnName));
>             }
>
>             long rowCount = 0;
>             try {
>                   rowCount =
> aggregationClient.rowCount(Bytes.toBytes(tableName),
>                               null, scan);
>             } catch (Throwable e) {
>                   // TODO Auto-generated catch block
>                   e.printStackTrace();
>             }
>             System.out.println("row count is " + rowCount);
>
>             return rowCount;
>       }
>
> For CUSTOMERS, the response is acceptable but for PRODUCTS, it is
> timing-out(even on the shell 1000851 row(s) in 258.9220 seconds).
>
> What needs to be done to get a response quickly? Approach other than
> AggregationClient or tweaking the Scan in the above code snippet?
>
> Regards,
> Omkar Joshi
>
> ________________________________
> The contents of this e-mail and any attachment(s) may contain confidential
> or privileged information for the intended recipient(s). Unintended
> recipients are prohibited from taking action on the basis of information in
> this e-mail and using or disseminating the information, and must notify the
> sender and delete it from their system. L&T Infotech will not accept
> responsibility or liability for the accuracy or completeness of, or the
> presence of any virus or disabling code in this e-mail"
>

Re: Speeding up the row count

Posted by James Taylor <jt...@salesforce.com>.
Phoenix will parallelize within a region:

SELECT count(1) FROM orders

I agree with Ted, though, even serially, 100,000 rows shouldn't take any where near 6 mins. You say > 100,000 rows. Can you tell us what it's < ?

Thanks,
James

On Apr 19, 2013, at 2:37 AM, "Ted Yu" <yu...@gmail.com> wrote:

> Since there is only one region in your table, using aggregation coprocessor has no advantage. 
> I think there may be some issue with your cluster - row count should finish within 6 minutes.
> 
> Have you checked server logs ?
> 
> Thanks
> 
> On Apr 19, 2013, at 12:33 AM, Omkar Joshi <Om...@lntinfotech.com> wrote:
> 
>> Hi,
>> 
>> I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the distributed mode.
>> 
>> I'm having a table named ORDERS with >100000 rows.
>> 
>> NOTE : Since my cluster is ultra-small, I didn't pre-split the table.
>> 
>> ORDERS
>> rowkey :                ORDER_ID
>> 
>> column family : ORDER_DETAILS
>>       columns : CUSTOMER_ID
>>                       PRODUCT_ID
>>                       REQUEST_DATE
>>                       PRODUCT_QUANTITY
>>                       PRICE
>>                       PAYMENT_MODE
>> 
>> The java client code to simply check the count of the records is :
>> 
>> public long getTableCount(String tableName, String columnFamilyName) {
>> 
>>               AggregationClient aggregationClient = new AggregationClient(config);
>>               Scan scan = new Scan();
>>               scan.addFamily(Bytes.toBytes(columnFamilyName));
>>               scan.setFilter(new FirstKeyOnlyFilter());
>> 
>>               long rowCount = 0;
>> 
>>               try {
>>                       rowCount = aggregationClient.rowCount(Bytes.toBytes(tableName),
>>                                       null, scan);
>>                       System.out.println("No. of rows in " + tableName + " is "
>>                                       + rowCount);
>>               } catch (Throwable e) {
>>                       // TODO Auto-generated catch block
>>                       e.printStackTrace();
>>               }
>> 
>>               return rowCount;
>>       }
>> 
>> It is running for more than 6 minutes now :(
>> 
>> What shall I do to speed up the execution to milliseconds(at least a couple of seconds)?
>> 
>> Regards,
>> Omkar Joshi
>> 
>> 
>> -----Original Message-----
>> From: Vedad Kirlic [mailto:kirlich@gmail.com]
>> Sent: Thursday, April 18, 2013 12:22 AM
>> To: user@hbase.apache.org
>> Subject: Re: Speeding up the row count
>> 
>> Hi Omkar,
>> 
>> If you are not interested in occurrences of specific column (e.g. name,
>> email ... ), and just want to get total number of rows (regardless of their
>> content - i.e. columns), you should avoid adding any columns to the Scan, in
>> which case coprocessor implementation for AggregateClient, will add
>> FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
>> this should result in some speed up.
>> 
>> This is similar approach to what hbase shell 'count' implementation does,
>> although reduction in overhead in that case is bigger, since data transfer
>> from region server to client (shell) is minimized, whereas in case of
>> coprocessor, data does not leave region server, so most of the improvement
>> in that case should come from avoiding loading of unnecessary files. Not
>> sure how this will apply to your particular case, given that data set per
>> row seems to be rather small. Also, in case of AggregateClient you will
>> benefit if/when your tables span multiple regions. Essentially, performance
>> of this approach will 'degrade' as your table gets bigger, but only to the
>> point when it splits, from which point it should be pretty constant. Having
>> this in mind, and your type of data, you might consider pre-splitting your
>> tables.
>> 
>> DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
>> internals :), so your best bet is to try it - I'm too lazy to verify impact
>> my self ;)
>> 
>> Finally, if your case can tolerate eventual consistency of counters with
>> actual number of rows, you can, as already suggested, have RowCounter map
>> reduce run every once in a while, write the counter(s) back to hbase, and
>> read those when you need to obtain the number of rows.
>> 
>> Regards,
>> Vedad
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>> 
>> The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and  using or disseminating the information,  and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

RE: Speeding up the row count

Posted by Omkar Joshi <Om...@lntinfotech.com>.
Hi Ted,

I observed 3 logs :

1. hbase-hduser-master-cldx-1139-1033.log
        The log in the logs directory of the HBase master
2. hbase-hduser-regionserver-cldx-1140-1034.log
        The log in the logs directory of the HBase slave(region server)
3. hbase-hduser-zookeeper-cldx-1140-1034.log
        The log in the logs directory of the HBase slave(region server)

I executed the following steps :

1. Restarted HBase so that I can view logs afresh.  Here, after the start-up itself, I observed the following log-wise errors :

=> hbase-hduser-master-cldx-1139-1033.log

2013-04-22 16:56:24,440 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment in: cldx-1140-1034,60020,1366629888662 due to org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: Received:OPEN for the region:bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. for the region:bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e., which we are already trying to OPEN.
2013-04-22 16:56:24,440 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. to cldx-1140-1034,60020,1366629888662, trying to assign to the same region server because of RegionAlreadyInTransitionException;retry=0
org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: Received:OPEN for the region:bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. for the region:bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e., which we are already trying to OPEN.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1742)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
        at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: Received:OPEN for the region:bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. for the region:bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e., which we are already trying to OPEN.
        at org.apache.hadoop.hbase.regionserver.HRegionServer.addRegionsInTransition(HRegionServer.java:2951)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2861)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2837)
        at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)

        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:995)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
        at $Proxy11.openRegion(Unknown Source)
        at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:501)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1697)
        ... 8 more


=> hbase-hduser-regionserver-cldx-1140-1034.log

2013-04-22 16:54:56,593 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: Exception during readTableDecriptor. Current table name = bulk_import
org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under hdfs://cldx-1139-1033:9000/hbase/bulk_import
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:416)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:408)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:163)
        at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2871)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegions(HRegionServer.java:2964)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
2013-04-22 16:54:56,595 WARN org.apache.hadoop.hbase.util.FSTableDescriptors: The following folder is in HBase's root directory and doesn't contain a table descriptor, do consider deleting it: bulk_import
2013-04-22 16:54:56,595 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 9581c73100211b61522030d517ecaa6e from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,597 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store ORDER_DETAILS
2013-04-22 16:54:56,600 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 9581c73100211b61522030d517ecaa6e from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,600 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: users,,1365525193570.dc78fa7cfda5397f59d0930957f635a8.
2013-04-22 16:54:56,604 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store CUSTOMER_INFO
2013-04-22 16:54:56,607 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node dc78fa7cfda5397f59d0930957f635a8 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,610 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node dc78fa7cfda5397f59d0930957f635a8 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,618 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store ORDER_DETAILS
2013-04-22 16:54:56,662 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://cldx-1139-1033:9000/hbase/CUSTOMERS/69e390864dab55e234696786f44fe5b5/CUSTOMER_INFO/79e3f98ca4b74200861d1bb32accd858, isReference=false, isBulkLoadResult=true, bulkLoadTS=1366118957032, majorCompaction=true
2013-04-22 16:54:56,682 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://cldx-1139-1033:9000/hbase/ORDERS/9ac7375923f9dea94a8ac64ff9414d22/ORDER_DETAILS/371b6964eda444fa98b653a3638503d4, isReference=false, isBulkLoadResult=true, bulkLoadTS=1366629061328, majorCompaction=true
2013-04-22 16:54:56,687 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined CUSTOMERS,,1366118797565.69e390864dab55e234696786f44fe5b5.; next sequenceid=1
2013-04-22 16:54:56,688 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 69e390864dab55e234696786f44fe5b5 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,692 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 69e390864dab55e234696786f44fe5b5 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,693 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=CUSTOMERS,,1366118797565.69e390864dab55e234696786f44fe5b5., daughter=false
2013-04-22 16:54:56,694 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://cldx-1139-1033:9000/hbase/ORDERS_BIG/04513e1297ad0fe75f834b7f86ff22ab/ORDER_DETAILS/61b9222574d645a89d3fb940b727a455, isReference=false, isBulkLoadResult=true, bulkLoadTS=1366298481718, majorCompaction=true
2013-04-22 16:54:56,697 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ORDERS,,1366627622969.9ac7375923f9dea94a8ac64ff9414d22.; next sequenceid=1
2013-04-22 16:54:56,697 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 9ac7375923f9dea94a8ac64ff9414d22 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,705 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 9ac7375923f9dea94a8ac64ff9414d22 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,711 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=ORDERS,,1366627622969.9ac7375923f9dea94a8ac64ff9414d22., daughter=false
2013-04-22 16:54:56,712 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ORDERS_BIG,,1366287893803.04513e1297ad0fe75f834b7f86ff22ab.; next sequenceid=1
2013-04-22 16:54:56,712 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 04513e1297ad0fe75f834b7f86ff22ab from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,726 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Looked up root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@476bff8d; serverName=cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,734 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Looked up root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@476bff8d; serverName=cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,735 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Cached location for .META.,,1.1028785192 is cldx-1140-1034:60020
2013-04-22 16:54:56,737 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 04513e1297ad0fe75f834b7f86ff22ab from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,738 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=ORDERS_BIG,,1366287893803.04513e1297ad0fe75f834b7f86ff22ab., daughter=false
2013-04-22 16:54:56,760 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting no less than 4 replicas.  Requesting close of hlog.
2013-04-22 16:54:56,762 DEBUG org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll requested
2013-04-22 16:54:56,764 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row CUSTOMERS,,1366118797565.69e390864dab55e234696786f44fe5b5. with server=cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,764 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open deploy task for region=CUSTOMERS,,1366118797565.69e390864dab55e234696786f44fe5b5., daughter=false
2013-04-22 16:54:56,765 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 69e390864dab55e234696786f44fe5b5 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,768 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new createWriter -- HADOOP-6840
2013-04-22 16:54:56,768 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Path=hdfs://cldx-1139-1033:9000/hbase/.logs/cldx-1140-1034,60020,1366629888662/cldx-1140-1034%2C60020%2C1366629888662.1366629896763, syncFs=true, hflush=false, compression=false
2013-04-22 16:54:56,768 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: cleanupCurrentWriter  waiting for transactions to get synced  total 4 synced till here 2
2013-04-22 16:54:56,771 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row ORDERS,,1366627622969.9ac7375923f9dea94a8ac64ff9414d22. with server=cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,771 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open deploy task for region=ORDERS,,1366627622969.9ac7375923f9dea94a8ac64ff9414d22., daughter=false
2013-04-22 16:54:56,772 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 9ac7375923f9dea94a8ac64ff9414d22 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,776 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 69e390864dab55e234696786f44fe5b5 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,776 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME => 'CUSTOMERS,,1366118797565.69e390864dab55e234696786f44fe5b5.', STARTKEY => '', ENDKEY => '', ENCODED => 69e390864dab55e234696786f44fe5b5,}, server: cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,776 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened CUSTOMERS,,1366118797565.69e390864dab55e234696786f44fe5b5. on server:cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,776 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME => 'ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb.', STARTKEY => '', ENDKEY => '', ENCODED => 9870fabb39dd37993459cae5a5f64ceb,}
2013-04-22 16:54:56,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ...
2013-04-22 16:54:56,778 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol handler: region=ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb. protocol=org.apache.hadoop.hbase.coprocessor.AggregateProtocol
2013-04-22 16:54:56,778 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: System coprocessor org.apache.hadoop.hbase.coprocessor.AggregateImplementation was loaded successfully with priority (536870911).
2013-04-22 16:54:56,778 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb.
2013-04-22 16:54:56,782 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 9ac7375923f9dea94a8ac64ff9414d22 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,782 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME => 'ORDERS,,1366627622969.9ac7375923f9dea94a8ac64ff9414d22.', STARTKEY => '', ENDKEY => '', ENCODED => 9ac7375923f9dea94a8ac64ff9414d22,}, server: cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,784 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened ORDERS,,1366627622969.9ac7375923f9dea94a8ac64ff9414d22. on server:cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,785 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME => 'ORDERS_HIVE_DUMMY,,1366382877311.1d484c3c63185a3ab689726fcb535fcd.', STARTKEY => '', ENDKEY => '', ENCODED => 1d484c3c63185a3ab689726fcb535fcd,}
2013-04-22 16:54:56,786 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ...
2013-04-22 16:54:56,787 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol handler: region=ORDERS_HIVE_DUMMY,,1366382877311.1d484c3c63185a3ab689726fcb535fcd. protocol=org.apache.hadoop.hbase.coprocessor.AggregateProtocol
2013-04-22 16:54:56,787 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: System coprocessor org.apache.hadoop.hbase.coprocessor.AggregateImplementation was loaded successfully with priority (536870911).
2013-04-22 16:54:56,787 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ORDERS_HIVE_DUMMY,,1366382877311.1d484c3c63185a3ab689726fcb535fcd.
2013-04-22 16:54:56,800 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store ORDER_DETAILS
2013-04-22 16:54:56,800 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/cldx-1140-1034,60020,1366629888662/cldx-1140-1034%2C60020%2C1366629888662.1366629896241, entries=3, filesize=961.  for /hbase/.logs/cldx-1140-1034,60020,1366629888662/cldx-1140-1034%2C60020%2C1366629888662.1366629896763
2013-04-22 16:54:56,807 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store ORDER_DETAILS
2013-04-22 16:54:56,813 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb.; next sequenceid=1
2013-04-22 16:54:56,813 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 9870fabb39dd37993459cae5a5f64ceb from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,814 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row ORDERS_BIG,,1366287893803.04513e1297ad0fe75f834b7f86ff22ab. with server=cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open deploy task for region=ORDERS_BIG,,1366287893803.04513e1297ad0fe75f834b7f86ff22ab., daughter=false
2013-04-22 16:54:56,818 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 04513e1297ad0fe75f834b7f86ff22ab from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,826 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 04513e1297ad0fe75f834b7f86ff22ab from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,827 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME => 'ORDERS_BIG,,1366287893803.04513e1297ad0fe75f834b7f86ff22ab.', STARTKEY => '', ENDKEY => '', ENCODED => 04513e1297ad0fe75f834b7f86ff22ab,}, server: cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,827 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened ORDERS_BIG,,1366287893803.04513e1297ad0fe75f834b7f86ff22ab. on server:cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,828 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME => 'PRODUCTS,,1366192676314.5fef4835dd5f8f5043c26a0f7e2bb812.', STARTKEY => '', ENDKEY => '', ENCODED => 5fef4835dd5f8f5043c26a0f7e2bb812,}
2013-04-22 16:54:56,828 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ...
2013-04-22 16:54:56,829 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 9870fabb39dd37993459cae5a5f64ceb from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,829 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol handler: region=PRODUCTS,,1366192676314.5fef4835dd5f8f5043c26a0f7e2bb812. protocol=org.apache.hadoop.hbase.coprocessor.AggregateProtocol
2013-04-22 16:54:56,829 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: System coprocessor org.apache.hadoop.hbase.coprocessor.AggregateImplementation was loaded successfully with priority (536870911).
2013-04-22 16:54:56,829 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated PRODUCTS,,1366192676314.5fef4835dd5f8f5043c26a0f7e2bb812.
2013-04-22 16:54:56,829 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb., daughter=false
2013-04-22 16:54:56,833 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ORDERS_HIVE_DUMMY,,1366382877311.1d484c3c63185a3ab689726fcb535fcd.; next sequenceid=1
2013-04-22 16:54:56,833 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 1d484c3c63185a3ab689726fcb535fcd from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,839 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 1d484c3c63185a3ab689726fcb535fcd from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-04-22 16:54:56,845 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=ORDERS_HIVE_DUMMY,,1366382877311.1d484c3c63185a3ab689726fcb535fcd., daughter=false
2013-04-22 16:54:56,857 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store PRODUCT_INFO
2013-04-22 16:54:56,871 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting no less than 4 replicas.  Requesting close of hlog.
2013-04-22 16:54:56,872 DEBUG org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll requested
2013-04-22 16:54:56,872 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb. with server=cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,872 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open deploy task for region=ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb., daughter=false
2013-04-22 16:54:56,874 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 9870fabb39dd37993459cae5a5f64ceb from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,883 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row ORDERS_HIVE_DUMMY,,1366382877311.1d484c3c63185a3ab689726fcb535fcd. with server=cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,883 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open deploy task for region=ORDERS_HIVE_DUMMY,,1366382877311.1d484c3c63185a3ab689726fcb535fcd., daughter=false
2013-04-22 16:54:56,883 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new createWriter -- HADOOP-6840
2013-04-22 16:54:56,887 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Successfully transitioned node 9870fabb39dd37993459cae5a5f64ceb from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,887 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e317d2a010001 Attempting to transition node 1d484c3c63185a3ab689726fcb535fcd from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-04-22 16:54:56,887 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME => 'ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb.', STARTKEY => '', ENDKEY => '', ENCODED => 9870fabb39dd37993459cae5a5f64ceb,}, server: cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,888 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened ORDERS_HIVE,,1366374759692.9870fabb39dd37993459cae5a5f64ceb. on server:cldx-1140-1034,60020,1366629888662
2013-04-22 16:54:56,888 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Path=hdfs://cldx-1139-1033:9000/hbase/.logs/cldx-1140-1034,60020,1366629888662/cldx-1140-1034%2C60020%2C1366629888662.1366629896872, syncFs=true, hflush=false, compression=false
2013-04-22 16:54:56,888 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME => 'bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e.', STARTKEY => '', ENDKEY => '', ENCODED => 9581c73100211b61522030d517ecaa6e,}
2013-04-22 16:54:56,891 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol handler: region=bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. protocol=org.apache.hadoop.hbase.coprocessor.AggregateProtocol
2013-04-22 16:54:56,891 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: System coprocessor org.apache.hadoop.hbase.coprocessor.AggregateImplementation was loaded successfully with priority (536870911).
2013-04-22 16:54:56,895 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e., starting to roll back the global memstore size.
java.lang.IllegalStateException: Could not instantiate a region instance.
        at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:4073)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4253)
        at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:329)
        at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:100)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:4070)
        ... 7 more
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:162)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.<init>(RegionCoprocessorHost.java:154)
        at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:473)
        ... 12 more


2. Executed the client code as in the previous mails and got the same client trace. Below is the dump in the logs after I saw(after the client error) :

=>hbase-hduser-master-cldx-1139-1033.log

2013-04-22 16:59:57,275 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {9581c73100211b61522030d517ecaa6e=bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. state=OFFLINE, ts=1366629984518, server=null}
2013-04-22 16:59:57,285 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Creating scanner over .META. starting at key ''
2013-04-22 16:59:57,285 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Advancing internal scanner to startKey at ''
2013-04-22 16:59:57,303 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Finished with scanning at {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,}
2013-04-22 16:59:57,304 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 8 catalog row(s) and gc'd 0 unreferenced parent region(s)
2013-04-22 17:04:57,276 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {9581c73100211b61522030d517ecaa6e=bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. state=OFFLINE, ts=1366629984518, server=null}
2013-04-22 17:04:57,284 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Creating scanner over .META. starting at key ''
2013-04-22 17:04:57,284 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Advancing internal scanner to startKey at ''
2013-04-22 17:04:57,301 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Finished with scanning at {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,}
2013-04-22 17:04:57,301 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 8 catalog row(s) and gc'd 0 unreferenced parent region(s)
2013-04-22 17:09:57,276 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {9581c73100211b61522030d517ecaa6e=bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. state=OFFLINE, ts=1366629984518, server=null}
2013-04-22 17:09:57,285 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Creating scanner over .META. starting at key ''
2013-04-22 17:09:57,285 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Advancing internal scanner to startKey at ''
2013-04-22 17:09:57,296 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Finished with scanning at {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,}
2013-04-22 17:09:57,297 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 8 catalog row(s) and gc'd 0 unreferenced parent region(s)
2013-04-22 17:14:57,276 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {9581c73100211b61522030d517ecaa6e=bulk_import,,1366106606639.9581c73100211b61522030d517ecaa6e. state=OFFLINE, ts=1366629984518, server=null}
2013-04-22 17:14:57,284 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Creating scanner over .META. starting at key ''
2013-04-22 17:14:57,284 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Advancing internal scanner to startKey at ''
2013-04-22 17:14:57,293 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Finished with scanning at {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,}
2013-04-22 17:14:57,294 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 8 catalog row(s) and gc'd 0 unreferenced parent region(s)

=>hbase-hduser-regionserver-cldx-1140-1034.log

2013-04-22 16:59:48,848 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=2.03 MB, free=243.82 MB, max=245.84 MB, blocks=6, accesses=93, hits=87, hitRatio=93.54%, , cachingAccesses=93, cachingHits=87, cachingHitsRatio=93.54%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-04-22 17:04:06,372 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Received dynamic protocol exec call with protocolName org.apache.hadoop.hbase.coprocessor.AggregateProtocol
2013-04-22 17:04:12,481 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:12,685 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=192.79 MB, single=213.77 MB, multi=257.12 KB, memory=10.65 KB
2013-04-22 17:04:13,230 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:13,257 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.66 MB, single=206.97 MB, multi=257.12 KB, memory=10.65 KB
2013-04-22 17:04:13,991 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:14,039 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=185.86 MB, single=207.34 MB, multi=257.12 KB, memory=10.65 KB
2013-04-22 17:04:25,177 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:25,192 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.41 MB, single=206.59 MB, multi=385.67 KB, memory=10.65 KB
2013-04-22 17:04:26,166 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:26,180 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=185.1 MB, single=206.9 MB, multi=385.67 KB, memory=10.65 KB
2013-04-22 17:04:27,187 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:27,195 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.79 MB, single=206.78 MB, multi=385.67 KB, memory=10.65 KB
2013-04-22 17:04:27,908 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:27,920 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.47 MB, single=206.59 MB, multi=385.67 KB, memory=10.65 KB
2013-04-22 17:04:28,902 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:28,908 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.66 MB, single=206.59 MB, multi=385.67 KB, memory=10.65 KB
2013-04-22 17:04:29,766 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:29,775 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.59 MB, multi=385.67 KB, memory=10.65 KB
2013-04-22 17:04:30,744 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:30,748 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.59 MB, single=206.59 MB, multi=514.21 KB, memory=10.65 KB
2013-04-22 17:04:31,753 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:31,756 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.41 MB, single=206.46 MB, multi=514.21 KB, memory=10.65 KB
2013-04-22 17:04:32,641 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:32,651 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.46 MB, multi=514.21 KB, memory=10.65 KB
2013-04-22 17:04:33,429 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:33,432 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=206.52 MB, multi=514.21 KB, memory=10.65 KB
2013-04-22 17:04:34,249 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:34,252 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.4 MB, single=206.52 MB, multi=514.21 KB, memory=10.65 KB
2013-04-22 17:04:34,966 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.67 MB of total=209.05 MB
2013-04-22 17:04:34,970 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.71 MB, total=184.59 MB, single=206.59 MB, multi=514.21 KB, memory=10.65 KB
2013-04-22 17:04:35,732 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:35,736 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.59 MB, single=206.4 MB, multi=642.79 KB, memory=10.65 KB
2013-04-22 17:04:36,490 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:36,492 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.33 MB, multi=642.79 KB, memory=10.65 KB
2013-04-22 17:04:36,892 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:36,895 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.47 MB, single=206.46 MB, multi=642.79 KB, memory=10.65 KB
2013-04-22 17:04:37,654 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:37,656 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.33 MB, multi=642.79 KB, memory=10.65 KB
2013-04-22 17:04:38,151 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:38,153 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.34 MB, multi=642.79 KB, memory=10.65 KB
2013-04-22 17:04:39,498 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:39,500 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.21 MB, multi=771.32 KB, memory=10.65 KB
2013-04-22 17:04:40,308 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:40,310 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.21 MB, multi=771.32 KB, memory=10.65 KB
2013-04-22 17:04:40,933 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:41,123 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=189.51 MB, single=211.38 MB, multi=771.32 KB, memory=10.65 KB
2013-04-22 17:04:42,113 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:42,115 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.41 MB, single=206.27 MB, multi=771.32 KB, memory=10.65 KB
2013-04-22 17:04:42,711 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:42,714 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=206.27 MB, multi=771.32 KB, memory=10.65 KB
2013-04-22 17:04:43,607 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:43,612 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.21 MB, multi=771.32 KB, memory=10.65 KB
2013-04-22 17:04:44,573 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:44,575 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.08 MB, multi=899.85 KB, memory=10.65 KB
2013-04-22 17:04:44,947 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:44,949 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.08 MB, multi=899.85 KB, memory=10.65 KB
2013-04-22 17:04:45,701 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:45,703 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.47 MB, single=206.08 MB, multi=899.85 KB, memory=10.65 KB
2013-04-22 17:04:46,360 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:46,493 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=187.05 MB, single=208.79 MB, multi=899.85 KB, memory=10.65 KB
2013-04-22 17:04:47,395 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:47,397 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=206.08 MB, multi=899.85 KB, memory=10.65 KB
2013-04-22 17:04:48,075 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:48,077 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.96 MB, multi=1 MB, memory=10.65 KB
2013-04-22 17:04:48,599 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:48,601 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.96 MB, multi=1 MB, memory=10.65 KB
2013-04-22 17:04:48,847 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=192.85 MB, free=52.99 MB, max=245.84 MB, blocks=3026, accesses=31948, hits=16018, hitRatio=50.13%, , cachingAccesses=31948, cachingHits=16018, cachingHitsRatio=50.13%, , evictions=33, evicted=12904, evictedPerRun=391.0303039550781
2013-04-22 17:04:49,140 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:49,142 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.4 MB, single=205.96 MB, multi=1 MB, memory=10.65 KB
2013-04-22 17:04:49,755 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:49,761 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.59 MB, single=206.08 MB, multi=1 MB, memory=10.65 KB
2013-04-22 17:04:50,109 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:50,111 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=206.08 MB, multi=1 MB, memory=10.65 KB
2013-04-22 17:04:50,724 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:50,726 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.96 MB, multi=1 MB, memory=10.65 KB
2013-04-22 17:04:51,527 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:51,539 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.89 MB, multi=1.13 MB, memory=10.65 KB
2013-04-22 17:04:52,138 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:52,140 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.96 MB, multi=1.13 MB, memory=10.65 KB
2013-04-22 17:04:52,996 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:52,998 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.83 MB, multi=1.13 MB, memory=10.65 KB
2013-04-22 17:04:53,401 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:53,403 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.4 MB, single=205.83 MB, multi=1.13 MB, memory=10.65 KB
2013-04-22 17:04:54,195 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:54,198 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.96 MB, multi=1.13 MB, memory=10.65 KB
2013-04-22 17:04:54,619 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:54,636 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.83 MB, multi=1.26 MB, memory=10.65 KB
2013-04-22 17:04:55,223 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:55,225 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.71 MB, multi=1.26 MB, memory=10.65 KB
2013-04-22 17:04:55,830 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:55,832 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.71 MB, multi=1.26 MB, memory=10.65 KB
2013-04-22 17:04:56,392 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:56,395 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.77 MB, multi=1.26 MB, memory=10.65 KB
2013-04-22 17:04:57,526 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:57,528 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.71 MB, multi=1.26 MB, memory=10.65 KB
2013-04-22 17:04:57,992 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:57,995 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.66 MB, single=205.71 MB, multi=1.26 MB, memory=10.65 KB
2013-04-22 17:04:59,030 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:59,032 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.58 MB, multi=1.38 MB, memory=10.65 KB
2013-04-22 17:04:59,490 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:04:59,492 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.46 MB, single=205.71 MB, multi=1.38 MB, memory=10.65 KB
2013-04-22 17:05:00,089 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:05:00,091 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.71 MB, multi=1.38 MB, memory=10.65 KB
2013-04-22 17:05:00,662 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:05:00,664 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.58 MB, multi=1.38 MB, memory=10.65 KB
2013-04-22 17:05:01,183 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:05:01,186 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.58 MB, multi=1.38 MB, memory=10.65 KB
2013-04-22 17:05:01,939 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.6 MB of total=208.99 MB
2013-04-22 17:05:01,941 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.46 MB, single=205.64 MB, multi=1.38 MB, memory=10.65 KB
2013-04-22 17:05:02,390 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.6 MB of total=208.99 MB
2013-04-22 17:05:02,393 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.52 MB, multi=1.51 MB, memory=10.65 KB
2013-04-22 17:05:03,031 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.6 MB of total=208.99 MB
2013-04-22 17:05:03,033 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.45 MB, multi=1.51 MB, memory=10.65 KB
2013-04-22 17:05:03,629 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.6 MB of total=208.99 MB
2013-04-22 17:05:03,631 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.53 MB, single=205.52 MB, multi=1.51 MB, memory=10.65 KB
2013-04-22 17:05:04,172 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.61 MB of total=208.99 MB
2013-04-22 17:05:04,175 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.46 MB, single=205.52 MB, multi=1.51 MB, memory=10.65 KB
2013-04-22 17:05:05,116 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.6 MB of total=208.99 MB
2013-04-22 17:05:05,117 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.45 MB, multi=1.51 MB, memory=10.65 KB
2013-04-22 17:05:05,569 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.6 MB of total=208.99 MB
2013-04-22 17:05:05,571 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.34 MB, single=205.33 MB, multi=1.63 MB, memory=10.65 KB
2013-04-22 17:05:06,114 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 24.6 MB of total=208.99 MB
2013-04-22 17:05:06,116 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=24.65 MB, total=184.46 MB, single=205.39 MB, multi=1.63 MB, memory=10.65 KB
2013-04-22 17:05:06,569 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call execCoprocessor([B@4a65f2be, getRowNum(null, {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"ORDER_DETAILS":[]},"maxVersions":1,"filter":"FirstKeyOnlyFilter","caching":-1}), rpc version=1, client version=0, methodsFingerPrint=0), rpc version=1, client version=29, methodsFingerPrint=-1368823753 from 172.17.89.68:4636 after 59995 ms, since caller disconnected
        at org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:436)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3851)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3771)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3754)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3785)
        at org.apache.hadoop.hbase.coprocessor.AggregateImplementation.getRowNum(AggregateImplementation.java:158)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.hbase.regionserver.HRegion.exec(HRegion.java:5476)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.execCoprocessor(HRegionServer.java:3719)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)

Regards,
Omkar Joshi


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Friday, April 19, 2013 7:25 PM
To: user@hbase.apache.org
Cc: user@hbase.apache.org
Subject: Re: Speeding up the row count

The stack trace was from your HBase client.

Can you check server log ?

Thanks

On Apr 19, 2013, at 2:55 AM, Omkar Joshi <Om...@lntinfotech.com> wrote:

> Hi Ted,
>
> 6 minutes is too long :(
> Will this decrease to seconds if more nodes are added in the cluster?
>
> I got this exception finally(I recall faintly about increasing some timeout parameter while querying but I didn't want to increase it to a high value) :
>
> Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation processExecs
> WARNING: Error executing for row
> java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
> Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]
>
>        at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
>        at java.util.concurrent.FutureTask.get(Unknown Source)
>        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(HConnectionManager.java:1475)
>        at org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1236)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient.rowCount(AggregationClient.java:216)
>        at client.hbase.HBaseCRUD.getTableCount(HBaseCRUD.java:307)
>        at client.hbase.HBaseCRUD.main(HBaseCRUD.java:117)
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
> Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]
>
>        at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
>        at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
>        at $Proxy6.getRowNum(Unknown Source)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
>        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
>        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>        at java.util.concurrent.FutureTask.run(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>        at java.lang.Thread.run(Unknown Source)
>
> Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation internalClose
> INFO: Closed zookeeper sessionid=0x13e185b8ee8003a
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
> Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]
>
>        at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
>        at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
>        at $Proxy6.getRowNum(Unknown Source)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
>        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
>        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>        at java.util.concurrent.FutureTask.run(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>        at java.lang.Thread.run(Unknown Source)
>
> Regards,
> Omkar Joshi
>
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Friday, April 19, 2013 3:00 PM
> To: user@hbase.apache.org
> Cc: user@hbase.apache.org
> Subject: Re: Speeding up the row count
>
> Since there is only one region in your table, using aggregation coprocessor has no advantage.
> I think there may be some issue with your cluster - row count should finish within 6 minutes.
>
> Have you checked server logs ?
>
> Thanks
>
> On Apr 19, 2013, at 12:33 AM, Omkar Joshi <Om...@lntinfotech.com> wrote:
>
>> Hi,
>>
>> I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the distributed mode.
>>
>> I'm having a table named ORDERS with >100000 rows.
>>
>> NOTE : Since my cluster is ultra-small, I didn't pre-split the table.
>>
>> ORDERS
>> rowkey :                ORDER_ID
>>
>> column family : ORDER_DETAILS
>>       columns : CUSTOMER_ID
>>                       PRODUCT_ID
>>                       REQUEST_DATE
>>                       PRODUCT_QUANTITY
>>                       PRICE
>>                       PAYMENT_MODE
>>
>> The java client code to simply check the count of the records is :
>>
>> public long getTableCount(String tableName, String columnFamilyName) {
>>
>>               AggregationClient aggregationClient = new AggregationClient(config);
>>               Scan scan = new Scan();
>>               scan.addFamily(Bytes.toBytes(columnFamilyName));
>>               scan.setFilter(new FirstKeyOnlyFilter());
>>
>>               long rowCount = 0;
>>
>>               try {
>>                       rowCount = aggregationClient.rowCount(Bytes.toBytes(tableName),
>>                                       null, scan);
>>                       System.out.println("No. of rows in " + tableName + " is "
>>                                       + rowCount);
>>               } catch (Throwable e) {
>>                       // TODO Auto-generated catch block
>>                       e.printStackTrace();
>>               }
>>
>>               return rowCount;
>>       }
>>
>> It is running for more than 6 minutes now :(
>>
>> What shall I do to speed up the execution to milliseconds(at least a couple of seconds)?
>>
>> Regards,
>> Omkar Joshi
>>
>>
>> -----Original Message-----
>> From: Vedad Kirlic [mailto:kirlich@gmail.com]
>> Sent: Thursday, April 18, 2013 12:22 AM
>> To: user@hbase.apache.org
>> Subject: Re: Speeding up the row count
>>
>> Hi Omkar,
>>
>> If you are not interested in occurrences of specific column (e.g. name,
>> email ... ), and just want to get total number of rows (regardless of their
>> content - i.e. columns), you should avoid adding any columns to the Scan, in
>> which case coprocessor implementation for AggregateClient, will add
>> FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
>> this should result in some speed up.
>>
>> This is similar approach to what hbase shell 'count' implementation does,
>> although reduction in overhead in that case is bigger, since data transfer
>> from region server to client (shell) is minimized, whereas in case of
>> coprocessor, data does not leave region server, so most of the improvement
>> in that case should come from avoiding loading of unnecessary files. Not
>> sure how this will apply to your particular case, given that data set per
>> row seems to be rather small. Also, in case of AggregateClient you will
>> benefit if/when your tables span multiple regions. Essentially, performance
>> of this approach will 'degrade' as your table gets bigger, but only to the
>> point when it splits, from which point it should be pretty constant. Having
>> this in mind, and your type of data, you might consider pre-splitting your
>> tables.
>>
>> DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
>> internals :), so your best bet is to try it - I'm too lazy to verify impact
>> my self ;)
>>
>> Finally, if your case can tolerate eventual consistency of counters with
>> actual number of rows, you can, as already suggested, have RowCounter map
>> reduce run every once in a while, write the counter(s) back to hbase, and
>> read those when you need to obtain the number of rows.
>>
>> Regards,
>> Vedad
>>
>>
>>
>> --
>> View this message in context: http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>> The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and  using or disseminating the information,  and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

Re: Speeding up the row count

Posted by Ted Yu <yu...@gmail.com>.
The stack trace was from your HBase client. 

Can you check server log ?

Thanks

On Apr 19, 2013, at 2:55 AM, Omkar Joshi <Om...@lntinfotech.com> wrote:

> Hi Ted,
> 
> 6 minutes is too long :(
> Will this decrease to seconds if more nodes are added in the cluster?
> 
> I got this exception finally(I recall faintly about increasing some timeout parameter while querying but I didn't want to increase it to a high value) :
> 
> Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation processExecs
> WARNING: Error executing for row
> java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
> Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]
> 
>        at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
>        at java.util.concurrent.FutureTask.get(Unknown Source)
>        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(HConnectionManager.java:1475)
>        at org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1236)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient.rowCount(AggregationClient.java:216)
>        at client.hbase.HBaseCRUD.getTableCount(HBaseCRUD.java:307)
>        at client.hbase.HBaseCRUD.main(HBaseCRUD.java:117)
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
> Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]
> 
>        at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
>        at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
>        at $Proxy6.getRowNum(Unknown Source)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
>        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
>        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>        at java.util.concurrent.FutureTask.run(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>        at java.lang.Thread.run(Unknown Source)
> 
> Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation internalClose
> INFO: Closed zookeeper sessionid=0x13e185b8ee8003a
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
> Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
> Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]
> 
>        at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
>        at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
>        at $Proxy6.getRowNum(Unknown Source)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
>        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
>        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
>        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>        at java.util.concurrent.FutureTask.run(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>        at java.lang.Thread.run(Unknown Source)
> 
> Regards,
> Omkar Joshi
> 
> 
> 
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Friday, April 19, 2013 3:00 PM
> To: user@hbase.apache.org
> Cc: user@hbase.apache.org
> Subject: Re: Speeding up the row count
> 
> Since there is only one region in your table, using aggregation coprocessor has no advantage.
> I think there may be some issue with your cluster - row count should finish within 6 minutes.
> 
> Have you checked server logs ?
> 
> Thanks
> 
> On Apr 19, 2013, at 12:33 AM, Omkar Joshi <Om...@lntinfotech.com> wrote:
> 
>> Hi,
>> 
>> I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the distributed mode.
>> 
>> I'm having a table named ORDERS with >100000 rows.
>> 
>> NOTE : Since my cluster is ultra-small, I didn't pre-split the table.
>> 
>> ORDERS
>> rowkey :                ORDER_ID
>> 
>> column family : ORDER_DETAILS
>>       columns : CUSTOMER_ID
>>                       PRODUCT_ID
>>                       REQUEST_DATE
>>                       PRODUCT_QUANTITY
>>                       PRICE
>>                       PAYMENT_MODE
>> 
>> The java client code to simply check the count of the records is :
>> 
>> public long getTableCount(String tableName, String columnFamilyName) {
>> 
>>               AggregationClient aggregationClient = new AggregationClient(config);
>>               Scan scan = new Scan();
>>               scan.addFamily(Bytes.toBytes(columnFamilyName));
>>               scan.setFilter(new FirstKeyOnlyFilter());
>> 
>>               long rowCount = 0;
>> 
>>               try {
>>                       rowCount = aggregationClient.rowCount(Bytes.toBytes(tableName),
>>                                       null, scan);
>>                       System.out.println("No. of rows in " + tableName + " is "
>>                                       + rowCount);
>>               } catch (Throwable e) {
>>                       // TODO Auto-generated catch block
>>                       e.printStackTrace();
>>               }
>> 
>>               return rowCount;
>>       }
>> 
>> It is running for more than 6 minutes now :(
>> 
>> What shall I do to speed up the execution to milliseconds(at least a couple of seconds)?
>> 
>> Regards,
>> Omkar Joshi
>> 
>> 
>> -----Original Message-----
>> From: Vedad Kirlic [mailto:kirlich@gmail.com]
>> Sent: Thursday, April 18, 2013 12:22 AM
>> To: user@hbase.apache.org
>> Subject: Re: Speeding up the row count
>> 
>> Hi Omkar,
>> 
>> If you are not interested in occurrences of specific column (e.g. name,
>> email ... ), and just want to get total number of rows (regardless of their
>> content - i.e. columns), you should avoid adding any columns to the Scan, in
>> which case coprocessor implementation for AggregateClient, will add
>> FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
>> this should result in some speed up.
>> 
>> This is similar approach to what hbase shell 'count' implementation does,
>> although reduction in overhead in that case is bigger, since data transfer
>> from region server to client (shell) is minimized, whereas in case of
>> coprocessor, data does not leave region server, so most of the improvement
>> in that case should come from avoiding loading of unnecessary files. Not
>> sure how this will apply to your particular case, given that data set per
>> row seems to be rather small. Also, in case of AggregateClient you will
>> benefit if/when your tables span multiple regions. Essentially, performance
>> of this approach will 'degrade' as your table gets bigger, but only to the
>> point when it splits, from which point it should be pretty constant. Having
>> this in mind, and your type of data, you might consider pre-splitting your
>> tables.
>> 
>> DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
>> internals :), so your best bet is to try it - I'm too lazy to verify impact
>> my self ;)
>> 
>> Finally, if your case can tolerate eventual consistency of counters with
>> actual number of rows, you can, as already suggested, have RowCounter map
>> reduce run every once in a while, write the counter(s) back to hbase, and
>> read those when you need to obtain the number of rows.
>> 
>> Regards,
>> Vedad
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>> 
>> The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and  using or disseminating the information,  and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

Re: Speeding up the row count

Posted by lars hofhansl <la...@apache.org>.
You should expect to be able to scan about 1-2m small rows/s/core if everything is in cache.
Something is definitely wrong in your setup. Can you post your config files (HBase and HDFS) via pastebin?


-- Lars



________________________________
 From: Omkar Joshi <Om...@lntinfotech.com>
To: "user@hbase.apache.org" <us...@hbase.apache.org> 
Sent: Friday, April 19, 2013 2:55 AM
Subject: RE: Speeding up the row count
 

Hi Ted,

6 minutes is too long :(
Will this decrease to seconds if more nodes are added in the cluster?

I got this exception finally(I recall faintly about increasing some timeout parameter while querying but I didn't want to increase it to a high value) :

Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation processExecs
WARNING: Error executing for row
java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]

        at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
        at java.util.concurrent.FutureTask.get(Unknown Source)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(HConnectionManager.java:1475)
        at org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1236)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient.rowCount(AggregationClient.java:216)
        at client.hbase.HBaseCRUD.getTableCount(HBaseCRUD.java:307)
        at client.hbase.HBaseCRUD.main(HBaseCRUD.java:117)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]

        at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
        at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
        at $Proxy6.getRowNum(Unknown Source)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation internalClose
INFO: Closed zookeeper sessionid=0x13e185b8ee8003a
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]

        at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
        at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
        at $Proxy6.getRowNum(Unknown Source)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

Regards,
Omkar Joshi



-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Friday, April 19, 2013 3:00 PM
To: user@hbase.apache.org
Cc: user@hbase.apache.org
Subject: Re: Speeding up the row count

Since there is only one region in your table, using aggregation coprocessor has no advantage.
I think there may be some issue with your cluster - row count should finish within 6 minutes.

Have you checked server logs ?

Thanks

On Apr 19, 2013, at 12:33 AM, Omkar Joshi <Om...@lntinfotech.com> wrote:

> Hi,
>
> I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the distributed mode.
>
> I'm having a table named ORDERS with >100000 rows.
>
> NOTE : Since my cluster is ultra-small, I didn't pre-split the table.
>
> ORDERS
> rowkey :                ORDER_ID
>
> column family : ORDER_DETAILS
>        columns : CUSTOMER_ID
>                        PRODUCT_ID
>                        REQUEST_DATE
>                        PRODUCT_QUANTITY
>                        PRICE
>                        PAYMENT_MODE
>
> The java client code to simply check the count of the records is :
>
> public long getTableCount(String tableName, String columnFamilyName) {
>
>                AggregationClient aggregationClient = new AggregationClient(config);
>                Scan scan = new Scan();
>                scan.addFamily(Bytes.toBytes(columnFamilyName));
>                scan.setFilter(new FirstKeyOnlyFilter());
>
>                long rowCount = 0;
>
>                try {
>                        rowCount = aggregationClient.rowCount(Bytes.toBytes(tableName),
>                                        null, scan);
>                        System.out.println("No. of rows in " + tableName + " is "
>                                        + rowCount);
>                } catch (Throwable e) {
>                        // TODO Auto-generated catch block
>                        e.printStackTrace();
>                }
>
>                return rowCount;
>        }
>
> It is running for more than 6 minutes now :(
>
> What shall I do to speed up the execution to milliseconds(at least a couple of seconds)?
>
> Regards,
> Omkar Joshi
>
>
> -----Original Message-----
> From: Vedad Kirlic [mailto:kirlich@gmail.com]
> Sent: Thursday, April 18, 2013 12:22 AM
> To: user@hbase.apache.org
> Subject: Re: Speeding up the row count
>
> Hi Omkar,
>
> If you are not interested in occurrences of specific column (e.g. name,
> email ... ), and just want to get total number of rows (regardless of their
> content - i.e. columns), you should avoid adding any columns to the Scan, in
> which case coprocessor implementation for AggregateClient, will add
> FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
> this should result in some speed up.
>
> This is similar approach to what hbase shell 'count' implementation does,
> although reduction in overhead in that case is bigger, since data transfer
> from region server to client (shell) is minimized, whereas in case of
> coprocessor, data does not leave region server, so most of the improvement
> in that case should come from avoiding loading of unnecessary files. Not
> sure how this will apply to your particular case, given that data set per
> row seems to be rather small. Also, in case of AggregateClient you will
> benefit if/when your tables span multiple regions. Essentially, performance
> of this approach will 'degrade' as your table gets bigger, but only to the
> point when it splits, from which point it should be pretty constant. Having
> this in mind, and your type of data, you might consider pre-splitting your
> tables.
>
> DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
> internals :), so your best bet is to try it - I'm too lazy to verify impact
> my self ;)
>
> Finally, if your case can tolerate eventual consistency of counters with
> actual number of rows, you can, as already suggested, have RowCounter map
> reduce run every once in a while, write the counter(s) back to hbase, and
> read those when you need to obtain the number of rows.
>
> Regards,
> Vedad
>
>
>
> --
> View this message in context: http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
> The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and  using or disseminating the information,  and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

RE: Speeding up the row count

Posted by Omkar Joshi <Om...@lntinfotech.com>.
Hi Ted,

6 minutes is too long :(
Will this decrease to seconds if more nodes are added in the cluster?

I got this exception finally(I recall faintly about increasing some timeout parameter while querying but I didn't want to increase it to a high value) :

Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation processExecs
WARNING: Error executing for row
java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]

        at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
        at java.util.concurrent.FutureTask.get(Unknown Source)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(HConnectionManager.java:1475)
        at org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1236)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient.rowCount(AggregationClient.java:216)
        at client.hbase.HBaseCRUD.getTableCount(HBaseCRUD.java:307)
        at client.hbase.HBaseCRUD.main(HBaseCRUD.java:117)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]

        at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
        at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
        at $Proxy6.getRowNum(Unknown Source)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation internalClose
INFO: Closed zookeeper sessionid=0x13e185b8ee8003a
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020]
Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020]

        at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
        at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
        at $Proxy6.getRowNum(Unknown Source)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220)
        at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463)
        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

Regards,
Omkar Joshi



-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Friday, April 19, 2013 3:00 PM
To: user@hbase.apache.org
Cc: user@hbase.apache.org
Subject: Re: Speeding up the row count

Since there is only one region in your table, using aggregation coprocessor has no advantage.
I think there may be some issue with your cluster - row count should finish within 6 minutes.

Have you checked server logs ?

Thanks

On Apr 19, 2013, at 12:33 AM, Omkar Joshi <Om...@lntinfotech.com> wrote:

> Hi,
>
> I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the distributed mode.
>
> I'm having a table named ORDERS with >100000 rows.
>
> NOTE : Since my cluster is ultra-small, I didn't pre-split the table.
>
> ORDERS
> rowkey :                ORDER_ID
>
> column family : ORDER_DETAILS
>        columns : CUSTOMER_ID
>                        PRODUCT_ID
>                        REQUEST_DATE
>                        PRODUCT_QUANTITY
>                        PRICE
>                        PAYMENT_MODE
>
> The java client code to simply check the count of the records is :
>
> public long getTableCount(String tableName, String columnFamilyName) {
>
>                AggregationClient aggregationClient = new AggregationClient(config);
>                Scan scan = new Scan();
>                scan.addFamily(Bytes.toBytes(columnFamilyName));
>                scan.setFilter(new FirstKeyOnlyFilter());
>
>                long rowCount = 0;
>
>                try {
>                        rowCount = aggregationClient.rowCount(Bytes.toBytes(tableName),
>                                        null, scan);
>                        System.out.println("No. of rows in " + tableName + " is "
>                                        + rowCount);
>                } catch (Throwable e) {
>                        // TODO Auto-generated catch block
>                        e.printStackTrace();
>                }
>
>                return rowCount;
>        }
>
> It is running for more than 6 minutes now :(
>
> What shall I do to speed up the execution to milliseconds(at least a couple of seconds)?
>
> Regards,
> Omkar Joshi
>
>
> -----Original Message-----
> From: Vedad Kirlic [mailto:kirlich@gmail.com]
> Sent: Thursday, April 18, 2013 12:22 AM
> To: user@hbase.apache.org
> Subject: Re: Speeding up the row count
>
> Hi Omkar,
>
> If you are not interested in occurrences of specific column (e.g. name,
> email ... ), and just want to get total number of rows (regardless of their
> content - i.e. columns), you should avoid adding any columns to the Scan, in
> which case coprocessor implementation for AggregateClient, will add
> FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
> this should result in some speed up.
>
> This is similar approach to what hbase shell 'count' implementation does,
> although reduction in overhead in that case is bigger, since data transfer
> from region server to client (shell) is minimized, whereas in case of
> coprocessor, data does not leave region server, so most of the improvement
> in that case should come from avoiding loading of unnecessary files. Not
> sure how this will apply to your particular case, given that data set per
> row seems to be rather small. Also, in case of AggregateClient you will
> benefit if/when your tables span multiple regions. Essentially, performance
> of this approach will 'degrade' as your table gets bigger, but only to the
> point when it splits, from which point it should be pretty constant. Having
> this in mind, and your type of data, you might consider pre-splitting your
> tables.
>
> DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
> internals :), so your best bet is to try it - I'm too lazy to verify impact
> my self ;)
>
> Finally, if your case can tolerate eventual consistency of counters with
> actual number of rows, you can, as already suggested, have RowCounter map
> reduce run every once in a while, write the counter(s) back to hbase, and
> read those when you need to obtain the number of rows.
>
> Regards,
> Vedad
>
>
>
> --
> View this message in context: http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
> The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and  using or disseminating the information,  and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

Re: Speeding up the row count

Posted by Ted Yu <yu...@gmail.com>.
Since there is only one region in your table, using aggregation coprocessor has no advantage. 
I think there may be some issue with your cluster - row count should finish within 6 minutes.

Have you checked server logs ?

Thanks

On Apr 19, 2013, at 12:33 AM, Omkar Joshi <Om...@lntinfotech.com> wrote:

> Hi,
> 
> I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the distributed mode.
> 
> I'm having a table named ORDERS with >100000 rows.
> 
> NOTE : Since my cluster is ultra-small, I didn't pre-split the table.
> 
> ORDERS
> rowkey :                ORDER_ID
> 
> column family : ORDER_DETAILS
>        columns : CUSTOMER_ID
>                        PRODUCT_ID
>                        REQUEST_DATE
>                        PRODUCT_QUANTITY
>                        PRICE
>                        PAYMENT_MODE
> 
> The java client code to simply check the count of the records is :
> 
> public long getTableCount(String tableName, String columnFamilyName) {
> 
>                AggregationClient aggregationClient = new AggregationClient(config);
>                Scan scan = new Scan();
>                scan.addFamily(Bytes.toBytes(columnFamilyName));
>                scan.setFilter(new FirstKeyOnlyFilter());
> 
>                long rowCount = 0;
> 
>                try {
>                        rowCount = aggregationClient.rowCount(Bytes.toBytes(tableName),
>                                        null, scan);
>                        System.out.println("No. of rows in " + tableName + " is "
>                                        + rowCount);
>                } catch (Throwable e) {
>                        // TODO Auto-generated catch block
>                        e.printStackTrace();
>                }
> 
>                return rowCount;
>        }
> 
> It is running for more than 6 minutes now :(
> 
> What shall I do to speed up the execution to milliseconds(at least a couple of seconds)?
> 
> Regards,
> Omkar Joshi
> 
> 
> -----Original Message-----
> From: Vedad Kirlic [mailto:kirlich@gmail.com]
> Sent: Thursday, April 18, 2013 12:22 AM
> To: user@hbase.apache.org
> Subject: Re: Speeding up the row count
> 
> Hi Omkar,
> 
> If you are not interested in occurrences of specific column (e.g. name,
> email ... ), and just want to get total number of rows (regardless of their
> content - i.e. columns), you should avoid adding any columns to the Scan, in
> which case coprocessor implementation for AggregateClient, will add
> FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
> this should result in some speed up.
> 
> This is similar approach to what hbase shell 'count' implementation does,
> although reduction in overhead in that case is bigger, since data transfer
> from region server to client (shell) is minimized, whereas in case of
> coprocessor, data does not leave region server, so most of the improvement
> in that case should come from avoiding loading of unnecessary files. Not
> sure how this will apply to your particular case, given that data set per
> row seems to be rather small. Also, in case of AggregateClient you will
> benefit if/when your tables span multiple regions. Essentially, performance
> of this approach will 'degrade' as your table gets bigger, but only to the
> point when it splits, from which point it should be pretty constant. Having
> this in mind, and your type of data, you might consider pre-splitting your
> tables.
> 
> DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
> internals :), so your best bet is to try it - I'm too lazy to verify impact
> my self ;)
> 
> Finally, if your case can tolerate eventual consistency of counters with
> actual number of rows, you can, as already suggested, have RowCounter map
> reduce run every once in a while, write the counter(s) back to hbase, and
> read those when you need to obtain the number of rows.
> 
> Regards,
> Vedad
> 
> 
> 
> --
> View this message in context: http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
> Sent from the HBase User mailing list archive at Nabble.com.
> 
> The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and  using or disseminating the information,  and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

RE: Speeding up the row count

Posted by Omkar Joshi <Om...@lntinfotech.com>.
Hi,

I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the distributed mode.

I'm having a table named ORDERS with >100000 rows.

NOTE : Since my cluster is ultra-small, I didn't pre-split the table.

ORDERS
rowkey :                ORDER_ID

column family : ORDER_DETAILS
        columns : CUSTOMER_ID
                        PRODUCT_ID
                        REQUEST_DATE
                        PRODUCT_QUANTITY
                        PRICE
                        PAYMENT_MODE

The java client code to simply check the count of the records is :

public long getTableCount(String tableName, String columnFamilyName) {

                AggregationClient aggregationClient = new AggregationClient(config);
                Scan scan = new Scan();
                scan.addFamily(Bytes.toBytes(columnFamilyName));
                scan.setFilter(new FirstKeyOnlyFilter());

                long rowCount = 0;

                try {
                        rowCount = aggregationClient.rowCount(Bytes.toBytes(tableName),
                                        null, scan);
                        System.out.println("No. of rows in " + tableName + " is "
                                        + rowCount);
                } catch (Throwable e) {
                        // TODO Auto-generated catch block
                        e.printStackTrace();
                }

                return rowCount;
        }

It is running for more than 6 minutes now :(

What shall I do to speed up the execution to milliseconds(at least a couple of seconds)?

Regards,
Omkar Joshi


-----Original Message-----
From: Vedad Kirlic [mailto:kirlich@gmail.com]
Sent: Thursday, April 18, 2013 12:22 AM
To: user@hbase.apache.org
Subject: Re: Speeding up the row count

Hi Omkar,

If you are not interested in occurrences of specific column (e.g. name,
email ... ), and just want to get total number of rows (regardless of their
content - i.e. columns), you should avoid adding any columns to the Scan, in
which case coprocessor implementation for AggregateClient, will add
FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
this should result in some speed up.

This is similar approach to what hbase shell 'count' implementation does,
although reduction in overhead in that case is bigger, since data transfer
from region server to client (shell) is minimized, whereas in case of
coprocessor, data does not leave region server, so most of the improvement
in that case should come from avoiding loading of unnecessary files. Not
sure how this will apply to your particular case, given that data set per
row seems to be rather small. Also, in case of AggregateClient you will
benefit if/when your tables span multiple regions. Essentially, performance
of this approach will 'degrade' as your table gets bigger, but only to the
point when it splits, from which point it should be pretty constant. Having
this in mind, and your type of data, you might consider pre-splitting your
tables.

DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
internals :), so your best bet is to try it - I'm too lazy to verify impact
my self ;)

Finally, if your case can tolerate eventual consistency of counters with
actual number of rows, you can, as already suggested, have RowCounter map
reduce run every once in a while, write the counter(s) back to hbase, and
read those when you need to obtain the number of rows.

Regards,
Vedad



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
Sent from the HBase User mailing list archive at Nabble.com.

The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and  using or disseminating the information,  and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

Re: Speeding up the row count

Posted by Vedad Kirlic <ki...@gmail.com>.
Hi Omkar,

If you are not interested in occurrences of specific column (e.g. name,
email ... ), and just want to get total number of rows (regardless of their
content - i.e. columns), you should avoid adding any columns to the Scan, in
which case coprocessor implementation for AggregateClient, will add
FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so
this should result in some speed up.

This is similar approach to what hbase shell 'count' implementation does,
although reduction in overhead in that case is bigger, since data transfer
from region server to client (shell) is minimized, whereas in case of
coprocessor, data does not leave region server, so most of the improvement
in that case should come from avoiding loading of unnecessary files. Not
sure how this will apply to your particular case, given that data set per
row seems to be rather small. Also, in case of AggregateClient you will
benefit if/when your tables span multiple regions. Essentially, performance
of this approach will 'degrade' as your table gets bigger, but only to the
point when it splits, from which point it should be pretty constant. Having
this in mind, and your type of data, you might consider pre-splitting your
tables.

DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase
internals :), so your best bet is to try it - I'm too lazy to verify impact
my self ;)

Finally, if your case can tolerate eventual consistency of counters with
actual number of rows, you can, as already suggested, have RowCounter map
reduce run every once in a while, write the counter(s) back to hbase, and
read those when you need to obtain the number of rows.

Regards,
Vedad



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html
Sent from the HBase User mailing list archive at Nabble.com.