You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Zaharije Pasalic <pa...@gmail.com> on 2010/01/18 17:47:08 UTC

Configuration limits for hbase and hadoop ...

Hi

we are using 7nodes HBase configuration + 1 master. Each node/master
have 8GB of memory with 4core cpu. One master is used for hadoop and
also as hbase master. Also, 7 nodes are shared for hadoop and hbase.
In configuration files we set 2GB of memory for hbase and additional
2GB for hadoop. HDFS has 1.6TB of free space.

Now we are trying to import 50 millions rows of data. Each row have
100 columns (in reality we will have sparsely populated table, but now
we are testing worst-case scenario). We are having 50 million records
encoded in about 100 CSV files stored in HDFS.

Importing process is really simple one: small map reduce program will
read CSV file, split lines and insert it into table (only Map, no
Reduce parts). We are using default hadoop configuration (on 7 nodes
we can run 14 maps). Also we are using 32MB for writeBufferSize on
HBase and also we set setWriteToWAL to false.

At the beginning everything looks fine, but after ~33 millions of
records we are encounter strange behavior of HBase.

Firstly one of nodes where META table resides have high load. Status
web page shows ~1700 requests on that node even if we are not running
any MapReduce (0 request on other nodes). Also, i do not see any
activity in log files on that node. Here is the last couple of lines
from log on that node:

2010-01-18 14:46:26,666 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
profiles2,1a2e1b7a-a43e-4e4f-9f84-40b4662cc4e0,1263825424277
2010-01-18 14:46:26,667 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
profiles2,1a2e1b7a-a43e-4e4f-9f84-40b4662cc4e0,1263825424277
2010-01-18 14:46:27,441 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
profiles2,1a2e1b7a-a43e-4e4f-9f84-40b4662cc4e0,1263825424277
2010-01-18 14:47:38,773 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,0d9deec6-b6df-43a3-ab94-685dade5af61,1263825533141 in
2mins, 18sec
2010-01-18 14:47:38,773 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,0dbd64eb-3e59-4b35-af4a-92a83a1e1858,1263825533141
2010-01-18 14:49:01,881 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,0dbd64eb-3e59-4b35-af4a-92a83a1e1858,1263825533141 in
1mins, 23sec
2010-01-18 14:49:01,883 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,3f726ebf-2ec8-43a0-bd50-d40bec1776d4,1263825595669
2010-01-18 14:49:52,186 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,3f726ebf-2ec8-43a0-bd50-d40bec1776d4,1263825595669 in
50sec
2010-01-18 14:49:52,186 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,3f5303ee-4729-4ab9-bfd6-3c319d429c4f,1263825595669
2010-01-18 14:50:57,328 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,3f5303ee-4729-4ab9-bfd6-3c319d429c4f,1263825595669 in
1mins, 5sec
2010-01-18 14:50:57,328 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,8f50a54d-e8d5-4dec-84a4-05a468fbf8e1,1263825624515
2010-01-18 14:51:24,508 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,8f50a54d-e8d5-4dec-84a4-05a468fbf8e1,1263825624515 in
27sec
2010-01-18 14:51:24,508 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,8f309cdb-eb70-49e0-90d4-d2510e38ae51,1263825624515
2010-01-18 14:52:19,736 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,8f309cdb-eb70-49e0-90d4-d2510e38ae51,1263825624515 in
55sec
2010-01-18 14:52:19,736 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,84bd729a-c64b-4d75-8189-e828dbf06797,1263825639973
2010-01-18 14:53:44,053 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,84bd729a-c64b-4d75-8189-e828dbf06797,1263825639973 in
1mins, 24sec
2010-01-18 14:53:44,053 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,84dcfe35-e488-4eec-99d8-83be178f1b22,1263825639973
2010-01-18 14:55:09,999 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,84dcfe35-e488-4eec-99d8-83be178f1b22,1263825639973 in
1mins, 25sec
2010-01-18 14:55:09,999 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,a6252b0c-b2b1-4bd2-acf4-522065a2a3be,1263825653683
2010-01-18 14:56:22,364 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,a6252b0c-b2b1-4bd2-acf4-522065a2a3be,1263825653683 in
1mins, 12sec
2010-01-18 14:56:22,364 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,a644b61a-f2c0-4855-ad99-1e6ab2d82e61,1263825653683
2010-01-18 14:57:41,518 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,a644b61a-f2c0-4855-ad99-1e6ab2d82e61,1263825653683 in
1mins, 19sec

second manifestation is that i can create new empty table and start
importing data normaly, but if i try to import more data into same
table (now having ~33 millions) i'm having really bad performance and
hbase status page does not work at all (will not load into browser).

Currently ~33 millions of records uses 800GB of disk and i'm having
1.1TB free HDFS storage.

So my questions is: what i'm doing wrong? Is current cluster good
enough to support 50millions records or my current 33 millions is
limit on current configuration? Any hints. Also, I'm getting about 800
inserts per second, is this slow?   Any hint is appreciated.

Best
Zaharije

RE: Support for MultiGet / SQL In clause

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
Thanks will check it out.   

Also if my key is a compound key and I do a scan on a  partial key range
will it work.  Meaning key is a concat  of  {a,b}  and my scan
specifies "get me all rows   between {a1,a2}.  Is this still as fast as
doing   {a1b1 , a2b2}

-----Original Message-----
From: Marc Limotte [mailto:mslimotte@gmail.com] 
Sent: Tuesday, January 19, 2010 10:26 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause

Sriram,

Would a secondary index help you:
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
client/tableindexed/package-summary.html#package_description
.

The index is stored in a separate table, but the index is managed for
you.

I don't think you can do an arbitrary "in" query, though.  If the keys
that
you want to include in the "in" are reasonably close neighbors, you
could do
a scan and skip ones that are uninteresting.  You could also try a batch
Get
by applying a separate patch, see
http://issues.apache.org/jira/browse/HBASE-1845.

Marc Limotte

On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
sriramc@ivycomptech.com> wrote:

> Is there any support for this.  I want to do this
>
> 1.  Create a second table to maintain mapping between secondary column
> and the rowid's of the primary table
>
> 2.  Use this second table to get the rowid's to lookup from the
primary
> table using a SQL In like clause ---
>
> Basically I am doing this to speed up querying by  Non-row key
columns.
>
> Thanks
>
> Sriram C
>
>
> This email is sent for and on behalf of Ivy Comptech Private Limited.
Ivy
> Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended
recipient
> dissemination or copying of this email is prohibited. If you have
received
> this in error, please notify the sender by replying by email and then
delete
> the email completely from your system.
> Any views or opinions are solely those of the sender.  This
communication
> is not intended to form a binding contract on behalf of Ivy Comptech
Private
> Limited unless expressly indicated to the contrary and properly
authorised.
> Any actions taken on the basis of this email are at the recipient's
own
> risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills,
> Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994.
> Registered in India. A list of members' names is available for
inspection at
> the registered office.
>
>

Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Dan Washusen <da...@reactive.org>.
Yes, that's roughly what I was thinking...

2010/1/24 Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>

> So on the reporting tables I will have to store by the keys I want to
> lookup by  for example
>
> 1.  One reporting table by  gameid
>
> 2.  Another one by same some other column like tournamentid
>
> So basically  create a reporting table based on how I want to query and
> this reporting table will be queried by it rowKey (which is native) and
> the column values will be what I want
>
> Etc.  Is that right ?
>
>
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 2:00 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> Sounds like it's some sort of reporting system. Have you considered
> duplicating data into reporting tables?
>
> Write all the game details into the main table then map reduce into
> your reporting tables...
>
> On 24/01/2010, at 7:07 PM, "Sriram Muthuswamy Chittathoor"
> <sriramc@ivycomptech.com
>  > wrote:
>
> > However, I'd only recommend using secondary index as a last resort.
> > First I'd try doing everything I can to work with the index I get for
> > free. The row key.  It sounds like you have done this already...
> > --
> >
> > The only reason why this is important to me is because of the
> > following
> >
> > 1.  I am storing at a minimal 1 yrs worth of data (small rows --  10
> > billion)
> >
> > 2.  Row key is   user + date   (columns  --   gameid ,  opponent etc)
> >
> > 3.  Queries may be something like give me details for a particular
> > "gameid"
> >
> > 4.  To do step 3  I am assuming I need something like a secondary
> > index
> > or else given my row key  how else can I do it
> >
> >
> >
> > -----Original Message-----
> > From: Daniel Washusen [mailto:dan@reactive.org]
> > Sent: Sunday, January 24, 2010 3:16 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> > HBASE-1845
> >
> > Well, it CAN be a RAM hog ;-). It depends what you're indexing. Each
> > unique value in the indexed column resides in memory. If you index a
> > column that contains 1 million random 1KB values then the index will
> > require at least 1GB of memory. Also it *can* slow down writes,
> > especially when bulk loading sequential keys.
> >
> > On the up side, it can make scans dramatically faster.
> >
> > However, I'd only recommend using secondary index as a last resort.
> > First I'd try doing everything I can to work with the index I get for
> > free. The row key.  It sounds like you have done this already...
> >
> > Cheers,
> > Dan
> >
> > On 24/01/2010, at 7:02 AM, Stack <st...@duboce.net> wrote:
> >
> >> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
> >> <sr...@ivycomptech.com> wrote:
> >>> Thanks all.  I messed it up when I was trying to upgrade to
> >>> 0.20.3.  I deleted the data directory and formatted it thinking it
> >>> will reset the whole cluster.
> >>>
> >>> I started fresh by deleting the data directory on all the nodes and
> >>> then everything worked.  I was also able to create the indexed
> >>> table using the 0.20.3 patch.  Let me run some tests on a few
> >>> million rows and see how it holds up.
> >>>
> >>> BTW --  what would be the right way when I moved versions.  Do I
> >>> run migrate scripts to migrate the data to newer versions ?
> >>>
> >> Just install the new binaries every and restart or perform a rolling
> >> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
> >> if you would avoid taking down your cluster during the upgrade.
> >>
> >> You'll be flagged on start if you need to run a migration but general
> >> rule is that there (should) never be need of a migration between
> >> patch
> >> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
> >> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
> >>
> >> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
> >> hog but the speed improvement finding matching cells can be
> >> startling.
> >>
> >> St.Ack
> >>
> >>> -----Original Message-----
> >>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> >>> Stack
> >>> Sent: Saturday, January 23, 2010 5:00 AM
> >>> To: hbase-user@hadoop.apache.org
> >>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> >>> HBASE-1845
> >>>
> >>> Check your master log.  Something is seriously off if you do not
> >>> have
> >>> a reachable .META. table.
> >>> St.Ack
> >>>
> >>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
> >>> <sr...@ivycomptech.com> wrote:
> >>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
> >>>> starting
> >>>> hbase I keep getting the error below when I go to the hbase shell
> >>>>
> >>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
> >>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
> >>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
> >>>> hbase(main):001:0> list
> >>>> NativeException:
> >>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> >>>> contact region server null for region , row '', but failed after 7
> >>>> attempts.
> >>>> Exceptions:
> >>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
> >>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
> >>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
> >>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
> >>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
> >>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
> >>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
> >>>>
> >>>>
> >>>>
> >>>> Also when I try to create a table programatically I get this --
> >>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
> >>>> to
> >>>> server localhost/127.0.0.1:2181
> >>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
> >>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
> >>>> remote=localhost/127.0.0.1:2181]
> >>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
> >>>> successful
> >>>> Exception in thread "main"
> >>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
> >>>>       at
> >>>> org.apache.hadoop.hbase.client.HConnectionManager
> >>>> $TableServers.locateReg
> >>>> ionInMeta(HConnectionManager.java:684)
> >>>>       at
> >>>> org.apache.hadoop.hbase.client.HConnectionManager
> >>>> $TableServers.locateReg
> >>>> ion(HConnectionManager.java:634)
> >>>>       at
> >>>> org.apache.hadoop.hbase.client.HConnectionManager
> >>>> $TableServers.locateReg
> >>>> ion(HConnectionManager.java:601)
> >>>>       at
> >>>> org.apache.hadoop.hbase.client.HConnectionManager
> >>>> $TableServers.locateReg
> >>>> ionInMeta(HConnectionManager.java:675)
> >>>>       at
> >>>> org.apache.hadoop.hbase.client.HConnectionManager
> >>>> $TableServers.locateReg
> >>>> ion(HConnectionManager.java:638)
> >>>>       at
> >>>> org.apache.hadoop.hbase.client.HConnectionManager
> >>>> $TableServers.locateReg
> >>>> ion(HConnectionManager.java:601)
> >>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
> >>>> 128)
> >>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
> >>>> 106)
> >>>>       at test.CreateTable.main(CreateTable.java:36)
> >>>>
> >>>>
> >>>>
> >>>> Any clues ?
> >>>>
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Dan Washusen [mailto:dan@reactive.org]
> >>>> Sent: Friday, January 22, 2010 4:53 AM
> >>>> To: hbase-user@hadoop.apache.org
> >>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> >>>> HBASE-1845
> >>>>
> >>>> If you want to give the "indexed" contrib package a try you'll
> >>>> need to
> >>>> do
> >>>> the following:
> >>>>
> >>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
> >>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
> >>>> "\n"
> >>>> ":"`)
> >>>>  2. Set the 'hbase.hregion.impl' property to
> >>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
> >>>> hbase-site.xml
> >>>>
> >>>> Once you've done that you can create a table with an index using:
> >>>>
> >>>>>    // define which qualifiers need an index (choosing the correct
> >>>> type)
> >>>>>    IdxColumnDescriptor columnDescriptor = new
> >>>>> IdxColumnDescriptor("columnFamily");
> >>>>>    columnDescriptor.addIndexDescriptor(
> >>>>>      new IdxIndexDescriptor("qualifier",
> >>>>> IdxQualifierType.BYTE_ARRAY)
> >>>>>    );
> >>>>>
> >>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
> >>>>> ("table");
> >>>>>    tableDescriptor.addFamily(columnDescriptor);
> >>>>>
> >>>>
> >>>> Then when you want to perform a scan with an index hint:
> >>>>
> >>>>>    Scan scan = new IdxScan(
> >>>>>          new Comparison("columnFamily", "qualifier",
> >>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
> >>>>>      );
> >>>>>
> >>>>
> >>>> You have to keep in mind that the index hint is only a hint.  It
> >>>> guarantees
> >>>> that your scan will get all rows that match the hint but you'll
> >>>> more
> >>>> than
> >>>> likely receive rows that don't.  For this reason I'd suggest that
> >>>> you
> >>>> also
> >>>> include a filter along with the scan:
> >>>>
> >>>>>      Scan scan = new IdxScan(
> >>>>>          new Comparison("columnFamily", "qualifier",
> >>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
> >>>>>      );
> >>>>>      scan.setFilter(
> >>>>>          new SingleColumnValueFilter(
> >>>>>              "columnFamily", "qualifer",
> >>>> CompareFilter.CompareOp.EQUAL,
> >>>>>              new BinaryComparator("foo")
> >>>>>          )
> >>>>>      );
> >>>>>
> >>>>
> >>>> Cheers,
> >>>> Dan
> >>>>
> >>>>
> >>>> 2010/1/22 stack <st...@duboce.net>
> >>>>
> >>>>>
> >>>>
> >
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
> <http://peop
> >>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
> >>>>>
> >>>>> There is a bit of documentation if you look at javadoc for the
> >>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
> >>>>>
> >>>>> St.Ack
> >>>>>
> >>>>> P.S. We had a thread going named "HBase bulk load".  You got all
> >>>>> the
> >>>>> answers you need on that one?
> >>>>>
> >>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
> >>>>> <sr...@ivycomptech.com> wrote:
> >>>>>>
> >>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
> >>>> pass
> >>>>>> me the link
> >>>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
> >>>>>> Of
> >>>>>> stack
> >>>>>> Sent: Friday, January 22, 2010 12:42 AM
> >>>>>> To: hbase-user@hadoop.apache.org
> >>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
> >>>>>> patch
> >>>>>> HBASE-1845
> >>>>>>
> >>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
> >>>> probably
> >>>>>> rotted since any ways.
> >>>>>>
> >>>>>> Have you looked at hbase-2037 since committed and available in
> >>>>>> 0.20.3RC2.
> >>>>>> Would this help you with your original problem?
> >>>>>>
> >>>>>> St.Ack
> >>>>>>
> >>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
> >>>>>> sriramc@ivycomptech.com> wrote:
> >>>>>>
> >>>>>>> I tried applying the patch to the hbase source code  hbase
> >>>>>>> 0.20.2
> >>>> and
> >>>>>> I
> >>>>>>> get the errors below.  Do you know if this needs to be applied
> >>>>>>> to
> >>>> a
> >>>>>>> specific hbase version. Is there a version which works with
> >>>>>>> 0.20.2
> >>>> or
> >>>>>>> later ??
> >>>>>>> Basically HRegionServer  and HTable patching fails.
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks for the help
> >>>>>>>
> >>>>>>> patch -p0 -i batch.patch
> >>>>>>>
> >>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
> >>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
> >>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
> >>>>>>> patching file
> >>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
> >>>>>>> patching file
> >>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> >>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
> >>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
> >>>>>>> HTable.java
> >>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
> >>>>>>> Hunk #4 FAILED at 405.
> >>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> >>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
> >>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> >>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> >>>>>>> patching file
> >>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> >>>>>>> patching file
> >>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
> >>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
> >>>>>>> patching file
> >>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> >>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> >>>>>>> patching file
> >>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> >>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
> >>>>>>> patching file
> >>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> >>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
> >>>>>>> Hunk #2 FAILED at 2515.
> >>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
> >>>>>>>
> >>>> src/java/org/apache/hadoop/hbase/regionserver/
> >>>> HRegionServer.java.rej
> >>>>>>> patching file
> >>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
> >>>>>>> Hunk #2 FAILED at 333.
> >>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
> >>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
> >>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
> >>>>>>> To: hbase-user@hadoop.apache.org
> >>>>>>> Subject: Re: Support for MultiGet / SQL In clause
> >>>>>>>
> >>>>>>> Sriram,
> >>>>>>>
> >>>>>>> Would a secondary index help you:
> >>>>>>>
> >>>>>>
> >>>>
> >
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> >>>>>>> client/tableindexed/package-summary.html#package_description
> >>>>>>> .
> >>>>>>>
> >>>>>>> The index is stored in a separate table, but the index is
> >>>>>>> managed
> >>>> for
> >>>>>>> you.
> >>>>>>>
> >>>>>>> I don't think you can do an arbitrary "in" query, though.  If
> >>>>>>> the
> >>>> keys
> >>>>>>> that
> >>>>>>> you want to include in the "in" are reasonably close neighbors,
> >>>> you
> >>>>>>> could do
> >>>>>>> a scan and skip ones that are uninteresting.  You could also
> >>>>>>> try a
> >>>>>> batch
> >>>>>>> Get
> >>>>>>> by applying a separate patch, see
> >>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
> >>>>>>>
> >>>>>>> Marc Limotte
> >>>>>>>
> >>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> >>>>>>> sriramc@ivycomptech.com> wrote:
> >>>>>>>
> >>>>>>>> Is there any support for this.  I want to do this
> >>>>>>>>
> >>>>>>>> 1.  Create a second table to maintain mapping between secondary
> >>>>>> column
> >>>>>>>> and the rowid's of the primary table
> >>>>>>>>
> >>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
> >>>>>>> primary
> >>>>>>>> table using a SQL In like clause ---
> >>>>>>>>
> >>>>>>>> Basically I am doing this to speed up querying by  Non-row key
> >>>>>>> columns.
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>>
> >>>>>>>> Sriram C
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
> >>>>>> Limited.
> >>>>>>> Ivy
> >>>>>>>> Comptech Private Limited is a limited liability company.
> >>>>>>>>
> >>>>>>>> This email and any attachments are confidential, and may be
> >>>> legally
> >>>>>>>> privileged and protected by copyright. If you are not the
> >>>> intended
> >>>>>>> recipient
> >>>>>>>> dissemination or copying of this email is prohibited. If you
> >>>> have
> >>>>>>> received
> >>>>>>>> this in error, please notify the sender by replying by email
> >>>>>>>> and
> >>>>>> then
> >>>>>>> delete
> >>>>>>>> the email completely from your system.
> >>>>>>>> Any views or opinions are solely those of the sender.  This
> >>>>>>> communication
> >>>>>>>> is not intended to form a binding contract on behalf of Ivy
> >>>> Comptech
> >>>>>>> Private
> >>>>>>>> Limited unless expressly indicated to the contrary and properly
> >>>>>>> authorised.
> >>>>>>>> Any actions taken on the basis of this email are at the
> >>>> recipient's
> >>>>>>> own
> >>>>>>>> risk.
> >>>>>>>>
> >>>>>>>> Registered office:
> >>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> >>>>>> Hills,
> >>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> >>>> 37994.
> >>>>>>>> Registered in India. A list of members' names is available for
> >>>>>>> inspection at
> >>>>>>>> the registered office.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> >>> This email is sent for and on behalf of Ivy Comptech Private
> >>> Limited. Ivy Comptech Private Limited is a limited liability
> >>> company.
> >>>
> >>> This email and any attachments are confidential, and may be legally
> >>> privileged and protected by copyright. If you are not the intended
> >>> recipient dissemination or copying of this email is prohibited. If
> >>> you have received this in error, please notify the sender by
> >>> replying by email and then delete the email completely from your
> >>> system.
> >>> Any views or opinions are solely those of the sender.  This
> >>> communication is not intended to form a binding contract on behalf
> >>> of Ivy Comptech Private Limited unless expressly indicated to the
> >>> contrary and properly authorised. Any actions taken on the basis of
> >>> this email are at the recipient's own risk.
> >>>
> >>> Registered office:
> >>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> >>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> >>> 37994. Registered in India. A list of members' names is available
> >>> for inspection at the registered office.
> >>>
> >>>
>

RE: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
So on the reporting tables I will have to store by the keys I want to
lookup by  for example  

1.  One reporting table by  gameid

2.  Another one by same some other column like tournamentid 

So basically  create a reporting table based on how I want to query and
this reporting table will be queried by it rowKey (which is native) and
the column values will be what I want

Etc.  Is that right ?



-----Original Message-----
From: Daniel Washusen [mailto:dan@reactive.org] 
Sent: Sunday, January 24, 2010 2:00 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause -- error in patch
HBASE-1845

Sounds like it's some sort of reporting system. Have you considered
duplicating data into reporting tables?

Write all the game details into the main table then map reduce into
your reporting tables...

On 24/01/2010, at 7:07 PM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:

> However, I'd only recommend using secondary index as a last resort.
> First I'd try doing everything I can to work with the index I get for
> free. The row key.  It sounds like you have done this already...
> --
>
> The only reason why this is important to me is because of the
> following
>
> 1.  I am storing at a minimal 1 yrs worth of data (small rows --  10
> billion)
>
> 2.  Row key is   user + date   (columns  --   gameid ,  opponent etc)
>
> 3.  Queries may be something like give me details for a particular
> "gameid"
>
> 4.  To do step 3  I am assuming I need something like a secondary
> index
> or else given my row key  how else can I do it
>
>
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 3:16 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> Well, it CAN be a RAM hog ;-). It depends what you're indexing. Each
> unique value in the indexed column resides in memory. If you index a
> column that contains 1 million random 1KB values then the index will
> require at least 1GB of memory. Also it *can* slow down writes,
> especially when bulk loading sequential keys.
>
> On the up side, it can make scans dramatically faster.
>
> However, I'd only recommend using secondary index as a last resort.
> First I'd try doing everything I can to work with the index I get for
> free. The row key.  It sounds like you have done this already...
>
> Cheers,
> Dan
>
> On 24/01/2010, at 7:02 AM, Stack <st...@duboce.net> wrote:
>
>> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> Thanks all.  I messed it up when I was trying to upgrade to
>>> 0.20.3.  I deleted the data directory and formatted it thinking it
>>> will reset the whole cluster.
>>>
>>> I started fresh by deleting the data directory on all the nodes and
>>> then everything worked.  I was also able to create the indexed
>>> table using the 0.20.3 patch.  Let me run some tests on a few
>>> million rows and see how it holds up.
>>>
>>> BTW --  what would be the right way when I moved versions.  Do I
>>> run migrate scripts to migrate the data to newer versions ?
>>>
>> Just install the new binaries every and restart or perform a rolling
>> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
>> if you would avoid taking down your cluster during the upgrade.
>>
>> You'll be flagged on start if you need to run a migration but general
>> rule is that there (should) never be need of a migration between
>> patch
>> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
>> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>>
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be
>> startling.
>>
>> St.Ack
>>
>>> -----Original Message-----
>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> Stack
>>> Sent: Saturday, January 23, 2010 5:00 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> Check your master log.  Something is seriously off if you do not
>>> have
>>> a reachable .META. table.
>>> St.Ack
>>>
>>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>>> starting
>>>> hbase I keep getting the error below when I go to the hbase shell
>>>>
>>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>>> hbase(main):001:0> list
>>>> NativeException:
>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>> contact region server null for region , row '', but failed after 7
>>>> attempts.
>>>> Exceptions:
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>
>>>>
>>>>
>>>> Also when I try to create a table programatically I get this --
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>>> to
>>>> server localhost/127.0.0.1:2181
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>>> remote=localhost/127.0.0.1:2181]
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>>> successful
>>>> Exception in thread "main"
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:684)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:634)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:675)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:638)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 128)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 106)
>>>>       at test.CreateTable.main(CreateTable.java:36)
>>>>
>>>>
>>>>
>>>> Any clues ?
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Dan Washusen [mailto:dan@reactive.org]
>>>> Sent: Friday, January 22, 2010 4:53 AM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>> HBASE-1845
>>>>
>>>> If you want to give the "indexed" contrib package a try you'll
>>>> need to
>>>> do
>>>> the following:
>>>>
>>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>>> "\n"
>>>> ":"`)
>>>>  2. Set the 'hbase.hregion.impl' property to
>>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>>> hbase-site.xml
>>>>
>>>> Once you've done that you can create a table with an index using:
>>>>
>>>>>    // define which qualifiers need an index (choosing the correct
>>>> type)
>>>>>    IdxColumnDescriptor columnDescriptor = new
>>>>> IdxColumnDescriptor("columnFamily");
>>>>>    columnDescriptor.addIndexDescriptor(
>>>>>      new IdxIndexDescriptor("qualifier",
>>>>> IdxQualifierType.BYTE_ARRAY)
>>>>>    );
>>>>>
>>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
>>>>> ("table");
>>>>>    tableDescriptor.addFamily(columnDescriptor);
>>>>>
>>>>
>>>> Then when you want to perform a scan with an index hint:
>>>>
>>>>>    Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>
>>>>
>>>> You have to keep in mind that the index hint is only a hint.  It
>>>> guarantees
>>>> that your scan will get all rows that match the hint but you'll
>>>> more
>>>> than
>>>> likely receive rows that don't.  For this reason I'd suggest that
>>>> you
>>>> also
>>>> include a filter along with the scan:
>>>>
>>>>>      Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>      scan.setFilter(
>>>>>          new SingleColumnValueFilter(
>>>>>              "columnFamily", "qualifer",
>>>> CompareFilter.CompareOp.EQUAL,
>>>>>              new BinaryComparator("foo")
>>>>>          )
>>>>>      );
>>>>>
>>>>
>>>> Cheers,
>>>> Dan
>>>>
>>>>
>>>> 2010/1/22 stack <st...@duboce.net>
>>>>
>>>>>
>>>>
>
http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>>
>>>>> There is a bit of documentation if you look at javadoc for the
>>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>>
>>>>> St.Ack
>>>>>
>>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>>> the
>>>>> answers you need on that one?
>>>>>
>>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>>> <sr...@ivycomptech.com> wrote:
>>>>>>
>>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>>> pass
>>>>>> me the link
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>>> Of
>>>>>> stack
>>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>>> patch
>>>>>> HBASE-1845
>>>>>>
>>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>>> probably
>>>>>> rotted since any ways.
>>>>>>
>>>>>> Have you looked at hbase-2037 since committed and available in
>>>>>> 0.20.3RC2.
>>>>>> Would this help you with your original problem?
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> I tried applying the patch to the hbase source code  hbase
>>>>>>> 0.20.2
>>>> and
>>>>>> I
>>>>>>> get the errors below.  Do you know if this needs to be applied
>>>>>>> to
>>>> a
>>>>>>> specific hbase version. Is there a version which works with
>>>>>>> 0.20.2
>>>> or
>>>>>>> later ??
>>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the help
>>>>>>>
>>>>>>> patch -p0 -i batch.patch
>>>>>>>
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
>>>>>>> HTable.java
>>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>>> Hunk #4 FAILED at 405.
>>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>>> Hunk #2 FAILED at 2515.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>
>>>> src/java/org/apache/hadoop/hbase/regionserver/
>>>> HRegionServer.java.rej
>>>>>>> patching file
>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>>> Hunk #2 FAILED at 333.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>>
>>>>>>> Sriram,
>>>>>>>
>>>>>>> Would a secondary index help you:
>>>>>>>
>>>>>>
>>>>
>
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>>> .
>>>>>>>
>>>>>>> The index is stored in a separate table, but the index is
>>>>>>> managed
>>>> for
>>>>>>> you.
>>>>>>>
>>>>>>> I don't think you can do an arbitrary "in" query, though.  If
>>>>>>> the
>>>> keys
>>>>>>> that
>>>>>>> you want to include in the "in" are reasonably close neighbors,
>>>> you
>>>>>>> could do
>>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>>> try a
>>>>>> batch
>>>>>>> Get
>>>>>>> by applying a separate patch, see
>>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>>
>>>>>>> Marc Limotte
>>>>>>>
>>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>
>>>>>>>> Is there any support for this.  I want to do this
>>>>>>>>
>>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>>> column
>>>>>>>> and the rowid's of the primary table
>>>>>>>>
>>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>>> primary
>>>>>>>> table using a SQL In like clause ---
>>>>>>>>
>>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>>> columns.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Sriram C
>>>>>>>>
>>>>>>>>
>>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>>> Limited.
>>>>>>> Ivy
>>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>>
>>>>>>>> This email and any attachments are confidential, and may be
>>>> legally
>>>>>>>> privileged and protected by copyright. If you are not the
>>>> intended
>>>>>>> recipient
>>>>>>>> dissemination or copying of this email is prohibited. If you
>>>> have
>>>>>>> received
>>>>>>>> this in error, please notify the sender by replying by email
>>>>>>>> and
>>>>>> then
>>>>>>> delete
>>>>>>>> the email completely from your system.
>>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>>> communication
>>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>>> Comptech
>>>>>>> Private
>>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>>> authorised.
>>>>>>>> Any actions taken on the basis of this email are at the
>>>> recipient's
>>>>>>> own
>>>>>>>> risk.
>>>>>>>>
>>>>>>>> Registered office:
>>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>>> Hills,
>>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>> 37994.
>>>>>>>> Registered in India. A list of members' names is available for
>>>>>>> inspection at
>>>>>>>> the registered office.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>> This email is sent for and on behalf of Ivy Comptech Private
>>> Limited. Ivy Comptech Private Limited is a limited liability
>>> company.
>>>
>>> This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If
>>> you have received this in error, please notify the sender by
>>> replying by email and then delete the email completely from your
>>> system.
>>> Any views or opinions are solely those of the sender.  This
>>> communication is not intended to form a binding contract on behalf
>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>> contrary and properly authorised. Any actions taken on the basis of
>>> this email are at the recipient's own risk.
>>>
>>> Registered office:
>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994. Registered in India. A list of members' names is available
>>> for inspection at the registered office.
>>>
>>>

RE: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
I increased my heap to 2 GB. I am able to index slightly more rows.  But
still keep getting errors like this below.  How many GB is the
recommended configuration ?

java.io.IOException: Call to /10.1.37.155:60020 failed on local
exception: java.io.EOFException
java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused

        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegion
ServerWithRetries(HConnectionManager.java:1048)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3.doCall(
HConnectionManager.java:1239)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.pro
cess(HConnectionManager.java:1161)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBa
tchOfRows(HConnectionManager.java:1247)
        at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:609)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)




-----Original Message-----
From: Daniel Washusen [mailto:dan@reactive.org] 
Sent: Sunday, January 24, 2010 2:00 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause -- error in patch
HBASE-1845

Sounds like it's some sort of reporting system. Have you considered
duplicating data into reporting tables?

Write all the game details into the main table then map reduce into
your reporting tables...

On 24/01/2010, at 7:07 PM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:

> However, I'd only recommend using secondary index as a last resort.
> First I'd try doing everything I can to work with the index I get for
> free. The row key.  It sounds like you have done this already...
> --
>
> The only reason why this is important to me is because of the
> following
>
> 1.  I am storing at a minimal 1 yrs worth of data (small rows --  10
> billion)
>
> 2.  Row key is   user + date   (columns  --   gameid ,  opponent etc)
>
> 3.  Queries may be something like give me details for a particular
> "gameid"
>
> 4.  To do step 3  I am assuming I need something like a secondary
> index
> or else given my row key  how else can I do it
>
>
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 3:16 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> Well, it CAN be a RAM hog ;-). It depends what you're indexing. Each
> unique value in the indexed column resides in memory. If you index a
> column that contains 1 million random 1KB values then the index will
> require at least 1GB of memory. Also it *can* slow down writes,
> especially when bulk loading sequential keys.
>
> On the up side, it can make scans dramatically faster.
>
> However, I'd only recommend using secondary index as a last resort.
> First I'd try doing everything I can to work with the index I get for
> free. The row key.  It sounds like you have done this already...
>
> Cheers,
> Dan
>
> On 24/01/2010, at 7:02 AM, Stack <st...@duboce.net> wrote:
>
>> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> Thanks all.  I messed it up when I was trying to upgrade to
>>> 0.20.3.  I deleted the data directory and formatted it thinking it
>>> will reset the whole cluster.
>>>
>>> I started fresh by deleting the data directory on all the nodes and
>>> then everything worked.  I was also able to create the indexed
>>> table using the 0.20.3 patch.  Let me run some tests on a few
>>> million rows and see how it holds up.
>>>
>>> BTW --  what would be the right way when I moved versions.  Do I
>>> run migrate scripts to migrate the data to newer versions ?
>>>
>> Just install the new binaries every and restart or perform a rolling
>> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
>> if you would avoid taking down your cluster during the upgrade.
>>
>> You'll be flagged on start if you need to run a migration but general
>> rule is that there (should) never be need of a migration between
>> patch
>> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
>> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>>
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be
>> startling.
>>
>> St.Ack
>>
>>> -----Original Message-----
>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> Stack
>>> Sent: Saturday, January 23, 2010 5:00 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> Check your master log.  Something is seriously off if you do not
>>> have
>>> a reachable .META. table.
>>> St.Ack
>>>
>>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>>> starting
>>>> hbase I keep getting the error below when I go to the hbase shell
>>>>
>>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>>> hbase(main):001:0> list
>>>> NativeException:
>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>> contact region server null for region , row '', but failed after 7
>>>> attempts.
>>>> Exceptions:
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>
>>>>
>>>>
>>>> Also when I try to create a table programatically I get this --
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>>> to
>>>> server localhost/127.0.0.1:2181
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>>> remote=localhost/127.0.0.1:2181]
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>>> successful
>>>> Exception in thread "main"
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:684)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:634)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:675)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:638)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 128)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 106)
>>>>       at test.CreateTable.main(CreateTable.java:36)
>>>>
>>>>
>>>>
>>>> Any clues ?
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Dan Washusen [mailto:dan@reactive.org]
>>>> Sent: Friday, January 22, 2010 4:53 AM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>> HBASE-1845
>>>>
>>>> If you want to give the "indexed" contrib package a try you'll
>>>> need to
>>>> do
>>>> the following:
>>>>
>>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>>> "\n"
>>>> ":"`)
>>>>  2. Set the 'hbase.hregion.impl' property to
>>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>>> hbase-site.xml
>>>>
>>>> Once you've done that you can create a table with an index using:
>>>>
>>>>>    // define which qualifiers need an index (choosing the correct
>>>> type)
>>>>>    IdxColumnDescriptor columnDescriptor = new
>>>>> IdxColumnDescriptor("columnFamily");
>>>>>    columnDescriptor.addIndexDescriptor(
>>>>>      new IdxIndexDescriptor("qualifier",
>>>>> IdxQualifierType.BYTE_ARRAY)
>>>>>    );
>>>>>
>>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
>>>>> ("table");
>>>>>    tableDescriptor.addFamily(columnDescriptor);
>>>>>
>>>>
>>>> Then when you want to perform a scan with an index hint:
>>>>
>>>>>    Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>
>>>>
>>>> You have to keep in mind that the index hint is only a hint.  It
>>>> guarantees
>>>> that your scan will get all rows that match the hint but you'll
>>>> more
>>>> than
>>>> likely receive rows that don't.  For this reason I'd suggest that
>>>> you
>>>> also
>>>> include a filter along with the scan:
>>>>
>>>>>      Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>      scan.setFilter(
>>>>>          new SingleColumnValueFilter(
>>>>>              "columnFamily", "qualifer",
>>>> CompareFilter.CompareOp.EQUAL,
>>>>>              new BinaryComparator("foo")
>>>>>          )
>>>>>      );
>>>>>
>>>>
>>>> Cheers,
>>>> Dan
>>>>
>>>>
>>>> 2010/1/22 stack <st...@duboce.net>
>>>>
>>>>>
>>>>
>
http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>>
>>>>> There is a bit of documentation if you look at javadoc for the
>>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>>
>>>>> St.Ack
>>>>>
>>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>>> the
>>>>> answers you need on that one?
>>>>>
>>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>>> <sr...@ivycomptech.com> wrote:
>>>>>>
>>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>>> pass
>>>>>> me the link
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>>> Of
>>>>>> stack
>>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>>> patch
>>>>>> HBASE-1845
>>>>>>
>>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>>> probably
>>>>>> rotted since any ways.
>>>>>>
>>>>>> Have you looked at hbase-2037 since committed and available in
>>>>>> 0.20.3RC2.
>>>>>> Would this help you with your original problem?
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> I tried applying the patch to the hbase source code  hbase
>>>>>>> 0.20.2
>>>> and
>>>>>> I
>>>>>>> get the errors below.  Do you know if this needs to be applied
>>>>>>> to
>>>> a
>>>>>>> specific hbase version. Is there a version which works with
>>>>>>> 0.20.2
>>>> or
>>>>>>> later ??
>>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the help
>>>>>>>
>>>>>>> patch -p0 -i batch.patch
>>>>>>>
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
>>>>>>> HTable.java
>>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>>> Hunk #4 FAILED at 405.
>>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>>> Hunk #2 FAILED at 2515.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>
>>>> src/java/org/apache/hadoop/hbase/regionserver/
>>>> HRegionServer.java.rej
>>>>>>> patching file
>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>>> Hunk #2 FAILED at 333.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>>
>>>>>>> Sriram,
>>>>>>>
>>>>>>> Would a secondary index help you:
>>>>>>>
>>>>>>
>>>>
>
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>>> .
>>>>>>>
>>>>>>> The index is stored in a separate table, but the index is
>>>>>>> managed
>>>> for
>>>>>>> you.
>>>>>>>
>>>>>>> I don't think you can do an arbitrary "in" query, though.  If
>>>>>>> the
>>>> keys
>>>>>>> that
>>>>>>> you want to include in the "in" are reasonably close neighbors,
>>>> you
>>>>>>> could do
>>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>>> try a
>>>>>> batch
>>>>>>> Get
>>>>>>> by applying a separate patch, see
>>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>>
>>>>>>> Marc Limotte
>>>>>>>
>>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>
>>>>>>>> Is there any support for this.  I want to do this
>>>>>>>>
>>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>>> column
>>>>>>>> and the rowid's of the primary table
>>>>>>>>
>>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>>> primary
>>>>>>>> table using a SQL In like clause ---
>>>>>>>>
>>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>>> columns.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Sriram C
>>>>>>>>
>>>>>>>>
>>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>>> Limited.
>>>>>>> Ivy
>>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>>
>>>>>>>> This email and any attachments are confidential, and may be
>>>> legally
>>>>>>>> privileged and protected by copyright. If you are not the
>>>> intended
>>>>>>> recipient
>>>>>>>> dissemination or copying of this email is prohibited. If you
>>>> have
>>>>>>> received
>>>>>>>> this in error, please notify the sender by replying by email
>>>>>>>> and
>>>>>> then
>>>>>>> delete
>>>>>>>> the email completely from your system.
>>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>>> communication
>>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>>> Comptech
>>>>>>> Private
>>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>>> authorised.
>>>>>>>> Any actions taken on the basis of this email are at the
>>>> recipient's
>>>>>>> own
>>>>>>>> risk.
>>>>>>>>
>>>>>>>> Registered office:
>>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>>> Hills,
>>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>> 37994.
>>>>>>>> Registered in India. A list of members' names is available for
>>>>>>> inspection at
>>>>>>>> the registered office.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>> This email is sent for and on behalf of Ivy Comptech Private
>>> Limited. Ivy Comptech Private Limited is a limited liability
>>> company.
>>>
>>> This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If
>>> you have received this in error, please notify the sender by
>>> replying by email and then delete the email completely from your
>>> system.
>>> Any views or opinions are solely those of the sender.  This
>>> communication is not intended to form a binding contract on behalf
>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>> contrary and properly authorised. Any actions taken on the basis of
>>> this email are at the recipient's own risk.
>>>
>>> Registered office:
>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994. Registered in India. A list of members' names is available
>>> for inspection at the registered office.
>>>
>>>

Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Daniel Washusen <da...@reactive.org>.
Sounds like it's some sort of reporting system. Have you considered
duplicating data into reporting tables?

Write all the game details into the main table then map reduce into
your reporting tables...

On 24/01/2010, at 7:07 PM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:

> However, I'd only recommend using secondary index as a last resort.
> First I'd try doing everything I can to work with the index I get for
> free. The row key.  It sounds like you have done this already...
> --
>
> The only reason why this is important to me is because of the
> following
>
> 1.  I am storing at a minimal 1 yrs worth of data (small rows --  10
> billion)
>
> 2.  Row key is   user + date   (columns  --   gameid ,  opponent etc)
>
> 3.  Queries may be something like give me details for a particular
> "gameid"
>
> 4.  To do step 3  I am assuming I need something like a secondary
> index
> or else given my row key  how else can I do it
>
>
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 3:16 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> Well, it CAN be a RAM hog ;-). It depends what you're indexing. Each
> unique value in the indexed column resides in memory. If you index a
> column that contains 1 million random 1KB values then the index will
> require at least 1GB of memory. Also it *can* slow down writes,
> especially when bulk loading sequential keys.
>
> On the up side, it can make scans dramatically faster.
>
> However, I'd only recommend using secondary index as a last resort.
> First I'd try doing everything I can to work with the index I get for
> free. The row key.  It sounds like you have done this already...
>
> Cheers,
> Dan
>
> On 24/01/2010, at 7:02 AM, Stack <st...@duboce.net> wrote:
>
>> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> Thanks all.  I messed it up when I was trying to upgrade to
>>> 0.20.3.  I deleted the data directory and formatted it thinking it
>>> will reset the whole cluster.
>>>
>>> I started fresh by deleting the data directory on all the nodes and
>>> then everything worked.  I was also able to create the indexed
>>> table using the 0.20.3 patch.  Let me run some tests on a few
>>> million rows and see how it holds up.
>>>
>>> BTW --  what would be the right way when I moved versions.  Do I
>>> run migrate scripts to migrate the data to newer versions ?
>>>
>> Just install the new binaries every and restart or perform a rolling
>> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
>> if you would avoid taking down your cluster during the upgrade.
>>
>> You'll be flagged on start if you need to run a migration but general
>> rule is that there (should) never be need of a migration between
>> patch
>> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
>> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>>
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be
>> startling.
>>
>> St.Ack
>>
>>> -----Original Message-----
>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> Stack
>>> Sent: Saturday, January 23, 2010 5:00 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> Check your master log.  Something is seriously off if you do not
>>> have
>>> a reachable .META. table.
>>> St.Ack
>>>
>>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>>> starting
>>>> hbase I keep getting the error below when I go to the hbase shell
>>>>
>>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>>> hbase(main):001:0> list
>>>> NativeException:
>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>> contact region server null for region , row '', but failed after 7
>>>> attempts.
>>>> Exceptions:
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>
>>>>
>>>>
>>>> Also when I try to create a table programatically I get this --
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>>> to
>>>> server localhost/127.0.0.1:2181
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>>> remote=localhost/127.0.0.1:2181]
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>>> successful
>>>> Exception in thread "main"
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:684)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:634)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:675)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:638)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 128)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 106)
>>>>       at test.CreateTable.main(CreateTable.java:36)
>>>>
>>>>
>>>>
>>>> Any clues ?
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Dan Washusen [mailto:dan@reactive.org]
>>>> Sent: Friday, January 22, 2010 4:53 AM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>> HBASE-1845
>>>>
>>>> If you want to give the "indexed" contrib package a try you'll
>>>> need to
>>>> do
>>>> the following:
>>>>
>>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>>> "\n"
>>>> ":"`)
>>>>  2. Set the 'hbase.hregion.impl' property to
>>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>>> hbase-site.xml
>>>>
>>>> Once you've done that you can create a table with an index using:
>>>>
>>>>>    // define which qualifiers need an index (choosing the correct
>>>> type)
>>>>>    IdxColumnDescriptor columnDescriptor = new
>>>>> IdxColumnDescriptor("columnFamily");
>>>>>    columnDescriptor.addIndexDescriptor(
>>>>>      new IdxIndexDescriptor("qualifier",
>>>>> IdxQualifierType.BYTE_ARRAY)
>>>>>    );
>>>>>
>>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
>>>>> ("table");
>>>>>    tableDescriptor.addFamily(columnDescriptor);
>>>>>
>>>>
>>>> Then when you want to perform a scan with an index hint:
>>>>
>>>>>    Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>
>>>>
>>>> You have to keep in mind that the index hint is only a hint.  It
>>>> guarantees
>>>> that your scan will get all rows that match the hint but you'll
>>>> more
>>>> than
>>>> likely receive rows that don't.  For this reason I'd suggest that
>>>> you
>>>> also
>>>> include a filter along with the scan:
>>>>
>>>>>      Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>      scan.setFilter(
>>>>>          new SingleColumnValueFilter(
>>>>>              "columnFamily", "qualifer",
>>>> CompareFilter.CompareOp.EQUAL,
>>>>>              new BinaryComparator("foo")
>>>>>          )
>>>>>      );
>>>>>
>>>>
>>>> Cheers,
>>>> Dan
>>>>
>>>>
>>>> 2010/1/22 stack <st...@duboce.net>
>>>>
>>>>>
>>>>
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>>
>>>>> There is a bit of documentation if you look at javadoc for the
>>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>>
>>>>> St.Ack
>>>>>
>>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>>> the
>>>>> answers you need on that one?
>>>>>
>>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>>> <sr...@ivycomptech.com> wrote:
>>>>>>
>>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>>> pass
>>>>>> me the link
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>>> Of
>>>>>> stack
>>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>>> patch
>>>>>> HBASE-1845
>>>>>>
>>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>>> probably
>>>>>> rotted since any ways.
>>>>>>
>>>>>> Have you looked at hbase-2037 since committed and available in
>>>>>> 0.20.3RC2.
>>>>>> Would this help you with your original problem?
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> I tried applying the patch to the hbase source code  hbase
>>>>>>> 0.20.2
>>>> and
>>>>>> I
>>>>>>> get the errors below.  Do you know if this needs to be applied
>>>>>>> to
>>>> a
>>>>>>> specific hbase version. Is there a version which works with
>>>>>>> 0.20.2
>>>> or
>>>>>>> later ??
>>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the help
>>>>>>>
>>>>>>> patch -p0 -i batch.patch
>>>>>>>
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
>>>>>>> HTable.java
>>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>>> Hunk #4 FAILED at 405.
>>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>>> Hunk #2 FAILED at 2515.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>
>>>> src/java/org/apache/hadoop/hbase/regionserver/
>>>> HRegionServer.java.rej
>>>>>>> patching file
>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>>> Hunk #2 FAILED at 333.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>>
>>>>>>> Sriram,
>>>>>>>
>>>>>>> Would a secondary index help you:
>>>>>>>
>>>>>>
>>>>
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>>> .
>>>>>>>
>>>>>>> The index is stored in a separate table, but the index is
>>>>>>> managed
>>>> for
>>>>>>> you.
>>>>>>>
>>>>>>> I don't think you can do an arbitrary "in" query, though.  If
>>>>>>> the
>>>> keys
>>>>>>> that
>>>>>>> you want to include in the "in" are reasonably close neighbors,
>>>> you
>>>>>>> could do
>>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>>> try a
>>>>>> batch
>>>>>>> Get
>>>>>>> by applying a separate patch, see
>>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>>
>>>>>>> Marc Limotte
>>>>>>>
>>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>
>>>>>>>> Is there any support for this.  I want to do this
>>>>>>>>
>>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>>> column
>>>>>>>> and the rowid's of the primary table
>>>>>>>>
>>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>>> primary
>>>>>>>> table using a SQL In like clause ---
>>>>>>>>
>>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>>> columns.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Sriram C
>>>>>>>>
>>>>>>>>
>>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>>> Limited.
>>>>>>> Ivy
>>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>>
>>>>>>>> This email and any attachments are confidential, and may be
>>>> legally
>>>>>>>> privileged and protected by copyright. If you are not the
>>>> intended
>>>>>>> recipient
>>>>>>>> dissemination or copying of this email is prohibited. If you
>>>> have
>>>>>>> received
>>>>>>>> this in error, please notify the sender by replying by email
>>>>>>>> and
>>>>>> then
>>>>>>> delete
>>>>>>>> the email completely from your system.
>>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>>> communication
>>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>>> Comptech
>>>>>>> Private
>>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>>> authorised.
>>>>>>>> Any actions taken on the basis of this email are at the
>>>> recipient's
>>>>>>> own
>>>>>>>> risk.
>>>>>>>>
>>>>>>>> Registered office:
>>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>>> Hills,
>>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>> 37994.
>>>>>>>> Registered in India. A list of members' names is available for
>>>>>>> inspection at
>>>>>>>> the registered office.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>> This email is sent for and on behalf of Ivy Comptech Private
>>> Limited. Ivy Comptech Private Limited is a limited liability
>>> company.
>>>
>>> This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If
>>> you have received this in error, please notify the sender by
>>> replying by email and then delete the email completely from your
>>> system.
>>> Any views or opinions are solely those of the sender.  This
>>> communication is not intended to form a binding contract on behalf
>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>> contrary and properly authorised. Any actions taken on the basis of
>>> this email are at the recipient's own risk.
>>>
>>> Registered office:
>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994. Registered in India. A list of members' names is available
>>> for inspection at the registered office.
>>>
>>>

RE: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
However, I'd only recommend using secondary index as a last resort.
First I'd try doing everything I can to work with the index I get for
free. The row key.  It sounds like you have done this already...
--  

The only reason why this is important to me is because of the following 

1.  I am storing at a minimal 1 yrs worth of data (small rows --  10
billion)

2.  Row key is   user + date   (columns  --   gameid ,  opponent etc)

3.  Queries may be something like give me details for a particular
"gameid"

4.  To do step 3  I am assuming I need something like a secondary index
or else given my row key  how else can I do it 



-----Original Message-----
From: Daniel Washusen [mailto:dan@reactive.org] 
Sent: Sunday, January 24, 2010 3:16 AM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause -- error in patch
HBASE-1845

Well, it CAN be a RAM hog ;-). It depends what you're indexing. Each
unique value in the indexed column resides in memory. If you index a
column that contains 1 million random 1KB values then the index will
require at least 1GB of memory. Also it *can* slow down writes,
especially when bulk loading sequential keys.

On the up side, it can make scans dramatically faster.

However, I'd only recommend using secondary index as a last resort.
First I'd try doing everything I can to work with the index I get for
free. The row key.  It sounds like you have done this already...

Cheers,
Dan

On 24/01/2010, at 7:02 AM, Stack <st...@duboce.net> wrote:

> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
>> Thanks all.  I messed it up when I was trying to upgrade to
>> 0.20.3.  I deleted the data directory and formatted it thinking it
>> will reset the whole cluster.
>>
>> I started fresh by deleting the data directory on all the nodes and
>> then everything worked.  I was also able to create the indexed
>> table using the 0.20.3 patch.  Let me run some tests on a few
>> million rows and see how it holds up.
>>
>> BTW --  what would be the right way when I moved versions.  Do I
>> run migrate scripts to migrate the data to newer versions ?
>>
> Just install the new binaries every and restart or perform a rolling
> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
> if you would avoid taking down your cluster during the upgrade.
>
> You'll be flagged on start if you need to run a migration but general
> rule is that there (should) never be need of a migration between patch
> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>
> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
> hog but the speed improvement finding matching cells can be startling.
>
> St.Ack
>
>> -----Original Message-----
>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> Stack
>> Sent: Saturday, January 23, 2010 5:00 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> Check your master log.  Something is seriously off if you do not have
>> a reachable .META. table.
>> St.Ack
>>
>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>> starting
>>> hbase I keep getting the error below when I go to the hbase shell
>>>
>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>> hbase(main):001:0> list
>>> NativeException:
>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>> contact region server null for region , row '', but failed after 7
>>> attempts.
>>> Exceptions:
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>
>>>
>>>
>>> Also when I try to create a table programatically I get this --
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>> to
>>> server localhost/127.0.0.1:2181
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>> remote=localhost/127.0.0.1:2181]
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>> successful
>>> Exception in thread "main"
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ionInMeta(HConnectionManager.java:684)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:634)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:601)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ionInMeta(HConnectionManager.java:675)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:638)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:601)
>>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>> 128)
>>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>> 106)
>>>        at test.CreateTable.main(CreateTable.java:36)
>>>
>>>
>>>
>>> Any clues ?
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Dan Washusen [mailto:dan@reactive.org]
>>> Sent: Friday, January 22, 2010 4:53 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> If you want to give the "indexed" contrib package a try you'll
>>> need to
>>> do
>>> the following:
>>>
>>>   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>> "\n"
>>> ":"`)
>>>   2. Set the 'hbase.hregion.impl' property to
>>>   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>> hbase-site.xml
>>>
>>> Once you've done that you can create a table with an index using:
>>>
>>>>     // define which qualifiers need an index (choosing the correct
>>> type)
>>>>     IdxColumnDescriptor columnDescriptor = new
>>>> IdxColumnDescriptor("columnFamily");
>>>>     columnDescriptor.addIndexDescriptor(
>>>>       new IdxIndexDescriptor("qualifier",
>>>> IdxQualifierType.BYTE_ARRAY)
>>>>     );
>>>>
>>>>     HTableDescriptor tableDescriptor = new HTableDescriptor
>>>> ("table");
>>>>     tableDescriptor.addFamily(columnDescriptor);
>>>>
>>>
>>> Then when you want to perform a scan with an index hint:
>>>
>>>>     Scan scan = new IdxScan(
>>>>           new Comparison("columnFamily", "qualifier",
>>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>       );
>>>>
>>>
>>> You have to keep in mind that the index hint is only a hint.  It
>>> guarantees
>>> that your scan will get all rows that match the hint but you'll more
>>> than
>>> likely receive rows that don't.  For this reason I'd suggest that
>>> you
>>> also
>>> include a filter along with the scan:
>>>
>>>>       Scan scan = new IdxScan(
>>>>           new Comparison("columnFamily", "qualifier",
>>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>       );
>>>>       scan.setFilter(
>>>>           new SingleColumnValueFilter(
>>>>               "columnFamily", "qualifer",
>>> CompareFilter.CompareOp.EQUAL,
>>>>               new BinaryComparator("foo")
>>>>           )
>>>>       );
>>>>
>>>
>>> Cheers,
>>> Dan
>>>
>>>
>>> 2010/1/22 stack <st...@duboce.net>
>>>
>>>>
>>>
http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>
>>>> There is a bit of documentation if you look at javadoc for the
>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>
>>>> St.Ack
>>>>
>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>> the
>>>> answers you need on that one?
>>>>
>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>> <sr...@ivycomptech.com> wrote:
>>>>>
>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>> pass
>>>>> me the link
>>>>>
>>>>> -----Original Message-----
>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>> Of
>>>>> stack
>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>> patch
>>>>> HBASE-1845
>>>>>
>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>> probably
>>>>> rotted since any ways.
>>>>>
>>>>> Have you looked at hbase-2037 since committed and available in
>>>>> 0.20.3RC2.
>>>>>  Would this help you with your original problem?
>>>>>
>>>>> St.Ack
>>>>>
>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>> sriramc@ivycomptech.com> wrote:
>>>>>
>>>>>> I tried applying the patch to the hbase source code  hbase 0.20.2
>>> and
>>>>> I
>>>>>> get the errors below.  Do you know if this needs to be applied to
>>> a
>>>>>> specific hbase version. Is there a version which works with
>>>>>> 0.20.2
>>> or
>>>>>> later ??
>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>
>>>>>>
>>>>>> Thanks for the help
>>>>>>
>>>>>> patch -p0 -i batch.patch
>>>>>>
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>> patching file
>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/HTable.java
>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>> Hunk #4 FAILED at 405.
>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>> patching file
>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>> Hunk #2 FAILED at 2515.
>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>
>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
>>>>>> patching file
>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>> Hunk #2 FAILED at 333.
>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>
>>>>>> Sriram,
>>>>>>
>>>>>> Would a secondary index help you:
>>>>>>
>>>>>
>>>
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>> .
>>>>>>
>>>>>> The index is stored in a separate table, but the index is managed
>>> for
>>>>>> you.
>>>>>>
>>>>>> I don't think you can do an arbitrary "in" query, though.  If the
>>> keys
>>>>>> that
>>>>>> you want to include in the "in" are reasonably close neighbors,
>>> you
>>>>>> could do
>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>> try a
>>>>> batch
>>>>>> Get
>>>>>> by applying a separate patch, see
>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>
>>>>>> Marc Limotte
>>>>>>
>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> Is there any support for this.  I want to do this
>>>>>>>
>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>> column
>>>>>>> and the rowid's of the primary table
>>>>>>>
>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>> primary
>>>>>>> table using a SQL In like clause ---
>>>>>>>
>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>> columns.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Sriram C
>>>>>>>
>>>>>>>
>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>> Limited.
>>>>>> Ivy
>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>
>>>>>>> This email and any attachments are confidential, and may be
>>> legally
>>>>>>> privileged and protected by copyright. If you are not the
>>> intended
>>>>>> recipient
>>>>>>> dissemination or copying of this email is prohibited. If you
>>> have
>>>>>> received
>>>>>>> this in error, please notify the sender by replying by email and
>>>>> then
>>>>>> delete
>>>>>>> the email completely from your system.
>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>> communication
>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>> Comptech
>>>>>> Private
>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>> authorised.
>>>>>>> Any actions taken on the basis of this email are at the
>>> recipient's
>>>>>> own
>>>>>>> risk.
>>>>>>>
>>>>>>> Registered office:
>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>> Hills,
>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994.
>>>>>>> Registered in India. A list of members' names is available for
>>>>>> inspection at
>>>>>>> the registered office.
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>> This email is sent for and on behalf of Ivy Comptech Private
>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>
>> This email and any attachments are confidential, and may be legally
>> privileged and protected by copyright. If you are not the intended
>> recipient dissemination or copying of this email is prohibited. If
>> you have received this in error, please notify the sender by
>> replying by email and then delete the email completely from your
>> system.
>> Any views or opinions are solely those of the sender.  This
>> communication is not intended to form a binding contract on behalf
>> of Ivy Comptech Private Limited unless expressly indicated to the
>> contrary and properly authorised. Any actions taken on the basis of
>> this email are at the recipient's own risk.
>>
>> Registered office:
>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994. Registered in India. A list of members' names is available
>> for inspection at the registered office.
>>
>>

Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Daniel Washusen <da...@reactive.org>.
Well, it CAN be a RAM hog ;-). It depends what you're indexing. Each
unique value in the indexed column resides in memory. If you index a
column that contains 1 million random 1KB values then the index will
require at least 1GB of memory. Also it *can* slow down writes,
especially when bulk loading sequential keys.

On the up side, it can make scans dramatically faster.

However, I'd only recommend using secondary index as a last resort.
First I'd try doing everything I can to work with the index I get for
free. The row key.  It sounds like you have done this already...

Cheers,
Dan

On 24/01/2010, at 7:02 AM, Stack <st...@duboce.net> wrote:

> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
>> Thanks all.  I messed it up when I was trying to upgrade to
>> 0.20.3.  I deleted the data directory and formatted it thinking it
>> will reset the whole cluster.
>>
>> I started fresh by deleting the data directory on all the nodes and
>> then everything worked.  I was also able to create the indexed
>> table using the 0.20.3 patch.  Let me run some tests on a few
>> million rows and see how it holds up.
>>
>> BTW --  what would be the right way when I moved versions.  Do I
>> run migrate scripts to migrate the data to newer versions ?
>>
> Just install the new binaries every and restart or perform a rolling
> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
> if you would avoid taking down your cluster during the upgrade.
>
> You'll be flagged on start if you need to run a migration but general
> rule is that there (should) never be need of a migration between patch
> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>
> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
> hog but the speed improvement finding matching cells can be startling.
>
> St.Ack
>
>> -----Original Message-----
>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> Stack
>> Sent: Saturday, January 23, 2010 5:00 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> Check your master log.  Something is seriously off if you do not have
>> a reachable .META. table.
>> St.Ack
>>
>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>> starting
>>> hbase I keep getting the error below when I go to the hbase shell
>>>
>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>> hbase(main):001:0> list
>>> NativeException:
>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>> contact region server null for region , row '', but failed after 7
>>> attempts.
>>> Exceptions:
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>
>>>
>>>
>>> Also when I try to create a table programatically I get this --
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>> to
>>> server localhost/127.0.0.1:2181
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>> remote=localhost/127.0.0.1:2181]
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>> successful
>>> Exception in thread "main"
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ionInMeta(HConnectionManager.java:684)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:634)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:601)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ionInMeta(HConnectionManager.java:675)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:638)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:601)
>>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>> 128)
>>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>> 106)
>>>        at test.CreateTable.main(CreateTable.java:36)
>>>
>>>
>>>
>>> Any clues ?
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Dan Washusen [mailto:dan@reactive.org]
>>> Sent: Friday, January 22, 2010 4:53 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> If you want to give the "indexed" contrib package a try you'll
>>> need to
>>> do
>>> the following:
>>>
>>>   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>> "\n"
>>> ":"`)
>>>   2. Set the 'hbase.hregion.impl' property to
>>>   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>> hbase-site.xml
>>>
>>> Once you've done that you can create a table with an index using:
>>>
>>>>     // define which qualifiers need an index (choosing the correct
>>> type)
>>>>     IdxColumnDescriptor columnDescriptor = new
>>>> IdxColumnDescriptor("columnFamily");
>>>>     columnDescriptor.addIndexDescriptor(
>>>>       new IdxIndexDescriptor("qualifier",
>>>> IdxQualifierType.BYTE_ARRAY)
>>>>     );
>>>>
>>>>     HTableDescriptor tableDescriptor = new HTableDescriptor
>>>> ("table");
>>>>     tableDescriptor.addFamily(columnDescriptor);
>>>>
>>>
>>> Then when you want to perform a scan with an index hint:
>>>
>>>>     Scan scan = new IdxScan(
>>>>           new Comparison("columnFamily", "qualifier",
>>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>       );
>>>>
>>>
>>> You have to keep in mind that the index hint is only a hint.  It
>>> guarantees
>>> that your scan will get all rows that match the hint but you'll more
>>> than
>>> likely receive rows that don't.  For this reason I'd suggest that
>>> you
>>> also
>>> include a filter along with the scan:
>>>
>>>>       Scan scan = new IdxScan(
>>>>           new Comparison("columnFamily", "qualifier",
>>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>       );
>>>>       scan.setFilter(
>>>>           new SingleColumnValueFilter(
>>>>               "columnFamily", "qualifer",
>>> CompareFilter.CompareOp.EQUAL,
>>>>               new BinaryComparator("foo")
>>>>           )
>>>>       );
>>>>
>>>
>>> Cheers,
>>> Dan
>>>
>>>
>>> 2010/1/22 stack <st...@duboce.net>
>>>
>>>>
>>> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>
>>>> There is a bit of documentation if you look at javadoc for the
>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>
>>>> St.Ack
>>>>
>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>> the
>>>> answers you need on that one?
>>>>
>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>> <sr...@ivycomptech.com> wrote:
>>>>>
>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>> pass
>>>>> me the link
>>>>>
>>>>> -----Original Message-----
>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>> Of
>>>>> stack
>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>> patch
>>>>> HBASE-1845
>>>>>
>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>> probably
>>>>> rotted since any ways.
>>>>>
>>>>> Have you looked at hbase-2037 since committed and available in
>>>>> 0.20.3RC2.
>>>>>  Would this help you with your original problem?
>>>>>
>>>>> St.Ack
>>>>>
>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>> sriramc@ivycomptech.com> wrote:
>>>>>
>>>>>> I tried applying the patch to the hbase source code  hbase 0.20.2
>>> and
>>>>> I
>>>>>> get the errors below.  Do you know if this needs to be applied to
>>> a
>>>>>> specific hbase version. Is there a version which works with
>>>>>> 0.20.2
>>> or
>>>>>> later ??
>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>
>>>>>>
>>>>>> Thanks for the help
>>>>>>
>>>>>> patch -p0 -i batch.patch
>>>>>>
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>> patching file
>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/HTable.java
>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>> Hunk #4 FAILED at 405.
>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>> patching file
>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>> Hunk #2 FAILED at 2515.
>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>
>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
>>>>>> patching file
>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>> Hunk #2 FAILED at 333.
>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>
>>>>>> Sriram,
>>>>>>
>>>>>> Would a secondary index help you:
>>>>>>
>>>>>
>>> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>> .
>>>>>>
>>>>>> The index is stored in a separate table, but the index is managed
>>> for
>>>>>> you.
>>>>>>
>>>>>> I don't think you can do an arbitrary "in" query, though.  If the
>>> keys
>>>>>> that
>>>>>> you want to include in the "in" are reasonably close neighbors,
>>> you
>>>>>> could do
>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>> try a
>>>>> batch
>>>>>> Get
>>>>>> by applying a separate patch, see
>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>
>>>>>> Marc Limotte
>>>>>>
>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> Is there any support for this.  I want to do this
>>>>>>>
>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>> column
>>>>>>> and the rowid's of the primary table
>>>>>>>
>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>> primary
>>>>>>> table using a SQL In like clause ---
>>>>>>>
>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>> columns.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Sriram C
>>>>>>>
>>>>>>>
>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>> Limited.
>>>>>> Ivy
>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>
>>>>>>> This email and any attachments are confidential, and may be
>>> legally
>>>>>>> privileged and protected by copyright. If you are not the
>>> intended
>>>>>> recipient
>>>>>>> dissemination or copying of this email is prohibited. If you
>>> have
>>>>>> received
>>>>>>> this in error, please notify the sender by replying by email and
>>>>> then
>>>>>> delete
>>>>>>> the email completely from your system.
>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>> communication
>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>> Comptech
>>>>>> Private
>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>> authorised.
>>>>>>> Any actions taken on the basis of this email are at the
>>> recipient's
>>>>>> own
>>>>>>> risk.
>>>>>>>
>>>>>>> Registered office:
>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>> Hills,
>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994.
>>>>>>> Registered in India. A list of members' names is available for
>>>>>> inspection at
>>>>>>> the registered office.
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>> This email is sent for and on behalf of Ivy Comptech Private
>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>
>> This email and any attachments are confidential, and may be legally
>> privileged and protected by copyright. If you are not the intended
>> recipient dissemination or copying of this email is prohibited. If
>> you have received this in error, please notify the sender by
>> replying by email and then delete the email completely from your
>> system.
>> Any views or opinions are solely those of the sender.  This
>> communication is not intended to form a binding contract on behalf
>> of Ivy Comptech Private Limited unless expressly indicated to the
>> contrary and properly authorised. Any actions taken on the basis of
>> this email are at the recipient's own risk.
>>
>> Registered office:
>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994. Registered in India. A list of members' names is available
>> for inspection at the registered office.
>>
>>

Re: public numbers for IHBase? (was Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845)

Posted by Dan Washusen <da...@reactive.org>.
Hi Sriram,
I can't really provide a recommended heap size at the moment.  For my tests
I'm using 5 nodes each with 48GB of memory (the region server get 8GB).  My
table contains about 30 millions rows with two columns that require an
index.  One column is a short integer and other a byte array.  Both my
indexed columns contain a lot of repetition (the byte[] can contain one of
thirty possible values, the short is a range between 1 and 20) which means
my index memory footprint isn't very big.  The region server VM's seem to
settle at around 6GB used (although I have increased the
hfile.block.cache.size property to 0.4).

Just doing some quick sums I don't think you are going to be able to use
IHbase in it's current state with your current hardware.  Assuming your user
+ date is something like "igorthebrave" + currentTimeMillis then you are
looking at over 20GB (25 bytes * 10 billion) of memory for the keys alone.

I'm sure as the IHbase contrib matures it will become better at this kind of
use case (for example, a disk backed index) but at the moment you'll have to
either add considerably more resources to your region servers or try and
work with the row key alone...

Cheers,
Dan


2010/1/26 Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>

> What I am finding is that it really hogs memeory when I was trying to
> insert rows and the region server kept crashing on me.  For example I was
> able to successfully create 2 million rows with around 2 GB but after that
> it keep needing more memory.
>
> Is this the experience with everyone or I am doing something wrong.
>  Basically at most on my Linux box I can go upto 2.7 GB on a 32 b it JVM.
>  For 1 billion rows and using IHBase -- what kind of memeory do I need
>
> Thanks
>
>
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
> Sent: Tuesday, January 26, 2010 12:00 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: public numbers for IHBase? (was Re: Support for MultiGet / SQL
> In clause -- error in patch HBASE-1845)
>
> No real numbers at the moment.  HBASE-2167 adds a
> PerformanceEvaluation for IHBase (Indexed HBase).  PE is sort of not
> the right use-case for IHBase with its largish, random values -- the
> latter requires RAM and writes are slowed.  Nonetheless, search for
> random values with the IHBase index can be up to two orders of
> magnitude better in this hostile test: e.g.  20 scans for 20 random
> values on a single node cluster with 1.5GB of memory allocated to the
> RS VM.
>
> Without an index: 732989ms at offset 0 for 1048576 rows
> With an index: 2160ms at offset 0 for 1048576 rows
>
> St.Ack
>
> On Sun, Jan 24, 2010 at 1:17 AM, Andrew Purtell <ap...@apache.org>
> wrote:
> > Stack, any way you might persuade the IHBase guys to post some numbers
> publicly?
> > I'd like to know more.
> >
> >   - Andy
> >
> >
> >
> > ----- Original Message ----
> >> From: Stack <st...@duboce.net>
> >> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
> > [...]
> >> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
> >> hog but the speed improvement finding matching cells can be startling.
> >
> >
> >
> >
> >
>
> This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy
> Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended recipient
> dissemination or copying of this email is prohibited. If you have received
> this in error, please notify the sender by replying by email and then delete
> the email completely from your system.
> Any views or opinions are solely those of the sender.  This communication
> is not intended to form a binding contract on behalf of Ivy Comptech Private
> Limited unless expressly indicated to the contrary and properly authorised.
> Any actions taken on the basis of this email are at the recipient's own
> risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills,
> Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994.
> Registered in India. A list of members' names is available for inspection at
> the registered office.
>
>

RE: public numbers for IHBase? (was Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845)

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
What I am finding is that it really hogs memeory when I was trying to insert rows and the region server kept crashing on me.  For example I was able to successfully create 2 million rows with around 2 GB but after that it keep needing more memory.

Is this the experience with everyone or I am doing something wrong.  Basically at most on my Linux box I can go upto 2.7 GB on a 32 b it JVM.  For 1 billion rows and using IHBase -- what kind of memeory do I need 

Thanks



-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Tuesday, January 26, 2010 12:00 AM
To: hbase-user@hadoop.apache.org
Subject: Re: public numbers for IHBase? (was Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845)

No real numbers at the moment.  HBASE-2167 adds a
PerformanceEvaluation for IHBase (Indexed HBase).  PE is sort of not
the right use-case for IHBase with its largish, random values -- the
latter requires RAM and writes are slowed.  Nonetheless, search for
random values with the IHBase index can be up to two orders of
magnitude better in this hostile test: e.g.  20 scans for 20 random
values on a single node cluster with 1.5GB of memory allocated to the
RS VM.

Without an index: 732989ms at offset 0 for 1048576 rows
With an index: 2160ms at offset 0 for 1048576 rows

St.Ack

On Sun, Jan 24, 2010 at 1:17 AM, Andrew Purtell <ap...@apache.org> wrote:
> Stack, any way you might persuade the IHBase guys to post some numbers publicly?
> I'd like to know more.
>
>   - Andy
>
>
>
> ----- Original Message ----
>> From: Stack <st...@duboce.net>
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845
> [...]
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be startling.
>
>
>
>
>

This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.  

This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system. 
Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.

Registered office:
Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.


Re: public numbers for IHBase? (was Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845)

Posted by Stack <st...@duboce.net>.
No real numbers at the moment.  HBASE-2167 adds a
PerformanceEvaluation for IHBase (Indexed HBase).  PE is sort of not
the right use-case for IHBase with its largish, random values -- the
latter requires RAM and writes are slowed.  Nonetheless, search for
random values with the IHBase index can be up to two orders of
magnitude better in this hostile test: e.g.  20 scans for 20 random
values on a single node cluster with 1.5GB of memory allocated to the
RS VM.

Without an index: 732989ms at offset 0 for 1048576 rows
With an index: 2160ms at offset 0 for 1048576 rows

St.Ack

On Sun, Jan 24, 2010 at 1:17 AM, Andrew Purtell <ap...@apache.org> wrote:
> Stack, any way you might persuade the IHBase guys to post some numbers publicly?
> I'd like to know more.
>
>   - Andy
>
>
>
> ----- Original Message ----
>> From: Stack <st...@duboce.net>
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845
> [...]
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be startling.
>
>
>
>
>

public numbers for IHBase? (was Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845)

Posted by Andrew Purtell <ap...@apache.org>.
Stack, any way you might persuade the IHBase guys to post some numbers publicly? 
I'd like to know more. 

   - Andy



----- Original Message ----
> From: Stack <st...@duboce.net>
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845
[...]
> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
> hog but the speed improvement finding matching cells can be startling.


      


Re: Support for MultiGet / SQL In clause

Posted by Stack <st...@duboce.net>.
Are they recent?  Seems like you have a mismatch in versions between
what wrote the data and what is reading it?  For sure you installed
the same hadoop everywhere?  Do the following exceptions come up into
hbase?  Is hbase running ok?

St.Ack

On Sat, Jan 23, 2010 at 11:35 PM, Sriram Muthuswamy Chittathoor
<sr...@ivycomptech.com> wrote:
> What do errors like this in my  datanode mean (comes in all the boxes) -
> - something is corrupted and I clean everything and restart
>
> 2010-01-24 02:27:46,349 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
> Incompatible namespaceIDs in /home/ppoker/test/hadoop-0.20.1/data/data:
> namenode namespaceID = 1033461714; datanode namespaceID = 1483221888
>        at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStor
> age.java:233)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead
> (DataStorage.java:148)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.j
> ava:298)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:216
> )
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.ja
> va:1283)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(Data
> Node.java:1238)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.
> java:1246)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)
>
> 2010-01-24 02:27:46,349 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>
>
>
>
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 12:41 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause
>
> Also, I would suggest that you allocate one of the machines to be
> hmaster & namenode. The other three machines run rserver, datanode &
> zookeeper.
>
> On 24/01/2010, at 5:36 PM, "Sriram Muthuswamy Chittathoor"
> <sriramc@ivycomptech.com
>  > wrote:
>
>> You were right. This is what I have in the regionserver.out file
>>
>> java.lang.OutOfMemoryError: Java heap space
>> Dumping heap to java_pid28602.hprof ...
>> Heap dump file created [1081182427 bytes in 53.046 secs]
>>
>> Do I increase the jvm heapsize for all the machines.  This is my
>> config
>>
>> 1.  4 boxes running the cluster
>> 2.  Habse Regionserver and HDFS data node on all the boxes
>> 3.  One of the boxes where this error occurred also has the NameNode
>> and
>> the HBasemaster running
>>
>> How do I selectively increase the heapsize just for regionserver.  I
>> changed  in bin/hbase.
>>
>> -----Original Message-----
>> From: Daniel Washusen [mailto:dan@reactive.org]
>> Sent: Sunday, January 24, 2010 3:44 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause
>>
>> What does the regionserver.out log file say?  Maybe you are out of
>> memory...  The master web ui will report how much heap space the
>> region server is using...
>>
>> On 24/01/2010, at 8:43 AM, "Sriram Muthuswamy Chittathoor"
>> <sriramc@ivycomptech.com
>>> wrote:
>>
>>> I created a new table with indexes.  Initially created 100000 rows
>>> and then did a scan.  At that time it was okay.  Then I started
>>> creating a million rows in a loop and then after some time I get
>>> this exception and the table disappeared (even from the hbase
>>> shell).  One other table also disappeared.
>>>
>>> This is very consistent.  I tried a few times and every time it is
>>> the same on creating a lot of rows.  Rows are not too big (Just some
>>> 6 columns in one family) each of say type long or string.  Created 3
>>> indexes -- 2 byte array and 1 long.
>>>
>>>
>>>
>>> Cur Rows : 349999
>>> Cur : 359999
>>> Cur : 369999
>>> Cur : 379999
>>> Cur Rows : 389999   <--  Crashed after this
>>> Exception in thread "main"
>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>> contact region server 10.1.162.25:60020 for region
>> POKERHH6,,1264279773695
>>> , row '0000392413', but failed after 10 attempts.
>>> Exceptions:
>>> java.io.IOException: Call to /10.1.162.25:60020 failed on local
>>> exception: java.io.EOFException
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>>
>>>       at org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.getRegionServerWithRetries(HConnectionManager.java:
>>> 1048)
>>>       at org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers$3.doCall(HConnectionManager.java:1239)
>>>       at org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers$Batch.process(HConnectionManager.java:1161)
>>>       at org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.processBatchOfRows(HConnectionManager.java:1247)
>>>       at org.apache.hadoop.hbase.client.HTable.flushCommits
>>> (HTable.java:609)
>>>       at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
>>>       at test.TestExtIndexedTable.main(TestExtIndexedTable.java:110)
>>> 10/01/23 16:21:28 INFO zookeeper.ZooKeeper: Closing session:
>>> 0x265caacc7a001b
>>> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Closing ClientCnxn for
>>> session: 0x265caacc7a001b
>>> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Exception while closing
>>> send thread for session 0x265caacc7a001b : Read error rc = -1
>>> java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> Stack
>>> Sent: Sunday, January 24, 2010 1:33 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>>> Thanks all.  I messed it up when I was trying to upgrade to
>>>> 0.20.3.  I deleted the data directory and formatted it thinking it
>>>> will reset the whole cluster.
>>>>
>>>> I started fresh by deleting the data directory on all the nodes and
>>>> then everything worked.  I was also able to create the indexed
>>>> table using the 0.20.3 patch.  Let me run some tests on a few
>>>> million rows and see how it holds up.
>>>>
>>>> BTW --  what would be the right way when I moved versions.  Do I
>>>> run migrate scripts to migrate the data to newer versions ?
>>>>
>>> Just install the new binaries every and restart or perform a rolling
>>> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
>>> if you would avoid taking down your cluster during the upgrade.
>>>
>>> You'll be flagged on start if you need to run a migration but general
>>> rule is that there (should) never be need of a migration between
>>> patch
>>> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
>>> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>>>
>>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>>> hog but the speed improvement finding matching cells can be
>>> startling.
>>>
>>> St.Ack
>>>
>>>> -----Original Message-----
>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>>> Stack
>>>> Sent: Saturday, January 23, 2010 5:00 AM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>> HBASE-1845
>>>>
>>>> Check your master log.  Something is seriously off if you do not
>>>> have
>>>> a reachable .META. table.
>>>> St.Ack
>>>>
>>>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>>>> <sr...@ivycomptech.com> wrote:
>>>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>>>> starting
>>>>> hbase I keep getting the error below when I go to the hbase shell
>>>>>
>>>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>>>> hbase(main):001:0> list
>>>>> NativeException:
>>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>>> contact region server null for region , row '', but failed after 7
>>>>> attempts.
>>>>> Exceptions:
>>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>>
>>>>>
>>>>>
>>>>> Also when I try to create a table programatically I get this --
>>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>>>> to
>>>>> server localhost/127.0.0.1:2181
>>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>>>> remote=localhost/127.0.0.1:2181]
>>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>>>> successful
>>>>> Exception in thread "main"
>>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>>       at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>>> $TableServers.locateReg
>>>>> ionInMeta(HConnectionManager.java:684)
>>>>>       at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>>> $TableServers.locateReg
>>>>> ion(HConnectionManager.java:634)
>>>>>       at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>>> $TableServers.locateReg
>>>>> ion(HConnectionManager.java:601)
>>>>>       at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>>> $TableServers.locateReg
>>>>> ionInMeta(HConnectionManager.java:675)
>>>>>       at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>>> $TableServers.locateReg
>>>>> ion(HConnectionManager.java:638)
>>>>>       at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>>> $TableServers.locateReg
>>>>> ion(HConnectionManager.java:601)
>>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>>> 128)
>>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>>> 106)
>>>>>       at test.CreateTable.main(CreateTable.java:36)
>>>>>
>>>>>
>>>>>
>>>>> Any clues ?
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Dan Washusen [mailto:dan@reactive.org]
>>>>> Sent: Friday, January 22, 2010 4:53 AM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>>> HBASE-1845
>>>>>
>>>>> If you want to give the "indexed" contrib package a try you'll
>>>>> need to
>>>>> do
>>>>> the following:
>>>>>
>>>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>>>> "\n"
>>>>> ":"`)
>>>>>  2. Set the 'hbase.hregion.impl' property to
>>>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>>>> hbase-site.xml
>>>>>
>>>>> Once you've done that you can create a table with an index using:
>>>>>
>>>>>>    // define which qualifiers need an index (choosing the correct
>>>>> type)
>>>>>>    IdxColumnDescriptor columnDescriptor = new
>>>>>> IdxColumnDescriptor("columnFamily");
>>>>>>    columnDescriptor.addIndexDescriptor(
>>>>>>      new IdxIndexDescriptor("qualifier",
>>>>>> IdxQualifierType.BYTE_ARRAY)
>>>>>>    );
>>>>>>
>>>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
>>>>>> ("table");
>>>>>>    tableDescriptor.addFamily(columnDescriptor);
>>>>>>
>>>>>
>>>>> Then when you want to perform a scan with an index hint:
>>>>>
>>>>>>    Scan scan = new IdxScan(
>>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>>      );
>>>>>>
>>>>>
>>>>> You have to keep in mind that the index hint is only a hint.  It
>>>>> guarantees
>>>>> that your scan will get all rows that match the hint but you'll
>>>>> more
>>>>> than
>>>>> likely receive rows that don't.  For this reason I'd suggest that
>>>>> you
>>>>> also
>>>>> include a filter along with the scan:
>>>>>
>>>>>>      Scan scan = new IdxScan(
>>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>>      );
>>>>>>      scan.setFilter(
>>>>>>          new SingleColumnValueFilter(
>>>>>>              "columnFamily", "qualifer",
>>>>> CompareFilter.CompareOp.EQUAL,
>>>>>>              new BinaryComparator("foo")
>>>>>>          )
>>>>>>      );
>>>>>>
>>>>>
>>>>> Cheers,
>>>>> Dan
>>>>>
>>>>>
>>>>> 2010/1/22 stack <st...@duboce.net>
>>>>>
>>>>>>
>>>>>
>>
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>>>
>>>>>> There is a bit of documentation if you look at javadoc for the
>>>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>>>> the
>>>>>> answers you need on that one?
>>>>>>
>>>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>>>> <sr...@ivycomptech.com> wrote:
>>>>>>>
>>>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>>>> pass
>>>>>>> me the link
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>>>> Of
>>>>>>> stack
>>>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>>>> patch
>>>>>>> HBASE-1845
>>>>>>>
>>>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>>>> probably
>>>>>>> rotted since any ways.
>>>>>>>
>>>>>>> Have you looked at hbase-2037 since committed and available in
>>>>>>> 0.20.3RC2.
>>>>>>> Would this help you with your original problem?
>>>>>>>
>>>>>>> St.Ack
>>>>>>>
>>>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>
>>>>>>>> I tried applying the patch to the hbase source code  hbase
>>>>>>>> 0.20.2
>>>>> and
>>>>>>> I
>>>>>>>> get the errors below.  Do you know if this needs to be applied
>>>>>>>> to
>>>>> a
>>>>>>>> specific hbase version. Is there a version which works with
>>>>>>>> 0.20.2
>>>>> or
>>>>>>>> later ??
>>>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for the help
>>>>>>>>
>>>>>>>> patch -p0 -i batch.patch
>>>>>>>>
>>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>>>> patching file
>>>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
>>>>>>>> HTable.java
>>>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>>>> Hunk #4 FAILED at 405.
>>>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>>>> patching file
>>>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>>>> patching file
>>>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>>>> Hunk #2 FAILED at 2515.
>>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>>
>>>>> src/java/org/apache/hadoop/hbase/regionserver/
>>>>> HRegionServer.java.rej
>>>>>>>> patching file
>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>>>> Hunk #2 FAILED at 333.
>>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>>>
>>>>>>>> Sriram,
>>>>>>>>
>>>>>>>> Would a secondary index help you:
>>>>>>>>
>>>>>>>
>>>>>
>>
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>>>> .
>>>>>>>>
>>>>>>>> The index is stored in a separate table, but the index is
>>>>>>>> managed
>>>>> for
>>>>>>>> you.
>>>>>>>>
>>>>>>>> I don't think you can do an arbitrary "in" query, though.  If
>>>>>>>> the
>>>>> keys
>>>>>>>> that
>>>>>>>> you want to include in the "in" are reasonably close neighbors,
>>>>> you
>>>>>>>> could do
>>>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>>>> try a
>>>>>>> batch
>>>>>>>> Get
>>>>>>>> by applying a separate patch, see
>>>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>>>
>>>>>>>> Marc Limotte
>>>>>>>>
>>>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>>
>>>>>>>>> Is there any support for this.  I want to do this
>>>>>>>>>
>>>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>>>> column
>>>>>>>>> and the rowid's of the primary table
>>>>>>>>>
>>>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>>>> primary
>>>>>>>>> table using a SQL In like clause ---
>>>>>>>>>
>>>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>>>> columns.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> Sriram C
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>>>> Limited.
>>>>>>>> Ivy
>>>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>>>
>>>>>>>>> This email and any attachments are confidential, and may be
>>>>> legally
>>>>>>>>> privileged and protected by copyright. If you are not the
>>>>> intended
>>>>>>>> recipient
>>>>>>>>> dissemination or copying of this email is prohibited. If you
>>>>> have
>>>>>>>> received
>>>>>>>>> this in error, please notify the sender by replying by email
>>>>>>>>> and
>>>>>>> then
>>>>>>>> delete
>>>>>>>>> the email completely from your system.
>>>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>>>> communication
>>>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>>>> Comptech
>>>>>>>> Private
>>>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>>>> authorised.
>>>>>>>>> Any actions taken on the basis of this email are at the
>>>>> recipient's
>>>>>>>> own
>>>>>>>>> risk.
>>>>>>>>>
>>>>>>>>> Registered office:
>>>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>>>> Hills,
>>>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>>> 37994.
>>>>>>>>> Registered in India. A list of members' names is available for
>>>>>>>> inspection at
>>>>>>>>> the registered office.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>> Limited. Ivy Comptech Private Limited is a limited liability
>>>> company.
>>>>
>>>> This email and any attachments are confidential, and may be legally
>>>> privileged and protected by copyright. If you are not the intended
>>>> recipient dissemination or copying of this email is prohibited. If
>>>> you have received this in error, please notify the sender by
>>>> replying by email and then delete the email completely from your
>>>> system.
>>>> Any views or opinions are solely those of the sender.  This
>>>> communication is not intended to form a binding contract on behalf
>>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>>> contrary and properly authorised. Any actions taken on the basis of
>>>> this email are at the recipient's own risk.
>>>>
>>>> Registered office:
>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>> 37994. Registered in India. A list of members' names is available
>>>> for inspection at the registered office.
>>>>
>>>>
>>>
>>> This email is sent for and on behalf of Ivy Comptech Private
>>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>>
>>> This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If
>>> you have received this in error, please notify the sender by
>>> replying by email and then delete the email completely from your
>>> system.
>>> Any views or opinions are solely those of the sender.  This
>>> communication is not intended to form a binding contract on behalf
>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>> contrary and properly authorised. Any actions taken on the basis of
>>> this email are at the recipient's own risk.
>>>
>>> Registered office:
>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994. Registered in India. A list of members' names is available
>>> for inspection at the registered office.
>>>
>>
>> This email is sent for and on behalf of Ivy Comptech Private
>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>
>> This email and any attachments are confidential, and may be legally
>> privileged and protected by copyright. If you are not the intended
>> recipient dissemination or copying of this email is prohibited. If
>> you have received this in error, please notify the sender by
>> replying by email and then delete the email completely from your
>> system.
>> Any views or opinions are solely those of the sender.  This
>> communication is not intended to form a binding contract on behalf
>> of Ivy Comptech Private Limited unless expressly indicated to the
>> contrary and properly authorised. Any actions taken on the basis of
>> this email are at the recipient's own risk.
>>
>> Registered office:
>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994. Registered in India. A list of members' names is available
>> for inspection at the registered office.
>>
>
> This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system.
> Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.
>
>

RE: Support for MultiGet / SQL In clause

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
What do errors like this in my  datanode mean (comes in all the boxes) -
- something is corrupted and I clean everything and restart

2010-01-24 02:27:46,349 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
Incompatible namespaceIDs in /home/ppoker/test/hadoop-0.20.1/data/data:
namenode namespaceID = 1033461714; datanode namespaceID = 1483221888
        at
org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStor
age.java:233)
        at
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead
(DataStorage.java:148)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.j
ava:298)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:216
)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.ja
va:1283)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(Data
Node.java:1238)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.
java:1246)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)

2010-01-24 02:27:46,349 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:





-----Original Message-----
From: Daniel Washusen [mailto:dan@reactive.org] 
Sent: Sunday, January 24, 2010 12:41 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause

Also, I would suggest that you allocate one of the machines to be
hmaster & namenode. The other three machines run rserver, datanode &
zookeeper.

On 24/01/2010, at 5:36 PM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:

> You were right. This is what I have in the regionserver.out file
>
> java.lang.OutOfMemoryError: Java heap space
> Dumping heap to java_pid28602.hprof ...
> Heap dump file created [1081182427 bytes in 53.046 secs]
>
> Do I increase the jvm heapsize for all the machines.  This is my
> config
>
> 1.  4 boxes running the cluster
> 2.  Habse Regionserver and HDFS data node on all the boxes
> 3.  One of the boxes where this error occurred also has the NameNode
> and
> the HBasemaster running
>
> How do I selectively increase the heapsize just for regionserver.  I
> changed  in bin/hbase.
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 3:44 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause
>
> What does the regionserver.out log file say?  Maybe you are out of
> memory...  The master web ui will report how much heap space the
> region server is using...
>
> On 24/01/2010, at 8:43 AM, "Sriram Muthuswamy Chittathoor"
> <sriramc@ivycomptech.com
>> wrote:
>
>> I created a new table with indexes.  Initially created 100000 rows
>> and then did a scan.  At that time it was okay.  Then I started
>> creating a million rows in a loop and then after some time I get
>> this exception and the table disappeared (even from the hbase
>> shell).  One other table also disappeared.
>>
>> This is very consistent.  I tried a few times and every time it is
>> the same on creating a lot of rows.  Rows are not too big (Just some
>> 6 columns in one family) each of say type long or string.  Created 3
>> indexes -- 2 byte array and 1 long.
>>
>>
>>
>> Cur Rows : 349999
>> Cur : 359999
>> Cur : 369999
>> Cur : 379999
>> Cur Rows : 389999   <--  Crashed after this
>> Exception in thread "main"
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>> contact region server 10.1.162.25:60020 for region
> POKERHH6,,1264279773695
>> , row '0000392413', but failed after 10 attempts.
>> Exceptions:
>> java.io.IOException: Call to /10.1.162.25:60020 failed on local
>> exception: java.io.EOFException
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers.getRegionServerWithRetries(HConnectionManager.java:
>> 1048)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers$3.doCall(HConnectionManager.java:1239)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers$Batch.process(HConnectionManager.java:1161)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers.processBatchOfRows(HConnectionManager.java:1247)
>>       at org.apache.hadoop.hbase.client.HTable.flushCommits
>> (HTable.java:609)
>>       at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
>>       at test.TestExtIndexedTable.main(TestExtIndexedTable.java:110)
>> 10/01/23 16:21:28 INFO zookeeper.ZooKeeper: Closing session:
>> 0x265caacc7a001b
>> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Closing ClientCnxn for
>> session: 0x265caacc7a001b
>> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Exception while closing
>> send thread for session 0x265caacc7a001b : Read error rc = -1
>> java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>>
>>
>>
>> -----Original Message-----
>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> Stack
>> Sent: Sunday, January 24, 2010 1:33 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> Thanks all.  I messed it up when I was trying to upgrade to
>>> 0.20.3.  I deleted the data directory and formatted it thinking it
>>> will reset the whole cluster.
>>>
>>> I started fresh by deleting the data directory on all the nodes and
>>> then everything worked.  I was also able to create the indexed
>>> table using the 0.20.3 patch.  Let me run some tests on a few
>>> million rows and see how it holds up.
>>>
>>> BTW --  what would be the right way when I moved versions.  Do I
>>> run migrate scripts to migrate the data to newer versions ?
>>>
>> Just install the new binaries every and restart or perform a rolling
>> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
>> if you would avoid taking down your cluster during the upgrade.
>>
>> You'll be flagged on start if you need to run a migration but general
>> rule is that there (should) never be need of a migration between
>> patch
>> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
>> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>>
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be
>> startling.
>>
>> St.Ack
>>
>>> -----Original Message-----
>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> Stack
>>> Sent: Saturday, January 23, 2010 5:00 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> Check your master log.  Something is seriously off if you do not
>>> have
>>> a reachable .META. table.
>>> St.Ack
>>>
>>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>>> starting
>>>> hbase I keep getting the error below when I go to the hbase shell
>>>>
>>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>>> hbase(main):001:0> list
>>>> NativeException:
>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>> contact region server null for region , row '', but failed after 7
>>>> attempts.
>>>> Exceptions:
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>
>>>>
>>>>
>>>> Also when I try to create a table programatically I get this --
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>>> to
>>>> server localhost/127.0.0.1:2181
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>>> remote=localhost/127.0.0.1:2181]
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>>> successful
>>>> Exception in thread "main"
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:684)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:634)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:675)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:638)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 128)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 106)
>>>>       at test.CreateTable.main(CreateTable.java:36)
>>>>
>>>>
>>>>
>>>> Any clues ?
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Dan Washusen [mailto:dan@reactive.org]
>>>> Sent: Friday, January 22, 2010 4:53 AM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>> HBASE-1845
>>>>
>>>> If you want to give the "indexed" contrib package a try you'll
>>>> need to
>>>> do
>>>> the following:
>>>>
>>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>>> "\n"
>>>> ":"`)
>>>>  2. Set the 'hbase.hregion.impl' property to
>>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>>> hbase-site.xml
>>>>
>>>> Once you've done that you can create a table with an index using:
>>>>
>>>>>    // define which qualifiers need an index (choosing the correct
>>>> type)
>>>>>    IdxColumnDescriptor columnDescriptor = new
>>>>> IdxColumnDescriptor("columnFamily");
>>>>>    columnDescriptor.addIndexDescriptor(
>>>>>      new IdxIndexDescriptor("qualifier",
>>>>> IdxQualifierType.BYTE_ARRAY)
>>>>>    );
>>>>>
>>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
>>>>> ("table");
>>>>>    tableDescriptor.addFamily(columnDescriptor);
>>>>>
>>>>
>>>> Then when you want to perform a scan with an index hint:
>>>>
>>>>>    Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>
>>>>
>>>> You have to keep in mind that the index hint is only a hint.  It
>>>> guarantees
>>>> that your scan will get all rows that match the hint but you'll
>>>> more
>>>> than
>>>> likely receive rows that don't.  For this reason I'd suggest that
>>>> you
>>>> also
>>>> include a filter along with the scan:
>>>>
>>>>>      Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>      scan.setFilter(
>>>>>          new SingleColumnValueFilter(
>>>>>              "columnFamily", "qualifer",
>>>> CompareFilter.CompareOp.EQUAL,
>>>>>              new BinaryComparator("foo")
>>>>>          )
>>>>>      );
>>>>>
>>>>
>>>> Cheers,
>>>> Dan
>>>>
>>>>
>>>> 2010/1/22 stack <st...@duboce.net>
>>>>
>>>>>
>>>>
>
http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>>
>>>>> There is a bit of documentation if you look at javadoc for the
>>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>>
>>>>> St.Ack
>>>>>
>>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>>> the
>>>>> answers you need on that one?
>>>>>
>>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>>> <sr...@ivycomptech.com> wrote:
>>>>>>
>>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>>> pass
>>>>>> me the link
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>>> Of
>>>>>> stack
>>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>>> patch
>>>>>> HBASE-1845
>>>>>>
>>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>>> probably
>>>>>> rotted since any ways.
>>>>>>
>>>>>> Have you looked at hbase-2037 since committed and available in
>>>>>> 0.20.3RC2.
>>>>>> Would this help you with your original problem?
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> I tried applying the patch to the hbase source code  hbase
>>>>>>> 0.20.2
>>>> and
>>>>>> I
>>>>>>> get the errors below.  Do you know if this needs to be applied
>>>>>>> to
>>>> a
>>>>>>> specific hbase version. Is there a version which works with
>>>>>>> 0.20.2
>>>> or
>>>>>>> later ??
>>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the help
>>>>>>>
>>>>>>> patch -p0 -i batch.patch
>>>>>>>
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
>>>>>>> HTable.java
>>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>>> Hunk #4 FAILED at 405.
>>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>>> Hunk #2 FAILED at 2515.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>
>>>> src/java/org/apache/hadoop/hbase/regionserver/
>>>> HRegionServer.java.rej
>>>>>>> patching file
>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>>> Hunk #2 FAILED at 333.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>>
>>>>>>> Sriram,
>>>>>>>
>>>>>>> Would a secondary index help you:
>>>>>>>
>>>>>>
>>>>
>
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>>> .
>>>>>>>
>>>>>>> The index is stored in a separate table, but the index is
>>>>>>> managed
>>>> for
>>>>>>> you.
>>>>>>>
>>>>>>> I don't think you can do an arbitrary "in" query, though.  If
>>>>>>> the
>>>> keys
>>>>>>> that
>>>>>>> you want to include in the "in" are reasonably close neighbors,
>>>> you
>>>>>>> could do
>>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>>> try a
>>>>>> batch
>>>>>>> Get
>>>>>>> by applying a separate patch, see
>>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>>
>>>>>>> Marc Limotte
>>>>>>>
>>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>
>>>>>>>> Is there any support for this.  I want to do this
>>>>>>>>
>>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>>> column
>>>>>>>> and the rowid's of the primary table
>>>>>>>>
>>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>>> primary
>>>>>>>> table using a SQL In like clause ---
>>>>>>>>
>>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>>> columns.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Sriram C
>>>>>>>>
>>>>>>>>
>>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>>> Limited.
>>>>>>> Ivy
>>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>>
>>>>>>>> This email and any attachments are confidential, and may be
>>>> legally
>>>>>>>> privileged and protected by copyright. If you are not the
>>>> intended
>>>>>>> recipient
>>>>>>>> dissemination or copying of this email is prohibited. If you
>>>> have
>>>>>>> received
>>>>>>>> this in error, please notify the sender by replying by email
>>>>>>>> and
>>>>>> then
>>>>>>> delete
>>>>>>>> the email completely from your system.
>>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>>> communication
>>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>>> Comptech
>>>>>>> Private
>>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>>> authorised.
>>>>>>>> Any actions taken on the basis of this email are at the
>>>> recipient's
>>>>>>> own
>>>>>>>> risk.
>>>>>>>>
>>>>>>>> Registered office:
>>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>>> Hills,
>>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>> 37994.
>>>>>>>> Registered in India. A list of members' names is available for
>>>>>>> inspection at
>>>>>>>> the registered office.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>> This email is sent for and on behalf of Ivy Comptech Private
>>> Limited. Ivy Comptech Private Limited is a limited liability
>>> company.
>>>
>>> This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If
>>> you have received this in error, please notify the sender by
>>> replying by email and then delete the email completely from your
>>> system.
>>> Any views or opinions are solely those of the sender.  This
>>> communication is not intended to form a binding contract on behalf
>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>> contrary and properly authorised. Any actions taken on the basis of
>>> this email are at the recipient's own risk.
>>>
>>> Registered office:
>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994. Registered in India. A list of members' names is available
>>> for inspection at the registered office.
>>>
>>>
>>
>> This email is sent for and on behalf of Ivy Comptech Private
>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>
>> This email and any attachments are confidential, and may be legally
>> privileged and protected by copyright. If you are not the intended
>> recipient dissemination or copying of this email is prohibited. If
>> you have received this in error, please notify the sender by
>> replying by email and then delete the email completely from your
>> system.
>> Any views or opinions are solely those of the sender.  This
>> communication is not intended to form a binding contract on behalf
>> of Ivy Comptech Private Limited unless expressly indicated to the
>> contrary and properly authorised. Any actions taken on the basis of
>> this email are at the recipient's own risk.
>>
>> Registered office:
>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994. Registered in India. A list of members' names is available
>> for inspection at the registered office.
>>
>
> This email is sent for and on behalf of Ivy Comptech Private
> Limited. Ivy Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended
> recipient dissemination or copying of this email is prohibited. If
> you have received this in error, please notify the sender by
> replying by email and then delete the email completely from your
> system.
> Any views or opinions are solely those of the sender.  This
> communication is not intended to form a binding contract on behalf
> of Ivy Comptech Private Limited unless expressly indicated to the
> contrary and properly authorised. Any actions taken on the basis of
> this email are at the recipient's own risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994. Registered in India. A list of members' names is available
> for inspection at the registered office.
>

This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.  

This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system. 
Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.

Registered office:
Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.


Re: Support for MultiGet / SQL In clause

Posted by Daniel Washusen <da...@reactive.org>.
Also, I would suggest that you allocate one of the machines to be
hmaster & namenode. The other three machines run rserver, datanode &
zookeeper.

On 24/01/2010, at 5:36 PM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:

> You were right. This is what I have in the regionserver.out file
>
> java.lang.OutOfMemoryError: Java heap space
> Dumping heap to java_pid28602.hprof ...
> Heap dump file created [1081182427 bytes in 53.046 secs]
>
> Do I increase the jvm heapsize for all the machines.  This is my
> config
>
> 1.  4 boxes running the cluster
> 2.  Habse Regionserver and HDFS data node on all the boxes
> 3.  One of the boxes where this error occurred also has the NameNode
> and
> the HBasemaster running
>
> How do I selectively increase the heapsize just for regionserver.  I
> changed  in bin/hbase.
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 3:44 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause
>
> What does the regionserver.out log file say?  Maybe you are out of
> memory...  The master web ui will report how much heap space the
> region server is using...
>
> On 24/01/2010, at 8:43 AM, "Sriram Muthuswamy Chittathoor"
> <sriramc@ivycomptech.com
>> wrote:
>
>> I created a new table with indexes.  Initially created 100000 rows
>> and then did a scan.  At that time it was okay.  Then I started
>> creating a million rows in a loop and then after some time I get
>> this exception and the table disappeared (even from the hbase
>> shell).  One other table also disappeared.
>>
>> This is very consistent.  I tried a few times and every time it is
>> the same on creating a lot of rows.  Rows are not too big (Just some
>> 6 columns in one family) each of say type long or string.  Created 3
>> indexes -- 2 byte array and 1 long.
>>
>>
>>
>> Cur Rows : 349999
>> Cur : 359999
>> Cur : 369999
>> Cur : 379999
>> Cur Rows : 389999   <--  Crashed after this
>> Exception in thread "main"
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>> contact region server 10.1.162.25:60020 for region
> POKERHH6,,1264279773695
>> , row '0000392413', but failed after 10 attempts.
>> Exceptions:
>> java.io.IOException: Call to /10.1.162.25:60020 failed on local
>> exception: java.io.EOFException
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers.getRegionServerWithRetries(HConnectionManager.java:
>> 1048)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers$3.doCall(HConnectionManager.java:1239)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers$Batch.process(HConnectionManager.java:1161)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers.processBatchOfRows(HConnectionManager.java:1247)
>>       at org.apache.hadoop.hbase.client.HTable.flushCommits
>> (HTable.java:609)
>>       at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
>>       at test.TestExtIndexedTable.main(TestExtIndexedTable.java:110)
>> 10/01/23 16:21:28 INFO zookeeper.ZooKeeper: Closing session:
>> 0x265caacc7a001b
>> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Closing ClientCnxn for
>> session: 0x265caacc7a001b
>> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Exception while closing
>> send thread for session 0x265caacc7a001b : Read error rc = -1
>> java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>>
>>
>>
>> -----Original Message-----
>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> Stack
>> Sent: Sunday, January 24, 2010 1:33 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> Thanks all.  I messed it up when I was trying to upgrade to
>>> 0.20.3.  I deleted the data directory and formatted it thinking it
>>> will reset the whole cluster.
>>>
>>> I started fresh by deleting the data directory on all the nodes and
>>> then everything worked.  I was also able to create the indexed
>>> table using the 0.20.3 patch.  Let me run some tests on a few
>>> million rows and see how it holds up.
>>>
>>> BTW --  what would be the right way when I moved versions.  Do I
>>> run migrate scripts to migrate the data to newer versions ?
>>>
>> Just install the new binaries every and restart or perform a rolling
>> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
>> if you would avoid taking down your cluster during the upgrade.
>>
>> You'll be flagged on start if you need to run a migration but general
>> rule is that there (should) never be need of a migration between
>> patch
>> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
>> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>>
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be
>> startling.
>>
>> St.Ack
>>
>>> -----Original Message-----
>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> Stack
>>> Sent: Saturday, January 23, 2010 5:00 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> Check your master log.  Something is seriously off if you do not
>>> have
>>> a reachable .META. table.
>>> St.Ack
>>>
>>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>>> starting
>>>> hbase I keep getting the error below when I go to the hbase shell
>>>>
>>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>>> hbase(main):001:0> list
>>>> NativeException:
>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>> contact region server null for region , row '', but failed after 7
>>>> attempts.
>>>> Exceptions:
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>
>>>>
>>>>
>>>> Also when I try to create a table programatically I get this --
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>>> to
>>>> server localhost/127.0.0.1:2181
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>>> remote=localhost/127.0.0.1:2181]
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>>> successful
>>>> Exception in thread "main"
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:684)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:634)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:675)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:638)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 128)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 106)
>>>>       at test.CreateTable.main(CreateTable.java:36)
>>>>
>>>>
>>>>
>>>> Any clues ?
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Dan Washusen [mailto:dan@reactive.org]
>>>> Sent: Friday, January 22, 2010 4:53 AM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>> HBASE-1845
>>>>
>>>> If you want to give the "indexed" contrib package a try you'll
>>>> need to
>>>> do
>>>> the following:
>>>>
>>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>>> "\n"
>>>> ":"`)
>>>>  2. Set the 'hbase.hregion.impl' property to
>>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>>> hbase-site.xml
>>>>
>>>> Once you've done that you can create a table with an index using:
>>>>
>>>>>    // define which qualifiers need an index (choosing the correct
>>>> type)
>>>>>    IdxColumnDescriptor columnDescriptor = new
>>>>> IdxColumnDescriptor("columnFamily");
>>>>>    columnDescriptor.addIndexDescriptor(
>>>>>      new IdxIndexDescriptor("qualifier",
>>>>> IdxQualifierType.BYTE_ARRAY)
>>>>>    );
>>>>>
>>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
>>>>> ("table");
>>>>>    tableDescriptor.addFamily(columnDescriptor);
>>>>>
>>>>
>>>> Then when you want to perform a scan with an index hint:
>>>>
>>>>>    Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>
>>>>
>>>> You have to keep in mind that the index hint is only a hint.  It
>>>> guarantees
>>>> that your scan will get all rows that match the hint but you'll
>>>> more
>>>> than
>>>> likely receive rows that don't.  For this reason I'd suggest that
>>>> you
>>>> also
>>>> include a filter along with the scan:
>>>>
>>>>>      Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>      scan.setFilter(
>>>>>          new SingleColumnValueFilter(
>>>>>              "columnFamily", "qualifer",
>>>> CompareFilter.CompareOp.EQUAL,
>>>>>              new BinaryComparator("foo")
>>>>>          )
>>>>>      );
>>>>>
>>>>
>>>> Cheers,
>>>> Dan
>>>>
>>>>
>>>> 2010/1/22 stack <st...@duboce.net>
>>>>
>>>>>
>>>>
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>>
>>>>> There is a bit of documentation if you look at javadoc for the
>>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>>
>>>>> St.Ack
>>>>>
>>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>>> the
>>>>> answers you need on that one?
>>>>>
>>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>>> <sr...@ivycomptech.com> wrote:
>>>>>>
>>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>>> pass
>>>>>> me the link
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>>> Of
>>>>>> stack
>>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>>> patch
>>>>>> HBASE-1845
>>>>>>
>>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>>> probably
>>>>>> rotted since any ways.
>>>>>>
>>>>>> Have you looked at hbase-2037 since committed and available in
>>>>>> 0.20.3RC2.
>>>>>> Would this help you with your original problem?
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> I tried applying the patch to the hbase source code  hbase
>>>>>>> 0.20.2
>>>> and
>>>>>> I
>>>>>>> get the errors below.  Do you know if this needs to be applied
>>>>>>> to
>>>> a
>>>>>>> specific hbase version. Is there a version which works with
>>>>>>> 0.20.2
>>>> or
>>>>>>> later ??
>>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the help
>>>>>>>
>>>>>>> patch -p0 -i batch.patch
>>>>>>>
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
>>>>>>> HTable.java
>>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>>> Hunk #4 FAILED at 405.
>>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>>> Hunk #2 FAILED at 2515.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>
>>>> src/java/org/apache/hadoop/hbase/regionserver/
>>>> HRegionServer.java.rej
>>>>>>> patching file
>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>>> Hunk #2 FAILED at 333.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>>
>>>>>>> Sriram,
>>>>>>>
>>>>>>> Would a secondary index help you:
>>>>>>>
>>>>>>
>>>>
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>>> .
>>>>>>>
>>>>>>> The index is stored in a separate table, but the index is
>>>>>>> managed
>>>> for
>>>>>>> you.
>>>>>>>
>>>>>>> I don't think you can do an arbitrary "in" query, though.  If
>>>>>>> the
>>>> keys
>>>>>>> that
>>>>>>> you want to include in the "in" are reasonably close neighbors,
>>>> you
>>>>>>> could do
>>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>>> try a
>>>>>> batch
>>>>>>> Get
>>>>>>> by applying a separate patch, see
>>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>>
>>>>>>> Marc Limotte
>>>>>>>
>>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>
>>>>>>>> Is there any support for this.  I want to do this
>>>>>>>>
>>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>>> column
>>>>>>>> and the rowid's of the primary table
>>>>>>>>
>>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>>> primary
>>>>>>>> table using a SQL In like clause ---
>>>>>>>>
>>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>>> columns.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Sriram C
>>>>>>>>
>>>>>>>>
>>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>>> Limited.
>>>>>>> Ivy
>>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>>
>>>>>>>> This email and any attachments are confidential, and may be
>>>> legally
>>>>>>>> privileged and protected by copyright. If you are not the
>>>> intended
>>>>>>> recipient
>>>>>>>> dissemination or copying of this email is prohibited. If you
>>>> have
>>>>>>> received
>>>>>>>> this in error, please notify the sender by replying by email
>>>>>>>> and
>>>>>> then
>>>>>>> delete
>>>>>>>> the email completely from your system.
>>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>>> communication
>>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>>> Comptech
>>>>>>> Private
>>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>>> authorised.
>>>>>>>> Any actions taken on the basis of this email are at the
>>>> recipient's
>>>>>>> own
>>>>>>>> risk.
>>>>>>>>
>>>>>>>> Registered office:
>>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>>> Hills,
>>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>> 37994.
>>>>>>>> Registered in India. A list of members' names is available for
>>>>>>> inspection at
>>>>>>>> the registered office.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>> This email is sent for and on behalf of Ivy Comptech Private
>>> Limited. Ivy Comptech Private Limited is a limited liability
>>> company.
>>>
>>> This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If
>>> you have received this in error, please notify the sender by
>>> replying by email and then delete the email completely from your
>>> system.
>>> Any views or opinions are solely those of the sender.  This
>>> communication is not intended to form a binding contract on behalf
>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>> contrary and properly authorised. Any actions taken on the basis of
>>> this email are at the recipient's own risk.
>>>
>>> Registered office:
>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994. Registered in India. A list of members' names is available
>>> for inspection at the registered office.
>>>
>>>
>>
>> This email is sent for and on behalf of Ivy Comptech Private
>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>
>> This email and any attachments are confidential, and may be legally
>> privileged and protected by copyright. If you are not the intended
>> recipient dissemination or copying of this email is prohibited. If
>> you have received this in error, please notify the sender by
>> replying by email and then delete the email completely from your
>> system.
>> Any views or opinions are solely those of the sender.  This
>> communication is not intended to form a binding contract on behalf
>> of Ivy Comptech Private Limited unless expressly indicated to the
>> contrary and properly authorised. Any actions taken on the basis of
>> this email are at the recipient's own risk.
>>
>> Registered office:
>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994. Registered in India. A list of members' names is available
>> for inspection at the registered office.
>>
>
> This email is sent for and on behalf of Ivy Comptech Private
> Limited. Ivy Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended
> recipient dissemination or copying of this email is prohibited. If
> you have received this in error, please notify the sender by
> replying by email and then delete the email completely from your
> system.
> Any views or opinions are solely those of the sender.  This
> communication is not intended to form a binding contract on behalf
> of Ivy Comptech Private Limited unless expressly indicated to the
> contrary and properly authorised. Any actions taken on the basis of
> this email are at the recipient's own risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994. Registered in India. A list of members' names is available
> for inspection at the registered office.
>

Re: Support for MultiGet / SQL In clause

Posted by Daniel Washusen <da...@reactive.org>.
One of the down sides of the indexed contrib is that it makes memory
planning and management much more crucial. I imagine this will improve
as it matures.

I would allocate as much memory as you can to the region servers.  In
your hbase-env.sh increase the hbase heapsize  env variable.  As I
said, give them as much as you can but obviously take into account
other processes you have running on the machine.

Cheers,
Dan

On 24/01/2010,  at 5:36 PM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:
> You were right. This is what I have in the regionserver.out file
>
> java.lang.OutOfMemoryError: Java heap space
> Dumping heap to java_pid28602.hprof ...
> Heap dump file created [1081182427 bytes in 53.046 secs]
>
> Do I increase the jvm heapsize for all the machines.  This is my
> config
>
> 1.  4 boxes running the cluster
> 2.  Habse Regionserver and HDFS data node on all the boxes
> 3.  One of the boxes where this error occurred also has the NameNode
> and
> the HBasemaster running
>
> How do I selectively increase the heapsize just for regionserver.  I
> changed  in bin/hbase.
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 3:44 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause
>
> What does the regionserver.out log file say?  Maybe you are out of
> memory...  The master web ui will report how much heap space the
> region server is using...
>
> On 24/01/2010, at 8:43 AM, "Sriram Muthuswamy Chittathoor"
> <sriramc@ivycomptech.com
>> wrote:
>
>> I created a new table with indexes.  Initially created 100000 rows
>> and then did a scan.  At that time it was okay.  Then I started
>> creating a million rows in a loop and then after some time I get
>> this exception and the table disappeared (even from the hbase
>> shell).  One other table also disappeared.
>>
>> This is very consistent.  I tried a few times and every time it is
>> the same on creating a lot of rows.  Rows are not too big (Just some
>> 6 columns in one family) each of say type long or string.  Created 3
>> indexes -- 2 byte array and 1 long.
>>
>>
>>
>> Cur Rows : 349999
>> Cur : 359999
>> Cur : 369999
>> Cur : 379999
>> Cur Rows : 389999   <--  Crashed after this
>> Exception in thread "main"
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>> contact region server 10.1.162.25:60020 for region
> POKERHH6,,1264279773695
>> , row '0000392413', but failed after 10 attempts.
>> Exceptions:
>> java.io.IOException: Call to /10.1.162.25:60020 failed on local
>> exception: java.io.EOFException
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>>
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers.getRegionServerWithRetries(HConnectionManager.java:
>> 1048)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers$3.doCall(HConnectionManager.java:1239)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers$Batch.process(HConnectionManager.java:1161)
>>       at org.apache.hadoop.hbase.client.HConnectionManager
>> $TableServers.processBatchOfRows(HConnectionManager.java:1247)
>>       at org.apache.hadoop.hbase.client.HTable.flushCommits
>> (HTable.java:609)
>>       at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
>>       at test.TestExtIndexedTable.main(TestExtIndexedTable.java:110)
>> 10/01/23 16:21:28 INFO zookeeper.ZooKeeper: Closing session:
>> 0x265caacc7a001b
>> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Closing ClientCnxn for
>> session: 0x265caacc7a001b
>> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Exception while closing
>> send thread for session 0x265caacc7a001b : Read error rc = -1
>> java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>>
>>
>>
>> -----Original Message-----
>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> Stack
>> Sent: Sunday, January 24, 2010 1:33 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> Thanks all.  I messed it up when I was trying to upgrade to
>>> 0.20.3.  I deleted the data directory and formatted it thinking it
>>> will reset the whole cluster.
>>>
>>> I started fresh by deleting the data directory on all the nodes and
>>> then everything worked.  I was also able to create the indexed
>>> table using the 0.20.3 patch.  Let me run some tests on a few
>>> million rows and see how it holds up.
>>>
>>> BTW --  what would be the right way when I moved versions.  Do I
>>> run migrate scripts to migrate the data to newer versions ?
>>>
>> Just install the new binaries every and restart or perform a rolling
>> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
>> if you would avoid taking down your cluster during the upgrade.
>>
>> You'll be flagged on start if you need to run a migration but general
>> rule is that there (should) never be need of a migration between
>> patch
>> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
>> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>>
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be
>> startling.
>>
>> St.Ack
>>
>>> -----Original Message-----
>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> Stack
>>> Sent: Saturday, January 23, 2010 5:00 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> Check your master log.  Something is seriously off if you do not
>>> have
>>> a reachable .META. table.
>>> St.Ack
>>>
>>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>>> starting
>>>> hbase I keep getting the error below when I go to the hbase shell
>>>>
>>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>>> hbase(main):001:0> list
>>>> NativeException:
>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>> contact region server null for region , row '', but failed after 7
>>>> attempts.
>>>> Exceptions:
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>
>>>>
>>>>
>>>> Also when I try to create a table programatically I get this --
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>>> to
>>>> server localhost/127.0.0.1:2181
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>>> remote=localhost/127.0.0.1:2181]
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>>> successful
>>>> Exception in thread "main"
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:684)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:634)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:675)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:638)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 128)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 106)
>>>>       at test.CreateTable.main(CreateTable.java:36)
>>>>
>>>>
>>>>
>>>> Any clues ?
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Dan Washusen [mailto:dan@reactive.org]
>>>> Sent: Friday, January 22, 2010 4:53 AM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>> HBASE-1845
>>>>
>>>> If you want to give the "indexed" contrib package a try you'll
>>>> need to
>>>> do
>>>> the following:
>>>>
>>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>>> "\n"
>>>> ":"`)
>>>>  2. Set the 'hbase.hregion.impl' property to
>>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>>> hbase-site.xml
>>>>
>>>> Once you've done that you can create a table with an index using:
>>>>
>>>>>    // define which qualifiers need an index (choosing the correct
>>>> type)
>>>>>    IdxColumnDescriptor columnDescriptor = new
>>>>> IdxColumnDescriptor("columnFamily");
>>>>>    columnDescriptor.addIndexDescriptor(
>>>>>      new IdxIndexDescriptor("qualifier",
>>>>> IdxQualifierType.BYTE_ARRAY)
>>>>>    );
>>>>>
>>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
>>>>> ("table");
>>>>>    tableDescriptor.addFamily(columnDescriptor);
>>>>>
>>>>
>>>> Then when you want to perform a scan with an index hint:
>>>>
>>>>>    Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>
>>>>
>>>> You have to keep in mind that the index hint is only a hint.  It
>>>> guarantees
>>>> that your scan will get all rows that match the hint but you'll
>>>> more
>>>> than
>>>> likely receive rows that don't.  For this reason I'd suggest that
>>>> you
>>>> also
>>>> include a filter along with the scan:
>>>>
>>>>>      Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>      scan.setFilter(
>>>>>          new SingleColumnValueFilter(
>>>>>              "columnFamily", "qualifer",
>>>> CompareFilter.CompareOp.EQUAL,
>>>>>              new BinaryComparator("foo")
>>>>>          )
>>>>>      );
>>>>>
>>>>
>>>> Cheers,
>>>> Dan
>>>>
>>>>
>>>> 2010/1/22 stack <st...@duboce.net>
>>>>
>>>>>
>>>>
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>>
>>>>> There is a bit of documentation if you look at javadoc for the
>>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>>
>>>>> St.Ack
>>>>>
>>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>>> the
>>>>> answers you need on that one?
>>>>>
>>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>>> <sr...@ivycomptech.com> wrote:
>>>>>>
>>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>>> pass
>>>>>> me the link
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>>> Of
>>>>>> stack
>>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>>> patch
>>>>>> HBASE-1845
>>>>>>
>>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>>> probably
>>>>>> rotted since any ways.
>>>>>>
>>>>>> Have you looked at hbase-2037 since committed and available in
>>>>>> 0.20.3RC2.
>>>>>> Would this help you with your original problem?
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> I tried applying the patch to the hbase source code  hbase
>>>>>>> 0.20.2
>>>> and
>>>>>> I
>>>>>>> get the errors below.  Do you know if this needs to be applied
>>>>>>> to
>>>> a
>>>>>>> specific hbase version. Is there a version which works with
>>>>>>> 0.20.2
>>>> or
>>>>>>> later ??
>>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the help
>>>>>>>
>>>>>>> patch -p0 -i batch.patch
>>>>>>>
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
>>>>>>> HTable.java
>>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>>> Hunk #4 FAILED at 405.
>>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>>> Hunk #2 FAILED at 2515.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>
>>>> src/java/org/apache/hadoop/hbase/regionserver/
>>>> HRegionServer.java.rej
>>>>>>> patching file
>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>>> Hunk #2 FAILED at 333.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>>
>>>>>>> Sriram,
>>>>>>>
>>>>>>> Would a secondary index help you:
>>>>>>>
>>>>>>
>>>>
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>>> .
>>>>>>>
>>>>>>> The index is stored in a separate table, but the index is
>>>>>>> managed
>>>> for
>>>>>>> you.
>>>>>>>
>>>>>>> I don't think you can do an arbitrary "in" query, though.  If
>>>>>>> the
>>>> keys
>>>>>>> that
>>>>>>> you want to include in the "in" are reasonably close neighbors,
>>>> you
>>>>>>> could do
>>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>>> try a
>>>>>> batch
>>>>>>> Get
>>>>>>> by applying a separate patch, see
>>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>>
>>>>>>> Marc Limotte
>>>>>>>
>>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>
>>>>>>>> Is there any support for this.  I want to do this
>>>>>>>>
>>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>>> column
>>>>>>>> and the rowid's of the primary table
>>>>>>>>
>>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>>> primary
>>>>>>>> table using a SQL In like clause ---
>>>>>>>>
>>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>>> columns.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Sriram C
>>>>>>>>
>>>>>>>>
>>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>>> Limited.
>>>>>>> Ivy
>>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>>
>>>>>>>> This email and any attachments are confidential, and may be
>>>> legally
>>>>>>>> privileged and protected by copyright. If you are not the
>>>> intended
>>>>>>> recipient
>>>>>>>> dissemination or copying of this email is prohibited. If you
>>>> have
>>>>>>> received
>>>>>>>> this in error, please notify the sender by replying by email
>>>>>>>> and
>>>>>> then
>>>>>>> delete
>>>>>>>> the email completely from your system.
>>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>>> communication
>>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>>> Comptech
>>>>>>> Private
>>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>>> authorised.
>>>>>>>> Any actions taken on the basis of this email are at the
>>>> recipient's
>>>>>>> own
>>>>>>>> risk.
>>>>>>>>
>>>>>>>> Registered office:
>>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>>> Hills,
>>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>> 37994.
>>>>>>>> Registered in India. A list of members' names is available for
>>>>>>> inspection at
>>>>>>>> the registered office.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>> This email is sent for and on behalf of Ivy Comptech Private
>>> Limited. Ivy Comptech Private Limited is a limited liability
>>> company.
>>>
>>> This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If
>>> you have received this in error, please notify the sender by
>>> replying by email and then delete the email completely from your
>>> system.
>>> Any views or opinions are solely those of the sender.  This
>>> communication is not intended to form a binding contract on behalf
>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>> contrary and properly authorised. Any actions taken on the basis of
>>> this email are at the recipient's own risk.
>>>
>>> Registered office:
>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994. Registered in India. A list of members' names is available
>>> for inspection at the registered office.
>>>
>>>
>>
>> This email is sent for and on behalf of Ivy Comptech Private
>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>
>> This email and any attachments are confidential, and may be legally
>> privileged and protected by copyright. If you are not the intended
>> recipient dissemination or copying of this email is prohibited. If
>> you have received this in error, please notify the sender by
>> replying by email and then delete the email completely from your
>> system.
>> Any views or opinions are solely those of the sender.  This
>> communication is not intended to form a binding contract on behalf
>> of Ivy Comptech Private Limited unless expressly indicated to the
>> contrary and properly authorised. Any actions taken on the basis of
>> this email are at the recipient's own risk.
>>
>> Registered office:
>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994. Registered in India. A list of members' names is available
>> for inspection at the registered office.
>>
>
> This email is sent for and on behalf of Ivy Comptech Private
> Limited. Ivy Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended
> recipient dissemination or copying of this email is prohibited. If
> you have received this in error, please notify the sender by
> replying by email and then delete the email completely from your
> system.
> Any views or opinions are solely those of the sender.  This
> communication is not intended to form a binding contract on behalf
> of Ivy Comptech Private Limited unless expressly indicated to the
> contrary and properly authorised. Any actions taken on the basis of
> this email are at the recipient's own risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994. Registered in India. A list of members' names is available
> for inspection at the registered office.
>

RE: Support for MultiGet / SQL In clause

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
You were right. This is what I have in the regionserver.out file

java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid28602.hprof ...
Heap dump file created [1081182427 bytes in 53.046 secs]

Do I increase the jvm heapsize for all the machines.  This is my config

1.  4 boxes running the cluster
2.  Habse Regionserver and HDFS data node on all the boxes
3.  One of the boxes where this error occurred also has the NameNode and
the HBasemaster running

How do I selectively increase the heapsize just for regionserver.  I
changed  in bin/hbase.  

-----Original Message-----
From: Daniel Washusen [mailto:dan@reactive.org] 
Sent: Sunday, January 24, 2010 3:44 AM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause

What does the regionserver.out log file say?  Maybe you are out of
memory...  The master web ui will report how much heap space the
region server is using...

On 24/01/2010, at 8:43 AM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:

> I created a new table with indexes.  Initially created 100000 rows
> and then did a scan.  At that time it was okay.  Then I started
> creating a million rows in a loop and then after some time I get
> this exception and the table disappeared (even from the hbase
> shell).  One other table also disappeared.
>
> This is very consistent.  I tried a few times and every time it is
> the same on creating a lot of rows.  Rows are not too big (Just some
> 6 columns in one family) each of say type long or string.  Created 3
> indexes -- 2 byte array and 1 long.
>
>
>
> Cur Rows : 349999
> Cur : 359999
> Cur : 369999
> Cur : 379999
> Cur Rows : 389999   <--  Crashed after this
> Exception in thread "main"
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server 10.1.162.25:60020 for region
POKERHH6,,1264279773695
> , row '0000392413', but failed after 10 attempts.
> Exceptions:
> java.io.IOException: Call to /10.1.162.25:60020 failed on local
> exception: java.io.EOFException
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>
>        at org.apache.hadoop.hbase.client.HConnectionManager
> $TableServers.getRegionServerWithRetries(HConnectionManager.java:1048)
>        at org.apache.hadoop.hbase.client.HConnectionManager
> $TableServers$3.doCall(HConnectionManager.java:1239)
>        at org.apache.hadoop.hbase.client.HConnectionManager
> $TableServers$Batch.process(HConnectionManager.java:1161)
>        at org.apache.hadoop.hbase.client.HConnectionManager
> $TableServers.processBatchOfRows(HConnectionManager.java:1247)
>        at org.apache.hadoop.hbase.client.HTable.flushCommits
> (HTable.java:609)
>        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
>        at test.TestExtIndexedTable.main(TestExtIndexedTable.java:110)
> 10/01/23 16:21:28 INFO zookeeper.ZooKeeper: Closing session:
> 0x265caacc7a001b
> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Closing ClientCnxn for
> session: 0x265caacc7a001b
> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Exception while closing
> send thread for session 0x265caacc7a001b : Read error rc = -1
> java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>
>
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> Stack
> Sent: Sunday, January 24, 2010 1:33 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
>> Thanks all.  I messed it up when I was trying to upgrade to
>> 0.20.3.  I deleted the data directory and formatted it thinking it
>> will reset the whole cluster.
>>
>> I started fresh by deleting the data directory on all the nodes and
>> then everything worked.  I was also able to create the indexed
>> table using the 0.20.3 patch.  Let me run some tests on a few
>> million rows and see how it holds up.
>>
>> BTW --  what would be the right way when I moved versions.  Do I
>> run migrate scripts to migrate the data to newer versions ?
>>
> Just install the new binaries every and restart or perform a rolling
> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
> if you would avoid taking down your cluster during the upgrade.
>
> You'll be flagged on start if you need to run a migration but general
> rule is that there (should) never be need of a migration between patch
> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>
> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
> hog but the speed improvement finding matching cells can be startling.
>
> St.Ack
>
>> -----Original Message-----
>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> Stack
>> Sent: Saturday, January 23, 2010 5:00 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> Check your master log.  Something is seriously off if you do not have
>> a reachable .META. table.
>> St.Ack
>>
>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>> starting
>>> hbase I keep getting the error below when I go to the hbase shell
>>>
>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>> hbase(main):001:0> list
>>> NativeException:
>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>> contact region server null for region , row '', but failed after 7
>>> attempts.
>>> Exceptions:
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>
>>>
>>>
>>> Also when I try to create a table programatically I get this --
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>> to
>>> server localhost/127.0.0.1:2181
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>> remote=localhost/127.0.0.1:2181]
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>> successful
>>> Exception in thread "main"
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ionInMeta(HConnectionManager.java:684)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:634)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:601)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ionInMeta(HConnectionManager.java:675)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:638)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:601)
>>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>> 128)
>>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>> 106)
>>>        at test.CreateTable.main(CreateTable.java:36)
>>>
>>>
>>>
>>> Any clues ?
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Dan Washusen [mailto:dan@reactive.org]
>>> Sent: Friday, January 22, 2010 4:53 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> If you want to give the "indexed" contrib package a try you'll
>>> need to
>>> do
>>> the following:
>>>
>>>   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>> "\n"
>>> ":"`)
>>>   2. Set the 'hbase.hregion.impl' property to
>>>   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>> hbase-site.xml
>>>
>>> Once you've done that you can create a table with an index using:
>>>
>>>>     // define which qualifiers need an index (choosing the correct
>>> type)
>>>>     IdxColumnDescriptor columnDescriptor = new
>>>> IdxColumnDescriptor("columnFamily");
>>>>     columnDescriptor.addIndexDescriptor(
>>>>       new IdxIndexDescriptor("qualifier",
>>>> IdxQualifierType.BYTE_ARRAY)
>>>>     );
>>>>
>>>>     HTableDescriptor tableDescriptor = new HTableDescriptor
>>>> ("table");
>>>>     tableDescriptor.addFamily(columnDescriptor);
>>>>
>>>
>>> Then when you want to perform a scan with an index hint:
>>>
>>>>     Scan scan = new IdxScan(
>>>>           new Comparison("columnFamily", "qualifier",
>>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>       );
>>>>
>>>
>>> You have to keep in mind that the index hint is only a hint.  It
>>> guarantees
>>> that your scan will get all rows that match the hint but you'll more
>>> than
>>> likely receive rows that don't.  For this reason I'd suggest that
>>> you
>>> also
>>> include a filter along with the scan:
>>>
>>>>       Scan scan = new IdxScan(
>>>>           new Comparison("columnFamily", "qualifier",
>>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>       );
>>>>       scan.setFilter(
>>>>           new SingleColumnValueFilter(
>>>>               "columnFamily", "qualifer",
>>> CompareFilter.CompareOp.EQUAL,
>>>>               new BinaryComparator("foo")
>>>>           )
>>>>       );
>>>>
>>>
>>> Cheers,
>>> Dan
>>>
>>>
>>> 2010/1/22 stack <st...@duboce.net>
>>>
>>>>
>>>
http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>
>>>> There is a bit of documentation if you look at javadoc for the
>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>
>>>> St.Ack
>>>>
>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>> the
>>>> answers you need on that one?
>>>>
>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>> <sr...@ivycomptech.com> wrote:
>>>>>
>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>> pass
>>>>> me the link
>>>>>
>>>>> -----Original Message-----
>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>> Of
>>>>> stack
>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>> patch
>>>>> HBASE-1845
>>>>>
>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>> probably
>>>>> rotted since any ways.
>>>>>
>>>>> Have you looked at hbase-2037 since committed and available in
>>>>> 0.20.3RC2.
>>>>>  Would this help you with your original problem?
>>>>>
>>>>> St.Ack
>>>>>
>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>> sriramc@ivycomptech.com> wrote:
>>>>>
>>>>>> I tried applying the patch to the hbase source code  hbase 0.20.2
>>> and
>>>>> I
>>>>>> get the errors below.  Do you know if this needs to be applied to
>>> a
>>>>>> specific hbase version. Is there a version which works with
>>>>>> 0.20.2
>>> or
>>>>>> later ??
>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>
>>>>>>
>>>>>> Thanks for the help
>>>>>>
>>>>>> patch -p0 -i batch.patch
>>>>>>
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>> patching file
>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/HTable.java
>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>> Hunk #4 FAILED at 405.
>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>> patching file
>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>> Hunk #2 FAILED at 2515.
>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>
>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
>>>>>> patching file
>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>> Hunk #2 FAILED at 333.
>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>
>>>>>> Sriram,
>>>>>>
>>>>>> Would a secondary index help you:
>>>>>>
>>>>>
>>>
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>> .
>>>>>>
>>>>>> The index is stored in a separate table, but the index is managed
>>> for
>>>>>> you.
>>>>>>
>>>>>> I don't think you can do an arbitrary "in" query, though.  If the
>>> keys
>>>>>> that
>>>>>> you want to include in the "in" are reasonably close neighbors,
>>> you
>>>>>> could do
>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>> try a
>>>>> batch
>>>>>> Get
>>>>>> by applying a separate patch, see
>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>
>>>>>> Marc Limotte
>>>>>>
>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> Is there any support for this.  I want to do this
>>>>>>>
>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>> column
>>>>>>> and the rowid's of the primary table
>>>>>>>
>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>> primary
>>>>>>> table using a SQL In like clause ---
>>>>>>>
>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>> columns.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Sriram C
>>>>>>>
>>>>>>>
>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>> Limited.
>>>>>> Ivy
>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>
>>>>>>> This email and any attachments are confidential, and may be
>>> legally
>>>>>>> privileged and protected by copyright. If you are not the
>>> intended
>>>>>> recipient
>>>>>>> dissemination or copying of this email is prohibited. If you
>>> have
>>>>>> received
>>>>>>> this in error, please notify the sender by replying by email and
>>>>> then
>>>>>> delete
>>>>>>> the email completely from your system.
>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>> communication
>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>> Comptech
>>>>>> Private
>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>> authorised.
>>>>>>> Any actions taken on the basis of this email are at the
>>> recipient's
>>>>>> own
>>>>>>> risk.
>>>>>>>
>>>>>>> Registered office:
>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>> Hills,
>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994.
>>>>>>> Registered in India. A list of members' names is available for
>>>>>> inspection at
>>>>>>> the registered office.
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>> This email is sent for and on behalf of Ivy Comptech Private
>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>
>> This email and any attachments are confidential, and may be legally
>> privileged and protected by copyright. If you are not the intended
>> recipient dissemination or copying of this email is prohibited. If
>> you have received this in error, please notify the sender by
>> replying by email and then delete the email completely from your
>> system.
>> Any views or opinions are solely those of the sender.  This
>> communication is not intended to form a binding contract on behalf
>> of Ivy Comptech Private Limited unless expressly indicated to the
>> contrary and properly authorised. Any actions taken on the basis of
>> this email are at the recipient's own risk.
>>
>> Registered office:
>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994. Registered in India. A list of members' names is available
>> for inspection at the registered office.
>>
>>
>
> This email is sent for and on behalf of Ivy Comptech Private
> Limited. Ivy Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended
> recipient dissemination or copying of this email is prohibited. If
> you have received this in error, please notify the sender by
> replying by email and then delete the email completely from your
> system.
> Any views or opinions are solely those of the sender.  This
> communication is not intended to form a binding contract on behalf
> of Ivy Comptech Private Limited unless expressly indicated to the
> contrary and properly authorised. Any actions taken on the basis of
> this email are at the recipient's own risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994. Registered in India. A list of members' names is available
> for inspection at the registered office.
>

This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.  

This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system. 
Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.

Registered office:
Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.


Re: Support for MultiGet / SQL In clause

Posted by Daniel Washusen <da...@reactive.org>.
What does the regionserver.out log file say?  Maybe you are out of
memory...  The master web ui will report how much heap space the
region server is using...

On 24/01/2010, at 8:43 AM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:

> I created a new table with indexes.  Initially created 100000 rows
> and then did a scan.  At that time it was okay.  Then I started
> creating a million rows in a loop and then after some time I get
> this exception and the table disappeared (even from the hbase
> shell).  One other table also disappeared.
>
> This is very consistent.  I tried a few times and every time it is
> the same on creating a lot of rows.  Rows are not too big (Just some
> 6 columns in one family) each of say type long or string.  Created 3
> indexes -- 2 byte array and 1 long.
>
>
>
> Cur Rows : 349999
> Cur : 359999
> Cur : 369999
> Cur : 379999
> Cur Rows : 389999   <--  Crashed after this
> Exception in thread "main"
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server 10.1.162.25:60020 for region POKERHH6,,1264279773695
> , row '0000392413', but failed after 10 attempts.
> Exceptions:
> java.io.IOException: Call to /10.1.162.25:60020 failed on local
> exception: java.io.EOFException
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
> org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
>
>        at org.apache.hadoop.hbase.client.HConnectionManager
> $TableServers.getRegionServerWithRetries(HConnectionManager.java:1048)
>        at org.apache.hadoop.hbase.client.HConnectionManager
> $TableServers$3.doCall(HConnectionManager.java:1239)
>        at org.apache.hadoop.hbase.client.HConnectionManager
> $TableServers$Batch.process(HConnectionManager.java:1161)
>        at org.apache.hadoop.hbase.client.HConnectionManager
> $TableServers.processBatchOfRows(HConnectionManager.java:1247)
>        at org.apache.hadoop.hbase.client.HTable.flushCommits
> (HTable.java:609)
>        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
>        at test.TestExtIndexedTable.main(TestExtIndexedTable.java:110)
> 10/01/23 16:21:28 INFO zookeeper.ZooKeeper: Closing session:
> 0x265caacc7a001b
> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Closing ClientCnxn for
> session: 0x265caacc7a001b
> 10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Exception while closing
> send thread for session 0x265caacc7a001b : Read error rc = -1
> java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>
>
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> Stack
> Sent: Sunday, January 24, 2010 1:33 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
>> Thanks all.  I messed it up when I was trying to upgrade to
>> 0.20.3.  I deleted the data directory and formatted it thinking it
>> will reset the whole cluster.
>>
>> I started fresh by deleting the data directory on all the nodes and
>> then everything worked.  I was also able to create the indexed
>> table using the 0.20.3 patch.  Let me run some tests on a few
>> million rows and see how it holds up.
>>
>> BTW --  what would be the right way when I moved versions.  Do I
>> run migrate scripts to migrate the data to newer versions ?
>>
> Just install the new binaries every and restart or perform a rolling
> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
> if you would avoid taking down your cluster during the upgrade.
>
> You'll be flagged on start if you need to run a migration but general
> rule is that there (should) never be need of a migration between patch
> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>
> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
> hog but the speed improvement finding matching cells can be startling.
>
> St.Ack
>
>> -----Original Message-----
>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> Stack
>> Sent: Saturday, January 23, 2010 5:00 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> Check your master log.  Something is seriously off if you do not have
>> a reachable .META. table.
>> St.Ack
>>
>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>> starting
>>> hbase I keep getting the error below when I go to the hbase shell
>>>
>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>> hbase(main):001:0> list
>>> NativeException:
>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>> contact region server null for region , row '', but failed after 7
>>> attempts.
>>> Exceptions:
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>
>>>
>>>
>>> Also when I try to create a table programatically I get this --
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>> to
>>> server localhost/127.0.0.1:2181
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>> remote=localhost/127.0.0.1:2181]
>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>> successful
>>> Exception in thread "main"
>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ionInMeta(HConnectionManager.java:684)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:634)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:601)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ionInMeta(HConnectionManager.java:675)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:638)
>>>        at
>>> org.apache.hadoop.hbase.client.HConnectionManager
>>> $TableServers.locateReg
>>> ion(HConnectionManager.java:601)
>>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>> 128)
>>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>> 106)
>>>        at test.CreateTable.main(CreateTable.java:36)
>>>
>>>
>>>
>>> Any clues ?
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Dan Washusen [mailto:dan@reactive.org]
>>> Sent: Friday, January 22, 2010 4:53 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> If you want to give the "indexed" contrib package a try you'll
>>> need to
>>> do
>>> the following:
>>>
>>>   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>> "\n"
>>> ":"`)
>>>   2. Set the 'hbase.hregion.impl' property to
>>>   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>> hbase-site.xml
>>>
>>> Once you've done that you can create a table with an index using:
>>>
>>>>     // define which qualifiers need an index (choosing the correct
>>> type)
>>>>     IdxColumnDescriptor columnDescriptor = new
>>>> IdxColumnDescriptor("columnFamily");
>>>>     columnDescriptor.addIndexDescriptor(
>>>>       new IdxIndexDescriptor("qualifier",
>>>> IdxQualifierType.BYTE_ARRAY)
>>>>     );
>>>>
>>>>     HTableDescriptor tableDescriptor = new HTableDescriptor
>>>> ("table");
>>>>     tableDescriptor.addFamily(columnDescriptor);
>>>>
>>>
>>> Then when you want to perform a scan with an index hint:
>>>
>>>>     Scan scan = new IdxScan(
>>>>           new Comparison("columnFamily", "qualifier",
>>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>       );
>>>>
>>>
>>> You have to keep in mind that the index hint is only a hint.  It
>>> guarantees
>>> that your scan will get all rows that match the hint but you'll more
>>> than
>>> likely receive rows that don't.  For this reason I'd suggest that
>>> you
>>> also
>>> include a filter along with the scan:
>>>
>>>>       Scan scan = new IdxScan(
>>>>           new Comparison("columnFamily", "qualifier",
>>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>       );
>>>>       scan.setFilter(
>>>>           new SingleColumnValueFilter(
>>>>               "columnFamily", "qualifer",
>>> CompareFilter.CompareOp.EQUAL,
>>>>               new BinaryComparator("foo")
>>>>           )
>>>>       );
>>>>
>>>
>>> Cheers,
>>> Dan
>>>
>>>
>>> 2010/1/22 stack <st...@duboce.net>
>>>
>>>>
>>> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>
>>>> There is a bit of documentation if you look at javadoc for the
>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>
>>>> St.Ack
>>>>
>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>> the
>>>> answers you need on that one?
>>>>
>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>> <sr...@ivycomptech.com> wrote:
>>>>>
>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>> pass
>>>>> me the link
>>>>>
>>>>> -----Original Message-----
>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>> Of
>>>>> stack
>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>> patch
>>>>> HBASE-1845
>>>>>
>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>> probably
>>>>> rotted since any ways.
>>>>>
>>>>> Have you looked at hbase-2037 since committed and available in
>>>>> 0.20.3RC2.
>>>>>  Would this help you with your original problem?
>>>>>
>>>>> St.Ack
>>>>>
>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>> sriramc@ivycomptech.com> wrote:
>>>>>
>>>>>> I tried applying the patch to the hbase source code  hbase 0.20.2
>>> and
>>>>> I
>>>>>> get the errors below.  Do you know if this needs to be applied to
>>> a
>>>>>> specific hbase version. Is there a version which works with
>>>>>> 0.20.2
>>> or
>>>>>> later ??
>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>
>>>>>>
>>>>>> Thanks for the help
>>>>>>
>>>>>> patch -p0 -i batch.patch
>>>>>>
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>> patching file
>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/HTable.java
>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>> Hunk #4 FAILED at 405.
>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>> patching file
>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>> patching file
>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>> Hunk #2 FAILED at 2515.
>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>
>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
>>>>>> patching file
>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>> Hunk #2 FAILED at 333.
>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>
>>>>>> Sriram,
>>>>>>
>>>>>> Would a secondary index help you:
>>>>>>
>>>>>
>>> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>> .
>>>>>>
>>>>>> The index is stored in a separate table, but the index is managed
>>> for
>>>>>> you.
>>>>>>
>>>>>> I don't think you can do an arbitrary "in" query, though.  If the
>>> keys
>>>>>> that
>>>>>> you want to include in the "in" are reasonably close neighbors,
>>> you
>>>>>> could do
>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>> try a
>>>>> batch
>>>>>> Get
>>>>>> by applying a separate patch, see
>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>
>>>>>> Marc Limotte
>>>>>>
>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> Is there any support for this.  I want to do this
>>>>>>>
>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>> column
>>>>>>> and the rowid's of the primary table
>>>>>>>
>>>>>>> 2.  Use this second table to get the rowid's to lookup from the
>>>>>> primary
>>>>>>> table using a SQL In like clause ---
>>>>>>>
>>>>>>> Basically I am doing this to speed up querying by  Non-row key
>>>>>> columns.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Sriram C
>>>>>>>
>>>>>>>
>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>> Limited.
>>>>>> Ivy
>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>
>>>>>>> This email and any attachments are confidential, and may be
>>> legally
>>>>>>> privileged and protected by copyright. If you are not the
>>> intended
>>>>>> recipient
>>>>>>> dissemination or copying of this email is prohibited. If you
>>> have
>>>>>> received
>>>>>>> this in error, please notify the sender by replying by email and
>>>>> then
>>>>>> delete
>>>>>>> the email completely from your system.
>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>> communication
>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>> Comptech
>>>>>> Private
>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>> authorised.
>>>>>>> Any actions taken on the basis of this email are at the
>>> recipient's
>>>>>> own
>>>>>>> risk.
>>>>>>>
>>>>>>> Registered office:
>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>> Hills,
>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994.
>>>>>>> Registered in India. A list of members' names is available for
>>>>>> inspection at
>>>>>>> the registered office.
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>> This email is sent for and on behalf of Ivy Comptech Private
>> Limited. Ivy Comptech Private Limited is a limited liability company.
>>
>> This email and any attachments are confidential, and may be legally
>> privileged and protected by copyright. If you are not the intended
>> recipient dissemination or copying of this email is prohibited. If
>> you have received this in error, please notify the sender by
>> replying by email and then delete the email completely from your
>> system.
>> Any views or opinions are solely those of the sender.  This
>> communication is not intended to form a binding contract on behalf
>> of Ivy Comptech Private Limited unless expressly indicated to the
>> contrary and properly authorised. Any actions taken on the basis of
>> this email are at the recipient's own risk.
>>
>> Registered office:
>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994. Registered in India. A list of members' names is available
>> for inspection at the registered office.
>>
>>
>
> This email is sent for and on behalf of Ivy Comptech Private
> Limited. Ivy Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended
> recipient dissemination or copying of this email is prohibited. If
> you have received this in error, please notify the sender by
> replying by email and then delete the email completely from your
> system.
> Any views or opinions are solely those of the sender.  This
> communication is not intended to form a binding contract on behalf
> of Ivy Comptech Private Limited unless expressly indicated to the
> contrary and properly authorised. Any actions taken on the basis of
> this email are at the recipient's own risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994. Registered in India. A list of members' names is available
> for inspection at the registered office.
>

RE: Support for MultiGet / SQL In clause

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
I created a new table with indexes.  Initially created 100000 rows and then did a scan.  At that time it was okay.  Then I started creating a million rows in a loop and then after some time I get this exception and the table disappeared (even from the hbase shell).  One other table also disappeared.

This is very consistent.  I tried a few times and every time it is the same on creating a lot of rows.  Rows are not too big (Just some 6 columns in one family) each of say type long or string.  Created 3 indexes -- 2 byte array and 1 long.

  

Cur Rows : 349999
Cur : 359999
Cur : 369999
Cur : 379999
Cur Rows : 389999   <--  Crashed after this
Exception in thread "main" org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server 10.1.162.25:60020 for region POKERHH6,,1264279773695, row '0000392413', but failed after 10 attempts.
Exceptions:
java.io.IOException: Call to /10.1.162.25:60020 failed on local exception: java.io.EOFException
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6
org.apache.hadoop.hbase.TableNotFoundException: POKERHH6

        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1048)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3.doCall(HConnectionManager.java:1239)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1161)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1247)
        at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:609)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
        at test.TestExtIndexedTable.main(TestExtIndexedTable.java:110)
10/01/23 16:21:28 INFO zookeeper.ZooKeeper: Closing session: 0x265caacc7a001b
10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Closing ClientCnxn for session: 0x265caacc7a001b
10/01/23 16:21:28 INFO zookeeper.ClientCnxn: Exception while closing send thread for session 0x265caacc7a001b : Read error rc = -1 java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]



-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Sunday, January 24, 2010 1:33 AM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
<sr...@ivycomptech.com> wrote:
> Thanks all.  I messed it up when I was trying to upgrade to 0.20.3.  I deleted the data directory and formatted it thinking it will reset the whole cluster.
>
> I started fresh by deleting the data directory on all the nodes and then everything worked.  I was also able to create the indexed table using the 0.20.3 patch.  Let me run some tests on a few million rows and see how it holds up.
>
> BTW --  what would be the right way when I moved versions.  Do I run migrate scripts to migrate the data to newer versions ?
>
Just install the new binaries every and restart or perform a rolling
restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
if you would avoid taking down your cluster during the upgrade.

You'll be flagged on start if you need to run a migration but general
rule is that there (should) never be need of a migration between patch
releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
migrations moving between minor numbers; e.g. from 0.19 to 0.20.

Let us know how IHBase works out for you (indexed hbase).  Its a RAM
hog but the speed improvement finding matching cells can be startling.

St.Ack

> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
> Sent: Saturday, January 23, 2010 5:00 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845
>
> Check your master log.  Something is seriously off if you do not have
> a reachable .META. table.
> St.Ack
>
> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after starting
>> hbase I keep getting the error below when I go to the hbase shell
>>
>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>> hbase(main):001:0> list
>> NativeException:
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>> contact region server null for region , row '', but failed after 7
>> attempts.
>> Exceptions:
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>
>>
>>
>> Also when I try to create a table programatically I get this --
>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection to
>> server localhost/127.0.0.1:2181
>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>> remote=localhost/127.0.0.1:2181]
>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>> successful
>> Exception in thread "main"
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ionInMeta(HConnectionManager.java:684)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ion(HConnectionManager.java:634)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ion(HConnectionManager.java:601)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ionInMeta(HConnectionManager.java:675)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ion(HConnectionManager.java:638)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ion(HConnectionManager.java:601)
>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:128)
>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:106)
>>        at test.CreateTable.main(CreateTable.java:36)
>>
>>
>>
>> Any clues ?
>>
>>
>>
>> -----Original Message-----
>> From: Dan Washusen [mailto:dan@reactive.org]
>> Sent: Friday, January 22, 2010 4:53 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> If you want to give the "indexed" contrib package a try you'll need to
>> do
>> the following:
>>
>>   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n"
>> ":"`)
>>   2. Set the 'hbase.hregion.impl' property to
>>   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>> hbase-site.xml
>>
>> Once you've done that you can create a table with an index using:
>>
>>>     // define which qualifiers need an index (choosing the correct
>> type)
>>>     IdxColumnDescriptor columnDescriptor = new
>>> IdxColumnDescriptor("columnFamily");
>>>     columnDescriptor.addIndexDescriptor(
>>>       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
>>>     );
>>>
>>>     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
>>>     tableDescriptor.addFamily(columnDescriptor);
>>>
>>
>> Then when you want to perform a scan with an index hint:
>>
>>>     Scan scan = new IdxScan(
>>>           new Comparison("columnFamily", "qualifier",
>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>       );
>>>
>>
>> You have to keep in mind that the index hint is only a hint.  It
>> guarantees
>> that your scan will get all rows that match the hint but you'll more
>> than
>> likely receive rows that don't.  For this reason I'd suggest that you
>> also
>> include a filter along with the scan:
>>
>>>       Scan scan = new IdxScan(
>>>           new Comparison("columnFamily", "qualifier",
>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>       );
>>>       scan.setFilter(
>>>           new SingleColumnValueFilter(
>>>               "columnFamily", "qualifer",
>> CompareFilter.CompareOp.EQUAL,
>>>               new BinaryComparator("foo")
>>>           )
>>>       );
>>>
>>
>> Cheers,
>> Dan
>>
>>
>> 2010/1/22 stack <st...@duboce.net>
>>
>>>
>> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>
>>> There is a bit of documentation if you look at javadoc for the
>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>
>>> St.Ack
>>>
>>> P.S. We had a thread going named "HBase bulk load".  You got all the
>>> answers you need on that one?
>>>
>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>> >
>>> > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>> pass
>>> > me the link
>>> >
>>> > -----Original Message-----
>>> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> > stack
>>> > Sent: Friday, January 22, 2010 12:42 AM
>>> > To: hbase-user@hadoop.apache.org
>>> > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> > HBASE-1845
>>> >
>>> > IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>> probably
>>> > rotted since any ways.
>>> >
>>> > Have you looked at hbase-2037 since committed and available in
>>> > 0.20.3RC2.
>>> >  Would this help you with your original problem?
>>> >
>>> > St.Ack
>>> >
>>> > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>> > sriramc@ivycomptech.com> wrote:
>>> >
>>> > > I tried applying the patch to the hbase source code  hbase 0.20.2
>> and
>>> > I
>>> > > get the errors below.  Do you know if this needs to be applied to
>> a
>>> > > specific hbase version. Is there a version which works with 0.20.2
>> or
>>> > > later ??
>>> > > Basically HRegionServer  and HTable patching fails.
>>> > >
>>> > >
>>> > > Thanks for the help
>>> > >
>>> > > patch -p0 -i batch.patch
>>> > >
>>> > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>> > > Hunk #1 succeeded at 61 (offset 2 lines).
>>> > > Hunk #2 succeeded at 347 (offset 31 lines).
>>> > > patching file
>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>> > > patching file
>>> > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>> > > Hunk #3 succeeded at 1244 (offset 6 lines).
>>> > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
>>> > > Hunk #2 succeeded at 73 (offset 8 lines).
>>> > > Hunk #4 FAILED at 405.
>>> > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>> > > 1 out of 5 hunks FAILED -- saving rejects to file
>>> > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>> > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>> > > patching file
>>> > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>> > > patching file
>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>> > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>> > > patching file
>>> > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>> > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>> > > patching file
>>> > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>> > > Hunk #2 succeeded at 247 (offset 2 lines).
>>> > > patching file
>>> > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>> > > Hunk #1 succeeded at 78 (offset -1 lines).
>>> > > Hunk #2 FAILED at 2515.
>>> > > 1 out of 2 hunks FAILED -- saving rejects to file
>>> > >
>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
>>> > > patching file
>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>> > > Hunk #2 FAILED at 333.
>>> > > 1 out of 2 hunks FAILED -- saving rejects to file
>>> > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > -----Original Message-----
>>> > > From: Marc Limotte [mailto:mslimotte@gmail.com]
>>> > > Sent: Tuesday, January 19, 2010 10:26 PM
>>> > > To: hbase-user@hadoop.apache.org
>>> > > Subject: Re: Support for MultiGet / SQL In clause
>>> > >
>>> > > Sriram,
>>> > >
>>> > > Would a secondary index help you:
>>> > >
>>> >
>> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>> > > client/tableindexed/package-summary.html#package_description
>>> > > .
>>> > >
>>> > > The index is stored in a separate table, but the index is managed
>> for
>>> > > you.
>>> > >
>>> > > I don't think you can do an arbitrary "in" query, though.  If the
>> keys
>>> > > that
>>> > > you want to include in the "in" are reasonably close neighbors,
>> you
>>> > > could do
>>> > > a scan and skip ones that are uninteresting.  You could also try a
>>> > batch
>>> > > Get
>>> > > by applying a separate patch, see
>>> > > http://issues.apache.org/jira/browse/HBASE-1845.
>>> > >
>>> > > Marc Limotte
>>> > >
>>> > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>> > > sriramc@ivycomptech.com> wrote:
>>> > >
>>> > > > Is there any support for this.  I want to do this
>>> > > >
>>> > > > 1.  Create a second table to maintain mapping between secondary
>>> > column
>>> > > > and the rowid's of the primary table
>>> > > >
>>> > > > 2.  Use this second table to get the rowid's to lookup from the
>>> > > primary
>>> > > > table using a SQL In like clause ---
>>> > > >
>>> > > > Basically I am doing this to speed up querying by  Non-row key
>>> > > columns.
>>> > > >
>>> > > > Thanks
>>> > > >
>>> > > > Sriram C
>>> > > >
>>> > > >
>>> > > > This email is sent for and on behalf of Ivy Comptech Private
>>> > Limited.
>>> > > Ivy
>>> > > > Comptech Private Limited is a limited liability company.
>>> > > >
>>> > > > This email and any attachments are confidential, and may be
>> legally
>>> > > > privileged and protected by copyright. If you are not the
>> intended
>>> > > recipient
>>> > > > dissemination or copying of this email is prohibited. If you
>> have
>>> > > received
>>> > > > this in error, please notify the sender by replying by email and
>>> > then
>>> > > delete
>>> > > > the email completely from your system.
>>> > > > Any views or opinions are solely those of the sender.  This
>>> > > communication
>>> > > > is not intended to form a binding contract on behalf of Ivy
>> Comptech
>>> > > Private
>>> > > > Limited unless expressly indicated to the contrary and properly
>>> > > authorised.
>>> > > > Any actions taken on the basis of this email are at the
>> recipient's
>>> > > own
>>> > > > risk.
>>> > > >
>>> > > > Registered office:
>>> > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> > Hills,
>>> > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994.
>>> > > > Registered in India. A list of members' names is available for
>>> > > inspection at
>>> > > > the registered office.
>>> > > >
>>> > > >
>>> > >
>>>
>>
>
> This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system.
> Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.
>
>

This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.  

This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system. 
Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.

Registered office:
Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.


Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Stack <st...@duboce.net>.
On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
<sr...@ivycomptech.com> wrote:
> Thanks all.  I messed it up when I was trying to upgrade to 0.20.3.  I deleted the data directory and formatted it thinking it will reset the whole cluster.
>
> I started fresh by deleting the data directory on all the nodes and then everything worked.  I was also able to create the indexed table using the 0.20.3 patch.  Let me run some tests on a few million rows and see how it holds up.
>
> BTW --  what would be the right way when I moved versions.  Do I run migrate scripts to migrate the data to newer versions ?
>
Just install the new binaries every and restart or perform a rolling
restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
if you would avoid taking down your cluster during the upgrade.

You'll be flagged on start if you need to run a migration but general
rule is that there (should) never be need of a migration between patch
releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
migrations moving between minor numbers; e.g. from 0.19 to 0.20.

Let us know how IHBase works out for you (indexed hbase).  Its a RAM
hog but the speed improvement finding matching cells can be startling.

St.Ack

> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
> Sent: Saturday, January 23, 2010 5:00 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845
>
> Check your master log.  Something is seriously off if you do not have
> a reachable .META. table.
> St.Ack
>
> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after starting
>> hbase I keep getting the error below when I go to the hbase shell
>>
>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>> hbase(main):001:0> list
>> NativeException:
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>> contact region server null for region , row '', but failed after 7
>> attempts.
>> Exceptions:
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>
>>
>>
>> Also when I try to create a table programatically I get this --
>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection to
>> server localhost/127.0.0.1:2181
>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>> remote=localhost/127.0.0.1:2181]
>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>> successful
>> Exception in thread "main"
>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ionInMeta(HConnectionManager.java:684)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ion(HConnectionManager.java:634)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ion(HConnectionManager.java:601)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ionInMeta(HConnectionManager.java:675)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ion(HConnectionManager.java:638)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
>> ion(HConnectionManager.java:601)
>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:128)
>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:106)
>>        at test.CreateTable.main(CreateTable.java:36)
>>
>>
>>
>> Any clues ?
>>
>>
>>
>> -----Original Message-----
>> From: Dan Washusen [mailto:dan@reactive.org]
>> Sent: Friday, January 22, 2010 4:53 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> HBASE-1845
>>
>> If you want to give the "indexed" contrib package a try you'll need to
>> do
>> the following:
>>
>>   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n"
>> ":"`)
>>   2. Set the 'hbase.hregion.impl' property to
>>   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>> hbase-site.xml
>>
>> Once you've done that you can create a table with an index using:
>>
>>>     // define which qualifiers need an index (choosing the correct
>> type)
>>>     IdxColumnDescriptor columnDescriptor = new
>>> IdxColumnDescriptor("columnFamily");
>>>     columnDescriptor.addIndexDescriptor(
>>>       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
>>>     );
>>>
>>>     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
>>>     tableDescriptor.addFamily(columnDescriptor);
>>>
>>
>> Then when you want to perform a scan with an index hint:
>>
>>>     Scan scan = new IdxScan(
>>>           new Comparison("columnFamily", "qualifier",
>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>       );
>>>
>>
>> You have to keep in mind that the index hint is only a hint.  It
>> guarantees
>> that your scan will get all rows that match the hint but you'll more
>> than
>> likely receive rows that don't.  For this reason I'd suggest that you
>> also
>> include a filter along with the scan:
>>
>>>       Scan scan = new IdxScan(
>>>           new Comparison("columnFamily", "qualifier",
>>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>       );
>>>       scan.setFilter(
>>>           new SingleColumnValueFilter(
>>>               "columnFamily", "qualifer",
>> CompareFilter.CompareOp.EQUAL,
>>>               new BinaryComparator("foo")
>>>           )
>>>       );
>>>
>>
>> Cheers,
>> Dan
>>
>>
>> 2010/1/22 stack <st...@duboce.net>
>>
>>>
>> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>
>>> There is a bit of documentation if you look at javadoc for the
>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>
>>> St.Ack
>>>
>>> P.S. We had a thread going named "HBase bulk load".  You got all the
>>> answers you need on that one?
>>>
>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>> <sr...@ivycomptech.com> wrote:
>>> >
>>> > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>> pass
>>> > me the link
>>> >
>>> > -----Original Message-----
>>> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> > stack
>>> > Sent: Friday, January 22, 2010 12:42 AM
>>> > To: hbase-user@hadoop.apache.org
>>> > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> > HBASE-1845
>>> >
>>> > IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>> probably
>>> > rotted since any ways.
>>> >
>>> > Have you looked at hbase-2037 since committed and available in
>>> > 0.20.3RC2.
>>> >  Would this help you with your original problem?
>>> >
>>> > St.Ack
>>> >
>>> > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>> > sriramc@ivycomptech.com> wrote:
>>> >
>>> > > I tried applying the patch to the hbase source code  hbase 0.20.2
>> and
>>> > I
>>> > > get the errors below.  Do you know if this needs to be applied to
>> a
>>> > > specific hbase version. Is there a version which works with 0.20.2
>> or
>>> > > later ??
>>> > > Basically HRegionServer  and HTable patching fails.
>>> > >
>>> > >
>>> > > Thanks for the help
>>> > >
>>> > > patch -p0 -i batch.patch
>>> > >
>>> > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>> > > Hunk #1 succeeded at 61 (offset 2 lines).
>>> > > Hunk #2 succeeded at 347 (offset 31 lines).
>>> > > patching file
>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>> > > patching file
>>> > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>> > > Hunk #3 succeeded at 1244 (offset 6 lines).
>>> > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
>>> > > Hunk #2 succeeded at 73 (offset 8 lines).
>>> > > Hunk #4 FAILED at 405.
>>> > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>> > > 1 out of 5 hunks FAILED -- saving rejects to file
>>> > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>> > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>> > > patching file
>>> > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>> > > patching file
>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>> > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>> > > patching file
>>> > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>> > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>> > > patching file
>>> > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>> > > Hunk #2 succeeded at 247 (offset 2 lines).
>>> > > patching file
>>> > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>> > > Hunk #1 succeeded at 78 (offset -1 lines).
>>> > > Hunk #2 FAILED at 2515.
>>> > > 1 out of 2 hunks FAILED -- saving rejects to file
>>> > >
>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
>>> > > patching file
>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>> > > Hunk #2 FAILED at 333.
>>> > > 1 out of 2 hunks FAILED -- saving rejects to file
>>> > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > -----Original Message-----
>>> > > From: Marc Limotte [mailto:mslimotte@gmail.com]
>>> > > Sent: Tuesday, January 19, 2010 10:26 PM
>>> > > To: hbase-user@hadoop.apache.org
>>> > > Subject: Re: Support for MultiGet / SQL In clause
>>> > >
>>> > > Sriram,
>>> > >
>>> > > Would a secondary index help you:
>>> > >
>>> >
>> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>> > > client/tableindexed/package-summary.html#package_description
>>> > > .
>>> > >
>>> > > The index is stored in a separate table, but the index is managed
>> for
>>> > > you.
>>> > >
>>> > > I don't think you can do an arbitrary "in" query, though.  If the
>> keys
>>> > > that
>>> > > you want to include in the "in" are reasonably close neighbors,
>> you
>>> > > could do
>>> > > a scan and skip ones that are uninteresting.  You could also try a
>>> > batch
>>> > > Get
>>> > > by applying a separate patch, see
>>> > > http://issues.apache.org/jira/browse/HBASE-1845.
>>> > >
>>> > > Marc Limotte
>>> > >
>>> > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>>> > > sriramc@ivycomptech.com> wrote:
>>> > >
>>> > > > Is there any support for this.  I want to do this
>>> > > >
>>> > > > 1.  Create a second table to maintain mapping between secondary
>>> > column
>>> > > > and the rowid's of the primary table
>>> > > >
>>> > > > 2.  Use this second table to get the rowid's to lookup from the
>>> > > primary
>>> > > > table using a SQL In like clause ---
>>> > > >
>>> > > > Basically I am doing this to speed up querying by  Non-row key
>>> > > columns.
>>> > > >
>>> > > > Thanks
>>> > > >
>>> > > > Sriram C
>>> > > >
>>> > > >
>>> > > > This email is sent for and on behalf of Ivy Comptech Private
>>> > Limited.
>>> > > Ivy
>>> > > > Comptech Private Limited is a limited liability company.
>>> > > >
>>> > > > This email and any attachments are confidential, and may be
>> legally
>>> > > > privileged and protected by copyright. If you are not the
>> intended
>>> > > recipient
>>> > > > dissemination or copying of this email is prohibited. If you
>> have
>>> > > received
>>> > > > this in error, please notify the sender by replying by email and
>>> > then
>>> > > delete
>>> > > > the email completely from your system.
>>> > > > Any views or opinions are solely those of the sender.  This
>>> > > communication
>>> > > > is not intended to form a binding contract on behalf of Ivy
>> Comptech
>>> > > Private
>>> > > > Limited unless expressly indicated to the contrary and properly
>>> > > authorised.
>>> > > > Any actions taken on the basis of this email are at the
>> recipient's
>>> > > own
>>> > > > risk.
>>> > > >
>>> > > > Registered office:
>>> > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> > Hills,
>>> > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>> 37994.
>>> > > > Registered in India. A list of members' names is available for
>>> > > inspection at
>>> > > > the registered office.
>>> > > >
>>> > > >
>>> > >
>>>
>>
>
> This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system.
> Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.
>
>

RE: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
Thanks all.  I messed it up when I was trying to upgrade to 0.20.3.  I deleted the data directory and formatted it thinking it will reset the whole cluster. 

I started fresh by deleting the data directory on all the nodes and then everything worked.  I was also able to create the indexed table using the 0.20.3 patch.  Let me run some tests on a few million rows and see how it holds up. 

BTW --  what would be the right way when I moved versions.  Do I run migrate scripts to migrate the data to newer versions ?

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Saturday, January 23, 2010 5:00 AM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Check your master log.  Something is seriously off if you do not have
a reachable .META. table.
St.Ack

On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
<sr...@ivycomptech.com> wrote:
> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after starting
> hbase I keep getting the error below when I go to the hbase shell
>
> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
> HBase Shell; enter 'help<RETURN>' for list of supported commands.
> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
> hbase(main):001:0> list
> NativeException:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server null for region , row '', but failed after 7
> attempts.
> Exceptions:
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
>
>
>
> Also when I try to create a table programatically I get this --
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection to
> server localhost/127.0.0.1:2181
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
> remote=localhost/127.0.0.1:2181]
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
> successful
> Exception in thread "main"
> org.apache.hadoop.hbase.TableNotFoundException: .META.
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ionInMeta(HConnectionManager.java:684)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:634)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:601)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ionInMeta(HConnectionManager.java:675)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:638)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:601)
>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:128)
>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:106)
>        at test.CreateTable.main(CreateTable.java:36)
>
>
>
> Any clues ?
>
>
>
> -----Original Message-----
> From: Dan Washusen [mailto:dan@reactive.org]
> Sent: Friday, January 22, 2010 4:53 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> If you want to give the "indexed" contrib package a try you'll need to
> do
> the following:
>
>   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n"
> ":"`)
>   2. Set the 'hbase.hregion.impl' property to
>   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
> hbase-site.xml
>
> Once you've done that you can create a table with an index using:
>
>>     // define which qualifiers need an index (choosing the correct
> type)
>>     IdxColumnDescriptor columnDescriptor = new
>> IdxColumnDescriptor("columnFamily");
>>     columnDescriptor.addIndexDescriptor(
>>       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
>>     );
>>
>>     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
>>     tableDescriptor.addFamily(columnDescriptor);
>>
>
> Then when you want to perform a scan with an index hint:
>
>>     Scan scan = new IdxScan(
>>           new Comparison("columnFamily", "qualifier",
>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>       );
>>
>
> You have to keep in mind that the index hint is only a hint.  It
> guarantees
> that your scan will get all rows that match the hint but you'll more
> than
> likely receive rows that don't.  For this reason I'd suggest that you
> also
> include a filter along with the scan:
>
>>       Scan scan = new IdxScan(
>>           new Comparison("columnFamily", "qualifier",
>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>       );
>>       scan.setFilter(
>>           new SingleColumnValueFilter(
>>               "columnFamily", "qualifer",
> CompareFilter.CompareOp.EQUAL,
>>               new BinaryComparator("foo")
>>           )
>>       );
>>
>
> Cheers,
> Dan
>
>
> 2010/1/22 stack <st...@duboce.net>
>
>>
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>
>> There is a bit of documentation if you look at javadoc for the
>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>
>> St.Ack
>>
>> P.S. We had a thread going named "HBase bulk load".  You got all the
>> answers you need on that one?
>>
>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>> >
>> > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
> pass
>> > me the link
>> >
>> > -----Original Message-----
>> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> > stack
>> > Sent: Friday, January 22, 2010 12:42 AM
>> > To: hbase-user@hadoop.apache.org
>> > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> > HBASE-1845
>> >
>> > IIRC, hbase-1845 was a sketch only and not yet complete.  Its
> probably
>> > rotted since any ways.
>> >
>> > Have you looked at hbase-2037 since committed and available in
>> > 0.20.3RC2.
>> >  Would this help you with your original problem?
>> >
>> > St.Ack
>> >
>> > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>> > sriramc@ivycomptech.com> wrote:
>> >
>> > > I tried applying the patch to the hbase source code  hbase 0.20.2
> and
>> > I
>> > > get the errors below.  Do you know if this needs to be applied to
> a
>> > > specific hbase version. Is there a version which works with 0.20.2
> or
>> > > later ??
>> > > Basically HRegionServer  and HTable patching fails.
>> > >
>> > >
>> > > Thanks for the help
>> > >
>> > > patch -p0 -i batch.patch
>> > >
>> > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
>> > > Hunk #1 succeeded at 61 (offset 2 lines).
>> > > Hunk #2 succeeded at 347 (offset 31 lines).
>> > > patching file
> src/java/org/apache/hadoop/hbase/client/HConnection.java
>> > > patching file
>> > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>> > > Hunk #3 succeeded at 1244 (offset 6 lines).
>> > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
>> > > Hunk #2 succeeded at 73 (offset 8 lines).
>> > > Hunk #4 FAILED at 405.
>> > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>> > > 1 out of 5 hunks FAILED -- saving rejects to file
>> > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>> > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>> > > patching file
>> > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>> > > patching file
> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>> > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
>> > > patching file
>> > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>> > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>> > > patching file
>> > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>> > > Hunk #2 succeeded at 247 (offset 2 lines).
>> > > patching file
>> > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>> > > Hunk #1 succeeded at 78 (offset -1 lines).
>> > > Hunk #2 FAILED at 2515.
>> > > 1 out of 2 hunks FAILED -- saving rejects to file
>> > >
> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
>> > > patching file
> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>> > > Hunk #2 FAILED at 333.
>> > > 1 out of 2 hunks FAILED -- saving rejects to file
>> > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>> > >
>> > >
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: Marc Limotte [mailto:mslimotte@gmail.com]
>> > > Sent: Tuesday, January 19, 2010 10:26 PM
>> > > To: hbase-user@hadoop.apache.org
>> > > Subject: Re: Support for MultiGet / SQL In clause
>> > >
>> > > Sriram,
>> > >
>> > > Would a secondary index help you:
>> > >
>> >
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>> > > client/tableindexed/package-summary.html#package_description
>> > > .
>> > >
>> > > The index is stored in a separate table, but the index is managed
> for
>> > > you.
>> > >
>> > > I don't think you can do an arbitrary "in" query, though.  If the
> keys
>> > > that
>> > > you want to include in the "in" are reasonably close neighbors,
> you
>> > > could do
>> > > a scan and skip ones that are uninteresting.  You could also try a
>> > batch
>> > > Get
>> > > by applying a separate patch, see
>> > > http://issues.apache.org/jira/browse/HBASE-1845.
>> > >
>> > > Marc Limotte
>> > >
>> > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>> > > sriramc@ivycomptech.com> wrote:
>> > >
>> > > > Is there any support for this.  I want to do this
>> > > >
>> > > > 1.  Create a second table to maintain mapping between secondary
>> > column
>> > > > and the rowid's of the primary table
>> > > >
>> > > > 2.  Use this second table to get the rowid's to lookup from the
>> > > primary
>> > > > table using a SQL In like clause ---
>> > > >
>> > > > Basically I am doing this to speed up querying by  Non-row key
>> > > columns.
>> > > >
>> > > > Thanks
>> > > >
>> > > > Sriram C
>> > > >
>> > > >
>> > > > This email is sent for and on behalf of Ivy Comptech Private
>> > Limited.
>> > > Ivy
>> > > > Comptech Private Limited is a limited liability company.
>> > > >
>> > > > This email and any attachments are confidential, and may be
> legally
>> > > > privileged and protected by copyright. If you are not the
> intended
>> > > recipient
>> > > > dissemination or copying of this email is prohibited. If you
> have
>> > > received
>> > > > this in error, please notify the sender by replying by email and
>> > then
>> > > delete
>> > > > the email completely from your system.
>> > > > Any views or opinions are solely those of the sender.  This
>> > > communication
>> > > > is not intended to form a binding contract on behalf of Ivy
> Comptech
>> > > Private
>> > > > Limited unless expressly indicated to the contrary and properly
>> > > authorised.
>> > > > Any actions taken on the basis of this email are at the
> recipient's
>> > > own
>> > > > risk.
>> > > >
>> > > > Registered office:
>> > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> > Hills,
>> > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994.
>> > > > Registered in India. A list of members' names is available for
>> > > inspection at
>> > > > the registered office.
>> > > >
>> > > >
>> > >
>>
>

This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.  

This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system. 
Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.

Registered office:
Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.


Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Stack <st...@duboce.net>.
Check your master log.  Something is seriously off if you do not have
a reachable .META. table.
St.Ack

On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
<sr...@ivycomptech.com> wrote:
> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after starting
> hbase I keep getting the error below when I go to the hbase shell
>
> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
> HBase Shell; enter 'help<RETURN>' for list of supported commands.
> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
> hbase(main):001:0> list
> NativeException:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server null for region , row '', but failed after 7
> attempts.
> Exceptions:
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
>
>
>
> Also when I try to create a table programatically I get this --
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection to
> server localhost/127.0.0.1:2181
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
> remote=localhost/127.0.0.1:2181]
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
> successful
> Exception in thread "main"
> org.apache.hadoop.hbase.TableNotFoundException: .META.
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ionInMeta(HConnectionManager.java:684)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:634)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:601)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ionInMeta(HConnectionManager.java:675)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:638)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:601)
>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:128)
>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:106)
>        at test.CreateTable.main(CreateTable.java:36)
>
>
>
> Any clues ?
>
>
>
> -----Original Message-----
> From: Dan Washusen [mailto:dan@reactive.org]
> Sent: Friday, January 22, 2010 4:53 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> If you want to give the "indexed" contrib package a try you'll need to
> do
> the following:
>
>   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n"
> ":"`)
>   2. Set the 'hbase.hregion.impl' property to
>   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
> hbase-site.xml
>
> Once you've done that you can create a table with an index using:
>
>>     // define which qualifiers need an index (choosing the correct
> type)
>>     IdxColumnDescriptor columnDescriptor = new
>> IdxColumnDescriptor("columnFamily");
>>     columnDescriptor.addIndexDescriptor(
>>       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
>>     );
>>
>>     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
>>     tableDescriptor.addFamily(columnDescriptor);
>>
>
> Then when you want to perform a scan with an index hint:
>
>>     Scan scan = new IdxScan(
>>           new Comparison("columnFamily", "qualifier",
>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>       );
>>
>
> You have to keep in mind that the index hint is only a hint.  It
> guarantees
> that your scan will get all rows that match the hint but you'll more
> than
> likely receive rows that don't.  For this reason I'd suggest that you
> also
> include a filter along with the scan:
>
>>       Scan scan = new IdxScan(
>>           new Comparison("columnFamily", "qualifier",
>>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>       );
>>       scan.setFilter(
>>           new SingleColumnValueFilter(
>>               "columnFamily", "qualifer",
> CompareFilter.CompareOp.EQUAL,
>>               new BinaryComparator("foo")
>>           )
>>       );
>>
>
> Cheers,
> Dan
>
>
> 2010/1/22 stack <st...@duboce.net>
>
>>
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>
>> There is a bit of documentation if you look at javadoc for the
>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>
>> St.Ack
>>
>> P.S. We had a thread going named "HBase bulk load".  You got all the
>> answers you need on that one?
>>
>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>> <sr...@ivycomptech.com> wrote:
>> >
>> > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
> pass
>> > me the link
>> >
>> > -----Original Message-----
>> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> > stack
>> > Sent: Friday, January 22, 2010 12:42 AM
>> > To: hbase-user@hadoop.apache.org
>> > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>> > HBASE-1845
>> >
>> > IIRC, hbase-1845 was a sketch only and not yet complete.  Its
> probably
>> > rotted since any ways.
>> >
>> > Have you looked at hbase-2037 since committed and available in
>> > 0.20.3RC2.
>> >  Would this help you with your original problem?
>> >
>> > St.Ack
>> >
>> > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>> > sriramc@ivycomptech.com> wrote:
>> >
>> > > I tried applying the patch to the hbase source code  hbase 0.20.2
> and
>> > I
>> > > get the errors below.  Do you know if this needs to be applied to
> a
>> > > specific hbase version. Is there a version which works with 0.20.2
> or
>> > > later ??
>> > > Basically HRegionServer  and HTable patching fails.
>> > >
>> > >
>> > > Thanks for the help
>> > >
>> > > patch -p0 -i batch.patch
>> > >
>> > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
>> > > Hunk #1 succeeded at 61 (offset 2 lines).
>> > > Hunk #2 succeeded at 347 (offset 31 lines).
>> > > patching file
> src/java/org/apache/hadoop/hbase/client/HConnection.java
>> > > patching file
>> > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>> > > Hunk #3 succeeded at 1244 (offset 6 lines).
>> > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
>> > > Hunk #2 succeeded at 73 (offset 8 lines).
>> > > Hunk #4 FAILED at 405.
>> > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>> > > 1 out of 5 hunks FAILED -- saving rejects to file
>> > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>> > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>> > > patching file
>> > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>> > > patching file
> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>> > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
>> > > patching file
>> > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>> > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>> > > patching file
>> > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>> > > Hunk #2 succeeded at 247 (offset 2 lines).
>> > > patching file
>> > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>> > > Hunk #1 succeeded at 78 (offset -1 lines).
>> > > Hunk #2 FAILED at 2515.
>> > > 1 out of 2 hunks FAILED -- saving rejects to file
>> > >
> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
>> > > patching file
> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>> > > Hunk #2 FAILED at 333.
>> > > 1 out of 2 hunks FAILED -- saving rejects to file
>> > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>> > >
>> > >
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: Marc Limotte [mailto:mslimotte@gmail.com]
>> > > Sent: Tuesday, January 19, 2010 10:26 PM
>> > > To: hbase-user@hadoop.apache.org
>> > > Subject: Re: Support for MultiGet / SQL In clause
>> > >
>> > > Sriram,
>> > >
>> > > Would a secondary index help you:
>> > >
>> >
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>> > > client/tableindexed/package-summary.html#package_description
>> > > .
>> > >
>> > > The index is stored in a separate table, but the index is managed
> for
>> > > you.
>> > >
>> > > I don't think you can do an arbitrary "in" query, though.  If the
> keys
>> > > that
>> > > you want to include in the "in" are reasonably close neighbors,
> you
>> > > could do
>> > > a scan and skip ones that are uninteresting.  You could also try a
>> > batch
>> > > Get
>> > > by applying a separate patch, see
>> > > http://issues.apache.org/jira/browse/HBASE-1845.
>> > >
>> > > Marc Limotte
>> > >
>> > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
>> > > sriramc@ivycomptech.com> wrote:
>> > >
>> > > > Is there any support for this.  I want to do this
>> > > >
>> > > > 1.  Create a second table to maintain mapping between secondary
>> > column
>> > > > and the rowid's of the primary table
>> > > >
>> > > > 2.  Use this second table to get the rowid's to lookup from the
>> > > primary
>> > > > table using a SQL In like clause ---
>> > > >
>> > > > Basically I am doing this to speed up querying by  Non-row key
>> > > columns.
>> > > >
>> > > > Thanks
>> > > >
>> > > > Sriram C
>> > > >
>> > > >
>> > > > This email is sent for and on behalf of Ivy Comptech Private
>> > Limited.
>> > > Ivy
>> > > > Comptech Private Limited is a limited liability company.
>> > > >
>> > > > This email and any attachments are confidential, and may be
> legally
>> > > > privileged and protected by copyright. If you are not the
> intended
>> > > recipient
>> > > > dissemination or copying of this email is prohibited. If you
> have
>> > > received
>> > > > this in error, please notify the sender by replying by email and
>> > then
>> > > delete
>> > > > the email completely from your system.
>> > > > Any views or opinions are solely those of the sender.  This
>> > > communication
>> > > > is not intended to form a binding contract on behalf of Ivy
> Comptech
>> > > Private
>> > > > Limited unless expressly indicated to the contrary and properly
>> > > authorised.
>> > > > Any actions taken on the basis of this email are at the
> recipient's
>> > > own
>> > > > risk.
>> > > >
>> > > > Registered office:
>> > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>> > Hills,
>> > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994.
>> > > > Registered in India. A list of members' names is available for
>> > > inspection at
>> > > > the registered office.
>> > > >
>> > > >
>> > >
>>
>

Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Dan Washusen <da...@reactive.org>.
Not sure what's going on here...

Could you send along the logs/*regionserver*.log and logs/*master*.log
files?

2010/1/23 Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>

> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after starting
> hbase I keep getting the error below when I go to the hbase shell
>
> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
> HBase Shell; enter 'help<RETURN>' for list of supported commands.
> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
> hbase(main):001:0> list
> NativeException:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server null for region , row '', but failed after 7
> attempts.
> Exceptions:
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
> org.apache.hadoop.hbase.TableNotFoundException: .META.
>
>
>
> Also when I try to create a table programatically I get this --
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection to
> server localhost/127.0.0.1:2181
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
> remote=localhost/127.0.0.1:2181]
> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
> successful
> Exception in thread "main"
> org.apache.hadoop.hbase.TableNotFoundException: .META.
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ionInMeta(HConnectionManager.java:684)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:634)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:601)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ionInMeta(HConnectionManager.java:675)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:638)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
> ion(HConnectionManager.java:601)
>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:128)
>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:106)
>        at test.CreateTable.main(CreateTable.java:36)
>
>
>
> Any clues ?
>
>
>
> -----Original Message-----
> From: Dan Washusen [mailto:dan@reactive.org]
> Sent: Friday, January 22, 2010 4:53 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> If you want to give the "indexed" contrib package a try you'll need to
> do
> the following:
>
>    1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>    /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n"
> ":"`)
>    2. Set the 'hbase.hregion.impl' property to
>    'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
> hbase-site.xml
>
> Once you've done that you can create a table with an index using:
>
> >     // define which qualifiers need an index (choosing the correct
> type)
> >     IdxColumnDescriptor columnDescriptor = new
> > IdxColumnDescriptor("columnFamily");
> >     columnDescriptor.addIndexDescriptor(
> >       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
> >     );
> >
> >     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
> >     tableDescriptor.addFamily(columnDescriptor);
> >
>
> Then when you want to perform a scan with an index hint:
>
> >     Scan scan = new IdxScan(
> >           new Comparison("columnFamily", "qualifier",
> >               Comparison.Operator.EQ, Bytes.toBytes("foo"))
> >       );
> >
>
> You have to keep in mind that the index hint is only a hint.  It
> guarantees
> that your scan will get all rows that match the hint but you'll more
> than
> likely receive rows that don't.  For this reason I'd suggest that you
> also
> include a filter along with the scan:
>
> >       Scan scan = new IdxScan(
> >           new Comparison("columnFamily", "qualifier",
> >               Comparison.Operator.EQ, Bytes.toBytes("foo"))
> >       );
> >       scan.setFilter(
> >           new SingleColumnValueFilter(
> >               "columnFamily", "qualifer",
> CompareFilter.CompareOp.EQUAL,
> >               new BinaryComparator("foo")
> >           )
> >       );
> >
>
> Cheers,
> Dan
>
>
> 2010/1/22 stack <st...@duboce.net>
>
> >
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
> <http://peop
> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
> >
> > There is a bit of documentation if you look at javadoc for the
> > 'indexed' contrib (This is what hbase-2073 is called on commit).
> >
> > St.Ack
> >
> > P.S. We had a thread going named "HBase bulk load".  You got all the
> > answers you need on that one?
> >
> > On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
> > <sr...@ivycomptech.com> wrote:
> > >
> > > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
> pass
> > > me the link
> > >
> > > -----Original Message-----
> > > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > > stack
> > > Sent: Friday, January 22, 2010 12:42 AM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> > > HBASE-1845
> > >
> > > IIRC, hbase-1845 was a sketch only and not yet complete.  Its
> probably
> > > rotted since any ways.
> > >
> > > Have you looked at hbase-2037 since committed and available in
> > > 0.20.3RC2.
> > >  Would this help you with your original problem?
> > >
> > > St.Ack
> > >
> > > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
> > > sriramc@ivycomptech.com> wrote:
> > >
> > > > I tried applying the patch to the hbase source code  hbase 0.20.2
> and
> > > I
> > > > get the errors below.  Do you know if this needs to be applied to
> a
> > > > specific hbase version. Is there a version which works with 0.20.2
> or
> > > > later ??
> > > > Basically HRegionServer  and HTable patching fails.
> > > >
> > > >
> > > > Thanks for the help
> > > >
> > > > patch -p0 -i batch.patch
> > > >
> > > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
> > > > Hunk #1 succeeded at 61 (offset 2 lines).
> > > > Hunk #2 succeeded at 347 (offset 31 lines).
> > > > patching file
> src/java/org/apache/hadoop/hbase/client/HConnection.java
> > > > patching file
> > > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> > > > Hunk #3 succeeded at 1244 (offset 6 lines).
> > > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
> > > > Hunk #2 succeeded at 73 (offset 8 lines).
> > > > Hunk #4 FAILED at 405.
> > > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> > > > 1 out of 5 hunks FAILED -- saving rejects to file
> > > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> > > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> > > > patching file
> > > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> > > > patching file
> src/java/org/apache/hadoop/hbase/client/MultiResult.java
> > > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
> > > > patching file
> > > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> > > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> > > > patching file
> > > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> > > > Hunk #2 succeeded at 247 (offset 2 lines).
> > > > patching file
> > > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> > > > Hunk #1 succeeded at 78 (offset -1 lines).
> > > > Hunk #2 FAILED at 2515.
> > > > 1 out of 2 hunks FAILED -- saving rejects to file
> > > >
> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
> > > > patching file
> src/test/org/apache/hadoop/hbase/client/TestHTable.java
> > > > Hunk #2 FAILED at 333.
> > > > 1 out of 2 hunks FAILED -- saving rejects to file
> > > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
> > > >
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Marc Limotte [mailto:mslimotte@gmail.com]
> > > > Sent: Tuesday, January 19, 2010 10:26 PM
> > > > To: hbase-user@hadoop.apache.org
> > > > Subject: Re: Support for MultiGet / SQL In clause
> > > >
> > > > Sriram,
> > > >
> > > > Would a secondary index help you:
> > > >
> > >
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> > > > client/tableindexed/package-summary.html#package_description
> > > > .
> > > >
> > > > The index is stored in a separate table, but the index is managed
> for
> > > > you.
> > > >
> > > > I don't think you can do an arbitrary "in" query, though.  If the
> keys
> > > > that
> > > > you want to include in the "in" are reasonably close neighbors,
> you
> > > > could do
> > > > a scan and skip ones that are uninteresting.  You could also try a
> > > batch
> > > > Get
> > > > by applying a separate patch, see
> > > > http://issues.apache.org/jira/browse/HBASE-1845.
> > > >
> > > > Marc Limotte
> > > >
> > > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> > > > sriramc@ivycomptech.com> wrote:
> > > >
> > > > > Is there any support for this.  I want to do this
> > > > >
> > > > > 1.  Create a second table to maintain mapping between secondary
> > > column
> > > > > and the rowid's of the primary table
> > > > >
> > > > > 2.  Use this second table to get the rowid's to lookup from the
> > > > primary
> > > > > table using a SQL In like clause ---
> > > > >
> > > > > Basically I am doing this to speed up querying by  Non-row key
> > > > columns.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Sriram C
> > > > >
> > > > >
> > > > > This email is sent for and on behalf of Ivy Comptech Private
> > > Limited.
> > > > Ivy
> > > > > Comptech Private Limited is a limited liability company.
> > > > >
> > > > > This email and any attachments are confidential, and may be
> legally
> > > > > privileged and protected by copyright. If you are not the
> intended
> > > > recipient
> > > > > dissemination or copying of this email is prohibited. If you
> have
> > > > received
> > > > > this in error, please notify the sender by replying by email and
> > > then
> > > > delete
> > > > > the email completely from your system.
> > > > > Any views or opinions are solely those of the sender.  This
> > > > communication
> > > > > is not intended to form a binding contract on behalf of Ivy
> Comptech
> > > > Private
> > > > > Limited unless expressly indicated to the contrary and properly
> > > > authorised.
> > > > > Any actions taken on the basis of this email are at the
> recipient's
> > > > own
> > > > > risk.
> > > > >
> > > > > Registered office:
> > > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> > > Hills,
> > > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994.
> > > > > Registered in India. A list of members' names is available for
> > > > inspection at
> > > > > the registered office.
> > > > >
> > > > >
> > > >
> >
>

RE: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after starting
hbase I keep getting the error below when I go to the hbase shell 

[ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
hbase(main):001:0> list
NativeException:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
contact region server null for region , row '', but failed after 7
attempts.
Exceptions:
org.apache.hadoop.hbase.TableNotFoundException: .META.
org.apache.hadoop.hbase.TableNotFoundException: .META.
org.apache.hadoop.hbase.TableNotFoundException: .META.
org.apache.hadoop.hbase.TableNotFoundException: .META.
org.apache.hadoop.hbase.TableNotFoundException: .META.
org.apache.hadoop.hbase.TableNotFoundException: .META.
org.apache.hadoop.hbase.TableNotFoundException: .META.



Also when I try to create a table programatically I get this -- 
10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection to
server localhost/127.0.0.1:2181
10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
remote=localhost/127.0.0.1:2181]
10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
successful
Exception in thread "main"
org.apache.hadoop.hbase.TableNotFoundException: .META.
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
ionInMeta(HConnectionManager.java:684)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
ion(HConnectionManager.java:634)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
ion(HConnectionManager.java:601)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
ionInMeta(HConnectionManager.java:675)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
ion(HConnectionManager.java:638)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateReg
ion(HConnectionManager.java:601)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:128)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:106)
        at test.CreateTable.main(CreateTable.java:36)



Any clues ?



-----Original Message-----
From: Dan Washusen [mailto:dan@reactive.org] 
Sent: Friday, January 22, 2010 4:53 AM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause -- error in patch
HBASE-1845

If you want to give the "indexed" contrib package a try you'll need to
do
the following:

   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n"
":"`)
   2. Set the 'hbase.hregion.impl' property to
   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
hbase-site.xml

Once you've done that you can create a table with an index using:

>     // define which qualifiers need an index (choosing the correct
type)
>     IdxColumnDescriptor columnDescriptor = new
> IdxColumnDescriptor("columnFamily");
>     columnDescriptor.addIndexDescriptor(
>       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
>     );
>
>     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
>     tableDescriptor.addFamily(columnDescriptor);
>

Then when you want to perform a scan with an index hint:

>     Scan scan = new IdxScan(
>           new Comparison("columnFamily", "qualifier",
>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>       );
>

You have to keep in mind that the index hint is only a hint.  It
guarantees
that your scan will get all rows that match the hint but you'll more
than
likely receive rows that don't.  For this reason I'd suggest that you
also
include a filter along with the scan:

>       Scan scan = new IdxScan(
>           new Comparison("columnFamily", "qualifier",
>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>       );
>       scan.setFilter(
>           new SingleColumnValueFilter(
>               "columnFamily", "qualifer",
CompareFilter.CompareOp.EQUAL,
>               new BinaryComparator("foo")
>           )
>       );
>

Cheers,
Dan


2010/1/22 stack <st...@duboce.net>

>
http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>
> There is a bit of documentation if you look at javadoc for the
> 'indexed' contrib (This is what hbase-2073 is called on commit).
>
> St.Ack
>
> P.S. We had a thread going named "HBase bulk load".  You got all the
> answers you need on that one?
>
> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
> >
> > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
pass
> > me the link
> >
> > -----Original Message-----
> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > stack
> > Sent: Friday, January 22, 2010 12:42 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> > HBASE-1845
> >
> > IIRC, hbase-1845 was a sketch only and not yet complete.  Its
probably
> > rotted since any ways.
> >
> > Have you looked at hbase-2037 since committed and available in
> > 0.20.3RC2.
> >  Would this help you with your original problem?
> >
> > St.Ack
> >
> > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
> > sriramc@ivycomptech.com> wrote:
> >
> > > I tried applying the patch to the hbase source code  hbase 0.20.2
and
> > I
> > > get the errors below.  Do you know if this needs to be applied to
a
> > > specific hbase version. Is there a version which works with 0.20.2
or
> > > later ??
> > > Basically HRegionServer  and HTable patching fails.
> > >
> > >
> > > Thanks for the help
> > >
> > > patch -p0 -i batch.patch
> > >
> > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
> > > Hunk #1 succeeded at 61 (offset 2 lines).
> > > Hunk #2 succeeded at 347 (offset 31 lines).
> > > patching file
src/java/org/apache/hadoop/hbase/client/HConnection.java
> > > patching file
> > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> > > Hunk #3 succeeded at 1244 (offset 6 lines).
> > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
> > > Hunk #2 succeeded at 73 (offset 8 lines).
> > > Hunk #4 FAILED at 405.
> > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> > > 1 out of 5 hunks FAILED -- saving rejects to file
> > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> > > patching file
> > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> > > patching file
src/java/org/apache/hadoop/hbase/client/MultiResult.java
> > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
> > > patching file
> > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> > > patching file
> > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> > > Hunk #2 succeeded at 247 (offset 2 lines).
> > > patching file
> > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> > > Hunk #1 succeeded at 78 (offset -1 lines).
> > > Hunk #2 FAILED at 2515.
> > > 1 out of 2 hunks FAILED -- saving rejects to file
> > >
src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
> > > patching file
src/test/org/apache/hadoop/hbase/client/TestHTable.java
> > > Hunk #2 FAILED at 333.
> > > 1 out of 2 hunks FAILED -- saving rejects to file
> > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Marc Limotte [mailto:mslimotte@gmail.com]
> > > Sent: Tuesday, January 19, 2010 10:26 PM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: Re: Support for MultiGet / SQL In clause
> > >
> > > Sriram,
> > >
> > > Would a secondary index help you:
> > >
> >
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> > > client/tableindexed/package-summary.html#package_description
> > > .
> > >
> > > The index is stored in a separate table, but the index is managed
for
> > > you.
> > >
> > > I don't think you can do an arbitrary "in" query, though.  If the
keys
> > > that
> > > you want to include in the "in" are reasonably close neighbors,
you
> > > could do
> > > a scan and skip ones that are uninteresting.  You could also try a
> > batch
> > > Get
> > > by applying a separate patch, see
> > > http://issues.apache.org/jira/browse/HBASE-1845.
> > >
> > > Marc Limotte
> > >
> > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> > > sriramc@ivycomptech.com> wrote:
> > >
> > > > Is there any support for this.  I want to do this
> > > >
> > > > 1.  Create a second table to maintain mapping between secondary
> > column
> > > > and the rowid's of the primary table
> > > >
> > > > 2.  Use this second table to get the rowid's to lookup from the
> > > primary
> > > > table using a SQL In like clause ---
> > > >
> > > > Basically I am doing this to speed up querying by  Non-row key
> > > columns.
> > > >
> > > > Thanks
> > > >
> > > > Sriram C
> > > >
> > > >
> > > > This email is sent for and on behalf of Ivy Comptech Private
> > Limited.
> > > Ivy
> > > > Comptech Private Limited is a limited liability company.
> > > >
> > > > This email and any attachments are confidential, and may be
legally
> > > > privileged and protected by copyright. If you are not the
intended
> > > recipient
> > > > dissemination or copying of this email is prohibited. If you
have
> > > received
> > > > this in error, please notify the sender by replying by email and
> > then
> > > delete
> > > > the email completely from your system.
> > > > Any views or opinions are solely those of the sender.  This
> > > communication
> > > > is not intended to form a binding contract on behalf of Ivy
Comptech
> > > Private
> > > > Limited unless expressly indicated to the contrary and properly
> > > authorised.
> > > > Any actions taken on the basis of this email are at the
recipient's
> > > own
> > > > risk.
> > > >
> > > > Registered office:
> > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> > Hills,
> > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number:
37994.
> > > > Registered in India. A list of members' names is available for
> > > inspection at
> > > > the registered office.
> > > >
> > > >
> > >
>

Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Dan Washusen <da...@reactive.org>.
This error looks like you might be setting the 'hbase.regionserver.impl'
property to "org.apache.hadoop.hbase.regionserver.IdxRegion" instead of the
'hbase.hregion.impl' property.  Could you ensure that you have the following
in your hbase-site.xml:

...

<property>
>     <name>hbase.hregion.impl</name>
>     <value>org.apache.hadoop.hbase.regionserver.IdxRegion</value>
> </property>
> ...
>


2010/1/23 Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>

> I installed hbase-0.20.3.
>
> 1. Can I copy my hbase data directory which had old data.
> 2. I get this regionserver error --
>
> Fri Jan 22 11:03:33 EST 2010 Starting regionserver on
> morungole.ivycomptech.co.in
> ulimit -n 8192
> 2010-01-22 11:03:34,694 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> vmInputArguments=[-Xmx1000m, -XX:+HeapDumpOnOutOfMemoryError,
> -XX:+UseConcMarkSweepGC, -XX:+CMSIncrementalMode,
> -Dhbase.log.dir=/home/ppoker/test/hbase-0.20.3/bin/../logs,
> -Dhbase.log.file=hbase-ppoker-regionserver-morungole.ivycomptech.co.in.l
> og, -Dhbase.home.dir=/home/ppoker/test/hbase-0.20.3/bin/..,
> -Dhbase.id.str=ppoker, -Dhbase.root.logger=INFO,DRFA,
> -Djava.library.path=/home/ppoker/test/hbase-0.20.3/bin/../lib/native/Lin
> ux-i386-32]
> 2010-01-22 11:03:34,705 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Can not start region
> server because java.lang.NoSuchMethodException:
> org.apache.hadoop.hbase.regionserver.IdxRegion.<init>(org.apache.hadoop.
> hbase.HBaseConfiguration)
>        at java.lang.Class.getConstructor0(Class.java:2706)
>        at java.lang.Class.getConstructor(Class.java:1657)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doMain(HRegionServer.
> java:2429)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.ja
> va:2499)
>
>
>
>
>
> -----Original Message-----
> From: Dan Washusen [mailto:dan@reactive.org]
> Sent: Friday, January 22, 2010 4:53 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> If you want to give the "indexed" contrib package a try you'll need to
> do
> the following:
>
>    1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>    /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n"
> ":"`)
>    2. Set the 'hbase.hregion.impl' property to
>    'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
> hbase-site.xml
>
> Once you've done that you can create a table with an index using:
>
> >     // define which qualifiers need an index (choosing the correct
> type)
> >     IdxColumnDescriptor columnDescriptor = new
> > IdxColumnDescriptor("columnFamily");
> >     columnDescriptor.addIndexDescriptor(
> >       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
> >     );
> >
> >     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
> >     tableDescriptor.addFamily(columnDescriptor);
>
>
> Then when you want to perform a scan with an index hint:
>
> >     Scan scan = new IdxScan(
> >           new Comparison("columnFamily", "qualifier",
> >               Comparison.Operator.EQ, Bytes.toBytes("foo"))
> >       );
> >
>
> You have to keep in mind that the index hint is only a hint.  It
> guarantees
> that your scan will get all rows that match the hint but you'll more
> than
> likely receive rows that don't.  For this reason I'd suggest that you
> also
> include a filter along with the scan:
>
> >       Scan scan = new IdxScan(
> >           new Comparison("columnFamily", "qualifier",
> >               Comparison.Operator.EQ, Bytes.toBytes("foo"))
> >       );
> >       scan.setFilter(
> >           new SingleColumnValueFilter(
> >               "columnFamily", "qualifer",
> CompareFilter.CompareOp.EQUAL,
> >               new BinaryComparator("foo")
> >           )
> >       );
> >
>
> Cheers,
> Dan
>
>
> 2010/1/22 stack <st...@duboce.net>
>
> >
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
> <http://peop
> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
> >
> > There is a bit of documentation if you look at javadoc for the
> > 'indexed' contrib (This is what hbase-2073 is called on commit).
> >
> > St.Ack
> >
> > P.S. We had a thread going named "HBase bulk load".  You got all the
> > answers you need on that one?
> >
> > On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
> > <sr...@ivycomptech.com> wrote:
> > >
> > > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
> pass
> > > me the link
> > >
> > > -----Original Message-----
> > > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > > stack
> > > Sent: Friday, January 22, 2010 12:42 AM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> > > HBASE-1845
> > >
> > > IIRC, hbase-1845 was a sketch only and not yet complete.  Its
> probably
> > > rotted since any ways.
> > >
> > > Have you looked at hbase-2037 since committed and available in
> > > 0.20.3RC2.
> > >  Would this help you with your original problem?
> > >
> > > St.Ack
> > >
> > > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
> > > sriramc@ivycomptech.com> wrote:
> > >
> > > > I tried applying the patch to the hbase source code  hbase 0.20.2
> and
> > > I
> > > > get the errors below.  Do you know if this needs to be applied to
> a
> > > > specific hbase version. Is there a version which works with 0.20.2
> or
> > > > later ??
> > > > Basically HRegionServer  and HTable patching fails.
> > > >
> > > >
> > > > Thanks for the help
> > > >
> > > > patch -p0 -i batch.patch
> > > >
> > > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
> > > > Hunk #1 succeeded at 61 (offset 2 lines).
> > > > Hunk #2 succeeded at 347 (offset 31 lines).
> > > > patching file
> src/java/org/apache/hadoop/hbase/client/HConnection.java
> > > > patching file
> > > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> > > > Hunk #3 succeeded at 1244 (offset 6 lines).
> > > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
> > > > Hunk #2 succeeded at 73 (offset 8 lines).
> > > > Hunk #4 FAILED at 405.
> > > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> > > > 1 out of 5 hunks FAILED -- saving rejects to file
> > > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> > > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> > > > patching file
> > > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> > > > patching file
> src/java/org/apache/hadoop/hbase/client/MultiResult.java
> > > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
> > > > patching file
> > > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> > > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> > > > patching file
> > > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> > > > Hunk #2 succeeded at 247 (offset 2 lines).
> > > > patching file
> > > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> > > > Hunk #1 succeeded at 78 (offset -1 lines).
> > > > Hunk #2 FAILED at 2515.
> > > > 1 out of 2 hunks FAILED -- saving rejects to file
> > > >
> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
> > > > patching file
> src/test/org/apache/hadoop/hbase/client/TestHTable.java
> > > > Hunk #2 FAILED at 333.
> > > > 1 out of 2 hunks FAILED -- saving rejects to file
> > > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
> > > >
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Marc Limotte [mailto:mslimotte@gmail.com]
> > > > Sent: Tuesday, January 19, 2010 10:26 PM
> > > > To: hbase-user@hadoop.apache.org
> > > > Subject: Re: Support for MultiGet / SQL In clause
> > > >
> > > > Sriram,
> > > >
> > > > Would a secondary index help you:
> > > >
> > >
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> > > > client/tableindexed/package-summary.html#package_description
> > > > .
> > > >
> > > > The index is stored in a separate table, but the index is managed
> for
> > > > you.
> > > >
> > > > I don't think you can do an arbitrary "in" query, though.  If the
> keys
> > > > that
> > > > you want to include in the "in" are reasonably close neighbors,
> you
> > > > could do
> > > > a scan and skip ones that are uninteresting.  You could also try a
> > > batch
> > > > Get
> > > > by applying a separate patch, see
> > > > http://issues.apache.org/jira/browse/HBASE-1845.
> > > >
> > > > Marc Limotte
> > > >
> > > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> > > > sriramc@ivycomptech.com> wrote:
> > > >
> > > > > Is there any support for this.  I want to do this
> > > > >
> > > > > 1.  Create a second table to maintain mapping between secondary
> > > column
> > > > > and the rowid's of the primary table
> > > > >
> > > > > 2.  Use this second table to get the rowid's to lookup from the
> > > > primary
> > > > > table using a SQL In like clause ---
> > > > >
> > > > > Basically I am doing this to speed up querying by  Non-row key
> > > > columns.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Sriram C
> > > > >
> > > > >
> > > > > This email is sent for and on behalf of Ivy Comptech Private
> > > Limited.
> > > > Ivy
> > > > > Comptech Private Limited is a limited liability company.
> > > > >
> > > > > This email and any attachments are confidential, and may be
> legally
> > > > > privileged and protected by copyright. If you are not the
> intended
> > > > recipient
> > > > > dissemination or copying of this email is prohibited. If you
> have
> > > > received
> > > > > this in error, please notify the sender by replying by email and
> > > then
> > > > delete
> > > > > the email completely from your system.
> > > > > Any views or opinions are solely those of the sender.  This
> > > > communication
> > > > > is not intended to form a binding contract on behalf of Ivy
> Comptech
> > > > Private
> > > > > Limited unless expressly indicated to the contrary and properly
> > > > authorised.
> > > > > Any actions taken on the basis of this email are at the
> recipient's
> > > > own
> > > > > risk.
> > > > >
> > > > > Registered office:
> > > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> > > Hills,
> > > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number:
> 37994.
> > > > > Registered in India. A list of members' names is available for
> > > > inspection at
> > > > > the registered office.
> > > > >
> > > > >
> > > >
> >
>

RE: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
I installed hbase-0.20.3.  

1. Can I copy my hbase data directory which had old data. 
2. I get this regionserver error --   

Fri Jan 22 11:03:33 EST 2010 Starting regionserver on
morungole.ivycomptech.co.in
ulimit -n 8192
2010-01-22 11:03:34,694 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
vmInputArguments=[-Xmx1000m, -XX:+HeapDumpOnOutOfMemoryError,
-XX:+UseConcMarkSweepGC, -XX:+CMSIncrementalMode,
-Dhbase.log.dir=/home/ppoker/test/hbase-0.20.3/bin/../logs,
-Dhbase.log.file=hbase-ppoker-regionserver-morungole.ivycomptech.co.in.l
og, -Dhbase.home.dir=/home/ppoker/test/hbase-0.20.3/bin/..,
-Dhbase.id.str=ppoker, -Dhbase.root.logger=INFO,DRFA,
-Djava.library.path=/home/ppoker/test/hbase-0.20.3/bin/../lib/native/Lin
ux-i386-32]
2010-01-22 11:03:34,705 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: Can not start region
server because java.lang.NoSuchMethodException:
org.apache.hadoop.hbase.regionserver.IdxRegion.<init>(org.apache.hadoop.
hbase.HBaseConfiguration)
        at java.lang.Class.getConstructor0(Class.java:2706)
        at java.lang.Class.getConstructor(Class.java:1657)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.doMain(HRegionServer.
java:2429)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.ja
va:2499)





-----Original Message-----
From: Dan Washusen [mailto:dan@reactive.org] 
Sent: Friday, January 22, 2010 4:53 AM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause -- error in patch
HBASE-1845

If you want to give the "indexed" contrib package a try you'll need to
do
the following:

   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n"
":"`)
   2. Set the 'hbase.hregion.impl' property to
   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
hbase-site.xml

Once you've done that you can create a table with an index using:

>     // define which qualifiers need an index (choosing the correct
type)
>     IdxColumnDescriptor columnDescriptor = new
> IdxColumnDescriptor("columnFamily");
>     columnDescriptor.addIndexDescriptor(
>       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
>     );
>
>     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
>     tableDescriptor.addFamily(columnDescriptor);


Then when you want to perform a scan with an index hint:

>     Scan scan = new IdxScan(
>           new Comparison("columnFamily", "qualifier",
>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>       );
>

You have to keep in mind that the index hint is only a hint.  It
guarantees
that your scan will get all rows that match the hint but you'll more
than
likely receive rows that don't.  For this reason I'd suggest that you
also
include a filter along with the scan:

>       Scan scan = new IdxScan(
>           new Comparison("columnFamily", "qualifier",
>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>       );
>       scan.setFilter(
>           new SingleColumnValueFilter(
>               "columnFamily", "qualifer",
CompareFilter.CompareOp.EQUAL,
>               new BinaryComparator("foo")
>           )
>       );
>

Cheers,
Dan


2010/1/22 stack <st...@duboce.net>

>
http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>
> There is a bit of documentation if you look at javadoc for the
> 'indexed' contrib (This is what hbase-2073 is called on commit).
>
> St.Ack
>
> P.S. We had a thread going named "HBase bulk load".  You got all the
> answers you need on that one?
>
> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
> >
> > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
pass
> > me the link
> >
> > -----Original Message-----
> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > stack
> > Sent: Friday, January 22, 2010 12:42 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> > HBASE-1845
> >
> > IIRC, hbase-1845 was a sketch only and not yet complete.  Its
probably
> > rotted since any ways.
> >
> > Have you looked at hbase-2037 since committed and available in
> > 0.20.3RC2.
> >  Would this help you with your original problem?
> >
> > St.Ack
> >
> > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
> > sriramc@ivycomptech.com> wrote:
> >
> > > I tried applying the patch to the hbase source code  hbase 0.20.2
and
> > I
> > > get the errors below.  Do you know if this needs to be applied to
a
> > > specific hbase version. Is there a version which works with 0.20.2
or
> > > later ??
> > > Basically HRegionServer  and HTable patching fails.
> > >
> > >
> > > Thanks for the help
> > >
> > > patch -p0 -i batch.patch
> > >
> > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
> > > Hunk #1 succeeded at 61 (offset 2 lines).
> > > Hunk #2 succeeded at 347 (offset 31 lines).
> > > patching file
src/java/org/apache/hadoop/hbase/client/HConnection.java
> > > patching file
> > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> > > Hunk #3 succeeded at 1244 (offset 6 lines).
> > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
> > > Hunk #2 succeeded at 73 (offset 8 lines).
> > > Hunk #4 FAILED at 405.
> > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> > > 1 out of 5 hunks FAILED -- saving rejects to file
> > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> > > patching file
> > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> > > patching file
src/java/org/apache/hadoop/hbase/client/MultiResult.java
> > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
> > > patching file
> > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> > > patching file
> > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> > > Hunk #2 succeeded at 247 (offset 2 lines).
> > > patching file
> > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> > > Hunk #1 succeeded at 78 (offset -1 lines).
> > > Hunk #2 FAILED at 2515.
> > > 1 out of 2 hunks FAILED -- saving rejects to file
> > >
src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
> > > patching file
src/test/org/apache/hadoop/hbase/client/TestHTable.java
> > > Hunk #2 FAILED at 333.
> > > 1 out of 2 hunks FAILED -- saving rejects to file
> > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Marc Limotte [mailto:mslimotte@gmail.com]
> > > Sent: Tuesday, January 19, 2010 10:26 PM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: Re: Support for MultiGet / SQL In clause
> > >
> > > Sriram,
> > >
> > > Would a secondary index help you:
> > >
> >
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> > > client/tableindexed/package-summary.html#package_description
> > > .
> > >
> > > The index is stored in a separate table, but the index is managed
for
> > > you.
> > >
> > > I don't think you can do an arbitrary "in" query, though.  If the
keys
> > > that
> > > you want to include in the "in" are reasonably close neighbors,
you
> > > could do
> > > a scan and skip ones that are uninteresting.  You could also try a
> > batch
> > > Get
> > > by applying a separate patch, see
> > > http://issues.apache.org/jira/browse/HBASE-1845.
> > >
> > > Marc Limotte
> > >
> > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> > > sriramc@ivycomptech.com> wrote:
> > >
> > > > Is there any support for this.  I want to do this
> > > >
> > > > 1.  Create a second table to maintain mapping between secondary
> > column
> > > > and the rowid's of the primary table
> > > >
> > > > 2.  Use this second table to get the rowid's to lookup from the
> > > primary
> > > > table using a SQL In like clause ---
> > > >
> > > > Basically I am doing this to speed up querying by  Non-row key
> > > columns.
> > > >
> > > > Thanks
> > > >
> > > > Sriram C
> > > >
> > > >
> > > > This email is sent for and on behalf of Ivy Comptech Private
> > Limited.
> > > Ivy
> > > > Comptech Private Limited is a limited liability company.
> > > >
> > > > This email and any attachments are confidential, and may be
legally
> > > > privileged and protected by copyright. If you are not the
intended
> > > recipient
> > > > dissemination or copying of this email is prohibited. If you
have
> > > received
> > > > this in error, please notify the sender by replying by email and
> > then
> > > delete
> > > > the email completely from your system.
> > > > Any views or opinions are solely those of the sender.  This
> > > communication
> > > > is not intended to form a binding contract on behalf of Ivy
Comptech
> > > Private
> > > > Limited unless expressly indicated to the contrary and properly
> > > authorised.
> > > > Any actions taken on the basis of this email are at the
recipient's
> > > own
> > > > risk.
> > > >
> > > > Registered office:
> > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> > Hills,
> > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number:
37994.
> > > > Registered in India. A list of members' names is available for
> > > inspection at
> > > > the registered office.
> > > >
> > > >
> > >
>

Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Dan Washusen <da...@reactive.org>.
If you want to give the "indexed" contrib package a try you'll need to do
the following:

   1. Include the contrib jars (export HBASE_CLASSPATH=(`find
   /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n" ":"`)
   2. Set the 'hbase.hregion.impl' property to
   'org.apache.hadoop.hbase.regionserver.IdxRegion' in your hbase-site.xml

Once you've done that you can create a table with an index using:

>     // define which qualifiers need an index (choosing the correct type)
>     IdxColumnDescriptor columnDescriptor = new
> IdxColumnDescriptor("columnFamily");
>     columnDescriptor.addIndexDescriptor(
>       new IdxIndexDescriptor("qualifier", IdxQualifierType.BYTE_ARRAY)
>     );
>
>     HTableDescriptor tableDescriptor = new HTableDescriptor("table");
>     tableDescriptor.addFamily(columnDescriptor);
>

Then when you want to perform a scan with an index hint:

>     Scan scan = new IdxScan(
>           new Comparison("columnFamily", "qualifier",
>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>       );
>

You have to keep in mind that the index hint is only a hint.  It guarantees
that your scan will get all rows that match the hint but you'll more than
likely receive rows that don't.  For this reason I'd suggest that you also
include a filter along with the scan:

>       Scan scan = new IdxScan(
>           new Comparison("columnFamily", "qualifier",
>               Comparison.Operator.EQ, Bytes.toBytes("foo"))
>       );
>       scan.setFilter(
>           new SingleColumnValueFilter(
>               "columnFamily", "qualifer", CompareFilter.CompareOp.EQUAL,
>               new BinaryComparator("foo")
>           )
>       );
>

Cheers,
Dan


2010/1/22 stack <st...@duboce.net>

> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>
> There is a bit of documentation if you look at javadoc for the
> 'indexed' contrib (This is what hbase-2073 is called on commit).
>
> St.Ack
>
> P.S. We had a thread going named "HBase bulk load".  You got all the
> answers you need on that one?
>
> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
> <sr...@ivycomptech.com> wrote:
> >
> > Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u pass
> > me the link
> >
> > -----Original Message-----
> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > stack
> > Sent: Friday, January 22, 2010 12:42 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> > HBASE-1845
> >
> > IIRC, hbase-1845 was a sketch only and not yet complete.  Its probably
> > rotted since any ways.
> >
> > Have you looked at hbase-2037 since committed and available in
> > 0.20.3RC2.
> >  Would this help you with your original problem?
> >
> > St.Ack
> >
> > On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
> > sriramc@ivycomptech.com> wrote:
> >
> > > I tried applying the patch to the hbase source code  hbase 0.20.2  and
> > I
> > > get the errors below.  Do you know if this needs to be applied to a
> > > specific hbase version. Is there a version which works with 0.20.2 or
> > > later ??
> > > Basically HRegionServer  and HTable patching fails.
> > >
> > >
> > > Thanks for the help
> > >
> > > patch -p0 -i batch.patch
> > >
> > > patching file src/java/org/apache/hadoop/hbase/client/Get.java
> > > Hunk #1 succeeded at 61 (offset 2 lines).
> > > Hunk #2 succeeded at 347 (offset 31 lines).
> > > patching file src/java/org/apache/hadoop/hbase/client/HConnection.java
> > > patching file
> > > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> > > Hunk #3 succeeded at 1244 (offset 6 lines).
> > > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
> > > Hunk #2 succeeded at 73 (offset 8 lines).
> > > Hunk #4 FAILED at 405.
> > > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> > > 1 out of 5 hunks FAILED -- saving rejects to file
> > > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> > > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> > > patching file
> > src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> > > patching file src/java/org/apache/hadoop/hbase/client/MultiResult.java
> > > patching file src/java/org/apache/hadoop/hbase/client/Row.java
> > > patching file
> > > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> > > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> > > patching file
> > src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> > > Hunk #2 succeeded at 247 (offset 2 lines).
> > > patching file
> > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> > > Hunk #1 succeeded at 78 (offset -1 lines).
> > > Hunk #2 FAILED at 2515.
> > > 1 out of 2 hunks FAILED -- saving rejects to file
> > > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
> > > patching file src/test/org/apache/hadoop/hbase/client/TestHTable.java
> > > Hunk #2 FAILED at 333.
> > > 1 out of 2 hunks FAILED -- saving rejects to file
> > > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Marc Limotte [mailto:mslimotte@gmail.com]
> > > Sent: Tuesday, January 19, 2010 10:26 PM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: Re: Support for MultiGet / SQL In clause
> > >
> > > Sriram,
> > >
> > > Would a secondary index help you:
> > >
> > http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> > > client/tableindexed/package-summary.html#package_description
> > > .
> > >
> > > The index is stored in a separate table, but the index is managed for
> > > you.
> > >
> > > I don't think you can do an arbitrary "in" query, though.  If the keys
> > > that
> > > you want to include in the "in" are reasonably close neighbors, you
> > > could do
> > > a scan and skip ones that are uninteresting.  You could also try a
> > batch
> > > Get
> > > by applying a separate patch, see
> > > http://issues.apache.org/jira/browse/HBASE-1845.
> > >
> > > Marc Limotte
> > >
> > > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> > > sriramc@ivycomptech.com> wrote:
> > >
> > > > Is there any support for this.  I want to do this
> > > >
> > > > 1.  Create a second table to maintain mapping between secondary
> > column
> > > > and the rowid's of the primary table
> > > >
> > > > 2.  Use this second table to get the rowid's to lookup from the
> > > primary
> > > > table using a SQL In like clause ---
> > > >
> > > > Basically I am doing this to speed up querying by  Non-row key
> > > columns.
> > > >
> > > > Thanks
> > > >
> > > > Sriram C
> > > >
> > > >
> > > > This email is sent for and on behalf of Ivy Comptech Private
> > Limited.
> > > Ivy
> > > > Comptech Private Limited is a limited liability company.
> > > >
> > > > This email and any attachments are confidential, and may be legally
> > > > privileged and protected by copyright. If you are not the intended
> > > recipient
> > > > dissemination or copying of this email is prohibited. If you have
> > > received
> > > > this in error, please notify the sender by replying by email and
> > then
> > > delete
> > > > the email completely from your system.
> > > > Any views or opinions are solely those of the sender.  This
> > > communication
> > > > is not intended to form a binding contract on behalf of Ivy Comptech
> > > Private
> > > > Limited unless expressly indicated to the contrary and properly
> > > authorised.
> > > > Any actions taken on the basis of this email are at the recipient's
> > > own
> > > > risk.
> > > >
> > > > Registered office:
> > > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> > Hills,
> > > > Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994.
> > > > Registered in India. A list of members' names is available for
> > > inspection at
> > > > the registered office.
> > > >
> > > >
> > >
>

Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by stack <st...@duboce.net>.
http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/

There is a bit of documentation if you look at javadoc for the
'indexed' contrib (This is what hbase-2073 is called on commit).

St.Ack

P.S. We had a thread going named "HBase bulk load".  You got all the
answers you need on that one?

On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
<sr...@ivycomptech.com> wrote:
>
> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u pass
> me the link
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> stack
> Sent: Friday, January 22, 2010 12:42 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> IIRC, hbase-1845 was a sketch only and not yet complete.  Its probably
> rotted since any ways.
>
> Have you looked at hbase-2037 since committed and available in
> 0.20.3RC2.
>  Would this help you with your original problem?
>
> St.Ack
>
> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
> sriramc@ivycomptech.com> wrote:
>
> > I tried applying the patch to the hbase source code  hbase 0.20.2  and
> I
> > get the errors below.  Do you know if this needs to be applied to a
> > specific hbase version. Is there a version which works with 0.20.2 or
> > later ??
> > Basically HRegionServer  and HTable patching fails.
> >
> >
> > Thanks for the help
> >
> > patch -p0 -i batch.patch
> >
> > patching file src/java/org/apache/hadoop/hbase/client/Get.java
> > Hunk #1 succeeded at 61 (offset 2 lines).
> > Hunk #2 succeeded at 347 (offset 31 lines).
> > patching file src/java/org/apache/hadoop/hbase/client/HConnection.java
> > patching file
> > src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> > Hunk #3 succeeded at 1244 (offset 6 lines).
> > patching file src/java/org/apache/hadoop/hbase/client/HTable.java
> > Hunk #2 succeeded at 73 (offset 8 lines).
> > Hunk #4 FAILED at 405.
> > Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> > 1 out of 5 hunks FAILED -- saving rejects to file
> > src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> > patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> > patching file
> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> > patching file src/java/org/apache/hadoop/hbase/client/MultiResult.java
> > patching file src/java/org/apache/hadoop/hbase/client/Row.java
> > patching file
> > src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> > Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> > patching file
> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> > Hunk #2 succeeded at 247 (offset 2 lines).
> > patching file
> > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> > Hunk #1 succeeded at 78 (offset -1 lines).
> > Hunk #2 FAILED at 2515.
> > 1 out of 2 hunks FAILED -- saving rejects to file
> > src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
> > patching file src/test/org/apache/hadoop/hbase/client/TestHTable.java
> > Hunk #2 FAILED at 333.
> > 1 out of 2 hunks FAILED -- saving rejects to file
> > src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
> >
> >
> >
> >
> > -----Original Message-----
> > From: Marc Limotte [mailto:mslimotte@gmail.com]
> > Sent: Tuesday, January 19, 2010 10:26 PM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: Support for MultiGet / SQL In clause
> >
> > Sriram,
> >
> > Would a secondary index help you:
> >
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> > client/tableindexed/package-summary.html#package_description
> > .
> >
> > The index is stored in a separate table, but the index is managed for
> > you.
> >
> > I don't think you can do an arbitrary "in" query, though.  If the keys
> > that
> > you want to include in the "in" are reasonably close neighbors, you
> > could do
> > a scan and skip ones that are uninteresting.  You could also try a
> batch
> > Get
> > by applying a separate patch, see
> > http://issues.apache.org/jira/browse/HBASE-1845.
> >
> > Marc Limotte
> >
> > On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> > sriramc@ivycomptech.com> wrote:
> >
> > > Is there any support for this.  I want to do this
> > >
> > > 1.  Create a second table to maintain mapping between secondary
> column
> > > and the rowid's of the primary table
> > >
> > > 2.  Use this second table to get the rowid's to lookup from the
> > primary
> > > table using a SQL In like clause ---
> > >
> > > Basically I am doing this to speed up querying by  Non-row key
> > columns.
> > >
> > > Thanks
> > >
> > > Sriram C
> > >
> > >
> > > This email is sent for and on behalf of Ivy Comptech Private
> Limited.
> > Ivy
> > > Comptech Private Limited is a limited liability company.
> > >
> > > This email and any attachments are confidential, and may be legally
> > > privileged and protected by copyright. If you are not the intended
> > recipient
> > > dissemination or copying of this email is prohibited. If you have
> > received
> > > this in error, please notify the sender by replying by email and
> then
> > delete
> > > the email completely from your system.
> > > Any views or opinions are solely those of the sender.  This
> > communication
> > > is not intended to form a binding contract on behalf of Ivy Comptech
> > Private
> > > Limited unless expressly indicated to the contrary and properly
> > authorised.
> > > Any actions taken on the basis of this email are at the recipient's
> > own
> > > risk.
> > >
> > > Registered office:
> > > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
> Hills,
> > > Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994.
> > > Registered in India. A list of members' names is available for
> > inspection at
> > > the registered office.
> > >
> > >
> >

RE: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u pass
me the link

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
stack
Sent: Friday, January 22, 2010 12:42 AM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause -- error in patch
HBASE-1845

IIRC, hbase-1845 was a sketch only and not yet complete.  Its probably
rotted since any ways.

Have you looked at hbase-2037 since committed and available in
0.20.3RC2.
 Would this help you with your original problem?

St.Ack

On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
sriramc@ivycomptech.com> wrote:

> I tried applying the patch to the hbase source code  hbase 0.20.2  and
I
> get the errors below.  Do you know if this needs to be applied to a
> specific hbase version. Is there a version which works with 0.20.2 or
> later ??
> Basically HRegionServer  and HTable patching fails.
>
>
> Thanks for the help
>
> patch -p0 -i batch.patch
>
> patching file src/java/org/apache/hadoop/hbase/client/Get.java
> Hunk #1 succeeded at 61 (offset 2 lines).
> Hunk #2 succeeded at 347 (offset 31 lines).
> patching file src/java/org/apache/hadoop/hbase/client/HConnection.java
> patching file
> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> Hunk #3 succeeded at 1244 (offset 6 lines).
> patching file src/java/org/apache/hadoop/hbase/client/HTable.java
> Hunk #2 succeeded at 73 (offset 8 lines).
> Hunk #4 FAILED at 405.
> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> 1 out of 5 hunks FAILED -- saving rejects to file
> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> patching file
src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> patching file src/java/org/apache/hadoop/hbase/client/MultiResult.java
> patching file src/java/org/apache/hadoop/hbase/client/Row.java
> patching file
> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> patching file
src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> Hunk #2 succeeded at 247 (offset 2 lines).
> patching file
> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> Hunk #1 succeeded at 78 (offset -1 lines).
> Hunk #2 FAILED at 2515.
> 1 out of 2 hunks FAILED -- saving rejects to file
> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
> patching file src/test/org/apache/hadoop/hbase/client/TestHTable.java
> Hunk #2 FAILED at 333.
> 1 out of 2 hunks FAILED -- saving rejects to file
> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>
>
>
>
> -----Original Message-----
> From: Marc Limotte [mailto:mslimotte@gmail.com]
> Sent: Tuesday, January 19, 2010 10:26 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause
>
> Sriram,
>
> Would a secondary index help you:
>
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> client/tableindexed/package-summary.html#package_description
> .
>
> The index is stored in a separate table, but the index is managed for
> you.
>
> I don't think you can do an arbitrary "in" query, though.  If the keys
> that
> you want to include in the "in" are reasonably close neighbors, you
> could do
> a scan and skip ones that are uninteresting.  You could also try a
batch
> Get
> by applying a separate patch, see
> http://issues.apache.org/jira/browse/HBASE-1845.
>
> Marc Limotte
>
> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> sriramc@ivycomptech.com> wrote:
>
> > Is there any support for this.  I want to do this
> >
> > 1.  Create a second table to maintain mapping between secondary
column
> > and the rowid's of the primary table
> >
> > 2.  Use this second table to get the rowid's to lookup from the
> primary
> > table using a SQL In like clause ---
> >
> > Basically I am doing this to speed up querying by  Non-row key
> columns.
> >
> > Thanks
> >
> > Sriram C
> >
> >
> > This email is sent for and on behalf of Ivy Comptech Private
Limited.
> Ivy
> > Comptech Private Limited is a limited liability company.
> >
> > This email and any attachments are confidential, and may be legally
> > privileged and protected by copyright. If you are not the intended
> recipient
> > dissemination or copying of this email is prohibited. If you have
> received
> > this in error, please notify the sender by replying by email and
then
> delete
> > the email completely from your system.
> > Any views or opinions are solely those of the sender.  This
> communication
> > is not intended to form a binding contract on behalf of Ivy Comptech
> Private
> > Limited unless expressly indicated to the contrary and properly
> authorised.
> > Any actions taken on the basis of this email are at the recipient's
> own
> > risk.
> >
> > Registered office:
> > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
Hills,
> > Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994.
> > Registered in India. A list of members' names is available for
> inspection at
> > the registered office.
> >
> >
>

Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by stack <st...@duboce.net>.
IIRC, hbase-1845 was a sketch only and not yet complete.  Its probably
rotted since any ways.

Have you looked at hbase-2037 since committed and available in 0.20.3RC2.
 Would this help you with your original problem?

St.Ack

On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
sriramc@ivycomptech.com> wrote:

> I tried applying the patch to the hbase source code  hbase 0.20.2  and I
> get the errors below.  Do you know if this needs to be applied to a
> specific hbase version. Is there a version which works with 0.20.2 or
> later ??
> Basically HRegionServer  and HTable patching fails.
>
>
> Thanks for the help
>
> patch -p0 -i batch.patch
>
> patching file src/java/org/apache/hadoop/hbase/client/Get.java
> Hunk #1 succeeded at 61 (offset 2 lines).
> Hunk #2 succeeded at 347 (offset 31 lines).
> patching file src/java/org/apache/hadoop/hbase/client/HConnection.java
> patching file
> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
> Hunk #3 succeeded at 1244 (offset 6 lines).
> patching file src/java/org/apache/hadoop/hbase/client/HTable.java
> Hunk #2 succeeded at 73 (offset 8 lines).
> Hunk #4 FAILED at 405.
> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
> 1 out of 5 hunks FAILED -- saving rejects to file
> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
> patching file src/java/org/apache/hadoop/hbase/client/MultiCallable.java
> patching file src/java/org/apache/hadoop/hbase/client/MultiResult.java
> patching file src/java/org/apache/hadoop/hbase/client/Row.java
> patching file
> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
> patching file src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
> Hunk #2 succeeded at 247 (offset 2 lines).
> patching file
> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
> Hunk #1 succeeded at 78 (offset -1 lines).
> Hunk #2 FAILED at 2515.
> 1 out of 2 hunks FAILED -- saving rejects to file
> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
> patching file src/test/org/apache/hadoop/hbase/client/TestHTable.java
> Hunk #2 FAILED at 333.
> 1 out of 2 hunks FAILED -- saving rejects to file
> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>
>
>
>
> -----Original Message-----
> From: Marc Limotte [mailto:mslimotte@gmail.com]
> Sent: Tuesday, January 19, 2010 10:26 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause
>
> Sriram,
>
> Would a secondary index help you:
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
> client/tableindexed/package-summary.html#package_description
> .
>
> The index is stored in a separate table, but the index is managed for
> you.
>
> I don't think you can do an arbitrary "in" query, though.  If the keys
> that
> you want to include in the "in" are reasonably close neighbors, you
> could do
> a scan and skip ones that are uninteresting.  You could also try a batch
> Get
> by applying a separate patch, see
> http://issues.apache.org/jira/browse/HBASE-1845.
>
> Marc Limotte
>
> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
> sriramc@ivycomptech.com> wrote:
>
> > Is there any support for this.  I want to do this
> >
> > 1.  Create a second table to maintain mapping between secondary column
> > and the rowid's of the primary table
> >
> > 2.  Use this second table to get the rowid's to lookup from the
> primary
> > table using a SQL In like clause ---
> >
> > Basically I am doing this to speed up querying by  Non-row key
> columns.
> >
> > Thanks
> >
> > Sriram C
> >
> >
> > This email is sent for and on behalf of Ivy Comptech Private Limited.
> Ivy
> > Comptech Private Limited is a limited liability company.
> >
> > This email and any attachments are confidential, and may be legally
> > privileged and protected by copyright. If you are not the intended
> recipient
> > dissemination or copying of this email is prohibited. If you have
> received
> > this in error, please notify the sender by replying by email and then
> delete
> > the email completely from your system.
> > Any views or opinions are solely those of the sender.  This
> communication
> > is not intended to form a binding contract on behalf of Ivy Comptech
> Private
> > Limited unless expressly indicated to the contrary and properly
> authorised.
> > Any actions taken on the basis of this email are at the recipient's
> own
> > risk.
> >
> > Registered office:
> > Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills,
> > Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994.
> > Registered in India. A list of members' names is available for
> inspection at
> > the registered office.
> >
> >
>

RE: Support for MultiGet / SQL In clause -- error in patch HBASE-1845

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
I tried applying the patch to the hbase source code  hbase 0.20.2  and I
get the errors below.  Do you know if this needs to be applied to a
specific hbase version. Is there a version which works with 0.20.2 or
later ??
Basically HRegionServer  and HTable patching fails. 


Thanks for the help

patch -p0 -i batch.patch

patching file src/java/org/apache/hadoop/hbase/client/Get.java
Hunk #1 succeeded at 61 (offset 2 lines).
Hunk #2 succeeded at 347 (offset 31 lines).
patching file src/java/org/apache/hadoop/hbase/client/HConnection.java
patching file
src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
Hunk #3 succeeded at 1244 (offset 6 lines).
patching file src/java/org/apache/hadoop/hbase/client/HTable.java
Hunk #2 succeeded at 73 (offset 8 lines).
Hunk #4 FAILED at 405.
Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
1 out of 5 hunks FAILED -- saving rejects to file
src/java/org/apache/hadoop/hbase/client/HTable.java.rej
patching file src/java/org/apache/hadoop/hbase/client/Multi.java
patching file src/java/org/apache/hadoop/hbase/client/MultiCallable.java
patching file src/java/org/apache/hadoop/hbase/client/MultiResult.java
patching file src/java/org/apache/hadoop/hbase/client/Row.java
patching file
src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
patching file src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
Hunk #2 succeeded at 247 (offset 2 lines).
patching file
src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Hunk #1 succeeded at 78 (offset -1 lines).
Hunk #2 FAILED at 2515.
1 out of 2 hunks FAILED -- saving rejects to file
src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
patching file src/test/org/apache/hadoop/hbase/client/TestHTable.java
Hunk #2 FAILED at 333.
1 out of 2 hunks FAILED -- saving rejects to file
src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej




-----Original Message-----
From: Marc Limotte [mailto:mslimotte@gmail.com] 
Sent: Tuesday, January 19, 2010 10:26 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Support for MultiGet / SQL In clause

Sriram,

Would a secondary index help you:
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
client/tableindexed/package-summary.html#package_description
.

The index is stored in a separate table, but the index is managed for
you.

I don't think you can do an arbitrary "in" query, though.  If the keys
that
you want to include in the "in" are reasonably close neighbors, you
could do
a scan and skip ones that are uninteresting.  You could also try a batch
Get
by applying a separate patch, see
http://issues.apache.org/jira/browse/HBASE-1845.

Marc Limotte

On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
sriramc@ivycomptech.com> wrote:

> Is there any support for this.  I want to do this
>
> 1.  Create a second table to maintain mapping between secondary column
> and the rowid's of the primary table
>
> 2.  Use this second table to get the rowid's to lookup from the
primary
> table using a SQL In like clause ---
>
> Basically I am doing this to speed up querying by  Non-row key
columns.
>
> Thanks
>
> Sriram C
>
>
> This email is sent for and on behalf of Ivy Comptech Private Limited.
Ivy
> Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended
recipient
> dissemination or copying of this email is prohibited. If you have
received
> this in error, please notify the sender by replying by email and then
delete
> the email completely from your system.
> Any views or opinions are solely those of the sender.  This
communication
> is not intended to form a binding contract on behalf of Ivy Comptech
Private
> Limited unless expressly indicated to the contrary and properly
authorised.
> Any actions taken on the basis of this email are at the recipient's
own
> risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills,
> Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994.
> Registered in India. A list of members' names is available for
inspection at
> the registered office.
>
>

Re: Support for MultiGet / SQL In clause

Posted by Marc Limotte <ms...@gmail.com>.
Sriram,

Would a secondary index help you:
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/client/tableindexed/package-summary.html#package_description
.

The index is stored in a separate table, but the index is managed for you.

I don't think you can do an arbitrary "in" query, though.  If the keys that
you want to include in the "in" are reasonably close neighbors, you could do
a scan and skip ones that are uninteresting.  You could also try a batch Get
by applying a separate patch, see
http://issues.apache.org/jira/browse/HBASE-1845.

Marc Limotte

On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor <
sriramc@ivycomptech.com> wrote:

> Is there any support for this.  I want to do this
>
> 1.  Create a second table to maintain mapping between secondary column
> and the rowid's of the primary table
>
> 2.  Use this second table to get the rowid's to lookup from the primary
> table using a SQL In like clause ---
>
> Basically I am doing this to speed up querying by  Non-row key columns.
>
> Thanks
>
> Sriram C
>
>
> This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy
> Comptech Private Limited is a limited liability company.
>
> This email and any attachments are confidential, and may be legally
> privileged and protected by copyright. If you are not the intended recipient
> dissemination or copying of this email is prohibited. If you have received
> this in error, please notify the sender by replying by email and then delete
> the email completely from your system.
> Any views or opinions are solely those of the sender.  This communication
> is not intended to form a binding contract on behalf of Ivy Comptech Private
> Limited unless expressly indicated to the contrary and properly authorised.
> Any actions taken on the basis of this email are at the recipient's own
> risk.
>
> Registered office:
> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills,
> Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994.
> Registered in India. A list of members' names is available for inspection at
> the registered office.
>
>

Support for MultiGet / SQL In clause

Posted by Sriram Muthuswamy Chittathoor <sr...@ivycomptech.com>.
Is there any support for this.  I want to do this

1.  Create a second table to maintain mapping between secondary column
and the rowid's of the primary table

2.  Use this second table to get the rowid's to lookup from the primary
table using a SQL In like clause ---  

Basically I am doing this to speed up querying by  Non-row key columns.

Thanks

Sriram C


This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private Limited is a limited liability company.  

This email and any attachments are confidential, and may be legally privileged and protected by copyright. If you are not the intended recipient dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system. 
Any views or opinions are solely those of the sender.  This communication is not intended to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated to the contrary and properly authorised. Any actions taken on the basis of this email are at the recipient's own risk.

Registered office:
Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names is available for inspection at the registered office.


Re: Configuration limits for hbase and hadoop ...

Posted by Jean-Daniel Cryans <jd...@apache.org>.
> 2010-01-19 09:27:48,382 WARN
> org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 4 on
> 60020 took 1338ms appending an edit to hlog; editcount=222803

This usually means that your HDFS is a tad slow, probably very loaded.

> 2010-01-19 09:28:05,251 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server handler 4 on 60020 caught:
> java.nio.channels.ClosedChannelException
>        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1125)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:615)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:679)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:943)

This happens when a client (like a mapreduce task) gets killed while
the operation is happening on the region server.

J-D

Re: Configuration limits for hbase and hadoop ...

Posted by Zaharije Pasalic <pa...@gmail.com>.
On Tue, Jan 19, 2010 at 10:12 AM, Zaharije Pasalic
<pa...@gmail.com> wrote:
> On Tue, Jan 19, 2010 at 2:38 AM, stack <st...@duboce.net> wrote:
>> On Mon, Jan 18, 2010 at 5:18 PM, Zaharije Pasalic <
>> pasalic.zaharije@gmail.com> wrote:
>>
>>> On Tue, Jan 19, 2010 at 12:13 AM, stack <st...@duboce.net> wrote:
>>> > On Mon, Jan 18, 2010 at 8:47 AM, Zaharije Pasalic <
>>> > pasalic.zaharije@gmail.com> wrote:
>>> >> Importing process is really simple one: small map reduce program will
>>> >> read CSV file, split lines and insert it into table (only Map, no
>>> >> Reduce parts). We are using default hadoop configuration (on 7 nodes
>>> >> we can run 14 maps). Also we are using 32MB for writeBufferSize on
>>> >> HBase and also we set setWriteToWAL to false.
>>> >>
>>> >>
>>> > The mapreduce tasks are running on same nodes as hbase+datanodes?  WIth
>>> 8G
>>> > of RAM only, that might be a bit of a stretch.  You have monitoring on
>>> these
>>> > machines?  Any swapping?   Or are they fine?
>>> >
>>> >
>>>
>>> No, there is no swapping at all. Also cpu usage is really small.
>>>
>>>
>> OK.  Then it unlikely MapReduce is robbing resources from datanodes (whats
>> i/o like on these machines?  Load?).
>
> we are using RackSpace cloud, so i'm not sure about i/o (i will try to
> check with their support). Currently there is no more load on those
> ervers except when i run MapReduce.
>
>>
>>> Are you inserting one row only per map task or more than this?  You are
>>> > reusing an HTable instance?  Or failing that passing the same
>>> > HBaseConfiguration each time?  If you make a new HTable with a new
>>> > HBaseConfiguration each time then it does not make use of cache of region
>>> > locations; it has to go fetch them again.  This can make for extra
>>> loading
>>> > on .META. table.
>>> >
>>>
>>> We are having 500000 lines per single CSV file ~518MB. Default
>>> splitting is used.
>>
>>
>> Whats that?  A task per line?  Does the line have 100 columns on it?  Is
>> that a MR task per line of a CSV file?  Is the HTable being created per
>> Task?
>>
>>
>
> Not sure that i understand "task per line". Did you mean one map per
> one line? If that, no, one map will parse ~6K lines
> (so 6K rows will written in one map).
>
> Here is snippet of main createJobConfgiuration:
>
>    // Job configuration
>    Job job = new Job(conf, "hbase import");
>    job.setJarByClass(HBaseImport2.class);
>    job.setMapperClass(ImportMapper.class);
>
>    // INPUT
>    FileInputFormat.addInputPath(job, new Path(fileName));
>
>    // OUTPUT
>    job.setOutputFormatClass(CustomTableOutputFormat.class);
>    job.getConfiguration().set(CustomTableOutputFormat.OUTPUT_TABLE, tableName);
>    job.setOutputKeyClass(ImmutableBytesWritable.class);
>    job.setOutputValueClass(Writable.class);
>
>    // MISC
>    job.setNumReduceTasks(0);
>
> main method looks like:
>
>    HBaseConfiguration conf = new HBaseConfiguration();
>    // parse coimmand line args ...
>    Job job = createJob(conf, fileNameFromArgs, tableNameFromArgs);
>
> and map part:
>
>    public void map(Object key, Text value, Context context) throws
> IOException, InterruptedException               {
>         int i=0;
>        String name = "";
>                try {
>                        String[] values = value.toString().split(",");
>
>                        context.getCounter(Counters.ROWS_WRITTEN).increment(1);
>
>                        Put put = new Put(values[0].getBytes());
>                        put.setWriteToWAL(false);
>                        for (i=1; i<values.length; i++) {
>                                name = values[i];
>                                put.add("attr".getBytes(),
> context.getConfiguration().get("column_name_" + (i-1)).getBytes(),
> values[i].getBytes());
>                        }
>
>                        context.write(key, put);
>                }
>                catch(Exception e) {
>                        throw new RuntimeException("Values: '" + value + "' [" + i +
> ":"+name+"]" + "\n" + e.getMessage());
>                }
>        }
>
>>
>>
>>
>>> We are using a little modified TableOutputFormat
>>> class (I added support for write buffer size).
>>>
>>> So, we are instantiating HBaseConfiguration only in main method, and
>>> leaving rest to (Custom)TableOutputFormat.
>>>
>>
>> So, you have TOF hooked up as the MR Map output?
>>
>
> Yes. Check upper code.
>
>>
>>
>>>
>>> > Regards logs, enable DEBUG if you can (See FAQ for how).
>>> >
>>>
>>> Will provide logs soon ...
>>>
>>
>>
>> Thanks.
>>
>>

Also on our regionserver's we are encounter this:
2010-01-19 09:19:06,814 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
profiles,356e989b-56b0-424e-b161-f1f150edfdb0,1263866025810
2010-01-19 09:19:06,814 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
profiles,430248c6-5f7f-409f-838b-4f06755103d9,1263866025810
2010-01-19 09:19:06,814 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
profiles,356e989b-56b0-424e-b161-f1f150edfdb0,1263866025810
2010-01-19 09:19:06,815 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
profiles,356e989b-56b0-424e-b161-f1f150edfdb0,1263866025810
2010-01-19 09:19:06,815 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
profiles,430248c6-5f7f-409f-838b-4f06755103d9,1263866025810
2010-01-19 09:19:06,815 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
profiles,430248c6-5f7f-409f-838b-4f06755103d9,1263866025810
2010-01-19 09:19:10,231 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region .META.,,1
2010-01-19 09:19:10,289 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region .META.,,1 in 0sec
2010-01-19 09:19:12,824 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_FLUSH:
.META.,,1
2010-01-19 09:19:12,824 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
MSG_REGION_MAJOR_COMPACT: .META.,,1
2010-01-19 09:19:12,824 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_FLUSH: .META.,,1
2010-01-19 09:19:12,854 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_MAJOR_COMPACT: .META.,,1
2010-01-19 09:19:12,854 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting major
compaction on region .META.,,1
2010-01-19 09:19:12,901 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region .META.,,1 in 0sec
2010-01-19 09:20:05,359 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
-5249079272973123665 lease expired
2010-01-19 09:20:07,403 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
5066754693501358278 lease expired
2010-01-19 09:20:09,451 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
1341126026956906568 lease expired
2010-01-19 09:20:11,495 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
5981325577571503497 lease expired
2010-01-19 09:20:11,531 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
-1394218678167923901 lease expired
2010-01-19 09:20:11,567 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
3498595592686506630 lease expired
2010-01-19 09:24:35,565 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region .META.,,1
2010-01-19 09:24:35,641 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region .META.,,1 in 0sec
2010-01-19 09:27:07,693 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region .META.,,1
2010-01-19 09:27:07,725 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region .META.,,1 in 0sec
2010-01-19 09:27:32,517 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465
2010-01-19 09:27:32,517 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465
2010-01-19 09:27:32,518 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_OPEN:
profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465
2010-01-19 09:27:33,239 INFO
org.apache.hadoop.hbase.regionserver.HRegion: region
profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465/1388385637
available; sequence id is 35054928
2010-01-19 09:27:33,239 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_OPEN:
profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465
2010-01-19 09:27:33,239 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465
2010-01-19 09:27:33,311 INFO
org.apache.hadoop.hbase.regionserver.HRegion: region
profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465/1543981704
available; sequence id is 35054929
2010-01-19 09:27:48,317 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 1467ms appending an edit to hlog; editcount=221807
2010-01-19 09:27:48,317 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 1467ms appending an edit to hlog; editcount=221808
2010-01-19 09:27:48,317 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 1467ms appending an edit to hlog; editcount=221809
2010-01-19 09:27:48,317 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 1467ms appending an edit to hlog; editcount=221810

// bunch of same lines

2010-01-19 09:27:48,382 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 4 on
60020 took 1338ms appending an edit to hlog; editcount=222803
2010-01-19 09:27:48,382 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 4 on
60020 took 1338ms appending an edit to hlog; editcount=222804

// bunch of same lines

2010-01-19 09:27:48,888 INFO
org.apache.hadoop.hbase.regionserver.HLog: Roll
/hbase/.logs/hadoop-node08,60020,1263861311176/hlog.dat.1263868517809,
entries=228673, calcsize=63867472, filesize=39216210. New hlog
/hbase/.logs/hadoop-node08,60020,1263861311176/hlog.dat.1263893268885
2010-01-19 09:27:48,888 INFO
org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file
/hbase/.logs/hadoop-node08,60020,1263861311176/hlog.dat.1263861311541
whose highest sequence/edit id is 16347832
2010-01-19 09:28:05,250 WARN org.apache.hadoop.ipc.HBaseServer: IPC
Server Responder, call put([B@5b47f8aa,
[Lorg.apache.hadoop.hbase.client.Put;@52168fb7) from
10.177.88.207:49677: output error
2010-01-19 09:28:05,251 INFO org.apache.hadoop.ipc.HBaseServer: IPC
Server handler 4 on 60020 caught:
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1125)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:615)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:679)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:943)

2010-01-19 09:28:05,871 WARN org.apache.hadoop.ipc.HBaseServer: IPC
Server Responder, call put([B@70fc63b4,
[Lorg.apache.hadoop.hbase.client.Put;@49f5f85f) from
10.177.88.55:45340: output error
2010-01-19 09:28:05,871 INFO org.apache.hadoop.ipc.HBaseServer: IPC
Server handler 5 on 60020 caught:
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1125)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:615)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:679)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:943)

2010-01-19 09:28:13,766 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465 in
40sec
2010-01-19 09:28:13,766 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465
2010-01-19 09:28:24,949 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465 in
11sec


and also in hbase-hadoop-zookeeper-hadoop-master01:

i'm having :

2010-01-19 09:28:05,139 INFO
org.apache.zookeeper.server.NIOServerCnxn: closing
session:0x12643a36e430082 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.177.88.51:2181
remote=/10.177.88.207:43529]
2010-01-19 09:28:05,177 WARN
org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
session 0x12643a36e430083 due to java.io.IOException: Read error
2010-01-19 09:28:05,177 INFO
org.apache.zookeeper.server.NIOServerCnxn: closing
session:0x12643a36e430083 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.177.88.51:2181
remote=/10.177.88.207:43531]
2010-01-19 09:28:05,477 WARN
org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
session 0x12643a36e430085 due to java.io.IOException: Read error
2010-01-19 09:28:05,478 INFO
org.apache.zookeeper.server.NIOServerCnxn: closing
session:0x12643a36e430085 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.177.88.51:2181
remote=/10.177.88.55:58731]
2010-01-19 09:28:05,517 WARN
org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
session 0x12643a36e430084 due to java.io.IOException: Read error

across log file.

>>
>>>
>>> >
>>> >> second manifestation is that i can create new empty table and start
>>> >> importing data normaly, but if i try to import more data into same
>>> >> table (now having ~33 millions) i'm having really bad performance and
>>> >> hbase status page does not work at all (will not load into browser).
>>> >>
>>> >> Thats bad.  Can you tell how many regions you have on your cluster?  How
>>> > many per server?
>>> >
>>>
>>> ~1800 regions on cluster and ~250 per node. We are using replication
>>> by factor of 2 (there is
>>> no reason why we used 2 instead of default 3)
>>>
>>> Also, if I leave maps to run i will got following errors in datanode logs:
>>>
>>> 2010-01-18 23:15:15,795 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>>> DatanodeRegistration(10.177.88.209:50010,
>>> storageID=DS-515966566-10.177.88.209-50010-1263597214826,
>>> infoPort=50075, ipcPort=50020):DataXceiver
>>> java.io.IOException: Block blk_3350193476599136386_135159 is not valid.
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734)
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722)
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92)
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>>>        at java.lang.Thread.run(Thread.java:619)
>>>
>>>
>> But this does not show up in the regionserver, right?  My guess is that HDFS
>> deals with the broken block.
>>
>
> No, nothing in regionserver.
>
>> St.Ack
>>
>>
>>> >
>>> >
>>> >> So my questions is: what i'm doing wrong? Is current cluster good
>>> >> enough to support 50millions records or my current 33 millions is
>>> >> limit on current configuration? Any hints. Also, I'm getting about 800
>>> >> inserts per second, is this slow?   Any hint is appreciated.
>>> >>
>>> >> An insert has 100 columns?  Is this 800/second across the whole cluster?
>>> >
>>> > St.Ack
>>> >
>>>
>>
>

Re: Configuration limits for hbase and hadoop ...

Posted by Zaharije Pasalic <pa...@gmail.com>.
On Tue, Jan 19, 2010 at 2:38 AM, stack <st...@duboce.net> wrote:
> On Mon, Jan 18, 2010 at 5:18 PM, Zaharije Pasalic <
> pasalic.zaharije@gmail.com> wrote:
>
>> On Tue, Jan 19, 2010 at 12:13 AM, stack <st...@duboce.net> wrote:
>> > On Mon, Jan 18, 2010 at 8:47 AM, Zaharije Pasalic <
>> > pasalic.zaharije@gmail.com> wrote:
>> >> Importing process is really simple one: small map reduce program will
>> >> read CSV file, split lines and insert it into table (only Map, no
>> >> Reduce parts). We are using default hadoop configuration (on 7 nodes
>> >> we can run 14 maps). Also we are using 32MB for writeBufferSize on
>> >> HBase and also we set setWriteToWAL to false.
>> >>
>> >>
>> > The mapreduce tasks are running on same nodes as hbase+datanodes?  WIth
>> 8G
>> > of RAM only, that might be a bit of a stretch.  You have monitoring on
>> these
>> > machines?  Any swapping?   Or are they fine?
>> >
>> >
>>
>> No, there is no swapping at all. Also cpu usage is really small.
>>
>>
> OK.  Then it unlikely MapReduce is robbing resources from datanodes (whats
> i/o like on these machines?  Load?).

we are using RackSpace cloud, so i'm not sure about i/o (i will try to
check with their support). Currently there is no more load on those
ervers except when i run MapReduce.

>
>> Are you inserting one row only per map task or more than this?  You are
>> > reusing an HTable instance?  Or failing that passing the same
>> > HBaseConfiguration each time?  If you make a new HTable with a new
>> > HBaseConfiguration each time then it does not make use of cache of region
>> > locations; it has to go fetch them again.  This can make for extra
>> loading
>> > on .META. table.
>> >
>>
>> We are having 500000 lines per single CSV file ~518MB. Default
>> splitting is used.
>
>
> Whats that?  A task per line?  Does the line have 100 columns on it?  Is
> that a MR task per line of a CSV file?  Is the HTable being created per
> Task?
>
>

Not sure that i understand "task per line". Did you mean one map per
one line? If that, no, one map will parse ~6K lines
(so 6K rows will written in one map).

Here is snippet of main createJobConfgiuration:

    // Job configuration
    Job job = new Job(conf, "hbase import");
    job.setJarByClass(HBaseImport2.class);
    job.setMapperClass(ImportMapper.class);

    // INPUT
    FileInputFormat.addInputPath(job, new Path(fileName));
		
    // OUTPUT
    job.setOutputFormatClass(CustomTableOutputFormat.class);
    job.getConfiguration().set(CustomTableOutputFormat.OUTPUT_TABLE, tableName);
    job.setOutputKeyClass(ImmutableBytesWritable.class);
    job.setOutputValueClass(Writable.class);		
		
    // MISC
    job.setNumReduceTasks(0);

main method looks like:

    HBaseConfiguration conf = new HBaseConfiguration();
    // parse coimmand line args ...
    Job job = createJob(conf, fileNameFromArgs, tableNameFromArgs);

and map part:

    public void map(Object key, Text value, Context context) throws
IOException, InterruptedException		{
         int i=0;
	String name = "";
		try {
			String[] values = value.toString().split(",");
				
			context.getCounter(Counters.ROWS_WRITTEN).increment(1);
				
			Put put = new Put(values[0].getBytes());
			put.setWriteToWAL(false);
			for (i=1; i<values.length; i++) {
				name = values[i];
				put.add("attr".getBytes(),
context.getConfiguration().get("column_name_" + (i-1)).getBytes(),
values[i].getBytes());
			}

			context.write(key, put);
		}
		catch(Exception e) {
			throw new RuntimeException("Values: '" + value + "' [" + i +
":"+name+"]" + "\n" + e.getMessage());
		}
	}

>
>
>
>> We are using a little modified TableOutputFormat
>> class (I added support for write buffer size).
>>
>> So, we are instantiating HBaseConfiguration only in main method, and
>> leaving rest to (Custom)TableOutputFormat.
>>
>
> So, you have TOF hooked up as the MR Map output?
>

Yes. Check upper code.

>
>
>>
>> > Regards logs, enable DEBUG if you can (See FAQ for how).
>> >
>>
>> Will provide logs soon ...
>>
>
>
> Thanks.
>
>
>
>>
>> >
>> >> second manifestation is that i can create new empty table and start
>> >> importing data normaly, but if i try to import more data into same
>> >> table (now having ~33 millions) i'm having really bad performance and
>> >> hbase status page does not work at all (will not load into browser).
>> >>
>> >> Thats bad.  Can you tell how many regions you have on your cluster?  How
>> > many per server?
>> >
>>
>> ~1800 regions on cluster and ~250 per node. We are using replication
>> by factor of 2 (there is
>> no reason why we used 2 instead of default 3)
>>
>> Also, if I leave maps to run i will got following errors in datanode logs:
>>
>> 2010-01-18 23:15:15,795 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>> DatanodeRegistration(10.177.88.209:50010,
>> storageID=DS-515966566-10.177.88.209-50010-1263597214826,
>> infoPort=50075, ipcPort=50020):DataXceiver
>> java.io.IOException: Block blk_3350193476599136386_135159 is not valid.
>>        at
>> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734)
>>        at
>> org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722)
>>        at
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92)
>>        at
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
>>        at
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>>        at java.lang.Thread.run(Thread.java:619)
>>
>>
> But this does not show up in the regionserver, right?  My guess is that HDFS
> deals with the broken block.
>

No, nothing in regionserver.

> St.Ack
>
>
>> >
>> >
>> >> So my questions is: what i'm doing wrong? Is current cluster good
>> >> enough to support 50millions records or my current 33 millions is
>> >> limit on current configuration? Any hints. Also, I'm getting about 800
>> >> inserts per second, is this slow?   Any hint is appreciated.
>> >>
>> >> An insert has 100 columns?  Is this 800/second across the whole cluster?
>> >
>> > St.Ack
>> >
>>
>

Re: Configuration limits for hbase and hadoop ...

Posted by stack <st...@duboce.net>.
On Mon, Jan 18, 2010 at 5:18 PM, Zaharije Pasalic <
pasalic.zaharije@gmail.com> wrote:

> On Tue, Jan 19, 2010 at 12:13 AM, stack <st...@duboce.net> wrote:
> > On Mon, Jan 18, 2010 at 8:47 AM, Zaharije Pasalic <
> > pasalic.zaharije@gmail.com> wrote:
> >> Importing process is really simple one: small map reduce program will
> >> read CSV file, split lines and insert it into table (only Map, no
> >> Reduce parts). We are using default hadoop configuration (on 7 nodes
> >> we can run 14 maps). Also we are using 32MB for writeBufferSize on
> >> HBase and also we set setWriteToWAL to false.
> >>
> >>
> > The mapreduce tasks are running on same nodes as hbase+datanodes?  WIth
> 8G
> > of RAM only, that might be a bit of a stretch.  You have monitoring on
> these
> > machines?  Any swapping?   Or are they fine?
> >
> >
>
> No, there is no swapping at all. Also cpu usage is really small.
>
>
OK.  Then it unlikely MapReduce is robbing resources from datanodes (whats
i/o like on these machines?  Load?).

> Are you inserting one row only per map task or more than this?  You are
> > reusing an HTable instance?  Or failing that passing the same
> > HBaseConfiguration each time?  If you make a new HTable with a new
> > HBaseConfiguration each time then it does not make use of cache of region
> > locations; it has to go fetch them again.  This can make for extra
> loading
> > on .META. table.
> >
>
> We are having 500000 lines per single CSV file ~518MB. Default
> splitting is used.


Whats that?  A task per line?  Does the line have 100 columns on it?  Is
that a MR task per line of a CSV file?  Is the HTable being created per
Task?





> We are using a little modified TableOutputFormat
> class (I added support for write buffer size).
>
> So, we are instantiating HBaseConfiguration only in main method, and
> leaving rest to (Custom)TableOutputFormat.
>

So, you have TOF hooked up as the MR Map output?



>
> > Regards logs, enable DEBUG if you can (See FAQ for how).
> >
>
> Will provide logs soon ...
>


Thanks.



>
> >
> >> second manifestation is that i can create new empty table and start
> >> importing data normaly, but if i try to import more data into same
> >> table (now having ~33 millions) i'm having really bad performance and
> >> hbase status page does not work at all (will not load into browser).
> >>
> >> Thats bad.  Can you tell how many regions you have on your cluster?  How
> > many per server?
> >
>
> ~1800 regions on cluster and ~250 per node. We are using replication
> by factor of 2 (there is
> no reason why we used 2 instead of default 3)
>
> Also, if I leave maps to run i will got following errors in datanode logs:
>
> 2010-01-18 23:15:15,795 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(10.177.88.209:50010,
> storageID=DS-515966566-10.177.88.209-50010-1263597214826,
> infoPort=50075, ipcPort=50020):DataXceiver
> java.io.IOException: Block blk_3350193476599136386_135159 is not valid.
>        at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734)
>        at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722)
>        at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>        at java.lang.Thread.run(Thread.java:619)
>
>
But this does not show up in the regionserver, right?  My guess is that HDFS
deals with the broken block.

St.Ack


> >
> >
> >> So my questions is: what i'm doing wrong? Is current cluster good
> >> enough to support 50millions records or my current 33 millions is
> >> limit on current configuration? Any hints. Also, I'm getting about 800
> >> inserts per second, is this slow?   Any hint is appreciated.
> >>
> >> An insert has 100 columns?  Is this 800/second across the whole cluster?
> >
> > St.Ack
> >
>

Re: Configuration limits for hbase and hadoop ...

Posted by Zaharije Pasalic <pa...@gmail.com>.
On Tue, Jan 19, 2010 at 12:13 AM, stack <st...@duboce.net> wrote:
> On Mon, Jan 18, 2010 at 8:47 AM, Zaharije Pasalic <
> pasalic.zaharije@gmail.com> wrote:
>
>> Now we are trying to import 50 millions rows of data. Each row have
>> 100 columns (in reality we will have sparsely populated table, but now
>> we are testing worst-case scenario). We are having 50 million records
>> encoded in about 100 CSV files stored in HDFS.
>>
>
>
> 50Millions for such a cluster is a small number of rows.  100 columns per
> row should work out fine.  You have one column family only, right?
>

Yes. For now we are doing proof-of-concept and we used one family for
everything. In reality, we will have ~10 families for 100 columns

>
>
>>
>> Importing process is really simple one: small map reduce program will
>> read CSV file, split lines and insert it into table (only Map, no
>> Reduce parts). We are using default hadoop configuration (on 7 nodes
>> we can run 14 maps). Also we are using 32MB for writeBufferSize on
>> HBase and also we set setWriteToWAL to false.
>>
>>
> The mapreduce tasks are running on same nodes as hbase+datanodes?  WIth 8G
> of RAM only, that might be a bit of a stretch.  You have monitoring on these
> machines?  Any swapping?   Or are they fine?
>
>

No, there is no swapping at all. Also cpu usage is really small.

>
>
>> At the beginning everything looks fine, but after ~33 millions of
>> records we are encounter strange behavior of HBase.
>>
>> Firstly one of nodes where META table resides have high load. Status
>> web page shows ~1700 requests on that node even if we are not running
>> any MapReduce (0 request on other nodes).
>
>
> See other message.
>
> Are you inserting one row only per map task or more than this?  You are
> reusing an HTable instance?  Or failing that passing the same
> HBaseConfiguration each time?  If you make a new HTable with a new
> HBaseConfiguration each time then it does not make use of cache of region
> locations; it has to go fetch them again.  This can make for extra loading
> on .META. table.
>

We are having 500000 lines per single CSV file ~518MB. Default
splitting is used. We are using a little modified TableOutputFormat
class (I added support for write buffer size).

So, we are instantiating HBaseConfiguration only in main method, and
leaving rest to (Custom)TableOutputFormat.

> Regards logs, enable DEBUG if you can (See FAQ for how).
>

Will provide logs soon ...

>
>> second manifestation is that i can create new empty table and start
>> importing data normaly, but if i try to import more data into same
>> table (now having ~33 millions) i'm having really bad performance and
>> hbase status page does not work at all (will not load into browser).
>>
>> Thats bad.  Can you tell how many regions you have on your cluster?  How
> many per server?
>

~1800 regions on cluster and ~250 per node. We are using replication
by factor of 2 (there is
no reason why we used 2 instead of default 3)

Also, if I leave maps to run i will got following errors in datanode logs:

2010-01-18 23:15:15,795 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.177.88.209:50010,
storageID=DS-515966566-10.177.88.209-50010-1263597214826,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Block blk_3350193476599136386_135159 is not valid.
        at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734)
        at org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722)
        at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
        at java.lang.Thread.run(Thread.java:619)


>
>
>> So my questions is: what i'm doing wrong? Is current cluster good
>> enough to support 50millions records or my current 33 millions is
>> limit on current configuration? Any hints. Also, I'm getting about 800
>> inserts per second, is this slow?   Any hint is appreciated.
>>
>> An insert has 100 columns?  Is this 800/second across the whole cluster?
>
> St.Ack
>

Re: Configuration limits for hbase and hadoop ...

Posted by stack <st...@duboce.net>.
On Mon, Jan 18, 2010 at 8:47 AM, Zaharije Pasalic <
pasalic.zaharije@gmail.com> wrote:

> Now we are trying to import 50 millions rows of data. Each row have
> 100 columns (in reality we will have sparsely populated table, but now
> we are testing worst-case scenario). We are having 50 million records
> encoded in about 100 CSV files stored in HDFS.
>


50Millions for such a cluster is a small number of rows.  100 columns per
row should work out fine.  You have one column family only, right?



>
> Importing process is really simple one: small map reduce program will
> read CSV file, split lines and insert it into table (only Map, no
> Reduce parts). We are using default hadoop configuration (on 7 nodes
> we can run 14 maps). Also we are using 32MB for writeBufferSize on
> HBase and also we set setWriteToWAL to false.
>
>
The mapreduce tasks are running on same nodes as hbase+datanodes?  WIth 8G
of RAM only, that might be a bit of a stretch.  You have monitoring on these
machines?  Any swapping?   Or are they fine?




> At the beginning everything looks fine, but after ~33 millions of
> records we are encounter strange behavior of HBase.
>
> Firstly one of nodes where META table resides have high load. Status
> web page shows ~1700 requests on that node even if we are not running
> any MapReduce (0 request on other nodes).


See other message.

Are you inserting one row only per map task or more than this?  You are
reusing an HTable instance?  Or failing that passing the same
HBaseConfiguration each time?  If you make a new HTable with a new
HBaseConfiguration each time then it does not make use of cache of region
locations; it has to go fetch them again.  This can make for extra loading
on .META. table.

Regards logs, enable DEBUG if you can (See FAQ for how).


> second manifestation is that i can create new empty table and start
> importing data normaly, but if i try to import more data into same
> table (now having ~33 millions) i'm having really bad performance and
> hbase status page does not work at all (will not load into browser).
>
> Thats bad.  Can you tell how many regions you have on your cluster?  How
many per server?



> So my questions is: what i'm doing wrong? Is current cluster good
> enough to support 50millions records or my current 33 millions is
> limit on current configuration? Any hints. Also, I'm getting about 800
> inserts per second, is this slow?   Any hint is appreciated.
>
> An insert has 100 columns?  Is this 800/second across the whole cluster?

St.Ack