You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Sharma, Avani" <ag...@ebay.com> on 2010/09/15 20:06:03 UTC

RE: HBase table lost on upgrade - compiling HBaseFsck.java

Ted,

I am trying to compile the file and am getting the same errors like you mentioned and more:
[javac] symbol  : method
metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
    [javac] location: class org.apache.hadoop.hbase.client.MetaScanner
    [javac]       MetaScanner.metaScan(conf, visitor);
    [javac]                  ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:503:
cannot find symbol
    [javac] symbol  : method create()
    [javac] location: class org.apache.hadoop.hbase.HBaseConfiguration
    [javac]     Configuration conf = HBaseConfiguration.create();

I got around a few of these by adding the logging jar to the CLASSPATH. But I still have some. I see that you sent out a fix, but I am unable to see the attachment. 

I have the conf dirs in CLASSPATH as well as hadoop, zk and hbase jars.

Would you recall how can these be fixed? I guess some jars are needed in the CLASSPATH 

-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Wednesday, September 08, 2010 10:11 PM
To: user@hbase.apache.org
Subject: Re: HBase table lost on upgrade

You can copy HBaseFsck.java from trunk and compile in 0.20.6

On Wed, Sep 8, 2010 at 3:43 PM, Sharma, Avani <ag...@ebay.com> wrote:

> Right.
>
> Anyway, where can I get this file from ? Any pointers?
> I can't find it at
> src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java in 0.20.6.
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Wednesday, September 08, 2010 3:09 PM
> To: user@hbase.apache.org
> Subject: Re: HBase table lost on upgrade
>
> master.jsp shows tables, not regions.
> I personally haven't encountered the problem you're facing.
>
> On Wed, Sep 8, 2010 at 2:36 PM, Sharma, Avani <ag...@ebay.com> wrote:
>
> > Ted,
> > I did look at that thread. It seems I need to modify the code in that
> file?
> > Could you point me to the exact steps to get it and compile it?
> >
> > Did you get through the issue if regions being added to catalog , but do
> > not show up in master.jsp?
> >
> >
> >
> >
> > On Sep 4, 2010, at 9:24 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > The tool Stack mentioned is hbck. If you want to port it to 0.20, see
> > email
> > > thread entitled:
> > > compiling HBaseFsck.java for 0.20.5You should try reducing the number
> of
> > > tables in your system, possibly through HBASE-2473
> > >
> > > Cheers
> > >
> > > On Thu, Sep 2, 2010 at 11:41 AM, Sharma, Avani <ag...@ebay.com>
> > wrote:
> > >
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > Stack
> > >> Sent: Wednesday, September 01, 2010 10:45 PM
> > >> To: user@hbase.apache.org
> > >> Subject: Re: HBase table lost on upgrade
> > >>
> > >> On Wed, Sep 1, 2010 at 5:49 PM, Sharma, Avani <ag...@ebay.com>
> > wrote:
> > >>> That email was just informational. Below are the details on my
> cluster
> > -
> > >> let me know if more is needed.
> > >>>
> > >>> I have 2 hbase clusters setup
> > >>> -       for production, 6 node cluster,  32G, 8 processors
> > >>> -       for dev, 3 node cluster , 16GRAM , 4 processors
> > >>>
> > >>> 1. I installed hadoop0.20.2 and hbase0.20.3 on both these clusters,
> > >> successfully.
> > >>
> > >> Why not latest stable version, 0.20.6?
> > >>
> > >> This was couple of months ago.
> > >>
> > >>
> > >>> 2. After that I loaded 2G+ files into HDFS and HBASE table.
> > >>
> > >>
> > >> Whats this mean?  Each of the .5M cells was 2G in size or the total
> size
> > >> was 2G?
> > >>
> > >> The total file size is 2G. Cells are of the order of hundreds of
> bytes.
> > >>
> > >>
> > >>>       An example Hbase table looks like this:
> > >>>               {NAME =>'TABLE', FAMILIES => [{NAME => 'data', VERSIONS
> > =>
> > >> '100', COM true
> > >>>                PRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> > >> '65536', IN_MEMO
> > >>>                RY => 'false', BLOCKCACHE => 'true'}]}
> > >>
> > >> That looks fine.
> > >>
> > >>> 3. I started stargate on one server and accessed Hbase for reading
> from
> > >> another 3rd party application successfully.
> > >>>       It took 600 seconds on dev cluster and 250 on production to
> read
> > >> .5M records from Hbase via stargate.
> > >>
> > >>
> > >> That don't sound so good.
> > >>
> > >>
> > >>
> > >>> 4. later to boost read performance, it was suggested that upgrading
> to
> > >> Hbase0.20.6 will be helpful. I did that on production (w/o running the
> > >> migrate script) and re-started stargate and everything was running
> fine,
> > >> though I did not see a bump in performance.
> > >>>
> > >>> 5. Eventually, I had to move to dev cluster from production because
> of
> > >> some resource issues at our end. Dev cluster had 0.20.3 at this time.
> As
> > I
> > >> started loading more files into Hbase (<10 versions of <1G files) and
> > >> converting my app to use hbase more heavily (via more stargate
> clients),
> > the
> > >> performance started degrading. I decided it was time to upgrade dev
> > cluster
> > >> as well to 0.20.6.  (I did not run the migrate script here as well, I
> > missed
> > >> this step in the doc).
> > >>>
> > >>
> > >> What kinda perf you looking for from REST?
> > >>
> > >> Do you have to use REST?  All is base64'd so its safe to transport.
> > >>
> > >> I also have the Java Api code (for testing purposes) and that gave
> > similar
> > >> performance results (520 seconds on dev and 250 on production
> cluster).
> > Is
> > >> there a way to flush the cache before we run the next experiment? I
> > doubt
> > >> that the first lookup always takes longer and then the later ones
> > perform
> > >> better.
> > >>
> > >> I need something that can integrate with C++ - libcurl and stargate
> were
> > >> the easiest to start with. I could look at thrift or anything else the
> > Hbase
> > >> gurus think might be a better fit performance-wise.
> > >>
> > >>
> > >>> 6. When Hbase 0.20.6 came back up on dev cluster (with increased
> block
> > >> cache (.6) and region server handler counts (75) ), pointing to the
> same
> > >> rootdir, I noticed that some tables were missing. I could see a
> mention
> > of
> > >> them in the logs, but not when I did 'list' in the shell. I recovered
> > those
> > >> tables using add_table.rb script.
> > >>
> > >>
> > >> How did you shutdown this cluster?  Did you reboot machines?  Was your
> > >> hdfs homed on /tmp?  What is going on on your systems?  Are they
> > >> swapping?  Did you give HBase more than its default memory?  You read
> > >> the requirements and made sure ulimit and xceivers had been upped on
> > >> these machines?
> > >>
> > >>
> > >> Did not reboot machines. hdfs or hbase do not store data/logs in /tmp.
> > They
> > >> are not swapping.
> > >> Hbase heap size is 2G.  I have upped the xcievers now on your
> > >> recommanedation.  Do I need to restart hdfs after making this change
> in
> > >> hdfs-site.xml ?
> > >> ulimit -n
> > >> 2048
> > >>
> > >>
> > >>
> > >>>       a. Is there a way to check the health of all Hbase tables in
> the
> > >> cluster after an upgrade or even periodically, to make sure that
> > everything
> > >> is healthy ?
> > >>>       b. I would like to be able to force this error again and check
> > the
> > >> health of hbase and want it to report to me that some tables were
> lost.
> > >> Currently, I just found out because I had very less data and it was
> easy
> > to
> > >> tell.
> > >>>
> > >>
> > >> Iin trunk there is such a tool.  In 0.20.x, run a count against our
> > >> table.  See the hbase shell.  Type help to see how.
> > >>
> > >>
> > >> What tool are you talking about here - it wasn't clear ? Count against
> > >> which table ? I want hbase to check all tables and I don't know how
> many
> > >> tables I have since there are too many - is that possible?
> > >>
> > >>> 7. Here are the issues I face after this upgrade
> > >>>       a. when I run stop-hbase.sh, it  does not stop my regionservers
> > on
> > >> other boxes.
> > >>
> > >> Why not.  Whats going on on those machines?  If you tail the logs on
> > >> the hosts that won't go down and/or on master, what do they say?
> > >> Tail the logs.  Should give you (us) clue.
> > >>
> > >> They do go down with some errors in the log, but down't report it on
> the
> > >> terminal.
> > >> http://pastebin.com/0hYwaffL  regionserver log
> > >>
> > >>
> > >>
> > >>>       b. It does start them using start-hbase.sh.
> > >>>       c. Is it that stopping regionservers is not reported, but it
> does
> > >> stop them (I see that happening on production cluster) ?
> > >>>
> > >>
> > >>
> > >>
> > >>> 8. I started stargate in the upgraded 0.20.6 in dev cluster
> > >>>       a. earlier when I sent a URL to look for a data row that did
> not
> > >> exist, the return value was NULL , now I get an xml stating HTTP error
> > >> 404/405.        Everything works as expected for an existing data row.
> > >>
> > >> The latter sounds RESTy.  What would you expect of it?  The null?
> > >>
> > >>
> > >> Yes, it should send NULL like it does in the production server. Is
> there
> > >> anyone else you can point to who would have used REST ? This is the
> main
> > >> showstopper for me currently.
> > >>
> > >>
> > >>
> >
>

Re: HBase table lost on upgrade - compiling HBaseFsck.java

Posted by Ted Yu <yu...@gmail.com>.
Here is the file from my earlier post which compiles in 0.20.5.

On Wed, Sep 15, 2010 at 12:50 PM, Ted Yu <yu...@gmail.com> wrote:

> The scope of change to compile HBaseFsck.java in 0.20.x is bigger than it
> used to.
> Here are the errors I got - the last 3 depend on other HBase files.
>
> compile-core:
>     [javac] Compiling 2 source files to
> /Users/tyu/hbase-0.20.5/build/classes
>     [javac]
> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:46:
> cannot find symbol
>     [javac] symbol  : class ZooKeeperConnectionException
>     [javac] location: package org.apache.hadoop.hbase
>     [javac] import org.apache.hadoop.hbase.ZooKeeperConnectionException;
>     [javac]                               ^
>     [javac]
> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:81:
> cannot find symbol
>     [javac] symbol  : class ZooKeeperConnectionException
>     [javac] location: class org.apache.hadoop.hbase.client.HBaseFsck
>     [javac]     throws MasterNotRunningException,
> ZooKeeperConnectionException, IOException {
>     [javac]                                       ^
>     [javac]
> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:82:
> cannot find symbol
>     [javac] symbol  : constructor
> HBaseAdmin(org.apache.hadoop.conf.Configuration)
>     [javac] location: class org.apache.hadoop.hbase.client.HBaseAdmin
>     [javac]     super(conf);
>     [javac]     ^
>     [javac]
> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:267:
> cannot find symbol
>     [javac] symbol  : method getOnlineRegions()
>     [javac] location: interface
> org.apache.hadoop.hbase.ipc.HRegionInterface
>     [javac]         NavigableSet<HRegionInfo> regions =
> server.getOnlineRegions();
>     [javac]                                                   ^
>     [javac]
> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:440:
> cannot find symbol
>     [javac] symbol  : method
> metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
>
>     [javac] location: class org.apache.hadoop.hbase.client.MetaScanner
>     [javac]       MetaScanner.metaScan(conf, visitor);
>     [javac]                  ^
>
> On Wed, Sep 15, 2010 at 12:19 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> If you show us the errors, that would help me understand your situation
>> better.
>> HBaseFsck.java has changed a lot since I last tried to compile it.
>>
>>
>> On Wed, Sep 15, 2010 at 11:06 AM, Sharma, Avani <ag...@ebay.com>wrote:
>>
>>> Ted,
>>>
>>> I am trying to compile the file and am getting the same errors like you
>>> mentioned and more:
>>> [javac] symbol  : method
>>>
>>> metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
>>>    [javac] location: class org.apache.hadoop.hbase.client.MetaScanner
>>>    [javac]       MetaScanner.metaScan(conf, visitor);
>>>    [javac]                  ^
>>>    [javac]
>>>
>>> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:503:
>>> cannot find symbol
>>>    [javac] symbol  : method create()
>>>    [javac] location: class org.apache.hadoop.hbase.HBaseConfiguration
>>>    [javac]     Configuration conf = HBaseConfiguration.create();
>>>
>>> I got around a few of these by adding the logging jar to the CLASSPATH.
>>> But I still have some. I see that you sent out a fix, but I am unable to see
>>> the attachment.
>>>
>>> I have the conf dirs in CLASSPATH as well as hadoop, zk and hbase jars.
>>>
>>> Would you recall how can these be fixed? I guess some jars are needed in
>>> the CLASSPATH
>>>
>>> -----Original Message-----
>>> From: Ted Yu [mailto:yuzhihong@gmail.com]
>>> Sent: Wednesday, September 08, 2010 10:11 PM
>>> To: user@hbase.apache.org
>>> Subject: Re: HBase table lost on upgrade
>>>
>>> You can copy HBaseFsck.java from trunk and compile in 0.20.6
>>>
>>> On Wed, Sep 8, 2010 at 3:43 PM, Sharma, Avani <ag...@ebay.com> wrote:
>>>
>>> > Right.
>>> >
>>> > Anyway, where can I get this file from ? Any pointers?
>>> > I can't find it at
>>> > src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java in 0.20.6.
>>> >
>>> > -----Original Message-----
>>> > From: Ted Yu [mailto:yuzhihong@gmail.com]
>>> > Sent: Wednesday, September 08, 2010 3:09 PM
>>> > To: user@hbase.apache.org
>>> > Subject: Re: HBase table lost on upgrade
>>> >
>>> > master.jsp shows tables, not regions.
>>> > I personally haven't encountered the problem you're facing.
>>> >
>>> > On Wed, Sep 8, 2010 at 2:36 PM, Sharma, Avani <ag...@ebay.com>
>>> wrote:
>>> >
>>> > > Ted,
>>> > > I did look at that thread. It seems I need to modify the code in that
>>> > file?
>>> > > Could you point me to the exact steps to get it and compile it?
>>> > >
>>> > > Did you get through the issue if regions being added to catalog , but
>>> do
>>> > > not show up in master.jsp?
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > On Sep 4, 2010, at 9:24 PM, Ted Yu <yu...@gmail.com> wrote:
>>> > >
>>> > > > The tool Stack mentioned is hbck. If you want to port it to 0.20,
>>> see
>>> > > email
>>> > > > thread entitled:
>>> > > > compiling HBaseFsck.java for 0.20.5You should try reducing the
>>> number
>>> > of
>>> > > > tables in your system, possibly through HBASE-2473
>>> > > >
>>> > > > Cheers
>>> > > >
>>> > > > On Thu, Sep 2, 2010 at 11:41 AM, Sharma, Avani <ag...@ebay.com>
>>> > > wrote:
>>> > > >
>>> > > >>
>>> > > >>
>>> > > >>
>>> > > >> -----Original Message-----
>>> > > >> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>> Of
>>> > > Stack
>>> > > >> Sent: Wednesday, September 01, 2010 10:45 PM
>>> > > >> To: user@hbase.apache.org
>>> > > >> Subject: Re: HBase table lost on upgrade
>>> > > >>
>>> > > >> On Wed, Sep 1, 2010 at 5:49 PM, Sharma, Avani <ag...@ebay.com>
>>> > > wrote:
>>> > > >>> That email was just informational. Below are the details on my
>>> > cluster
>>> > > -
>>> > > >> let me know if more is needed.
>>> > > >>>
>>> > > >>> I have 2 hbase clusters setup
>>> > > >>> -       for production, 6 node cluster,  32G, 8 processors
>>> > > >>> -       for dev, 3 node cluster , 16GRAM , 4 processors
>>> > > >>>
>>> > > >>> 1. I installed hadoop0.20.2 and hbase0.20.3 on both these
>>> clusters,
>>> > > >> successfully.
>>> > > >>
>>> > > >> Why not latest stable version, 0.20.6?
>>> > > >>
>>> > > >> This was couple of months ago.
>>> > > >>
>>> > > >>
>>> > > >>> 2. After that I loaded 2G+ files into HDFS and HBASE table.
>>> > > >>
>>> > > >>
>>> > > >> Whats this mean?  Each of the .5M cells was 2G in size or the
>>> total
>>> > size
>>> > > >> was 2G?
>>> > > >>
>>> > > >> The total file size is 2G. Cells are of the order of hundreds of
>>> > bytes.
>>> > > >>
>>> > > >>
>>> > > >>>       An example Hbase table looks like this:
>>> > > >>>               {NAME =>'TABLE', FAMILIES => [{NAME => 'data',
>>> VERSIONS
>>> > > =>
>>> > > >> '100', COM true
>>> > > >>>                PRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE
>>> =>
>>> > > >> '65536', IN_MEMO
>>> > > >>>                RY => 'false', BLOCKCACHE => 'true'}]}
>>> > > >>
>>> > > >> That looks fine.
>>> > > >>
>>> > > >>> 3. I started stargate on one server and accessed Hbase for
>>> reading
>>> > from
>>> > > >> another 3rd party application successfully.
>>> > > >>>       It took 600 seconds on dev cluster and 250 on production to
>>> > read
>>> > > >> .5M records from Hbase via stargate.
>>> > > >>
>>> > > >>
>>> > > >> That don't sound so good.
>>> > > >>
>>> > > >>
>>> > > >>
>>> > > >>> 4. later to boost read performance, it was suggested that
>>> upgrading
>>> > to
>>> > > >> Hbase0.20.6 will be helpful. I did that on production (w/o running
>>> the
>>> > > >> migrate script) and re-started stargate and everything was running
>>> > fine,
>>> > > >> though I did not see a bump in performance.
>>> > > >>>
>>> > > >>> 5. Eventually, I had to move to dev cluster from production
>>> because
>>> > of
>>> > > >> some resource issues at our end. Dev cluster had 0.20.3 at this
>>> time.
>>> > As
>>> > > I
>>> > > >> started loading more files into Hbase (<10 versions of <1G files)
>>> and
>>> > > >> converting my app to use hbase more heavily (via more stargate
>>> > clients),
>>> > > the
>>> > > >> performance started degrading. I decided it was time to upgrade
>>> dev
>>> > > cluster
>>> > > >> as well to 0.20.6.  (I did not run the migrate script here as
>>> well, I
>>> > > missed
>>> > > >> this step in the doc).
>>> > > >>>
>>> > > >>
>>> > > >> What kinda perf you looking for from REST?
>>> > > >>
>>> > > >> Do you have to use REST?  All is base64'd so its safe to
>>> transport.
>>> > > >>
>>> > > >> I also have the Java Api code (for testing purposes) and that gave
>>> > > similar
>>> > > >> performance results (520 seconds on dev and 250 on production
>>> > cluster).
>>> > > Is
>>> > > >> there a way to flush the cache before we run the next experiment?
>>> I
>>> > > doubt
>>> > > >> that the first lookup always takes longer and then the later ones
>>> > > perform
>>> > > >> better.
>>> > > >>
>>> > > >> I need something that can integrate with C++ - libcurl and
>>> stargate
>>> > were
>>> > > >> the easiest to start with. I could look at thrift or anything else
>>> the
>>> > > Hbase
>>> > > >> gurus think might be a better fit performance-wise.
>>> > > >>
>>> > > >>
>>> > > >>> 6. When Hbase 0.20.6 came back up on dev cluster (with increased
>>> > block
>>> > > >> cache (.6) and region server handler counts (75) ), pointing to
>>> the
>>> > same
>>> > > >> rootdir, I noticed that some tables were missing. I could see a
>>> > mention
>>> > > of
>>> > > >> them in the logs, but not when I did 'list' in the shell. I
>>> recovered
>>> > > those
>>> > > >> tables using add_table.rb script.
>>> > > >>
>>> > > >>
>>> > > >> How did you shutdown this cluster?  Did you reboot machines?  Was
>>> your
>>> > > >> hdfs homed on /tmp?  What is going on on your systems?  Are they
>>> > > >> swapping?  Did you give HBase more than its default memory?  You
>>> read
>>> > > >> the requirements and made sure ulimit and xceivers had been upped
>>> on
>>> > > >> these machines?
>>> > > >>
>>> > > >>
>>> > > >> Did not reboot machines. hdfs or hbase do not store data/logs in
>>> /tmp.
>>> > > They
>>> > > >> are not swapping.
>>> > > >> Hbase heap size is 2G.  I have upped the xcievers now on your
>>> > > >> recommanedation.  Do I need to restart hdfs after making this
>>> change
>>> > in
>>> > > >> hdfs-site.xml ?
>>> > > >> ulimit -n
>>> > > >> 2048
>>> > > >>
>>> > > >>
>>> > > >>
>>> > > >>>       a. Is there a way to check the health of all Hbase tables
>>> in
>>> > the
>>> > > >> cluster after an upgrade or even periodically, to make sure that
>>> > > everything
>>> > > >> is healthy ?
>>> > > >>>       b. I would like to be able to force this error again and
>>> check
>>> > > the
>>> > > >> health of hbase and want it to report to me that some tables were
>>> > lost.
>>> > > >> Currently, I just found out because I had very less data and it
>>> was
>>> > easy
>>> > > to
>>> > > >> tell.
>>> > > >>>
>>> > > >>
>>> > > >> Iin trunk there is such a tool.  In 0.20.x, run a count against
>>> our
>>> > > >> table.  See the hbase shell.  Type help to see how.
>>> > > >>
>>> > > >>
>>> > > >> What tool are you talking about here - it wasn't clear ? Count
>>> against
>>> > > >> which table ? I want hbase to check all tables and I don't know
>>> how
>>> > many
>>> > > >> tables I have since there are too many - is that possible?
>>> > > >>
>>> > > >>> 7. Here are the issues I face after this upgrade
>>> > > >>>       a. when I run stop-hbase.sh, it  does not stop my
>>> regionservers
>>> > > on
>>> > > >> other boxes.
>>> > > >>
>>> > > >> Why not.  Whats going on on those machines?  If you tail the logs
>>> on
>>> > > >> the hosts that won't go down and/or on master, what do they say?
>>> > > >> Tail the logs.  Should give you (us) clue.
>>> > > >>
>>> > > >> They do go down with some errors in the log, but down't report it
>>> on
>>> > the
>>> > > >> terminal.
>>> > > >> http://pastebin.com/0hYwaffL  regionserver log
>>> > > >>
>>> > > >>
>>> > > >>
>>> > > >>>       b. It does start them using start-hbase.sh.
>>> > > >>>       c. Is it that stopping regionservers is not reported, but
>>> it
>>> > does
>>> > > >> stop them (I see that happening on production cluster) ?
>>> > > >>>
>>> > > >>
>>> > > >>
>>> > > >>
>>> > > >>> 8. I started stargate in the upgraded 0.20.6 in dev cluster
>>> > > >>>       a. earlier when I sent a URL to look for a data row that
>>> did
>>> > not
>>> > > >> exist, the return value was NULL , now I get an xml stating HTTP
>>> error
>>> > > >> 404/405.        Everything works as expected for an existing data
>>> row.
>>> > > >>
>>> > > >> The latter sounds RESTy.  What would you expect of it?  The null?
>>> > > >>
>>> > > >>
>>> > > >> Yes, it should send NULL like it does in the production server. Is
>>> > there
>>> > > >> anyone else you can point to who would have used REST ? This is
>>> the
>>> > main
>>> > > >> showstopper for me currently.
>>> > > >>
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>>
>>
>>
>

RE: HBase table lost on upgrade - compiling HBaseFsck.java

Posted by "Sharma, Avani" <ag...@ebay.com>.
Hello Ted,

Here are the actual errors I get -

HBaseFsck.java:46: cannot find symbol
symbol  : class ZooKeeperConnectionException
location: package org.apache.hadoop.hbase
import org.apache.hadoop.hbase.ZooKeeperConnectionException;
                              ^
HBaseFsck.java:81: cannot find symbol
symbol  : class ZooKeeperConnectionException
location: class org.apache.hadoop.hbase.client.HBaseFsck
    throws MasterNotRunningException, ZooKeeperConnectionException, IOException {
                                      ^
HBaseFsck.java:82: cannot find symbol
symbol  : constructor HBaseAdmin(org.apache.hadoop.conf.Configuration)
location: class org.apache.hadoop.hbase.client.HBaseAdmin
    super(conf);
    ^
HBaseFsck.java:267: cannot find symbol
symbol  : method getOnlineRegions()
location: interface org.apache.hadoop.hbase.ipc.HRegionInterface
        NavigableSet<HRegionInfo> regions = server.getOnlineRegions();
                                                  ^
HBaseFsck.java:440: cannot find symbol
symbol  : method metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
location: class org.apache.hadoop.hbase.client.MetaScanner
      MetaScanner.metaScan(conf, visitor);
                 ^
HBaseFsck.java:496: cannot find symbol
symbol  : method create()
location: class org.apache.hadoop.hbase.HBaseConfiguration
    Configuration conf = HBaseConfiguration.create();
                                           ^
6 errors


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Wednesday, September 15, 2010 12:50 PM
To: user@hbase.apache.org
Subject: Re: HBase table lost on upgrade - compiling HBaseFsck.java

The scope of change to compile HBaseFsck.java in 0.20.x is bigger than it
used to.
Here are the errors I got - the last 3 depend on other HBase files.

compile-core:
    [javac] Compiling 2 source files to
/Users/tyu/hbase-0.20.5/build/classes
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:46:
cannot find symbol
    [javac] symbol  : class ZooKeeperConnectionException
    [javac] location: package org.apache.hadoop.hbase
    [javac] import org.apache.hadoop.hbase.ZooKeeperConnectionException;
    [javac]                               ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:81:
cannot find symbol
    [javac] symbol  : class ZooKeeperConnectionException
    [javac] location: class org.apache.hadoop.hbase.client.HBaseFsck
    [javac]     throws MasterNotRunningException,
ZooKeeperConnectionException, IOException {
    [javac]                                       ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:82:
cannot find symbol
    [javac] symbol  : constructor
HBaseAdmin(org.apache.hadoop.conf.Configuration)
    [javac] location: class org.apache.hadoop.hbase.client.HBaseAdmin
    [javac]     super(conf);
    [javac]     ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:267:
cannot find symbol
    [javac] symbol  : method getOnlineRegions()
    [javac] location: interface org.apache.hadoop.hbase.ipc.HRegionInterface
    [javac]         NavigableSet<HRegionInfo> regions =
server.getOnlineRegions();
    [javac]                                                   ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:440:
cannot find symbol
    [javac] symbol  : method
metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
    [javac] location: class org.apache.hadoop.hbase.client.MetaScanner
    [javac]       MetaScanner.metaScan(conf, visitor);
    [javac]                  ^

On Wed, Sep 15, 2010 at 12:19 PM, Ted Yu <yu...@gmail.com> wrote:

> If you show us the errors, that would help me understand your situation
> better.
> HBaseFsck.java has changed a lot since I last tried to compile it.
>
>
> On Wed, Sep 15, 2010 at 11:06 AM, Sharma, Avani <ag...@ebay.com> wrote:
>
>> Ted,
>>
>> I am trying to compile the file and am getting the same errors like you
>> mentioned and more:
>> [javac] symbol  : method
>>
>> metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
>>    [javac] location: class org.apache.hadoop.hbase.client.MetaScanner
>>    [javac]       MetaScanner.metaScan(conf, visitor);
>>    [javac]                  ^
>>    [javac]
>>
>> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:503:
>> cannot find symbol
>>    [javac] symbol  : method create()
>>    [javac] location: class org.apache.hadoop.hbase.HBaseConfiguration
>>    [javac]     Configuration conf = HBaseConfiguration.create();
>>
>> I got around a few of these by adding the logging jar to the CLASSPATH.
>> But I still have some. I see that you sent out a fix, but I am unable to see
>> the attachment.
>>
>> I have the conf dirs in CLASSPATH as well as hadoop, zk and hbase jars.
>>
>> Would you recall how can these be fixed? I guess some jars are needed in
>> the CLASSPATH
>>
>> -----Original Message-----
>> From: Ted Yu [mailto:yuzhihong@gmail.com]
>> Sent: Wednesday, September 08, 2010 10:11 PM
>> To: user@hbase.apache.org
>> Subject: Re: HBase table lost on upgrade
>>
>> You can copy HBaseFsck.java from trunk and compile in 0.20.6
>>
>> On Wed, Sep 8, 2010 at 3:43 PM, Sharma, Avani <ag...@ebay.com> wrote:
>>
>> > Right.
>> >
>> > Anyway, where can I get this file from ? Any pointers?
>> > I can't find it at
>> > src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java in 0.20.6.
>> >
>> > -----Original Message-----
>> > From: Ted Yu [mailto:yuzhihong@gmail.com]
>> > Sent: Wednesday, September 08, 2010 3:09 PM
>> > To: user@hbase.apache.org
>> > Subject: Re: HBase table lost on upgrade
>> >
>> > master.jsp shows tables, not regions.
>> > I personally haven't encountered the problem you're facing.
>> >
>> > On Wed, Sep 8, 2010 at 2:36 PM, Sharma, Avani <ag...@ebay.com>
>> wrote:
>> >
>> > > Ted,
>> > > I did look at that thread. It seems I need to modify the code in that
>> > file?
>> > > Could you point me to the exact steps to get it and compile it?
>> > >
>> > > Did you get through the issue if regions being added to catalog , but
>> do
>> > > not show up in master.jsp?
>> > >
>> > >
>> > >
>> > >
>> > > On Sep 4, 2010, at 9:24 PM, Ted Yu <yu...@gmail.com> wrote:
>> > >
>> > > > The tool Stack mentioned is hbck. If you want to port it to 0.20,
>> see
>> > > email
>> > > > thread entitled:
>> > > > compiling HBaseFsck.java for 0.20.5You should try reducing the
>> number
>> > of
>> > > > tables in your system, possibly through HBASE-2473
>> > > >
>> > > > Cheers
>> > > >
>> > > > On Thu, Sep 2, 2010 at 11:41 AM, Sharma, Avani <ag...@ebay.com>
>> > > wrote:
>> > > >
>> > > >>
>> > > >>
>> > > >>
>> > > >> -----Original Message-----
>> > > >> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>> Of
>> > > Stack
>> > > >> Sent: Wednesday, September 01, 2010 10:45 PM
>> > > >> To: user@hbase.apache.org
>> > > >> Subject: Re: HBase table lost on upgrade
>> > > >>
>> > > >> On Wed, Sep 1, 2010 at 5:49 PM, Sharma, Avani <ag...@ebay.com>
>> > > wrote:
>> > > >>> That email was just informational. Below are the details on my
>> > cluster
>> > > -
>> > > >> let me know if more is needed.
>> > > >>>
>> > > >>> I have 2 hbase clusters setup
>> > > >>> -       for production, 6 node cluster,  32G, 8 processors
>> > > >>> -       for dev, 3 node cluster , 16GRAM , 4 processors
>> > > >>>
>> > > >>> 1. I installed hadoop0.20.2 and hbase0.20.3 on both these
>> clusters,
>> > > >> successfully.
>> > > >>
>> > > >> Why not latest stable version, 0.20.6?
>> > > >>
>> > > >> This was couple of months ago.
>> > > >>
>> > > >>
>> > > >>> 2. After that I loaded 2G+ files into HDFS and HBASE table.
>> > > >>
>> > > >>
>> > > >> Whats this mean?  Each of the .5M cells was 2G in size or the total
>> > size
>> > > >> was 2G?
>> > > >>
>> > > >> The total file size is 2G. Cells are of the order of hundreds of
>> > bytes.
>> > > >>
>> > > >>
>> > > >>>       An example Hbase table looks like this:
>> > > >>>               {NAME =>'TABLE', FAMILIES => [{NAME => 'data',
>> VERSIONS
>> > > =>
>> > > >> '100', COM true
>> > > >>>                PRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE
>> =>
>> > > >> '65536', IN_MEMO
>> > > >>>                RY => 'false', BLOCKCACHE => 'true'}]}
>> > > >>
>> > > >> That looks fine.
>> > > >>
>> > > >>> 3. I started stargate on one server and accessed Hbase for reading
>> > from
>> > > >> another 3rd party application successfully.
>> > > >>>       It took 600 seconds on dev cluster and 250 on production to
>> > read
>> > > >> .5M records from Hbase via stargate.
>> > > >>
>> > > >>
>> > > >> That don't sound so good.
>> > > >>
>> > > >>
>> > > >>
>> > > >>> 4. later to boost read performance, it was suggested that
>> upgrading
>> > to
>> > > >> Hbase0.20.6 will be helpful. I did that on production (w/o running
>> the
>> > > >> migrate script) and re-started stargate and everything was running
>> > fine,
>> > > >> though I did not see a bump in performance.
>> > > >>>
>> > > >>> 5. Eventually, I had to move to dev cluster from production
>> because
>> > of
>> > > >> some resource issues at our end. Dev cluster had 0.20.3 at this
>> time.
>> > As
>> > > I
>> > > >> started loading more files into Hbase (<10 versions of <1G files)
>> and
>> > > >> converting my app to use hbase more heavily (via more stargate
>> > clients),
>> > > the
>> > > >> performance started degrading. I decided it was time to upgrade dev
>> > > cluster
>> > > >> as well to 0.20.6.  (I did not run the migrate script here as well,
>> I
>> > > missed
>> > > >> this step in the doc).
>> > > >>>
>> > > >>
>> > > >> What kinda perf you looking for from REST?
>> > > >>
>> > > >> Do you have to use REST?  All is base64'd so its safe to transport.
>> > > >>
>> > > >> I also have the Java Api code (for testing purposes) and that gave
>> > > similar
>> > > >> performance results (520 seconds on dev and 250 on production
>> > cluster).
>> > > Is
>> > > >> there a way to flush the cache before we run the next experiment? I
>> > > doubt
>> > > >> that the first lookup always takes longer and then the later ones
>> > > perform
>> > > >> better.
>> > > >>
>> > > >> I need something that can integrate with C++ - libcurl and stargate
>> > were
>> > > >> the easiest to start with. I could look at thrift or anything else
>> the
>> > > Hbase
>> > > >> gurus think might be a better fit performance-wise.
>> > > >>
>> > > >>
>> > > >>> 6. When Hbase 0.20.6 came back up on dev cluster (with increased
>> > block
>> > > >> cache (.6) and region server handler counts (75) ), pointing to the
>> > same
>> > > >> rootdir, I noticed that some tables were missing. I could see a
>> > mention
>> > > of
>> > > >> them in the logs, but not when I did 'list' in the shell. I
>> recovered
>> > > those
>> > > >> tables using add_table.rb script.
>> > > >>
>> > > >>
>> > > >> How did you shutdown this cluster?  Did you reboot machines?  Was
>> your
>> > > >> hdfs homed on /tmp?  What is going on on your systems?  Are they
>> > > >> swapping?  Did you give HBase more than its default memory?  You
>> read
>> > > >> the requirements and made sure ulimit and xceivers had been upped
>> on
>> > > >> these machines?
>> > > >>
>> > > >>
>> > > >> Did not reboot machines. hdfs or hbase do not store data/logs in
>> /tmp.
>> > > They
>> > > >> are not swapping.
>> > > >> Hbase heap size is 2G.  I have upped the xcievers now on your
>> > > >> recommanedation.  Do I need to restart hdfs after making this
>> change
>> > in
>> > > >> hdfs-site.xml ?
>> > > >> ulimit -n
>> > > >> 2048
>> > > >>
>> > > >>
>> > > >>
>> > > >>>       a. Is there a way to check the health of all Hbase tables in
>> > the
>> > > >> cluster after an upgrade or even periodically, to make sure that
>> > > everything
>> > > >> is healthy ?
>> > > >>>       b. I would like to be able to force this error again and
>> check
>> > > the
>> > > >> health of hbase and want it to report to me that some tables were
>> > lost.
>> > > >> Currently, I just found out because I had very less data and it was
>> > easy
>> > > to
>> > > >> tell.
>> > > >>>
>> > > >>
>> > > >> Iin trunk there is such a tool.  In 0.20.x, run a count against our
>> > > >> table.  See the hbase shell.  Type help to see how.
>> > > >>
>> > > >>
>> > > >> What tool are you talking about here - it wasn't clear ? Count
>> against
>> > > >> which table ? I want hbase to check all tables and I don't know how
>> > many
>> > > >> tables I have since there are too many - is that possible?
>> > > >>
>> > > >>> 7. Here are the issues I face after this upgrade
>> > > >>>       a. when I run stop-hbase.sh, it  does not stop my
>> regionservers
>> > > on
>> > > >> other boxes.
>> > > >>
>> > > >> Why not.  Whats going on on those machines?  If you tail the logs
>> on
>> > > >> the hosts that won't go down and/or on master, what do they say?
>> > > >> Tail the logs.  Should give you (us) clue.
>> > > >>
>> > > >> They do go down with some errors in the log, but down't report it
>> on
>> > the
>> > > >> terminal.
>> > > >> http://pastebin.com/0hYwaffL  regionserver log
>> > > >>
>> > > >>
>> > > >>
>> > > >>>       b. It does start them using start-hbase.sh.
>> > > >>>       c. Is it that stopping regionservers is not reported, but it
>> > does
>> > > >> stop them (I see that happening on production cluster) ?
>> > > >>>
>> > > >>
>> > > >>
>> > > >>
>> > > >>> 8. I started stargate in the upgraded 0.20.6 in dev cluster
>> > > >>>       a. earlier when I sent a URL to look for a data row that did
>> > not
>> > > >> exist, the return value was NULL , now I get an xml stating HTTP
>> error
>> > > >> 404/405.        Everything works as expected for an existing data
>> row.
>> > > >>
>> > > >> The latter sounds RESTy.  What would you expect of it?  The null?
>> > > >>
>> > > >>
>> > > >> Yes, it should send NULL like it does in the production server. Is
>> > there
>> > > >> anyone else you can point to who would have used REST ? This is the
>> > main
>> > > >> showstopper for me currently.
>> > > >>
>> > > >>
>> > > >>
>> > >
>> >
>>
>
>

Re: HBase table lost on upgrade - compiling HBaseFsck.java

Posted by Ted Yu <yu...@gmail.com>.
The scope of change to compile HBaseFsck.java in 0.20.x is bigger than it
used to.
Here are the errors I got - the last 3 depend on other HBase files.

compile-core:
    [javac] Compiling 2 source files to
/Users/tyu/hbase-0.20.5/build/classes
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:46:
cannot find symbol
    [javac] symbol  : class ZooKeeperConnectionException
    [javac] location: package org.apache.hadoop.hbase
    [javac] import org.apache.hadoop.hbase.ZooKeeperConnectionException;
    [javac]                               ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:81:
cannot find symbol
    [javac] symbol  : class ZooKeeperConnectionException
    [javac] location: class org.apache.hadoop.hbase.client.HBaseFsck
    [javac]     throws MasterNotRunningException,
ZooKeeperConnectionException, IOException {
    [javac]                                       ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:82:
cannot find symbol
    [javac] symbol  : constructor
HBaseAdmin(org.apache.hadoop.conf.Configuration)
    [javac] location: class org.apache.hadoop.hbase.client.HBaseAdmin
    [javac]     super(conf);
    [javac]     ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:267:
cannot find symbol
    [javac] symbol  : method getOnlineRegions()
    [javac] location: interface org.apache.hadoop.hbase.ipc.HRegionInterface
    [javac]         NavigableSet<HRegionInfo> regions =
server.getOnlineRegions();
    [javac]                                                   ^
    [javac]
/Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:440:
cannot find symbol
    [javac] symbol  : method
metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
    [javac] location: class org.apache.hadoop.hbase.client.MetaScanner
    [javac]       MetaScanner.metaScan(conf, visitor);
    [javac]                  ^

On Wed, Sep 15, 2010 at 12:19 PM, Ted Yu <yu...@gmail.com> wrote:

> If you show us the errors, that would help me understand your situation
> better.
> HBaseFsck.java has changed a lot since I last tried to compile it.
>
>
> On Wed, Sep 15, 2010 at 11:06 AM, Sharma, Avani <ag...@ebay.com> wrote:
>
>> Ted,
>>
>> I am trying to compile the file and am getting the same errors like you
>> mentioned and more:
>> [javac] symbol  : method
>>
>> metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
>>    [javac] location: class org.apache.hadoop.hbase.client.MetaScanner
>>    [javac]       MetaScanner.metaScan(conf, visitor);
>>    [javac]                  ^
>>    [javac]
>>
>> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:503:
>> cannot find symbol
>>    [javac] symbol  : method create()
>>    [javac] location: class org.apache.hadoop.hbase.HBaseConfiguration
>>    [javac]     Configuration conf = HBaseConfiguration.create();
>>
>> I got around a few of these by adding the logging jar to the CLASSPATH.
>> But I still have some. I see that you sent out a fix, but I am unable to see
>> the attachment.
>>
>> I have the conf dirs in CLASSPATH as well as hadoop, zk and hbase jars.
>>
>> Would you recall how can these be fixed? I guess some jars are needed in
>> the CLASSPATH
>>
>> -----Original Message-----
>> From: Ted Yu [mailto:yuzhihong@gmail.com]
>> Sent: Wednesday, September 08, 2010 10:11 PM
>> To: user@hbase.apache.org
>> Subject: Re: HBase table lost on upgrade
>>
>> You can copy HBaseFsck.java from trunk and compile in 0.20.6
>>
>> On Wed, Sep 8, 2010 at 3:43 PM, Sharma, Avani <ag...@ebay.com> wrote:
>>
>> > Right.
>> >
>> > Anyway, where can I get this file from ? Any pointers?
>> > I can't find it at
>> > src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java in 0.20.6.
>> >
>> > -----Original Message-----
>> > From: Ted Yu [mailto:yuzhihong@gmail.com]
>> > Sent: Wednesday, September 08, 2010 3:09 PM
>> > To: user@hbase.apache.org
>> > Subject: Re: HBase table lost on upgrade
>> >
>> > master.jsp shows tables, not regions.
>> > I personally haven't encountered the problem you're facing.
>> >
>> > On Wed, Sep 8, 2010 at 2:36 PM, Sharma, Avani <ag...@ebay.com>
>> wrote:
>> >
>> > > Ted,
>> > > I did look at that thread. It seems I need to modify the code in that
>> > file?
>> > > Could you point me to the exact steps to get it and compile it?
>> > >
>> > > Did you get through the issue if regions being added to catalog , but
>> do
>> > > not show up in master.jsp?
>> > >
>> > >
>> > >
>> > >
>> > > On Sep 4, 2010, at 9:24 PM, Ted Yu <yu...@gmail.com> wrote:
>> > >
>> > > > The tool Stack mentioned is hbck. If you want to port it to 0.20,
>> see
>> > > email
>> > > > thread entitled:
>> > > > compiling HBaseFsck.java for 0.20.5You should try reducing the
>> number
>> > of
>> > > > tables in your system, possibly through HBASE-2473
>> > > >
>> > > > Cheers
>> > > >
>> > > > On Thu, Sep 2, 2010 at 11:41 AM, Sharma, Avani <ag...@ebay.com>
>> > > wrote:
>> > > >
>> > > >>
>> > > >>
>> > > >>
>> > > >> -----Original Message-----
>> > > >> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>> Of
>> > > Stack
>> > > >> Sent: Wednesday, September 01, 2010 10:45 PM
>> > > >> To: user@hbase.apache.org
>> > > >> Subject: Re: HBase table lost on upgrade
>> > > >>
>> > > >> On Wed, Sep 1, 2010 at 5:49 PM, Sharma, Avani <ag...@ebay.com>
>> > > wrote:
>> > > >>> That email was just informational. Below are the details on my
>> > cluster
>> > > -
>> > > >> let me know if more is needed.
>> > > >>>
>> > > >>> I have 2 hbase clusters setup
>> > > >>> -       for production, 6 node cluster,  32G, 8 processors
>> > > >>> -       for dev, 3 node cluster , 16GRAM , 4 processors
>> > > >>>
>> > > >>> 1. I installed hadoop0.20.2 and hbase0.20.3 on both these
>> clusters,
>> > > >> successfully.
>> > > >>
>> > > >> Why not latest stable version, 0.20.6?
>> > > >>
>> > > >> This was couple of months ago.
>> > > >>
>> > > >>
>> > > >>> 2. After that I loaded 2G+ files into HDFS and HBASE table.
>> > > >>
>> > > >>
>> > > >> Whats this mean?  Each of the .5M cells was 2G in size or the total
>> > size
>> > > >> was 2G?
>> > > >>
>> > > >> The total file size is 2G. Cells are of the order of hundreds of
>> > bytes.
>> > > >>
>> > > >>
>> > > >>>       An example Hbase table looks like this:
>> > > >>>               {NAME =>'TABLE', FAMILIES => [{NAME => 'data',
>> VERSIONS
>> > > =>
>> > > >> '100', COM true
>> > > >>>                PRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE
>> =>
>> > > >> '65536', IN_MEMO
>> > > >>>                RY => 'false', BLOCKCACHE => 'true'}]}
>> > > >>
>> > > >> That looks fine.
>> > > >>
>> > > >>> 3. I started stargate on one server and accessed Hbase for reading
>> > from
>> > > >> another 3rd party application successfully.
>> > > >>>       It took 600 seconds on dev cluster and 250 on production to
>> > read
>> > > >> .5M records from Hbase via stargate.
>> > > >>
>> > > >>
>> > > >> That don't sound so good.
>> > > >>
>> > > >>
>> > > >>
>> > > >>> 4. later to boost read performance, it was suggested that
>> upgrading
>> > to
>> > > >> Hbase0.20.6 will be helpful. I did that on production (w/o running
>> the
>> > > >> migrate script) and re-started stargate and everything was running
>> > fine,
>> > > >> though I did not see a bump in performance.
>> > > >>>
>> > > >>> 5. Eventually, I had to move to dev cluster from production
>> because
>> > of
>> > > >> some resource issues at our end. Dev cluster had 0.20.3 at this
>> time.
>> > As
>> > > I
>> > > >> started loading more files into Hbase (<10 versions of <1G files)
>> and
>> > > >> converting my app to use hbase more heavily (via more stargate
>> > clients),
>> > > the
>> > > >> performance started degrading. I decided it was time to upgrade dev
>> > > cluster
>> > > >> as well to 0.20.6.  (I did not run the migrate script here as well,
>> I
>> > > missed
>> > > >> this step in the doc).
>> > > >>>
>> > > >>
>> > > >> What kinda perf you looking for from REST?
>> > > >>
>> > > >> Do you have to use REST?  All is base64'd so its safe to transport.
>> > > >>
>> > > >> I also have the Java Api code (for testing purposes) and that gave
>> > > similar
>> > > >> performance results (520 seconds on dev and 250 on production
>> > cluster).
>> > > Is
>> > > >> there a way to flush the cache before we run the next experiment? I
>> > > doubt
>> > > >> that the first lookup always takes longer and then the later ones
>> > > perform
>> > > >> better.
>> > > >>
>> > > >> I need something that can integrate with C++ - libcurl and stargate
>> > were
>> > > >> the easiest to start with. I could look at thrift or anything else
>> the
>> > > Hbase
>> > > >> gurus think might be a better fit performance-wise.
>> > > >>
>> > > >>
>> > > >>> 6. When Hbase 0.20.6 came back up on dev cluster (with increased
>> > block
>> > > >> cache (.6) and region server handler counts (75) ), pointing to the
>> > same
>> > > >> rootdir, I noticed that some tables were missing. I could see a
>> > mention
>> > > of
>> > > >> them in the logs, but not when I did 'list' in the shell. I
>> recovered
>> > > those
>> > > >> tables using add_table.rb script.
>> > > >>
>> > > >>
>> > > >> How did you shutdown this cluster?  Did you reboot machines?  Was
>> your
>> > > >> hdfs homed on /tmp?  What is going on on your systems?  Are they
>> > > >> swapping?  Did you give HBase more than its default memory?  You
>> read
>> > > >> the requirements and made sure ulimit and xceivers had been upped
>> on
>> > > >> these machines?
>> > > >>
>> > > >>
>> > > >> Did not reboot machines. hdfs or hbase do not store data/logs in
>> /tmp.
>> > > They
>> > > >> are not swapping.
>> > > >> Hbase heap size is 2G.  I have upped the xcievers now on your
>> > > >> recommanedation.  Do I need to restart hdfs after making this
>> change
>> > in
>> > > >> hdfs-site.xml ?
>> > > >> ulimit -n
>> > > >> 2048
>> > > >>
>> > > >>
>> > > >>
>> > > >>>       a. Is there a way to check the health of all Hbase tables in
>> > the
>> > > >> cluster after an upgrade or even periodically, to make sure that
>> > > everything
>> > > >> is healthy ?
>> > > >>>       b. I would like to be able to force this error again and
>> check
>> > > the
>> > > >> health of hbase and want it to report to me that some tables were
>> > lost.
>> > > >> Currently, I just found out because I had very less data and it was
>> > easy
>> > > to
>> > > >> tell.
>> > > >>>
>> > > >>
>> > > >> Iin trunk there is such a tool.  In 0.20.x, run a count against our
>> > > >> table.  See the hbase shell.  Type help to see how.
>> > > >>
>> > > >>
>> > > >> What tool are you talking about here - it wasn't clear ? Count
>> against
>> > > >> which table ? I want hbase to check all tables and I don't know how
>> > many
>> > > >> tables I have since there are too many - is that possible?
>> > > >>
>> > > >>> 7. Here are the issues I face after this upgrade
>> > > >>>       a. when I run stop-hbase.sh, it  does not stop my
>> regionservers
>> > > on
>> > > >> other boxes.
>> > > >>
>> > > >> Why not.  Whats going on on those machines?  If you tail the logs
>> on
>> > > >> the hosts that won't go down and/or on master, what do they say?
>> > > >> Tail the logs.  Should give you (us) clue.
>> > > >>
>> > > >> They do go down with some errors in the log, but down't report it
>> on
>> > the
>> > > >> terminal.
>> > > >> http://pastebin.com/0hYwaffL  regionserver log
>> > > >>
>> > > >>
>> > > >>
>> > > >>>       b. It does start them using start-hbase.sh.
>> > > >>>       c. Is it that stopping regionservers is not reported, but it
>> > does
>> > > >> stop them (I see that happening on production cluster) ?
>> > > >>>
>> > > >>
>> > > >>
>> > > >>
>> > > >>> 8. I started stargate in the upgraded 0.20.6 in dev cluster
>> > > >>>       a. earlier when I sent a URL to look for a data row that did
>> > not
>> > > >> exist, the return value was NULL , now I get an xml stating HTTP
>> error
>> > > >> 404/405.        Everything works as expected for an existing data
>> row.
>> > > >>
>> > > >> The latter sounds RESTy.  What would you expect of it?  The null?
>> > > >>
>> > > >>
>> > > >> Yes, it should send NULL like it does in the production server. Is
>> > there
>> > > >> anyone else you can point to who would have used REST ? This is the
>> > main
>> > > >> showstopper for me currently.
>> > > >>
>> > > >>
>> > > >>
>> > >
>> >
>>
>
>

Re: HBase table lost on upgrade - compiling HBaseFsck.java

Posted by Ted Yu <yu...@gmail.com>.
If you show us the errors, that would help me understand your situation
better.
HBaseFsck.java has changed a lot since I last tried to compile it.

On Wed, Sep 15, 2010 at 11:06 AM, Sharma, Avani <ag...@ebay.com> wrote:

> Ted,
>
> I am trying to compile the file and am getting the same errors like you
> mentioned and more:
> [javac] symbol  : method
>
> metaScan(org.apache.hadoop.conf.Configuration,org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor)
>    [javac] location: class org.apache.hadoop.hbase.client.MetaScanner
>    [javac]       MetaScanner.metaScan(conf, visitor);
>    [javac]                  ^
>    [javac]
>
> /Users/tyu/hbase-0.20.5/src/java/org/apache/hadoop/hbase/client/HBaseFsck.java:503:
> cannot find symbol
>    [javac] symbol  : method create()
>    [javac] location: class org.apache.hadoop.hbase.HBaseConfiguration
>    [javac]     Configuration conf = HBaseConfiguration.create();
>
> I got around a few of these by adding the logging jar to the CLASSPATH. But
> I still have some. I see that you sent out a fix, but I am unable to see the
> attachment.
>
> I have the conf dirs in CLASSPATH as well as hadoop, zk and hbase jars.
>
> Would you recall how can these be fixed? I guess some jars are needed in
> the CLASSPATH
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Wednesday, September 08, 2010 10:11 PM
> To: user@hbase.apache.org
> Subject: Re: HBase table lost on upgrade
>
> You can copy HBaseFsck.java from trunk and compile in 0.20.6
>
> On Wed, Sep 8, 2010 at 3:43 PM, Sharma, Avani <ag...@ebay.com> wrote:
>
> > Right.
> >
> > Anyway, where can I get this file from ? Any pointers?
> > I can't find it at
> > src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java in 0.20.6.
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > Sent: Wednesday, September 08, 2010 3:09 PM
> > To: user@hbase.apache.org
> > Subject: Re: HBase table lost on upgrade
> >
> > master.jsp shows tables, not regions.
> > I personally haven't encountered the problem you're facing.
> >
> > On Wed, Sep 8, 2010 at 2:36 PM, Sharma, Avani <ag...@ebay.com> wrote:
> >
> > > Ted,
> > > I did look at that thread. It seems I need to modify the code in that
> > file?
> > > Could you point me to the exact steps to get it and compile it?
> > >
> > > Did you get through the issue if regions being added to catalog , but
> do
> > > not show up in master.jsp?
> > >
> > >
> > >
> > >
> > > On Sep 4, 2010, at 9:24 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > The tool Stack mentioned is hbck. If you want to port it to 0.20, see
> > > email
> > > > thread entitled:
> > > > compiling HBaseFsck.java for 0.20.5You should try reducing the number
> > of
> > > > tables in your system, possibly through HBASE-2473
> > > >
> > > > Cheers
> > > >
> > > > On Thu, Sep 2, 2010 at 11:41 AM, Sharma, Avani <ag...@ebay.com>
> > > wrote:
> > > >
> > > >>
> > > >>
> > > >>
> > > >> -----Original Message-----
> > > >> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > > Stack
> > > >> Sent: Wednesday, September 01, 2010 10:45 PM
> > > >> To: user@hbase.apache.org
> > > >> Subject: Re: HBase table lost on upgrade
> > > >>
> > > >> On Wed, Sep 1, 2010 at 5:49 PM, Sharma, Avani <ag...@ebay.com>
> > > wrote:
> > > >>> That email was just informational. Below are the details on my
> > cluster
> > > -
> > > >> let me know if more is needed.
> > > >>>
> > > >>> I have 2 hbase clusters setup
> > > >>> -       for production, 6 node cluster,  32G, 8 processors
> > > >>> -       for dev, 3 node cluster , 16GRAM , 4 processors
> > > >>>
> > > >>> 1. I installed hadoop0.20.2 and hbase0.20.3 on both these clusters,
> > > >> successfully.
> > > >>
> > > >> Why not latest stable version, 0.20.6?
> > > >>
> > > >> This was couple of months ago.
> > > >>
> > > >>
> > > >>> 2. After that I loaded 2G+ files into HDFS and HBASE table.
> > > >>
> > > >>
> > > >> Whats this mean?  Each of the .5M cells was 2G in size or the total
> > size
> > > >> was 2G?
> > > >>
> > > >> The total file size is 2G. Cells are of the order of hundreds of
> > bytes.
> > > >>
> > > >>
> > > >>>       An example Hbase table looks like this:
> > > >>>               {NAME =>'TABLE', FAMILIES => [{NAME => 'data',
> VERSIONS
> > > =>
> > > >> '100', COM true
> > > >>>                PRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE
> =>
> > > >> '65536', IN_MEMO
> > > >>>                RY => 'false', BLOCKCACHE => 'true'}]}
> > > >>
> > > >> That looks fine.
> > > >>
> > > >>> 3. I started stargate on one server and accessed Hbase for reading
> > from
> > > >> another 3rd party application successfully.
> > > >>>       It took 600 seconds on dev cluster and 250 on production to
> > read
> > > >> .5M records from Hbase via stargate.
> > > >>
> > > >>
> > > >> That don't sound so good.
> > > >>
> > > >>
> > > >>
> > > >>> 4. later to boost read performance, it was suggested that upgrading
> > to
> > > >> Hbase0.20.6 will be helpful. I did that on production (w/o running
> the
> > > >> migrate script) and re-started stargate and everything was running
> > fine,
> > > >> though I did not see a bump in performance.
> > > >>>
> > > >>> 5. Eventually, I had to move to dev cluster from production because
> > of
> > > >> some resource issues at our end. Dev cluster had 0.20.3 at this
> time.
> > As
> > > I
> > > >> started loading more files into Hbase (<10 versions of <1G files)
> and
> > > >> converting my app to use hbase more heavily (via more stargate
> > clients),
> > > the
> > > >> performance started degrading. I decided it was time to upgrade dev
> > > cluster
> > > >> as well to 0.20.6.  (I did not run the migrate script here as well,
> I
> > > missed
> > > >> this step in the doc).
> > > >>>
> > > >>
> > > >> What kinda perf you looking for from REST?
> > > >>
> > > >> Do you have to use REST?  All is base64'd so its safe to transport.
> > > >>
> > > >> I also have the Java Api code (for testing purposes) and that gave
> > > similar
> > > >> performance results (520 seconds on dev and 250 on production
> > cluster).
> > > Is
> > > >> there a way to flush the cache before we run the next experiment? I
> > > doubt
> > > >> that the first lookup always takes longer and then the later ones
> > > perform
> > > >> better.
> > > >>
> > > >> I need something that can integrate with C++ - libcurl and stargate
> > were
> > > >> the easiest to start with. I could look at thrift or anything else
> the
> > > Hbase
> > > >> gurus think might be a better fit performance-wise.
> > > >>
> > > >>
> > > >>> 6. When Hbase 0.20.6 came back up on dev cluster (with increased
> > block
> > > >> cache (.6) and region server handler counts (75) ), pointing to the
> > same
> > > >> rootdir, I noticed that some tables were missing. I could see a
> > mention
> > > of
> > > >> them in the logs, but not when I did 'list' in the shell. I
> recovered
> > > those
> > > >> tables using add_table.rb script.
> > > >>
> > > >>
> > > >> How did you shutdown this cluster?  Did you reboot machines?  Was
> your
> > > >> hdfs homed on /tmp?  What is going on on your systems?  Are they
> > > >> swapping?  Did you give HBase more than its default memory?  You
> read
> > > >> the requirements and made sure ulimit and xceivers had been upped on
> > > >> these machines?
> > > >>
> > > >>
> > > >> Did not reboot machines. hdfs or hbase do not store data/logs in
> /tmp.
> > > They
> > > >> are not swapping.
> > > >> Hbase heap size is 2G.  I have upped the xcievers now on your
> > > >> recommanedation.  Do I need to restart hdfs after making this change
> > in
> > > >> hdfs-site.xml ?
> > > >> ulimit -n
> > > >> 2048
> > > >>
> > > >>
> > > >>
> > > >>>       a. Is there a way to check the health of all Hbase tables in
> > the
> > > >> cluster after an upgrade or even periodically, to make sure that
> > > everything
> > > >> is healthy ?
> > > >>>       b. I would like to be able to force this error again and
> check
> > > the
> > > >> health of hbase and want it to report to me that some tables were
> > lost.
> > > >> Currently, I just found out because I had very less data and it was
> > easy
> > > to
> > > >> tell.
> > > >>>
> > > >>
> > > >> Iin trunk there is such a tool.  In 0.20.x, run a count against our
> > > >> table.  See the hbase shell.  Type help to see how.
> > > >>
> > > >>
> > > >> What tool are you talking about here - it wasn't clear ? Count
> against
> > > >> which table ? I want hbase to check all tables and I don't know how
> > many
> > > >> tables I have since there are too many - is that possible?
> > > >>
> > > >>> 7. Here are the issues I face after this upgrade
> > > >>>       a. when I run stop-hbase.sh, it  does not stop my
> regionservers
> > > on
> > > >> other boxes.
> > > >>
> > > >> Why not.  Whats going on on those machines?  If you tail the logs on
> > > >> the hosts that won't go down and/or on master, what do they say?
> > > >> Tail the logs.  Should give you (us) clue.
> > > >>
> > > >> They do go down with some errors in the log, but down't report it on
> > the
> > > >> terminal.
> > > >> http://pastebin.com/0hYwaffL  regionserver log
> > > >>
> > > >>
> > > >>
> > > >>>       b. It does start them using start-hbase.sh.
> > > >>>       c. Is it that stopping regionservers is not reported, but it
> > does
> > > >> stop them (I see that happening on production cluster) ?
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >>> 8. I started stargate in the upgraded 0.20.6 in dev cluster
> > > >>>       a. earlier when I sent a URL to look for a data row that did
> > not
> > > >> exist, the return value was NULL , now I get an xml stating HTTP
> error
> > > >> 404/405.        Everything works as expected for an existing data
> row.
> > > >>
> > > >> The latter sounds RESTy.  What would you expect of it?  The null?
> > > >>
> > > >>
> > > >> Yes, it should send NULL like it does in the production server. Is
> > there
> > > >> anyone else you can point to who would have used REST ? This is the
> > main
> > > >> showstopper for me currently.
> > > >>
> > > >>
> > > >>
> > >
> >
>