You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Something Something <ma...@gmail.com> on 2011/02/01 03:27:04 UTC

Re: Tables & rows disappear

1)  Version numbers:

hadoop-0.20.2
hbase-0.20.6


2)  autoFlush to 'true' works, but wouldn't that slow down the insertion
process?

3)  Here's how I had set it up:

In my Mapper's setup method:

           table = new HTable(new HBaseConfiguration(), XYZ_TABLE);

        table.setAutoFlush(false);

        table.setWriteBufferSize(1024 * 1024 * 12);

In my Mappers' cleanup method:
       table.flushCommits();

    table.close();

At the time of writing:

       Put put = new Put(Bytes.toBytes(key));

    put.setWriteToWAL(false);

       put.add(Bytes.toBytes("info"), Bytes.toBytes("code"), Bytes.toBytes(
code));

    & so on... and at the end...



    table.put(put);


Is this not the right way to do it?  Please let me know.  Thanks for the
help.


On Sun, Jan 30, 2011 at 3:03 PM, Stack <st...@duboce.net> wrote:

> What version of hbase+hadoop?
> St.Ack
>
> On Fri, Jan 28, 2011 at 8:37 PM, Something Something
> <ma...@gmail.com> wrote:
> > Apologies for my dumbness.  I know it's some property that I am not
> setting
> > correctly.  But every time I stop & start HBase & Hadoop I either lose
> all
> > my tables or loose rows on tables in HBase.
> >
> > Here's what various files contain:
> >
> > *core-site.xml*
> > <configuration>
> >  <property>
> >    <name>fs.default.name</name>
> >    <value>hdfs://localhost:9000</value>
> >  </property>
> >  <property>
> >    <name>hadoop.tmp.dir</name>
> >    <value>/usr/xxx/hdfs</value>
> >  </property>
> > </configuration>
> >
> > *hdfs-site.xml*
> > <configuration>
> >  <property>
> >    <name>dfs.replication</name>
> >    <value>1</value>
> >  </property>
> >  <property>
> >    <name>dfs.name.dir</name>
> >    <value>/usr/xxx/hdfs/name</value>
> >  </property>
> >
> >  <property>
> >    <name>dfs.data.dir</name>
> >    <value>/usr/xxx/hdfs/data</value>
> >  </property>
> >
> > *mapred-site.xml*
> > <configuration>
> >  <property>
> >    <name>mapred.job.tracker</name>
> >    <value>localhost:9001</value>
> >  </property>
> > </configuration>
> >
> > *hbase-site.xml*
> > <configuration>
> >  <property>
> >    <name>hbase.rootdir</name>
> >    <value>hdfs://localhost:9000/hbase</value>
> >  </property>
> >  <property>
> >    <name>hbase.tmp.dir</name>
> >    <value>/usr/xxx/hdfs/hbase</value>
> >  </property>
> > </configuration>
> >
> >
> > What am I missing?  Please help.  Thanks.
> >
>

Re: Tables & rows disappear

Posted by Ryan Rawson <ry...@gmail.com>.
I'm guessing that you arent having as clean as a shutdown as you might
think if you are seeing tables dissapear.  Here is a quick way to
tell, if you think table 'x' should exist, but it doesnt seem to, do
this:


bin/hadoop fs -ls /hbase/x

if that directory exists, I think you might be running into the hadoop
data loss bug... This is a known problem in earlier version of HBase
where hadoop wouldn't allow HBase to read the write ahead log fully
and you would end up with missing data, in this case the missing data
is the META entries telling hbase about that table.

The good news is newer version of HBase fix this. Hadoop 0.20 append
branch or CDH3b2+ and HBase 0.90.0 have solved this issue.

If you are unable to upgrade, you will need to be careful on how you
shutdown your cluster. When you use bin/stop-hbase.sh that kicks off
the process and it can take a long time (flushing data to disk).
Doing any kill -9 of regionservers will dump what's in memory and
leave you with data loss, so dont do that.



On Wed, Feb 2, 2011 at 11:51 AM, Something Something
<ma...@gmail.com> wrote:
> Stack - Any thoughts on this?
>
> On Mon, Jan 31, 2011 at 6:27 PM, Something Something <
> mailinglists19@gmail.com> wrote:
>
>> 1)  Version numbers:
>>
>> hadoop-0.20.2
>> hbase-0.20.6
>>
>>
>> 2)  autoFlush to 'true' works, but wouldn't that slow down the insertion
>> process?
>>
>> 3)  Here's how I had set it up:
>>
>> In my Mapper's setup method:
>>
>>         table = new HTable(new HBaseConfiguration(), XYZ_TABLE);
>>
>>         table.setAutoFlush(false);
>>
>>         table.setWriteBufferSize(1024 * 1024 * 12);
>>
>> In my Mappers' cleanup method:
>>        table.flushCommits();
>>
>>     table.close();
>>
>> At the time of writing:
>>
>>     Put put = new Put(Bytes.toBytes(key));
>>
>>     put.setWriteToWAL(false);
>>
>>     put.add(Bytes.toBytes("info"), Bytes.toBytes("code"), Bytes.toBytes(
>> code));
>>
>>     & so on... and at the end...
>>
>>
>>
>>     table.put(put);
>>
>>
>> Is this not the right way to do it?  Please let me know.  Thanks for the
>> help.
>>
>>
>>
>> On Sun, Jan 30, 2011 at 3:03 PM, Stack <st...@duboce.net> wrote:
>>
>>> What version of hbase+hadoop?
>>> St.Ack
>>>
>>> On Fri, Jan 28, 2011 at 8:37 PM, Something Something
>>> <ma...@gmail.com> wrote:
>>> > Apologies for my dumbness.  I know it's some property that I am not
>>> setting
>>> > correctly.  But every time I stop & start HBase & Hadoop I either lose
>>> all
>>> > my tables or loose rows on tables in HBase.
>>> >
>>> > Here's what various files contain:
>>> >
>>> > *core-site.xml*
>>> > <configuration>
>>> >  <property>
>>> >    <name>fs.default.name</name>
>>> >    <value>hdfs://localhost:9000</value>
>>> >  </property>
>>> >  <property>
>>> >    <name>hadoop.tmp.dir</name>
>>> >    <value>/usr/xxx/hdfs</value>
>>> >  </property>
>>> > </configuration>
>>> >
>>> > *hdfs-site.xml*
>>> > <configuration>
>>> >  <property>
>>> >    <name>dfs.replication</name>
>>> >    <value>1</value>
>>> >  </property>
>>> >  <property>
>>> >    <name>dfs.name.dir</name>
>>> >    <value>/usr/xxx/hdfs/name</value>
>>> >  </property>
>>> >
>>> >  <property>
>>> >    <name>dfs.data.dir</name>
>>> >    <value>/usr/xxx/hdfs/data</value>
>>> >  </property>
>>> >
>>> > *mapred-site.xml*
>>> > <configuration>
>>> >  <property>
>>> >    <name>mapred.job.tracker</name>
>>> >    <value>localhost:9001</value>
>>> >  </property>
>>> > </configuration>
>>> >
>>> > *hbase-site.xml*
>>> > <configuration>
>>> >  <property>
>>> >    <name>hbase.rootdir</name>
>>> >    <value>hdfs://localhost:9000/hbase</value>
>>> >  </property>
>>> >  <property>
>>> >    <name>hbase.tmp.dir</name>
>>> >    <value>/usr/xxx/hdfs/hbase</value>
>>> >  </property>
>>> > </configuration>
>>> >
>>> >
>>> > What am I missing?  Please help.  Thanks.
>>> >
>>>
>>
>>
>

Re: Tables & rows disappear

Posted by Something Something <ma...@gmail.com>.
Stack - Any thoughts on this?

On Mon, Jan 31, 2011 at 6:27 PM, Something Something <
mailinglists19@gmail.com> wrote:

> 1)  Version numbers:
>
> hadoop-0.20.2
> hbase-0.20.6
>
>
> 2)  autoFlush to 'true' works, but wouldn't that slow down the insertion
> process?
>
> 3)  Here's how I had set it up:
>
> In my Mapper's setup method:
>
>         table = new HTable(new HBaseConfiguration(), XYZ_TABLE);
>
>         table.setAutoFlush(false);
>
>         table.setWriteBufferSize(1024 * 1024 * 12);
>
> In my Mappers' cleanup method:
>        table.flushCommits();
>
>     table.close();
>
> At the time of writing:
>
>     Put put = new Put(Bytes.toBytes(key));
>
>     put.setWriteToWAL(false);
>
>     put.add(Bytes.toBytes("info"), Bytes.toBytes("code"), Bytes.toBytes(
> code));
>
>     & so on... and at the end...
>
>
>
>     table.put(put);
>
>
> Is this not the right way to do it?  Please let me know.  Thanks for the
> help.
>
>
>
> On Sun, Jan 30, 2011 at 3:03 PM, Stack <st...@duboce.net> wrote:
>
>> What version of hbase+hadoop?
>> St.Ack
>>
>> On Fri, Jan 28, 2011 at 8:37 PM, Something Something
>> <ma...@gmail.com> wrote:
>> > Apologies for my dumbness.  I know it's some property that I am not
>> setting
>> > correctly.  But every time I stop & start HBase & Hadoop I either lose
>> all
>> > my tables or loose rows on tables in HBase.
>> >
>> > Here's what various files contain:
>> >
>> > *core-site.xml*
>> > <configuration>
>> >  <property>
>> >    <name>fs.default.name</name>
>> >    <value>hdfs://localhost:9000</value>
>> >  </property>
>> >  <property>
>> >    <name>hadoop.tmp.dir</name>
>> >    <value>/usr/xxx/hdfs</value>
>> >  </property>
>> > </configuration>
>> >
>> > *hdfs-site.xml*
>> > <configuration>
>> >  <property>
>> >    <name>dfs.replication</name>
>> >    <value>1</value>
>> >  </property>
>> >  <property>
>> >    <name>dfs.name.dir</name>
>> >    <value>/usr/xxx/hdfs/name</value>
>> >  </property>
>> >
>> >  <property>
>> >    <name>dfs.data.dir</name>
>> >    <value>/usr/xxx/hdfs/data</value>
>> >  </property>
>> >
>> > *mapred-site.xml*
>> > <configuration>
>> >  <property>
>> >    <name>mapred.job.tracker</name>
>> >    <value>localhost:9001</value>
>> >  </property>
>> > </configuration>
>> >
>> > *hbase-site.xml*
>> > <configuration>
>> >  <property>
>> >    <name>hbase.rootdir</name>
>> >    <value>hdfs://localhost:9000/hbase</value>
>> >  </property>
>> >  <property>
>> >    <name>hbase.tmp.dir</name>
>> >    <value>/usr/xxx/hdfs/hbase</value>
>> >  </property>
>> > </configuration>
>> >
>> >
>> > What am I missing?  Please help.  Thanks.
>> >
>>
>
>