You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Something Something <ma...@gmail.com> on 2011/01/29 02:37:21 UTC

Tables & rows disappear

Apologies for my dumbness.  I know it's some property that I am not setting
correctly.  But every time I stop & start HBase & Hadoop I either lose all
my tables or loose rows on tables in HBase.

Here's what various files contain:

*core-site.xml*
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/xxx/hdfs</value>
  </property>
</configuration>

*hdfs-site.xml*
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.name.dir</name>
    <value>/usr/xxx/hdfs/name</value>
  </property>

  <property>
    <name>dfs.data.dir</name>
    <value>/usr/xxx/hdfs/data</value>
  </property>

*mapred-site.xml*
<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
</configuration>

*hbase-site.xml*
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://localhost:9000/hbase</value>
  </property>
  <property>
    <name>hbase.tmp.dir</name>
    <value>/usr/xxx/hdfs/hbase</value>
  </property>
</configuration>


What am I missing?  Please help.  Thanks.

Re: Tables & rows disappear

Posted by Dani Rayan <da...@gmail.com>.
Hey,

This can happen in couple of scenarios:

1. If the "writeBuffer" value is quite large and the writes are too little
for "autoflush" to be called [default is 2mb for writeBuffer]
2. You have set the "autoFlush" to false and never call flushCommits

If you haven't configured  these properties in hbase-site.xml
Immediate soln:
> puttable.setAutoFlush(true); //  for your table in the code
> put.setWriteToWAL(true);  // this is for more reliability
I'm sure this would persist every single writes in table, but you need to
fine-tune these properties
for your reliability and performance levels.

-Thanks,
Dani
http://www.cc.gatech.edu/~iar3/


On Fri, Jan 28, 2011 at 8:37 PM, Something Something <
mailinglists19@gmail.com> wrote:

> Apologies for my dumbness.  I know it's some property that I am not setting
> correctly.  But every time I stop & start HBase & Hadoop I either lose all
> my tables or loose rows on tables in HBase.
>
> Here's what various files contain:
>
> *core-site.xml*
> <configuration>
>  <property>
>    <name>fs.default.name</name>
>    <value>hdfs://localhost:9000</value>
>  </property>
>  <property>
>    <name>hadoop.tmp.dir</name>
>    <value>/usr/xxx/hdfs</value>
>  </property>
> </configuration>
>
> *hdfs-site.xml*
> <configuration>
>  <property>
>    <name>dfs.replication</name>
>    <value>1</value>
>  </property>
>  <property>
>    <name>dfs.name.dir</name>
>    <value>/usr/xxx/hdfs/name</value>
>  </property>
>
>  <property>
>    <name>dfs.data.dir</name>
>    <value>/usr/xxx/hdfs/data</value>
>  </property>
>
> *mapred-site.xml*
> <configuration>
>  <property>
>    <name>mapred.job.tracker</name>
>    <value>localhost:9001</value>
>  </property>
> </configuration>
>
> *hbase-site.xml*
> <configuration>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://localhost:9000/hbase</value>
>  </property>
>  <property>
>    <name>hbase.tmp.dir</name>
>    <value>/usr/xxx/hdfs/hbase</value>
>  </property>
> </configuration>
>
>
> What am I missing?  Please help.  Thanks.
>

Re: Tables & rows disappear

Posted by Ryan Rawson <ry...@gmail.com>.
I'm guessing that you arent having as clean as a shutdown as you might
think if you are seeing tables dissapear.  Here is a quick way to
tell, if you think table 'x' should exist, but it doesnt seem to, do
this:


bin/hadoop fs -ls /hbase/x

if that directory exists, I think you might be running into the hadoop
data loss bug... This is a known problem in earlier version of HBase
where hadoop wouldn't allow HBase to read the write ahead log fully
and you would end up with missing data, in this case the missing data
is the META entries telling hbase about that table.

The good news is newer version of HBase fix this. Hadoop 0.20 append
branch or CDH3b2+ and HBase 0.90.0 have solved this issue.

If you are unable to upgrade, you will need to be careful on how you
shutdown your cluster. When you use bin/stop-hbase.sh that kicks off
the process and it can take a long time (flushing data to disk).
Doing any kill -9 of regionservers will dump what's in memory and
leave you with data loss, so dont do that.



On Wed, Feb 2, 2011 at 11:51 AM, Something Something
<ma...@gmail.com> wrote:
> Stack - Any thoughts on this?
>
> On Mon, Jan 31, 2011 at 6:27 PM, Something Something <
> mailinglists19@gmail.com> wrote:
>
>> 1)  Version numbers:
>>
>> hadoop-0.20.2
>> hbase-0.20.6
>>
>>
>> 2)  autoFlush to 'true' works, but wouldn't that slow down the insertion
>> process?
>>
>> 3)  Here's how I had set it up:
>>
>> In my Mapper's setup method:
>>
>>         table = new HTable(new HBaseConfiguration(), XYZ_TABLE);
>>
>>         table.setAutoFlush(false);
>>
>>         table.setWriteBufferSize(1024 * 1024 * 12);
>>
>> In my Mappers' cleanup method:
>>        table.flushCommits();
>>
>>     table.close();
>>
>> At the time of writing:
>>
>>     Put put = new Put(Bytes.toBytes(key));
>>
>>     put.setWriteToWAL(false);
>>
>>     put.add(Bytes.toBytes("info"), Bytes.toBytes("code"), Bytes.toBytes(
>> code));
>>
>>     & so on... and at the end...
>>
>>
>>
>>     table.put(put);
>>
>>
>> Is this not the right way to do it?  Please let me know.  Thanks for the
>> help.
>>
>>
>>
>> On Sun, Jan 30, 2011 at 3:03 PM, Stack <st...@duboce.net> wrote:
>>
>>> What version of hbase+hadoop?
>>> St.Ack
>>>
>>> On Fri, Jan 28, 2011 at 8:37 PM, Something Something
>>> <ma...@gmail.com> wrote:
>>> > Apologies for my dumbness.  I know it's some property that I am not
>>> setting
>>> > correctly.  But every time I stop & start HBase & Hadoop I either lose
>>> all
>>> > my tables or loose rows on tables in HBase.
>>> >
>>> > Here's what various files contain:
>>> >
>>> > *core-site.xml*
>>> > <configuration>
>>> >  <property>
>>> >    <name>fs.default.name</name>
>>> >    <value>hdfs://localhost:9000</value>
>>> >  </property>
>>> >  <property>
>>> >    <name>hadoop.tmp.dir</name>
>>> >    <value>/usr/xxx/hdfs</value>
>>> >  </property>
>>> > </configuration>
>>> >
>>> > *hdfs-site.xml*
>>> > <configuration>
>>> >  <property>
>>> >    <name>dfs.replication</name>
>>> >    <value>1</value>
>>> >  </property>
>>> >  <property>
>>> >    <name>dfs.name.dir</name>
>>> >    <value>/usr/xxx/hdfs/name</value>
>>> >  </property>
>>> >
>>> >  <property>
>>> >    <name>dfs.data.dir</name>
>>> >    <value>/usr/xxx/hdfs/data</value>
>>> >  </property>
>>> >
>>> > *mapred-site.xml*
>>> > <configuration>
>>> >  <property>
>>> >    <name>mapred.job.tracker</name>
>>> >    <value>localhost:9001</value>
>>> >  </property>
>>> > </configuration>
>>> >
>>> > *hbase-site.xml*
>>> > <configuration>
>>> >  <property>
>>> >    <name>hbase.rootdir</name>
>>> >    <value>hdfs://localhost:9000/hbase</value>
>>> >  </property>
>>> >  <property>
>>> >    <name>hbase.tmp.dir</name>
>>> >    <value>/usr/xxx/hdfs/hbase</value>
>>> >  </property>
>>> > </configuration>
>>> >
>>> >
>>> > What am I missing?  Please help.  Thanks.
>>> >
>>>
>>
>>
>

Re: Tables & rows disappear

Posted by Something Something <ma...@gmail.com>.
Stack - Any thoughts on this?

On Mon, Jan 31, 2011 at 6:27 PM, Something Something <
mailinglists19@gmail.com> wrote:

> 1)  Version numbers:
>
> hadoop-0.20.2
> hbase-0.20.6
>
>
> 2)  autoFlush to 'true' works, but wouldn't that slow down the insertion
> process?
>
> 3)  Here's how I had set it up:
>
> In my Mapper's setup method:
>
>         table = new HTable(new HBaseConfiguration(), XYZ_TABLE);
>
>         table.setAutoFlush(false);
>
>         table.setWriteBufferSize(1024 * 1024 * 12);
>
> In my Mappers' cleanup method:
>        table.flushCommits();
>
>     table.close();
>
> At the time of writing:
>
>     Put put = new Put(Bytes.toBytes(key));
>
>     put.setWriteToWAL(false);
>
>     put.add(Bytes.toBytes("info"), Bytes.toBytes("code"), Bytes.toBytes(
> code));
>
>     & so on... and at the end...
>
>
>
>     table.put(put);
>
>
> Is this not the right way to do it?  Please let me know.  Thanks for the
> help.
>
>
>
> On Sun, Jan 30, 2011 at 3:03 PM, Stack <st...@duboce.net> wrote:
>
>> What version of hbase+hadoop?
>> St.Ack
>>
>> On Fri, Jan 28, 2011 at 8:37 PM, Something Something
>> <ma...@gmail.com> wrote:
>> > Apologies for my dumbness.  I know it's some property that I am not
>> setting
>> > correctly.  But every time I stop & start HBase & Hadoop I either lose
>> all
>> > my tables or loose rows on tables in HBase.
>> >
>> > Here's what various files contain:
>> >
>> > *core-site.xml*
>> > <configuration>
>> >  <property>
>> >    <name>fs.default.name</name>
>> >    <value>hdfs://localhost:9000</value>
>> >  </property>
>> >  <property>
>> >    <name>hadoop.tmp.dir</name>
>> >    <value>/usr/xxx/hdfs</value>
>> >  </property>
>> > </configuration>
>> >
>> > *hdfs-site.xml*
>> > <configuration>
>> >  <property>
>> >    <name>dfs.replication</name>
>> >    <value>1</value>
>> >  </property>
>> >  <property>
>> >    <name>dfs.name.dir</name>
>> >    <value>/usr/xxx/hdfs/name</value>
>> >  </property>
>> >
>> >  <property>
>> >    <name>dfs.data.dir</name>
>> >    <value>/usr/xxx/hdfs/data</value>
>> >  </property>
>> >
>> > *mapred-site.xml*
>> > <configuration>
>> >  <property>
>> >    <name>mapred.job.tracker</name>
>> >    <value>localhost:9001</value>
>> >  </property>
>> > </configuration>
>> >
>> > *hbase-site.xml*
>> > <configuration>
>> >  <property>
>> >    <name>hbase.rootdir</name>
>> >    <value>hdfs://localhost:9000/hbase</value>
>> >  </property>
>> >  <property>
>> >    <name>hbase.tmp.dir</name>
>> >    <value>/usr/xxx/hdfs/hbase</value>
>> >  </property>
>> > </configuration>
>> >
>> >
>> > What am I missing?  Please help.  Thanks.
>> >
>>
>
>

Re: Tables & rows disappear

Posted by Something Something <ma...@gmail.com>.
1)  Version numbers:

hadoop-0.20.2
hbase-0.20.6


2)  autoFlush to 'true' works, but wouldn't that slow down the insertion
process?

3)  Here's how I had set it up:

In my Mapper's setup method:

           table = new HTable(new HBaseConfiguration(), XYZ_TABLE);

        table.setAutoFlush(false);

        table.setWriteBufferSize(1024 * 1024 * 12);

In my Mappers' cleanup method:
       table.flushCommits();

    table.close();

At the time of writing:

       Put put = new Put(Bytes.toBytes(key));

    put.setWriteToWAL(false);

       put.add(Bytes.toBytes("info"), Bytes.toBytes("code"), Bytes.toBytes(
code));

    & so on... and at the end...



    table.put(put);


Is this not the right way to do it?  Please let me know.  Thanks for the
help.


On Sun, Jan 30, 2011 at 3:03 PM, Stack <st...@duboce.net> wrote:

> What version of hbase+hadoop?
> St.Ack
>
> On Fri, Jan 28, 2011 at 8:37 PM, Something Something
> <ma...@gmail.com> wrote:
> > Apologies for my dumbness.  I know it's some property that I am not
> setting
> > correctly.  But every time I stop & start HBase & Hadoop I either lose
> all
> > my tables or loose rows on tables in HBase.
> >
> > Here's what various files contain:
> >
> > *core-site.xml*
> > <configuration>
> >  <property>
> >    <name>fs.default.name</name>
> >    <value>hdfs://localhost:9000</value>
> >  </property>
> >  <property>
> >    <name>hadoop.tmp.dir</name>
> >    <value>/usr/xxx/hdfs</value>
> >  </property>
> > </configuration>
> >
> > *hdfs-site.xml*
> > <configuration>
> >  <property>
> >    <name>dfs.replication</name>
> >    <value>1</value>
> >  </property>
> >  <property>
> >    <name>dfs.name.dir</name>
> >    <value>/usr/xxx/hdfs/name</value>
> >  </property>
> >
> >  <property>
> >    <name>dfs.data.dir</name>
> >    <value>/usr/xxx/hdfs/data</value>
> >  </property>
> >
> > *mapred-site.xml*
> > <configuration>
> >  <property>
> >    <name>mapred.job.tracker</name>
> >    <value>localhost:9001</value>
> >  </property>
> > </configuration>
> >
> > *hbase-site.xml*
> > <configuration>
> >  <property>
> >    <name>hbase.rootdir</name>
> >    <value>hdfs://localhost:9000/hbase</value>
> >  </property>
> >  <property>
> >    <name>hbase.tmp.dir</name>
> >    <value>/usr/xxx/hdfs/hbase</value>
> >  </property>
> > </configuration>
> >
> >
> > What am I missing?  Please help.  Thanks.
> >
>

Re: Tables & rows disappear

Posted by Stack <st...@duboce.net>.
What version of hbase+hadoop?
St.Ack

On Fri, Jan 28, 2011 at 8:37 PM, Something Something
<ma...@gmail.com> wrote:
> Apologies for my dumbness.  I know it's some property that I am not setting
> correctly.  But every time I stop & start HBase & Hadoop I either lose all
> my tables or loose rows on tables in HBase.
>
> Here's what various files contain:
>
> *core-site.xml*
> <configuration>
>  <property>
>    <name>fs.default.name</name>
>    <value>hdfs://localhost:9000</value>
>  </property>
>  <property>
>    <name>hadoop.tmp.dir</name>
>    <value>/usr/xxx/hdfs</value>
>  </property>
> </configuration>
>
> *hdfs-site.xml*
> <configuration>
>  <property>
>    <name>dfs.replication</name>
>    <value>1</value>
>  </property>
>  <property>
>    <name>dfs.name.dir</name>
>    <value>/usr/xxx/hdfs/name</value>
>  </property>
>
>  <property>
>    <name>dfs.data.dir</name>
>    <value>/usr/xxx/hdfs/data</value>
>  </property>
>
> *mapred-site.xml*
> <configuration>
>  <property>
>    <name>mapred.job.tracker</name>
>    <value>localhost:9001</value>
>  </property>
> </configuration>
>
> *hbase-site.xml*
> <configuration>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://localhost:9000/hbase</value>
>  </property>
>  <property>
>    <name>hbase.tmp.dir</name>
>    <value>/usr/xxx/hdfs/hbase</value>
>  </property>
> </configuration>
>
>
> What am I missing?  Please help.  Thanks.
>