You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jahangir Mohammed <md...@gmail.com> on 2012/01/12 05:13:55 UTC

HBase Export

Hello,

Have couple of questions around hbase "export" facility:
1. I was looking at Export code.
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/Export.html.
I don't see "setCaching" called in the code. Wouldn't that would have speed
up the hbase export. Is this intentional? or is it expected that user
change this conf in hbase-site.xml according to their memory constraints.
Am I missing anything here?

2. While running hbase export for backing up data, I am getting scanner
timeout exceptions too. I can increase the regionserver.lease.period but
not sure whether it's a good idea.

Would be nice to hear some opinions. Any help greatly appreciated.

Thanks,
Jahangir.

Re: HBase in Hadoop-1.0.0

Posted by Harsh J <ha...@cloudera.com>.
You are right in your understanding. Apache HBase is not bundled with the Apache Hadoop release, but the article rather meant that it now supports HBase better.

Also, Apache Hadoop 1.0.0, being a rename of the 0.20.205 series, has append/sync APIs from branch-0.20-append, not the hflush/hsync, which are newer APIs available only in 0.22 onwards.

On 12-Jan-2012, at 3:06 PM, Graeme Seaton wrote:

> Hi,
> 
> I have seen in the media coverage about the release of Hadoop-1.0.0 that "HBase is officially part of the 1.0 release".  My understanding from the release notes that it includes changes (append/hsynch/hflush, and security) to support HBase more effectively but HBase is still separately bundled.
> 
> Probably being slightly OCD but which is the correct ?
> 
> Regards,
> 
> Graeme


HBase in Hadoop-1.0.0

Posted by Graeme Seaton <li...@graemes.com>.
Hi,

I have seen in the media coverage about the release of Hadoop-1.0.0 that 
"HBase is officially part of the 1.0 release".  My understanding from 
the release notes that it includes changes (append/hsynch/hflush, and 
security) to support HBase more effectively but HBase is still 
separately bundled.

Probably being slightly OCD but which is the correct ?

Regards,

Graeme

Re: HBase Export

Posted by Doug Meil <do...@explorysmedical.com>.
No problem!  I'll add it to the book because this applies to the CopyTable
utility as well, and you're not the first to ask this question.





On 1/12/12 12:36 PM, "Jahangir Mohammed" <md...@gmail.com> wrote:

>Thanks Doug, missed it.
>
>Thanks,
>Jahangir.
>
>On Thu, Jan 12, 2012 at 10:34 AM, Doug Meil
><do...@explorysmedical.com>wrote:
>
>>
>> setCaching is set in TableInputFormat, and it relies on the following
>> property set in the jobconf...
>>
>>   job.getConfiguration().setInt("hbase.client.scanner.caching",
>> batchSize);
>>
>>
>>
>>
>>
>>
>> On 1/11/12 11:13 PM, "Jahangir Mohammed" <md...@gmail.com>
>>wrote:
>>
>> >Hello,
>> >
>> >Have couple of questions around hbase "export" facility:
>> >1. I was looking at Export code.
>> >
>> 
>>http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/Export.
>>h
>> >tml.
>> >I don't see "setCaching" called in the code. Wouldn't that would have
>> >speed
>> >up the hbase export. Is this intentional? or is it expected that user
>> >change this conf in hbase-site.xml according to their memory
>>constraints.
>> >Am I missing anything here?
>> >
>> >2. While running hbase export for backing up data, I am getting scanner
>> >timeout exceptions too. I can increase the regionserver.lease.period
>>but
>> >not sure whether it's a good idea.
>> >
>> >Would be nice to hear some opinions. Any help greatly appreciated.
>> >
>> >Thanks,
>> >Jahangir.
>>
>>
>>



Re: HBase Export

Posted by Jahangir Mohammed <md...@gmail.com>.
Thanks Doug, missed it.

Thanks,
Jahangir.

On Thu, Jan 12, 2012 at 10:34 AM, Doug Meil
<do...@explorysmedical.com>wrote:

>
> setCaching is set in TableInputFormat, and it relies on the following
> property set in the jobconf...
>
>   job.getConfiguration().setInt("hbase.client.scanner.caching",
> batchSize);
>
>
>
>
>
>
> On 1/11/12 11:13 PM, "Jahangir Mohammed" <md...@gmail.com> wrote:
>
> >Hello,
> >
> >Have couple of questions around hbase "export" facility:
> >1. I was looking at Export code.
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/Export.h
> >tml.
> >I don't see "setCaching" called in the code. Wouldn't that would have
> >speed
> >up the hbase export. Is this intentional? or is it expected that user
> >change this conf in hbase-site.xml according to their memory constraints.
> >Am I missing anything here?
> >
> >2. While running hbase export for backing up data, I am getting scanner
> >timeout exceptions too. I can increase the regionserver.lease.period but
> >not sure whether it's a good idea.
> >
> >Would be nice to hear some opinions. Any help greatly appreciated.
> >
> >Thanks,
> >Jahangir.
>
>
>

Re: HBase Export

Posted by Doug Meil <do...@explorysmedical.com>.
setCaching is set in TableInputFormat, and it relies on the following
property set in the jobconf...

   job.getConfiguration().setInt("hbase.client.scanner.caching",
batchSize);






On 1/11/12 11:13 PM, "Jahangir Mohammed" <md...@gmail.com> wrote:

>Hello,
>
>Have couple of questions around hbase "export" facility:
>1. I was looking at Export code.
>http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/Export.h
>tml.
>I don't see "setCaching" called in the code. Wouldn't that would have
>speed
>up the hbase export. Is this intentional? or is it expected that user
>change this conf in hbase-site.xml according to their memory constraints.
>Am I missing anything here?
>
>2. While running hbase export for backing up data, I am getting scanner
>timeout exceptions too. I can increase the regionserver.lease.period but
>not sure whether it's a good idea.
>
>Would be nice to hear some opinions. Any help greatly appreciated.
>
>Thanks,
>Jahangir.