You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Ted Yu <yu...@gmail.com> on 2016/02/26 17:55:40 UTC

Re: Hbase in spark

In hbase, there is hbase-spark module which supports bulk load.
This module is to be backported in the upcoming 1.3.0 release.

There is some pending work, such as HBASE-15271 .

FYI

On Fri, Feb 26, 2016 at 8:50 AM, Renu Yadav <yr...@gmail.com> wrote:

> Has anybody implemented bulk load into hbase using spark?
>
> I need help to optimize its performance.
>
> Please help.
>
>
> Thanks & Regards,
> Renu Yadav
>

Re: Hbase in spark

Posted by Ted Malaska <te...@cloudera.com>.
Yes, and I have used HBASE-15271 and successful loaded over 20 billion
records into HBase even with node failures.

On Fri, Feb 26, 2016 at 11:55 AM, Ted Yu <yu...@gmail.com> wrote:

> In hbase, there is hbase-spark module which supports bulk load.
> This module is to be backported in the upcoming 1.3.0 release.
>
> There is some pending work, such as HBASE-15271 .
>
> FYI
>
> On Fri, Feb 26, 2016 at 8:50 AM, Renu Yadav <yr...@gmail.com> wrote:
>
>> Has anybody implemented bulk load into hbase using spark?
>>
>> I need help to optimize its performance.
>>
>> Please help.
>>
>>
>> Thanks & Regards,
>> Renu Yadav
>>
>
>

Re: Hbase in spark

Posted by Ted Yu <yu...@gmail.com>.
I know little about your use case.

Did you mean that your data is relatively evenly distributed in Spark
domain but showed skew in the bulk load phase ?

On Fri, Feb 26, 2016 at 9:02 AM, Renu Yadav <yr...@gmail.com> wrote:

> Hi Ted,
>
> Thanks for the reply. I am using spark hbase module only but the problem
> is when I do the bulk load it shows data skew and takes time to create the
> hfile.
> On 26 Feb 2016 10:25 p.m., "Ted Yu" <yu...@gmail.com> wrote:
>
>> In hbase, there is hbase-spark module which supports bulk load.
>> This module is to be backported in the upcoming 1.3.0 release.
>>
>> There is some pending work, such as HBASE-15271 .
>>
>> FYI
>>
>> On Fri, Feb 26, 2016 at 8:50 AM, Renu Yadav <yr...@gmail.com> wrote:
>>
>>> Has anybody implemented bulk load into hbase using spark?
>>>
>>> I need help to optimize its performance.
>>>
>>> Please help.
>>>
>>>
>>> Thanks & Regards,
>>> Renu Yadav
>>>
>>
>>