You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Adi Meller <ad...@gmail.com> on 2017/03/20 04:55:21 UTC

Csvbulkloadtool

Hello.
I need to move some (5-6) big (2 tera each) tables from hive to Phoenix
every day.

I have cdh 5.7 and install phoenix 4.7 thought parcel.
I have 4 region server with  94gb physical memory And 32 cores each.

1. I created csv files from hive  (by run create table) . And created table
with 16 regions through phoenix. then bulk load it using csvbulkloadtool.
It took me 1 day to load 1 tera of data.
Is there any recommendation I can use to make the bulkload faster? How can
I know what is my bottleneck?

2. What is the best method to load from hive tables into phoenix?

3. I read that hive- phoenix integration include Phoenix 4.8 but I cannot
find parcel for cdh other than phoenix 4.7. Is there any plans create 4.8
and higher parcel for cloudera ?

Thanks in advanced
Adi.

Re: Csvbulkloadtool

Posted by Josh Elser <el...@apache.org>.
On Mon, Mar 20, 2017 at 12:55 AM, Adi Meller <ad...@gmail.com> wrote:
> Hello.
> I need to move some (5-6) big (2 tera each) tables from hive to Phoenix
> every day.
>
> I have cdh 5.7 and install phoenix 4.7 thought parcel.
> I have 4 region server with  94gb physical memory And 32 cores each.
>
> 1. I created csv files from hive  (by run create table) . And created table
> with 16 regions through phoenix. then bulk load it using csvbulkloadtool. It
> took me 1 day to load 1 tera of data.
> Is there any recommendation I can use to make the bulkload faster? How can I
> know what is my bottleneck?

No we can't tell you why it is slow because we're not wizards :) Tell
us more about what takes so long. Is it the mappers? Reducers? How
many of each do you have? Share the mapper/reducer logs.

> 2. What is the best method to load from hive tables into phoenix?

Given your current version constraint, this is probably your best way.

> 3. I read that hive- phoenix integration include Phoenix 4.8 but I cannot
> find parcel for cdh other than phoenix 4.7. Is there any plans create 4.8
> and higher parcel for cloudera ?

These types of questions are usually better asked on the vendor's forums.

> Thanks in advanced
> Adi.

- Josh