You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@phoenix.apache.org by Anil <an...@gmail.com> on 2017/12/28 10:33:38 UTC

Phoenix Mapreduce

HI Team,

I was looking at the PhoenixOutputFormat and PhoenixRecordWriter.java ,
could not see connection autocommit is set to false. Did i miss something
here ?

Is there any way to read from phoenix table and create HFiles for bulk
import instead of committing every record (batch).

I have written a mapreduce job to create a datasets for my target table and
data load to target table is taking long time and want to avoid load time
by avoiding statement execution or frequent commits.

Any help would be appreciated. thanks.

Thanks,
Anil

Re: Phoenix Mapreduce

Posted by Josh Elser <el...@apache.org>.

Hey Anil,

Check out the MultiHfileOutputFormat class.

You can see how AbstractBulkLoadTool invokes it inside the `submitJob` 
method.

On 12/28/17 5:33 AM, Anil wrote:
> HI Team,
> 
> I was looking at the PhoenixOutputFormat and PhoenixRecordWriter.java , 
> could not see connection autocommit is set to false. Did i miss 
> something here ?
> 
> Is there any way to read from phoenix table and create HFiles for bulk 
> import instead of committing every record (batch).
> 
> I have written a mapreduce job to create a datasets for my target table 
> and data load to target table is taking long time and want to avoid load 
> time by avoiding statement execution or frequent commits.
> 
> Any help would be appreciated. thanks.
> 
> Thanks,
> Anil
> 
>