You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Pariksheet Barapatre <pb...@gmail.com> on 2015/01/11 18:34:42 UTC

Pheonix equivalent HFile

Hello All,

New year greetings..!!!

My question as follow -

How to create Phoenix Salted table  equivalent HFile using MapReduce.

As per my understanding we can create HFile by specifying

HFileOutputFormat.configureIncrementalLoad(job, hTable);

What would be the way to create  salt and generate phoenix equivalent
rowkey and values.


Cheers,
Pari

Re: Pheonix equivalent HFile

Posted by Gabriel Reid <ga...@gmail.com>.
Yes, this can be used at production scale -- that's the intention of the CSV bulk loader. The insert and rollbacks that you see are purely in-memory operations used to build up the KeyVales for the HFiles. 

- Gabriel

> On 11 Jan 2015, at 19:39, Pariksheet Barapatre <pb...@gmail.com> wrote:
> 
> Hi Gabriel,
> 
> This is great. Thanks. Can I use same approach for generating HFile at
> production scale. I am bit worried  because for every row, code tries to
> insert row and then rollbacks. (uncommitted row).
> 
> 
> Many Thanks
> Pari
> 
> 
> 
>> On 11 January 2015 at 23:36, Gabriel Reid <ga...@gmail.com> wrote:
>> 
>> The CSV bulk loader in Phoenix actually does this -- it creates HFiles
>> via MapReduce based on CSV input.
>> 
>> You can take a look at the details of how it works in
>> CsvBulkLoadTool.java [1] and CsvToKeyValueMapper.java [2]. There isn't
>> currently a public API for creating Phoenix-compatible HFiles via
>> MapReduce in Phoenix, but there is a set of utility classes in the
>> org.apache.phoenix.mapreduce package for writing to Phoenix directly
>> as the output of a MapReduce program.
>> 
>> - Gabriel
>> 
>> 
>> 1.
>> https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java
>> 2.
>> https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java
>> 
>> On Sun, Jan 11, 2015 at 6:34 PM, Pariksheet Barapatre
>> <pb...@gmail.com> wrote:
>>> Hello All,
>>> 
>>> New year greetings..!!!
>>> 
>>> My question as follow -
>>> 
>>> How to create Phoenix Salted table  equivalent HFile using MapReduce.
>>> 
>>> As per my understanding we can create HFile by specifying
>>> 
>>> HFileOutputFormat.configureIncrementalLoad(job, hTable);
>>> 
>>> What would be the way to create  salt and generate phoenix equivalent
>> rowkey
>>> and values.
>>> 
>>> 
>>> Cheers,
>>> Pari
> 
> 
> 
> -- 
> Cheers,
> Pari

Re: Pheonix equivalent HFile

Posted by Pariksheet Barapatre <pb...@gmail.com>.
Hi Gabriel,

This is great. Thanks. Can I use same approach for generating HFile at
production scale. I am bit worried  because for every row, code tries to
insert row and then rollbacks. (uncommitted row).


Many Thanks
Pari



On 11 January 2015 at 23:36, Gabriel Reid <ga...@gmail.com> wrote:

> The CSV bulk loader in Phoenix actually does this -- it creates HFiles
> via MapReduce based on CSV input.
>
> You can take a look at the details of how it works in
> CsvBulkLoadTool.java [1] and CsvToKeyValueMapper.java [2]. There isn't
> currently a public API for creating Phoenix-compatible HFiles via
> MapReduce in Phoenix, but there is a set of utility classes in the
> org.apache.phoenix.mapreduce package for writing to Phoenix directly
> as the output of a MapReduce program.
>
> - Gabriel
>
>
> 1.
> https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java
> 2.
> https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java
>
> On Sun, Jan 11, 2015 at 6:34 PM, Pariksheet Barapatre
> <pb...@gmail.com> wrote:
> > Hello All,
> >
> > New year greetings..!!!
> >
> > My question as follow -
> >
> > How to create Phoenix Salted table  equivalent HFile using MapReduce.
> >
> > As per my understanding we can create HFile by specifying
> >
> > HFileOutputFormat.configureIncrementalLoad(job, hTable);
> >
> > What would be the way to create  salt and generate phoenix equivalent
> rowkey
> > and values.
> >
> >
> > Cheers,
> > Pari
>



-- 
Cheers,
Pari

Re: Pheonix equivalent HFile

Posted by Gabriel Reid <ga...@gmail.com>.
The CSV bulk loader in Phoenix actually does this -- it creates HFiles
via MapReduce based on CSV input.

You can take a look at the details of how it works in
CsvBulkLoadTool.java [1] and CsvToKeyValueMapper.java [2]. There isn't
currently a public API for creating Phoenix-compatible HFiles via
MapReduce in Phoenix, but there is a set of utility classes in the
org.apache.phoenix.mapreduce package for writing to Phoenix directly
as the output of a MapReduce program.

- Gabriel


1. https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java
2. https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java

On Sun, Jan 11, 2015 at 6:34 PM, Pariksheet Barapatre
<pb...@gmail.com> wrote:
> Hello All,
>
> New year greetings..!!!
>
> My question as follow -
>
> How to create Phoenix Salted table  equivalent HFile using MapReduce.
>
> As per my understanding we can create HFile by specifying
>
> HFileOutputFormat.configureIncrementalLoad(job, hTable);
>
> What would be the way to create  salt and generate phoenix equivalent rowkey
> and values.
>
>
> Cheers,
> Pari

Re: Pheonix equivalent HFile

Posted by Pariksheet Barapatre <pb...@gmail.com>.
Many Thanks for quick pointer. Let me go through the link and get back.

Cheers
Pari

On 11 January 2015 at 23:10, Ted Yu <yu...@gmail.com> wrote:

> Have you looked at https://issues.apache.org/jira/browse/PHOENIX-1454 ?
>
> Cheers
>
> On Sun, Jan 11, 2015 at 9:34 AM, Pariksheet Barapatre <
> pbarapatre@gmail.com>
> wrote:
>
> > Hello All,
> >
> > New year greetings..!!!
> >
> > My question as follow -
> >
> > How to create Phoenix Salted table  equivalent HFile using MapReduce.
> >
> > As per my understanding we can create HFile by specifying
> >
> > HFileOutputFormat.configureIncrementalLoad(job, hTable);
> >
> > What would be the way to create  salt and generate phoenix equivalent
> > rowkey and values.
> >
> >
> > Cheers,
> > Pari
> >
>



-- 
Cheers,
Pari

Re: Pheonix equivalent HFile

Posted by Ted Yu <yu...@gmail.com>.
Have you looked at https://issues.apache.org/jira/browse/PHOENIX-1454 ?

Cheers

On Sun, Jan 11, 2015 at 9:34 AM, Pariksheet Barapatre <pb...@gmail.com>
wrote:

> Hello All,
>
> New year greetings..!!!
>
> My question as follow -
>
> How to create Phoenix Salted table  equivalent HFile using MapReduce.
>
> As per my understanding we can create HFile by specifying
>
> HFileOutputFormat.configureIncrementalLoad(job, hTable);
>
> What would be the way to create  salt and generate phoenix equivalent
> rowkey and values.
>
>
> Cheers,
> Pari
>

Re: Pheonix equivalent HFile

Posted by Pariksheet Barapatre <pb...@gmail.com>.
Thanks Gabriel & Abe. I like CVS bilk loader approach and make more sense
to my use case.

Cheers
Pari

On 12 January 2015 at 03:36, Abe Weinograd <ab...@flonet.com> wrote:

> We do that.  For the most part, it is easy.  Bytes.toBytes works for most
> of the data types if you are using the UNSIGNED_* variety.  If not, we were
> using the API for the Phoenix Data types.  I don't know about salting
> specifically.
>
> Also, i believe that Phoenix now has an OutputFormat in the latest version
> that might help.
>
> Abe
>
> On Sun, Jan 11, 2015 at 12:34 PM, Pariksheet Barapatre <
> pbarapatre@gmail.com> wrote:
>
>> Hello All,
>>
>> New year greetings..!!!
>>
>> My question as follow -
>>
>> How to create Phoenix Salted table  equivalent HFile using MapReduce.
>>
>> As per my understanding we can create HFile by specifying
>>
>> HFileOutputFormat.configureIncrementalLoad(job, hTable);
>>
>> What would be the way to create  salt and generate phoenix equivalent
>> rowkey and values.
>>
>>
>> Cheers,
>> Pari
>>
>
>


-- 
Cheers,
Pari

Re: Pheonix equivalent HFile

Posted by Abe Weinograd <ab...@flonet.com>.
We do that.  For the most part, it is easy.  Bytes.toBytes works for most
of the data types if you are using the UNSIGNED_* variety.  If not, we were
using the API for the Phoenix Data types.  I don't know about salting
specifically.

Also, i believe that Phoenix now has an OutputFormat in the latest version
that might help.

Abe

On Sun, Jan 11, 2015 at 12:34 PM, Pariksheet Barapatre <pbarapatre@gmail.com
> wrote:

> Hello All,
>
> New year greetings..!!!
>
> My question as follow -
>
> How to create Phoenix Salted table  equivalent HFile using MapReduce.
>
> As per my understanding we can create HFile by specifying
>
> HFileOutputFormat.configureIncrementalLoad(job, hTable);
>
> What would be the way to create  salt and generate phoenix equivalent
> rowkey and values.
>
>
> Cheers,
> Pari
>

Re: Pheonix equivalent HFile

Posted by Abe Weinograd <ab...@flonet.com>.
We do that.  For the most part, it is easy.  Bytes.toBytes works for most
of the data types if you are using the UNSIGNED_* variety.  If not, we were
using the API for the Phoenix Data types.  I don't know about salting
specifically.

Also, i believe that Phoenix now has an OutputFormat in the latest version
that might help.

Abe

On Sun, Jan 11, 2015 at 12:34 PM, Pariksheet Barapatre <pbarapatre@gmail.com
> wrote:

> Hello All,
>
> New year greetings..!!!
>
> My question as follow -
>
> How to create Phoenix Salted table  equivalent HFile using MapReduce.
>
> As per my understanding we can create HFile by specifying
>
> HFileOutputFormat.configureIncrementalLoad(job, hTable);
>
> What would be the way to create  salt and generate phoenix equivalent
> rowkey and values.
>
>
> Cheers,
> Pari
>