You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Gaurav Agarwal <ga...@gmail.com> on 2015/09/01 20:51:05 UTC

Phoenix map reduce

Hello

We are using phoenix Map reduce CSV uploader to load data into HBASe . I
read documentation on Phoenix site, it will only create HFLE no WAL logs
will be created.Please confirm understanding is correct or wrong

We have to use HBASe replication across cluster for Master Master scenario.
Will the replication work in that scenario or do we need to use Copy Table
to replicate ?

thanks

Re: Phoenix map reduce

Posted by Krishna <re...@gmail.com>.
Another option is to create HFiles using csv bulk loader on one cluster,
transfer them to the backup cluster and run LoadIncrementalHFiles(...).

On Tue, Sep 1, 2015 at 11:53 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Gaurav,
>
> bulk load bypass the WAL, that's correct. It's true for Phoenix, it's true
> for HBase (outside of Phoenix).
>
> If you have replication activated, you will have to bulkload the data into
> the 2 clusters. Transfert your csv files on the other side too and bulkload
> from there.
>
> JM
>
> 2015-09-01 14:51 GMT-04:00 Gaurav Agarwal <ga...@gmail.com>:
>
>> Hello
>>
>> We are using phoenix Map reduce CSV uploader to load data into HBASe . I
>> read documentation on Phoenix site, it will only create HFLE no WAL logs
>> will be created.Please confirm understanding is correct or wrong
>>
>> We have to use HBASe replication across cluster for Master Master
>> scenario. Will the replication work in that scenario or do we need to use
>> Copy Table to replicate ?
>>
>> thanks
>>
>
>

Re: Phoenix map reduce

Posted by Gaurav Agarwal <ga...@gmail.com>.
thanks for the reply.

On Wed, Sep 2, 2015 at 12:48 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> copytable will start a MR job and will do the copy in parallele, which is
> good. But it's still going to do a lot of puts on the destination cluster
> which will trigger flushs and compactions. If it's easy for you to send
> your csv file there I think it will be more efficient, even if copyTbale
> can solve your issue.
>
> JM
>
> 2015-09-01 15:01 GMT-04:00 Gaurav Agarwal <ga...@gmail.com>:
>
>> In this will copyTable command of hbase is good to use or transfer the
>> csv file on the other side and bulkload from there which one is good
>> according to performance
>>
>> On Wed, Sep 2, 2015 at 12:23 AM, Jean-Marc Spaggiari <
>> jean-marc@spaggiari.org> wrote:
>>
>>> Hi Gaurav,
>>>
>>> bulk load bypass the WAL, that's correct. It's true for Phoenix, it's
>>> true for HBase (outside of Phoenix).
>>>
>>> If you have replication activated, you will have to bulkload the data
>>> into the 2 clusters. Transfert your csv files on the other side too and
>>> bulkload from there.
>>>
>>> JM
>>>
>>> 2015-09-01 14:51 GMT-04:00 Gaurav Agarwal <ga...@gmail.com>:
>>>
>>>> Hello
>>>>
>>>> We are using phoenix Map reduce CSV uploader to load data into HBASe .
>>>> I read documentation on Phoenix site, it will only create HFLE no WAL logs
>>>> will be created.Please confirm understanding is correct or wrong
>>>>
>>>> We have to use HBASe replication across cluster for Master Master
>>>> scenario. Will the replication work in that scenario or do we need to use
>>>> Copy Table to replicate ?
>>>>
>>>> thanks
>>>>
>>>
>>>
>>
>

Re: Phoenix map reduce

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
copytable will start a MR job and will do the copy in parallele, which is
good. But it's still going to do a lot of puts on the destination cluster
which will trigger flushs and compactions. If it's easy for you to send
your csv file there I think it will be more efficient, even if copyTbale
can solve your issue.

JM

2015-09-01 15:01 GMT-04:00 Gaurav Agarwal <ga...@gmail.com>:

> In this will copyTable command of hbase is good to use or transfer the csv
> file on the other side and bulkload from there which one is good according
> to performance
>
> On Wed, Sep 2, 2015 at 12:23 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi Gaurav,
>>
>> bulk load bypass the WAL, that's correct. It's true for Phoenix, it's
>> true for HBase (outside of Phoenix).
>>
>> If you have replication activated, you will have to bulkload the data
>> into the 2 clusters. Transfert your csv files on the other side too and
>> bulkload from there.
>>
>> JM
>>
>> 2015-09-01 14:51 GMT-04:00 Gaurav Agarwal <ga...@gmail.com>:
>>
>>> Hello
>>>
>>> We are using phoenix Map reduce CSV uploader to load data into HBASe . I
>>> read documentation on Phoenix site, it will only create HFLE no WAL logs
>>> will be created.Please confirm understanding is correct or wrong
>>>
>>> We have to use HBASe replication across cluster for Master Master
>>> scenario. Will the replication work in that scenario or do we need to use
>>> Copy Table to replicate ?
>>>
>>> thanks
>>>
>>
>>
>

Re: Phoenix map reduce

Posted by Gaurav Agarwal <ga...@gmail.com>.
In this will copyTable command of hbase is good to use or transfer the csv
file on the other side and bulkload from there which one is good according
to performance

On Wed, Sep 2, 2015 at 12:23 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Gaurav,
>
> bulk load bypass the WAL, that's correct. It's true for Phoenix, it's true
> for HBase (outside of Phoenix).
>
> If you have replication activated, you will have to bulkload the data into
> the 2 clusters. Transfert your csv files on the other side too and bulkload
> from there.
>
> JM
>
> 2015-09-01 14:51 GMT-04:00 Gaurav Agarwal <ga...@gmail.com>:
>
>> Hello
>>
>> We are using phoenix Map reduce CSV uploader to load data into HBASe . I
>> read documentation on Phoenix site, it will only create HFLE no WAL logs
>> will be created.Please confirm understanding is correct or wrong
>>
>> We have to use HBASe replication across cluster for Master Master
>> scenario. Will the replication work in that scenario or do we need to use
>> Copy Table to replicate ?
>>
>> thanks
>>
>
>

Re: Phoenix map reduce

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Gaurav,

bulk load bypass the WAL, that's correct. It's true for Phoenix, it's true
for HBase (outside of Phoenix).

If you have replication activated, you will have to bulkload the data into
the 2 clusters. Transfert your csv files on the other side too and bulkload
from there.

JM

2015-09-01 14:51 GMT-04:00 Gaurav Agarwal <ga...@gmail.com>:

> Hello
>
> We are using phoenix Map reduce CSV uploader to load data into HBASe . I
> read documentation on Phoenix site, it will only create HFLE no WAL logs
> will be created.Please confirm understanding is correct or wrong
>
> We have to use HBASe replication across cluster for Master Master
> scenario. Will the replication work in that scenario or do we need to use
> Copy Table to replicate ?
>
> thanks
>