You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@phoenix.apache.org by Amit Sela <am...@infolinks.com> on 2014/03/04 13:50:01 UTC

Insert data into HBase with Phoenix

Hi all,

I'm using HBase 0.94.12 with Hadoop 1.0.4.

I'm trying to load ~3GB of data into an HBase table using the csv bulk load
(MR).
This is very very slow, the MR takes about 5X normal bulk load.

I was wondering if that is the best way ? I also wonder if it supports
constant pre-splitting - meaning that before each bulk load I add new
regions to the table ? Another issue I have with csv bulk load is dynamic
columns - I tried with setting null (actually "") in the csv where there is
no value but it contradicts the benefits of HBase not saving null values...

Do you think using upsert in batches could work better ? can it handle 3GB
(uncompressed) ? anyone did it from a MR context (Reducer executing the
UPSERT batches) ?

Thanks,
Amit.