You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Albert Shau <as...@yahoo-inc.com> on 2011/08/24 20:25:39 UTC

Bulk Load question

Hi,

I want to do bulk loads by following http://hbase.apache.org/bulk-loads.html to create HFiles, and then using LoadIncrementalHFiles to load the data into a table.  Suppose the data I'm loading is being written to a new column that hasn't been used, and the rows are a superset of the rows already in the table.  Is it correct to assume that the existing data will not be affected by the load and that reads and writes can be happening during the load?  In other words, is the bulk load conceptually the same as doing a bunch of puts all at once through the api, or do I need to think of it differently?

Thanks,
Albert

Re: Bulk Load question

Posted by Jean-Daniel Cryans <jd...@apache.org>.
That's pretty much it.

J-D

On Wed, Aug 24, 2011 at 11:25 AM, Albert Shau <as...@yahoo-inc.com> wrote:
> Hi,
>
> I want to do bulk loads by following http://hbase.apache.org/bulk-loads.html to create HFiles, and then using LoadIncrementalHFiles to load the data into a table.  Suppose the data I'm loading is being written to a new column that hasn't been used, and the rows are a superset of the rows already in the table.  Is it correct to assume that the existing data will not be affected by the load and that reads and writes can be happening during the load?  In other words, is the bulk load conceptually the same as doing a bunch of puts all at once through the api, or do I need to think of it differently?
>
> Thanks,
> Albert
>