You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Anthony Nguyen <an...@gmail.com> on 2015/12/09 16:18:54 UTC

Duplicate HFiles during bulk load in 0.98

Hi all,

Having duplicate HFiles within a region should result in no change to the
data, correct? The reason I ask is because I'm seeing duplicate HFiles
being created during a bulk load - they have the same row count, same size,
 and same firstKey and lastKey. Is this normal behavior? Will a compaction
eventually take care of this?

Thanks!

Re: Duplicate HFiles during bulk load in 0.98

Posted by Ted Yu <yu...@gmail.com>.
Interesting.

Which exact 0.98 release are you using ?

Can you inspect logs to see when the duplicate HFiles were introduced
(during one bulk load run or multiple bulk load runs) ?

bq. Will a compaction eventually take care of this?

I think so.

Thanks

On Wed, Dec 9, 2015 at 7:18 AM, Anthony Nguyen <an...@gmail.com>
wrote:

> Hi all,
>
> Having duplicate HFiles within a region should result in no change to the
> data, correct? The reason I ask is because I'm seeing duplicate HFiles
> being created during a bulk load - they have the same row count, same size,
>  and same firstKey and lastKey. Is this normal behavior? Will a compaction
> eventually take care of this?
>
> Thanks!
>