You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by oc tsdb <oc...@gmail.com> on 2014/01/17 13:46:03 UTC

How to remove duplicate data in HBase?

Hi all,

 We want to know if there is any option to remove duplicate data in Hbase
based on column family dynamically?

Thanks,
OC

Re: How to remove duplicate data in HBase?

Posted by Michael Segel <mi...@hotmail.com>.
First, you should define what you mean when you say duplicate data.

Depending on your definition… it may already be handled. 

On Jan 17, 2014, at 7:39 AM, Ted Yu <yu...@gmail.com> wrote:

> Can you tell us where the duplicate data resides - between column families or between columns in a single column family ?
> 
> Cheers
> 
> On Jan 17, 2014, at 4:46 AM, oc tsdb <oc...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> We want to know if there is any option to remove duplicate data in Hbase
>> based on column family dynamically?
>> 
>> Thanks,
>> OC
> 


Re: How to remove duplicate data in HBase?

Posted by Ted Yu <yu...@gmail.com>.
Can you tell us where the duplicate data resides - between column families or between columns in a single column family ?

Cheers

On Jan 17, 2014, at 4:46 AM, oc tsdb <oc...@gmail.com> wrote:

> Hi all,
> 
> We want to know if there is any option to remove duplicate data in Hbase
> based on column family dynamically?
> 
> Thanks,
> OC