You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Varun Sharma <va...@pinterest.com> on 2013/06/30 21:03:10 UTC

Issues with delete markers

Hi,

We are having an issue with the way HBase does handling of deletes. We are
looking to retrieve 300 columns in a row but the row has tens of thousands
of delete markers in it before we span the 300 columns something like this


row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3

And so on. Therefore, the issue here, being that to retrieve these 300
columns, we need to go through tens of thousands of deletes - sometimes we
get a spurt of these queries and that DDoSes a region server. We are okay
with saying, only return first 300 columns and stop once you encounter, say
5K column delete markers or something.

I wonder if such a construct is provided by HBase or do we need to build
something on top of the RAW scan and handle the delete masking there.

Thanks
Varun

Re: Issues with delete markers

Posted by lars hofhansl <la...@apache.org>.
That is the easy part :)
The hard part is to add this to filters in a backwards compatible way.

-- Lars


----- Original Message -----
From: Varun Sharma <va...@pinterest.com>
To: user@hbase.apache.org; lars hofhansl <la...@apache.org>
Cc: "dev@hbase.apache.org" <de...@hbase.apache.org>
Sent: Monday, July 1, 2013 8:18 AM
Subject: Re: Issues with delete markers

I mean version tracking with delete markers...


On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma <va...@pinterest.com> wrote:

> So, yesterday, I implemented this change via a coprocessor which basically
> initiates a scan which is raw, keeps tracking of # of delete markers
> encountered and stops when a configured threshold is met. It instantiates
> its own ScanDeleteTracker to do the masking through delete markers. So raw
> scan, count delete markers/stop if too many encountered and mask them so to
> return sane stuff back to the client.
>
> I guess until now it has been working reasonably. Also, with HBase 8809,
> version tracking etc. should also work with filters now.
>
>
> On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <la...@apache.org> wrote:
>
>> That would be quite dramatic change, we cannot pass delete markers to the
>> existing filters without confusing them.
>> We could invent a new method (filterDeleteKV or filterDeleteMarker or
>> something) on filters along with a new "filter type" that implements that
>> method.
>>
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Varun Sharma <va...@pinterest.com>
>> To: "dev@hbase.apache.org" <de...@hbase.apache.org>; user@hbase.apache.org
>> Cc:
>> Sent: Sunday, June 30, 2013 1:56 PM
>> Subject: Re: Issues with delete markers
>>
>> Sorry, typo, i meant that for user scans, should we be passing delete
>> markers through.the filters as well ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com>
>> wrote:
>>
>> > For user scans, i feel we should be passing delete markers through as
>> well.
>> >
>> >
>> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <varun@pinterest.com
>> >wrote:
>> >
>> >> I tried this a little bit and it seems that filters are not called on
>> >> delete markers. For raw scans returning delete markers, does it make
>> sense
>> >> to do that ?
>> >>
>> >> Varun
>> >>
>> >>
>> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <varun@pinterest.com
>> >wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> We are having an issue with the way HBase does handling of deletes. We
>> >>> are looking to retrieve 300 columns in a row but the row has tens of
>> >>> thousands of delete markers in it before we span the 300 columns
>> something
>> >>> like this
>> >>>
>> >>>
>> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
>> Col3
>> >>>
>> >>> And so on. Therefore, the issue here, being that to retrieve these 300
>> >>> columns, we need to go through tens of thousands of deletes -
>> sometimes we
>> >>> get a spurt of these queries and that DDoSes a region server. We are
>> okay
>> >>> with saying, only return first 300 columns and stop once you
>> encounter, say
>> >>> 5K column delete markers or something.
>> >>>
>> >>> I wonder if such a construct is provided by HBase or do we need to
>> build
>> >>> something on top of the RAW scan and handle the delete masking there.
>> >>>
>> >>> Thanks
>> >>> Varun
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>>
>


Re: Issues with delete markers

Posted by lars hofhansl <la...@apache.org>.
That is the easy part :)
The hard part is to add this to filters in a backwards compatible way.

-- Lars


----- Original Message -----
From: Varun Sharma <va...@pinterest.com>
To: user@hbase.apache.org; lars hofhansl <la...@apache.org>
Cc: "dev@hbase.apache.org" <de...@hbase.apache.org>
Sent: Monday, July 1, 2013 8:18 AM
Subject: Re: Issues with delete markers

I mean version tracking with delete markers...


On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma <va...@pinterest.com> wrote:

> So, yesterday, I implemented this change via a coprocessor which basically
> initiates a scan which is raw, keeps tracking of # of delete markers
> encountered and stops when a configured threshold is met. It instantiates
> its own ScanDeleteTracker to do the masking through delete markers. So raw
> scan, count delete markers/stop if too many encountered and mask them so to
> return sane stuff back to the client.
>
> I guess until now it has been working reasonably. Also, with HBase 8809,
> version tracking etc. should also work with filters now.
>
>
> On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <la...@apache.org> wrote:
>
>> That would be quite dramatic change, we cannot pass delete markers to the
>> existing filters without confusing them.
>> We could invent a new method (filterDeleteKV or filterDeleteMarker or
>> something) on filters along with a new "filter type" that implements that
>> method.
>>
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Varun Sharma <va...@pinterest.com>
>> To: "dev@hbase.apache.org" <de...@hbase.apache.org>; user@hbase.apache.org
>> Cc:
>> Sent: Sunday, June 30, 2013 1:56 PM
>> Subject: Re: Issues with delete markers
>>
>> Sorry, typo, i meant that for user scans, should we be passing delete
>> markers through.the filters as well ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com>
>> wrote:
>>
>> > For user scans, i feel we should be passing delete markers through as
>> well.
>> >
>> >
>> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <varun@pinterest.com
>> >wrote:
>> >
>> >> I tried this a little bit and it seems that filters are not called on
>> >> delete markers. For raw scans returning delete markers, does it make
>> sense
>> >> to do that ?
>> >>
>> >> Varun
>> >>
>> >>
>> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <varun@pinterest.com
>> >wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> We are having an issue with the way HBase does handling of deletes. We
>> >>> are looking to retrieve 300 columns in a row but the row has tens of
>> >>> thousands of delete markers in it before we span the 300 columns
>> something
>> >>> like this
>> >>>
>> >>>
>> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
>> Col3
>> >>>
>> >>> And so on. Therefore, the issue here, being that to retrieve these 300
>> >>> columns, we need to go through tens of thousands of deletes -
>> sometimes we
>> >>> get a spurt of these queries and that DDoSes a region server. We are
>> okay
>> >>> with saying, only return first 300 columns and stop once you
>> encounter, say
>> >>> 5K column delete markers or something.
>> >>>
>> >>> I wonder if such a construct is provided by HBase or do we need to
>> build
>> >>> something on top of the RAW scan and handle the delete masking there.
>> >>>
>> >>> Thanks
>> >>> Varun
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>>
>


Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
I mean version tracking with delete markers...


On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma <va...@pinterest.com> wrote:

> So, yesterday, I implemented this change via a coprocessor which basically
> initiates a scan which is raw, keeps tracking of # of delete markers
> encountered and stops when a configured threshold is met. It instantiates
> its own ScanDeleteTracker to do the masking through delete markers. So raw
> scan, count delete markers/stop if too many encountered and mask them so to
> return sane stuff back to the client.
>
> I guess until now it has been working reasonably. Also, with HBase 8809,
> version tracking etc. should also work with filters now.
>
>
> On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <la...@apache.org> wrote:
>
>> That would be quite dramatic change, we cannot pass delete markers to the
>> existing filters without confusing them.
>> We could invent a new method (filterDeleteKV or filterDeleteMarker or
>> something) on filters along with a new "filter type" that implements that
>> method.
>>
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Varun Sharma <va...@pinterest.com>
>> To: "dev@hbase.apache.org" <de...@hbase.apache.org>; user@hbase.apache.org
>> Cc:
>> Sent: Sunday, June 30, 2013 1:56 PM
>> Subject: Re: Issues with delete markers
>>
>> Sorry, typo, i meant that for user scans, should we be passing delete
>> markers through.the filters as well ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com>
>> wrote:
>>
>> > For user scans, i feel we should be passing delete markers through as
>> well.
>> >
>> >
>> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <varun@pinterest.com
>> >wrote:
>> >
>> >> I tried this a little bit and it seems that filters are not called on
>> >> delete markers. For raw scans returning delete markers, does it make
>> sense
>> >> to do that ?
>> >>
>> >> Varun
>> >>
>> >>
>> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <varun@pinterest.com
>> >wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> We are having an issue with the way HBase does handling of deletes. We
>> >>> are looking to retrieve 300 columns in a row but the row has tens of
>> >>> thousands of delete markers in it before we span the 300 columns
>> something
>> >>> like this
>> >>>
>> >>>
>> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
>> Col3
>> >>>
>> >>> And so on. Therefore, the issue here, being that to retrieve these 300
>> >>> columns, we need to go through tens of thousands of deletes -
>> sometimes we
>> >>> get a spurt of these queries and that DDoSes a region server. We are
>> okay
>> >>> with saying, only return first 300 columns and stop once you
>> encounter, say
>> >>> 5K column delete markers or something.
>> >>>
>> >>> I wonder if such a construct is provided by HBase or do we need to
>> build
>> >>> something on top of the RAW scan and handle the delete masking there.
>> >>>
>> >>> Thanks
>> >>> Varun
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>>
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
I mean version tracking with delete markers...


On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma <va...@pinterest.com> wrote:

> So, yesterday, I implemented this change via a coprocessor which basically
> initiates a scan which is raw, keeps tracking of # of delete markers
> encountered and stops when a configured threshold is met. It instantiates
> its own ScanDeleteTracker to do the masking through delete markers. So raw
> scan, count delete markers/stop if too many encountered and mask them so to
> return sane stuff back to the client.
>
> I guess until now it has been working reasonably. Also, with HBase 8809,
> version tracking etc. should also work with filters now.
>
>
> On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <la...@apache.org> wrote:
>
>> That would be quite dramatic change, we cannot pass delete markers to the
>> existing filters without confusing them.
>> We could invent a new method (filterDeleteKV or filterDeleteMarker or
>> something) on filters along with a new "filter type" that implements that
>> method.
>>
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Varun Sharma <va...@pinterest.com>
>> To: "dev@hbase.apache.org" <de...@hbase.apache.org>; user@hbase.apache.org
>> Cc:
>> Sent: Sunday, June 30, 2013 1:56 PM
>> Subject: Re: Issues with delete markers
>>
>> Sorry, typo, i meant that for user scans, should we be passing delete
>> markers through.the filters as well ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com>
>> wrote:
>>
>> > For user scans, i feel we should be passing delete markers through as
>> well.
>> >
>> >
>> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <varun@pinterest.com
>> >wrote:
>> >
>> >> I tried this a little bit and it seems that filters are not called on
>> >> delete markers. For raw scans returning delete markers, does it make
>> sense
>> >> to do that ?
>> >>
>> >> Varun
>> >>
>> >>
>> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <varun@pinterest.com
>> >wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> We are having an issue with the way HBase does handling of deletes. We
>> >>> are looking to retrieve 300 columns in a row but the row has tens of
>> >>> thousands of delete markers in it before we span the 300 columns
>> something
>> >>> like this
>> >>>
>> >>>
>> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
>> Col3
>> >>>
>> >>> And so on. Therefore, the issue here, being that to retrieve these 300
>> >>> columns, we need to go through tens of thousands of deletes -
>> sometimes we
>> >>> get a spurt of these queries and that DDoSes a region server. We are
>> okay
>> >>> with saying, only return first 300 columns and stop once you
>> encounter, say
>> >>> 5K column delete markers or something.
>> >>>
>> >>> I wonder if such a construct is provided by HBase or do we need to
>> build
>> >>> something on top of the RAW scan and handle the delete masking there.
>> >>>
>> >>> Thanks
>> >>> Varun
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>>
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
So, yesterday, I implemented this change via a coprocessor which basically
initiates a scan which is raw, keeps tracking of # of delete markers
encountered and stops when a configured threshold is met. It instantiates
its own ScanDeleteTracker to do the masking through delete markers. So raw
scan, count delete markers/stop if too many encountered and mask them so to
return sane stuff back to the client.

I guess until now it has been working reasonably. Also, with HBase 8809,
version tracking etc. should also work with filters now.


On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <la...@apache.org> wrote:

> That would be quite dramatic change, we cannot pass delete markers to the
> existing filters without confusing them.
> We could invent a new method (filterDeleteKV or filterDeleteMarker or
> something) on filters along with a new "filter type" that implements that
> method.
>
>
> -- Lars
>
>
> ----- Original Message -----
> From: Varun Sharma <va...@pinterest.com>
> To: "dev@hbase.apache.org" <de...@hbase.apache.org>; user@hbase.apache.org
> Cc:
> Sent: Sunday, June 30, 2013 1:56 PM
> Subject: Re: Issues with delete markers
>
> Sorry, typo, i meant that for user scans, should we be passing delete
> markers through.the filters as well ?
>
> Varun
>
>
> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com> wrote:
>
> > For user scans, i feel we should be passing delete markers through as
> well.
> >
> >
> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <varun@pinterest.com
> >wrote:
> >
> >> I tried this a little bit and it seems that filters are not called on
> >> delete markers. For raw scans returning delete markers, does it make
> sense
> >> to do that ?
> >>
> >> Varun
> >>
> >>
> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <varun@pinterest.com
> >wrote:
> >>
> >>> Hi,
> >>>
> >>> We are having an issue with the way HBase does handling of deletes. We
> >>> are looking to retrieve 300 columns in a row but the row has tens of
> >>> thousands of delete markers in it before we span the 300 columns
> something
> >>> like this
> >>>
> >>>
> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
> Col3
> >>>
> >>> And so on. Therefore, the issue here, being that to retrieve these 300
> >>> columns, we need to go through tens of thousands of deletes -
> sometimes we
> >>> get a spurt of these queries and that DDoSes a region server. We are
> okay
> >>> with saying, only return first 300 columns and stop once you
> encounter, say
> >>> 5K column delete markers or something.
> >>>
> >>> I wonder if such a construct is provided by HBase or do we need to
> build
> >>> something on top of the RAW scan and handle the delete masking there.
> >>>
> >>> Thanks
> >>> Varun
> >>>
> >>>
> >>>
> >>
> >
>
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
So, yesterday, I implemented this change via a coprocessor which basically
initiates a scan which is raw, keeps tracking of # of delete markers
encountered and stops when a configured threshold is met. It instantiates
its own ScanDeleteTracker to do the masking through delete markers. So raw
scan, count delete markers/stop if too many encountered and mask them so to
return sane stuff back to the client.

I guess until now it has been working reasonably. Also, with HBase 8809,
version tracking etc. should also work with filters now.


On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl <la...@apache.org> wrote:

> That would be quite dramatic change, we cannot pass delete markers to the
> existing filters without confusing them.
> We could invent a new method (filterDeleteKV or filterDeleteMarker or
> something) on filters along with a new "filter type" that implements that
> method.
>
>
> -- Lars
>
>
> ----- Original Message -----
> From: Varun Sharma <va...@pinterest.com>
> To: "dev@hbase.apache.org" <de...@hbase.apache.org>; user@hbase.apache.org
> Cc:
> Sent: Sunday, June 30, 2013 1:56 PM
> Subject: Re: Issues with delete markers
>
> Sorry, typo, i meant that for user scans, should we be passing delete
> markers through.the filters as well ?
>
> Varun
>
>
> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com> wrote:
>
> > For user scans, i feel we should be passing delete markers through as
> well.
> >
> >
> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <varun@pinterest.com
> >wrote:
> >
> >> I tried this a little bit and it seems that filters are not called on
> >> delete markers. For raw scans returning delete markers, does it make
> sense
> >> to do that ?
> >>
> >> Varun
> >>
> >>
> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <varun@pinterest.com
> >wrote:
> >>
> >>> Hi,
> >>>
> >>> We are having an issue with the way HBase does handling of deletes. We
> >>> are looking to retrieve 300 columns in a row but the row has tens of
> >>> thousands of delete markers in it before we span the 300 columns
> something
> >>> like this
> >>>
> >>>
> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
> Col3
> >>>
> >>> And so on. Therefore, the issue here, being that to retrieve these 300
> >>> columns, we need to go through tens of thousands of deletes -
> sometimes we
> >>> get a spurt of these queries and that DDoSes a region server. We are
> okay
> >>> with saying, only return first 300 columns and stop once you
> encounter, say
> >>> 5K column delete markers or something.
> >>>
> >>> I wonder if such a construct is provided by HBase or do we need to
> build
> >>> something on top of the RAW scan and handle the delete masking there.
> >>>
> >>> Thanks
> >>> Varun
> >>>
> >>>
> >>>
> >>
> >
>
>

Re: Issues with delete markers

Posted by lars hofhansl <la...@apache.org>.
That would be quite dramatic change, we cannot pass delete markers to the existing filters without confusing them.
We could invent a new method (filterDeleteKV or filterDeleteMarker or something) on filters along with a new "filter type" that implements that method.


-- Lars


----- Original Message -----
From: Varun Sharma <va...@pinterest.com>
To: "dev@hbase.apache.org" <de...@hbase.apache.org>; user@hbase.apache.org
Cc: 
Sent: Sunday, June 30, 2013 1:56 PM
Subject: Re: Issues with delete markers

Sorry, typo, i meant that for user scans, should we be passing delete
markers through.the filters as well ?

Varun


On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com> wrote:

> For user scans, i feel we should be passing delete markers through as well.
>
>
> On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <va...@pinterest.com>wrote:
>
>> I tried this a little bit and it seems that filters are not called on
>> delete markers. For raw scans returning delete markers, does it make sense
>> to do that ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <va...@pinterest.com>wrote:
>>
>>> Hi,
>>>
>>> We are having an issue with the way HBase does handling of deletes. We
>>> are looking to retrieve 300 columns in a row but the row has tens of
>>> thousands of delete markers in it before we span the 300 columns something
>>> like this
>>>
>>>
>>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3
>>>
>>> And so on. Therefore, the issue here, being that to retrieve these 300
>>> columns, we need to go through tens of thousands of deletes - sometimes we
>>> get a spurt of these queries and that DDoSes a region server. We are okay
>>> with saying, only return first 300 columns and stop once you encounter, say
>>> 5K column delete markers or something.
>>>
>>> I wonder if such a construct is provided by HBase or do we need to build
>>> something on top of the RAW scan and handle the delete masking there.
>>>
>>> Thanks
>>> Varun
>>>
>>>
>>>
>>
>


Re: Issues with delete markers

Posted by lars hofhansl <la...@apache.org>.
That would be quite dramatic change, we cannot pass delete markers to the existing filters without confusing them.
We could invent a new method (filterDeleteKV or filterDeleteMarker or something) on filters along with a new "filter type" that implements that method.


-- Lars


----- Original Message -----
From: Varun Sharma <va...@pinterest.com>
To: "dev@hbase.apache.org" <de...@hbase.apache.org>; user@hbase.apache.org
Cc: 
Sent: Sunday, June 30, 2013 1:56 PM
Subject: Re: Issues with delete markers

Sorry, typo, i meant that for user scans, should we be passing delete
markers through.the filters as well ?

Varun


On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com> wrote:

> For user scans, i feel we should be passing delete markers through as well.
>
>
> On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <va...@pinterest.com>wrote:
>
>> I tried this a little bit and it seems that filters are not called on
>> delete markers. For raw scans returning delete markers, does it make sense
>> to do that ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <va...@pinterest.com>wrote:
>>
>>> Hi,
>>>
>>> We are having an issue with the way HBase does handling of deletes. We
>>> are looking to retrieve 300 columns in a row but the row has tens of
>>> thousands of delete markers in it before we span the 300 columns something
>>> like this
>>>
>>>
>>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3
>>>
>>> And so on. Therefore, the issue here, being that to retrieve these 300
>>> columns, we need to go through tens of thousands of deletes - sometimes we
>>> get a spurt of these queries and that DDoSes a region server. We are okay
>>> with saying, only return first 300 columns and stop once you encounter, say
>>> 5K column delete markers or something.
>>>
>>> I wonder if such a construct is provided by HBase or do we need to build
>>> something on top of the RAW scan and handle the delete masking there.
>>>
>>> Thanks
>>> Varun
>>>
>>>
>>>
>>
>


Re: Issues with delete markers

Posted by Jesse Yates <je...@gmail.com>.
There is some discussion of this (and similar issues with delete markers)
on HBASE-8809 <https://issues.apache.org/jira/browse/HBASE-8809>.

-Jesse
-------------------
Jesse Yates
@jesse_yates
jyates.github.com


On Sun, Jun 30, 2013 at 1:56 PM, Varun Sharma <va...@pinterest.com> wrote:

> Sorry, typo, i meant that for user scans, should we be passing delete
> markers through.the filters as well ?
>
> Varun
>
>
> On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com> wrote:
>
> > For user scans, i feel we should be passing delete markers through as
> well.
> >
> >
> > On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <varun@pinterest.com
> >wrote:
> >
> >> I tried this a little bit and it seems that filters are not called on
> >> delete markers. For raw scans returning delete markers, does it make
> sense
> >> to do that ?
> >>
> >> Varun
> >>
> >>
> >> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <varun@pinterest.com
> >wrote:
> >>
> >>> Hi,
> >>>
> >>> We are having an issue with the way HBase does handling of deletes. We
> >>> are looking to retrieve 300 columns in a row but the row has tens of
> >>> thousands of delete markers in it before we span the 300 columns
> something
> >>> like this
> >>>
> >>>
> >>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3
> Col3
> >>>
> >>> And so on. Therefore, the issue here, being that to retrieve these 300
> >>> columns, we need to go through tens of thousands of deletes -
> sometimes we
> >>> get a spurt of these queries and that DDoSes a region server. We are
> okay
> >>> with saying, only return first 300 columns and stop once you
> encounter, say
> >>> 5K column delete markers or something.
> >>>
> >>> I wonder if such a construct is provided by HBase or do we need to
> build
> >>> something on top of the RAW scan and handle the delete masking there.
> >>>
> >>> Thanks
> >>> Varun
> >>>
> >>>
> >>>
> >>
> >
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
Sorry, typo, i meant that for user scans, should we be passing delete
markers through.the filters as well ?

Varun


On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com> wrote:

> For user scans, i feel we should be passing delete markers through as well.
>
>
> On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <va...@pinterest.com>wrote:
>
>> I tried this a little bit and it seems that filters are not called on
>> delete markers. For raw scans returning delete markers, does it make sense
>> to do that ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <va...@pinterest.com>wrote:
>>
>>> Hi,
>>>
>>> We are having an issue with the way HBase does handling of deletes. We
>>> are looking to retrieve 300 columns in a row but the row has tens of
>>> thousands of delete markers in it before we span the 300 columns something
>>> like this
>>>
>>>
>>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3
>>>
>>> And so on. Therefore, the issue here, being that to retrieve these 300
>>> columns, we need to go through tens of thousands of deletes - sometimes we
>>> get a spurt of these queries and that DDoSes a region server. We are okay
>>> with saying, only return first 300 columns and stop once you encounter, say
>>> 5K column delete markers or something.
>>>
>>> I wonder if such a construct is provided by HBase or do we need to build
>>> something on top of the RAW scan and handle the delete masking there.
>>>
>>> Thanks
>>> Varun
>>>
>>>
>>>
>>
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
Sorry, typo, i meant that for user scans, should we be passing delete
markers through.the filters as well ?

Varun


On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma <va...@pinterest.com> wrote:

> For user scans, i feel we should be passing delete markers through as well.
>
>
> On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <va...@pinterest.com>wrote:
>
>> I tried this a little bit and it seems that filters are not called on
>> delete markers. For raw scans returning delete markers, does it make sense
>> to do that ?
>>
>> Varun
>>
>>
>> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <va...@pinterest.com>wrote:
>>
>>> Hi,
>>>
>>> We are having an issue with the way HBase does handling of deletes. We
>>> are looking to retrieve 300 columns in a row but the row has tens of
>>> thousands of delete markers in it before we span the 300 columns something
>>> like this
>>>
>>>
>>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3
>>>
>>> And so on. Therefore, the issue here, being that to retrieve these 300
>>> columns, we need to go through tens of thousands of deletes - sometimes we
>>> get a spurt of these queries and that DDoSes a region server. We are okay
>>> with saying, only return first 300 columns and stop once you encounter, say
>>> 5K column delete markers or something.
>>>
>>> I wonder if such a construct is provided by HBase or do we need to build
>>> something on top of the RAW scan and handle the delete masking there.
>>>
>>> Thanks
>>> Varun
>>>
>>>
>>>
>>
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
For user scans, i feel we should be passing delete markers through as well.


On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <va...@pinterest.com> wrote:

> I tried this a little bit and it seems that filters are not called on
> delete markers. For raw scans returning delete markers, does it make sense
> to do that ?
>
> Varun
>
>
> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <va...@pinterest.com>wrote:
>
>> Hi,
>>
>> We are having an issue with the way HBase does handling of deletes. We
>> are looking to retrieve 300 columns in a row but the row has tens of
>> thousands of delete markers in it before we span the 300 columns something
>> like this
>>
>>
>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3
>>
>> And so on. Therefore, the issue here, being that to retrieve these 300
>> columns, we need to go through tens of thousands of deletes - sometimes we
>> get a spurt of these queries and that DDoSes a region server. We are okay
>> with saying, only return first 300 columns and stop once you encounter, say
>> 5K column delete markers or something.
>>
>> I wonder if such a construct is provided by HBase or do we need to build
>> something on top of the RAW scan and handle the delete masking there.
>>
>> Thanks
>> Varun
>>
>>
>>
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
For user scans, i feel we should be passing delete markers through as well.


On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma <va...@pinterest.com> wrote:

> I tried this a little bit and it seems that filters are not called on
> delete markers. For raw scans returning delete markers, does it make sense
> to do that ?
>
> Varun
>
>
> On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <va...@pinterest.com>wrote:
>
>> Hi,
>>
>> We are having an issue with the way HBase does handling of deletes. We
>> are looking to retrieve 300 columns in a row but the row has tens of
>> thousands of delete markers in it before we span the 300 columns something
>> like this
>>
>>
>> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3
>>
>> And so on. Therefore, the issue here, being that to retrieve these 300
>> columns, we need to go through tens of thousands of deletes - sometimes we
>> get a spurt of these queries and that DDoSes a region server. We are okay
>> with saying, only return first 300 columns and stop once you encounter, say
>> 5K column delete markers or something.
>>
>> I wonder if such a construct is provided by HBase or do we need to build
>> something on top of the RAW scan and handle the delete masking there.
>>
>> Thanks
>> Varun
>>
>>
>>
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
I tried this a little bit and it seems that filters are not called on
delete markers. For raw scans returning delete markers, does it make sense
to do that ?

Varun


On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <va...@pinterest.com> wrote:

> Hi,
>
> We are having an issue with the way HBase does handling of deletes. We are
> looking to retrieve 300 columns in a row but the row has tens of thousands
> of delete markers in it before we span the 300 columns something like this
>
>
> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3
>
> And so on. Therefore, the issue here, being that to retrieve these 300
> columns, we need to go through tens of thousands of deletes - sometimes we
> get a spurt of these queries and that DDoSes a region server. We are okay
> with saying, only return first 300 columns and stop once you encounter, say
> 5K column delete markers or something.
>
> I wonder if such a construct is provided by HBase or do we need to build
> something on top of the RAW scan and handle the delete masking there.
>
> Thanks
> Varun
>
>
>

Re: Issues with delete markers

Posted by Varun Sharma <va...@pinterest.com>.
I tried this a little bit and it seems that filters are not called on
delete markers. For raw scans returning delete markers, does it make sense
to do that ?

Varun


On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma <va...@pinterest.com> wrote:

> Hi,
>
> We are having an issue with the way HBase does handling of deletes. We are
> looking to retrieve 300 columns in a row but the row has tens of thousands
> of delete markers in it before we span the 300 columns something like this
>
>
> row  DeleteCol1 Col1  DeleteCol2 Col2 ................... DeleteCol3 Col3
>
> And so on. Therefore, the issue here, being that to retrieve these 300
> columns, we need to go through tens of thousands of deletes - sometimes we
> get a spurt of these queries and that DDoSes a region server. We are okay
> with saying, only return first 300 columns and stop once you encounter, say
> 5K column delete markers or something.
>
> I wonder if such a construct is provided by HBase or do we need to build
> something on top of the RAW scan and handle the delete masking there.
>
> Thanks
> Varun
>
>
>