You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Manjeet Singh <ma...@gmail.com> on 2017/09/21 18:43:31 UTC

How to improve filter based scan

Hi All

This is old question but first time I feel the pain.
In hbase we have filters I am using substring and value filter .
Problem is that it's took around 10  minutes to get the result.
I have found start and stop key is helpful but again how can you get stop
key.

Anyway can anyone tell me how can I take my scan in 1 or 2 second?  I have
tried page filter caching setmaxresult etc.

Thanks
Manjeet Singh

Re: How to improve filter based scan

Posted by Ted Yu <yu...@gmail.com>.
Please read the javadoc:

 * <li>1 - means that this byte in provided row key is NOT fixed, i.e. row
key's byte at this

 * position can be different from the one in provided row key</li>

On Thu, Sep 21, 2017 at 12:09 PM, Manjeet Singh <ma...@gmail.com>
wrote:

> In fuzzyrowkey filter I must have to know Rowley right ... But I know only
> application name and want all subscribe
>
> On 22 Sep 2017 12:36 am, "Manjeet Singh" <ma...@gmail.com>
> wrote:
>
> > Salt is one char and generated on subscriber number, we are using pre
> > splitting each salt is representing one region
> >
> > On 22 Sep 2017 12:33 am, "Ted Yu" <yu...@gmail.com> wrote:
> >
> >> Have you looked at FuzzyRowFilter (assuming salt is of fixed width) ?
> >>
> >> On Thu, Sep 21, 2017 at 11:59 AM, Manjeet Singh <
> >> manjeet.chandhok@gmail.com>
> >> wrote:
> >>
> >> > Hi
> >> >
> >> > We are using hbase 1.2 version
> >> > Our usecase is to get all the subscribe who have used xyz application
> >> for
> >> > this we have rk as below
> >> >
> >> > Salt_subscriber_application_usecasenumber
> >> > Example
> >> > #_77777777_facebook_50
> >> >
> >> > Thanks
> >> > Manjeet singh
> >> >
> >> > On 22 Sep 2017 12:19 am, "Ted Yu" <yu...@gmail.com> wrote:
> >> >
> >> > > bq. how can you get stop key.
> >> > >
> >> > > This would depend on your use case(s).
> >> > > Can you tell us more about your schema ?
> >> > >
> >> > > BTW which release are you using ?
> >> > >
> >> > > On Thu, Sep 21, 2017 at 11:43 AM, Manjeet Singh <
> >> > > manjeet.chandhok@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi All
> >> > > >
> >> > > > This is old question but first time I feel the pain.
> >> > > > In hbase we have filters I am using substring and value filter .
> >> > > > Problem is that it's took around 10  minutes to get the result.
> >> > > > I have found start and stop key is helpful but again how can you
> get
> >> > stop
> >> > > > key.
> >> > > >
> >> > > > Anyway can anyone tell me how can I take my scan in 1 or 2 second?
> >> I
> >> > > have
> >> > > > tried page filter caching setmaxresult etc.
> >> > > >
> >> > > > Thanks
> >> > > > Manjeet Singh
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: How to improve filter based scan

Posted by Manjeet Singh <ma...@gmail.com>.
In fuzzyrowkey filter I must have to know Rowley right ... But I know only
application name and want all subscribe

On 22 Sep 2017 12:36 am, "Manjeet Singh" <ma...@gmail.com> wrote:

> Salt is one char and generated on subscriber number, we are using pre
> splitting each salt is representing one region
>
> On 22 Sep 2017 12:33 am, "Ted Yu" <yu...@gmail.com> wrote:
>
>> Have you looked at FuzzyRowFilter (assuming salt is of fixed width) ?
>>
>> On Thu, Sep 21, 2017 at 11:59 AM, Manjeet Singh <
>> manjeet.chandhok@gmail.com>
>> wrote:
>>
>> > Hi
>> >
>> > We are using hbase 1.2 version
>> > Our usecase is to get all the subscribe who have used xyz application
>> for
>> > this we have rk as below
>> >
>> > Salt_subscriber_application_usecasenumber
>> > Example
>> > #_77777777_facebook_50
>> >
>> > Thanks
>> > Manjeet singh
>> >
>> > On 22 Sep 2017 12:19 am, "Ted Yu" <yu...@gmail.com> wrote:
>> >
>> > > bq. how can you get stop key.
>> > >
>> > > This would depend on your use case(s).
>> > > Can you tell us more about your schema ?
>> > >
>> > > BTW which release are you using ?
>> > >
>> > > On Thu, Sep 21, 2017 at 11:43 AM, Manjeet Singh <
>> > > manjeet.chandhok@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi All
>> > > >
>> > > > This is old question but first time I feel the pain.
>> > > > In hbase we have filters I am using substring and value filter .
>> > > > Problem is that it's took around 10  minutes to get the result.
>> > > > I have found start and stop key is helpful but again how can you get
>> > stop
>> > > > key.
>> > > >
>> > > > Anyway can anyone tell me how can I take my scan in 1 or 2 second?
>> I
>> > > have
>> > > > tried page filter caching setmaxresult etc.
>> > > >
>> > > > Thanks
>> > > > Manjeet Singh
>> > > >
>> > >
>> >
>>
>

Re: How to improve filter based scan

Posted by Manjeet Singh <ma...@gmail.com>.
Salt is one char and generated on subscriber number, we are using pre
splitting each salt is representing one region

On 22 Sep 2017 12:33 am, "Ted Yu" <yu...@gmail.com> wrote:

> Have you looked at FuzzyRowFilter (assuming salt is of fixed width) ?
>
> On Thu, Sep 21, 2017 at 11:59 AM, Manjeet Singh <
> manjeet.chandhok@gmail.com>
> wrote:
>
> > Hi
> >
> > We are using hbase 1.2 version
> > Our usecase is to get all the subscribe who have used xyz application for
> > this we have rk as below
> >
> > Salt_subscriber_application_usecasenumber
> > Example
> > #_77777777_facebook_50
> >
> > Thanks
> > Manjeet singh
> >
> > On 22 Sep 2017 12:19 am, "Ted Yu" <yu...@gmail.com> wrote:
> >
> > > bq. how can you get stop key.
> > >
> > > This would depend on your use case(s).
> > > Can you tell us more about your schema ?
> > >
> > > BTW which release are you using ?
> > >
> > > On Thu, Sep 21, 2017 at 11:43 AM, Manjeet Singh <
> > > manjeet.chandhok@gmail.com>
> > > wrote:
> > >
> > > > Hi All
> > > >
> > > > This is old question but first time I feel the pain.
> > > > In hbase we have filters I am using substring and value filter .
> > > > Problem is that it's took around 10  minutes to get the result.
> > > > I have found start and stop key is helpful but again how can you get
> > stop
> > > > key.
> > > >
> > > > Anyway can anyone tell me how can I take my scan in 1 or 2 second?  I
> > > have
> > > > tried page filter caching setmaxresult etc.
> > > >
> > > > Thanks
> > > > Manjeet Singh
> > > >
> > >
> >
>

Re: How to improve filter based scan

Posted by Ted Yu <yu...@gmail.com>.
Have you looked at FuzzyRowFilter (assuming salt is of fixed width) ?

On Thu, Sep 21, 2017 at 11:59 AM, Manjeet Singh <ma...@gmail.com>
wrote:

> Hi
>
> We are using hbase 1.2 version
> Our usecase is to get all the subscribe who have used xyz application for
> this we have rk as below
>
> Salt_subscriber_application_usecasenumber
> Example
> #_77777777_facebook_50
>
> Thanks
> Manjeet singh
>
> On 22 Sep 2017 12:19 am, "Ted Yu" <yu...@gmail.com> wrote:
>
> > bq. how can you get stop key.
> >
> > This would depend on your use case(s).
> > Can you tell us more about your schema ?
> >
> > BTW which release are you using ?
> >
> > On Thu, Sep 21, 2017 at 11:43 AM, Manjeet Singh <
> > manjeet.chandhok@gmail.com>
> > wrote:
> >
> > > Hi All
> > >
> > > This is old question but first time I feel the pain.
> > > In hbase we have filters I am using substring and value filter .
> > > Problem is that it's took around 10  minutes to get the result.
> > > I have found start and stop key is helpful but again how can you get
> stop
> > > key.
> > >
> > > Anyway can anyone tell me how can I take my scan in 1 or 2 second?  I
> > have
> > > tried page filter caching setmaxresult etc.
> > >
> > > Thanks
> > > Manjeet Singh
> > >
> >
>

Re: How to improve filter based scan

Posted by Ted Yu <yu...@gmail.com>.
w.r.t. using more than one Filter, it depends on which filters you use and
the distribution of your data.

Probably try out (limited) combinations of Filters on your data (sample).

Cheers

On Thu, Sep 21, 2017 at 12:03 PM, Manjeet Singh <ma...@gmail.com>
wrote:

> One more question does in hbase performance is vary based on filter?  And
> does two or more filter can improve read scan or it increases read time?
>
> On 22 Sep 2017 12:29 am, "Manjeet Singh" <ma...@gmail.com>
> wrote:
>
> > Hi
> >
> > We are using hbase 1.2 version
> > Our usecase is to get all the subscribe who have used xyz application for
> > this we have rk as below
> >
> > Salt_subscriber_application_usecasenumber
> > Example
> > #_77777777_facebook_50
> >
> > Thanks
> > Manjeet singh
> >
> > On 22 Sep 2017 12:19 am, "Ted Yu" <yu...@gmail.com> wrote:
> >
> >> bq. how can you get stop key.
> >>
> >> This would depend on your use case(s).
> >> Can you tell us more about your schema ?
> >>
> >> BTW which release are you using ?
> >>
> >> On Thu, Sep 21, 2017 at 11:43 AM, Manjeet Singh <
> >> manjeet.chandhok@gmail.com>
> >> wrote:
> >>
> >> > Hi All
> >> >
> >> > This is old question but first time I feel the pain.
> >> > In hbase we have filters I am using substring and value filter .
> >> > Problem is that it's took around 10  minutes to get the result.
> >> > I have found start and stop key is helpful but again how can you get
> >> stop
> >> > key.
> >> >
> >> > Anyway can anyone tell me how can I take my scan in 1 or 2 second?  I
> >> have
> >> > tried page filter caching setmaxresult etc.
> >> >
> >> > Thanks
> >> > Manjeet Singh
> >> >
> >>
> >
>

Re: How to improve filter based scan

Posted by Manjeet Singh <ma...@gmail.com>.
One more question does in hbase performance is vary based on filter?  And
does two or more filter can improve read scan or it increases read time?

On 22 Sep 2017 12:29 am, "Manjeet Singh" <ma...@gmail.com> wrote:

> Hi
>
> We are using hbase 1.2 version
> Our usecase is to get all the subscribe who have used xyz application for
> this we have rk as below
>
> Salt_subscriber_application_usecasenumber
> Example
> #_77777777_facebook_50
>
> Thanks
> Manjeet singh
>
> On 22 Sep 2017 12:19 am, "Ted Yu" <yu...@gmail.com> wrote:
>
>> bq. how can you get stop key.
>>
>> This would depend on your use case(s).
>> Can you tell us more about your schema ?
>>
>> BTW which release are you using ?
>>
>> On Thu, Sep 21, 2017 at 11:43 AM, Manjeet Singh <
>> manjeet.chandhok@gmail.com>
>> wrote:
>>
>> > Hi All
>> >
>> > This is old question but first time I feel the pain.
>> > In hbase we have filters I am using substring and value filter .
>> > Problem is that it's took around 10  minutes to get the result.
>> > I have found start and stop key is helpful but again how can you get
>> stop
>> > key.
>> >
>> > Anyway can anyone tell me how can I take my scan in 1 or 2 second?  I
>> have
>> > tried page filter caching setmaxresult etc.
>> >
>> > Thanks
>> > Manjeet Singh
>> >
>>
>

Re: How to improve filter based scan

Posted by Manjeet Singh <ma...@gmail.com>.
Hi

We are using hbase 1.2 version
Our usecase is to get all the subscribe who have used xyz application for
this we have rk as below

Salt_subscriber_application_usecasenumber
Example
#_77777777_facebook_50

Thanks
Manjeet singh

On 22 Sep 2017 12:19 am, "Ted Yu" <yu...@gmail.com> wrote:

> bq. how can you get stop key.
>
> This would depend on your use case(s).
> Can you tell us more about your schema ?
>
> BTW which release are you using ?
>
> On Thu, Sep 21, 2017 at 11:43 AM, Manjeet Singh <
> manjeet.chandhok@gmail.com>
> wrote:
>
> > Hi All
> >
> > This is old question but first time I feel the pain.
> > In hbase we have filters I am using substring and value filter .
> > Problem is that it's took around 10  minutes to get the result.
> > I have found start and stop key is helpful but again how can you get stop
> > key.
> >
> > Anyway can anyone tell me how can I take my scan in 1 or 2 second?  I
> have
> > tried page filter caching setmaxresult etc.
> >
> > Thanks
> > Manjeet Singh
> >
>

Re: How to improve filter based scan

Posted by Ted Yu <yu...@gmail.com>.
bq. how can you get stop key.

This would depend on your use case(s).
Can you tell us more about your schema ?

BTW which release are you using ?

On Thu, Sep 21, 2017 at 11:43 AM, Manjeet Singh <ma...@gmail.com>
wrote:

> Hi All
>
> This is old question but first time I feel the pain.
> In hbase we have filters I am using substring and value filter .
> Problem is that it's took around 10  minutes to get the result.
> I have found start and stop key is helpful but again how can you get stop
> key.
>
> Anyway can anyone tell me how can I take my scan in 1 or 2 second?  I have
> tried page filter caching setmaxresult etc.
>
> Thanks
> Manjeet Singh
>