You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jerry Lam <ch...@gmail.com> on 2012/08/20 20:57:56 UTC

Column Value Reference Timestamp Filter

Hi HBase community:

I have a requirement in which I need to query a row based on the timestamp
stored in the value of a column of a row. For example.

(rowkeyA of col1) -> (value) at timestamp = t1, (value) stores t2. Result
should return all columns of rowkeyA at timestamp = t2.

Note that t1 > t2 ALWAYS.

Can this sound like something that can be done using Filter? If yes, can it
be done using the existing filters in HBase without customization?

Best Regards,

Jerry

Re: Column Value Reference Timestamp Filter

Posted by Jerry Lam <ch...@gmail.com>.
Hi Alex:

We decided to use setTimeRange and setMaxVersions, and remove the column
with a reference timestamp (i.e. we don't put this column into hbase
anymore). This behavior is what we would like but it seems very inefficient
because all versions are processed before the setMaxVersions takes effect
(I just posted some new findings in another post).

Best Regards,

Jerry

On Mon, Aug 20, 2012 at 4:47 PM, Alex Baranau <al...@gmail.com>wrote:

> Hi,
>
> So, you have row with key rowKeyA and column col1. And it contains two
> values value1 and value2 at timestamp1 and timestamp2 respectively, where
> timestamp1 is most recent. And you want to fetch "most recent but one"
> values in all columns when doing the scan. I.e. you don't know the
> timestamp1 or timestamp2 exactly you just need to fetch the value which was
> placed before the most recent one. Is that correct?
>
> Don't think there's some filter that would allow you to do so
> "out-of-the-box". You should probably be able to write such filter and use
> scan.setMaxVersions(2). Not sure if keyvalues are fed into filter ordered
> by their timestamp..
>
> How about returning 2 most recent values to the client and filtering on the
> client-side? Why this doesn't work in your case? (large values in columns
> in size or?).
>
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> Solr
>
> On Mon, Aug 20, 2012 at 2:57 PM, Jerry Lam <ch...@gmail.com> wrote:
>
> > Hi HBase community:
> >
> > I have a requirement in which I need to query a row based on the
> timestamp
> > stored in the value of a column of a row. For example.
> >
> > (rowkeyA of col1) -> (value) at timestamp = t1, (value) stores t2. Result
> > should return all columns of rowkeyA at timestamp = t2.
> >
> > Note that t1 > t2 ALWAYS.
> >
> > Can this sound like something that can be done using Filter? If yes, can
> it
> > be done using the existing filters in HBase without customization?
> >
> > Best Regards,
> >
> > Jerry
> >
>

Re: Column Value Reference Timestamp Filter

Posted by Alex Baranau <al...@gmail.com>.
Hi,

So, you have row with key rowKeyA and column col1. And it contains two
values value1 and value2 at timestamp1 and timestamp2 respectively, where
timestamp1 is most recent. And you want to fetch "most recent but one"
values in all columns when doing the scan. I.e. you don't know the
timestamp1 or timestamp2 exactly you just need to fetch the value which was
placed before the most recent one. Is that correct?

Don't think there's some filter that would allow you to do so
"out-of-the-box". You should probably be able to write such filter and use
scan.setMaxVersions(2). Not sure if keyvalues are fed into filter ordered
by their timestamp..

How about returning 2 most recent values to the client and filtering on the
client-side? Why this doesn't work in your case? (large values in columns
in size or?).

Alex Baranau
------
Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
Solr

On Mon, Aug 20, 2012 at 2:57 PM, Jerry Lam <ch...@gmail.com> wrote:

> Hi HBase community:
>
> I have a requirement in which I need to query a row based on the timestamp
> stored in the value of a column of a row. For example.
>
> (rowkeyA of col1) -> (value) at timestamp = t1, (value) stores t2. Result
> should return all columns of rowkeyA at timestamp = t2.
>
> Note that t1 > t2 ALWAYS.
>
> Can this sound like something that can be done using Filter? If yes, can it
> be done using the existing filters in HBase without customization?
>
> Best Regards,
>
> Jerry
>