You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mingtao Zhang <ma...@gmail.com> on 2014/08/16 23:26:57 UTC

Scan result sequence

Hi,

My rowkey is

sessionid|hash(pageurl)|timestamp

When I scan using a prefix filter with a specific sessionid, will it give
me the result in sequence? For example, the expected sequence in my mind is:

session1|hash(a.com)|1
session1|hash(a.com)|2
session1|hash(a.com)|3
session1|hash(b.com)|2.5
session1|hash(b.com)|5
session1|hash(b.com)|6
session1|hash(c.com)|3.5
session1|hash(c.com)|5.5
session1|hash(c.com)|7

Thanks in advance!

Best Regards,
Mingtao

Re: Scan result sequence

Posted by Mingtao Zhang <ma...@gmail.com>.
Gr8 :) Thank you!

Mingtao


On Sat, Aug 16, 2014 at 6:03 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. hash(a.com) comes together with the timestamp sequence
>
> That should be the case - assuming your sessionId is of fixed width.
>
> Cheers
>
>
> On Sat, Aug 16, 2014 at 2:55 PM, Mingtao Zhang <ma...@gmail.com>
> wrote:
>
> > Hi Ted,
> >
> > I used murmurhash. Actually I don't care about the sequence between the
> > group of a.com and b.com record. I am 120% :) as far as hash(a.com)
> comes
> > together with the timestamp sequence. (hash(b.com) could come either
> > before
> > or after)
> >
> > Best Regards,
> > Mingtao
> >
> > On Sat, Aug 16, 2014 at 5:44 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > How do you generate hash based on pageurl ?
> > > The order between hash(a.com) and hash(b.com <http://a.com/>) may not
> be
> > > what you expected.
> > >
> > > BTW See http://hbase.apache.org/book.html#row and
> > > http://hbase.apache.org/book.html#dm.sort
> > >
> > > Cheers
> > >
> > >
> > > On Sat, Aug 16, 2014 at 2:26 PM, Mingtao Zhang <mail2mingtao@gmail.com
> >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > My rowkey is
> > > >
> > > > sessionid|hash(pageurl)|timestamp
> > > >
> > > > When I scan using a prefix filter with a specific sessionid, will it
> > give
> > > > me the result in sequence? For example, the expected sequence in my
> > mind
> > > > is:
> > > >
> > > > session1|hash(a.com)|1
> > > > session1|hash(a.com)|2
> > > > session1|hash(a.com)|3
> > > > session1|hash(b.com)|2.5
> > > > session1|hash(b.com)|5
> > > > session1|hash(b.com)|6
> > > > session1|hash(c.com)|3.5
> > > > session1|hash(c.com)|5.5
> > > > session1|hash(c.com)|7
> > > >
> > > > Thanks in advance!
> > > >
> > > > Best Regards,
> > > > Mingtao
> > > >
> > >
> >
>

Re: Scan result sequence

Posted by Ted Yu <yu...@gmail.com>.
bq. hash(a.com) comes together with the timestamp sequence

That should be the case - assuming your sessionId is of fixed width.

Cheers


On Sat, Aug 16, 2014 at 2:55 PM, Mingtao Zhang <ma...@gmail.com>
wrote:

> Hi Ted,
>
> I used murmurhash. Actually I don't care about the sequence between the
> group of a.com and b.com record. I am 120% :) as far as hash(a.com) comes
> together with the timestamp sequence. (hash(b.com) could come either
> before
> or after)
>
> Best Regards,
> Mingtao
>
> On Sat, Aug 16, 2014 at 5:44 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > How do you generate hash based on pageurl ?
> > The order between hash(a.com) and hash(b.com <http://a.com/>) may not be
> > what you expected.
> >
> > BTW See http://hbase.apache.org/book.html#row and
> > http://hbase.apache.org/book.html#dm.sort
> >
> > Cheers
> >
> >
> > On Sat, Aug 16, 2014 at 2:26 PM, Mingtao Zhang <ma...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > My rowkey is
> > >
> > > sessionid|hash(pageurl)|timestamp
> > >
> > > When I scan using a prefix filter with a specific sessionid, will it
> give
> > > me the result in sequence? For example, the expected sequence in my
> mind
> > > is:
> > >
> > > session1|hash(a.com)|1
> > > session1|hash(a.com)|2
> > > session1|hash(a.com)|3
> > > session1|hash(b.com)|2.5
> > > session1|hash(b.com)|5
> > > session1|hash(b.com)|6
> > > session1|hash(c.com)|3.5
> > > session1|hash(c.com)|5.5
> > > session1|hash(c.com)|7
> > >
> > > Thanks in advance!
> > >
> > > Best Regards,
> > > Mingtao
> > >
> >
>

Re: Scan result sequence

Posted by Mingtao Zhang <ma...@gmail.com>.
Hi Ted,

I used murmurhash. Actually I don't care about the sequence between the
group of a.com and b.com record. I am 120% :) as far as hash(a.com) comes
together with the timestamp sequence. (hash(b.com) could come either before
or after)

Best Regards,
Mingtao

On Sat, Aug 16, 2014 at 5:44 PM, Ted Yu <yu...@gmail.com> wrote:

> How do you generate hash based on pageurl ?
> The order between hash(a.com) and hash(b.com <http://a.com/>) may not be
> what you expected.
>
> BTW See http://hbase.apache.org/book.html#row and
> http://hbase.apache.org/book.html#dm.sort
>
> Cheers
>
>
> On Sat, Aug 16, 2014 at 2:26 PM, Mingtao Zhang <ma...@gmail.com>
> wrote:
>
> > Hi,
> >
> > My rowkey is
> >
> > sessionid|hash(pageurl)|timestamp
> >
> > When I scan using a prefix filter with a specific sessionid, will it give
> > me the result in sequence? For example, the expected sequence in my mind
> > is:
> >
> > session1|hash(a.com)|1
> > session1|hash(a.com)|2
> > session1|hash(a.com)|3
> > session1|hash(b.com)|2.5
> > session1|hash(b.com)|5
> > session1|hash(b.com)|6
> > session1|hash(c.com)|3.5
> > session1|hash(c.com)|5.5
> > session1|hash(c.com)|7
> >
> > Thanks in advance!
> >
> > Best Regards,
> > Mingtao
> >
>

Re: Scan result sequence

Posted by Ted Yu <yu...@gmail.com>.
How do you generate hash based on pageurl ?
The order between hash(a.com) and hash(b.com <http://a.com/>) may not be
what you expected.

BTW See http://hbase.apache.org/book.html#row and
http://hbase.apache.org/book.html#dm.sort

Cheers


On Sat, Aug 16, 2014 at 2:26 PM, Mingtao Zhang <ma...@gmail.com>
wrote:

> Hi,
>
> My rowkey is
>
> sessionid|hash(pageurl)|timestamp
>
> When I scan using a prefix filter with a specific sessionid, will it give
> me the result in sequence? For example, the expected sequence in my mind
> is:
>
> session1|hash(a.com)|1
> session1|hash(a.com)|2
> session1|hash(a.com)|3
> session1|hash(b.com)|2.5
> session1|hash(b.com)|5
> session1|hash(b.com)|6
> session1|hash(c.com)|3.5
> session1|hash(c.com)|5.5
> session1|hash(c.com)|7
>
> Thanks in advance!
>
> Best Regards,
> Mingtao
>