You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by 齐忠 <ce...@gmail.com> on 2016/05/16 08:46:42 UTC

hbase rowkey design

I have very large log(50T per day),

My log event as follows

url,visitid,requesttime

http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
http://www.aaa.com?a=b&c=d&e=fa, 1, 1463387280
http://www.aaa.com?a=b&c=d&e=fa, 2, 1463387280
http://www.aaa.com?a=b&c=d&e=fab, 2, 1463387280
http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380


When a user enters a part of the url, and returns the
uv(UniqueVisitor) pv(PageView)。

for example

input: e=f*

output: uv=2,pv=5,

input: e=fa

output:uv=2,pv=3

How to design rowkey?

Thanks.

Re: hbase rowkey design

Posted by Heng Chen <he...@gmail.com>.
In my company, we calculate UV/PV offline in batch, and update every day.

If do it online, url + timestamp could be the rowkey.



2016-05-16 18:13 GMT+08:00 齐忠 <ce...@gmail.com>:

> Yes, like google analytics.
>
> 2016-05-16 17:48 GMT+08:00 Heng Chen <he...@gmail.com>:
> > You want to calculate UV/PV online?
> >
> > 2016-05-16 16:46 GMT+08:00 齐忠 <ce...@gmail.com>:
> >
> >> I have very large log(50T per day),
> >>
> >> My log event as follows
> >>
> >> url,visitid,requesttime
> >>
> >> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
> >> http://www.aaa.com?a=b&c=d&e=fa, 1, 1463387280
> >> http://www.aaa.com?a=b&c=d&e=fa, 2, 1463387280
> >> http://www.aaa.com?a=b&c=d&e=fab, 2, 1463387280
> >> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
> >>
> >>
> >> When a user enters a part of the url, and returns the
> >> uv(UniqueVisitor) pv(PageView)。
> >>
> >> for example
> >>
> >> input: e=f*
> >>
> >> output: uv=2,pv=5,
> >>
> >> input: e=fa
> >>
> >> output:uv=2,pv=3
> >>
> >> How to design rowkey?
> >>
> >> Thanks.
> >>
>
>
>
> --
> centerqi@gmail.com|齐忠
>

Re: hbase rowkey design

Posted by 齐忠 <ce...@gmail.com>.
Yes, like google analytics.

2016-05-16 17:48 GMT+08:00 Heng Chen <he...@gmail.com>:
> You want to calculate UV/PV online?
>
> 2016-05-16 16:46 GMT+08:00 齐忠 <ce...@gmail.com>:
>
>> I have very large log(50T per day),
>>
>> My log event as follows
>>
>> url,visitid,requesttime
>>
>> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
>> http://www.aaa.com?a=b&c=d&e=fa, 1, 1463387280
>> http://www.aaa.com?a=b&c=d&e=fa, 2, 1463387280
>> http://www.aaa.com?a=b&c=d&e=fab, 2, 1463387280
>> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
>>
>>
>> When a user enters a part of the url, and returns the
>> uv(UniqueVisitor) pv(PageView)。
>>
>> for example
>>
>> input: e=f*
>>
>> output: uv=2,pv=5,
>>
>> input: e=fa
>>
>> output:uv=2,pv=3
>>
>> How to design rowkey?
>>
>> Thanks.
>>



-- 
centerqi@gmail.com|齐忠

Re: hbase rowkey design

Posted by Heng Chen <he...@gmail.com>.
You want to calculate UV/PV online?

2016-05-16 16:46 GMT+08:00 齐忠 <ce...@gmail.com>:

> I have very large log(50T per day),
>
> My log event as follows
>
> url,visitid,requesttime
>
> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
> http://www.aaa.com?a=b&c=d&e=fa, 1, 1463387280
> http://www.aaa.com?a=b&c=d&e=fa, 2, 1463387280
> http://www.aaa.com?a=b&c=d&e=fab, 2, 1463387280
> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
>
>
> When a user enters a part of the url, and returns the
> uv(UniqueVisitor) pv(PageView)。
>
> for example
>
> input: e=f*
>
> output: uv=2,pv=5,
>
> input: e=fa
>
> output:uv=2,pv=3
>
> How to design rowkey?
>
> Thanks.
>