You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by AnilKumar B <ak...@gmail.com> on 2013/06/01 16:36:45 UTC

Re: What is the best hbase table schema for following json data?

Thanks Ted & Michael.


On Fri, May 31, 2013 at 12:39 AM, Michael Segel
<mi...@hotmail.com>wrote:

> But you should be able to write a custom column filter that handles JSON
> records within a cell.
>
> On May 30, 2013, at 11:48 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > bq. Still these ColumnPrefixFilter will work in this case?
> >
> > Probably not. Can you group the subset of keys at the beginning of the
> > column (assuming the subset of keys is known and doesn't change) ?
> >
> > bq. I am storing each click(set of key value pairs) in one cell say
> > "clicks:event1". Is this OK?
> >
> > This should be Okay.
> >
> > On Wed, May 29, 2013 at 11:13 PM, AnilKumar B <ak...@gmail.com>
> wrote:
> >
> >> Hi Ted,
> >>
> >> @You can utilize MultipleColumnPrefixFilter or ColumnPrefixFilter to
> speed
> >> up scan.
> >> [Anil] Thanks for the info. But I am storing all the key value pairs
> >> corresponding to one click in one column. Still these ColumnPrefixFilter
> >> will work in this case?
> >>
> >> @How many key / value pairs does each 'click' have ?
> >> [Anil] number of key value pairs are not fixed. It can vary from 20-200
> >>
> >> @Among these pairs, are you going to search for a subset of keys ?
> >> [Anil] Yes.
> >>
> >>
> >>
> >> In my schema, I am storing each click(set of key value pairs) in one
> cell
> >> say "clicks:event1". Is this OK? or do I need to change schema design in
> >> such a way that each key-value pair as one column? What is the better
> way
> >> to store Json data?
> >>
> >>
> >> Thanks,
> >> B Anil Kumar.
> >>
> >>
> >> On Thu, May 30, 2013 at 9:42 AM, Ted Yu <yu...@gmail.com> wrote:
> >>
> >>> bq. 1) Suppose If I want search on key of click, It will be full scan
> >>>
> >>> You can utilize MultipleColumnPrefixFilter or ColumnPrefixFilter to
> speed
> >>> up scan.
> >>>
> >>> How many key / value pairs does each 'click' have ? Among these pairs,
> >> are
> >>> you going to search for a subset of keys ?
> >>>
> >>> Cheers
> >>>
> >>> On Wed, May 29, 2013 at 8:47 PM, AnilKumar B <ak...@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> What is the best hbase table schema for following json data?
> >>>> I need to store following JSON data in hbase.
> >>>> {"Session"":{"Header" :
> >>>>
> {"key1":"value1","key2":"value2","key3":"value3","key4":"value4",....},
> >>>> "clicks" : [{"click" " : {"key1":"value1","key2":"value2",
> >>>> "key3":"value3"....}, {"click" : {"key1":"value1", "key2":"value2",
> >>>> ....}}]}}
> >>>>
> >>>> I have created the schema as below, but there seems to some issues.
> >>>> rowkey -> compositeKey of session fields
> >>>> ColumnFamily 1 -> "Header" which consists of following columns
> >>>> 1) Header:HeaderFields which stores  "{"Header" :
> >>>>
> {"key1":"value1","key1":"value1","key1":"value1","key1":"value1",....}"
> >>> in
> >>>> one cell
> >>>> 2) other columns
> >>>>
> >>>> ColumnFamily 2 -> "clicks" and each "click" will be one column
> >>>>
> >>>> The problem here is
> >>>> 1) Suppose If I want search on key of click, It will be full scan, how
> >>> can
> >>>> I optimize my schema for such search requirement?
> >>>> 2) If I want to provide some secondary index for keys of clicks, how
> >> can
> >>>> Implement it?
> >>>>
> >>>> Thanks,
> >>>> B Anil Kumar.
> >>>>
> >>>
> >>
>
>