You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mohammad Tariq <do...@gmail.com> on 2013/01/28 12:45:04 UTC

Indexing Hbase Data

Hello list,

         I would like to have some suggestions on Hbase data indexing. What
would you prefer? I never faced such requirement till now. This is the
first time when there is a need of indexing, so thought of getting some
expert comments and suggestions.

Thank you so much for your precious time.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com

Re: Indexing Hbase Data

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Mohammad,

I think I understand what you are trying to do, but I'm not really
sure there is a faster way. All the access you are doing are direct.
There is no scan. You have a direct access to look into you clauses
table, then you have another direct access based on what you found
there.

For you miles vs meters example, you access the distance you "simply"
need to have another reference table which gives you the conversion
rules. You have keys like "milesmiles" (value=1), "metersmiles"
(value=0.000621371) "milesmeters" (value=1609.34). Based on the
standard unit and what is entered you lookup for the convertion rate
into this table based on the unit of the value been put (meters) and
the standard (miles) which gives you the key (metersmiles). So each
time someone want to insert something, you lookup at the conversion
table before inserting it.

I just hope you will have hundred of thousands of conversion types
because HBase might not be the best fit for few entries only.

JM

2013/1/28, Mohammad Tariq <do...@gmail.com>:
> Hello Jean,
>
>            Actually it's to read the values faster. The problem goes like
> this :
>
> I have a table that has just 2 columns : 1- Stores some clause.
>                                                          2- Stores all
> possible aliases for the original clause.
>
> These clauses are again 'column names' for another table.
>
> Now, a user can insert a value using any of the aliases or using
> the original name of a clause. If he/she gives the actual name I can put
> the value there directly into the main table which holds the data. And, if
> the user gives an alias instead of the actual clause name, I have to get
> the actual clause name first name from this table and then put the data
> into the main table.
>
> So, this table basically holds only the mappings. But the actual data has
> to stored into some other table. If I am not able to get hold of this
> mapping stuff quickly there would be a lot of overhead while putting the
> data and my puts might eventually fail.
>
> For example :
>
> My main table has a column, say "distance" and another column, say "units".
> Now, a user wants to insert a distance value as "miles" or as "meters".
> Considering "miles" as the standard unit, if a user inserts the distance in
> "miles", I can put it as it is. But if the user tries to insert the
> distance in "meters", I have to first find out that meters actually mean
> miles and I have to put the value into the distance column after converting
> it into miles.
>
> I hope I was able to explain the problem properly ;)
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, Jan 28, 2013 at 5:57 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi Mohammad,
>>
>> I don't really see how you can get faster results than indexing the
>> content as the row key in another table. Access is direct after that.
>>
>> What do you mean with "faster resuls"? To build the index? Or to read
>> through it?
>>
>> JM
>>
>> 2013/1/28, Mohammad Tariq <do...@gmail.com>:
>> > Thank you for the valuable reply sir. Actually I tried that and it
>> > works
>> > fine. But we need faster results. I was thinking of creating an index
>> > and
>> > have it loaded in the memory, at all times. so that fetches are faster.
>> Is
>> > there any OOTB feature available in co-proc?
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > On Mon, Jan 28, 2013 at 5:35 PM, ramkrishna vasudevan <
>> > ramkrishna.s.vasudevan@gmail.com> wrote:
>> >
>> >> As a POC, just try to load the data into another table that has the
>> >> rowkey
>> >> that has the original row's value.
>> >> Try to scan the index table first and then get the main table row key.
>> >> First this should help, later can make this more better by using
>> >> coprocessors.
>> >>
>> >> Regards
>> >> Ram
>> >>
>> >> On Mon, Jan 28, 2013 at 5:25 PM, Mohammad Tariq <do...@gmail.com>
>> >> wrote:
>> >>
>> >> > Hello Viral,
>> >> >
>> >> >      Thank you so much for the quick response. Intention is to index
>> >> > the
>> >> > values. I'll have a look at ihbase.
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >> >
>> >> > On Mon, Jan 28, 2013 at 5:22 PM, Viral Bajaria <
>> viral.bajaria@gmail.com
>> >> > >wrote:
>> >> >
>> >> > > When you say indexing, are you referring to indexing the column
>> >> > qualifiers
>> >> > > or the values that you are storing in the qualifier ?
>> >> > >
>> >> > > Regarding indexing, I remember someone had recommended this on the
>> >> > mailing
>> >> > > list before: https://github.com/ykulbak/ihbase/wiki but it seems
>> the
>> >> > > development on that is not active anymore.
>> >> > >
>> >> > > -Viral
>> >> > >
>> >> > > On Mon, Jan 28, 2013 at 3:45 AM, Mohammad Tariq
>> >> > > <dontariq@gmail.com
>> >
>> >> > > wrote:
>> >> > >
>> >> > > > Hello list,
>> >> > > >
>> >> > > >          I would like to have some suggestions on Hbase data
>> >> indexing.
>> >> > > What
>> >> > > > would you prefer? I never faced such requirement till now. This
>> >> > > > is
>> >> the
>> >> > > > first time when there is a need of indexing, so thought of
>> >> > > > getting
>> >> some
>> >> > > > expert comments and suggestions.
>> >> > > >
>> >> > > > Thank you so much for your precious time.
>> >> > > >
>> >> > > > Warm Regards,
>> >> > > > Tariq
>> >> > > > https://mtariq.jux.com/
>> >> > > > cloudfront.blogspot.com
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >
>>
>

Re: Indexing Hbase Data

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Jean,

           Actually it's to read the values faster. The problem goes like
this :

I have a table that has just 2 columns : 1- Stores some clause.
                                                         2- Stores all
possible aliases for the original clause.

These clauses are again 'column names' for another table.

Now, a user can insert a value using any of the aliases or using
the original name of a clause. If he/she gives the actual name I can put
the value there directly into the main table which holds the data. And, if
the user gives an alias instead of the actual clause name, I have to get
the actual clause name first name from this table and then put the data
into the main table.

So, this table basically holds only the mappings. But the actual data has
to stored into some other table. If I am not able to get hold of this
mapping stuff quickly there would be a lot of overhead while putting the
data and my puts might eventually fail.

For example :

My main table has a column, say "distance" and another column, say "units".
Now, a user wants to insert a distance value as "miles" or as "meters".
Considering "miles" as the standard unit, if a user inserts the distance in
"miles", I can put it as it is. But if the user tries to insert the
distance in "meters", I have to first find out that meters actually mean
miles and I have to put the value into the distance column after converting
it into miles.

I hope I was able to explain the problem properly ;)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Mon, Jan 28, 2013 at 5:57 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Mohammad,
>
> I don't really see how you can get faster results than indexing the
> content as the row key in another table. Access is direct after that.
>
> What do you mean with "faster resuls"? To build the index? Or to read
> through it?
>
> JM
>
> 2013/1/28, Mohammad Tariq <do...@gmail.com>:
> > Thank you for the valuable reply sir. Actually I tried that and it works
> > fine. But we need faster results. I was thinking of creating an index and
> > have it loaded in the memory, at all times. so that fetches are faster.
> Is
> > there any OOTB feature available in co-proc?
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Mon, Jan 28, 2013 at 5:35 PM, ramkrishna vasudevan <
> > ramkrishna.s.vasudevan@gmail.com> wrote:
> >
> >> As a POC, just try to load the data into another table that has the
> >> rowkey
> >> that has the original row's value.
> >> Try to scan the index table first and then get the main table row key.
> >> First this should help, later can make this more better by using
> >> coprocessors.
> >>
> >> Regards
> >> Ram
> >>
> >> On Mon, Jan 28, 2013 at 5:25 PM, Mohammad Tariq <do...@gmail.com>
> >> wrote:
> >>
> >> > Hello Viral,
> >> >
> >> >      Thank you so much for the quick response. Intention is to index
> >> > the
> >> > values. I'll have a look at ihbase.
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > On Mon, Jan 28, 2013 at 5:22 PM, Viral Bajaria <
> viral.bajaria@gmail.com
> >> > >wrote:
> >> >
> >> > > When you say indexing, are you referring to indexing the column
> >> > qualifiers
> >> > > or the values that you are storing in the qualifier ?
> >> > >
> >> > > Regarding indexing, I remember someone had recommended this on the
> >> > mailing
> >> > > list before: https://github.com/ykulbak/ihbase/wiki but it seems
> the
> >> > > development on that is not active anymore.
> >> > >
> >> > > -Viral
> >> > >
> >> > > On Mon, Jan 28, 2013 at 3:45 AM, Mohammad Tariq <dontariq@gmail.com
> >
> >> > > wrote:
> >> > >
> >> > > > Hello list,
> >> > > >
> >> > > >          I would like to have some suggestions on Hbase data
> >> indexing.
> >> > > What
> >> > > > would you prefer? I never faced such requirement till now. This is
> >> the
> >> > > > first time when there is a need of indexing, so thought of getting
> >> some
> >> > > > expert comments and suggestions.
> >> > > >
> >> > > > Thank you so much for your precious time.
> >> > > >
> >> > > > Warm Regards,
> >> > > > Tariq
> >> > > > https://mtariq.jux.com/
> >> > > > cloudfront.blogspot.com
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: Indexing Hbase Data

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Mohammad,

I don't really see how you can get faster results than indexing the
content as the row key in another table. Access is direct after that.

What do you mean with "faster resuls"? To build the index? Or to read
through it?

JM

2013/1/28, Mohammad Tariq <do...@gmail.com>:
> Thank you for the valuable reply sir. Actually I tried that and it works
> fine. But we need faster results. I was thinking of creating an index and
> have it loaded in the memory, at all times. so that fetches are faster. Is
> there any OOTB feature available in co-proc?
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, Jan 28, 2013 at 5:35 PM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
>> As a POC, just try to load the data into another table that has the
>> rowkey
>> that has the original row's value.
>> Try to scan the index table first and then get the main table row key.
>> First this should help, later can make this more better by using
>> coprocessors.
>>
>> Regards
>> Ram
>>
>> On Mon, Jan 28, 2013 at 5:25 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>>
>> > Hello Viral,
>> >
>> >      Thank you so much for the quick response. Intention is to index
>> > the
>> > values. I'll have a look at ihbase.
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > On Mon, Jan 28, 2013 at 5:22 PM, Viral Bajaria <viral.bajaria@gmail.com
>> > >wrote:
>> >
>> > > When you say indexing, are you referring to indexing the column
>> > qualifiers
>> > > or the values that you are storing in the qualifier ?
>> > >
>> > > Regarding indexing, I remember someone had recommended this on the
>> > mailing
>> > > list before: https://github.com/ykulbak/ihbase/wiki but it seems the
>> > > development on that is not active anymore.
>> > >
>> > > -Viral
>> > >
>> > > On Mon, Jan 28, 2013 at 3:45 AM, Mohammad Tariq <do...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hello list,
>> > > >
>> > > >          I would like to have some suggestions on Hbase data
>> indexing.
>> > > What
>> > > > would you prefer? I never faced such requirement till now. This is
>> the
>> > > > first time when there is a need of indexing, so thought of getting
>> some
>> > > > expert comments and suggestions.
>> > > >
>> > > > Thank you so much for your precious time.
>> > > >
>> > > > Warm Regards,
>> > > > Tariq
>> > > > https://mtariq.jux.com/
>> > > > cloudfront.blogspot.com
>> > > >
>> > >
>> >
>>
>

Re: Indexing Hbase Data

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you for the valuable reply sir. Actually I tried that and it works
fine. But we need faster results. I was thinking of creating an index and
have it loaded in the memory, at all times. so that fetches are faster. Is
there any OOTB feature available in co-proc?

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Mon, Jan 28, 2013 at 5:35 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> As a POC, just try to load the data into another table that has the rowkey
> that has the original row's value.
> Try to scan the index table first and then get the main table row key.
> First this should help, later can make this more better by using
> coprocessors.
>
> Regards
> Ram
>
> On Mon, Jan 28, 2013 at 5:25 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
>
> > Hello Viral,
> >
> >      Thank you so much for the quick response. Intention is to index the
> > values. I'll have a look at ihbase.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Mon, Jan 28, 2013 at 5:22 PM, Viral Bajaria <viral.bajaria@gmail.com
> > >wrote:
> >
> > > When you say indexing, are you referring to indexing the column
> > qualifiers
> > > or the values that you are storing in the qualifier ?
> > >
> > > Regarding indexing, I remember someone had recommended this on the
> > mailing
> > > list before: https://github.com/ykulbak/ihbase/wiki but it seems the
> > > development on that is not active anymore.
> > >
> > > -Viral
> > >
> > > On Mon, Jan 28, 2013 at 3:45 AM, Mohammad Tariq <do...@gmail.com>
> > > wrote:
> > >
> > > > Hello list,
> > > >
> > > >          I would like to have some suggestions on Hbase data
> indexing.
> > > What
> > > > would you prefer? I never faced such requirement till now. This is
> the
> > > > first time when there is a need of indexing, so thought of getting
> some
> > > > expert comments and suggestions.
> > > >
> > > > Thank you so much for your precious time.
> > > >
> > > > Warm Regards,
> > > > Tariq
> > > > https://mtariq.jux.com/
> > > > cloudfront.blogspot.com
> > > >
> > >
> >
>

Re: Indexing Hbase Data

Posted by ramkrishna vasudevan <ra...@gmail.com>.
As a POC, just try to load the data into another table that has the rowkey
that has the original row's value.
Try to scan the index table first and then get the main table row key.
First this should help, later can make this more better by using
coprocessors.

Regards
Ram

On Mon, Jan 28, 2013 at 5:25 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Viral,
>
>      Thank you so much for the quick response. Intention is to index the
> values. I'll have a look at ihbase.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, Jan 28, 2013 at 5:22 PM, Viral Bajaria <viral.bajaria@gmail.com
> >wrote:
>
> > When you say indexing, are you referring to indexing the column
> qualifiers
> > or the values that you are storing in the qualifier ?
> >
> > Regarding indexing, I remember someone had recommended this on the
> mailing
> > list before: https://github.com/ykulbak/ihbase/wiki but it seems the
> > development on that is not active anymore.
> >
> > -Viral
> >
> > On Mon, Jan 28, 2013 at 3:45 AM, Mohammad Tariq <do...@gmail.com>
> > wrote:
> >
> > > Hello list,
> > >
> > >          I would like to have some suggestions on Hbase data indexing.
> > What
> > > would you prefer? I never faced such requirement till now. This is the
> > > first time when there is a need of indexing, so thought of getting some
> > > expert comments and suggestions.
> > >
> > > Thank you so much for your precious time.
> > >
> > > Warm Regards,
> > > Tariq
> > > https://mtariq.jux.com/
> > > cloudfront.blogspot.com
> > >
> >
>

Re: Indexing Hbase Data

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Viral,

     Thank you so much for the quick response. Intention is to index the
values. I'll have a look at ihbase.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Mon, Jan 28, 2013 at 5:22 PM, Viral Bajaria <vi...@gmail.com>wrote:

> When you say indexing, are you referring to indexing the column qualifiers
> or the values that you are storing in the qualifier ?
>
> Regarding indexing, I remember someone had recommended this on the mailing
> list before: https://github.com/ykulbak/ihbase/wiki but it seems the
> development on that is not active anymore.
>
> -Viral
>
> On Mon, Jan 28, 2013 at 3:45 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
>
> > Hello list,
> >
> >          I would like to have some suggestions on Hbase data indexing.
> What
> > would you prefer? I never faced such requirement till now. This is the
> > first time when there is a need of indexing, so thought of getting some
> > expert comments and suggestions.
> >
> > Thank you so much for your precious time.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
>

Re: Indexing Hbase Data

Posted by Viral Bajaria <vi...@gmail.com>.
When you say indexing, are you referring to indexing the column qualifiers
or the values that you are storing in the qualifier ?

Regarding indexing, I remember someone had recommended this on the mailing
list before: https://github.com/ykulbak/ihbase/wiki but it seems the
development on that is not active anymore.

-Viral

On Mon, Jan 28, 2013 at 3:45 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello list,
>
>          I would like to have some suggestions on Hbase data indexing. What
> would you prefer? I never faced such requirement till now. This is the
> first time when there is a need of indexing, so thought of getting some
> expert comments and suggestions.
>
> Thank you so much for your precious time.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>