You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Vikram Singh Chandel <vi...@gmail.com> on 2014/05/02 22:33:00 UTC
Re: How to implement sorting in HBase scans for a particular column

Hi James
Thanks a lot James for the reply,  we will give it a try and let you know
with our progress




On Tue, Apr 29, 2014 at 11:25 PM, James Taylor <jt...@salesforce.com>wrote:

> Hi Vikram,
> I see you sent the Phoenix mailing list back in Dec a question on how to
> use Phoenix 2.1.2 with Hadoop 2 for HBase 0.94. Looks like you were having
> trouble building Phoenix with the hadoop2 profile. In our 3.0/4.0 we bundle
> the phoenix jars pre-built with both hadoop1 and hadoop2, so there's
> nothing you need to do.
>
> Did you have any other issues?
>
> Regarding sorting rows, Apache Phoenix handles this for you when you do an
> ORDER BY:
> CREATE TABLE names(id VARCHAR NOT NULL PRIMARY KEY,
>     name VARCHAR, age INTEGER);
> // populate the table
> SELECT * FROM names ORDER BY age;
>
> Thanks,
> James
>
>
> On Tue, Apr 29, 2014 at 5:33 AM, Vikram Singh Chandel <
> vikramsinghchandel@gmail.com> wrote:
>
> > Yes we have looked, but way back in November December 2013 when it was
> > having a lot of issue and because of which we decided not to use it. We
> > built our solution design on Hbase alone. So we are looking for a better
> > solution.
> >
> > Thanks
> >
> >
> > On Tue, Apr 29, 2014 at 5:46 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Have you looked at Apache Phoenix ?
> > >
> > > Cheers
> > >
> > > On Apr 29, 2014, at 2:13 AM, Vikram Singh Chandel <
> > > vikramsinghchandel@gmail.com> wrote:
> > >
> > > > Hi
> > > >
> > > > We have a requirement in which we have to get the scan result sorted
> > on a
> > > > particular column.
> > > >
> > > > eg. *Get Details of Authors sorted by their Publication Count. Limit
> > > :1000 *
> > > >
> > > > *Row Key is a MD5 hash of Author Id*
> > > >
> > > > Number of records 8.2 million rows for 3 year data.(sample dataset,
> > > actual
> > > > data set is 30 years)
> > > >
> > > > We are currently looking in to implement a *comparator *and sort the
> > > > values. But but for this first we have to store all 8.2 m records in
> a
> > > > map/list and then sort. And this approach is neither memory efficient
> > nor
> > > > time efficient.
> > > >
> > > > Is there any solution via which this kind of request can be fulfilled
> > in
> > > > real time?
> > > >
> > > >
> > > >
> > > > --
> > > > *Regards*
> > > >
> > > > *VIKRAM SINGH CHANDEL*
> > > >
> > > > Please do not print this email unless it is absolutely
> > necessary,Reduce.
> > > > Reuse. Recycle. Save our planet.
> > >
> >
> >
> >
> > --
> > *Regards*
> >
> > *VIKRAM SINGH CHANDEL*
> >
> > Please do not print this email unless it is absolutely necessary,Reduce.
> > Reuse. Recycle. Save our planet.
> >
>



-- 
*Regards*

*VIKRAM SINGH CHANDEL*

Please do not print this email unless it is absolutely necessary,Reduce.
Reuse. Recycle. Save our planet.