You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Laurent Hatier <la...@gmail.com> on 2011/08/10 19:02:17 UTC

Mongo vs HBase

Hi all,

I would like to know why MongoDB is faster than HBase to select items.
I explain my case :
I've inserted 4'000'000 lines into HBase and MongoDB and i must calculate
the geolocation with the IP. I calculate a Long number with the IP and i go
to find it into the 4'000'000 lines.
it's take 5 ms to select the right row with Mongo instead of HBase takes 5
seconds.
I think that the reason is the method : cur.limit(1) with MongoDB but is
there no function like this with HBase ?

-- 
Laurent HATIER
Étudiant en 2e année du Cycle Ingénieur à l'EISTI

Re: Mongo vs HBase

Posted by Fuad Efendi <fu...@efendi.ca>.

Sorry for off topic, but just as a sample to understand fundamental
difference:

1. "SELECT COUNT" will take few hours on MySQL InnoDB in most typical
cases, and _it_is_ implemented.

2. Same with HBase: full table scan. However, with MapReduce it might take
less time. Or, we can query Solr (Lily-way) to get number of records, but
data won't be absolutely correct.

Just as a sample

Of course, we can "transactionally" store number of records somewhere and
_kill_performance_.

Another solution is to use fixed-width records (similar to MyISAM) - but
data will be sparse etc.

Lily provides Hbase -based "Write Ahead Log", Hbase-based "Message Queue",
and Hbase -based "Secondary Index" (separate library); and it also
provides framework support to subscribe to a queue of messages.

-- 
Fuad Efendi
http://www.tokenizer.ca

On 11-08-11 4:13 AM, "Laurent Hatier" <la...@gmail.com> wrote:

>Thanks all.
>
>i've seen that there is no limit with HBase. I mean the following
>statement
>: "SELECT ... FROM ... LIMIT 1". (Because there is this method with
>Mongo^^)
>Is it implemented ?
>
>2011/8/11 Jason Rutherglen <ja...@gmail.com>
>
>> Laurent,
>>
>> This could be implemented with Lucene, eg, HBASE-3529.  Contact me
>> offline if you are interested in pursuing that angle.
>>
>> Cheers.
>>
>> On Wed, Aug 10, 2011 at 10:02 AM, Laurent Hatier
>> <la...@gmail.com> wrote:
>> > Hi all,
>> >
>> > I would like to know why MongoDB is faster than HBase to select items.
>> > I explain my case :
>> > I've inserted 4'000'000 lines into HBase and MongoDB and i must
>>calculate
>> > the geolocation with the IP. I calculate a Long number with the IP
>>and i
>> go
>> > to find it into the 4'000'000 lines.
>> > it's take 5 ms to select the right row with Mongo instead of HBase
>>takes
>> 5
>> > seconds.
>> > I think that the reason is the method : cur.limit(1) with MongoDB but
>>is
>> > there no function like this with HBase ?
>> >
>> > --
>> > Laurent HATIER
>> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
>> >
>>
>
>
>
>-- 
>Laurent HATIER
>Étudiant en 2e année du Cycle Ingénieur à l'EISTI

Re: Mongo vs HBase

Posted by Laurent Hatier <la...@gmail.com>.

Thanks all.

i've seen that there is no limit with HBase. I mean the following statement
: "SELECT ... FROM ... LIMIT 1". (Because there is this method with Mongo^^)
Is it implemented ?

2011/8/11 Jason Rutherglen <ja...@gmail.com>

> Laurent,
>
> This could be implemented with Lucene, eg, HBASE-3529.  Contact me
> offline if you are interested in pursuing that angle.
>
> Cheers.
>
> On Wed, Aug 10, 2011 at 10:02 AM, Laurent Hatier
> <la...@gmail.com> wrote:
> > Hi all,
> >
> > I would like to know why MongoDB is faster than HBase to select items.
> > I explain my case :
> > I've inserted 4'000'000 lines into HBase and MongoDB and i must calculate
> > the geolocation with the IP. I calculate a Long number with the IP and i
> go
> > to find it into the 4'000'000 lines.
> > it's take 5 ms to select the right row with Mongo instead of HBase takes
> 5
> > seconds.
> > I think that the reason is the method : cur.limit(1) with MongoDB but is
> > there no function like this with HBase ?
> >
> > --
> > Laurent HATIER
> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> >
>



-- 
Laurent HATIER
Étudiant en 2e année du Cycle Ingénieur à l'EISTI

Re: Mongo vs HBase

Posted by Jason Rutherglen <ja...@gmail.com>.

Laurent,

This could be implemented with Lucene, eg, HBASE-3529.  Contact me
offline if you are interested in pursuing that angle.

Cheers.

On Wed, Aug 10, 2011 at 10:02 AM, Laurent Hatier
<la...@gmail.com> wrote:
> Hi all,
>
> I would like to know why MongoDB is faster than HBase to select items.
> I explain my case :
> I've inserted 4'000'000 lines into HBase and MongoDB and i must calculate
> the geolocation with the IP. I calculate a Long number with the IP and i go
> to find it into the 4'000'000 lines.
> it's take 5 ms to select the right row with Mongo instead of HBase takes 5
> seconds.
> I think that the reason is the method : cur.limit(1) with MongoDB but is
> there no function like this with HBase ?
>
> --
> Laurent HATIER
> Étudiant en 2e année du Cycle Ingénieur à l'EISTI
>

Re: Mongo vs HBase

Posted by Li Pi <li...@cloudera.com>.

There have been a few attempts, some of them up to date, others deprecated.

See IHBase, IHTBase, and Lily.

Also see https://issues.apache.org/jira/browse/HBASE-3340

 Lily is the only one which is up to date.


On Wed, Aug 10, 2011 at 8:07 PM, Blake Lemoine <ba...@gmail.com> wrote:

> I'm just curious here.  I'm working on a google summer of code project
> currently that utilizes HBase and several times now I've made secondary
> indices based on what I think are standard practices.  Is there any
> principled reason that this process couldn't be automated or is it just
> that
> no one has implemented it yet?
>
> On Wed, Aug 10, 2011 at 7:57 PM, Ryan Rawson <ry...@gmail.com> wrote:
>
> > Mongodb does an excellent job at single node scalability - they use
> > mmap and many smart things and really kick ass ... ON A SINGLE NODE.
> >
> > That single node must have raid (raid it going out of fashion btw),
> > and you wont be able to scale without resorting to:
> > - replication (complex setup!)
> > - sharding
> >
> > mongo claims to help on the last item, but it is still a risk point.
> >
> > For really large data that must span multiple machines, there is no
> > "clustered sql" type solution that isnt (a) borked in various ways
> > (Oracle RAC I'm looking at you) or (b) stupid expensive (Oracle RAC,
> > STILL looking at you)
> >
> > Tools like HBase give you scalability at the cost of features (no
> > automated secondary indexing, no query language).
> >
> > Welcome... to... big... data.
> >
> > -ryan
> >
> > On Thu, Aug 11, 2011 at 12:44 AM, Edward Capriolo <edlinuxguru@gmail.com
> >
> > wrote:
> > > On Wed, Aug 10, 2011 at 4:26 PM, Li Pi <li...@cloudera.com> wrote:
> > >
> > >> You'll have to build your own secondary indexes for now.
> > >>
> > >> On Wed, Aug 10, 2011 at 1:15 PM, Laurent Hatier <
> > laurent.hatier@gmail.com
> > >> >wrote:
> > >>
> > >> > Yes, i have heard this index but is it available on hbase 0.90.3 ?
> > >> >
> > >> > 2011/8/10 Chris Tarnas <cf...@email.com>
> > >> >
> > >> > > Hi Laurent,
> > >> > >
> > >> > > Without more details on your schema and how you are finding that
> > number
> > >> > in
> > >> > > your table it is impossible to fully answer the question. I
> suspect
> > >> what
> > >> > you
> > >> > > are seeing is mongo's native support for secondary indexes. If you
> > were
> > >> > to
> > >> > > add secondary indexes in HBase then retrieving that row should be
> on
> > >> the
> > >> > > order of 3-30ms. If that is you main query method then you could
> > >> > reorganize
> > >> > > your table to make that long number your row key, then you would
> get
> > >> even
> > >> > > faster reads.
> > >> > >
> > >> > > -chris
> > >> > >
> > >> > >
> > >> > > On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:
> > >> > >
> > >> > > > Hi all,
> > >> > > >
> > >> > > > I would like to know why MongoDB is faster than HBase to select
> > >> items.
> > >> > > > I explain my case :
> > >> > > > I've inserted 4'000'000 lines into HBase and MongoDB and i must
> > >> > calculate
> > >> > > > the geolocation with the IP. I calculate a Long number with the
> IP
> > >> and
> > >> > i
> > >> > > go
> > >> > > > to find it into the 4'000'000 lines.
> > >> > > > it's take 5 ms to select the right row with Mongo instead of
> HBase
> > >> > takes
> > >> > > 5
> > >> > > > seconds.
> > >> > > > I think that the reason is the method : cur.limit(1) with
> MongoDB
> > but
> > >> > is
> > >> > > > there no function like this with HBase ?
> > >> > > >
> > >> > > > --
> > >> > > > Laurent HATIER
> > >> > > > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> > >> > >
> > >> > >
> > >> >
> > >> >
> > >> > --
> > >> > Laurent HATIER
> > >> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> > >> >
> > >>
> > >
> > > http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale
> > >
> >
>

Re: Mongo vs HBase

Posted by Blake Lemoine <ba...@gmail.com>.

I'm just curious here.  I'm working on a google summer of code project
currently that utilizes HBase and several times now I've made secondary
indices based on what I think are standard practices.  Is there any
principled reason that this process couldn't be automated or is it just that
no one has implemented it yet?

On Wed, Aug 10, 2011 at 7:57 PM, Ryan Rawson <ry...@gmail.com> wrote:

> Mongodb does an excellent job at single node scalability - they use
> mmap and many smart things and really kick ass ... ON A SINGLE NODE.
>
> That single node must have raid (raid it going out of fashion btw),
> and you wont be able to scale without resorting to:
> - replication (complex setup!)
> - sharding
>
> mongo claims to help on the last item, but it is still a risk point.
>
> For really large data that must span multiple machines, there is no
> "clustered sql" type solution that isnt (a) borked in various ways
> (Oracle RAC I'm looking at you) or (b) stupid expensive (Oracle RAC,
> STILL looking at you)
>
> Tools like HBase give you scalability at the cost of features (no
> automated secondary indexing, no query language).
>
> Welcome... to... big... data.
>
> -ryan
>
> On Thu, Aug 11, 2011 at 12:44 AM, Edward Capriolo <ed...@gmail.com>
> wrote:
> > On Wed, Aug 10, 2011 at 4:26 PM, Li Pi <li...@cloudera.com> wrote:
> >
> >> You'll have to build your own secondary indexes for now.
> >>
> >> On Wed, Aug 10, 2011 at 1:15 PM, Laurent Hatier <
> laurent.hatier@gmail.com
> >> >wrote:
> >>
> >> > Yes, i have heard this index but is it available on hbase 0.90.3 ?
> >> >
> >> > 2011/8/10 Chris Tarnas <cf...@email.com>
> >> >
> >> > > Hi Laurent,
> >> > >
> >> > > Without more details on your schema and how you are finding that
> number
> >> > in
> >> > > your table it is impossible to fully answer the question. I suspect
> >> what
> >> > you
> >> > > are seeing is mongo's native support for secondary indexes. If you
> were
> >> > to
> >> > > add secondary indexes in HBase then retrieving that row should be on
> >> the
> >> > > order of 3-30ms. If that is you main query method then you could
> >> > reorganize
> >> > > your table to make that long number your row key, then you would get
> >> even
> >> > > faster reads.
> >> > >
> >> > > -chris
> >> > >
> >> > >
> >> > > On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:
> >> > >
> >> > > > Hi all,
> >> > > >
> >> > > > I would like to know why MongoDB is faster than HBase to select
> >> items.
> >> > > > I explain my case :
> >> > > > I've inserted 4'000'000 lines into HBase and MongoDB and i must
> >> > calculate
> >> > > > the geolocation with the IP. I calculate a Long number with the IP
> >> and
> >> > i
> >> > > go
> >> > > > to find it into the 4'000'000 lines.
> >> > > > it's take 5 ms to select the right row with Mongo instead of HBase
> >> > takes
> >> > > 5
> >> > > > seconds.
> >> > > > I think that the reason is the method : cur.limit(1) with MongoDB
> but
> >> > is
> >> > > > there no function like this with HBase ?
> >> > > >
> >> > > > --
> >> > > > Laurent HATIER
> >> > > > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> >> > >
> >> > >
> >> >
> >> >
> >> > --
> >> > Laurent HATIER
> >> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> >> >
> >>
> >
> > http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale
> >
>

Re: Mongo vs HBase

Posted by Fuad Efendi <fu...@efendi.ca>.

And I LOVE JavaScript-based single-process (and of course single thread) MapReduce Mongo-way :(


Sent on the TELUS Mobility network with BlackBerry

-----Original Message-----
From: Ryan Rawson <ry...@gmail.com>
Date: Thu, 11 Aug 2011 00:57:21 
To: <us...@hbase.apache.org>
Reply-To: user@hbase.apache.org
Subject: Re: Mongo vs HBase

Mongodb does an excellent job at single node scalability - they use
mmap and many smart things and really kick ass ... ON A SINGLE NODE.

That single node must have raid (raid it going out of fashion btw),
and you wont be able to scale without resorting to:
- replication (complex setup!)
- sharding

mongo claims to help on the last item, but it is still a risk point.

For really large data that must span multiple machines, there is no
"clustered sql" type solution that isnt (a) borked in various ways
(Oracle RAC I'm looking at you) or (b) stupid expensive (Oracle RAC,
STILL looking at you)

Tools like HBase give you scalability at the cost of features (no
automated secondary indexing, no query language).

Welcome... to... big... data.

-ryan

On Thu, Aug 11, 2011 at 12:44 AM, Edward Capriolo <ed...@gmail.com> wrote:
> On Wed, Aug 10, 2011 at 4:26 PM, Li Pi <li...@cloudera.com> wrote:
>
>> You'll have to build your own secondary indexes for now.
>>
>> On Wed, Aug 10, 2011 at 1:15 PM, Laurent Hatier <laurent.hatier@gmail.com
>> >wrote:
>>
>> > Yes, i have heard this index but is it available on hbase 0.90.3 ?
>> >
>> > 2011/8/10 Chris Tarnas <cf...@email.com>
>> >
>> > > Hi Laurent,
>> > >
>> > > Without more details on your schema and how you are finding that number
>> > in
>> > > your table it is impossible to fully answer the question. I suspect
>> what
>> > you
>> > > are seeing is mongo's native support for secondary indexes. If you were
>> > to
>> > > add secondary indexes in HBase then retrieving that row should be on
>> the
>> > > order of 3-30ms. If that is you main query method then you could
>> > reorganize
>> > > your table to make that long number your row key, then you would get
>> even
>> > > faster reads.
>> > >
>> > > -chris
>> > >
>> > >
>> > > On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I would like to know why MongoDB is faster than HBase to select
>> items.
>> > > > I explain my case :
>> > > > I've inserted 4'000'000 lines into HBase and MongoDB and i must
>> > calculate
>> > > > the geolocation with the IP. I calculate a Long number with the IP
>> and
>> > i
>> > > go
>> > > > to find it into the 4'000'000 lines.
>> > > > it's take 5 ms to select the right row with Mongo instead of HBase
>> > takes
>> > > 5
>> > > > seconds.
>> > > > I think that the reason is the method : cur.limit(1) with MongoDB but
>> > is
>> > > > there no function like this with HBase ?
>> > > >
>> > > > --
>> > > > Laurent HATIER
>> > > > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
>> > >
>> > >
>> >
>> >
>> > --
>> > Laurent HATIER
>> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
>> >
>>
>
> http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale
>

Re: Mongo vs HBase

Posted by Ryan Rawson <ry...@gmail.com>.

Mongodb does an excellent job at single node scalability - they use
mmap and many smart things and really kick ass ... ON A SINGLE NODE.

That single node must have raid (raid it going out of fashion btw),
and you wont be able to scale without resorting to:
- replication (complex setup!)
- sharding

mongo claims to help on the last item, but it is still a risk point.

For really large data that must span multiple machines, there is no
"clustered sql" type solution that isnt (a) borked in various ways
(Oracle RAC I'm looking at you) or (b) stupid expensive (Oracle RAC,
STILL looking at you)

Tools like HBase give you scalability at the cost of features (no
automated secondary indexing, no query language).

Welcome... to... big... data.

-ryan

On Thu, Aug 11, 2011 at 12:44 AM, Edward Capriolo <ed...@gmail.com> wrote:
> On Wed, Aug 10, 2011 at 4:26 PM, Li Pi <li...@cloudera.com> wrote:
>
>> You'll have to build your own secondary indexes for now.
>>
>> On Wed, Aug 10, 2011 at 1:15 PM, Laurent Hatier <laurent.hatier@gmail.com
>> >wrote:
>>
>> > Yes, i have heard this index but is it available on hbase 0.90.3 ?
>> >
>> > 2011/8/10 Chris Tarnas <cf...@email.com>
>> >
>> > > Hi Laurent,
>> > >
>> > > Without more details on your schema and how you are finding that number
>> > in
>> > > your table it is impossible to fully answer the question. I suspect
>> what
>> > you
>> > > are seeing is mongo's native support for secondary indexes. If you were
>> > to
>> > > add secondary indexes in HBase then retrieving that row should be on
>> the
>> > > order of 3-30ms. If that is you main query method then you could
>> > reorganize
>> > > your table to make that long number your row key, then you would get
>> even
>> > > faster reads.
>> > >
>> > > -chris
>> > >
>> > >
>> > > On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I would like to know why MongoDB is faster than HBase to select
>> items.
>> > > > I explain my case :
>> > > > I've inserted 4'000'000 lines into HBase and MongoDB and i must
>> > calculate
>> > > > the geolocation with the IP. I calculate a Long number with the IP
>> and
>> > i
>> > > go
>> > > > to find it into the 4'000'000 lines.
>> > > > it's take 5 ms to select the right row with Mongo instead of HBase
>> > takes
>> > > 5
>> > > > seconds.
>> > > > I think that the reason is the method : cur.limit(1) with MongoDB but
>> > is
>> > > > there no function like this with HBase ?
>> > > >
>> > > > --
>> > > > Laurent HATIER
>> > > > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
>> > >
>> > >
>> >
>> >
>> > --
>> > Laurent HATIER
>> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
>> >
>>
>
> http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale
>

Re: Mongo vs HBase

Posted by Edward Capriolo <ed...@gmail.com>.

On Wed, Aug 10, 2011 at 4:26 PM, Li Pi <li...@cloudera.com> wrote:

> You'll have to build your own secondary indexes for now.
>
> On Wed, Aug 10, 2011 at 1:15 PM, Laurent Hatier <laurent.hatier@gmail.com
> >wrote:
>
> > Yes, i have heard this index but is it available on hbase 0.90.3 ?
> >
> > 2011/8/10 Chris Tarnas <cf...@email.com>
> >
> > > Hi Laurent,
> > >
> > > Without more details on your schema and how you are finding that number
> > in
> > > your table it is impossible to fully answer the question. I suspect
> what
> > you
> > > are seeing is mongo's native support for secondary indexes. If you were
> > to
> > > add secondary indexes in HBase then retrieving that row should be on
> the
> > > order of 3-30ms. If that is you main query method then you could
> > reorganize
> > > your table to make that long number your row key, then you would get
> even
> > > faster reads.
> > >
> > > -chris
> > >
> > >
> > > On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:
> > >
> > > > Hi all,
> > > >
> > > > I would like to know why MongoDB is faster than HBase to select
> items.
> > > > I explain my case :
> > > > I've inserted 4'000'000 lines into HBase and MongoDB and i must
> > calculate
> > > > the geolocation with the IP. I calculate a Long number with the IP
> and
> > i
> > > go
> > > > to find it into the 4'000'000 lines.
> > > > it's take 5 ms to select the right row with Mongo instead of HBase
> > takes
> > > 5
> > > > seconds.
> > > > I think that the reason is the method : cur.limit(1) with MongoDB but
> > is
> > > > there no function like this with HBase ?
> > > >
> > > > --
> > > > Laurent HATIER
> > > > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> > >
> > >
> >
> >
> > --
> > Laurent HATIER
> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> >
>

http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale

Re: Mongo vs HBase

Posted by Li Pi <li...@cloudera.com>.

You'll have to build your own secondary indexes for now.

On Wed, Aug 10, 2011 at 1:15 PM, Laurent Hatier <la...@gmail.com>wrote:

> Yes, i have heard this index but is it available on hbase 0.90.3 ?
>
> 2011/8/10 Chris Tarnas <cf...@email.com>
>
> > Hi Laurent,
> >
> > Without more details on your schema and how you are finding that number
> in
> > your table it is impossible to fully answer the question. I suspect what
> you
> > are seeing is mongo's native support for secondary indexes. If you were
> to
> > add secondary indexes in HBase then retrieving that row should be on the
> > order of 3-30ms. If that is you main query method then you could
> reorganize
> > your table to make that long number your row key, then you would get even
> > faster reads.
> >
> > -chris
> >
> >
> > On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:
> >
> > > Hi all,
> > >
> > > I would like to know why MongoDB is faster than HBase to select items.
> > > I explain my case :
> > > I've inserted 4'000'000 lines into HBase and MongoDB and i must
> calculate
> > > the geolocation with the IP. I calculate a Long number with the IP and
> i
> > go
> > > to find it into the 4'000'000 lines.
> > > it's take 5 ms to select the right row with Mongo instead of HBase
> takes
> > 5
> > > seconds.
> > > I think that the reason is the method : cur.limit(1) with MongoDB but
> is
> > > there no function like this with HBase ?
> > >
> > > --
> > > Laurent HATIER
> > > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> >
> >
>
>
> --
> Laurent HATIER
> Étudiant en 2e année du Cycle Ingénieur à l'EISTI
>

Re: Mongo vs HBase

Posted by Laurent Hatier <la...@gmail.com>.

Yes, i have heard this index but is it available on hbase 0.90.3 ?

2011/8/10 Chris Tarnas <cf...@email.com>

> Hi Laurent,
>
> Without more details on your schema and how you are finding that number in
> your table it is impossible to fully answer the question. I suspect what you
> are seeing is mongo's native support for secondary indexes. If you were to
> add secondary indexes in HBase then retrieving that row should be on the
> order of 3-30ms. If that is you main query method then you could reorganize
> your table to make that long number your row key, then you would get even
> faster reads.
>
> -chris
>
>
> On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:
>
> > Hi all,
> >
> > I would like to know why MongoDB is faster than HBase to select items.
> > I explain my case :
> > I've inserted 4'000'000 lines into HBase and MongoDB and i must calculate
> > the geolocation with the IP. I calculate a Long number with the IP and i
> go
> > to find it into the 4'000'000 lines.
> > it's take 5 ms to select the right row with Mongo instead of HBase takes
> 5
> > seconds.
> > I think that the reason is the method : cur.limit(1) with MongoDB but is
> > there no function like this with HBase ?
> >
> > --
> > Laurent HATIER
> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
>
>


-- 
Laurent HATIER
Étudiant en 2e année du Cycle Ingénieur à l'EISTI

Re: Mongo vs HBase

Posted by Chris Tarnas <cf...@email.com>.

Hi Laurent,

Without more details on your schema and how you are finding that number in your table it is impossible to fully answer the question. I suspect what you are seeing is mongo's native support for secondary indexes. If you were to add secondary indexes in HBase then retrieving that row should be on the order of 3-30ms. If that is you main query method then you could reorganize your table to make that long number your row key, then you would get even faster reads.

-chris

On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:

> Hi all,
> 
> I would like to know why MongoDB is faster than HBase to select items.
> I explain my case :
> I've inserted 4'000'000 lines into HBase and MongoDB and i must calculate
> the geolocation with the IP. I calculate a Long number with the IP and i go
> to find it into the 4'000'000 lines.
> it's take 5 ms to select the right row with Mongo instead of HBase takes 5
> seconds.
> I think that the reason is the method : cur.limit(1) with MongoDB but is
> there no function like this with HBase ?
> 
> -- 
> Laurent HATIER
> Étudiant en 2e année du Cycle Ingénieur à l'EISTI