You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Craig Minihan <cr...@ripcordsoftware.com> on 2015/11/12 13:26:28 UTC

MapReduce in CouchDB

All, I've just released the first public version of AvanceDB - an open source high performance MapReduce engine designed to work with CouchDB.

I've used CouchDB in various projects for a few years now and have always been a fan. The only thing I wanted to improve was M/R performance.

I've developed AvanceDB as an open source project on github over the last few months. All documents are stored in RAM and processed by an embedded SpiderMonkey instance and it is pretty fast.

Hopefully this project is of interest to the CouchDB community, you can find a demo here: https://www.youtube.com/watch?v=szpYFrm0Udc and some architecture info here: https://www.youtube.com/watch?v=Au5a9aoX6Lg.

The repo is here: https://github.com/RipcordSoftware/AvanceDB

Cheers,
Craig Minihan

RE: MapReduce in CouchDB

Posted by Craig Minihan <cr...@ripcordsoftware.com>.
Joan and Martin, I don't think AvanceDB would sit well inside CouchDB - I think the technology stacks and product goals are just too different. For example Avance is a resource hog (by design) while Couch is very light (admirably so).

However the core JS piece that Avance uses is MIT licensed and most likely could be integrated in-process into an Erlang VM. I'm not an Erlang guy but AFAIK this would be feasible.

I separated libjsapi into a separate project specifically to allow the FOSS community to get value from it in other projects.

Check out: https://github.com/RipcordSoftware/libjsapi

There are a bunch of examples in the README and in the wiki, all in C++ but they are so simple most folks could pick them up quickly. I made a Mandelbrot example running under GTK+ to show how the SpiderMonkey GC and the GTK memory models could co-exist. This type of approach would be needed to get BEAM and SpiderMonkey to interact harmoniously.

In the most trivial implementation you could pass it JSON and receive JSON back similar to the current JS 1.8.5 query server in Couch. Since this could be done in process it should improve basic MR performance significantly.

I'd gladly help out in any R&D effort on behalf of Couch to get this thing rolling.

Cheers,
Craig

-----Original Message-----
From: martin.broerse@gmail.com [mailto:martin.broerse@gmail.com] On Behalf Of Martin Broerse
Sent: 12 November 2015 21:28
To: user@couchdb.apache.org; Joan Touzet <wo...@apache.org>
Subject: Re: MapReduce in CouchDB

Hi Craig,

I agree with Joan and would not use it in a commercial project under a GPL licence. Next to Apache I also like MIT: https://opensource.org/licenses/MIT

- Martin

On Thu, Nov 12, 2015 at 8:38 PM, Joan Touzet <wo...@apache.org> wrote:

> Hi Craig,
>
> Neat work, however:
>
> ----- Original Message -----
> > From: "Alexander Shorin" <kx...@gmail.com> IIRC we already have 
> > sort of plug-able index engine. If AvanceDB is able to provide own 
> > as a library under non-GPL license,
>
> This is critical. Would you consider relicensing AvanceDB under the 
> Apache license? As it stands, with you licensing it as GPL, there is 
> no chance of this project/code ever being included with CouchDB 
> proper. Licensing will even make it impossible for us to ship a shim 
> library against which you could link as a NIF (since you didn't pick LGPL either).
>
> Please reconsider your licensing terms.
>
> Thanks,
> Joan Touzet
>

Re: MapReduce in CouchDB

Posted by Martin Broerse <in...@martinbroerse.com>.
Hi Craig,

I agree with Joan and would not use it in a commercial project under a GPL
licence. Next to Apache I also like MIT: https://opensource.org/licenses/MIT

- Martin

On Thu, Nov 12, 2015 at 8:38 PM, Joan Touzet <wo...@apache.org> wrote:

> Hi Craig,
>
> Neat work, however:
>
> ----- Original Message -----
> > From: "Alexander Shorin" <kx...@gmail.com>
> > IIRC we already have sort of plug-able index engine. If AvanceDB is
> > able to provide own as a library under non-GPL license,
>
> This is critical. Would you consider relicensing AvanceDB under the Apache
> license? As it stands, with you licensing it as GPL, there is no chance of
> this project/code ever being included with CouchDB proper. Licensing will
> even make it impossible for us to ship a shim library against which you
> could link as a NIF (since you didn't pick LGPL either).
>
> Please reconsider your licensing terms.
>
> Thanks,
> Joan Touzet
>

Re: MapReduce in CouchDB

Posted by Joan Touzet <wo...@apache.org>.
Hi Craig,

Neat work, however:

----- Original Message -----
> From: "Alexander Shorin" <kx...@gmail.com>
> IIRC we already have sort of plug-able index engine. If AvanceDB is
> able to provide own as a library under non-GPL license,

This is critical. Would you consider relicensing AvanceDB under the Apache
license? As it stands, with you licensing it as GPL, there is no chance of
this project/code ever being included with CouchDB proper. Licensing will
even make it impossible for us to ship a shim library against which you
could link as a NIF (since you didn't pick LGPL either).

Please reconsider your licensing terms.

Thanks,
Joan Touzet

Re: MapReduce in CouchDB

Posted by Alexander Shorin <kx...@gmail.com>.
On Thu, Nov 12, 2015 at 7:40 PM, Andy Wenk <an...@apache.org> wrote:
> I was really amazed how fast the MapReduce implementation is. Cool thing! I
> think you should get in touch with some core developers and tell them more
> ... (Alex Shorin, Bob Newson, Jan Lehnardt and others ... )

IIRC we already have sort of plug-able index engine. If AvanceDB is
able to provide own as a library under non-GPL license, it's possible
to make a NIF around and try to integrate it with CouchDB. Good idea
for plugin btw (;

--
,,,^..^,,,

Re: MapReduce in CouchDB

Posted by Andy Wenk <an...@apache.org>.
Hey Craig,

I was really amazed how fast the MapReduce implementation is. Cool thing! I
think you should get in touch with some core developers and tell them more
... (Alex Shorin, Bob Newson, Jan Lehnardt and others ... )

All the best

Andy

On 12 November 2015 at 14:52, Garren Smith <ga...@apache.org> wrote:

> Nice work Craig. This is very cool.
>
>
> > On 12 Nov 2015, at 3:48 PM, Alexander Shorin <kx...@gmail.com> wrote:
> >
> > Very nice! Congrats with the release!
> > --
> > ,,,^..^,,,
> >
> >
> > On Thu, Nov 12, 2015 at 3:26 PM, Craig Minihan
> > <cr...@ripcordsoftware.com> wrote:
> >> All, I've just released the first public version of AvanceDB - an open
> source high performance MapReduce engine designed to work with CouchDB.
> >>
> >> I've used CouchDB in various projects for a few years now and have
> always been a fan. The only thing I wanted to improve was M/R performance.
> >>
> >> I've developed AvanceDB as an open source project on github over the
> last few months. All documents are stored in RAM and processed by an
> embedded SpiderMonkey instance and it is pretty fast.
> >>
> >> Hopefully this project is of interest to the CouchDB community, you can
> find a demo here: https://www.youtube.com/watch?v=szpYFrm0Udc and some
> architecture info here: https://www.youtube.com/watch?v=Au5a9aoX6Lg.
> >>
> >> The repo is here: https://github.com/RipcordSoftware/AvanceDB
> >>
> >> Cheers,
> >> Craig Minihan
>
> --
> Andy Wenk
> Hamburg - Germany
> RockIt!
>
> GPG fingerprint: C044 8322 9E12 1483 4FEC 9452 B65D 6BE3 9ED3 9588
>
> https://people.apache.org/keys/committer/andywenk.asc
>
>

Re: MapReduce in CouchDB

Posted by Alexander Shorin <kx...@gmail.com>.
On Thu, Nov 12, 2015 at 5:57 PM, Roald de Vries <we...@gmail.com> wrote:
> Cool! Both for CouchDB on ramdisk and AdvanceDB:
> - would you recommend using this for caches, for example?

Partially yes (not sure about AvanceDB), while you guarantee that your
cache will fit the memory disk (remember, CouchDB database grows
constantly and you need twice more free space to compact it!)

> - could a CouchDB be a competitor to Redis, in this way?

No (not sure about AvanceDB), but it depends on your use case.


To be honest, CouchDB is too greedy for disk space to keep it in
memory for something serious, unless you have small databases and/or a
lot of memory. So it's only useful for development purposes.

--
,,,^..^,,,

Re: MapReduce in CouchDB

Posted by Roald de Vries <we...@gmail.com>.
> On 12 Nov 2015, at 15:48, Alexander Shorin <kx...@gmail.com> wrote:
> 
> On Thu, Nov 12, 2015 at 5:46 PM, Giovanni Lenzi <g....@smileupps.com> wrote:
>> Just wondering: has anyone tested couchdb running on a ramdisk, instead of
>> normal disks?
> 
> I did. It works, but I won't use such setup for production or for the
> data I care about (:
> For testing and for temporary things it's pretty useful however.

Cool! Both for CouchDB on ramdisk and AdvanceDB:
- would you recommend using this for caches, for example?
- could a CouchDB be a competitor to Redis, in this way?

Cheers, Roald

Re: MapReduce in CouchDB

Posted by Alexander Shorin <kx...@gmail.com>.
On Thu, Nov 12, 2015 at 5:46 PM, Giovanni Lenzi <g....@smileupps.com> wrote:
> Just wondering: has anyone tested couchdb running on a ramdisk, instead of
> normal disks?

I did. It works, but I won't use such setup for production or for the
data I care about (:
For testing and for temporary things it's pretty useful however.

--
,,,^..^,,,

Re: MapReduce in CouchDB

Posted by Giovanni Lenzi <g....@smileupps.com>.
Wonderful, that could serve many different purposes!!

Just wondering: has anyone tested couchdb running on a ramdisk, instead of
normal disks?

--Giovanni

2015-11-12 14:52 GMT+01:00 Garren Smith <ga...@apache.org>:

> Nice work Craig. This is very cool.
>
>
> > On 12 Nov 2015, at 3:48 PM, Alexander Shorin <kx...@gmail.com> wrote:
> >
> > Very nice! Congrats with the release!
> > --
> > ,,,^..^,,,
> >
> >
> > On Thu, Nov 12, 2015 at 3:26 PM, Craig Minihan
> > <cr...@ripcordsoftware.com> wrote:
> >> All, I've just released the first public version of AvanceDB - an open
> source high performance MapReduce engine designed to work with CouchDB.
> >>
> >> I've used CouchDB in various projects for a few years now and have
> always been a fan. The only thing I wanted to improve was M/R performance.
> >>
> >> I've developed AvanceDB as an open source project on github over the
> last few months. All documents are stored in RAM and processed by an
> embedded SpiderMonkey instance and it is pretty fast.
> >>
> >> Hopefully this project is of interest to the CouchDB community, you can
> find a demo here: https://www.youtube.com/watch?v=szpYFrm0Udc and some
> architecture info here: https://www.youtube.com/watch?v=Au5a9aoX6Lg.
> >>
> >> The repo is here: https://github.com/RipcordSoftware/AvanceDB
> >>
> >> Cheers,
> >> Craig Minihan
>
>

Re: MapReduce in CouchDB

Posted by Garren Smith <ga...@apache.org>.
Nice work Craig. This is very cool.


> On 12 Nov 2015, at 3:48 PM, Alexander Shorin <kx...@gmail.com> wrote:
> 
> Very nice! Congrats with the release!
> --
> ,,,^..^,,,
> 
> 
> On Thu, Nov 12, 2015 at 3:26 PM, Craig Minihan
> <cr...@ripcordsoftware.com> wrote:
>> All, I've just released the first public version of AvanceDB - an open source high performance MapReduce engine designed to work with CouchDB.
>> 
>> I've used CouchDB in various projects for a few years now and have always been a fan. The only thing I wanted to improve was M/R performance.
>> 
>> I've developed AvanceDB as an open source project on github over the last few months. All documents are stored in RAM and processed by an embedded SpiderMonkey instance and it is pretty fast.
>> 
>> Hopefully this project is of interest to the CouchDB community, you can find a demo here: https://www.youtube.com/watch?v=szpYFrm0Udc and some architecture info here: https://www.youtube.com/watch?v=Au5a9aoX6Lg.
>> 
>> The repo is here: https://github.com/RipcordSoftware/AvanceDB
>> 
>> Cheers,
>> Craig Minihan


Re: MapReduce in CouchDB

Posted by Alexander Shorin <kx...@gmail.com>.
Very nice! Congrats with the release!
--
,,,^..^,,,


On Thu, Nov 12, 2015 at 3:26 PM, Craig Minihan
<cr...@ripcordsoftware.com> wrote:
> All, I've just released the first public version of AvanceDB - an open source high performance MapReduce engine designed to work with CouchDB.
>
> I've used CouchDB in various projects for a few years now and have always been a fan. The only thing I wanted to improve was M/R performance.
>
> I've developed AvanceDB as an open source project on github over the last few months. All documents are stored in RAM and processed by an embedded SpiderMonkey instance and it is pretty fast.
>
> Hopefully this project is of interest to the CouchDB community, you can find a demo here: https://www.youtube.com/watch?v=szpYFrm0Udc and some architecture info here: https://www.youtube.com/watch?v=Au5a9aoX6Lg.
>
> The repo is here: https://github.com/RipcordSoftware/AvanceDB
>
> Cheers,
> Craig Minihan

Re: MapReduce in CouchDB

Posted by Alexander Gabriel <al...@gmail.com>.
WOW. Dynamic map reduce. I'm blown away