You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by "Poulard, Fabien" <fp...@dictanova.com> on 2012/12/18 20:23:20 UTC

Gora and MongoDB

Hi all,

I'm Fabien, I co-funded a company specialized in opinion mining on the Web.
We use Nutch 2.x for our crawling needs... and therefore Apache Gora as an
abstraction layer between our NoSQL datastore and Nutch results.

We've been using HBase so far. But we'd like to give 10gen MongoBD a shot.
I've started working on a gora datastore for MongoDB. I've searched in the
archives and in Jira but did not find anything related to MongoDB. Before
going anywhere further I'd like to check if anyone else is working on such
a thing and if I may find myself stucked by some difficulties I did not
anticipate.

Any hint would help ;)

-- 
*Fabien Poulard*
Associé-Fondateur Dictanova
Tél. 02 51 12 59 68 / 06 65 58 94 77

*Dictanova*
2, rue de la Houssinière - BP 92208
44322 Nantes Cedex 03

Re: Gora and MongoDB

Posted by Alfonso Nishikawa <al...@gmail.com>.
Greetings,

Gora+MongoDB are happy news. Good to know about that feature.
Maybe someone should make some little document with the involved classes
for extend a new datastore (maybe I will try someday).

About compiling schemas, in my opinion, someday someone should do something
for maven (plugin:) . But by now anything automated is welcome.

Cheers.

Alfonso Nishikawa
El 28/01/2013 07:37, "Poulard, Fabien" <fp...@dictanova.com> escribió:

Hi folks,

2012/12/18 Lewis John Mcgibbney <le...@gmail.com>

> If you would like to open an issue, please do. If you begin submitting
> patches, I'm sure that the community could and will test the code in an
> attempt to get a MondoDB module for Gora. This would be very much
welcomed.
>

I've managed to develop a MongoDB module for Gora. It's far from perfect
but it's good enough as a proof of concept to run Nutch.

Within Dictanova we're using Gradle as a build tools so I developped my
module with it as building tool instead of the ant one used for Gora. It
was just easier for me. In the process I've adapted a Gradle plugin to
compile Gora schemas based on a plugin for Avro schemas [1].

[1] https://github.com/iamsteveholmes/avro-gradle-plugin

I'd be glad you guys review my code as I'm sure there's plenty of room for
improvement. So far the code is in our git repository. What would be the
best way to share it with you ? I've created the issue GORA-200...
Moreover, I've implemented some unit tests but not using the Driver stuff I
saw for the others. I'll have to take a closer look at how it works... I'd
appreciate any pointer.

Cordially,

--
*Fabien Poulard*
Associé-Fondateur Dictanova
Tél. 02 51 12 59 68 / 06 65 58 94 77

*Dictanova*
2, rue de la Houssinière - BP 92208
44322 Nantes Cedex 03

Re: Gora and MongoDB

Posted by "Poulard, Fabien" <fp...@dictanova.com>.
Hi folks,

2012/12/18 Lewis John Mcgibbney <le...@gmail.com>

> If you would like to open an issue, please do. If you begin submitting
> patches, I'm sure that the community could and will test the code in an
> attempt to get a MondoDB module for Gora. This would be very much welcomed.
>

I've managed to develop a MongoDB module for Gora. It's far from perfect
but it's good enough as a proof of concept to run Nutch.

Within Dictanova we're using Gradle as a build tools so I developped my
module with it as building tool instead of the ant one used for Gora. It
was just easier for me. In the process I've adapted a Gradle plugin to
compile Gora schemas based on a plugin for Avro schemas [1].

[1] https://github.com/iamsteveholmes/avro-gradle-plugin

I'd be glad you guys review my code as I'm sure there's plenty of room for
improvement. So far the code is in our git repository. What would be the
best way to share it with you ? I've created the issue GORA-200...
Moreover, I've implemented some unit tests but not using the Driver stuff I
saw for the others. I'll have to take a closer look at how it works... I'd
appreciate any pointer.

Cordially,

-- 
*Fabien Poulard*
Associé-Fondateur Dictanova
Tél. 02 51 12 59 68 / 06 65 58 94 77

*Dictanova*
2, rue de la Houssinière - BP 92208
44322 Nantes Cedex 03

Re: Gora and MongoDB

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi,

AFAIK, there is no work currently being undertaken to build a datastore for
10gen's MongoDB.

If you would like to open an issue, please do. If you begin submitting
patches, I'm sure that the community could and will test the code in an
attempt to get a MondoDB module for Gora. This would be very much welcomed.

Please keep us posted as to how you are getting on.

Best

Lewis

On Tue, Dec 18, 2012 at 7:23 PM, Poulard, Fabien <fp...@dictanova.com>wrote:

> Hi all,
>
> I'm Fabien, I co-funded a company specialized in opinion mining on the Web.
> We use Nutch 2.x for our crawling needs... and therefore Apache Gora as an
> abstraction layer between our NoSQL datastore and Nutch results.
>
> We've been using HBase so far. But we'd like to give 10gen MongoBD a shot.
> I've started working on a gora datastore for MongoDB. I've searched in the
> archives and in Jira but did not find anything related to MongoDB. Before
> going anywhere further I'd like to check if anyone else is working on such
> a thing and if I may find myself stucked by some difficulties I did not
> anticipate.
>
> Any hint would help ;)
>
> --
> *Fabien Poulard*
> Associé-Fondateur Dictanova
> Tél. 02 51 12 59 68 / 06 65 58 94 77
>
> *Dictanova*
> 2, rue de la Houssinière - BP 92208
> 44322 Nantes Cedex 03
>



-- 
*Lewis*