You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Chris Hairfield <ch...@latitudegeo.com> on 2013/05/17 21:55:31 UTC

Status of Elasticsearch indexer?

Hello everyone,

I've been eagerly awaiting some of the functionality slated for 2.x, especially around your work integrating with Elasticsearch. If possible, could you give any additional status on pluggable indexing (NUTCH-1568) and the nutch-elasticsearch-indexer (NUTCH-1527)?

It's been a wonderful experience diving into Nutch for the last month and watching you guys do pretty awesome work. Now that I can finally say I no longer feel completely overwhelmed, I'd like to throw in my support for these items. Further, if there is work that still needs to be done, I might like to try helping out myself :)

Thanks!
Chris

Re: Status of Elasticsearch indexer?

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Chris,
Thanks for getting on the list and discussing these aspects of development
:0)
>From my perspective there are a number of observations

BRANCH 2.x
* NUTCH-1568 [0] is ripe for development. My sole justification for not
addressing this is that we wish to push Nutch 2.2 and it is safe to say
that there will not be enough testing to push toe code and mark it as
stable!
* NUTCH-1486 [1] is ready for testing (I know this is not elastic search
but I thought I'd throw it in there)
* Ferdy committed NUTCH-1445 [2] which enables you to index 2.x data to
Elastic Search but it is not pluggable so to speak. This will most likely
happen once we shift 2,x architecture pluggable in 2.3 development.

TRUNK

* Since Julien committed NUTCH-1047 trunk is pluggable to the tune of Solr
3.X, however Sebatian submitted a patch for a CSV indexer [4] and it would
nto be very hard to get the MongoDB patch ported to pluggable architecture
either I wouldn't imagine.
* Porting of Elastic Search from 2.x to pluggable trunk will most likely
happen in 1.8 development drive.

I think that wraps it up from me. Most likely there is something I've
missed out though!
It would be really great if you were able to chip in on any of the above...
we are always in need of porting stuff... and actually most critically
reviewing the mountain of patches we have in Jira :0)
hth
Lewis

[0] https://issues.apache.org/jira/browse/NUTCH-1568
[1] https://issues.apache.org/jira/browse/NUTCH-1486
[2] https://issues.apache.org/jira/browse/NUTCH-1445
[3] https://issues.apache.org/jira/browse/NUTCH-1047
[4] https://issues.apache.org/jira/browse/NUTCH-1541



On Fri, May 17, 2013 at 12:55 PM, Chris Hairfield <
chairfield@latitudegeo.com> wrote:

> Hello everyone,
>
> I've been eagerly awaiting some of the functionality slated for 2.x,
> especially around your work integrating with Elasticsearch. If possible,
> could you give any additional status on pluggable indexing (NUTCH-1568) and
> the nutch-elasticsearch-indexer (NUTCH-1527)?
>
> It's been a wonderful experience diving into Nutch for the last month and
> watching you guys do pretty awesome work. Now that I can finally say I no
> longer feel completely overwhelmed, I'd like to throw in my support for
> these items. Further, if there is work that still needs to be done, I might
> like to try helping out myself :)
>
> Thanks!
> Chris
>



-- 
*Lewis*