You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by RONNY <ro...@mputa.com> on 2008/10/22 01:50:22 UTC
Re: Is Nutch Still Active?
Nutch is too young a project to die the men are finalizing version 1.0
Ronny
John Martyniak wrote:
> Hi,
>
> I have been playing around with Nutch for a little while, and I see a
> ton of emails on the mailing lists, but there hasn't been a formal
> build in more than a year.
>
> Are there any plans? Is this project still being worked on?
>
> Any thoughts would be greatly appreciated.
>
> -John
>
>
Re: Is Nutch Still Active?
Posted by Dennis Kubes <ku...@apache.org>.
John Martyniak wrote:
> Dennis,
>
> Thanks for the information.
>
> Can you tell me what the benefit of integrating with SOLR would be? It
> seems to me that the only gap between the two is that Nutch has a
> Spider, and SOLR has incremental index, query warming, etc.
It is really about size of data and type of usage. Nutch is
specifically for web search while Solr is IMO better for enterprise and
restricted domain search. Nutch uses MapReduce throughough, Solr
doesn't (although indexes can be created in MR and served by Solr).
Nutch has a crawler, Solr doesn't. Nutch has a distributed search
server. Solr is working towards the same type of distributed search
model. I think the biggest difference in terms of ideology is web
search is batch oriented, do a crawl, process, analyze, and index it,
while enterprise search is closer to real time updates and dynamic changes.
So there are significant differences even though they can work together.
The current integration work is to allow indexes created by nutch to
be served by Solr. If your domain is creating a full text search from a
database, or something like radius or location search, I would use Solr.
It you want to create a large www or vertical search engine I would
use Nutch. If you have a large amount of data to crawl and/or process
and still want to integrate with a database I would use Nutch / Hadoop
to acquire and process the data and solr to serve it.
>
> And the approximate timing of the next release?
Well we were going to release 1.0 when hadoop released 1.0. They were
planning on doing that after verison 0.17. But they have continued
along the path to version 0.20 so I don't exactly know when a 1.0
release for hadoop would be. My guess, although no hard and firm plans
is within the next 1-2 months. Many patches are complete now and need
to be integrated, then let sit for a month or so to work out any bugs.
Dennis
>
> -John
>
> On Oct 22, 2008, at 1:29 PM, Dennis Kubes wrote:
>
>> We have been working on major feature upgrades for version 1. That
>> took some time. It includes things like a new scoring framework, an
>> new indexing framework, serving search results in XML and JSON,
>> integration with SOLR and HBase, among others. Not dead, just busy.
>>
>> Dennis
>>
>> John Martyniak wrote:
>>> Ronny,
>>> Thanks for the info.
>>> Does you know what the approximate timing for that is (Days, weeks,
>>> months)? And also the feature set.
>>> -John
>>> On Oct 21, 2008, at 7:50 PM, RONNY wrote:
>>>> Nutch is too young a project to die the men are finalizing version 1.0
>>>> Ronny
>>>>
>>>>
>>>> John Martyniak wrote:
>>>>> Hi,
>>>>>
>>>>> I have been playing around with Nutch for a little while, and I see
>>>>> a ton of emails on the mailing lists, but there hasn't been a
>>>>> formal build in more than a year.
>>>>>
>>>>> Are there any plans? Is this project still being worked on?
>>>>>
>>>>> Any thoughts would be greatly appreciated.
>>>>>
>>>>> -John
>>>>>
>>>>>
>>>>
>
Re: Is Nutch Still Active?
Posted by John Martyniak <jo...@beforedawn.com>.
Dennis,
Thanks for the information.
Can you tell me what the benefit of integrating with SOLR would be?
It seems to me that the only gap between the two is that Nutch has a
Spider, and SOLR has incremental index, query warming, etc.
And the approximate timing of the next release?
-John
On Oct 22, 2008, at 1:29 PM, Dennis Kubes wrote:
> We have been working on major feature upgrades for version 1. That
> took some time. It includes things like a new scoring framework, an
> new indexing framework, serving search results in XML and JSON,
> integration with SOLR and HBase, among others. Not dead, just busy.
>
> Dennis
>
> John Martyniak wrote:
>> Ronny,
>> Thanks for the info.
>> Does you know what the approximate timing for that is (Days, weeks,
>> months)? And also the feature set.
>> -John
>> On Oct 21, 2008, at 7:50 PM, RONNY wrote:
>>> Nutch is too young a project to die the men are finalizing version
>>> 1.0
>>> Ronny
>>>
>>>
>>> John Martyniak wrote:
>>>> Hi,
>>>>
>>>> I have been playing around with Nutch for a little while, and I
>>>> see a ton of emails on the mailing lists, but there hasn't been a
>>>> formal build in more than a year.
>>>>
>>>> Are there any plans? Is this project still being worked on?
>>>>
>>>> Any thoughts would be greatly appreciated.
>>>>
>>>> -John
>>>>
>>>>
>>>
Re: Is Nutch Still Active?
Posted by Dennis Kubes <ku...@apache.org>.
We have been working on major feature upgrades for version 1. That took
some time. It includes things like a new scoring framework, an new
indexing framework, serving search results in XML and JSON, integration
with SOLR and HBase, among others. Not dead, just busy.
Dennis
John Martyniak wrote:
> Ronny,
>
> Thanks for the info.
>
> Does you know what the approximate timing for that is (Days, weeks,
> months)? And also the feature set.
>
> -John
>
> On Oct 21, 2008, at 7:50 PM, RONNY wrote:
>
>> Nutch is too young a project to die the men are finalizing version 1.0
>> Ronny
>>
>>
>> John Martyniak wrote:
>>> Hi,
>>>
>>> I have been playing around with Nutch for a little while, and I see a
>>> ton of emails on the mailing lists, but there hasn't been a formal
>>> build in more than a year.
>>>
>>> Are there any plans? Is this project still being worked on?
>>>
>>> Any thoughts would be greatly appreciated.
>>>
>>> -John
>>>
>>>
>>
>
Re: Is Nutch Still Active?
Posted by John Martyniak <jo...@beforedawn.com>.
Ronny,
Thanks for the info.
Does you know what the approximate timing for that is (Days, weeks,
months)? And also the feature set.
-John
On Oct 21, 2008, at 7:50 PM, RONNY wrote:
> Nutch is too young a project to die the men are finalizing version 1.0
> Ronny
>
>
> John Martyniak wrote:
>> Hi,
>>
>> I have been playing around with Nutch for a little while, and I see
>> a ton of emails on the mailing lists, but there hasn't been a
>> formal build in more than a year.
>>
>> Are there any plans? Is this project still being worked on?
>>
>> Any thoughts would be greatly appreciated.
>>
>> -John
>>
>>
>