You are viewing a plain text version of this content. The canonical link for it is here.
Posted to announce@apache.org by lewis john mcgibbney <le...@apache.org> on 2012/06/07 18:52:32 UTC

[ANNOUNCE] Apache Nutch 1.5 Released

(apologies for cross posting...)

Good Afternoon Everyone,

The 1.5 release of Nutch is now available. This release includes
several improvements including upgrades of several major components
including Tika 1.1 and Hadoop 1.0.0, improvements to LinkRank and
WebGraph elements as well as a number of new plugins covering
blacklisting, filtering and parsing to name a few. Please see the list
of changes

http://www.apache.org/dist/nutch/CHANGES-1.5.txt

made in this version for a full breakdown of the 50 odd improvements
the release boasts. A full PMC release statement can be found below

http://nutch.apache.org/#07+June+2012+-+Apache+Nutch+1.5+Released

Apache Nutch is an open source web-search software project. Stemming
from Apache Lucene, it now builds on Apache Solr adding web-specifics,
such as a crawler, a link-graph database and parsing support handled
by Apache Tika for HTML and and array other document formats. Nutch
can run on a single machine, but gains a lot of its strength from
running in a Hadoop cluster. The system can be enhanced (eg other
document formats can be parsed) using a highly flexible, easily
extensible and thoroughly maintained plugin infrastructure.

Nutch is available in source and binary form (zip and tar.gz) from the following
download page: http://www.apache.org/dyn/closer.cgi/nutch/

In the initial 48 hours, the release may not be available on all mirrors.
When downloading from a mirror site, please remember to verify the downloads
using signatures found on the Apache site:

http://www.apache.org/dist/nutch/KEYS

For more information on Apache Nutch, visit the project home page:
http://nutch.apache.org

Thank you very much

Lewis John McGibbney (on behalf of the Apache Nutch community)

RE: [ANNOUNCE] Apache Nutch 1.5 Released

Posted by Markus Jelsma <ma...@openindex.io>.
Great work Lewis, Chris, committers and contributors!
Thanks all!

 
 
-----Original message-----
> From:lewis john mcgibbney <le...@apache.org>
> Sent: Thu 07-Jun-2012 19:01
> To: announce@apache.org; dev@nutch.apache.org; user@nutch.apache.org
> Subject: [ANNOUNCE] Apache Nutch 1.5 Released
> 
> (apologies for cross posting...)
> 
> Good Afternoon Everyone,
> 
> The 1.5 release of Nutch is now available. This release includes
> several improvements including upgrades of several major components
> including Tika 1.1 and Hadoop 1.0.0, improvements to LinkRank and
> WebGraph elements as well as a number of new plugins covering
> blacklisting, filtering and parsing to name a few. Please see the list
> of changes
> 
> http://www.apache.org/dist/nutch/CHANGES-1.5.txt
> 
> made in this version for a full breakdown of the 50 odd improvements
> the release boasts. A full PMC release statement can be found below
> 
> http://nutch.apache.org/#07+June+2012+-+Apache+Nutch+1.5+Released
> 
> Apache Nutch is an open source web-search software project. Stemming
> from Apache Lucene, it now builds on Apache Solr adding web-specifics,
> such as a crawler, a link-graph database and parsing support handled
> by Apache Tika for HTML and and array other document formats. Nutch
> can run on a single machine, but gains a lot of its strength from
> running in a Hadoop cluster. The system can be enhanced (eg other
> document formats can be parsed) using a highly flexible, easily
> extensible and thoroughly maintained plugin infrastructure.
> 
> Nutch is available in source and binary form (zip and tar.gz) from the following
> download page: http://www.apache.org/dyn/closer.cgi/nutch/
> 
> In the initial 48 hours, the release may not be available on all mirrors.
> When downloading from a mirror site, please remember to verify the downloads
> using signatures found on the Apache site:
> 
> http://www.apache.org/dist/nutch/KEYS
> 
> For more information on Apache Nutch, visit the project home page:
> http://nutch.apache.org
> 
> Thank you very much
> 
> Lewis John McGibbney (on behalf of the Apache Nutch community)
> 

Re: [ANNOUNCE] Apache Nutch 1.5 Released

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Agreed thanks Lewis!

Cheers,
Chris

On Jun 8, 2012, at 1:22 AM, Julien Nioche wrote:

> Thanks Lewis!
> 
> On 7 June 2012 17:52, lewis john mcgibbney <le...@apache.org> wrote:
> 
>> (apologies for cross posting...)
>> 
>> Good Afternoon Everyone,
>> 
>> The 1.5 release of Nutch is now available. This release includes
>> several improvements including upgrades of several major components
>> including Tika 1.1 and Hadoop 1.0.0, improvements to LinkRank and
>> WebGraph elements as well as a number of new plugins covering
>> blacklisting, filtering and parsing to name a few. Please see the list
>> of changes
>> 
>> http://www.apache.org/dist/nutch/CHANGES-1.5.txt
>> 
>> made in this version for a full breakdown of the 50 odd improvements
>> the release boasts. A full PMC release statement can be found below
>> 
>> http://nutch.apache.org/#07+June+2012+-+Apache+Nutch+1.5+Released
>> 
>> Apache Nutch is an open source web-search software project. Stemming
>> from Apache Lucene, it now builds on Apache Solr adding web-specifics,
>> such as a crawler, a link-graph database and parsing support handled
>> by Apache Tika for HTML and and array other document formats. Nutch
>> can run on a single machine, but gains a lot of its strength from
>> running in a Hadoop cluster. The system can be enhanced (eg other
>> document formats can be parsed) using a highly flexible, easily
>> extensible and thoroughly maintained plugin infrastructure.
>> 
>> Nutch is available in source and binary form (zip and tar.gz) from the
>> following
>> download page: http://www.apache.org/dyn/closer.cgi/nutch/
>> 
>> In the initial 48 hours, the release may not be available on all mirrors.
>> When downloading from a mirror site, please remember to verify the
>> downloads
>> using signatures found on the Apache site:
>> 
>> http://www.apache.org/dist/nutch/KEYS
>> 
>> For more information on Apache Nutch, visit the project home page:
>> http://nutch.apache.org
>> 
>> Thank you very much
>> 
>> Lewis John McGibbney (on behalf of the Apache Nutch community)
>> 
> 
> 
> 
> -- 
> *
> *Open Source Solutions for Text Engineering
> 
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: [ANNOUNCE] Apache Nutch 1.5 Released

Posted by Julien Nioche <li...@gmail.com>.
Thanks Lewis!

On 7 June 2012 17:52, lewis john mcgibbney <le...@apache.org> wrote:

> (apologies for cross posting...)
>
> Good Afternoon Everyone,
>
> The 1.5 release of Nutch is now available. This release includes
> several improvements including upgrades of several major components
> including Tika 1.1 and Hadoop 1.0.0, improvements to LinkRank and
> WebGraph elements as well as a number of new plugins covering
> blacklisting, filtering and parsing to name a few. Please see the list
> of changes
>
> http://www.apache.org/dist/nutch/CHANGES-1.5.txt
>
> made in this version for a full breakdown of the 50 odd improvements
> the release boasts. A full PMC release statement can be found below
>
> http://nutch.apache.org/#07+June+2012+-+Apache+Nutch+1.5+Released
>
> Apache Nutch is an open source web-search software project. Stemming
> from Apache Lucene, it now builds on Apache Solr adding web-specifics,
> such as a crawler, a link-graph database and parsing support handled
> by Apache Tika for HTML and and array other document formats. Nutch
> can run on a single machine, but gains a lot of its strength from
> running in a Hadoop cluster. The system can be enhanced (eg other
> document formats can be parsed) using a highly flexible, easily
> extensible and thoroughly maintained plugin infrastructure.
>
> Nutch is available in source and binary form (zip and tar.gz) from the
> following
> download page: http://www.apache.org/dyn/closer.cgi/nutch/
>
> In the initial 48 hours, the release may not be available on all mirrors.
> When downloading from a mirror site, please remember to verify the
> downloads
> using signatures found on the Apache site:
>
> http://www.apache.org/dist/nutch/KEYS
>
> For more information on Apache Nutch, visit the project home page:
> http://nutch.apache.org
>
> Thank you very much
>
> Lewis John McGibbney (on behalf of the Apache Nutch community)
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble