You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Sebastian Nagel <wa...@googlemail.com.INVALID> on 2018/08/10 11:35:32 UTC
[ANNOUNCE] Apache Nutch 1.15 Release
The Apache Nutch [0] Project Management Committee are pleased to announce
the immediate release of Apache Nutch v1.15. We advise all current users
and developers of the 1.X series to upgrade to this release.
Nutch is a well matured, production ready Web crawler. Nutch 1.x enables
fine grained configuration, relying on Apache Hadoop™ [1] data structures,
which are great for batch processing.
As usual in the 1.X series, release artifacts are made available as both
source and binary and also available within Maven Central [2] as a Maven
dependency. The release is available from our downloads page [3].
This release includes more than 100 bug fixes and improvements, the full
list of changes can be seen in the release report [4], the most notable
ones are:
NUTCH-1480 Multiple index writer instances with different configurations
It's now possible to index documents into multiple Solr or Elasticsearch
instances.
Please note that this feature changed the way indexers are configured,
see https://wiki.apache.org/nutch/IndexWriters for more information.
NUTCH-2412 Exchange component for indexing job
Configurable routing of documents to indexes
NUTCH-2375 Use the new MapReduce API
NUTCH-2583 Overall upgrade of library dependencies
which also makes Nutch run and compile on Java 9 and 10
NUTCH-2549 Multiple fixes and improvements to the protocol-http plugin
NUTCH-2576 A new HTTP protocol implementation based on the okhttp library
Supports HTTP/2 if used with Java 9 or higher.
NUTCH-1129 A new plugin to extract linked data based on the Any23 project
Thanks to all Nutch contributors which made this release possible,
Sebastian (on behalf of the Nutch PMC)
[0] http://nutch.apache.org/
[1] http://hadoop.apache.org/
[2] http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.apache.nutch%22%20AND%20a%3A%22nutch%22
[3] http://nutch.apache.org/downloads.html
[4] https://s.apache.org/nczS