You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2013/07/02 18:32:00 UTC
[ANNOUNCE] Apache Nutch v2.2.1 Released
Good Afternoon Everyone,
The Apache Nutch PMC are very pleased to announce the immediate release of
Apache Nutch v2.2.1, we advise all current users and developers of the 2.X
series to upgrade to this release ASAP.
Apache Nutch is an open source web-search software project. Stemming
from Apache
Lucene <http://lucene.apache.org/java/>, it now builds on Apache
Solr<http://lucene.apache.org/solr/>adding web-specifics, such as a
crawler, a link-graph database and parsing
support handled by Apache Tika <http://tika.apache.org/> for HTML and and
array other document formats.
Although this release includes library upgrades to Apache
Hadoop<http://hadoop.apache.org>1.2.0 and Apache
Tika <http://tika.apache.org> 1.3, it is predominantly a bug fix for NUTCH-1591
- Incorrect conversion of ByteBuffer to
String<https://issues.apache.org/jira/browse/NUTCH-1591>.
Please see the list of
changes<http://www.apache.org/dist/nutch/2.2.1/2.2.1-CHANGES.txt>for a
full breakdown. As usual in the 2.x series, this release is made
available only as source, but is also available within Maven
Central<http://search.maven.org/>.
The release is available here <http://www.apache.org/dyn/closer.cgi/nutch/>.
Have a great day
Best
lewismc
(on behalf of the Apache Nutch community)
--
*Lewis*
Re: [ANNOUNCE] Apache Nutch v2.2.1 Released
Posted by Julien Nioche <li...@gmail.com>.
Great stuff! Thanks Lewis
On 2 July 2013 17:32, Lewis John Mcgibbney <le...@gmail.com>wrote:
> Good Afternoon Everyone,
>
> The Apache Nutch PMC are very pleased to announce the immediate release of
> Apache Nutch v2.2.1, we advise all current users and developers of the 2.X
> series to upgrade to this release ASAP.
>
> Apache Nutch is an open source web-search software project. Stemming
> from Apache
> Lucene <http://lucene.apache.org/java/>, it now builds on Apache
> Solr<http://lucene.apache.org/solr/>adding web-specifics, such as a
> crawler, a link-graph database and parsing
> support handled by Apache Tika <http://tika.apache.org/> for HTML and and
> array other document formats.
>
> Although this release includes library upgrades to Apache
> Hadoop<http://hadoop.apache.org>1.2.0 and Apache
> Tika <http://tika.apache.org> 1.3, it is predominantly a bug fix for
> NUTCH-1591
> - Incorrect conversion of ByteBuffer to
> String<https://issues.apache.org/jira/browse/NUTCH-1591>.
> Please see the list of
> changes<http://www.apache.org/dist/nutch/2.2.1/2.2.1-CHANGES.txt>for a
> full breakdown. As usual in the 2.x series, this release is made
> available only as source, but is also available within Maven
> Central<http://search.maven.org/>.
> The release is available here <http://www.apache.org/dyn/closer.cgi/nutch/
> >.
>
>
> Have a great day
>
> Best
> lewismc
> (on behalf of the Apache Nutch community)
>
> --
> *Lewis*
>
--
*
*Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble
Re: [ANNOUNCE] Apache Nutch v2.2.1 Released
Posted by Julien Nioche <li...@gmail.com>.
Great stuff! Thanks Lewis
On 2 July 2013 17:32, Lewis John Mcgibbney <le...@gmail.com>wrote:
> Good Afternoon Everyone,
>
> The Apache Nutch PMC are very pleased to announce the immediate release of
> Apache Nutch v2.2.1, we advise all current users and developers of the 2.X
> series to upgrade to this release ASAP.
>
> Apache Nutch is an open source web-search software project. Stemming
> from Apache
> Lucene <http://lucene.apache.org/java/>, it now builds on Apache
> Solr<http://lucene.apache.org/solr/>adding web-specifics, such as a
> crawler, a link-graph database and parsing
> support handled by Apache Tika <http://tika.apache.org/> for HTML and and
> array other document formats.
>
> Although this release includes library upgrades to Apache
> Hadoop<http://hadoop.apache.org>1.2.0 and Apache
> Tika <http://tika.apache.org> 1.3, it is predominantly a bug fix for
> NUTCH-1591
> - Incorrect conversion of ByteBuffer to
> String<https://issues.apache.org/jira/browse/NUTCH-1591>.
> Please see the list of
> changes<http://www.apache.org/dist/nutch/2.2.1/2.2.1-CHANGES.txt>for a
> full breakdown. As usual in the 2.x series, this release is made
> available only as source, but is also available within Maven
> Central<http://search.maven.org/>.
> The release is available here <http://www.apache.org/dyn/closer.cgi/nutch/
> >.
>
>
> Have a great day
>
> Best
> lewismc
> (on behalf of the Apache Nutch community)
>
> --
> *Lewis*
>
--
*
*Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble
Re: [ANNOUNCE] Apache Nutch v2.2.1 Released
Posted by glumet <ja...@gmail.com>.
Ok, my fault... if somebody is interested in the correct solution:
You must add correct version of gora-hbase to you libraries.
gora-hbase-0.3.jar (I used 0.2.1)
--
View this message in context: http://lucene.472066.n3.nabble.com/ANNOUNCE-Apache-Nutch-v2-2-1-Released-tp4074798p4075575.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: [ANNOUNCE] Apache Nutch v2.2.1 Released
Posted by glumet <ja...@gmail.com>.
And this is the output from hadoop.log
/2013-07-04 16:12:05,069 WARN mapred.LocalJobRunner -
job_local1522971864_0001
java.lang.Exception: java.lang.NoSuchMethodError:
org.apache.gora.persistency.Persistent.getSchema()Lorg/apache/avro/Schema;
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.NoSuchMethodError:
org.apache.gora.persistency.Persistent.getSchema()Lorg/apache/avro/Schema;
at org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:177)
at
org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:191)
at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:88)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)/
--
View this message in context: http://lucene.472066.n3.nabble.com/ANNOUNCE-Apache-Nutch-v2-2-1-Released-tp4074798p4075502.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: [ANNOUNCE] Apache Nutch v2.2.1 Released
Posted by glumet <ja...@gmail.com>.
Unfortunately, I need to write that the new version is not working for me...
It fails already while injecting
InjectorJob: starting at 2013-07-04 15:15:01
InjectorJob: Injecting urlDir: /opt/ir/nutch2/urls
InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora
storage class.
InjectorJob: java.lang.RuntimeException: job failed: name=[newwebpage]inject
/opt/ir/nutch2/urls, jobid=job_local2141339932_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
--
View this message in context: http://lucene.472066.n3.nabble.com/ANNOUNCE-Apache-Nutch-v2-2-1-Released-tp4074798p4075488.html
Sent from the Nutch - User mailing list archive at Nabble.com.
RE: [ANNOUNCE] Apache Nutch v2.2.1 Released
Posted by Markus Jelsma <ma...@openindex.io>.
Great news, thanks Lewis!
-----Original message-----
From: Lewis John Mcgibbney<le...@gmail.com>
Sent: Tuesday 2nd July 2013 18:32
To: user@nutch.apache.org; dev@nutch.apache.org
Subject: [ANNOUNCE] Apache Nutch v2.2.1 Released
Good Afternoon Everyone,
The Apache Nutch PMC are very pleased to announce the immediate release of Apache Nutch v2.2.1, we advise all
current users and developers of the 2.X series to upgrade to this release ASAP.
Apache Nutch is an open source web-search
software project. Stemming from Apache Lucene <http://lucene.apache.org/java/>, it now builds
on Apache Solr <http://lucene.apache.org/solr/> adding web-specifics, such as a crawler,
a link-graph database and parsing support handled by Apache Tika <http://tika.apache.org/>
for HTML and and array other document formats.
Although this
release includes library upgrades to Apache Hadoop <http://hadoop.apache.org> 1.2.0 and
Apache Tika <http://tika.apache.org> 1.3, it is predominantly a bug fix for
NUTCH-1591 - Incorrect conversion of ByteBuffer to String <https://issues.apache.org/jira/browse/NUTCH-1591>.
Please see the list of changes <http://www.apache.org/dist/nutch/2.2.1/2.2.1-CHANGES.txt> for a full
breakdown.
As usual in the 2.x series, this release is made available only as source, but is also available within
Maven Central <http://search.maven.org/>.
The release is available here <http://www.apache.org/dyn/closer.cgi/nutch/>.
Have a great day
Best
lewismc
(on behalf of the Apache Nutch community)
--
Lewis
RE: [ANNOUNCE] Apache Nutch v2.2.1 Released
Posted by Markus Jelsma <ma...@openindex.io>.
Great news, thanks Lewis!
-----Original message-----
From: Lewis John Mcgibbney<le...@gmail.com>
Sent: Tuesday 2nd July 2013 18:32
To: user@nutch.apache.org; dev@nutch.apache.org
Subject: [ANNOUNCE] Apache Nutch v2.2.1 Released
Good Afternoon Everyone,
The Apache Nutch PMC are very pleased to announce the immediate release of Apache Nutch v2.2.1, we advise all
current users and developers of the 2.X series to upgrade to this release ASAP.
Apache Nutch is an open source web-search
software project. Stemming from Apache Lucene <http://lucene.apache.org/java/>, it now builds
on Apache Solr <http://lucene.apache.org/solr/> adding web-specifics, such as a crawler,
a link-graph database and parsing support handled by Apache Tika <http://tika.apache.org/>
for HTML and and array other document formats.
Although this
release includes library upgrades to Apache Hadoop <http://hadoop.apache.org> 1.2.0 and
Apache Tika <http://tika.apache.org> 1.3, it is predominantly a bug fix for
NUTCH-1591 - Incorrect conversion of ByteBuffer to String <https://issues.apache.org/jira/browse/NUTCH-1591>.
Please see the list of changes <http://www.apache.org/dist/nutch/2.2.1/2.2.1-CHANGES.txt> for a full
breakdown.
As usual in the 2.x series, this release is made available only as source, but is also available within
Maven Central <http://search.maven.org/>.
The release is available here <http://www.apache.org/dyn/closer.cgi/nutch/>.
Have a great day
Best
lewismc
(on behalf of the Apache Nutch community)
--
Lewis