You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2012/09/21 17:07:20 UTC

[VOTE] Apache Nutch 2.1 Release Candidate Available

Hi Everyone,

A candidate for Apache Nutch 2.1 is available at:

http://people.apache.org/~lewismc/apache-nutch-2.1

The release candidate is a src.zip and src.tar.gz ONLY
archive of the sources in:

http://svn.apache.org/repos/asf/nutch/tags/release-2.1/

We release Nutch 2.1 in this fashion due to the inclusion of
Apache Gora and the likelihood that users will regularly recompile
the code to suit dynamic requirements.

Further, a staged Maven repository of the 2.1 jar, sources.jar and
javadoc.jar is available here:

https://repository.apache.org/content/repositories/orgapachenutch-020/

Please vote on releasing this package as Apache Nutch 2.1.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 Nutch PMC votes are cast.

 [ ] +1 Release this package as Apache Nutch 2.1
 [ ] -1 Do not release this package because...

Many Thanks and heres to plenty more.

Kind Regards,
Lewis

P.S. Here's my +1.

-- 
Lewis

Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
+1 from me:

SIGS check out:

[chipotle:~/tmp/apache-nutch-2.1] mattmann% $HOME/bin/verify_gpg_sigs 
Verifying Signature for file apache-nutch-2.1-src.tar.gz.asc
gpg: Signature made Fri Sep 21 15:59:21 2012 BST using RSA key ID C601BCA7
gpg: Good signature from "Lewis John McGibbney (CODE SIGNING KEY) <le...@apache.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 2A23 D53F 8D27 5CB6 91E1  89C1 F45E 7970 C601 BCA7
Verifying Signature for file apache-nutch-2.1-src.zip.asc
gpg: Signature made Fri Sep 21 15:59:42 2012 BST using RSA key ID C601BCA7
gpg: Good signature from "Lewis John McGibbney (CODE SIGNING KEY) <le...@apache.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 2A23 D53F 8D27 5CB6 91E1  89C1 F45E 7970 C601 BCA7
[chipotle:~/tmp/apache-nutch-2.1] mattmann% 

MD5s check out:

[chipotle:~/tmp/apache-nutch-2.1] mattmann% $HOME/bin/verify_md5_checksums 
md5sum: stat '*.bz2': No such file or directory
apache-nutch-2.1-src.tar.gz: OK
apache-nutch-2.1-src.zip: OK
[chipotle:~/tmp/apache-nutch-2.1] mattmann% 

I built the code using ant runtime and it checked out fine:

...snip

runtime:
    [mkdir] Created dir: /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime
    [mkdir] Created dir: /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local
    [mkdir] Created dir: /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/deploy
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/deploy
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/deploy/bin
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/lib
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/lib/native
     [copy] Copying 26 files to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/conf
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/bin
     [copy] Copying 89 files to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/lib
     [copy] Copying 97 files to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/plugins
     [copy] Copied 2 empty directories to 2 empty directories under /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/test

BUILD SUCCESSFUL
Total time: 1 minute 24 seconds
[chipotle:~/tmp/apache-nutch-2.1/apache-nutch-2.1] mattmann% 

Looks great and great work!

Cheers,
Chris


On Sep 21, 2012, at 4:07 PM, Lewis John Mcgibbney wrote:

> Hi Everyone,
> 
> A candidate for Apache Nutch 2.1 is available at:
> 
> http://people.apache.org/~lewismc/apache-nutch-2.1
> 
> The release candidate is a src.zip and src.tar.gz ONLY
> archive of the sources in:
> 
> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
> 
> We release Nutch 2.1 in this fashion due to the inclusion of
> Apache Gora and the likelihood that users will regularly recompile
> the code to suit dynamic requirements.
> 
> Further, a staged Maven repository of the 2.1 jar, sources.jar and
> javadoc.jar is available here:
> 
> https://repository.apache.org/content/repositories/orgapachenutch-020/
> 
> Please vote on releasing this package as Apache Nutch 2.1.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Nutch PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Nutch 2.1
> [ ] -1 Do not release this package because...
> 
> Many Thanks and heres to plenty more.
> 
> Kind Regards,
> Lewis
> 
> P.S. Here's my +1.
> 
> -- 
> Lewis


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Sebastian Nagel <wa...@googlemail.com>.
Forgot to say:
I've run the test crawl with HBase 0.90.5

On 10/01/2012 04:34 PM, Julien Nioche wrote:
> Would be good to get thumb-ups from people who've tested crawls on other
> backends (Cassandra, Hbase) before pushing the release.  I can't really
> give a +1 as I've just checked the most obvious things.
> 
> On 1 October 2012 14:30, Lewis John Mcgibbney <le...@gmail.com>wrote:
> 
>> Hi Julien,
>>
>>
>> On Mon, Oct 1, 2012 at 2:18 PM, Julien Nioche <
>> lists.digitalpebble@gmail.com> wrote:
>>
>>>
>>> The sources look good otherwise, it compiles fine but on my machine the
>>> tests fail on TestGoraStorage with
>>>
>>> *[Server@1a467d4]: [Thread[HSQLDB Server @1a467d4,5,main]]:
>>> org.hsqldb.Server@1a467d4.run()/handleConnection(): *
>>>
>>>
>> I'll have a look into this later. I have been using the SQL module more
>> regularly recently and haven't come across this when recompiling and
>> testing but I will update none-the-less.
>>
>> Can I consider this as your +1 or am I jumping the gun?
>>
>> Thank you
>>
>> Lewis
>>
> 
> 
> 


Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi All,

I am tempted to close this VOTE tomorrow unless there are further
objections...

I'll let this simmer away for > 12 hours.

Thank you very much
Lewis

On Mon, Oct 1, 2012 at 3:34 PM, Julien Nioche <lists.digitalpebble@gmail.com
> wrote:

> Would be good to get thumb-ups from people who've tested crawls on other
> backends (Cassandra, Hbase) before pushing the release.  I can't really
> give a +1 as I've just checked the most obvious things.
>
>
> On 1 October 2012 14:30, Lewis John Mcgibbney <le...@gmail.com>wrote:
>
>> Hi Julien,
>>
>>
>> On Mon, Oct 1, 2012 at 2:18 PM, Julien Nioche <
>> lists.digitalpebble@gmail.com> wrote:
>>
>>>
>>> The sources look good otherwise, it compiles fine but on my machine the
>>> tests fail on TestGoraStorage with
>>>
>>> *[Server@1a467d4]: [Thread[HSQLDB Server @1a467d4,5,main]]:
>>> org.hsqldb.Server@1a467d4.run()/handleConnection(): *
>>>
>>>
>> I'll have a look into this later. I have been using the SQL module more
>> regularly recently and haven't come across this when recompiling and
>> testing but I will update none-the-less.
>>
>> Can I consider this as your +1 or am I jumping the gun?
>>
>> Thank you
>>
>> Lewis
>>
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>
>


-- 
*Lewis*

Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Julien Nioche <li...@gmail.com>.
Would be good to get thumb-ups from people who've tested crawls on other
backends (Cassandra, Hbase) before pushing the release.  I can't really
give a +1 as I've just checked the most obvious things.

On 1 October 2012 14:30, Lewis John Mcgibbney <le...@gmail.com>wrote:

> Hi Julien,
>
>
> On Mon, Oct 1, 2012 at 2:18 PM, Julien Nioche <
> lists.digitalpebble@gmail.com> wrote:
>
>>
>> The sources look good otherwise, it compiles fine but on my machine the
>> tests fail on TestGoraStorage with
>>
>> *[Server@1a467d4]: [Thread[HSQLDB Server @1a467d4,5,main]]:
>> org.hsqldb.Server@1a467d4.run()/handleConnection(): *
>>
>>
> I'll have a look into this later. I have been using the SQL module more
> regularly recently and haven't come across this when recompiling and
> testing but I will update none-the-less.
>
> Can I consider this as your +1 or am I jumping the gun?
>
> Thank you
>
> Lewis
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Julien,

On Mon, Oct 1, 2012 at 2:18 PM, Julien Nioche <lists.digitalpebble@gmail.com
> wrote:

>
> The sources look good otherwise, it compiles fine but on my machine the
> tests fail on TestGoraStorage with
>
> *[Server@1a467d4]: [Thread[HSQLDB Server @1a467d4,5,main]]:
> org.hsqldb.Server@1a467d4.run()/handleConnection(): *
>
>
I'll have a look into this later. I have been using the SQL module more
regularly recently and haven't come across this when recompiling and
testing but I will update none-the-less.

Can I consider this as your +1 or am I jumping the gun?

Thank you

Lewis

Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Julien Nioche <li...@gmail.com>.
Ok, thanks. Was trying to get a minimalistic crawl of
http://nutch.apache.org/ with MySQL but no success so far (the URL is not
being fetched). Unfortunately won't have the time to investigate this week.

The sources look good otherwise, it compiles fine but on my machine the
tests fail on TestGoraStorage with

*[Server@1a467d4]: [Thread[HSQLDB Server @1a467d4,5,main]]:
org.hsqldb.Server@1a467d4.run()/handleConnection(): *
*java.net.SocketException: Too many open files*
* at java.net.PlainSocketImpl.socketAccept(Native Method)*
* at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)*
* at java.net.ServerSocket.implAccept(ServerSocket.java:462)*
* at java.net.ServerSocket.accept(ServerSocket.java:430)*
* at org.hsqldb.server.Server.run(Unknown Source)*
* at org.hsqldb.server.Server.access$000(Unknown Source)*
* at org.hsqldb.server.Server$ServerThread.run(Unknown Source)*
*
*
Probably not a blocker though

Thanks

Julien


On 1 October 2012 13:46, Lewis John Mcgibbney <le...@gmail.com>wrote:

> Hi Julien,
>
> No gora-sql has been static since 0.1.1-incubating, this was due to a
> licensing issue and although the recent gora -sql * dependencies have been
> pushed to maven central we still use the legacy 0.1.1-incubating artifact
> for the time being.
>
> The time has simply not been available (on my part anyway) to dive head
> into a ASL v2.0 licensed Java/SQL client API to re-write the gora-sql
> module and bring it bang up to date.
>
> Thanks
>
> Lewis
>
>
> On Mon, Oct 1, 2012 at 1:36 PM, Julien Nioche <
> lists.digitalpebble@gmail.com> wrote:
>
>> Shouldn't the dependency for gora-sql point to v 0.2.1?
>>
>>
>> On 21 September 2012 16:07, Lewis John Mcgibbney <
>> lewis.mcgibbney@gmail.com> wrote:
>>
>>> Hi Everyone,
>>>
>>> A candidate for Apache Nutch 2.1 is available at:
>>>
>>> http://people.apache.org/~lewismc/apache-nutch-2.1
>>>
>>> The release candidate is a src.zip and src.tar.gz ONLY
>>> archive of the sources in:
>>>
>>> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>>>
>>> We release Nutch 2.1 in this fashion due to the inclusion of
>>> Apache Gora and the likelihood that users will regularly recompile
>>> the code to suit dynamic requirements.
>>>
>>> Further, a staged Maven repository of the 2.1 jar, sources.jar and
>>> javadoc.jar is available here:
>>>
>>> https://repository.apache.org/content/repositories/orgapachenutch-020/
>>>
>>> Please vote on releasing this package as Apache Nutch 2.1.
>>> The vote is open for the next 72 hours and passes if a majority of at
>>> least three +1 Nutch PMC votes are cast.
>>>
>>>  [ ] +1 Release this package as Apache Nutch 2.1
>>>  [ ] -1 Do not release this package because...
>>>
>>> Many Thanks and heres to plenty more.
>>>
>>> Kind Regards,
>>> Lewis
>>>
>>> P.S. Here's my +1.
>>>
>>> --
>>> Lewis
>>>
>>
>>
>>
>> --
>> *
>> *Open Source Solutions for Text Engineering
>>
>> http://digitalpebble.blogspot.com/
>> http://www.digitalpebble.com
>> http://twitter.com/digitalpebble
>>
>>
>
>
> --
> *Lewis*
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Julien,

No gora-sql has been static since 0.1.1-incubating, this was due to a
licensing issue and although the recent gora -sql * dependencies have been
pushed to maven central we still use the legacy 0.1.1-incubating artifact
for the time being.

The time has simply not been available (on my part anyway) to dive head
into a ASL v2.0 licensed Java/SQL client API to re-write the gora-sql
module and bring it bang up to date.

Thanks

Lewis

On Mon, Oct 1, 2012 at 1:36 PM, Julien Nioche <lists.digitalpebble@gmail.com
> wrote:

> Shouldn't the dependency for gora-sql point to v 0.2.1?
>
>
> On 21 September 2012 16:07, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> Hi Everyone,
>>
>> A candidate for Apache Nutch 2.1 is available at:
>>
>> http://people.apache.org/~lewismc/apache-nutch-2.1
>>
>> The release candidate is a src.zip and src.tar.gz ONLY
>> archive of the sources in:
>>
>> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>>
>> We release Nutch 2.1 in this fashion due to the inclusion of
>> Apache Gora and the likelihood that users will regularly recompile
>> the code to suit dynamic requirements.
>>
>> Further, a staged Maven repository of the 2.1 jar, sources.jar and
>> javadoc.jar is available here:
>>
>> https://repository.apache.org/content/repositories/orgapachenutch-020/
>>
>> Please vote on releasing this package as Apache Nutch 2.1.
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 Nutch PMC votes are cast.
>>
>>  [ ] +1 Release this package as Apache Nutch 2.1
>>  [ ] -1 Do not release this package because...
>>
>> Many Thanks and heres to plenty more.
>>
>> Kind Regards,
>> Lewis
>>
>> P.S. Here's my +1.
>>
>> --
>> Lewis
>>
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>
>


-- 
*Lewis*

Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Julien Nioche <li...@gmail.com>.
Shouldn't the dependency for gora-sql point to v 0.2.1?

On 21 September 2012 16:07, Lewis John Mcgibbney
<le...@gmail.com>wrote:

> Hi Everyone,
>
> A candidate for Apache Nutch 2.1 is available at:
>
> http://people.apache.org/~lewismc/apache-nutch-2.1
>
> The release candidate is a src.zip and src.tar.gz ONLY
> archive of the sources in:
>
> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>
> We release Nutch 2.1 in this fashion due to the inclusion of
> Apache Gora and the likelihood that users will regularly recompile
> the code to suit dynamic requirements.
>
> Further, a staged Maven repository of the 2.1 jar, sources.jar and
> javadoc.jar is available here:
>
> https://repository.apache.org/content/repositories/orgapachenutch-020/
>
> Please vote on releasing this package as Apache Nutch 2.1.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Nutch PMC votes are cast.
>
>  [ ] +1 Release this package as Apache Nutch 2.1
>  [ ] -1 Do not release this package because...
>
> Many Thanks and heres to plenty more.
>
> Kind Regards,
> Lewis
>
> P.S. Here's my +1.
>
> --
> Lewis
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Bai Shen <ba...@gmail.com>.
Gotcha.  I wasn't sure if that was the case or not.  Just wanted to make
sure y'all were aware.

On Wed, Oct 3, 2012 at 9:37 AM, Julien Nioche <lists.digitalpebble@gmail.com
> wrote:

> Only the Apache distribution of Hadoop version 1.0.3 is officially
> supported by Nutch. Obviously if we can get it to work on other
> distribution then the better it is but this can't be considered a bug or a
> blocker for the release
>
> On 3 October 2012 14:10, Bai Shen <ba...@gmail.com> wrote:
>
> > I just tried to run it and I'm getting the following bug on CDH4.
> >
> > https://issues.apache.org/jira/browse/NUTCH-1447
> >
> > On Mon, Oct 1, 2012 at 8:17 AM, Lewis John Mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > Hi All,
> > >
> > > Anyone else for this VOTE?
> > >
> > > Sorry to be a pest!
> > >
> > > Thanks
> > >
> > > Lewis
> > >
> > > On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney
> > > <le...@gmail.com> wrote:
> > > > Hi Everyone,
> > > >
> > > > A candidate for Apache Nutch 2.1 is available at:
> > > >
> > > > http://people.apache.org/~lewismc/apache-nutch-2.1
> > > >
> > > > The release candidate is a src.zip and src.tar.gz ONLY
> > > > archive of the sources in:
> > > >
> > > > http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
> > > >
> > > > We release Nutch 2.1 in this fashion due to the inclusion of
> > > > Apache Gora and the likelihood that users will regularly recompile
> > > > the code to suit dynamic requirements.
> > > >
> > > > Further, a staged Maven repository of the 2.1 jar, sources.jar and
> > > > javadoc.jar is available here:
> > > >
> > > >
> https://repository.apache.org/content/repositories/orgapachenutch-020/
> > > >
> > > > Please vote on releasing this package as Apache Nutch 2.1.
> > > > The vote is open for the next 72 hours and passes if a majority of at
> > > > least three +1 Nutch PMC votes are cast.
> > > >
> > > >  [ ] +1 Release this package as Apache Nutch 2.1
> > > >  [ ] -1 Do not release this package because...
> > > >
> > > > Many Thanks and heres to plenty more.
> > > >
> > > > Kind Regards,
> > > > Lewis
> > > >
> > > > P.S. Here's my +1.
> > > >
> > > > --
> > > > Lewis
> > >
> > >
> > >
> > > --
> > > Lewis
> > >
> >
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>

Re: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Julien Nioche <li...@gmail.com>.
Only the Apache distribution of Hadoop version 1.0.3 is officially
supported by Nutch. Obviously if we can get it to work on other
distribution then the better it is but this can't be considered a bug or a
blocker for the release

On 3 October 2012 14:10, Bai Shen <ba...@gmail.com> wrote:

> I just tried to run it and I'm getting the following bug on CDH4.
>
> https://issues.apache.org/jira/browse/NUTCH-1447
>
> On Mon, Oct 1, 2012 at 8:17 AM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
> > Hi All,
> >
> > Anyone else for this VOTE?
> >
> > Sorry to be a pest!
> >
> > Thanks
> >
> > Lewis
> >
> > On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney
> > <le...@gmail.com> wrote:
> > > Hi Everyone,
> > >
> > > A candidate for Apache Nutch 2.1 is available at:
> > >
> > > http://people.apache.org/~lewismc/apache-nutch-2.1
> > >
> > > The release candidate is a src.zip and src.tar.gz ONLY
> > > archive of the sources in:
> > >
> > > http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
> > >
> > > We release Nutch 2.1 in this fashion due to the inclusion of
> > > Apache Gora and the likelihood that users will regularly recompile
> > > the code to suit dynamic requirements.
> > >
> > > Further, a staged Maven repository of the 2.1 jar, sources.jar and
> > > javadoc.jar is available here:
> > >
> > > https://repository.apache.org/content/repositories/orgapachenutch-020/
> > >
> > > Please vote on releasing this package as Apache Nutch 2.1.
> > > The vote is open for the next 72 hours and passes if a majority of at
> > > least three +1 Nutch PMC votes are cast.
> > >
> > >  [ ] +1 Release this package as Apache Nutch 2.1
> > >  [ ] -1 Do not release this package because...
> > >
> > > Many Thanks and heres to plenty more.
> > >
> > > Kind Regards,
> > > Lewis
> > >
> > > P.S. Here's my +1.
> > >
> > > --
> > > Lewis
> >
> >
> >
> > --
> > Lewis
> >
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Bai Shen <ba...@gmail.com>.
I just tried to run it and I'm getting the following bug on CDH4.

https://issues.apache.org/jira/browse/NUTCH-1447

On Mon, Oct 1, 2012 at 8:17 AM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Hi All,
>
> Anyone else for this VOTE?
>
> Sorry to be a pest!
>
> Thanks
>
> Lewis
>
> On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney
> <le...@gmail.com> wrote:
> > Hi Everyone,
> >
> > A candidate for Apache Nutch 2.1 is available at:
> >
> > http://people.apache.org/~lewismc/apache-nutch-2.1
> >
> > The release candidate is a src.zip and src.tar.gz ONLY
> > archive of the sources in:
> >
> > http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
> >
> > We release Nutch 2.1 in this fashion due to the inclusion of
> > Apache Gora and the likelihood that users will regularly recompile
> > the code to suit dynamic requirements.
> >
> > Further, a staged Maven repository of the 2.1 jar, sources.jar and
> > javadoc.jar is available here:
> >
> > https://repository.apache.org/content/repositories/orgapachenutch-020/
> >
> > Please vote on releasing this package as Apache Nutch 2.1.
> > The vote is open for the next 72 hours and passes if a majority of at
> > least three +1 Nutch PMC votes are cast.
> >
> >  [ ] +1 Release this package as Apache Nutch 2.1
> >  [ ] -1 Do not release this package because...
> >
> > Many Thanks and heres to plenty more.
> >
> > Kind Regards,
> > Lewis
> >
> > P.S. Here's my +1.
> >
> > --
> > Lewis
>
>
>
> --
> Lewis
>

Re: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi,

Further to my +1 I can confirm that I've run medium sized crawls and
domain specific focused crawls using 2.1 tag, in pseduo distrib mode
on Hadoop 1.0.1 with gora-cassandra 0.2 and Cassandra 1.1.2

My Cassandra records look great displaying all column families, column
fields and super column fields for the gora cassandra mapping
configuration. The keyspace looks great with many links populated.

Lewis

On Mon, Oct 1, 2012 at 1:17 PM, Lewis John Mcgibbney
<le...@gmail.com> wrote:
> Hi All,
>
> Anyone else for this VOTE?
>
> Sorry to be a pest!
>
> Thanks
>
> Lewis
>
> On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney
> <le...@gmail.com> wrote:
>> Hi Everyone,
>>
>> A candidate for Apache Nutch 2.1 is available at:
>>
>> http://people.apache.org/~lewismc/apache-nutch-2.1
>>
>> The release candidate is a src.zip and src.tar.gz ONLY
>> archive of the sources in:
>>
>> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>>
>> We release Nutch 2.1 in this fashion due to the inclusion of
>> Apache Gora and the likelihood that users will regularly recompile
>> the code to suit dynamic requirements.
>>
>> Further, a staged Maven repository of the 2.1 jar, sources.jar and
>> javadoc.jar is available here:
>>
>> https://repository.apache.org/content/repositories/orgapachenutch-020/
>>
>> Please vote on releasing this package as Apache Nutch 2.1.
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 Nutch PMC votes are cast.
>>
>>  [ ] +1 Release this package as Apache Nutch 2.1
>>  [ ] -1 Do not release this package because...
>>
>> Many Thanks and heres to plenty more.
>>
>> Kind Regards,
>> Lewis
>>
>> P.S. Here's my +1.
>>
>> --
>> Lewis
>
>
>
> --
> Lewis



-- 
Lewis

Re: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks for your VOTE!

Cheers,
Chris

On Oct 4, 2012, at 1:08 AM, <j....@thomsonreuters.com>
 <j....@thomsonreuters.com> wrote:

> A bit late but my two cents. I have done a couple of installs on Ubuntu 12.04 using MySQL for the backend and have noticed a couple of the improvements and no regressions so +1 for releasing from my end.
> 
> -----Original Message-----
> From: Lewis John Mcgibbney [mailto:lewis.mcgibbney@gmail.com] 
> Sent: Monday, October 01, 2012 9:18 PM
> To: dev@nutch.apache.org; user@nutch.apache.org
> Subject: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available
> 
> Hi All,
> 
> Anyone else for this VOTE?
> 
> Sorry to be a pest!
> 
> Thanks
> 
> Lewis
> 
> On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney <le...@gmail.com> wrote:
>> Hi Everyone,
>> 
>> A candidate for Apache Nutch 2.1 is available at:
>> 
>> http://people.apache.org/~lewismc/apache-nutch-2.1
>> 
>> The release candidate is a src.zip and src.tar.gz ONLY archive of the 
>> sources in:
>> 
>> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>> 
>> We release Nutch 2.1 in this fashion due to the inclusion of Apache 
>> Gora and the likelihood that users will regularly recompile the code 
>> to suit dynamic requirements.
>> 
>> Further, a staged Maven repository of the 2.1 jar, sources.jar and 
>> javadoc.jar is available here:
>> 
>> https://repository.apache.org/content/repositories/orgapachenutch-020/
>> 
>> Please vote on releasing this package as Apache Nutch 2.1.
>> The vote is open for the next 72 hours and passes if a majority of at 
>> least three +1 Nutch PMC votes are cast.
>> 
>> [ ] +1 Release this package as Apache Nutch 2.1  [ ] -1 Do not 
>> release this package because...
>> 
>> Many Thanks and heres to plenty more.
>> 
>> Kind Regards,
>> Lewis
>> 
>> P.S. Here's my +1.
>> 
>> --
>> Lewis
> 
> 
> 
> --
> Lewis


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks for your VOTE!

Cheers,
Chris

On Oct 4, 2012, at 1:08 AM, <j....@thomsonreuters.com>
 <j....@thomsonreuters.com> wrote:

> A bit late but my two cents. I have done a couple of installs on Ubuntu 12.04 using MySQL for the backend and have noticed a couple of the improvements and no regressions so +1 for releasing from my end.
> 
> -----Original Message-----
> From: Lewis John Mcgibbney [mailto:lewis.mcgibbney@gmail.com] 
> Sent: Monday, October 01, 2012 9:18 PM
> To: dev@nutch.apache.org; user@nutch.apache.org
> Subject: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available
> 
> Hi All,
> 
> Anyone else for this VOTE?
> 
> Sorry to be a pest!
> 
> Thanks
> 
> Lewis
> 
> On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney <le...@gmail.com> wrote:
>> Hi Everyone,
>> 
>> A candidate for Apache Nutch 2.1 is available at:
>> 
>> http://people.apache.org/~lewismc/apache-nutch-2.1
>> 
>> The release candidate is a src.zip and src.tar.gz ONLY archive of the 
>> sources in:
>> 
>> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>> 
>> We release Nutch 2.1 in this fashion due to the inclusion of Apache 
>> Gora and the likelihood that users will regularly recompile the code 
>> to suit dynamic requirements.
>> 
>> Further, a staged Maven repository of the 2.1 jar, sources.jar and 
>> javadoc.jar is available here:
>> 
>> https://repository.apache.org/content/repositories/orgapachenutch-020/
>> 
>> Please vote on releasing this package as Apache Nutch 2.1.
>> The vote is open for the next 72 hours and passes if a majority of at 
>> least three +1 Nutch PMC votes are cast.
>> 
>> [ ] +1 Release this package as Apache Nutch 2.1  [ ] -1 Do not 
>> release this package because...
>> 
>> Many Thanks and heres to plenty more.
>> 
>> Kind Regards,
>> Lewis
>> 
>> P.S. Here's my +1.
>> 
>> --
>> Lewis
> 
> 
> 
> --
> Lewis


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


RE: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by j....@thomsonreuters.com.
A bit late but my two cents. I have done a couple of installs on Ubuntu 12.04 using MySQL for the backend and have noticed a couple of the improvements and no regressions so +1 for releasing from my end.

-----Original Message-----
From: Lewis John Mcgibbney [mailto:lewis.mcgibbney@gmail.com] 
Sent: Monday, October 01, 2012 9:18 PM
To: dev@nutch.apache.org; user@nutch.apache.org
Subject: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Hi All,

Anyone else for this VOTE?

Sorry to be a pest!

Thanks

Lewis

On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney <le...@gmail.com> wrote:
> Hi Everyone,
>
> A candidate for Apache Nutch 2.1 is available at:
>
> http://people.apache.org/~lewismc/apache-nutch-2.1
>
> The release candidate is a src.zip and src.tar.gz ONLY archive of the 
> sources in:
>
> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>
> We release Nutch 2.1 in this fashion due to the inclusion of Apache 
> Gora and the likelihood that users will regularly recompile the code 
> to suit dynamic requirements.
>
> Further, a staged Maven repository of the 2.1 jar, sources.jar and 
> javadoc.jar is available here:
>
> https://repository.apache.org/content/repositories/orgapachenutch-020/
>
> Please vote on releasing this package as Apache Nutch 2.1.
> The vote is open for the next 72 hours and passes if a majority of at 
> least three +1 Nutch PMC votes are cast.
>
>  [ ] +1 Release this package as Apache Nutch 2.1  [ ] -1 Do not 
> release this package because...
>
> Many Thanks and heres to plenty more.
>
> Kind Regards,
> Lewis
>
> P.S. Here's my +1.
>
> --
> Lewis



--
Lewis

[PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi All,

Anyone else for this VOTE?

Sorry to be a pest!

Thanks

Lewis

On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney
<le...@gmail.com> wrote:
> Hi Everyone,
>
> A candidate for Apache Nutch 2.1 is available at:
>
> http://people.apache.org/~lewismc/apache-nutch-2.1
>
> The release candidate is a src.zip and src.tar.gz ONLY
> archive of the sources in:
>
> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>
> We release Nutch 2.1 in this fashion due to the inclusion of
> Apache Gora and the likelihood that users will regularly recompile
> the code to suit dynamic requirements.
>
> Further, a staged Maven repository of the 2.1 jar, sources.jar and
> javadoc.jar is available here:
>
> https://repository.apache.org/content/repositories/orgapachenutch-020/
>
> Please vote on releasing this package as Apache Nutch 2.1.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Nutch PMC votes are cast.
>
>  [ ] +1 Release this package as Apache Nutch 2.1
>  [ ] -1 Do not release this package because...
>
> Many Thanks and heres to plenty more.
>
> Kind Regards,
> Lewis
>
> P.S. Here's my +1.
>
> --
> Lewis



-- 
Lewis

Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Sebastian Nagel <wa...@googlemail.com>.
+1

* package looks good
* sample crawl runs like a charm

On 09/21/2012 05:07 PM, Lewis John Mcgibbney wrote:
> Hi Everyone,
> 
> A candidate for Apache Nutch 2.1 is available at:
> 
> http://people.apache.org/~lewismc/apache-nutch-2.1
> 
> The release candidate is a src.zip and src.tar.gz ONLY
> archive of the sources in:
> 
> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
> 
> We release Nutch 2.1 in this fashion due to the inclusion of
> Apache Gora and the likelihood that users will regularly recompile
> the code to suit dynamic requirements.
> 
> Further, a staged Maven repository of the 2.1 jar, sources.jar and
> javadoc.jar is available here:
> 
> https://repository.apache.org/content/repositories/orgapachenutch-020/
> 
> Please vote on releasing this package as Apache Nutch 2.1.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Nutch PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache Nutch 2.1
>  [ ] -1 Do not release this package because...
> 
> Many Thanks and heres to plenty more.
> 
> Kind Regards,
> Lewis
> 
> P.S. Here's my +1.
> 


Re: [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
+1 from me:

SIGS check out:

[chipotle:~/tmp/apache-nutch-2.1] mattmann% $HOME/bin/verify_gpg_sigs 
Verifying Signature for file apache-nutch-2.1-src.tar.gz.asc
gpg: Signature made Fri Sep 21 15:59:21 2012 BST using RSA key ID C601BCA7
gpg: Good signature from "Lewis John McGibbney (CODE SIGNING KEY) <le...@apache.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 2A23 D53F 8D27 5CB6 91E1  89C1 F45E 7970 C601 BCA7
Verifying Signature for file apache-nutch-2.1-src.zip.asc
gpg: Signature made Fri Sep 21 15:59:42 2012 BST using RSA key ID C601BCA7
gpg: Good signature from "Lewis John McGibbney (CODE SIGNING KEY) <le...@apache.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 2A23 D53F 8D27 5CB6 91E1  89C1 F45E 7970 C601 BCA7
[chipotle:~/tmp/apache-nutch-2.1] mattmann% 

MD5s check out:

[chipotle:~/tmp/apache-nutch-2.1] mattmann% $HOME/bin/verify_md5_checksums 
md5sum: stat '*.bz2': No such file or directory
apache-nutch-2.1-src.tar.gz: OK
apache-nutch-2.1-src.zip: OK
[chipotle:~/tmp/apache-nutch-2.1] mattmann% 

I built the code using ant runtime and it checked out fine:

...snip

runtime:
    [mkdir] Created dir: /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime
    [mkdir] Created dir: /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local
    [mkdir] Created dir: /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/deploy
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/deploy
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/deploy/bin
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/lib
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/lib/native
     [copy] Copying 26 files to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/conf
     [copy] Copying 1 file to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/bin
     [copy] Copying 89 files to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/lib
     [copy] Copying 97 files to /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/plugins
     [copy] Copied 2 empty directories to 2 empty directories under /Users/mattmann/tmp/apache-nutch-2.1/apache-nutch-2.1/runtime/local/test

BUILD SUCCESSFUL
Total time: 1 minute 24 seconds
[chipotle:~/tmp/apache-nutch-2.1/apache-nutch-2.1] mattmann% 

Looks great and great work!

Cheers,
Chris


On Sep 21, 2012, at 4:07 PM, Lewis John Mcgibbney wrote:

> Hi Everyone,
> 
> A candidate for Apache Nutch 2.1 is available at:
> 
> http://people.apache.org/~lewismc/apache-nutch-2.1
> 
> The release candidate is a src.zip and src.tar.gz ONLY
> archive of the sources in:
> 
> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
> 
> We release Nutch 2.1 in this fashion due to the inclusion of
> Apache Gora and the likelihood that users will regularly recompile
> the code to suit dynamic requirements.
> 
> Further, a staged Maven repository of the 2.1 jar, sources.jar and
> javadoc.jar is available here:
> 
> https://repository.apache.org/content/repositories/orgapachenutch-020/
> 
> Please vote on releasing this package as Apache Nutch 2.1.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Nutch PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Nutch 2.1
> [ ] -1 Do not release this package because...
> 
> Many Thanks and heres to plenty more.
> 
> Kind Regards,
> Lewis
> 
> P.S. Here's my +1.
> 
> -- 
> Lewis


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


[PING] [VOTE] Apache Nutch 2.1 Release Candidate Available

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi All,

Anyone else for this VOTE?

Sorry to be a pest!

Thanks

Lewis

On Fri, Sep 21, 2012 at 4:07 PM, Lewis John Mcgibbney
<le...@gmail.com> wrote:
> Hi Everyone,
>
> A candidate for Apache Nutch 2.1 is available at:
>
> http://people.apache.org/~lewismc/apache-nutch-2.1
>
> The release candidate is a src.zip and src.tar.gz ONLY
> archive of the sources in:
>
> http://svn.apache.org/repos/asf/nutch/tags/release-2.1/
>
> We release Nutch 2.1 in this fashion due to the inclusion of
> Apache Gora and the likelihood that users will regularly recompile
> the code to suit dynamic requirements.
>
> Further, a staged Maven repository of the 2.1 jar, sources.jar and
> javadoc.jar is available here:
>
> https://repository.apache.org/content/repositories/orgapachenutch-020/
>
> Please vote on releasing this package as Apache Nutch 2.1.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Nutch PMC votes are cast.
>
>  [ ] +1 Release this package as Apache Nutch 2.1
>  [ ] -1 Do not release this package because...
>
> Many Thanks and heres to plenty more.
>
> Kind Regards,
> Lewis
>
> P.S. Here's my +1.
>
> --
> Lewis



-- 
Lewis