You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Timothy Rodriguez (BLOOMBERG/ 120 PARK)" <tr...@bloomberg.net> on 2016/11/07 23:45:04 UTC

Problems Running ant enwiki

Anyone else having problems retrieving the test wikipedia set used for benchmarks? It looks like the resource is no longer available. When I run ant enwiki I receive the following:

[get] Error opening connection java.io.FileNotFoundException:http://people.apache.org/~gsingers/wikipedia/enwiki-
20070527-pages-articles.xml.bz2

I've tried the link directly from a browser and it looks like it's moved.  Is there a mirror someplace?

-Tim

Re: Problems Running ant enwiki

Posted by David Smiley <da...@gmail.com>.
Created https://issues.apache.org/jira/browse/LUCENE-7546 where further
discussion can continue.  FYI you can grab the wikipedia dump at this URL
now:
http://home.apache.org/~dsmiley/data/enwiki-20070527-pages-articles.xml.bz2

On Tue, Nov 8, 2016 at 9:53 AM David Smiley <da...@gmail.com>
wrote:

> (CC'ing Grant)
>
> I remember hearing something about people.apache.org getting migrated to
> home.apache.org:
>
> https://mail-archives.apache.org/mod_mbox/openoffice-dev/201511.mbox/%3C007a01d127b0$c5ed36c0$51c7a440$@apache.org%3E
> I've found it difficult to find public information on this; it's more
> indirect shares that individuals in specific projects (like OpenOffice
> here) did.  In essence, I learned that a lot of content simply moved to the
> new server but some of the largest files were not copied.  This one did
> not; I checked.
>
> I have a copy of this file -- 2.69GB.  I'll upload it to my account in
> home.apache.org.  It's not clear there is a better space for it; it's
> large for the intended use of home.apache.org.  I'll also update
> lucene/benchmark/build.xml to reference the new URL.
>
> ~ David
>
> On Mon, Nov 7, 2016 at 6:46 PM Timothy Rodriguez (BLOOMBERG/ 120 PARK) <
> trodriguez25@bloomberg.net> wrote:
>
> Anyone else having problems retrieving the test wikipedia set used for
> benchmarks? It looks like the resource is no longer available. When I run
> ant enwiki I receive the following:
>
> [get] Error opening connection java.io.FileNotFoundException:
> http://people.apache.org/~gsingers/wikipedia/enwiki-
> 20070527-pages-articles.xml.bz2
>
> I've tried the link directly from a browser and it looks like it's moved.
> Is there a mirror someplace?
>
> -Tim
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Re: Problems Running ant enwiki

Posted by David Smiley <da...@gmail.com>.
(CC'ing Grant)

I remember hearing something about people.apache.org getting migrated to
home.apache.org:
https://mail-archives.apache.org/mod_mbox/openoffice-dev/201511.mbox/%3C007a01d127b0$c5ed36c0$51c7a440$@apache.org%3E
I've found it difficult to find public information on this; it's more
indirect shares that individuals in specific projects (like OpenOffice
here) did.  In essence, I learned that a lot of content simply moved to the
new server but some of the largest files were not copied.  This one did
not; I checked.

I have a copy of this file -- 2.69GB.  I'll upload it to my account in
home.apache.org.  It's not clear there is a better space for it; it's large
for the intended use of home.apache.org.  I'll also update
lucene/benchmark/build.xml to reference the new URL.

~ David

On Mon, Nov 7, 2016 at 6:46 PM Timothy Rodriguez (BLOOMBERG/ 120 PARK) <
trodriguez25@bloomberg.net> wrote:

> Anyone else having problems retrieving the test wikipedia set used for
> benchmarks? It looks like the resource is no longer available. When I run
> ant enwiki I receive the following:
>
> [get] Error opening connection java.io.FileNotFoundException:
> http://people.apache.org/~gsingers/wikipedia/enwiki-
> 20070527-pages-articles.xml.bz2
>
> I've tried the link directly from a browser and it looks like it's moved.
> Is there a mirror someplace?
>
> -Tim
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com