You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Talat Uyarer <ta...@uyarer.com> on 2015/07/03 16:18:05 UTC

Re: Nutch 2.3 not indexing Solr

Are you sure about still use nutch 2.x? You logs is like logs of nutch 1.x

Talat
On Jun 25, 2015 23:04, "Geoffry Roberts" <ge...@gmail.com> wrote:

> All,
>
> I am getting the following error when I attempt a crawl.  Can anyone shed a
> little light?
>
> The Command:
> $ bin/crawl -D solr.server.url=http://localhost:8983/solr/
> $HOME/nutch/urls/seed.txt /var/crawl 3
>
> The Error:
>
> Injecting seed URLs
>
> /usr/local/nutch/bin/nutch inject /var/crawl/crawldb
> /Users/gcr/nutch/urls/seed.txt
>
> Injector: starting at 2015-06-25 15:57:25
>
> Injector: crawlDb: /var/crawl/crawldb
>
> Injector: urlDir: /Users/gcr/nutch/urls/seed.txt
>
> Injector: Converting injected urls to crawl db entries.
>
> Injector: java.lang.UnsupportedOperationException: Not implemented by the
> DistributedFileSystem FileSystem implementation
>
> at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:214)
>
> at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2365)
>
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375)
>
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
>
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
>
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
>
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
>
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
>
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
>
> at org.apache.nutch.crawl.Injector.inject(Injector.java:296)
>
> at org.apache.nutch.crawl.Injector.run(Injector.java:379)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.nutch.crawl.Injector.main(Injector.java:369)
>
> My software always runs perfectly in the end. If it is not yet perfect, it
> is not yet the end.
>
> Geoffry Roberts
>

Re: Nutch 2.3 not indexing Solr

Posted by Geoffry Roberts <ge...@gmail.com>.
I double checked and I am indeed running Nutch v2.3.  I did some googling
and it seems others have had a similar problem and issues have been filed.


I have both v1.10 and v2.3 installed and can switch between them.  v1.10
works but v2.3 doesn't.  My interest in 2.3 is Accumulo.


Is anyone else having this problem?




> >
> > I am getting the following error when I attempt a crawl.  Can anyone
> shed a
> > little light?
> >
> > The Command:
> > $ bin/crawl -D solr.server.url=http://localhost:8983/solr/
> > $HOME/nutch/urls/seed.txt /var/crawl 3
> >
> > The Error:
> >
> > Injecting seed URLs
> >
> > /usr/local/nutch/bin/nutch inject /var/crawl/crawldb
> > /Users/gcr/nutch/urls/seed.txt
> >
> > Injector: starting at 2015-06-25 15:57:25
> >
> > Injector: crawlDb: /var/crawl/crawldb
> >
> > Injector: urlDir: /Users/gcr/nutch/urls/seed.txt
> >
> > Injector: Converting injected urls to crawl db entries.
> >
> > Injector: java.lang.UnsupportedOperationException: Not implemented by the
> > DistributedFileSystem FileSystem implementation
> >
> > at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:214)
> >
> > at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2365)
> >
> > at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375)
> >
> > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
> >
> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
> >
> > at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
> >
> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
> >
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
> >
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
> >
> > at org.apache.nutch.crawl.Injector.inject(Injector.java:296)
> >
> > at org.apache.nutch.crawl.Injector.run(Injector.java:379)
> >
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> >  at org.apache.nutch.crawl.Injector.main(Injector.java:369)
> >
> > My software always runs perfectly in the end. If it is not yet perfect,
> it
> > is not yet the end.
> >
> > Geoffry Roberts
> >
>



-- 
My software always runs perfectly in the end. If it is not yet perfect, it
is not yet the end.

Geoffry Roberts