You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Marek Bachmann <m....@uni-kassel.de> on 2011/06/15 12:42:08 UTC

index command missing in nutch 1.3?

Perhaps I have missed something important, but I am not able to find a 
way to build an index in nutch 1.3 since this command isn't found any more?
Is there a new way to do this?

I tried to run:

root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch index 
crawl/indexes crawl/crawldb/ crawl/linkdb/ crawl/segments/*
Exception in thread "main" java.lang.NoClassDefFoundError: index
Caused by: java.lang.ClassNotFoundException: index
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Could not find the main class: index.  Program will exit.

And after running nutch without an argument I saw that the index command 
is missing.

root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch
Usage: nutch [-core] COMMAND
where COMMAND is one of:
   crawl             one-step crawler for intranets
   readdb            read / dump crawl db
   convdb            convert crawl db from pre-0.9 format
   mergedb           merge crawldb-s, with optional filtering
   readlinkdb        read / dump link db
   inject            inject new urls into the database
   generate          generate new segments to fetch from crawl db
   freegen           generate new segments to fetch from text files
   fetch             fetch a segment's pages
   parse             parse a segment's pages
   readseg           read / dump segment data
   mergesegs         merge several segments, with optional filtering and 
slicing
   updatedb          update crawl db from segments after fetching
   invertlinks       create a linkdb from parsed segments
   mergelinkdb       merge linkdb-s, with optional filtering
   solrindex         run the solr indexer on parsed segments and linkdb
   solrdedup         remove duplicates from solr
   solrclean         remove HTTP 301 and 404 documents from solr
   plugin            load a plugin and run one of its classes main()
  or
   CLASSNAME         run the class named CLASSNAME
Most commands print help when invoked w/o parameters.

Expert: -core option is for developers only. It avoids building the job 
jar,
         instead it simply includes classes compiled with ant compile-core.
         NOTE: this works only for jobs executed in 'local' mode

Re: index command missing in nutch 1.3?

Posted by Markus Jelsma <ma...@openindex.io>.
From 1.3 CHANGES.txt:

* NUTCH-837 Remove search servers and Lucene dependencies (ab) 
https://issues.apache.org/jira/browse/NUTCH-837


On Wednesday 15 June 2011 12:42:08 Marek Bachmann wrote:
> Perhaps I have missed something important, but I am not able to find a
> way to build an index in nutch 1.3 since this command isn't found any more?
> Is there a new way to do this?
> 
> I tried to run:
> 
> root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch index
> crawl/indexes crawl/crawldb/ crawl/linkdb/ crawl/segments/*
> Exception in thread "main" java.lang.NoClassDefFoundError: index
> Caused by: java.lang.ClassNotFoundException: index
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> Could not find the main class: index.  Program will exit.
> 
> And after running nutch without an argument I saw that the index command
> is missing.
> 
> root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch
> Usage: nutch [-core] COMMAND
> where COMMAND is one of:
>    crawl             one-step crawler for intranets
>    readdb            read / dump crawl db
>    convdb            convert crawl db from pre-0.9 format
>    mergedb           merge crawldb-s, with optional filtering
>    readlinkdb        read / dump link db
>    inject            inject new urls into the database
>    generate          generate new segments to fetch from crawl db
>    freegen           generate new segments to fetch from text files
>    fetch             fetch a segment's pages
>    parse             parse a segment's pages
>    readseg           read / dump segment data
>    mergesegs         merge several segments, with optional filtering and
> slicing
>    updatedb          update crawl db from segments after fetching
>    invertlinks       create a linkdb from parsed segments
>    mergelinkdb       merge linkdb-s, with optional filtering
>    solrindex         run the solr indexer on parsed segments and linkdb
>    solrdedup         remove duplicates from solr
>    solrclean         remove HTTP 301 and 404 documents from solr
>    plugin            load a plugin and run one of its classes main()
>   or
>    CLASSNAME         run the class named CLASSNAME
> Most commands print help when invoked w/o parameters.
> 
> Expert: -core option is for developers only. It avoids building the job
> jar,
>          instead it simply includes classes compiled with ant compile-core.
>          NOTE: this works only for jobs executed in 'local' mode

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350