You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Marek Bachmann <m....@uni-kassel.de> on 2011/06/15 12:42:08 UTC
index command missing in nutch 1.3?
Perhaps I have missed something important, but I am not able to find a
way to build an index in nutch 1.3 since this command isn't found any more?
Is there a new way to do this?
I tried to run:
root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch index
crawl/indexes crawl/crawldb/ crawl/linkdb/ crawl/segments/*
Exception in thread "main" java.lang.NoClassDefFoundError: index
Caused by: java.lang.ClassNotFoundException: index
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Could not find the main class: index. Program will exit.
And after running nutch without an argument I saw that the index command
is missing.
root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch
Usage: nutch [-core] COMMAND
where COMMAND is one of:
crawl one-step crawler for intranets
readdb read / dump crawl db
convdb convert crawl db from pre-0.9 format
mergedb merge crawldb-s, with optional filtering
readlinkdb read / dump link db
inject inject new urls into the database
generate generate new segments to fetch from crawl db
freegen generate new segments to fetch from text files
fetch fetch a segment's pages
parse parse a segment's pages
readseg read / dump segment data
mergesegs merge several segments, with optional filtering and
slicing
updatedb update crawl db from segments after fetching
invertlinks create a linkdb from parsed segments
mergelinkdb merge linkdb-s, with optional filtering
solrindex run the solr indexer on parsed segments and linkdb
solrdedup remove duplicates from solr
solrclean remove HTTP 301 and 404 documents from solr
plugin load a plugin and run one of its classes main()
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
Expert: -core option is for developers only. It avoids building the job
jar,
instead it simply includes classes compiled with ant compile-core.
NOTE: this works only for jobs executed in 'local' mode
Re: index command missing in nutch 1.3?
Posted by Markus Jelsma <ma...@openindex.io>.
From 1.3 CHANGES.txt:
* NUTCH-837 Remove search servers and Lucene dependencies (ab)
https://issues.apache.org/jira/browse/NUTCH-837
On Wednesday 15 June 2011 12:42:08 Marek Bachmann wrote:
> Perhaps I have missed something important, but I am not able to find a
> way to build an index in nutch 1.3 since this command isn't found any more?
> Is there a new way to do this?
>
> I tried to run:
>
> root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch index
> crawl/indexes crawl/crawldb/ crawl/linkdb/ crawl/segments/*
> Exception in thread "main" java.lang.NoClassDefFoundError: index
> Caused by: java.lang.ClassNotFoundException: index
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> Could not find the main class: index. Program will exit.
>
> And after running nutch without an argument I saw that the index command
> is missing.
>
> root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch
> Usage: nutch [-core] COMMAND
> where COMMAND is one of:
> crawl one-step crawler for intranets
> readdb read / dump crawl db
> convdb convert crawl db from pre-0.9 format
> mergedb merge crawldb-s, with optional filtering
> readlinkdb read / dump link db
> inject inject new urls into the database
> generate generate new segments to fetch from crawl db
> freegen generate new segments to fetch from text files
> fetch fetch a segment's pages
> parse parse a segment's pages
> readseg read / dump segment data
> mergesegs merge several segments, with optional filtering and
> slicing
> updatedb update crawl db from segments after fetching
> invertlinks create a linkdb from parsed segments
> mergelinkdb merge linkdb-s, with optional filtering
> solrindex run the solr indexer on parsed segments and linkdb
> solrdedup remove duplicates from solr
> solrclean remove HTTP 301 and 404 documents from solr
> plugin load a plugin and run one of its classes main()
> or
> CLASSNAME run the class named CLASSNAME
> Most commands print help when invoked w/o parameters.
>
> Expert: -core option is for developers only. It avoids building the job
> jar,
> instead it simply includes classes compiled with ant compile-core.
> NOTE: this works only for jobs executed in 'local' mode
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350