You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Johannes Goslar <jo...@dkd.de> on 2014/09/16 01:34:50 UTC

Running Crawls via REST API

Hello,
is it possible to have nutch as a kind of stand-alone crawl server only spoken to via the REST API? 
I found the generic tutorial to setup nutch server with Cassandra and found this wiki page https://wiki.apache.org/nutch/NutchRESTAPI but it leaves me a bit confused about How I can actually start some full fetch cycles. I probably need to create some fetch job, but what is actually the full command with options to send via REST?
Might anybody maybe point to some working examples, I started digging through the java code, but it seems to be only generic key-value setting.
Kind Regards
Johannes

Re: Running Crawls via REST API

Posted by Johannes Goslar <jo...@dkd.de>.
If possible I do not want to write any single line of Java. That is why I am wondering, if it is possible to do everything via REST. But so far it seems like ssh might be the better remote interface.

Kind Regards
Johannes

Re: Running Crawls via REST API

Posted by atawfik <co...@gmail.com>.
If you investigate the bin/nutch script, you will notice that each command
supported by Nutch is calling a Java program or class. You can use the same
approach in your Java code. That is calling the appropriate Java class with
required parameters.

Regards
Ameer



--
View this message in context: http://lucene.472066.n3.nabble.com/Running-Crawls-via-REST-API-tp4159019p4159123.html
Sent from the Nutch - User mailing list archive at Nabble.com.