You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Amarnatha Reddy <po...@gmail.com> on 2018/06/26 06:42:57 UTC

Apache nutch,solr,zk best practices

Hi All,

I have done some POC on integrating nutch,solr,zk with some defaults values
and mainly followed the steps based on blogs.
One of my superior came up with some more details on best practice to
implement on POC part, kindly assist us on below questonier

1) Can you do a research on what is the best practice to periodically
running Nutch to index data to solr apart from configuring it as cronjob.
2) Check if there any OOTB method to run Nutch periodically
3) Do a comparative study of all the ways to run Nutch and come up with the
best approach.
4) Find out optimal value for “The number of rounds to run the crawl” and
best practice around it.


------------------------------

Thanks and Regards,

*Amarnath Polu*