You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nutch.apache.org by kaveh minooie <ka...@plutoz.com> on 2013/08/15 21:17:29 UTC

crawl.gen.delay

  is 'crawl.gen.delay' still being used anywhere? cause I can't find 
anything in the source code except for here:

package org.apache.nutch.crawl;

public class GeneratorJob extends NutchTool implements Tool {
   public static final String GENERATOR_TOP_N = "generate.topN";
   public static final String GENERATOR_CUR_TIME = "generate.curTime";
   public static final String GENERATOR_DELAY = "crawl.gen.delay";

, and I think it has the wrong value in the nutch-default.xml file. ( 
the value is in seconds, it should be in days)

Re: crawl.gen.delay

Posted by feng lu <am...@gmail.com>.

yes, it is used in Nutch 1.x , but never used in Nutch 2.x. because in
Nutch 2.x it will never generate selected url.

the correct expression of crawl.gen.crawl is milliseconds you can check the
Nutch 1.x nutch-default.xml. the property description like this:

<property>
  <name>crawl.gen.delay</name>
  <value>604800000</value>
  <description>
   This value, expressed in milliseconds, defines how long we should keep
the lock on records
   in CrawlDb that were just selected for fetching. If these records are
not updated
   in the meantime, the lock is canceled, i.e. they become eligible for
selecting.
   Default value of this is 7 days (604800000 ms).
  </description>
</property>

Maybe it is wrong.

On Fri, Aug 16, 2013 at 3:17 AM, kaveh minooie <ka...@plutoz.com> wrote:

> crawl.gen.delay





-- 
Don't Grow Old, Grow Up... :-)