You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Florian Schmedding (JIRA)" <ji...@apache.org> on 2014/02/12 20:45:19 UTC

[jira] [Comment Edited] (CONNECTORS-850) Maximum interval in dynamic crawling

    [ https://issues.apache.org/jira/browse/CONNECTORS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899490#comment-13899490 ] 

Florian Schmedding edited comment on CONNECTORS-850 at 2/12/14 7:44 PM:
------------------------------------------------------------------------

minimum interval: 2 min
maximum interval: 4 min
The job was run a few times before it was set to dynamic crawling. 

{noformat}
02-12-2014 18:27:45.475 fetch
02-12-2014 18:23:45.702 document ingest (solr localhost)
02-12-2014 18:23:44.921 fetch
02-12-2014 18:19:44.451 document ingest (solr localhost)
02-12-2014 18:19:43.837 fetch
02-12-2014 18:15:42.929 fetch
02-12-2014 18:11:41.582 document ingest (solr localhost)
02-12-2014 18:11:41.058 fetch
*** document changed
02-12-2014 18:07:40.744 document ingest (solr localhost)
02-12-2014 18:07:40.249 fetch
02-12-2014 18:03:37.546 fetch
02-12-2014 17:59:36.426 fetch
02-12-2014 17:55:34.297 fetch
02-12-2014 17:51:33.431 document ingest (solr localhost)
02-12-2014 17:51:32.973 fetch
*** job changed from scheduled to dynamic crawling
02-12-2014 17:24:24.560 document ingest (solr localhost)
02-12-2014 17:24:24.413 fetch
02-12-2014 17:21:17.042 document ingest (solr localhost)
02-12-2014 17:21:16.919 fetch
02-12-2014 17:18:15.892 document ingest (solr localhost)
{noformat}


was (Author: florianschmedding):
minimum interval: 2 min
maximum interval: 4 min
The job was run a few times before it was set to dynamic crawling. 

02-12-2014 18:27:45.475 fetch
02-12-2014 18:23:45.702 document ingest (solr localhost)
02-12-2014 18:23:44.921 fetch
02-12-2014 18:19:44.451 document ingest (solr localhost)
02-12-2014 18:19:43.837 fetch
02-12-2014 18:15:42.929 fetch
02-12-2014 18:11:41.582 document ingest (solr localhost)
02-12-2014 18:11:41.058 fetch
*** document changed
02-12-2014 18:07:40.744 document ingest (solr localhost)
02-12-2014 18:07:40.249 fetch
02-12-2014 18:03:37.546 fetch
02-12-2014 17:59:36.426 fetch
02-12-2014 17:55:34.297 fetch
02-12-2014 17:51:33.431 document ingest (solr localhost)
02-12-2014 17:51:32.973 fetch
*** job changed from scheduled to dynamic crawling
02-12-2014 17:24:24.560 document ingest (solr localhost)
02-12-2014 17:24:24.413 fetch
02-12-2014 17:21:17.042 document ingest (solr localhost)
02-12-2014 17:21:16.919 fetch
02-12-2014 17:18:15.892 document ingest (solr localhost)


> Maximum interval in dynamic crawling
> ------------------------------------
>
>                 Key: CONNECTORS-850
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-850
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Framework crawler agent
>    Affects Versions: ManifoldCF 1.4.1
>            Reporter: Florian Schmedding
>            Assignee: Karl Wright
>            Priority: Minor
>              Labels: features
>             Fix For: ManifoldCF 1.5
>
>
> Currently, the dynamic crawling method used for a continuous job extends the reseed and recrawl intervals when no changes are found in a checked document. However, it should be possible to restrict this extension to a maximum value in order to make sure that new documents are discovered within a certain interval.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)