You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2019/01/24 23:15:00 UTC
[jira] [Resolved] (CONNECTORS-1573) Web Crawler exclude from index
matches too much?
[ https://issues.apache.org/jira/browse/CONNECTORS-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karl Wright resolved CONNECTORS-1573.
-------------------------------------
Resolution: Not A Problem
> Web Crawler exclude from index matches too much?
> ------------------------------------------------
>
> Key: CONNECTORS-1573
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1573
> Project: ManifoldCF
> Issue Type: Bug
> Components: Web connector
> Affects Versions: ManifoldCF 2.10
> Reporter: Korneel Staelens
> Priority: Major
>
> Hello,
> I'm not sure this is a bug, or my misinterpretation of the exclusion rules:
> I want to set-up a rule, so that it does NOT index a parentpage, but does index all childpages of that parent:
> I'm setting up a rule:
> Inclusions:
> .*
>
> Exclustions:
> [http://www.website.com/nl/]
> (I've tried also: http://www.website.com/nl/(\s)* )
> No dice, I'f I'm looking at the logs, I see the pages are crawled, but not indexed due to job restriction. Is my rule wrong? Or is this a small bug?
>
> Thanks for advice!
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)