You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Michael Dürig (JIRA)" <ji...@apache.org> on 2015/04/27 09:40:38 UTC

[jira] [Commented] (OAK-2599) Allow excluding certain paths from getting indexed for particular index

    [ https://issues.apache.org/jira/browse/OAK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513661#comment-14513661 ] 

Michael Dürig commented on OAK-2599:
------------------------------------

Path filtering looks good AFAICS. One thing that could be optimised is to pass the filter down the tree and adapting it to the respective branch instead of keeping it in the context. This way you don't need to re-apply filter for paths that you did already apply at a parent. For this reason why I have the {{o.a.j.o.p.observation.filter.EventFilter#create}} factory methods. 

Also {{PathFilter#optimise}} could be further optimised by removing entries that subsume each other (e.g. including {{/a/b, /a}} is the same as including {{(/a}}. 

Then that method is a duplicate of {{ObservationManagerImpl#optimise}} and it would be nice if we could keep things DRY. And while nitpicking, {{LuceneIndexEditor#getPath}} should be either private or final.


> Allow excluding certain paths from getting indexed for particular index
> -----------------------------------------------------------------------
>
>                 Key: OAK-2599
>                 URL: https://issues.apache.org/jira/browse/OAK-2599
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: core
>            Reporter: Chetan Mehrotra
>             Fix For: 1.3.0
>
>         Attachments: OAK-2599-1.patch
>
>
> Currently an {{IndexEditor}} gets to index all nodes under the tree where it is defined (post OAK-1980).  Due to this IndexEditor would traverse the whole repo (or subtree if configured in non root path) to perform reindex. Depending on the repo size this process can take quite a bit of time. It would be faster if an IndexEditor can exclude certain paths from traversal
> Consider an application like Adobe AEM and an index which only index dam:Asset or the default full text index. For a fulltext index it might make sense to avoid indexing the versionStore. So if the index editor skips such path then lots of redundant traversal can be avoided. 
> Also see http://markmail.org/thread/4cuuicakagi6av4v



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)