You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Davide Giannella (JIRA)" <ji...@apache.org> on 2019/04/09 10:37:12 UTC

[jira] [Updated] (OAK-3380) Property index pruning should happen asynchronously

     [ https://issues.apache.org/jira/browse/OAK-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Davide Giannella updated OAK-3380:
----------------------------------
    Fix Version/s: 1.14.0

> Property index pruning should happen asynchronously
> ---------------------------------------------------
>
>                 Key: OAK-3380
>                 URL: https://issues.apache.org/jira/browse/OAK-3380
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: property-index
>    Affects Versions: 1.3.5
>            Reporter: Vikas Saurabh
>            Priority: Minor
>              Labels: resilience
>             Fix For: 1.12.0, 1.14.0
>
>
> Following up on this (a relatively old) thread \[1], we should do pruning of property index structure asynchronously. The thread was never concluded.. here are a couple of ideas picked from the thread:
> * Move pruning to an async thread
> * Throttle pruning i.e. prune only once in a while
> ** I'm not sure how that would work though -- an unpruned part would remain as is until another index happens on that path.
> Once we can move pruning to some async thread (reducing concurrent updates), OAK-2673 + OAK-2929 can take care of add-add conflicts.
> ----
> h6. Why is this an issue despite merge retries taking care of it?
> A couple of cases which have concurrent updates hitting merge conflicts in our product (Adobe AEM):
> * Some index are very volatile (in the sense that indexed property switches its values very quickly) e.g. sling job status, AEM workflow status.
> * Multiple threads take care of jobs. Although sling maintains a bucketed structure for job storage to reduce conflicts... but inside index tree the bucket structure, at times, gets pruned and needs to be created in the next job status change
> While retries do take care of these conflict a lot of times and even when they don't, AEM workflows has it's own retry to work around. But, retrying, IMHO, is just a waste of time -- more importantly in paths where application doesn't really have a control.
> h6. Would this add to cost of traversing index structure?
> Yes, there'd be some left over paths in index structure between asynchronous prunes. But, I think the cost of such wasted traversals would be covered up with time saved in avoiding the concurrent update conflict.
> ----
> (cc [~tmueller], [~mreutegg], [~alex.parvulescu], [~chetanm])
> \[1]: http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201506.mbox/%3CCADicHF66U2Vh-hLrJUNANsYtXfiDj2mT3vKTr4ybknGpzy9MNw@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)