You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Julian Reschke (Jira)" <ji...@apache.org> on 2022/02/02 14:40:00 UTC

[jira] [Comment Edited] (SLING-11113) resource resolver: bloom filter might be out of sync on startup

    [ https://issues.apache.org/jira/browse/SLING-11113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17485847#comment-17485847 ] 

Julian Reschke edited comment on SLING-11113 at 2/2/22, 2:39 PM:
-----------------------------------------------------------------

Not entirely.

The bloom filter is *needed* when the total number of vanity paths exceeds the cache (which defaults to "unlimited"). If the cache is full, the bloom filter is needed to determine whether it makes sense to ask the repo (using JCR query) for resources having configured a given path as vanity path.

The proposed fix just removes the startup optimization where we assume the bloom filter is accurate when present on disk. It means that when building the resolver map (walking all resources with vanity paths), we'll have to update the bloom filter as well.


was (Author: reschke):
Not entirely.

The bloom filter is *needed* when the total number of vanity paths exceeds the cache (which defaults to "unlimited"). If the cache is full, the bloom filter is needed to determine whether it makes sense to ask the repo (using JCR query) for resources having configured a given path as vanity path.

The proposed fix just removes the startup optimization where we assume the bllom filter is accurate when present on disk. It means that when building the resolver map (walking all resources with vanity paths), we'll have to update the bloom filter as well.

> resource resolver: bloom filter might be out of sync on startup
> ---------------------------------------------------------------
>
>                 Key: SLING-11113
>                 URL: https://issues.apache.org/jira/browse/SLING-11113
>             Project: Sling
>          Issue Type: Bug
>          Components: ResourceResolver
>            Reporter: Julian Reschke
>            Priority: Major
>
> It appears that the bloom filter can be out of sync with the repo on startup.
> Upon startup, when not present, it get's created, and updated with all vanity paths found in the repo. If present, it is used as is.
> So for a restart of a node, there's a time window (up to save interval of 60s plus downtime) during which the addition of vanity paths will not be reflected in the bloom filter.
> Now the bloom filter is only relevant if the number of vanity paths exceeds the maximum number, so this problem might be hard to observe.
> AFAIU, the *intent* of persisting the bloom filter is to avoid the cost of re-filling it on startup. However, we already know that *finding* the vanity paths (doing the query, getting the resources and processing the properties) is already costly. It's dubious that avoiding the cost if updating the filter helps here.
> Proposal: get rid of the persistence of the bloom filter altogether, reducing the complexity of the code significantly.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)