You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Julian Reschke (Jira)" <ji...@apache.org> on 2022/02/02 13:46:00 UTC

[jira] [Updated] (SLING-11113) resource resolver: bloom filter might be out of sync on startup

     [ https://issues.apache.org/jira/browse/SLING-11113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julian Reschke updated SLING-11113:
-----------------------------------
    Description: 
It appears that the bloom filter can be out of sync with the repo on startup.

Upon startup, when not present, it get's created, and updated with all vanity paths found in the repo. If present, it is used as is.

So for a restart of a node, there's a time window (up to save interval of 60s and downtime) during which the addition of vanity paths will not be reflected in the bloom filter.

Now the bloom filter is only relevant if the number of vanity paths exceeds the maximum number, so this problem might be hard to observe.

AFAIU, the *intent* of persisting the bloom filter is to avoid the cost of re-filling it on startup. However, we already know that *finding* the vanity paths (doing the query, getting the resources and processing the properties) is already costly. It's dubious that avoiding the cost if updating the filter helps here.

Proposal: get rid of the persistence of the bloom filter altogether, reducing the complexity of the code significantly.

 

  was:
It appears that the bloom filter can be out of sync with the repo on startup.

Upon startup, when not present, it get's created, and updated with all vanity paths found in the repo. If present, it is used as is.

So for a restart of a node, there's a time window (up to save interval of 60s and downtime) during which the addition of vanity paths will not be reflected in the bloom filter.

Now the bloom filter is only relevant if the number of vanity paths exceeds the maximum number, so this problem might be hard to observe.

AFAIU, the *intent* of persisting the bloom filter is to avoid the cost of re-filling it on startup. However, we already know that *finding* the vanity paths (doing the query, getting the resources and processing the properties) is already costly. It's dubious that avoiding the cost if updating the filter helps here.

Proposal: get rid of the persistence of the bloom filter altogether, reducing the complexity of the code signifcantly.

 


> resource resolver: bloom filter might be out of sync on startup
> ---------------------------------------------------------------
>
>                 Key: SLING-11113
>                 URL: https://issues.apache.org/jira/browse/SLING-11113
>             Project: Sling
>          Issue Type: Bug
>          Components: ResourceResolver
>            Reporter: Julian Reschke
>            Priority: Major
>
> It appears that the bloom filter can be out of sync with the repo on startup.
> Upon startup, when not present, it get's created, and updated with all vanity paths found in the repo. If present, it is used as is.
> So for a restart of a node, there's a time window (up to save interval of 60s and downtime) during which the addition of vanity paths will not be reflected in the bloom filter.
> Now the bloom filter is only relevant if the number of vanity paths exceeds the maximum number, so this problem might be hard to observe.
> AFAIU, the *intent* of persisting the bloom filter is to avoid the cost of re-filling it on startup. However, we already know that *finding* the vanity paths (doing the query, getting the resources and processing the properties) is already costly. It's dubious that avoiding the cost if updating the filter helps here.
> Proposal: get rid of the persistence of the bloom filter altogether, reducing the complexity of the code significantly.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)