You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by "Uittenbroek, R.M." <r....@rug.nl> on 2018/10/17 10:44:29 UTC

Question about lucene.DocNumberCache and initializeHierarchyCache

Hello,

I hope this is the right list for my question.

We are running a CMS on Jackrabbit 2.16.2. We have been using JCR for years
now. After startup and initialisation, when a first query is run, the
request takes very long (over 1 minute). From what I can see in the logs,
the lucene.DocNumberCache is queried for results (and filled because it was
empty).

We do this query and get this logging:

2018-10-17 10:48:54,629  INFO [18    ] jcr.JcrSearch - open: xpath query =
/jcr:root/webplatform/www.rug.nl//element(*,
nt:file)/jcr:content[(@cms:vParentLC =
'/education/international-student-blog' and @fs
:id != '12b506d7-130c-4643-8d0a-b684890bf946-33.14') and
(((not(@cms:publicationStart) and @cms:created <
xs:dateTime('2018-10-17T10:48:00.000+02:00')) or @cms:publicationStart <
xs:dateTime('2018-10-17T10:
48:00.000+02:00')) and (not(@cms:publicationEnd) or @cms:publicationEnd >=
xs:dateTime('2018-10-17T10:48:00.000+02:00'))) and (@cms:type =
'blogEntry')]/(@cms:type) order by @jcr:lastModified descending

2018-10-17 10:48:55,575  INFO [18    ] lucene.DocNumberCache -
size=7/1000000, #accesses=3149, #hits=3149, #misses=0, cacheRatio=100%
2018-10-17 10:49:05,576  INFO [18    ] lucene.DocNumberCache -
size=37575/1000000, #accesses=1746225, #hits=1261440, #misses=484785,
cacheRatio=73%
2018-10-17 10:49:15,577  INFO [18    ] lucene.DocNumberCache -
size=99950/1000000, #accesses=616478, #hits=0, #misses=616478, cacheRatio=0%
2018-10-17 10:49:25,578  INFO [18    ] lucene.DocNumberCache -
size=129659/1000000, #accesses=625557, #hits=0, #misses=625557,
cacheRatio=0%
2018-10-17 10:49:35,579  INFO [18    ] lucene.DocNumberCache -
size=151664/1000000, #accesses=608407, #hits=3, #misses=608404,
cacheRatio=1%
2018-10-17 10:49:45,580  INFO [18    ] lucene.DocNumberCache -
size=170923/1000000, #accesses=608628, #hits=0, #misses=608628,
cacheRatio=0%
2018-10-17 10:49:55,581  INFO [18    ] lucene.DocNumberCache -
size=187612/1000000, #accesses=613109, #hits=0, #misses=613109,
cacheRatio=0%
2018-10-17 10:50:05,582  INFO [18    ] lucene.DocNumberCache -
size=201976/1000000, #accesses=593435, #hits=0, #misses=593435,
cacheRatio=0%
2018-10-17 10:50:15,583  INFO [18    ] lucene.DocNumberCache -
size=216065/1000000, #accesses=607567, #hits=0, #misses=607567,
cacheRatio=0%
2018-10-17 10:50:20,331  INFO [18    ] jcr.JcrSearch - open: number of
nodes found = 107
2018-10-17 10:50:20,331  INFO [18    ] query.Search - Query performed with
guest=true

As you can see from the timestamps, this proces takes very long.

From the docs http://jackrabbit.apache.org/jcr/index-readers.html I see "In
order to speed up lookups by UUID the CachingMultiIndexReader also has a
DocNumberCache. This cache uses a LRU algorithm to keep a limited amount of
UUID to document number mappings.". So, I assume the DocNumberCache is
empty when the first query is done.
And from another doc, I read "initializeHierarchyCache: With the default
value of true the hierarchy cache is initialized on startup and control is
only given back when the initialization has completed. When set to false
the cache is populated during regular use.'. We use the default 'true'.

So, I would assume at startup this 'DocNumberCache' would be filled with
all hierarchy information and would be 'ready to go' when we start doing
queries. But this does not seem to be the case.

Am I doing something wrong, missing something, or is there another
parameter to set for this to work? I really would want this DocNumberCache
to be fully ready before JCR becomes 'available'.

Thanks for your help,

Kind regards,

Robbert Uittenbroek