You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2018/12/06 14:29:00 UTC

[jira] [Commented] (OAK-7947) Lazy loading of Lucene index files startup

    [ https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711507#comment-16711507 ] 

Thomas Mueller commented on OAK-7947:
-------------------------------------

The attached solves the issue. It contains various changes, possibly some of them are not needed, and some might be incorrect / problematic. This is work-in-progress. Still it would be nice to get some feedback from those who are more familiar with this code, for example [~catholicon] [~teofili] [~chetanm]. Changes I did:

* IndexTracker.getIndexDefinition constructs the node and returns it if the index isn't in the indices map yet. I don't know why it returned null before, it seems wrong to me.
* LuceneIndexNodeManager always opened the index, I don't know why. SearcherHolder now doesn't always do that. I basically make SearcherHolder open the index lazily.
* LucenePropertyIndex acquireIndexNode is called when planning, and that method opens the index files. I don't know why. I created a class LazyLuceneIndexNode that wraps LuceneIndexNode and creates it lazily.
* OakStreamingIndexFile now logs the directory name as well, not just the file name.
* DefaultIndexReader now opens the directory (DirectoryReader.open) lazily; only when calling getReader.
* FulltextIndexPlanner.estimatedEntryCount now only calls getNumDocs when really needed (that is, only if "entryCount" isn't set in the index definition). That should avoid having to open the index if we know the entryCount is high.

> Lazy loading of Lucene index files startup
> ------------------------------------------
>
>                 Key: OAK-7947
>                 URL: https://issues.apache.org/jira/browse/OAK-7947
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene, query
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>            Priority: Major
>         Attachments: OAK-7947.patch
>
>
> Right now, all Lucene index binaries are loaded on startup (I think when the first query is run, to do cost calculation). This is a performance problem if the index files are large, and need to be downloaded from the data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)