You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Davide Giannella (JIRA)" <ji...@apache.org> on 2017/10/13 09:00:26 UTC

[jira] [Closed] (OAK-6726) Use addDocument instead of updateDocument while reindexing with Lucene

     [ https://issues.apache.org/jira/browse/OAK-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Davide Giannella closed OAK-6726.
---------------------------------

Bulk close for 1.7.9

> Use addDocument instead of updateDocument while reindexing with Lucene
> ----------------------------------------------------------------------
>
>                 Key: OAK-6726
>                 URL: https://issues.apache.org/jira/browse/OAK-6726
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>            Priority: Minor
>             Fix For: 1.8, 1.7.9
>
>
> Currently the DefaultIndexWriter uses [updateDocument|https://lucene.apache.org/core/4_7_1/core/org/apache/lucene/index/IndexWriter.html#updateDocument(org.apache.lucene.index.Term, java.lang.Iterable)] while adding/updating document in index. This is fine for incremental indexing where we cannot be sure if index already has that entry. This call first does a search for existing document matching the term and then deletes and add the new document
> However for reindex case where we start from empty index we can use [addDocument|https://lucene.apache.org/core/4_7_1/core/org/apache/lucene/index/IndexWriter.html#addDocument(java.lang.Iterable)]. This avoids the extra work for search
> In test where index had ~70M entries switch to addDocument resulted in 10 min reduction in reindexing timings



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)