You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2017/01/26 09:38:24 UTC
[jira] [Comment Edited] (OAK-5324) Enable property index reindexing
via oak-run
[ https://issues.apache.org/jira/browse/OAK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837941#comment-15837941 ]
Thomas Mueller edited comment on OAK-5324 at 1/26/17 9:37 AM:
--------------------------------------------------------------
> But I assume this issue is rather about a way to introduce a new index or update an existing one when the system is online, right? In that case, the branch-less mode is off the table.
I see. I wrote a tool that allows managing indexes (creating, changing, reindexing, removing) using a script, for both the regular and the branch-less mode now:
http://svn.apache.org/r1780222
> At least for new indexes we could try to improve the branch handling in the DocumentNodeStore.
If that turns out to be much easier, we could probably make reindexing a special case of creating a new index. For example, re-index into a new hidden child node, ":data_1", ":data_2",..., so that the existing nodes are not changed. And only change the pointer to the latest ":data_x" node at the end, maybe in a separate commit. After that, the old, outdated ":data_(n-1)" node could be removed step-by-step using multiple commits, or in one commit (which can't conflict).
Another options might be to split indexing into multiple commits. For example use a "fromPath" .. "toPath" range, and only re-index part of the repository at a time.
> Async re-index? Does that disable synchronous index updates while it is re-indexing?
I don't know currently.
was (Author: tmueller):
> But I assume this issue is rather about a way to introduce a new index or update an existing one when the system is online, right? In that case, the branch-less mode is off the table.
I see. I wrote a tool that allows managing indexes (creating, changing, reindexing, removing) using a script, for both the regular and the branch-less mode now:
http://svn.apache.org/r1780222
> At least for new indexes we could try to improve the branch handling in the DocumentNodeStore.
If that turns out to be much easier, we could probably make reindexing a special case of creating a new index. For example, re-index into a new hidden child node, ":data_1", ":data_2",..., so that the existing nodes are not changed. And only change the pointer to the latest ":data_x" node at the very end, in a separate commit.
Another options might be to split indexing into multiple commits. For example use a "fromPath" .. "toPath" range, and only re-index part of the repository at a time.
> Async re-index? Does that disable synchronous index updates while it is re-indexing?
I don't know currently.
> Enable property index reindexing via oak-run
> --------------------------------------------
>
> Key: OAK-5324
> URL: https://issues.apache.org/jira/browse/OAK-5324
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: documentmk, run
> Reporter: Chetan Mehrotra
> Assignee: Thomas Mueller
> Fix For: 1.6, 1.8
>
>
> Currently introducing a new property index or performing a reindex of existing property index is problamatic on DocumentNodeStore. This happens because doing this results in either
> # Persisted branch - Which is slow at times and has issues related to conflict handling
> # Large in memory branch which increases heap pressure
> To enable this use case we should add some tooling in oak-run where we can use different approach for achieving the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)