You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Chetan Mehrotra (JIRA)" <ji...@apache.org> on 2017/07/17 10:00:03 UTC

[jira] [Updated] (OAK-6081) Indexing tooling via oak-run

     [ https://issues.apache.org/jira/browse/OAK-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chetan Mehrotra updated OAK-6081:
---------------------------------
    Description: 
To enable better management for indexing related operation specially around reindexing indexes on large repository setup we should implement some tooling as part of oak-run 

The tool would support


# For DocumentNodeStore setup it would be possible to connect oak-run to a live cluster and it would take care of indexing -> storing index on disk -> merging index ->  importing it back at end. This would ensure that live setup faces minimum disruption and is not loaded much
# For SegementNodeStore setup it would be possible to index on a cloned setup and then provide  a way to copy the index back

Future Enhancements

# *Resumable tarversal* - It should be able to reindex large repo with resumable traversal such that even if indexing breaks due to some issue it can resume from last state (OAK-5833)
# *Multithreaded traversal* - Current indexing is single threaded and hence for large repo it can take long time. Plan here is to support multi threaded indexing where each thread can be assigned a part of repository tree to index and in the end the indexes are merged

  was:
To enable better management for indexing related operation specially around reindexing indexes on large repository setup we should implement some tooling as part of oak-run 

The tool would support

# *Resumable tarversal* - It should be able to reindex large repo with resumable traversal such that even if indexing breaks due to some issue it can resume from last state (OAK-5833)
# *Multithreaded traversal* - Current indexing is single threaded and hence for large repo it can take long time. Plan here is to support multi threaded indexing where each thread can be assigned a part of repository tree to index and in the end the indexes are merged
# For DocumentNodeStore setup it would be possible to connect oak-run to a live cluster and it would take care of indexing -> storing index on disk -> merging index ->  importing it back at end. This would ensure that live setup faces minimum disruption and is not loaded much
# For SegementNodeStore setup it would be possible to index on a cloned setup and then provide  a way to copy the index back




> Indexing tooling via oak-run
> ----------------------------
>
>                 Key: OAK-6081
>                 URL: https://issues.apache.org/jira/browse/OAK-6081
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: indexing, run
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.8, 1.7.4
>
>
> To enable better management for indexing related operation specially around reindexing indexes on large repository setup we should implement some tooling as part of oak-run 
> The tool would support
> # For DocumentNodeStore setup it would be possible to connect oak-run to a live cluster and it would take care of indexing -> storing index on disk -> merging index ->  importing it back at end. This would ensure that live setup faces minimum disruption and is not loaded much
> # For SegementNodeStore setup it would be possible to index on a cloned setup and then provide  a way to copy the index back
> Future Enhancements
> # *Resumable tarversal* - It should be able to reindex large repo with resumable traversal such that even if indexing breaks due to some issue it can resume from last state (OAK-5833)
> # *Multithreaded traversal* - Current indexing is single threaded and hence for large repo it can take long time. Plan here is to support multi threaded indexing where each thread can be assigned a part of repository tree to index and in the end the indexes are merged



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)