You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2015/08/13 12:16:45 UTC
[jira] [Comment Edited] (OAK-3213) Improve DocumentStore API

    [ https://issues.apache.org/jira/browse/OAK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695018#comment-14695018 ] 

Thomas Mueller edited comment on OAK-3213 at 8/13/15 10:16 AM:
---------------------------------------------------------------

After directly discussing this with Julian, we have the following additions:

* Should the class be called "Query" or something else, for example "Selector"? Because it is used not just to retrieve data, but also to delete.
* Cache age: there are two aspects: how fresh the set of entries needs to be, and how fresh the data of each each individual entry needs to be. It's not clear if we have a case where this matters.
* Julian is OK to use ...maxCacheAge(100)... instead of adding a parameter to find.
* Sparse documents (above 'Define if only the whole document is needed...'). In some cases just id and the path is needed. But reading the path from the document store is relatively slow. Even thought the path can't be calculated from the id alone, it can be calculated from the parent path and the id (currently). This could be done at the caller side (in a utility class) where it is needed.
* Sparse documents: if the returned object is a "special kind of document", then we might have problems with caching (do we cache it as a separate entry?, what about the persistent cache?). We would prefer that method calls to the wrong property throws an exception, but still allow to re-use existing document objects (not copy the data), so wrapping them would be an option.


was (Author: tmueller):
After directly discussing this with Julian, we have the following additions:

* Should the class be called "Query" or something else, for example "Selector"? Because it is used not just to retrieve data, but also to delete.
* Cache age: there are two aspects: how fresh the set of entries needs to be, and how fresh the data of each each individual entry needs to be. It's not clear if we have a case where this matters.
* Julian is OK to use ...maxCacheAge(100)... instead of adding a parameter to find.
* Sparse documents (above 'Define if only the whole document is needed...'). In some cases just id and the path is needed. But reading the path from the document store is relatively slow. Even thought the path can't be calculated from the id alone, it can be calculated from the parent path and the id (currently). This could be done in the document store (in a utility class) where it is needed.
* Sparse documents: if the returned object is a "special kind of document", then we might have problems with caching (do we cache it as a separate entry?, what about the persistent cache?). We would prefer that method calls to the wrong property throws an exception, but still allow to re-use existing document objects (not copy the data), so wrapping them would be an option.

> Improve DocumentStore API
> -------------------------
>
>                 Key: OAK-3213
>                 URL: https://issues.apache.org/jira/browse/OAK-3213
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mongomk
>            Reporter: Thomas Mueller
>
> The current DocumentStore API needs to be improved to support new requirements, for example OAK-3001, to avoid "instanceof XYZDocumentStore" in the DocumentNodeStore implementation, to possibly improve performance, and to make it (more) future-proof.
> * Improved query functionality to support many constraints (currently, DocumentStore.query only fromKey, toKey, and startValue).
> * Allow query results to not be ordered by key if not needed at the caller side.
> * Maybe support remove with constraints (for OAK-3001).
> * Define if only the whole document is needed, or just the key, or the key plus some of its properties.
> * Define how old the result can be (is it allowed to return cached documents, how fresh does the result need to be, is it allowed to return some cached and some new documents).
> * In case of version changes in the data model (additional collections, additional indexes), allow to work with existing data, possibly without having to upgrade the store (maybe in read-only mode).
> Documentation might need to be improved to cover the data model as well (list of collections, list of indexes, possibility of additional indexes), and expected performance characteristics.
> There are some options questions:
> * Should we backport this change (to the 1.0 and / or 1.2 branch)?
> * Should we keep the current API (DocumentStore.query for example)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)