You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "mreutegg (via GitHub)" <gi...@apache.org> on 2023/02/24 14:05:17 UTC

[GitHub] [jackrabbit-oak] mreutegg commented on a diff in pull request #858: OAK-10114 : created new query API with projections to limit memory us…

mreutegg commented on code in PR #858:
URL: https://github.com/apache/jackrabbit-oak/pull/858#discussion_r1117032596


##########
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/mongo/MongoDocumentStore.java:
##########
@@ -722,8 +725,19 @@ public <T extends Document> List<T> query(Collection<T> collection,
                                               String indexedProperty,
                                               long startValue,
                                               int limit) {
-        return queryWithRetry(collection, fromKey, toKey, indexedProperty,
-                startValue, limit, maxQueryTimeMS);
+        return query(collection, fromKey, toKey, indexedProperty, startValue, limit, emptyList());

Review Comment:
   Ah, now I see why an empty projection must return the entire document. I'm still not sure this is a good idea.



##########
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/MongoDocumentStoreTest.java:
##########
@@ -241,6 +242,30 @@ public void queryWithLimit() throws Exception {
         store.dispose();
     }
 
+    @Test
+    public void queryWithProjections() {
+        DocumentStore docStore = openDocumentStore();
+        DocumentNodeStore store = new DocumentMK.Builder()
+                .setDocumentStore(docStore).setAsyncDelay(0).getNodeStore();
+        Revision rev = Revision.newRevision(0);
+        List<UpdateOp> inserts = new ArrayList<UpdateOp>();
+        for (int i = 0; i < DocumentMK.MANY_CHILDREN_THRESHOLD * 2; i++) {
+            DocumentNodeState n = new DocumentNodeState(store, Path.fromString("/node-" + i),
+                    new RevisionVector(rev));
+            inserts.add(n.asOperation(rev));
+        }
+        docStore.create(Collection.NODES, inserts);
+        List<NodeDocument> docs = docStore.query(Collection.NODES,
+                Utils.getKeyLowerLimit(Path.ROOT),  Utils.getKeyUpperLimit(Path.ROOT), null, 0,
+                DocumentMK.MANY_CHILDREN_THRESHOLD, Lists.newArrayList(ID));
+        // since both _id & _modCount are mandatory, so data size should be 2

Review Comment:
   This is not consistent with the contract in DocumentStore where it says only `_id` is mandatory.



##########
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentStore.java:
##########
@@ -486,4 +490,45 @@ default int getNodeNameLimit() {
     default Throttler throttler() {
         return NO_THROTTLING;
     }
+
+    /**
+     * Get a list of documents with only projected fields (as mentioned in projections param)
+     * along with "_id" field and where the key is greater than a start value and
+     * less than an end value <em>and</em> the given "indexed property" is greater
+     * or equals the specified value.
+     * <p>
+     * The indexed property can either be a {@link Long} value, in which case numeric
+     * comparison applies, or a {@link Boolean} value, in which case "false" is mapped
+     * to "0" and "true" is mapped to "1".
+     * <p>
+     * The returned documents are sorted by key and are immutable.
+     *
+     * @param <T> the document type
+     * @param collection the collection
+     * @param fromKey the start value (excluding)
+     * @param toKey the end value (excluding)
+     * @param indexedProperty the name of the indexed property (optional)
+     * @param startValue the minimum value of the indexed property
+     * @param limit the maximum number of entries to return
+     * @param projections {@link List} of projected keys (optional). Keep this empty to fetch all fields on document.

Review Comment:
   This contract is not intuitive to me. Wouldn't it be better in to return documents with `_id` field only when the `projections` list is empty? One can still use the existing query method if all fields are needed.



##########
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/mongo/MongoDocumentStore.java:
##########
@@ -823,6 +839,14 @@ && canUseModifiedTimeIdx(startValue)) {
                 } else {
                     result = dbCollection.find(query);
                 }
+
+                if (!projections.isEmpty()) {
+                    // it is mandatory to fetch MOD_COUNT because of cacheChangesTracker
+                    projections.add(MOD_COUNT);
+                    Bson projection = include(projections);
+                    result.projection(projection);

Review Comment:
   We need to be careful with the projection in `queryInternal()`. Later in this method documents are put into a cache: `nodeCache.putNonConflictingDocs()`. Documents with a projection are incomplete and later reads for complete documents may be served from cache.
   It might be best to simply bypass the cache when there is a projection and not put those documents in the cache.



##########
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentStore.java:
##########
@@ -486,4 +490,45 @@ default int getNodeNameLimit() {
     default Throttler throttler() {
         return NO_THROTTLING;
     }
+
+    /**
+     * Get a list of documents with only projected fields (as mentioned in projections param)
+     * along with "_id" field and where the key is greater than a start value and
+     * less than an end value <em>and</em> the given "indexed property" is greater
+     * or equals the specified value.
+     * <p>
+     * The indexed property can either be a {@link Long} value, in which case numeric
+     * comparison applies, or a {@link Boolean} value, in which case "false" is mapped
+     * to "0" and "true" is mapped to "1".
+     * <p>
+     * The returned documents are sorted by key and are immutable.
+     *
+     * @param <T> the document type
+     * @param collection the collection
+     * @param fromKey the start value (excluding)
+     * @param toKey the end value (excluding)
+     * @param indexedProperty the name of the indexed property (optional)
+     * @param startValue the minimum value of the indexed property
+     * @param limit the maximum number of entries to return
+     * @param projections {@link List} of projected keys (optional). Keep this empty to fetch all fields on document.
+     * @return the list (possibly empty)
+     * @throws DocumentStoreException if the operation failed. E.g. because of
+     *          an I/O error.
+     */
+    @NotNull
+    default <T extends Document> List<T> query(final Collection<T> collection,
+                                               final String fromKey,
+                                               final String toKey,
+                                               final String indexedProperty,
+                                               final long startValue,
+                                               final int limit,
+                                               final List<String> projections) throws DocumentStoreException {
+        final Set<String> projectedSet = newHashSet(projections);
+        projectedSet.add(ID);
+        projectedSet.add(MOD_COUNT);
+
+        final List<T> list = query(collection, fromKey, toKey, indexedProperty, startValue, limit);
+        list.stream().filter(doc -> !doc.isSealed()).forEach(doc -> doc.keySet().retainAll(projectedSet));

Review Comment:
   I would unconditionally create new documents with the projection applied. I'm surprised this works for some DocumentStore implementations. I was expecting all returned documents are sealed. Such documents can be served from cache and it would be dangerous if a caller can modify them.



##########
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/MongoDocumentStoreTest.java:
##########
@@ -241,6 +242,30 @@ public void queryWithLimit() throws Exception {
         store.dispose();
     }
 
+    @Test
+    public void queryWithProjections() {
+        DocumentStore docStore = openDocumentStore();
+        DocumentNodeStore store = new DocumentMK.Builder()
+                .setDocumentStore(docStore).setAsyncDelay(0).getNodeStore();
+        Revision rev = Revision.newRevision(0);
+        List<UpdateOp> inserts = new ArrayList<UpdateOp>();
+        for (int i = 0; i < DocumentMK.MANY_CHILDREN_THRESHOLD * 2; i++) {

Review Comment:
   Why add so many documents? It seems unnecessary to me for this test.



##########
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/MongoDocumentStoreTest.java:
##########
@@ -241,6 +242,30 @@ public void queryWithLimit() throws Exception {
         store.dispose();
     }
 
+    @Test
+    public void queryWithProjections() {
+        DocumentStore docStore = openDocumentStore();
+        DocumentNodeStore store = new DocumentMK.Builder()
+                .setDocumentStore(docStore).setAsyncDelay(0).getNodeStore();
+        Revision rev = Revision.newRevision(0);
+        List<UpdateOp> inserts = new ArrayList<UpdateOp>();
+        for (int i = 0; i < DocumentMK.MANY_CHILDREN_THRESHOLD * 2; i++) {
+            DocumentNodeState n = new DocumentNodeState(store, Path.fromString("/node-" + i),
+                    new RevisionVector(rev));
+            inserts.add(n.asOperation(rev));
+        }
+        docStore.create(Collection.NODES, inserts);
+        List<NodeDocument> docs = docStore.query(Collection.NODES,
+                Utils.getKeyLowerLimit(Path.ROOT),  Utils.getKeyUpperLimit(Path.ROOT), null, 0,
+                DocumentMK.MANY_CHILDREN_THRESHOLD, Lists.newArrayList(ID));
+        // since both _id & _modCount are mandatory, so data size should be 2
+        if (MONGO_DB) {
+            docs.forEach(d -> assertEquals(2 , d.data.size()));
+        }

Review Comment:
   It would be better if behaviour for all implementations was the same, so we don't have to special case here.



##########
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentStore.java:
##########
@@ -486,4 +490,45 @@ default int getNodeNameLimit() {
     default Throttler throttler() {
         return NO_THROTTLING;
     }
+
+    /**
+     * Get a list of documents with only projected fields (as mentioned in projections param)
+     * along with "_id" field and where the key is greater than a start value and
+     * less than an end value <em>and</em> the given "indexed property" is greater
+     * or equals the specified value.
+     * <p>
+     * The indexed property can either be a {@link Long} value, in which case numeric
+     * comparison applies, or a {@link Boolean} value, in which case "false" is mapped
+     * to "0" and "true" is mapped to "1".
+     * <p>
+     * The returned documents are sorted by key and are immutable.
+     *
+     * @param <T> the document type
+     * @param collection the collection
+     * @param fromKey the start value (excluding)
+     * @param toKey the end value (excluding)
+     * @param indexedProperty the name of the indexed property (optional)
+     * @param startValue the minimum value of the indexed property
+     * @param limit the maximum number of entries to return
+     * @param projections {@link List} of projected keys (optional). Keep this empty to fetch all fields on document.
+     * @return the list (possibly empty)
+     * @throws DocumentStoreException if the operation failed. E.g. because of
+     *          an I/O error.
+     */
+    @NotNull
+    default <T extends Document> List<T> query(final Collection<T> collection,
+                                               final String fromKey,
+                                               final String toKey,
+                                               final String indexedProperty,
+                                               final long startValue,
+                                               final int limit,
+                                               final List<String> projections) throws DocumentStoreException {

Review Comment:
   I think it should be singular: `projection`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org