You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Davide Giannella (JIRA)" <ji...@apache.org> on 2015/05/14 12:26:00 UTC

[jira] [Commented] (OAK-2807) Improve getSize performance for "public" content

    [ https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543465#comment-14543465 ] 

Davide Giannella commented on OAK-2807:
---------------------------------------

I like the idea but as of now I don't know how to easily/cleanly solve
it. Unfortunately the JCR API don't allow the {{count()}} function in
query so we have to rely on the Iterator implementation.

First the index should expose to the query engine the information
about it can serve the counting. 

For example lucene can serve counts of the number of returned nodes,
using the [TotalHitCountCollector|http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/TotalHitCountCollector.html].


So we should work out something along the following lines

{code}
diff --git a/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/query/QueryIndex.java b/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/query/QueryIndex.java
index f86dbc0..20f7f63 100644
--- a/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/query/QueryIndex.java
+++ b/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/query/QueryIndex.java
@@ -309,6 +309,12 @@ public interface QueryIndex {
         Object getAttribute(String name);
         
         /**
+         * 
+         * @return true if the current index could serve the count of returned nodes
+         */
+        boolean isCountIndex();
+        
+        /**
          * A builder for index plans.
          */
         public class Builder {
@@ -320,6 +326,7 @@ public interface QueryIndex {
             protected boolean isDelayed;
             protected boolean isFulltextIndex;
             protected boolean includesNodeData;
+            protected boolean countIndex;
             protected List<OrderEntry> sortOrder;
             protected NodeState definition;
             protected PropertyRestriction propRestriction;
@@ -386,6 +393,11 @@ public interface QueryIndex {
                return this;
             }
 
+            public Builder setCountIndex(final boolean countIndex) {
+                this.countIndex = countIndex;
+                return this;
+            }
+            
             public IndexPlan build() {
                 
                 return new IndexPlan() {
@@ -417,6 +429,8 @@ public interface QueryIndex {
                     private final Map<String, Object> attributes =
                             Builder.this.attributes;
                     
+                    private final boolean countIndex = Builder.this.countIndex;
+
                     @Override
                     public String toString() {
                         return String.format(
@@ -430,7 +444,8 @@ public interface QueryIndex {
                             + " sortOrder : %s,"
                             + " definition : %s,"
                             + " propertyRestriction : %s,"
-                            + " pathPrefix : %s }",
+                            + " pathPrefix : %s, "
+                            + " countIndex : %s }",
                             costPerExecution,
                             costPerEntry,
                             estimatedEntryCount,
@@ -441,11 +456,17 @@ public interface QueryIndex {
                             sortOrder,
                             definition,
                             propRestriction,
-                            pathPrefix
+                            pathPrefix,
+                            countIndex
                             );
                     }
 
                     @Override
+                    public boolean isCountIndex() {
+                        return countIndex;
+                    }
+                    
+                    @Override
                     public double getCostPerExecution() {
                         return costPerExecution;
                     }

{code}

Then we should work on the Iterator side in oak making it aware of the
executed plan and in case rely on the index itself. 

The iterator ({{RowIterator}} and {{NodeIterator}}) is initialised in
[QueryResultImpl|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/query/QueryResultImpl.java]

We'll probably need to implement an Oak version of the Iterators which
takes in account the executed plan/query. Unfortunately the information
about the plan and the index is lost way down the chain in the
QueryImpl. So far I didn't find any clear way to pass this information
along.





> Improve getSize performance for "public" content
> ------------------------------------------------
>
>                 Key: OAK-2807
>                 URL: https://issues.apache.org/jira/browse/OAK-2807
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query, security
>    Affects Versions: 1.0.13, 1.2
>            Reporter: Michael Marth
>
> Certain operations in the query engine like getting the size of a result set or facets are expensive to compute due to the fact that ACLs need to be computed on the entire result set. This issue is to discuss an idea how we could improve this:
> There is a very common special case: content (a subtree) that is readable by everyone (anonymous). If we mark an index on that subtree as "readable by everyone" on index creation then we could skip ACL check on the result set or  precompute/cache certain query results.
> In order to avoid information leakage the index would have to be marked "invalid" as soon as one node in that sub-tree is not readable by everyone anymore. (could be checked through a commit hook)
> Maybe this concept could even be generalized later to work with other principals than everyone.
> Just an idea - feel free to poke holes and shoot it down :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)