You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Alexander Klimetschek (JIRA)" <ji...@apache.org> on 2013/04/17 20:29:16 UTC
[jira] [Commented] (JCR-3572) Optimization in NodeImpl.getNodes(String) and getNodes(String[])

    [ https://issues.apache.org/jira/browse/JCR-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13634287#comment-13634287 ] 

Alexander Klimetschek commented on JCR-3572:
--------------------------------------------

Could you provide a patch?
                
> Optimization in NodeImpl.getNodes(String) and getNodes(String[])
> ----------------------------------------------------------------
>
>                 Key: JCR-3572
>                 URL: https://issues.apache.org/jira/browse/JCR-3572
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>    Affects Versions: 2.2.13, 2.4.3, 2.5.3, 2.6
>            Reporter: Sergiy Shyrkov
>            Priority: Minor
>
> We were doing some refactoring/optimization of our code (we are using Jackrabbit 2.2.x code) and replacing in some places traversal over child nodes (node.getNodes()) with node.getNodes(namePattern) as it suppose to speed up the things by only returning children, matching particular name pattern (in our case those our internationalization nodes, starting with "translation-").
> Profiling the code after we noticed that node.getNodes(namePattern) is actually slower than simple traversal over all children checking their name match.
> We can down to the org.apache.jackrabbit.util.ChildrenCollectorFilter which is used in the NodeImpl.getNodes(String) and getNodes(String[])
> Even if the instance of the filter is created with collectProperties set to false the parent (javax.jcr.util.TraversingItemVisitor.visit(Node)) still reads the properties of the nodes and iterates over them which seems useless.
> Digging deeper it seems there is an optimization which can be done to getNodes(String) and getNodes(String[]) earlier.
> Those methods could call the itemMgr.getChildNodes(), like the getNodes() does, passing additionally namePattern / nameGlobs.
> The ItemManager.ChildNodeEntry() already iterates on ChildNodeEntry instances, which have Name, so the filtering can be done here, i.e. much earlier, than in the LazyItemIterator.prefetchNext().
> The adjusted code for org.apache.jackrabbit.core.ItemManager.getChildNodes(), we've tested on our side looks as follows:
>     synchronized NodeIterator getChildNodes(NodeId parentId)
>             throws ItemNotFoundException, AccessDeniedException, RepositoryException {
>         return getChildNodesInternal(parentId, null, null);
>     }
>     synchronized NodeIterator getChildNodes(NodeId parentId, String namePattern, String[] nameGlobs)
>             throws ItemNotFoundException, AccessDeniedException, RepositoryException {
>         return getChildNodesInternal(parentId, namePattern, nameGlobs);
>     }
>     private NodeIterator getChildNodesInternal(NodeId parentId, String namePattern, String[] nameGlobs)
>             throws ItemNotFoundException, AccessDeniedException, RepositoryException {
>         sanityCheck();
>         ItemData data = getItemData(parentId);
>         if (!data.isNode()) {
>             String msg = "can't list child nodes of property " + parentId;
>             log.debug(msg);
>             throw new RepositoryException(msg);
>         }
>         ArrayList<ItemId> childIds = new ArrayList<ItemId>();
>         Iterator<ChildNodeEntry> iter = ((NodeState) data.getState()).getChildNodeEntries().iterator();
>         while (iter.hasNext()) {
>             ChildNodeEntry entry = iter.next();
>             // delay check for read-access until item is being built
>             // thus avoid duplicate check
>             
>             // check for name
>             if (namePattern == null && nameGlobs == null ||
>                     namePattern != null
>                     && ChildrenCollectorFilter.matches(sessionContext.getJCRName(entry.getName()), namePattern)
>                     || nameGlobs != null
>                     && ChildrenCollectorFilter.matches(sessionContext.getJCRName(entry.getName()), nameGlobs)) {
>                 childIds.add(entry.getId());
>             }
>         }
>         return new LazyItemIterator(sessionContext, childIds, parentId);
>     }
> and in the NodeImpl we call ItemManager.getChildNodes() as follows:
>     public NodeIterator getNodes(String namePattern) throws RepositoryException {
>         return getNodesInternal(namePattern, null);
>     }
>     public NodeIterator getNodes(final String[] nameGlobs)
>             throws RepositoryException {
>         return getNodesInternal(null, nameGlobs);
>     }
>     private NodeIterator getNodesInternal(final String namePattern, final String[] nameGlobs) throws RepositoryException {
>         return perform(new SessionOperation<NodeIterator>() {
>             public NodeIterator perform(SessionContext context)
>                     throws RepositoryException {
>                 try {
>                     return itemMgr.getChildNodes((NodeId) id, namePattern, nameGlobs);
>                 } catch (ItemNotFoundException e) {
>                     throw new RepositoryException(
>                             "Failed to list child nodes of " + NodeImpl.this, e);
>                 } catch (AccessDeniedException e) {
>                     throw new RepositoryException(
>                             "Failed to list child nodes of " + NodeImpl.this, e);
>                 }
>             }
>             public String toString() {
>                 return "node.getNodes()";
>             }
>         });
>     }
> Could you please provide your feedback if the optimization makes sense and looks consistent or if there are any issues with those changes?
> I could provide the patch of suggested changes. I just need to know against which branch of Jackrabbit code it should be done.
> Thank you in advance!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira