You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by "Martin Alderson (JIRA)" <ji...@apache.org> on 2007/08/18 17:52:30 UTC
[jira] Reopened: (DIRSERVER-951) Negated filter on indexed attribute doesn't find entries without attribute

     [ https://issues.apache.org/jira/browse/DIRSERVER-951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Alderson reopened DIRSERVER-951:
---------------------------------------


This is still a problem but it's rarer than I thought.

I have created a new test case based on yours (Alex) to show the problem.  It is currently disabled (since it fails!)

TEST: http://svn.apache.org/viewvc/directory/apacheds/trunk/server-unit/src/test/java/org/apache/directory/server/DIRSERVER951ITest.java?view=markup&pathrev=567283
LDIF: http://svn.apache.org/viewvc/directory/apacheds/trunk/server-unit/src/test/resources/org/apache/directory/server/DIRSERVER951ITest.ldif?view=markup&pathrev=567280

Your test wasn't provoking the problem for two reasons:
1. The indexed attribute you used was ou.  This attribute type has a subtype (c-ou) and the NormalizationService expands the filter from (!(ou=drama)) to (!(|(2.5.4.11=drama)(2.5.4.11.1=drama))).  The problem will only occur when the negated filter is a leaf node (i.e. not an AND, OR, or another negation).  This means the problem will only affect indexed attributes that do not have a subtype.
2. The scope part of the filter (base DN of ou=actors...) will be preferred over the negation filter by the filter optimizer.  The problem only occurs when the negated filter is selected by the optimizer as the primary filter.

My new test gets round requirement 1 by testing CN instead of OU, and requirement 2 by doing a search on the entire system partition rather than just ou=actors,ou=system.

-

The problem is that the current code is assuming that if the attribute is indexed then we can just ignore all those entries that do not have this attribute.  The javadoc for org.apache.directory.server.core.partition.impl.btree.DefaultOptimizer#getNegationScan gives away the current code's intentions:

     * Negation counts are estimated in one of two ways depending on its 
     * composition.  If the sole child of the negation is a leaf and an index
     * exists for the attribute of the leaf then the count on the index is taken
     * as the scan count.  If the child is a branch node then the count of the
     * negation node is set to the total count of entries in the master table.
     * This last resort tactic is used to get a rough estimate because it would 
     * cost too much to get an exact estimate on the count of a negation on a
     * branch node.


I think to fix this problem we need to change the bit of org.apache.directory.server.core.partition.impl.btree.ExpressionEnumerator#enumNeg that gets the base enumeration.  We can change:

        // Iterates over entire set of index values
        if ( node.getChild().isLeaf() )
        {
            LeafNode child = ( LeafNode ) node.getChild();
            
            if ( db.hasUserIndexOn( child.getAttribute() ) )
            {
                idx = db.getUserIndex( child.getAttribute() );
                childEnumeration = idx.listIndices();
            }
            else
            {
                childEnumeration = db.getNdnIndex().listIndices();
            }
        }
        // Iterates over the entire set of entries
        else
        {
            idx = db.getNdnIndex();
            childEnumeration = idx.listIndices();
        }

..to...

        childEnumeration = db.getNdnIndex().listIndices();

We perhaps should also change the name of childEnumeration to baseEnumeration - we were never actually getting an enumeration for the child filter.

The old code seems to be optimizing based on an incorrect assumption.  I would like someone to agree with me that this is the right move before I go ahead and commit this change though.  Assign the issue to me if you want me to take care of it.

In addition, if this fix is made we should also change org.apache.directory.server.core.partition.impl.btree.DefaultOptimizer#getNegationScan to just return MAX - a negation scan is always worst case since we have to retrieve all entries from the partition then iterate over them filtering out those that do not pass the negated test.  In the other filter optimizers where we can't use an index we just return MAX too.  Of course, I think this will probably make it impossible to meet requirement 2 since a scope filter node will always be preferred over the negation filter, unless the filter optimizer is disabled.


> Negated filter on indexed attribute doesn't find entries without attribute
> --------------------------------------------------------------------------
>
>                 Key: DIRSERVER-951
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-951
>             Project: Directory ApacheDS
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.5.1
>            Reporter: Martin Alderson
>            Assignee: Alex Karasulu
>            Priority: Blocker
>             Fix For: 1.5.1
>
>
> Searching with filter (!(myAttribute=value)) will not find entries which do not have a myAttribute attribute when that attribute is indexed.  When myAttribute is not indexed the filter works as expected, finding all entries that either do not have the specified value for myAttribute or do not have any values for myAttribute at all.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.