You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@directory.apache.org by John Peter <jm...@iki.fi> on 2015/02/04 12:53:16 UTC

[ApacheDS] groupofnames search is slow

Hi,

I'm using apacheds 2.0.0 M_19

ApacheDS has 11 group entries and some of them have 26 000 member
attributes.

I'm using a qucik search via apache directory studio:
(&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)(objectClass=groupOfNames))
Using a stopwatch I can see a 5 second delay before the result is visible.

Some other queries perform instantaneously.
Below is a snipet from the apacheDS log. It shows the search operation took
only 7070312ns or 7ms. After that apacheDS does something and the next log
entry is 5 seconds later.

Any idea's what is happening?

[13:29:19] DEBUG [org.apache.mina.filter.codec.ProtocolCodecFilter] -
Processing a MESSAGE_RECEIVED for session 1
[13:29:19] DEBUG [org.apache.mina.core.filterchain.IoFilterEvent] - Firing
a MESSAGE_RECEIVED event for session 1
[13:29:19] DEBUG [org.apache.directory.server.OPERATION_LOG] - >>
SearchOperation : SearchContext for Dn 'dc=test,dc=fi', filter
:'(&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)(objectClass=groupOfNames))'
[13:29:19] DEBUG [org.apache.directory.CURSOR_LOG] - Creating
ArrayTreeCursor ArrayTreeCursor (absent)
[13:29:19] DEBUG [org.apache.directory.CURSOR_LOG] - Closing
ArrayTreeCursor ArrayTreeCursor (absent)
[13:29:19] DEBUG [org.apache.directory.CURSOR_LOG] - Creating
ArrayTreeCursor ArrayTreeCursor (absent)
[13:29:19] DEBUG [org.apache.directory.CURSOR_LOG] - Closing
ArrayTreeCursor ArrayTreeCursor (absent)
[13:29:19] DEBUG [org.apache.directory.CURSOR_LOG] - Creating SetCursor
SetCursor :
    Index : -1

[13:29:19] DEBUG [org.apache.directory.CURSOR_LOG] - Creating
EntryCursorAdaptor EntryCursorAdaptor

[13:29:19] DEBUG [org.apache.directory.CURSOR_LOG] - Creating
BaseEntryFilteringCursor BaseEntryFilteringCursor, no wrapped
No filter
No prefetched
[13:29:19] DEBUG [org.apache.directory.server.OPERATION_LOG] - <<
SearchOperation successful
[13:29:19] DEBUG [org.apache.directory.server.OPERATION_TIME] - Search
operation took 7070312 ns
[13:29:24] DEBUG [org.apache.directory.CURSOR_LOG] - Closing
BaseEntryFilteringCursor BaseEntryFilteringCursor, wrapped :
    EntryCursorAdaptor
        IndexCursor :
            SetCursor :
                Index : 5
                Size : 5
                IndexEntry[ null, 60c9656a-901c-4470-9fae-4676d425c29e ]
                IndexEntry[ null, 2b350745-0eb2-445e-8d21-aa0219903b49 ]
                IndexEntry[ null, 6f38d205-4252-4797-acb4-e71141af2912 ]
                IndexEntry[ null, 683493e6-f53e-4a19-bcd8-b385f09827d7 ]
                IndexEntry[ null, ec63d090-6d69-4b45-8d0a-499fa8c0d7ae ]
        Evaluator :


Thanks,
John

Re: [ApacheDS] groupofnames search is slow

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 04/02/15 12:53, John Peter a écrit :
> Hi,
>
> I'm using apacheds 2.0.0 M_19
>
> ApacheDS has 11 group entries and some of them have 26 000 member
> attributes.
>
> I'm using a qucik search via apache directory studio:
> (&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)(objectClass=groupOfNames))
> Using a stopwatch I can see a 5 second delay before the result is visible.
>
> Some other queries perform instantaneously.
> Below is a snipet from the apacheDS log. It shows the search operation took
> only 7070312ns or 7ms. After that apacheDS does something and the next log
> entry is 5 seconds later.
>
> Any idea's what is happening?

First of all, is the 'member' attribute indexed ? If not, please do so,
restart your server, and restart.

Regarding the way ApacheDS works :

o the search request will use the index which has the lowest number of
candidates. In this case, I suppose it's the ObjctClass index that
matches (it should match 5 candidates). That is obviously fast.

o then, it has to check every of those 5 candidates to see if it macthes
the full filter, where teh 'member' attribute is given. This is where it
takes time, as we use a sub-index to store such attribute when there are
more than a number of elements (and with 26 000 'member' value, you bet
this is the case). If the 'member' attribute is not indexed, we will do
a full scan of this sub-index, which will take a lot of time, thus the 5
seconds you get.

I can assume that indexing 'member' would speed-up the test. If it's not
the case, then there is a bug we need to get fixed (and this should not
be a huge thing to do)

Can you try the indexing approach, and if it does not make the server
retrieving the results fast enough, then fill a JIRA ?

Many thanks !



Re: [ApacheDS] groupofnames search is slow

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 20/02/15 10:17, John Peter a écrit :
> Hi,
>
> Thanks Emmanuel for all the quick responses and fix.
>
> I see the fix version is M20. Any idea when the M20 version will be
> released?
No. Soon, but no date.
>
> I have the member attribute and ObjectClass indexed. I even exported
> everything to ldif file. Then reinstalled apacheDS and addded the index and
> imported all back. I read somewhere that an index would not be used
> otherwise. But is the current ApacheDS doing the indexing just by
> restarting?
yes. if you add an index, and restart the server, it will detect that
the index has not been created, and will create it on the fly.

>
> In my setup the query with or without (objectClass=groupOfNames) takes
> aprox the same time which is around 4 to 5 seconds depending on the uid I'm
> searching for.
>
> (&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=
> test,dc=fi)(objectClass=groupOfNames))

You don't need to add the (objectClass=groupOfNames) part, as 'member'
is only used by this ObjectClass (unless you have created your own
Objectclass using 'member'). Not using it will not improve the speed a
lot though.

I can provide an intermediate M20 build for you to test it...


Re: [ApacheDS] groupofnames search is slow

Posted by John Peter <jm...@iki.fi>.
Hi,

Thanks Emmanuel for all the quick responses and fix.

I see the fix version is M20. Any idea when the M20 version will be
released?

I have the member attribute and ObjectClass indexed. I even exported
everything to ldif file. Then reinstalled apacheDS and addded the index and
imported all back. I read somewhere that an index would not be used
otherwise. But is the current ApacheDS doing the indexing just by
restarting?

In my setup the query with or without (objectClass=groupOfNames) takes
aprox the same time which is around 4 to 5 seconds depending on the uid I'm
searching for.

(&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=
test,dc=fi)(objectClass=groupOfNames))
(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)

Thanks for all the help.

--
John

On Fri, Feb 6, 2015 at 1:34 AM, Emmanuel Lécharny <el...@gmail.com>
wrote:

> Le 04/02/15 12:53, John Peter a écrit :
> > Hi,
> >
> > I'm using apacheds 2.0.0 M_19
> >
> > ApacheDS has 11 group entries and some of them have 26 000 member
> > attributes.
> >
> > I'm using a qucik search via apache directory studio:
> >
> (&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)(objectClass=groupOfNames))
> > Using a stopwatch I can see a 5 second delay before the result is
> visible.
> >
> > Some other queries perform instantaneously.
> > Below is a snipet from the apacheDS log. It shows the search operation
> took
> > only 7070312ns or 7ms. After that apacheDS does something and the next
> log
> > entry is 5 seconds later.
> >
> > Any idea's what is happening?
>
> Ok, bug fixed : https://issues.apache.org/jira/browse/DIRSERVER-2048
>
>

Re: [ApacheDS] groupofnames search is slow

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 04/02/15 12:53, John Peter a écrit :
> Hi,
>
> I'm using apacheds 2.0.0 M_19
>
> ApacheDS has 11 group entries and some of them have 26 000 member
> attributes.
>
> I'm using a qucik search via apache directory studio:
> (&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)(objectClass=groupOfNames))
> Using a stopwatch I can see a 5 second delay before the result is visible.
>
> Some other queries perform instantaneously.
> Below is a snipet from the apacheDS log. It shows the search operation took
> only 7070312ns or 7ms. After that apacheDS does something and the next log
> entry is 5 seconds later.
>
> Any idea's what is happening?

Ok, bug fixed : https://issues.apache.org/jira/browse/DIRSERVER-2048


Re: [ApacheDS] groupofnames search is slow

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 05/02/15 20:03, Emmanuel Lécharny a écrit :
> Le 04/02/15 12:53, John Peter a écrit :
>> Hi,
>>
>> I'm using apacheds 2.0.0 M_19
>>
>> ApacheDS has 11 group entries and some of them have 26 000 member
>> attributes.
>>
>> I'm using a qucik search via apache directory studio:
>> (&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)(objectClass=groupOfNames))
>> Using a stopwatch I can see a 5 second delay before the result is visible.
>>
>> Some other queries perform instantaneously.
>> Below is a snipet from the apacheDS log. It shows the search operation took
>> only 7070312ns or 7ms. After that apacheDS does something and the next log
>> entry is 5 seconds later.
>>
>> Any idea's what is happening?
> Ok, I can reproduce the pb.
>
> If I do a search with
>
> (member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)
>
> as a filter, it takes only 400ms, and I tested it with 32 000 members in
> the entry.
>
> If I add (objectClass=groupOfNames) it takes 14 seconds.
>
> Two things here :
> - first, you can do the search omitting the (objectClass=groupOfNames),
> because the 'member' attribute is only used by this class, unless you
> have created an ObjectClass that uses the 'member' attribute
> - second, we should be able to speed up the search when the objectClass
> is used in the filter. It seems that we don't evaluate properly the
> number of candidates the (member=XXX) part is returning. This is clearly
> a bug and need to be fixed.
Some more insights :
- when the member attribute is indexed, here is the evaluation for the
filter :

(&:[4](&:[4](member=2.5.4.3=test7438,2.5.4.11=users,2.5.4.11=system:[4])(objectClass=groupofnames:[5]))(#{SUBTREE_SCOPE
(Estimated), 'ou=system', DEREF_ALWAYS}:[15]))

It means the 'member' attribute will be used as the primary index for
the real search, and we have the 4 candidates in the result set (I have
created 5 entries, but one of them does not have the searched member).

If it's not indexed, then we will use the objectClass as the main
seelctor, but the time it will take to process it will only be slightly
longer.

- The real pb is the code that check if a specific member value is
present in the member attribute in the candidates. It does :

    private boolean evaluate( Attribute attribute ) throws LdapException
    {
        /*
         * Cycle through the attribute values testing normalized version
         * obtained from using the ordering or equality matching rule's
         * normalizer.  The test uses the comparator obtained from the
         * appropriate matching rule to perform the check.
         */
        for ( Value<?> value : attribute )
        {
            OMG !!! A loop over 32 000 values :/

If I replace this test by :

    private boolean evaluate( Attribute attribute ) throws LdapException
    {
        if ( attribute.contains( node.getValue() ) )
        {
            return true;  // A shortcut
        }

        /*
         * Cycle through the attribute values testing normalized version
         * obtained from using the ordering or equality matching rule's
         * normalizer.  The test uses the comparator obtained from the
         * appropriate matching rule to perform the check.
         */
        for ( Value<?> value : attribute )
        {

then the search is done in 45ms instead of 422 ms (10 times faster).

Still have to check what could be the impact on the whole server.







Re: [ApacheDS] groupofnames search is slow

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 04/02/15 12:53, John Peter a écrit :
> Hi,
>
> I'm using apacheds 2.0.0 M_19
>
> ApacheDS has 11 group entries and some of them have 26 000 member
> attributes.
>
> I'm using a qucik search via apache directory studio:
> (&(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)(objectClass=groupOfNames))
> Using a stopwatch I can see a 5 second delay before the result is visible.
>
> Some other queries perform instantaneously.
> Below is a snipet from the apacheDS log. It shows the search operation took
> only 7070312ns or 7ms. After that apacheDS does something and the next log
> entry is 5 seconds later.
>
> Any idea's what is happening?

Ok, I can reproduce the pb.

If I do a search with

(member=uid=john,uid=1.2.246.1.1.1111.10.0,ou=users,dc=test,dc=fi)

as a filter, it takes only 400ms, and I tested it with 32 000 members in
the entry.

If I add (objectClass=groupOfNames) it takes 14 seconds.

Two things here :
- first, you can do the search omitting the (objectClass=groupOfNames),
because the 'member' attribute is only used by this class, unless you
have created an ObjectClass that uses the 'member' attribute
- second, we should be able to speed up the search when the objectClass
is used in the filter. It seems that we don't evaluate properly the
number of candidates the (member=XXX) part is returning. This is clearly
a bug and need to be fixed.