You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Emmanuel Lecharny <el...@apache.org> on 2008/05/29 19:49:14 UTC

Re: [ApacheDS] [Bigbang] Filters are run twice, and other potential improvements

Hi,

I did some debuging session today, in order to analyze where the server 
was doing useless operations. Here is what I found :

1) The ExceptionInterceptor search method has this portion of code which 
is problematic :

            EntryFilteringCursor cursor =  nextInterceptor.search( 
opContext );
           
            if ( ! cursor.next() )

The next() call will run the filters, leading to a call to the 
CollectiveAttribute accept() method, which does a lookup. This lookup is 
obviously overkilling (see point 2), but the problem is that when the 
NamingEnumerationAdaptor is created in the ServerLdapContext, filters 
are run a second time.

It generates a double call to all the filters, plus a double lookup on 
the backup, so at the end we have cloned three times the entry : once in 
the original pretech, and two in the collectiveAttribute accept method.

If we can kill this duplication of effort, the speedup will be enormous 
(I think we can even be faster than trunk)

2) The lookup in the CollectiveAttributeInterceptor's filter is useless. 
It's done in the following method :
    private void addCollectiveAttributes( LdapDN normName, ServerEntry 
entry, String[] retAttrs ) throws Exception

The idea is to get the 'collectiveAttributeSubentries' attribute type, 
which is not requested by default. This is done through a second fetch 
of the whole entry. We could probably just request for this 
attributeType (and I think it's already the case, as we get all the AT 
from the entry, operationnal attributes included, and strip the not 
requested attributes on the cloned entry). So avoiding such a lookup 
should be possible.

3) In the DefaultSearchEngine.cursor() method, we call 
JdbmStore.getEntryId( base.toString() ). Here is the code of this method :

    public Long getEntryId( String dn ) throws Exception
    {
        return ndnIdx.forwardLookup( dn );
    }

and the JdbmIndex.forwardLookup code :

    public Long forwardLookup( K attrVal ) throws Exception
    {
        return forward.get( getNormalized( attrVal ) );
    }

The getNormalized() method will look into a cache for a normalized form 
of the DN (but as this is a general index, K can be something different 
from a LdapDN, of course). Anyway, if it's a DN, it's already 
normalized, so this is costly and useless. We can check if the index is 
the ndn index and in this case, simply pass the String without checking 
in a cache.

4) In the same DefaultSearchEngine.cursor() method, we do a lookup on 
the alias index :

        String aliasedBase = db.getAliasIndex().reverseLookup( baseId );

Aliases could be hold in a cache (we won't have thousands of aliases) in 
order to avoid a lookup. (not sure this is a great hack ...)

5) The ChangeLog, Event and Trigger interceptors should be bypassed.

6) The EvaluatorBuilder uses Node attributes as String, which leads to 
many accesses to the OidRegistry, to check that those String are valid 
AttributeTypes. As the Nodes are built earlier in the process, I would 
suggest we store AttributeTypes instead of String into the Nodes, to 
avoid many lookup in the Registries.

That's it for now, but enough places we can work on atm to get this 
branch fast !

 

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

[ApacheDS] [Bigbang] Filters are run twice, and other potential improvements : some fixes

Posted by Emmanuel Lecharny <el...@apache.org>.

Hi,

I just realized that with the ClonedServerEntry we are now using 
internally, point #2 can just be solved in a matter of seconds : we 
don't need anymore to lookup for the entry as we already have the 
original one with all the operational attributes !

So I just commented the useless lookup, and did some performance test 
with the new version :

4686 req/s !!!

This is close to what we have in the trunk.

Let's move to the 5 other improvements now :)

Emmanuel Lecharny wrote:
> Hi,
>
> I did some debuging session today, in order to analyze where the 
> server was doing useless operations. Here is what I found :
>
> 1) The ExceptionInterceptor search method has this portion of code 
> which is problematic :
>
>            EntryFilteringCursor cursor =  nextInterceptor.search( 
> opContext );
>                      if ( ! cursor.next() )
>
> The next() call will run the filters, leading to a call to the 
> CollectiveAttribute accept() method, which does a lookup. This lookup 
> is obviously overkilling (see point 2), but the problem is that when 
> the NamingEnumerationAdaptor is created in the ServerLdapContext, 
> filters are run a second time.
>
> It generates a double call to all the filters, plus a double lookup on 
> the backup, so at the end we have cloned three times the entry : once 
> in the original pretech, and two in the collectiveAttribute accept 
> method.
>
> If we can kill this duplication of effort, the speedup will be 
> enormous (I think we can even be faster than trunk)
>
> 2) The lookup in the CollectiveAttributeInterceptor's filter is 
> useless. It's done in the following method :
>    private void addCollectiveAttributes( LdapDN normName, ServerEntry 
> entry, String[] retAttrs ) throws Exception
>
> The idea is to get the 'collectiveAttributeSubentries' attribute type, 
> which is not requested by default. This is done through a second fetch 
> of the whole entry. We could probably just request for this 
> attributeType (and I think it's already the case, as we get all the AT 
> from the entry, operationnal attributes included, and strip the not 
> requested attributes on the cloned entry). So avoiding such a lookup 
> should be possible.
>
> 3) In the DefaultSearchEngine.cursor() method, we call 
> JdbmStore.getEntryId( base.toString() ). Here is the code of this 
> method :
>
>    public Long getEntryId( String dn ) throws Exception
>    {
>        return ndnIdx.forwardLookup( dn );
>    }
>
> and the JdbmIndex.forwardLookup code :
>
>    public Long forwardLookup( K attrVal ) throws Exception
>    {
>        return forward.get( getNormalized( attrVal ) );
>    }
>
> The getNormalized() method will look into a cache for a normalized 
> form of the DN (but as this is a general index, K can be something 
> different from a LdapDN, of course). Anyway, if it's a DN, it's 
> already normalized, so this is costly and useless. We can check if the 
> index is the ndn index and in this case, simply pass the String 
> without checking in a cache.
>
> 4) In the same DefaultSearchEngine.cursor() method, we do a lookup on 
> the alias index :
>
>        String aliasedBase = db.getAliasIndex().reverseLookup( baseId );
>
> Aliases could be hold in a cache (we won't have thousands of aliases) 
> in order to avoid a lookup. (not sure this is a great hack ...)
>
> 5) The ChangeLog, Event and Trigger interceptors should be bypassed.
>
> 6) The EvaluatorBuilder uses Node attributes as String, which leads to 
> many accesses to the OidRegistry, to check that those String are valid 
> AttributeTypes. As the Nodes are built earlier in the process, I would 
> suggest we store AttributeTypes instead of String into the Nodes, to 
> avoid many lookup in the Registries.
>
> That's it for now, but enough places we can work on atm to get this 
> branch fast !
>
>
>


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: [ApacheDS] [Bigbang] Filters are run twice, and other potential improvements

Posted by Emmanuel Lecharny <el...@apache.org>.

Alex Karasulu wrote:
>>> 4) In the same DefaultSearchEngine.cursor() method, we do a lookup on the
>>> alias index :
>>>
>>>       String aliasedBase = db.getAliasIndex().reverseLookup( baseId );
>>>
>>> Aliases could be hold in a cache (we won't have thousands of aliases) in
>>> order to avoid a lookup. (not sure this is a great hack ...)
>>>
>>>       
> You cannot presume you will have few aliases.  You just never know how many
> people will use.  Hopefully the answer is none but this is not the case.
>   

true. May be not really interesting to have this cache, or simply not 
possible to cover all the cases. Anyway, this is not an important 
improvement.
>
>   
>>> 5) The ChangeLog, Event and Trigger interceptors should be bypassed.
>>>
>>>       
> Bypassed in lookup calls or search?
>   
search. I didn't checked lookup yet.
>
>   
>>> 6) The EvaluatorBuilder uses Node attributes as String, which leads to
>>> many accesses to the OidRegistry, to check that those String are valid
>>> AttributeTypes. As the Nodes are built earlier in the process, I would
>>> suggest we store AttributeTypes instead of String into the Nodes, to avoid
>>> many lookup in the Registries.
>>>
>>>       
> Yeah that sounds like a great idea.  Let me know if you already fixed this
> in your working directory and I can back off from making the changes.
>   
I will try to see what kind of change it will imply.

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: [ApacheDS] [Bigbang] Filters are run twice, and other potential improvements

Posted by Alex Karasulu <ak...@apache.org>.

On Thu, May 29, 2008 at 4:51 PM, Alex Karasulu <ak...@apache.org> wrote:

> On Thu, May 29, 2008 at 1:49 PM, Emmanuel Lecharny <el...@apache.org>
> wrote:
> Will address rest later.
>
>
>> 3) In the DefaultSearchEngine.cursor() method, we call
>> JdbmStore.getEntryId( base.toString() ). Here is the code of this method :
>>
>>   public Long getEntryId( String dn ) throws Exception
>>   {
>>       return ndnIdx.forwardLookup( dn );
>>   }
>>
>> and the JdbmIndex.forwardLookup code :
>>
>>   public Long forwardLookup( K attrVal ) throws Exception
>>   {
>>       return forward.get( getNormalized( attrVal ) );
>>   }
>>
>> The getNormalized() method will look into a cache for a normalized form of
>> the DN (but as this is a general index, K can be something different from a
>> LdapDN, of course). Anyway, if it's a DN, it's already normalized, so this
>> is costly and useless.
>
>
Ahh so you're saying base.toString() gets the normalized DN from the base DN
- ok then you're right.


> We can check if the index is the ndn index and in this case, simply pass
>> the String without checking in a cache.
>>
>
Yep very true.


>
>> 4) In the same DefaultSearchEngine.cursor() method, we do a lookup on the
>> alias index :
>>
>>       String aliasedBase = db.getAliasIndex().reverseLookup( baseId );
>>
>> Aliases could be hold in a cache (we won't have thousands of aliases) in
>> order to avoid a lookup. (not sure this is a great hack ...)
>>
>
You cannot presume you will have few aliases.  You just never know how many
people will use.  Hopefully the answer is none but this is not the case.


>
>> 5) The ChangeLog, Event and Trigger interceptors should be bypassed.
>>
>
Bypassed in lookup calls or search?


>> 6) The EvaluatorBuilder uses Node attributes as String, which leads to
>> many accesses to the OidRegistry, to check that those String are valid
>> AttributeTypes. As the Nodes are built earlier in the process, I would
>> suggest we store AttributeTypes instead of String into the Nodes, to avoid
>> many lookup in the Registries.
>>
>
Yeah that sounds like a great idea.  Let me know if you already fixed this
in your working directory and I can back off from making the changes.


>
>> That's it for now, but enough places we can work on atm to get this branch
>> fast !
>>
>>
Excellent analysis - just a few more and we can beat the trunk.

Alex

Re: [ApacheDS] [Bigbang] Filters are run twice, and other potential improvements

Posted by Alex Karasulu <ak...@apache.org>.

On Thu, May 29, 2008 at 1:49 PM, Emmanuel Lecharny <el...@apache.org>
wrote:

> Hi,
>
> I did some debuging session today, in order to analyze where the server was
> doing useless operations. Here is what I found :
>
> 1) The ExceptionInterceptor search method has this portion of code which is
> problematic :
>
>           EntryFilteringCursor cursor =  nextInterceptor.search( opContext
> );
>                     if ( ! cursor.next() )
>
> The next() call will run the filters, leading to a call to the
> CollectiveAttribute accept() method, which does a lookup. This lookup is
> obviously overkilling (see point 2), but the problem is that when the
> NamingEnumerationAdaptor is created in the ServerLdapContext, filters are
> run a second time.
>

How is this code fragment above problematic in itself outside of the lookup
due to the CollectiveAttributeInterceptor's filter?

>
> It generates a double call to all the filters, plus a double lookup on the
> backup, so at the end we have cloned three times the entry : once in the
> original pretech, and two in the collectiveAttribute accept method.
>
> If we can kill this duplication of effort, the speedup will be enormous (I
> think we can even be faster than trunk)
>
> 2) The lookup in the CollectiveAttributeInterceptor's filter is useless.
> It's done in the following method :
>   private void addCollectiveAttributes( LdapDN normName, ServerEntry entry,
> String[] retAttrs ) throws Exception


You're 100% right.  I removed it here:

      http://ohchof.notlong.com

Will this fix all the problems you were referring to?


>
> The idea is to get the 'collectiveAttributeSubentries' attribute type,
> which is not requested by default. This is done through a second fetch of
> the whole entry. We could probably just request for this attributeType (and
> I think it's already the case, as we get all the AT from the entry,
> operationnal attributes included, and strip the not requested attributes on
> the cloned entry). So avoiding such a lookup should be possible.


Yes I used clonedEntry.getOriginalEntry() to access this info.

Will address rest later.


> 3) In the DefaultSearchEngine.cursor() method, we call
> JdbmStore.getEntryId( base.toString() ). Here is the code of this method :
>
>   public Long getEntryId( String dn ) throws Exception
>   {
>       return ndnIdx.forwardLookup( dn );
>   }
>
> and the JdbmIndex.forwardLookup code :
>
>   public Long forwardLookup( K attrVal ) throws Exception
>   {
>       return forward.get( getNormalized( attrVal ) );
>   }
>
> The getNormalized() method will look into a cache for a normalized form of
> the DN (but as this is a general index, K can be something different from a
> LdapDN, of course). Anyway, if it's a DN, it's already normalized, so this
> is costly and useless. We can check if the index is the ndn index and in
> this case, simply pass the String without checking in a cache.
>
> 4) In the same DefaultSearchEngine.cursor() method, we do a lookup on the
> alias index :
>
>       String aliasedBase = db.getAliasIndex().reverseLookup( baseId );
>
> Aliases could be hold in a cache (we won't have thousands of aliases) in
> order to avoid a lookup. (not sure this is a great hack ...)
>
> 5) The ChangeLog, Event and Trigger interceptors should be bypassed.
>
> 6) The EvaluatorBuilder uses Node attributes as String, which leads to many
> accesses to the OidRegistry, to check that those String are valid
> AttributeTypes. As the Nodes are built earlier in the process, I would
> suggest we store AttributeTypes instead of String into the Nodes, to avoid
> many lookup in the Registries.
>
> That's it for now, but enough places we can work on atm to get this branch
> fast !
>
>
>
> --
> --
> cordialement, regards,
> Emmanuel Lécharny
> www.iktek.com
> directory.apache.org
>
>
>