You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Emmanuel Lecharny <el...@gmail.com> on 2008/07/14 11:42:18 UTC

DN parsng wiki page

Hi,

I started a page on wiki about the DN parsing process.
http://cwiki.apache.org/confluence/display/DIRxSRVx11/DN+Parsing

The idea is to describe what we should do to parse a DN in the most 
efficient way, but keeping it maintainable at the same time.

I will try to complete it this week.

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: DN parsng wiki page

Posted by Emmanuel Lecharny <el...@gmail.com>.

>> Currently, if the OID is not found, then we consider that the DN is
>> incorrect, and we throw an error. This is done during the DN
>> normalization. Is that the correct behavior ? (Assuming that we handle
>> correctly the referrals and extensible ObjectClass... To be double
>> checked ! Did I forgot any other case ?)
>
> I think that covers it for DN parsing. Attribute normalization is also 
> needed for filter evaluation, and in that case, servers are supposed 
> to silently ignore invalid/unrecognized filter terms.
Filter is parsed in another part of the code. I _think_ we follow the 
rules about ignoring unknown terms.

I will check the cases where an element of a DN is not known by the 
server, because we are using a referral or an extensible OC. Thanks for 
mentioning it, Howard !

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: DN parsng wiki page

Posted by Howard Chu <hy...@symas.com>.

Emmanuel Lecharny wrote:
>>> e.g. to find the syntax and what normalization rules it uses. And, whether
>>> you take this approach or not, you need a strategy for dealing with
>>> unrecognized attributeTypes. I.e., right now you're using the OID, but what
>>> happens if you're unable to find the OID?

>> Another excellent point.
>>
> Currently, if the OID is not found, then we consider that the DN is
> incorrect, and we throw an error. This is done during the DN
> normalization. Is that the correct behavior ? (Assuming that we handle
> correctly the referrals and extensible ObjectClass... To be double
> checked ! Did I forgot any other case ?)

I think that covers it for DN parsing. Attribute normalization is also needed 
for filter evaluation, and in that case, servers are supposed to silently 
ignore invalid/unrecognized filter terms.

>> Thanks for chiming in Howard.  It's always good to have your input!
>>
> Damn yes !

Glad to be able to help.

> More to come on the DN parsing page. I have 4 hours in the train
> tomorrow, perfect time to write doco.
>
> Thanks guys  !
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: DN parsng wiki page

Posted by Emmanuel Lecharny <el...@gmail.com>.

Hi guys,

thanks for your comments on this very preliminary page. More inline.

Alex Karasulu wrote:
> On Mon, Jul 14, 2008 at 2:51 PM, Howard Chu <hy...@symas.com> wrote:
>
>   
>> Ah, looks like a quite familiar pain. Just a note, in X.500/LDAP docs what
>> you called an ATAV is known as an AVA. Might want to line up the terminology
>> to make the correspondence clearer.
>>
>>     
>
> +1
>   
Will switch to AVA, sure.
>   
>> And may I suggest another optimization to consider - for your AttributeType
>> internal structure, store a reference to the schema element, not a
>> normalized OID.
>>     
>
> I think this is a good idea and something we've debated.  Our problem comes
> from the fact that this LdapDN class and it's LdapDNParser is way
> overloaded.  We keep making some things serve more than one use.  Namely
> referring to the fact that these classes are designed to work in clients and
> servers : hence the reason for sticking to an OID.  Not all clients are
> going to want to waste the time to perform a search over schema subentry
> attributes to resolve, parse and generate the AT representation.
>   
There is a reason why the LdapDN was not using AttributeType ( our 
internal 'pointer' to the schema ), it's that the DN are parsed during 
the LDAP protocol decoding. When it was first done, we didn't had easy 
access to the schema objects. This was debated from the begining (should 
we parse the DN while decoding the PDUs or not), and I'm not really sure 
I was right to push this idea. May be it's better to parse the DN later 
(but then, when ? The problem is that we need the DN pretty early, even 
before passing through the normalization Interceptor.)

Anyway, the idea here is to put everything on the table, and start to 
think again with some open mind.

Second point, raised by Alex, the fact we have a client/server mixed DN 
object may be over killing, and a separation would be good to have. 
Currently, this is done through a call to a Normalize method, which is 
not the best way to do it, because we also have to do the prepString 
before, and also because the client DN does not need this method.
> My mindset on these matters is very RISC+UNIX'ish in their nature.  Write
> something that does only one thing and does it well, simpley, in a managable
> and efficient manner.  That's why I think we need to break up this LdapDN
> into ClientDn and ServerDn.  You can add the LDAP part to these names but
> I'm not worried about the naming right now: just that we need to break these
> suckers up.
>   
+1. I will start thinking about it. LdapDN and ServerDN sounds good to me.

Another idea we have had would be to write a very fast DN parser which 
assumes that a DN is composed of ASCII chars, with a single AVA per RDN, 
without any special chars like " or +. If we fall into a special case, 
an exeption will be thrown and we fall back into the complex DN parser. 
this may save us a lot of CPU cycles...
> We're paying for an overhead in complexity and performance because we're
> trying to generalize one implementation to solve two different ways of
> dealing with handling LDAP distinguished names.
>   
I'm not sure that we are paying a huge penalty in term of performance, 
but in term of complexity and maintainability, it just sucks ... Even 
for me ;)
>
>   
>> You're going to need to dereference fields of the schema definition
>> frequently when dealing with it anyway,
>>     
>
>
> Yeah big time.
>   
+1
>
>   
>> e.g. to find the syntax and what normalization rules it uses. And, whether
>> you take this approach or not, you need a strategy for dealing with
>> unrecognized attributeTypes. I.e., right now you're using the OID, but what
>> happens if you're unable to find the OID?
>>
>>     
>
> Another excellent point.
>   
Currently, if the OID is not found, then we consider that the DN is 
incorrect, and we throw an error. This is done during the DN 
normalization. Is that the correct behavior ? (Assuming that we handle 
correctly the referrals and extensible ObjectClass... To be double 
checked ! Did I forgot any other case ?)
>
>   
>> Presumably the schema is always resident in memory, and a reference will be
>> the most memory efficient...
>>     
>
>
> +1 for the server dn and it's parser.
>
> Thanks for chiming in Howard.  It's always good to have your input!
>   
Damn yes !

More to come on the DN parsing page. I have 4 hours in the train 
tomorrow, perfect time to write doco.

Thanks guys  !

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: DN parsng wiki page

Posted by Alex Karasulu <ak...@apache.org>.

On Mon, Jul 14, 2008 at 2:51 PM, Howard Chu <hy...@symas.com> wrote:

> Alex Karasulu wrote:
>
>> Wow this page is pretty good.  Nice doco!
>>
>
> Ah, looks like a quite familiar pain. Just a note, in X.500/LDAP docs what
> you called an ATAV is known as an AVA. Might want to line up the terminology
> to make the correspondence clearer.
>

+1

>
> And may I suggest another optimization to consider - for your AttributeType
> internal structure, store a reference to the schema element, not a
> normalized OID.

I think this is a good idea and something we've debated.  Our problem comes
from the fact that this LdapDN class and it's LdapDNParser is way
overloaded.  We keep making some things serve more than one use.  Namely
referring to the fact that these classes are designed to work in clients and
servers : hence the reason for sticking to an OID.  Not all clients are
going to want to waste the time to perform a search over schema subentry
attributes to resolve, parse and generate the AT representation.

My mindset on these matters is very RISC+UNIX'ish in their nature.  Write
something that does only one thing and does it well, simpley, in a managable
and efficient manner.  That's why I think we need to break up this LdapDN
into ClientDn and ServerDn.  You can add the LDAP part to these names but
I'm not worried about the naming right now: just that we need to break these
suckers up.

We're paying for an overhead in complexity and performance because we're
trying to generalize one implementation to solve two different ways of
dealing with handling LDAP distinguished names.

> You're going to need to dereference fields of the schema definition
> frequently when dealing with it anyway,

Yeah big time.

> e.g. to find the syntax and what normalization rules it uses. And, whether
> you take this approach or not, you need a strategy for dealing with
> unrecognized attributeTypes. I.e., right now you're using the OID, but what
> happens if you're unable to find the OID?
>

Another excellent point.

>
> Presumably the schema is always resident in memory, and a reference will be
> the most memory efficient...

+1 for the server dn and it's parser.

Thanks for chiming in Howard.  It's always good to have your input!

Alex

Alex

>
>
>> Alex
>>
>> On Mon, Jul 14, 2008 at 5:42 AM, Emmanuel Lecharny <elecharny@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>>    Hi,
>>
>>    I started a page on wiki about the DN parsing process.
>>    http://cwiki.apache.org/confluence/display/DIRxSRVx11/DN+Parsing
>>
>>    The idea is to describe what we should do to parse a DN in the most
>>    efficient way, but keeping it maintainable at the same time.
>>
>>    I will try to complete it this week.
>>
>>    --
>>    --
>>    cordialement, regards,
>>    Emmanuel Lécharny
>>    www.iktek.com <http://www.iktek.com>
>>    directory.apache.org <http://directory.apache.org>
>>
>>
>>
>>
>>
>> --
>> Microsoft gives you Windows, Linux gives you the whole house ...
>>
>
>
> --
>  -- Howard Chu
>  CTO, Symas Corp.           http://www.symas.com
>  Director, Highland Sun     http://highlandsun.com/hyc/
>  Chief Architect, OpenLDAP  http://www.openldap.org/project/
>

-- 
Microsoft gives you Windows, Linux gives you the whole house ...

Re: DN parsng wiki page

Posted by Howard Chu <hy...@symas.com>.

Alex Karasulu wrote:
> Wow this page is pretty good.  Nice doco!

Ah, looks like a quite familiar pain. Just a note, in X.500/LDAP docs what you 
called an ATAV is known as an AVA. Might want to line up the terminology to 
make the correspondence clearer.

And may I suggest another optimization to consider - for your AttributeType 
internal structure, store a reference to the schema element, not a normalized 
OID. You're going to need to dereference fields of the schema definition 
frequently when dealing with it anyway, e.g. to find the syntax and what 
normalization rules it uses. And, whether you take this approach or not, you 
need a strategy for dealing with unrecognized attributeTypes. I.e., right now 
you're using the OID, but what happens if you're unable to find the OID?

Presumably the schema is always resident in memory, and a reference will be 
the most memory efficient...
>
> Alex
>
> On Mon, Jul 14, 2008 at 5:42 AM, Emmanuel Lecharny <elecharny@gmail.com
> <ma...@gmail.com>> wrote:
>
>     Hi,
>
>     I started a page on wiki about the DN parsing process.
>     http://cwiki.apache.org/confluence/display/DIRxSRVx11/DN+Parsing
>
>     The idea is to describe what we should do to parse a DN in the most
>     efficient way, but keeping it maintainable at the same time.
>
>     I will try to complete it this week.
>
>     --
>     --
>     cordialement, regards,
>     Emmanuel Lécharny
>     www.iktek.com <http://www.iktek.com>
>     directory.apache.org <http://directory.apache.org>
>
>
>
>
>
> --
> Microsoft gives you Windows, Linux gives you the whole house ...

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: DN parsng wiki page

Posted by Alex Karasulu <ak...@apache.org>.

Wow this page is pretty good.  Nice doco!

Alex

On Mon, Jul 14, 2008 at 5:42 AM, Emmanuel Lecharny <el...@gmail.com>
wrote:

> Hi,
>
> I started a page on wiki about the DN parsing process.
> http://cwiki.apache.org/confluence/display/DIRxSRVx11/DN+Parsing
>
> The idea is to describe what we should do to parse a DN in the most
> efficient way, but keeping it maintainable at the same time.
>
> I will try to complete it this week.
>
> --
> --
> cordialement, regards,
> Emmanuel Lécharny
> www.iktek.com
> directory.apache.org
>
>
>


-- 
Microsoft gives you Windows, Linux gives you the whole house ...