You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by Alex Karasulu <ao...@bellsouth.net> on 2005/09/18 20:40:12 UTC

[ApacheDS] Separating normalization from parsing

Emmanuel, Ersin,


Emmanuel and I had a breif conversation on IRC regarding the 
inefficiencies caused by double parsing names at times to normalize 
them.  Emmanuel had a good yet simple idea to decouple these two 
operations so Name normalization did not require another parse. 

Incidentally this solution also solves another problem that Ersin and I 
had discussed.  Namely the need to be able to isolate normalization so 
that it is does not complicate parsing.  I just wanted to quickly let 
you Ersin know that Emmanuel had some thoughts on this. An let you 
Emmanuel know that Ersin was thinking about this stuff :-).

Emmanuel's solution involved producing a NameComponent for populating a 
Name rather than a String for the components.  This way the attribute 
type and value are separated into fields within NameComponent objects by 
a populating parser.  The normalization can then occur (it necessary) 
after the parse on the type and/or attribute value fields of the 
NameComponent objects within a DN.  This approach would allow 
normalization to be decoupled from parsing.

There is however a slight problem with this approach.   However I think 
we might be able to get around it.  Using a NameComponent instead of a 
String introduces a problem when mapping to the JNDI Name interface.  
Name expects a String for name components as seen by the add() methods, 
and the get() method.  Getting around this is easy.  The internal 
representation for name components can be a NameComponent object within 
LdapName rather than a String.  The add() methods can be overloaded to 
take a NameComponent in addition to a String.  The overloads taking a 
String can generate the NameComponent object before storing it within 
the internal array of NameComponents.  Similarly, the get() method can 
call toString() to return the String representation of the NameComponent 
(btw which can/should be cached by NameComponent).  A new 
getNameComponent() method can be added to LdapName for access to the 
individual name components by a normalizer.

I like Emmauel's approach very much and think it can lead to some 
serious optimizations within the server.

Thoughts? Comments?

Alex


Re: [ApacheDS] Separating normalization from parsing

Posted by Emmanuel Lecharny <el...@apache.org>.
> Incidentally this solution also solves another problem that Ersin and I 
> had discussed. 

I gonna open a JIRA to collect the IRC convo we had, and to gather
ideas. It would be cool if one of ersin or alex could add their convo,
because I'm afraid I missed it ;)

> Emmanuel's solution involved producing a NameComponent for populating a 
> Name rather than a String for the components.  This way the attribute 
> type and value are separated into fields within NameComponent objects by 
> a populating parser.

Basically, a Name is (RDN)* and a RDN is (T=V)* A T=V is a
NameComponent. Be aware that we are talking about LdapNames which are
DNs.

>   The normalization can then occur (it necessary) 
> after the parse on the type and/or attribute value fields of the 
> NameComponent objects within a DN.  

Well, we can go further : the normalization should occur when a Type has
been parsed. This normalization is very simple :
- lowercase
- trimming leading and trailing space.

The StringTools class need to be extended to be able to do this
normalization.

But we can't normalize values farther than this operation :
- trimming leading and trailing space.

Those normalizations are really easy to implement.

> This approach would allow 
> normalization to be decoupled from parsing.

I don't know if we need to decouple DN normalization from the parsing.
If we follow RFC 2253, this can be done during the parsing process. 


> There is however a slight problem with this approach.   However I think 
> we might be able to get around it.  Using a NameComponent instead of a 
> String introduces a problem when mapping to the JNDI Name interface.  
> Name expects a String for name components as seen by the add() methods, 
> and the get() method.  Getting around this is easy.  The internal 
> representation for name components can be a NameComponent object within 
> LdapName rather than a String.  The add() methods can be overloaded to 
> take a NameComponent in addition to a String.  The overloads taking a 
> String can generate the NameComponent object before storing it within 
> the internal array of NameComponents.  Similarly, the get() method can 
> call toString() to return the String representation of the NameComponent 
> (btw which can/should be cached by NameComponent).  A new 
> getNameComponent() method can be added to LdapName for access to the 
> individual name components by a normalizer.

I didn't investigate this aspect, but it seems that you have a clear
solution. Yeah, sure, JNDI API must be fully functionnal even after any
modification made.

> 
> I like Emmauel's approach very much and think it can lead to some 
> serious optimizations within the server.
> 
> Thoughts? Comments?

Here you are.

Work has started...