You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by Stefan Seelmann <se...@apache.org> on 2008/06/01 00:11:34 UTC

Re: [Shared] Relaxing the schema parsers

Hi all,

I investigated a bit deeper:


Currently we have two grammars to parse schemas in shared-ldap:

a)
schema.g: schema parser for schema entities; used by syntax checkers,
the schema core and the browser in studio.

b)
openldap.g: parser for OpenLDAP schema files; used by the schema editor
in studio


Here are the changes I would like to do:

1st)
Move the functionality of openldap.g to schema.g because we could reuse
the grammar. The only difference is that an objectClass or attributeType
in an OpenLDAP schema file is prefixed by "objectclass" or
"attributetype", the remaining grammer is the same as in schema
entities. That would avoid duplicate grammar code.

2nd)
Relax the grammar of schema.g:
- allowing tabs instead of spaces,
- allowing more than one space
- allowing missing spaces before or after '(' and ')'
- allowing unordered parameters.
- case insentivitity for keywords like attributetype, NAME, DESC, MAY,
MUST, ...
The aim is to be able to parse schema entities in OpenLDAP schema files,
syntax checkers and the schema subsystem, as long as the intention is clear.

3rd)
Add an "isStrict" flag to schema.g which is true by default. If false
the following additional relaxions are activated:
- allow alphanumeric values for the numeric oid of an schema entity
- allow quoted oids, e.g. for NAME, MUST, MAY, etc.
- to be continued...
The aim is to be able to parse invalid schema entities. This feature is
needed by the LDAP browser because it should be able to work with all
kind of directory servers, even if they have an RFC-invalid schema.

4rd)
Adding additional tests ;-)

Kind Regards,
Stefan



Re: [Shared] Relaxing the schema parsers

Posted by Alex Karasulu <ak...@apache.org>.
+1 from me too - sorry for getting to this late.

On Sat, May 31, 2008 at 8:28 PM, Emmanuel Lecharny <el...@gmail.com>
wrote:

> Hi Stefan,
>
> more inline
>
> On Sun, Jun 1, 2008 at 12:11 AM, Stefan Seelmann <se...@apache.org>
> wrote:
> > Hi all,
> >
> > I investigated a bit deeper:
> >
> >
> > Currently we have two grammars to parse schemas in shared-ldap:
> >
> > a)
> > schema.g: schema parser for schema entities; used by syntax checkers,
> > the schema core and the browser in studio.
> >
> > b)
> > openldap.g: parser for OpenLDAP schema files; used by the schema editor
> > in studio
>
> AFAIR, those grammars are also different because they create different
> kind of object (to be double checked). Anyway, this is not a reason to
> not merge those two grammars.
> >
> >
> > Here are the changes I would like to do:
> >
> > 1st)
> > Move the functionality of openldap.g to schema.g because we could reuse
> > the grammar. The only difference is that an objectClass or attributeType
> > in an OpenLDAP schema file is prefixed by "objectclass" or
> > "attributetype", the remaining grammer is the same as in schema
> > entities. That would avoid duplicate grammar code.
>
> +1
>
> > 2nd)
> > Relax the grammar of schema.g:
> > - allowing tabs instead of spaces,
> > - allowing more than one space
> > - allowing missing spaces before or after '(' and ')'
> > - allowing unordered parameters.
> > - case insentivitity for keywords like attributetype, NAME, DESC, MAY,
> > MUST, ...
> > The aim is to be able to parse schema entities in OpenLDAP schema files,
> > syntax checkers and the schema subsystem, as long as the intention is
> clear.
>
> I would like to add some more features, like accepting a name for
> syntaxes. Nothing is less painfull than to have an OID to express that
> an AttributeType is a IA5String !
> >
> > 3rd)
> > Add an "isStrict" flag to schema.g which is true by default.
>
> IMHO, the grammar parser should not be strict by default, but relaxed.
> If you set it to strict, many users will ask 'why is my schema not
> correct ?'
>
> If false
> > the following additional relaxions are activated:
> > - allow alphanumeric values for the numeric oid of an schema entity
> > - allow quoted oids, e.g. for NAME, MUST, MAY, etc.
> > - to be continued...
> > The aim is to be able to parse invalid schema entities. This feature is
> > needed by the LDAP browser because it should be able to work with all
> > kind of directory servers, even if they have an RFC-invalid schema.
> >
> > 4rd)
> > Adding additional tests ;-)
>
> Thanks Stefan. You have my +1.
>
>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>

Re: [Shared] Relaxing the schema parsers

Posted by Alex Karasulu <ak...@apache.org>.
On Sat, May 31, 2008 at 9:45 PM, Howard Chu <hy...@symas.com> wrote:

> Emmanuel Lecharny wrote:
>
>  I would like to add some more features, like accepting a name for
>> syntaxes. Nothing is less painfull than to have an OID to express that
>> an AttributeType is a IA5String !
>>
>
> I was going to suggest that as well; the OID macros we use in OpenLDAP
> really make life a lot easier.
>

This will be really nice to have and would considerably improve
performance.  I'd love to have it as a feature.

Thanks,
Alex

Re: [Shared] Relaxing the schema parsers

Posted by Howard Chu <hy...@symas.com>.
Stefan Seelmann wrote:
> Hi,
>
> I just want to ask, how we should handle OID macros in OpenLDAP schema
> files. Here is an example:
>
>    objectIdentifier NetscapeRoot 2.16.840.1.113730
>    objectIdentifier NetscapeLDAP NetscapeRoot:3
>    objectIdentifier NetscapeLDAPattributeType NetscapeLDAP:1
>    attributetype ( NetscapeLDAPattributeType:198
>          NAME 'memberURL'
>          DESC 'Identifies ...'
>          SUP labeledURI )
>
> Right now the parser just fails if it finds an objectIdentifier line. I
> see two ways:
>
> 1st)
> The parser substitutes those macros internally and returns attribute
> types and object classes with the resolved OIDs
>
> 2nd)
> Add a new data structure for those Object Identifiers, keep the symbolic
> names within attribute types and object classes and let the caller
> handle the macros.
>
> Any opinions?

Just to give you an idea of how we handle them...

In the subschema subentry that we publish from the rootDSE, we only publish 
fully resolved OIDs. That's the only conformant way to behave there.

In our cn=config tree, we preserve the OID macros as they're provided in the 
input. There's a tree of data structures to record the macro definitions, and 
an extra field in each schema structure to record where they're used.

Hm, with a slight flaw - we allow OID macros for the attribute syntax field 
too, but that's not being preserved in cn=config. I guess this discussion will 
prod me into fixing that. ;)

> Howard Chu wrote:
>> Emmanuel Lecharny wrote:
>>
>>> I would like to add some more features, like accepting a name for
>>> syntaxes. Nothing is less painfull than to have an OID to express that
>>> an AttributeType is a IA5String !
>> I was going to suggest that as well; the OID macros we use in OpenLDAP
>> really make life a lot easier.
>>
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: [Shared] Relaxing the schema parsers

Posted by Stefan Seelmann <se...@apache.org>.
Hi,

I just want to ask, how we should handle OID macros in OpenLDAP schema
files. Here is an example:

  objectIdentifier NetscapeRoot 2.16.840.1.113730
  objectIdentifier NetscapeLDAP NetscapeRoot:3
  objectIdentifier NetscapeLDAPattributeType NetscapeLDAP:1
  attributetype ( NetscapeLDAPattributeType:198
        NAME 'memberURL'
        DESC 'Identifies ...'
        SUP labeledURI )

Right now the parser just fails if it finds an objectIdentifier line. I
see two ways:

1st)
The parser substitutes those macros internally and returns attribute
types and object classes with the resolved OIDs

2nd)
Add a new data structure for those Object Identifiers, keep the symbolic
names within attribute types and object classes and let the caller
handle the macros.

Any opinions?

Kind Regards,
Stefan



Howard Chu wrote:
> Emmanuel Lecharny wrote:
> 
>> I would like to add some more features, like accepting a name for
>> syntaxes. Nothing is less painfull than to have an OID to express that
>> an AttributeType is a IA5String !
> 
> I was going to suggest that as well; the OID macros we use in OpenLDAP
> really make life a lot easier.
> 


Re: [Shared] Relaxing the schema parsers

Posted by Howard Chu <hy...@symas.com>.
Emmanuel Lecharny wrote:

> I would like to add some more features, like accepting a name for
> syntaxes. Nothing is less painfull than to have an OID to express that
> an AttributeType is a IA5String !

I was going to suggest that as well; the OID macros we use in OpenLDAP really 
make life a lot easier.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: [Shared] Relaxing the schema parsers

Posted by Emmanuel Lecharny <el...@gmail.com>.
> Or also to accept this name in attribute types like
>
>  ( ... NAME 'mail' ... SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )
>  ( ... NAME 'mail' ... SYNTAX IA5String )
>
> In the latter case, should the schema parser take care of mapping this
> name to an OID? I guess this is not possible because the parser can't
> access the schema registry, so it must be done inside the server.

Generally speaking, when the syntaxes are already known (and it's 99%
the case), we should allow this kind of naming. We have to create a
static mapping between syntax OID and the associated name, which can
be extended if we have to (either by a configuration file or by a
direct access to a server, if possible).

If the mapping is not available for a Syntax, then we produce an error
while parsing.

This would make life so much easier !



-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [Shared] Relaxing the schema parsers

Posted by Stefan Seelmann <se...@apache.org>.
Emmanuel Lecharny wrote:

> AFAIR, those grammars are also different because they create different
> kind of object (to be double checked). Anyway, this is not a reason to
> not merge those two grammars.

Yes, you are right. In a first step lets convert between these different
objects, perhapse later we could remove one of these classes.

> I would like to add some more features, like accepting a name for
> syntaxes. Nothing is less painfull than to have an OID to express that
> an AttributeType is a IA5String !

Do you mean just to accept the name field in syntax descriptions like

  ( 1.3.6.1.4.1.1466.115.121.1.26 DESC 'IA5 String' )
  ( 1.3.6.1.4.1.1466.115.121.1.26 NAME 'IA5String' DESC 'IA5 String' )

Or also to accept this name in attribute types like

  ( ... NAME 'mail' ... SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )
  ( ... NAME 'mail' ... SYNTAX IA5String )

In the latter case, should the schema parser take care of mapping this
name to an OID? I guess this is not possible because the parser can't
access the schema registry, so it must be done inside the server.

>> 3rd)
>> Add an "isStrict" flag to schema.g which is true by default.
> 
> IMHO, the grammar parser should not be strict by default, but relaxed.
> If you set it to strict, many users will ask 'why is my schema not
> correct ?'

Ack.

Regards,
Stefan


Re: [Shared] Relaxing the schema parsers

Posted by Emmanuel Lecharny <el...@gmail.com>.
Hi Stefan,

more inline

On Sun, Jun 1, 2008 at 12:11 AM, Stefan Seelmann <se...@apache.org> wrote:
> Hi all,
>
> I investigated a bit deeper:
>
>
> Currently we have two grammars to parse schemas in shared-ldap:
>
> a)
> schema.g: schema parser for schema entities; used by syntax checkers,
> the schema core and the browser in studio.
>
> b)
> openldap.g: parser for OpenLDAP schema files; used by the schema editor
> in studio

AFAIR, those grammars are also different because they create different
kind of object (to be double checked). Anyway, this is not a reason to
not merge those two grammars.
>
>
> Here are the changes I would like to do:
>
> 1st)
> Move the functionality of openldap.g to schema.g because we could reuse
> the grammar. The only difference is that an objectClass or attributeType
> in an OpenLDAP schema file is prefixed by "objectclass" or
> "attributetype", the remaining grammer is the same as in schema
> entities. That would avoid duplicate grammar code.

+1

> 2nd)
> Relax the grammar of schema.g:
> - allowing tabs instead of spaces,
> - allowing more than one space
> - allowing missing spaces before or after '(' and ')'
> - allowing unordered parameters.
> - case insentivitity for keywords like attributetype, NAME, DESC, MAY,
> MUST, ...
> The aim is to be able to parse schema entities in OpenLDAP schema files,
> syntax checkers and the schema subsystem, as long as the intention is clear.

I would like to add some more features, like accepting a name for
syntaxes. Nothing is less painfull than to have an OID to express that
an AttributeType is a IA5String !
>
> 3rd)
> Add an "isStrict" flag to schema.g which is true by default.

IMHO, the grammar parser should not be strict by default, but relaxed.
If you set it to strict, many users will ask 'why is my schema not
correct ?'

If false
> the following additional relaxions are activated:
> - allow alphanumeric values for the numeric oid of an schema entity
> - allow quoted oids, e.g. for NAME, MUST, MAY, etc.
> - to be continued...
> The aim is to be able to parse invalid schema entities. This feature is
> needed by the LDAP browser because it should be able to work with all
> kind of directory servers, even if they have an RFC-invalid schema.
>
> 4rd)
> Adding additional tests ;-)

Thanks Stefan. You have my +1.



-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com