You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Stefan Seelmann <se...@apache.org> on 2008/05/27 23:36:00 UTC

[Shared] Relaxing the schema parsers

Hi dev,

We are using the shared-ldap schema parser now in the LDAP Browser
plugin of Studio.

Some LDAP servers however have a schema that isn't valid according to
RFC's (for example the Netscape successors are using "nsFooBar-oid" as
OID for some  attributeTypes and objectClasses). This causes problems
when loading the schema from such servers as the schema parser is quite
strict.

I would like to add a flag to the schema parser to make the parser
strict or more relaxed, depending on the usage. The default behaviour
should be unchanged of course, we need a strict parser in the server and
in the schema editor.

Kind Regards,
Stefan

Re: [Shared] Relaxing the schema parsers

Posted by Alex Karasulu <ak...@apache.org>.

+1 from me too - sorry for getting to this late.

On Sat, May 31, 2008 at 8:28 PM, Emmanuel Lecharny <el...@gmail.com>
wrote:

> Hi Stefan,
>
> more inline
>
> On Sun, Jun 1, 2008 at 12:11 AM, Stefan Seelmann <se...@apache.org>
> wrote:
> > Hi all,
> >
> > I investigated a bit deeper:
> >
> >
> > Currently we have two grammars to parse schemas in shared-ldap:
> >
> > a)
> > schema.g: schema parser for schema entities; used by syntax checkers,
> > the schema core and the browser in studio.
> >
> > b)
> > openldap.g: parser for OpenLDAP schema files; used by the schema editor
> > in studio
>
> AFAIR, those grammars are also different because they create different
> kind of object (to be double checked). Anyway, this is not a reason to
> not merge those two grammars.
> >
> >
> > Here are the changes I would like to do:
> >
> > 1st)
> > Move the functionality of openldap.g to schema.g because we could reuse
> > the grammar. The only difference is that an objectClass or attributeType
> > in an OpenLDAP schema file is prefixed by "objectclass" or
> > "attributetype", the remaining grammer is the same as in schema
> > entities. That would avoid duplicate grammar code.
>
> +1
>
> > 2nd)
> > Relax the grammar of schema.g:
> > - allowing tabs instead of spaces,
> > - allowing more than one space
> > - allowing missing spaces before or after '(' and ')'
> > - allowing unordered parameters.
> > - case insentivitity for keywords like attributetype, NAME, DESC, MAY,
> > MUST, ...
> > The aim is to be able to parse schema entities in OpenLDAP schema files,
> > syntax checkers and the schema subsystem, as long as the intention is
> clear.
>
> I would like to add some more features, like accepting a name for
> syntaxes. Nothing is less painfull than to have an OID to express that
> an AttributeType is a IA5String !
> >
> > 3rd)
> > Add an "isStrict" flag to schema.g which is true by default.
>
> IMHO, the grammar parser should not be strict by default, but relaxed.
> If you set it to strict, many users will ask 'why is my schema not
> correct ?'
>
> If false
> > the following additional relaxions are activated:
> > - allow alphanumeric values for the numeric oid of an schema entity
> > - allow quoted oids, e.g. for NAME, MUST, MAY, etc.
> > - to be continued...
> > The aim is to be able to parse invalid schema entities. This feature is
> > needed by the LDAP browser because it should be able to work with all
> > kind of directory servers, even if they have an RFC-invalid schema.
> >
> > 4rd)
> > Adding additional tests ;-)
>
> Thanks Stefan. You have my +1.
>
>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>

Re: [Shared] Relaxing the schema parsers

Posted by Alex Karasulu <ak...@apache.org>.

On Sat, May 31, 2008 at 9:45 PM, Howard Chu <hy...@symas.com> wrote:

> Emmanuel Lecharny wrote:
>
>  I would like to add some more features, like accepting a name for
>> syntaxes. Nothing is less painfull than to have an OID to express that
>> an AttributeType is a IA5String !
>>
>
> I was going to suggest that as well; the OID macros we use in OpenLDAP
> really make life a lot easier.
>

This will be really nice to have and would considerably improve
performance.  I'd love to have it as a feature.

Thanks,
Alex

Re: [Shared] Relaxing the schema parsers

Posted by Howard Chu <hy...@symas.com>.

Stefan Seelmann wrote:
> Hi,
>
> I just want to ask, how we should handle OID macros in OpenLDAP schema
> files. Here is an example:
>
>    objectIdentifier NetscapeRoot 2.16.840.1.113730
>    objectIdentifier NetscapeLDAP NetscapeRoot:3
>    objectIdentifier NetscapeLDAPattributeType NetscapeLDAP:1
>    attributetype ( NetscapeLDAPattributeType:198
>          NAME 'memberURL'
>          DESC 'Identifies ...'
>          SUP labeledURI )
>
> Right now the parser just fails if it finds an objectIdentifier line. I
> see two ways:
>
> 1st)
> The parser substitutes those macros internally and returns attribute
> types and object classes with the resolved OIDs
>
> 2nd)
> Add a new data structure for those Object Identifiers, keep the symbolic
> names within attribute types and object classes and let the caller
> handle the macros.
>
> Any opinions?

Just to give you an idea of how we handle them...

In the subschema subentry that we publish from the rootDSE, we only publish 
fully resolved OIDs. That's the only conformant way to behave there.

In our cn=config tree, we preserve the OID macros as they're provided in the 
input. There's a tree of data structures to record the macro definitions, and 
an extra field in each schema structure to record where they're used.

Hm, with a slight flaw - we allow OID macros for the attribute syntax field 
too, but that's not being preserved in cn=config. I guess this discussion will 
prod me into fixing that. ;)

> Howard Chu wrote:
>> Emmanuel Lecharny wrote:
>>
>>> I would like to add some more features, like accepting a name for
>>> syntaxes. Nothing is less painfull than to have an OID to express that
>>> an AttributeType is a IA5String !
>> I was going to suggest that as well; the OID macros we use in OpenLDAP
>> really make life a lot easier.
>>
>
>

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: [Shared] Relaxing the schema parsers

Posted by Stefan Seelmann <se...@apache.org>.

Hi,

I just want to ask, how we should handle OID macros in OpenLDAP schema
files. Here is an example:

  objectIdentifier NetscapeRoot 2.16.840.1.113730
  objectIdentifier NetscapeLDAP NetscapeRoot:3
  objectIdentifier NetscapeLDAPattributeType NetscapeLDAP:1
  attributetype ( NetscapeLDAPattributeType:198
        NAME 'memberURL'
        DESC 'Identifies ...'
        SUP labeledURI )

Right now the parser just fails if it finds an objectIdentifier line. I
see two ways:

1st)
The parser substitutes those macros internally and returns attribute
types and object classes with the resolved OIDs

2nd)
Add a new data structure for those Object Identifiers, keep the symbolic
names within attribute types and object classes and let the caller
handle the macros.

Any opinions?

Kind Regards,
Stefan

Howard Chu wrote:
> Emmanuel Lecharny wrote:
> 
>> I would like to add some more features, like accepting a name for
>> syntaxes. Nothing is less painfull than to have an OID to express that
>> an AttributeType is a IA5String !
> 
> I was going to suggest that as well; the OID macros we use in OpenLDAP
> really make life a lot easier.
>

Re: [Shared] Relaxing the schema parsers

Posted by Howard Chu <hy...@symas.com>.

Emmanuel Lecharny wrote:

> I would like to add some more features, like accepting a name for
> syntaxes. Nothing is less painfull than to have an OID to express that
> an AttributeType is a IA5String !

I was going to suggest that as well; the OID macros we use in OpenLDAP really 
make life a lot easier.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: [Shared] Relaxing the schema parsers

Posted by Emmanuel Lecharny <el...@gmail.com>.

> Or also to accept this name in attribute types like
>
>  ( ... NAME 'mail' ... SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )
>  ( ... NAME 'mail' ... SYNTAX IA5String )
>
> In the latter case, should the schema parser take care of mapping this
> name to an OID? I guess this is not possible because the parser can't
> access the schema registry, so it must be done inside the server.

Generally speaking, when the syntaxes are already known (and it's 99%
the case), we should allow this kind of naming. We have to create a
static mapping between syntax OID and the associated name, which can
be extended if we have to (either by a configuration file or by a
direct access to a server, if possible).

If the mapping is not available for a Syntax, then we produce an error
while parsing.

This would make life so much easier !



-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [Shared] Relaxing the schema parsers

Posted by Stefan Seelmann <se...@apache.org>.

Emmanuel Lecharny wrote:

> AFAIR, those grammars are also different because they create different
> kind of object (to be double checked). Anyway, this is not a reason to
> not merge those two grammars.

Yes, you are right. In a first step lets convert between these different
objects, perhapse later we could remove one of these classes.

> I would like to add some more features, like accepting a name for
> syntaxes. Nothing is less painfull than to have an OID to express that
> an AttributeType is a IA5String !

Do you mean just to accept the name field in syntax descriptions like

  ( 1.3.6.1.4.1.1466.115.121.1.26 DESC 'IA5 String' )
  ( 1.3.6.1.4.1.1466.115.121.1.26 NAME 'IA5String' DESC 'IA5 String' )

Or also to accept this name in attribute types like

  ( ... NAME 'mail' ... SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )
  ( ... NAME 'mail' ... SYNTAX IA5String )

In the latter case, should the schema parser take care of mapping this
name to an OID? I guess this is not possible because the parser can't
access the schema registry, so it must be done inside the server.

>> 3rd)
>> Add an "isStrict" flag to schema.g which is true by default.
> 
> IMHO, the grammar parser should not be strict by default, but relaxed.
> If you set it to strict, many users will ask 'why is my schema not
> correct ?'

Ack.

Regards,
Stefan

Re: [Shared] Relaxing the schema parsers

Posted by Emmanuel Lecharny <el...@gmail.com>.

Hi Stefan,

more inline

On Sun, Jun 1, 2008 at 12:11 AM, Stefan Seelmann <se...@apache.org> wrote:
> Hi all,
>
> I investigated a bit deeper:
>
>
> Currently we have two grammars to parse schemas in shared-ldap:
>
> a)
> schema.g: schema parser for schema entities; used by syntax checkers,
> the schema core and the browser in studio.
>
> b)
> openldap.g: parser for OpenLDAP schema files; used by the schema editor
> in studio

AFAIR, those grammars are also different because they create different
kind of object (to be double checked). Anyway, this is not a reason to
not merge those two grammars.
>
>
> Here are the changes I would like to do:
>
> 1st)
> Move the functionality of openldap.g to schema.g because we could reuse
> the grammar. The only difference is that an objectClass or attributeType
> in an OpenLDAP schema file is prefixed by "objectclass" or
> "attributetype", the remaining grammer is the same as in schema
> entities. That would avoid duplicate grammar code.

+1

> 2nd)
> Relax the grammar of schema.g:
> - allowing tabs instead of spaces,
> - allowing more than one space
> - allowing missing spaces before or after '(' and ')'
> - allowing unordered parameters.
> - case insentivitity for keywords like attributetype, NAME, DESC, MAY,
> MUST, ...
> The aim is to be able to parse schema entities in OpenLDAP schema files,
> syntax checkers and the schema subsystem, as long as the intention is clear.

I would like to add some more features, like accepting a name for
syntaxes. Nothing is less painfull than to have an OID to express that
an AttributeType is a IA5String !
>
> 3rd)
> Add an "isStrict" flag to schema.g which is true by default.

IMHO, the grammar parser should not be strict by default, but relaxed.
If you set it to strict, many users will ask 'why is my schema not
correct ?'

If false
> the following additional relaxions are activated:
> - allow alphanumeric values for the numeric oid of an schema entity
> - allow quoted oids, e.g. for NAME, MUST, MAY, etc.
> - to be continued...
> The aim is to be able to parse invalid schema entities. This feature is
> needed by the LDAP browser because it should be able to work with all
> kind of directory servers, even if they have an RFC-invalid schema.
>
> 4rd)
> Adding additional tests ;-)

Thanks Stefan. You have my +1.



-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [Shared] Relaxing the schema parsers

Posted by Stefan Seelmann <se...@apache.org>.

Hi all,

I investigated a bit deeper:


Currently we have two grammars to parse schemas in shared-ldap:

a)
schema.g: schema parser for schema entities; used by syntax checkers,
the schema core and the browser in studio.

b)
openldap.g: parser for OpenLDAP schema files; used by the schema editor
in studio


Here are the changes I would like to do:

1st)
Move the functionality of openldap.g to schema.g because we could reuse
the grammar. The only difference is that an objectClass or attributeType
in an OpenLDAP schema file is prefixed by "objectclass" or
"attributetype", the remaining grammer is the same as in schema
entities. That would avoid duplicate grammar code.

2nd)
Relax the grammar of schema.g:
- allowing tabs instead of spaces,
- allowing more than one space
- allowing missing spaces before or after '(' and ')'
- allowing unordered parameters.
- case insentivitity for keywords like attributetype, NAME, DESC, MAY,
MUST, ...
The aim is to be able to parse schema entities in OpenLDAP schema files,
syntax checkers and the schema subsystem, as long as the intention is clear.

3rd)
Add an "isStrict" flag to schema.g which is true by default. If false
the following additional relaxions are activated:
- allow alphanumeric values for the numeric oid of an schema entity
- allow quoted oids, e.g. for NAME, MUST, MAY, etc.
- to be continued...
The aim is to be able to parse invalid schema entities. This feature is
needed by the LDAP browser because it should be able to work with all
kind of directory servers, even if they have an RFC-invalid schema.

4rd)
Adding additional tests ;-)

Kind Regards,
Stefan

Re: [Shared] Relaxing the schema parsers

Posted by Stefan Seelmann <se...@apache.org>.

Alex Karasulu schrieb:
> 
> 
> On Wed, May 28, 2008 at 2:24 AM, Stefan Seelmann <seelmann@apache.org
> <ma...@apache.org>> wrote:
> 
>     Alex Karasulu schrieb:
>     >
>     > On Tue, May 27, 2008 at 5:42 PM, Emmanuel Lecharny
>     > <elecharny@apache.org <ma...@apache.org>
>     <mailto:elecharny@apache.org <ma...@apache.org>>> wrote:
>     >
>     >
>     >
>     >     this makes perfect sense to me. We also have to relax the parser
>     >     in many other aspects :
>     >     - allowing tabs instead of spaces,
>     >     - allowing more than one space
>     >     - allowing missing spaces before or after '(' and ')'
>     >     - allowing unordered parameters.
>     >
>     >
>     > Also some case invariance might be a good idea.  The parser seems to
>     > blow up when there's mixed case: i.e. attributetype passes but not
>     > attributeType.
>     Ah, ok. I think there is another issue. We have two grammars and two
>     parsers, one for the OpenLDAP style schema files and one for the the
>     syntax checkers. Some of these relaxions are already present in the one,
>     some in the other grammar. Perhaps we should try to could combine both
>     into one grammar?
> 
> 
> Are you talking about the grammar for the schema entity descriptions?
> 
> Alex

The 1st:
Grammar: openldap.g
Generated Java file: antlrOpenLdapSchemaParser.java
Used by: OpenLdapSchemaParser.java

The 2nd:
Grammar: schema.g
Generated Java files: AntlrSchemaParser.java
Used by: AttributeTypeDescriptionSchemaParser.java,
ObjectClassDescriptionSchemaParser.java,
LdapSyntaxDescriptionSchemaParser.java, etc.

I am not sure, which parser is used when.

In my original mail I was talking about the 2nd one. I use that parser
in the LDAP browser to parse the subschema subentry attributes
"attributeTypes", "objectClasses", "ldapSyntaxes", etc.

Kind Regards,
Stefan

Re: [Shared] Relaxing the schema parsers

Posted by Alex Karasulu <ak...@apache.org>.

On Wed, May 28, 2008 at 2:24 AM, Stefan Seelmann <se...@apache.org>
wrote:

> Alex Karasulu schrieb:
> >
> > On Tue, May 27, 2008 at 5:42 PM, Emmanuel Lecharny
> > <elecharny@apache.org <ma...@apache.org>> wrote:
> >
> >
> >
> >     this makes perfect sense to me. We also have to relax the parser
> >     in many other aspects :
> >     - allowing tabs instead of spaces,
> >     - allowing more than one space
> >     - allowing missing spaces before or after '(' and ')'
> >     - allowing unordered parameters.
> >
> >
> > Also some case invariance might be a good idea.  The parser seems to
> > blow up when there's mixed case: i.e. attributetype passes but not
> > attributeType.
> Ah, ok. I think there is another issue. We have two grammars and two
> parsers, one for the OpenLDAP style schema files and one for the the
> syntax checkers. Some of these relaxions are already present in the one,
> some in the other grammar. Perhaps we should try to could combine both
> into one grammar?
>

Are you talking about the grammar for the schema entity descriptions?

Alex

Re: [Shared] Relaxing the schema parsers

Posted by Stefan Seelmann <se...@apache.org>.

Alex Karasulu schrieb:
>
> On Tue, May 27, 2008 at 5:42 PM, Emmanuel Lecharny
> <elecharny@apache.org <ma...@apache.org>> wrote:
>
>
>
>     this makes perfect sense to me. We also have to relax the parser
>     in many other aspects :
>     - allowing tabs instead of spaces,
>     - allowing more than one space
>     - allowing missing spaces before or after '(' and ')'
>     - allowing unordered parameters.
>
>
> Also some case invariance might be a good idea.  The parser seems to
> blow up when there's mixed case: i.e. attributetype passes but not
> attributeType.
Ah, ok. I think there is another issue. We have two grammars and two
parsers, one for the OpenLDAP style schema files and one for the the
syntax checkers. Some of these relaxions are already present in the one,
some in the other grammar. Perhaps we should try to could combine both
into one grammar?

>  
>
>
>     This will render the grammar a little (lot?) more complex, but
>     this is really important. Many users have problems with the syntax
>     checker right now.
>
>
> Oh yeah but those are things we have to perhaps allow users to toggle
> on and off.  Like being able to relax certain kinds of schema checks
> per syntax, per attributeType etc.  Some low hanging fruit for those
> interested in getting involved in ADS.

Re: [Shared] Relaxing the schema parsers

Posted by Alex Karasulu <ak...@apache.org>.

On Tue, May 27, 2008 at 5:42 PM, Emmanuel Lecharny <el...@apache.org>
wrote:

> Stefan Seelmann wrote:
>
>> Hi dev,
>>
>> We are using the shared-ldap schema parser now in the LDAP Browser
>> plugin of Studio.
>>
>> Some LDAP servers however have a schema that isn't valid according to
>> RFC's (for example the Netscape successors are using "nsFooBar-oid" as
>> OID for some  attributeTypes and objectClasses). This causes problems
>> when loading the schema from such servers as the schema parser is quite
>> strict.
>>
>> I would like to add a flag to the schema parser to make the parser
>> strict or more relaxed, depending on the usage. The default behaviour
>> should be unchanged of course, we need a strict parser in the server and
>> in the schema editor.
>>
>
Yeah sounds good to me too.


>
>>
>>
>>
> Hi Stefan,
>
> this makes perfect sense to me. We also have to relax the parser in many
> other aspects :
> - allowing tabs instead of spaces,
> - allowing more than one space
> - allowing missing spaces before or after '(' and ')'
> - allowing unordered parameters.
>

Also some case invariance might be a good idea.  The parser seems to blow up
when there's mixed case: i.e. attributetype passes but not attributeType.


>
> This will render the grammar a little (lot?) more complex, but this is
> really important. Many users have problems with the syntax checker right
> now.
>

Oh yeah but those are things we have to perhaps allow users to toggle on and
off.  Like being able to relax certain kinds of schema checks per syntax,
per attributeType etc.  Some low hanging fruit for those interested in
getting involved in ADS.


>
> So it's up to you. I don't even think we need a flag for that, or may be a
> 'strict' flag to enforce strict syntax checking, the relaxed one being the
> default, IMO.
>

Re: [Shared] Relaxing the schema parsers

Posted by Emmanuel Lecharny <el...@apache.org>.

Stefan Seelmann wrote:
> Hi dev,
>
> We are using the shared-ldap schema parser now in the LDAP Browser
> plugin of Studio.
>
> Some LDAP servers however have a schema that isn't valid according to
> RFC's (for example the Netscape successors are using "nsFooBar-oid" as
> OID for some  attributeTypes and objectClasses). This causes problems
> when loading the schema from such servers as the schema parser is quite
> strict.
>
> I would like to add a flag to the schema parser to make the parser
> strict or more relaxed, depending on the usage. The default behaviour
> should be unchanged of course, we need a strict parser in the server and
> in the schema editor.
>
>
>   
Hi Stefan,

this makes perfect sense to me. We also have to relax the parser in many 
other aspects :
- allowing tabs instead of spaces,
- allowing more than one space
- allowing missing spaces before or after '(' and ')'
- allowing unordered parameters.

This will render the grammar a little (lot?) more complex, but this is 
really important. Many users have problems with the syntax checker right 
now.

So it's up to you. I don't even think we need a flag for that, or may be 
a 'strict' flag to enforce strict syntax checking, the relaxed one being 
the default, IMO.

Thanks !

PS : I think I have added a JIRA for that, but JIRA is down - at least 
in europe -.

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org