You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by Emmanuel Lecharny <el...@gmail.com> on 2007/09/27 11:59:41 UTC

[Filters] New parser

Hi,

due to some serious reentrant problems we have with the current server
implementation, I'm rewriting the filter parser.

Things are going well, but I have a question about the syntax. RFC
4515 does not allow spaces inside a filter, but the current parser
allows them. For instance :
( ou = test ) is valid for our antlr parser, when it's not
specifically allowed by the grammar.

I can relax the grammar easily, but then we may have issues like :
( ou= test) should match "test" or only " test" ? or should we use
this filter : ( ou = \20test ) to match " test" ?

I would favor a strict parser otherwise we will have serious problems
with such values containing starting or trailing spaces.

Keep in mind that the filter parser will only be used when embedding
the server. For a remote server, filters are already parsed by the
client.


wdyt ?

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [Filters] New parser

Posted by Emmanuel Lecharny <el...@gmail.com>.
Ok, fine.

I will be strict on the grammar then... yeah, I like that, leather,
handcuff  and whip;)

Emmanuel/Aka Master !

On 9/27/07, Ersin Er <er...@gmail.com> wrote:
> Strict is fine definately.
>
> Thanks.
>
> On 9/27/07, Emmanuel Lecharny <el...@gmail.com> wrote:
> > Hi,
> >
> > due to some serious reentrant problems we have with the current server
> > implementation, I'm rewriting the filter parser.
> >
> > Things are going well, but I have a question about the syntax. RFC
> > 4515 does not allow spaces inside a filter, but the current parser
> > allows them. For instance :
> > ( ou = test ) is valid for our antlr parser, when it's not
> > specifically allowed by the grammar.
> >
> > I can relax the grammar easily, but then we may have issues like :
> > ( ou= test) should match "test" or only " test" ? or should we use
> > this filter : ( ou = \20test ) to match " test" ?
> >
> > I would favor a strict parser otherwise we will have serious problems
> > with such values containing starting or trailing spaces.
> >
> > Keep in mind that the filter parser will only be used when embedding
> > the server. For a remote server, filters are already parsed by the
> > client.
> >
> >
> > wdyt ?
> >
> > --
> > Regards,
> > Cordialement,
> > Emmanuel Lécharny
> > www.iktek.com
> >
>
>
>
> --
> Ersin Er
>  http://www.ersin-er.name


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [Filters] New parser

Posted by Ersin Er <er...@gmail.com>.
Strict is fine definately.

Thanks.

On 9/27/07, Emmanuel Lecharny <el...@gmail.com> wrote:
>
> Hi,
>
> due to some serious reentrant problems we have with the current server
> implementation, I'm rewriting the filter parser.
>
> Things are going well, but I have a question about the syntax. RFC
> 4515 does not allow spaces inside a filter, but the current parser
> allows them. For instance :
> ( ou = test ) is valid for our antlr parser, when it's not
> specifically allowed by the grammar.
>
> I can relax the grammar easily, but then we may have issues like :
> ( ou= test) should match "test" or only " test" ? or should we use
> this filter : ( ou = \20test ) to match " test" ?
>
> I would favor a strict parser otherwise we will have serious problems
> with such values containing starting or trailing spaces.
>
> Keep in mind that the filter parser will only be used when embedding
> the server. For a remote server, filters are already parsed by the
> client.
>
>
> wdyt ?
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>



-- 
Ersin Er
http://www.ersin-er.name

Re: [Filters] New parser

Posted by Alex Karasulu <ak...@apache.org>.
Hi Emmanuel,

I agree with you about keeping the parser strict according to the RFC.

Alex

On 9/27/07, Emmanuel Lecharny <el...@gmail.com> wrote:
> Hi,
>
> due to some serious reentrant problems we have with the current server
> implementation, I'm rewriting the filter parser.
>
> Things are going well, but I have a question about the syntax. RFC
> 4515 does not allow spaces inside a filter, but the current parser
> allows them. For instance :
> ( ou = test ) is valid for our antlr parser, when it's not
> specifically allowed by the grammar.
>
> I can relax the grammar easily, but then we may have issues like :
> ( ou= test) should match "test" or only " test" ? or should we use
> this filter : ( ou = \20test ) to match " test" ?
>
> I would favor a strict parser otherwise we will have serious problems
> with such values containing starting or trailing spaces.
>
> Keep in mind that the filter parser will only be used when embedding
> the server. For a remote server, filters are already parsed by the
> client.
>
>
> wdyt ?
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>

Re: [Filters] New parser

Posted by Emmanuel Lecharny <el...@gmail.com>.
Hi Stefan

I have looked at your code, and its purpose is different from what we
need inside the server :
* in the server
 - fast parsing
 - stateless parser
 - fail fast
 - generate a ExprNode tree
 - strict filter
* in the client
 - fast parsing
 - contextfull parser
 - auto-corrective parser
 - generate LdapFilter tree
 - relaxed filter

We need also to be more precise in the way we distinguish nodes :
Substring node and present node are different, for instance.

For those reasons, I don't think it's reasonable to merge the two
parsers, and as they are already tailored for their specific usages, I
think we will waste some time during the merge at the expense of
slower performances on the server.

Btw, your parser is very elegant. I like the decoupling between
tokenization and syntaxic oarsing. Classical Aho, Sethi & Ullmann
compiler technic ;)



On 9/27/07, Stefan Seelmann <se...@apache.org> wrote:
> Hi Emmanuel,
>
> we also have a filter parser in Studio. It is a hand-written parser and
> it is optimized to parse incomplete filters while typing the filter in
> the GUI to give some attribute completions and error markers. See
> http://issues.apache.org/jira/browse/DIRSTUDIO-47.
>
> Perhaps you want to take a look at it, but be aware: it isn't strict ;-)
> http://svn.apache.org/viewvc/directory/studio/trunk/studio-ldapbrowser-core/src/main/java/org/apache/directory/studio/ldapbrowser/core/model/filter/
>
> Regards,
> Stefan
>
>
> Emmanuel Lecharny schrieb:
> > Hi,
> >
> > due to some serious reentrant problems we have with the current server
> > implementation, I'm rewriting the filter parser.
> >
> > Things are going well, but I have a question about the syntax. RFC
> > 4515 does not allow spaces inside a filter, but the current parser
> > allows them. For instance :
> > ( ou = test ) is valid for our antlr parser, when it's not
> > specifically allowed by the grammar.
> >
> > I can relax the grammar easily, but then we may have issues like :
> > ( ou= test) should match "test" or only " test" ? or should we use
> > this filter : ( ou = \20test ) to match " test" ?
> >
> > I would favor a strict parser otherwise we will have serious problems
> > with such values containing starting or trailing spaces.
> >
> > Keep in mind that the filter parser will only be used when embedding
> > the server. For a remote server, filters are already parsed by the
> > client.
> >
> >
> > wdyt ?
> >
>
>


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [Filters] New parser

Posted by Alex Karasulu <ak...@apache.org>.
On 9/27/07, Emmanuel Lecharny <el...@gmail.com> wrote:
>
> Last, not least, it's around 20 times faster than the antlr parser.
>

Now that's good ol Emmanuel going to town again.  I'm just impressed at how
fast you do
that with all those tests as well.

Thanks for your efforts!
Alex

Re: [Filters] New parser

Posted by Emmanuel Lecharny <el...@gmail.com>.
Ok, I have the new parser done. It's a stateless parser, so it can be
used without any synchronization (it's a static parser).

I have fixed some bugs in the tests (for instance,
(:=dummyAssertion\\23\\ac) is nt valid, but was assumed as valid in
the previous tests)

Last, not least, it's around 20 times faster than the antlr parser.

Btw, we have now more than 5000 tests when doing some integration test
... Not too bad :)

On 9/27/07, Emmanuel Lecharny <el...@gmail.com> wrote:
> Thanks Stefan !!!
>
> I will have a look, sure. I thought you were using an antlr parser ...
>
> On 9/27/07, Stefan Seelmann <se...@apache.org> wrote:
> > Hi Emmanuel,
> >
> > we also have a filter parser in Studio. It is a hand-written parser and
> > it is optimized to parse incomplete filters while typing the filter in
> > the GUI to give some attribute completions and error markers. See
> > http://issues.apache.org/jira/browse/DIRSTUDIO-47.
> >
> > Perhaps you want to take a look at it, but be aware: it isn't strict ;-)
> > http://svn.apache.org/viewvc/directory/studio/trunk/studio-ldapbrowser-core/src/main/java/org/apache/directory/studio/ldapbrowser/core/model/filter/
> >
> > Regards,
> > Stefan
> >
> >
> > Emmanuel Lecharny schrieb:
> > > Hi,
> > >
> > > due to some serious reentrant problems we have with the current server
> > > implementation, I'm rewriting the filter parser.
> > >
> > > Things are going well, but I have a question about the syntax. RFC
> > > 4515 does not allow spaces inside a filter, but the current parser
> > > allows them. For instance :
> > > ( ou = test ) is valid for our antlr parser, when it's not
> > > specifically allowed by the grammar.
> > >
> > > I can relax the grammar easily, but then we may have issues like :
> > > ( ou= test) should match "test" or only " test" ? or should we use
> > > this filter : ( ou = \20test ) to match " test" ?
> > >
> > > I would favor a strict parser otherwise we will have serious problems
> > > with such values containing starting or trailing spaces.
> > >
> > > Keep in mind that the filter parser will only be used when embedding
> > > the server. For a remote server, filters are already parsed by the
> > > client.
> > >
> > >
> > > wdyt ?
> > >
> >
> >
>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [Filters] New parser

Posted by Emmanuel Lecharny <el...@gmail.com>.
Thanks Stefan !!!

I will have a look, sure. I thought you were using an antlr parser ...

On 9/27/07, Stefan Seelmann <se...@apache.org> wrote:
> Hi Emmanuel,
>
> we also have a filter parser in Studio. It is a hand-written parser and
> it is optimized to parse incomplete filters while typing the filter in
> the GUI to give some attribute completions and error markers. See
> http://issues.apache.org/jira/browse/DIRSTUDIO-47.
>
> Perhaps you want to take a look at it, but be aware: it isn't strict ;-)
> http://svn.apache.org/viewvc/directory/studio/trunk/studio-ldapbrowser-core/src/main/java/org/apache/directory/studio/ldapbrowser/core/model/filter/
>
> Regards,
> Stefan
>
>
> Emmanuel Lecharny schrieb:
> > Hi,
> >
> > due to some serious reentrant problems we have with the current server
> > implementation, I'm rewriting the filter parser.
> >
> > Things are going well, but I have a question about the syntax. RFC
> > 4515 does not allow spaces inside a filter, but the current parser
> > allows them. For instance :
> > ( ou = test ) is valid for our antlr parser, when it's not
> > specifically allowed by the grammar.
> >
> > I can relax the grammar easily, but then we may have issues like :
> > ( ou= test) should match "test" or only " test" ? or should we use
> > this filter : ( ou = \20test ) to match " test" ?
> >
> > I would favor a strict parser otherwise we will have serious problems
> > with such values containing starting or trailing spaces.
> >
> > Keep in mind that the filter parser will only be used when embedding
> > the server. For a remote server, filters are already parsed by the
> > client.
> >
> >
> > wdyt ?
> >
>
>


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [Filters] New parser

Posted by Stefan Seelmann <se...@apache.org>.
Hi Emmanuel,

we also have a filter parser in Studio. It is a hand-written parser and
it is optimized to parse incomplete filters while typing the filter in
the GUI to give some attribute completions and error markers. See
http://issues.apache.org/jira/browse/DIRSTUDIO-47.

Perhaps you want to take a look at it, but be aware: it isn't strict ;-)
http://svn.apache.org/viewvc/directory/studio/trunk/studio-ldapbrowser-core/src/main/java/org/apache/directory/studio/ldapbrowser/core/model/filter/

Regards,
Stefan


Emmanuel Lecharny schrieb:
> Hi,
> 
> due to some serious reentrant problems we have with the current server
> implementation, I'm rewriting the filter parser.
> 
> Things are going well, but I have a question about the syntax. RFC
> 4515 does not allow spaces inside a filter, but the current parser
> allows them. For instance :
> ( ou = test ) is valid for our antlr parser, when it's not
> specifically allowed by the grammar.
> 
> I can relax the grammar easily, but then we may have issues like :
> ( ou= test) should match "test" or only " test" ? or should we use
> this filter : ( ou = \20test ) to match " test" ?
> 
> I would favor a strict parser otherwise we will have serious problems
> with such values containing starting or trailing spaces.
> 
> Keep in mind that the filter parser will only be used when embedding
> the server. For a remote server, filters are already parsed by the
> client.
> 
> 
> wdyt ?
>