You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Gaël Lalire <ga...@gmail.com> on 2009/11/15 03:18:51 UTC

[grammar] pure java syntax analyzer (this is not a compiler of compiler)

Hello,

Today if you want to do syntax analysis, you have to use a compiler of compiler (bison, yacc, javaCC ...) which generates source code.
After generation, you need to compile your generated source code and then you can parse some input.

I dislike this method because :
- you need to learn a meta-language (the language which describe the grammar)
- reusability of grammar is excluded
- the grammar cannot be dynamic (self described)

However I did not found any dynamical grammar analyzer so I decided to write it.
I separate my project in 3 modules :
- API : define Token, Terminal, Grammar, exceptions, ... ; A lexical analyzer have to depends on this module to send terminals to the syntax analyzer
- Impl : The analyzers (LR, LL, SLR, ...) and some calculation utilities.
- SPI : This module provide user friendly abstract classes. For example, if you create a grammar using this module there will be a type checking (generics) on non-terminals
and its rules, so you will be sure that there will be no ClassCastException. It also provide a easy way to create arithmetic expression (you just need to provide terminals, the helper
will create the rules).

Why donate to apache ?
I hope that I'm not the only one interested by having a runtime grammar tool, and I hope I could create a community on this project (because I'm alone).
This isn't an easy domain, there is many things I do not know about compilation, so a community could bring speeder or new implementations.
Also apache is well-known in university, which could be interested on this project for practical exercises.

Future tasks :
- LL(*) to create (abstract LL exists and is untested)
- LR(1+) to create (I need documentations)
- LALR(*) to create (need documentations too)
- SLR(2+) to create (need documentations always)
- Naming issue (bad english, bad words ...)
- Comments
- Tutorial
- Find a way to serialize the grammar's states (actions on terminal input maps) and restore it with a simple bindings of terminals instead of grammar analyze.
- Create bindings with a lexer (ORO ? JDK ?)
- More reusable grammar part (boolean expression, sql parser, ...)
- Parsing error management
- Create a BNF to the SPI convertor
- ...

I join a source code to this mail.
This is a maven2 / eclipse project [eclipse is not mandatory but the .project is provided]

Now I need a champion (If I understand the mechanism : apache rule are not simple) to integrate the project.

Best regards,
Gael Lalire


Re: [grammar] pure java syntax analyzer (this is not a compiler of compiler)

Posted by Emmanuel Lécharny <el...@apache.org>.
Gaël Lalire wrote:
> Hello,
>   
Hi !
> Today if you want to do syntax analysis, you have to use a compiler of compiler (bison, yacc, javaCC ...) which generates source code.
> After generation, you need to compile your generated source code and then you can parse some input.
>
> I dislike this method because :
> - you need to learn a meta-language (the language which describe the grammar)
> - reusability of grammar is excluded
> - the grammar cannot be dynamic (self described)
>
> However I did not found any dynamical grammar analyzer so I decided to write it.
> I separate my project in 3 modules :
> - API : define Token, Terminal, Grammar, exceptions, ... ; A lexical analyzer have to depends on this module to send terminals to the syntax analyzer
> - Impl : The analyzers (LR, LL, SLR, ...) and some calculation utilities.
> - SPI : This module provide user friendly abstract classes. For example, if you create a grammar using this module there will be a type checking (generics) on non-terminals
> and its rules, so you will be sure that there will be no ClassCastException. It also provide a easy way to create arithmetic expression (you just need to provide terminals, the helper
> will create the rules).
>   
I checked the code base and there are interesting (though partial) good 
ideas there. I really like the approach, ie providing a way to define 
your grammar in Java. Make me think that at some point, it could be cool 
to use annotations to define the grammar ...
> Why donate to apache ?
> I hope that I'm not the only one interested by having a runtime grammar tool, and I hope I could create a community on this project (because I'm alone).
> This isn't an easy domain, there is many things I do not know about compilation, so a community could bring speeder or new implementations.
> Also apache is well-known in university, which could be interested on this project for practical exercises.
>   
As Gurkan stated, the very first step is to fill a proposal : 
http://incubator.apache.org/guides/proposal.html
> Future tasks :
> - LL(*) to create (abstract LL exists and is untested)
> - LR(1+) to create (I need documentations)
> - LALR(*) to create (need documentations too)
> - SLR(2+) to create (need documentations always)
> - Naming issue (bad english, bad words ...)
> - Comments
> - Tutorial
> - Find a way to serialize the grammar's states (actions on terminal input maps) and restore it with a simple bindings of terminals instead of grammar analyze.
> - Create bindings with a lexer (ORO ? JDK ?)
> - More reusable grammar part (boolean expression, sql parser, ...)
> - Parsing error management
> - Create a BNF to the SPI convertor
> - ...
>
> I join a source code to this mail.
> This is a maven2 / eclipse project [eclipse is not mandatory but the .project is provided]
>
> Now I need a champion (If I understand the mechanism : apache rule are not simple) to integrate the project.
>   
Apache rules are simple. It's just that they are formal. This is the key 
to get a successful project out of incubation : The ASF is not 
sourceForge, and we promote Community above Code : the project must 
remain alive even if you quit it.

Thanks !

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [grammar] pure java syntax analyzer (this is not a compiler of compiler)

Posted by Gurkan Erdogdu <cg...@gmail.com>.
Hi:

I think that writing some incubator proposal document related with your
project is a good idea to start discussions.

You could look at sample proposals here . http://wiki.apache.org/incubator/

Thanks;

--Gurkan

2009/11/15 Gaël Lalire <ga...@gmail.com>

> Hello,
>
> Today if you want to do syntax analysis, you have to use a compiler of
> compiler (bison, yacc, javaCC ...) which generates source code.
> After generation, you need to compile your generated source code and then
> you can parse some input.
>
> I dislike this method because :
> - you need to learn a meta-language (the language which describe the
> grammar)
> - reusability of grammar is excluded
> - the grammar cannot be dynamic (self described)
>
> However I did not found any dynamical grammar analyzer so I decided to
> write it.
> I separate my project in 3 modules :
> - API : define Token, Terminal, Grammar, exceptions, ... ; A lexical
> analyzer have to depends on this module to send terminals to the syntax
> analyzer
> - Impl : The analyzers (LR, LL, SLR, ...) and some calculation utilities.
> - SPI : This module provide user friendly abstract classes. For example, if
> you create a grammar using this module there will be a type checking
> (generics) on non-terminals
> and its rules, so you will be sure that there will be no
> ClassCastException. It also provide a easy way to create arithmetic
> expression (you just need to provide terminals, the helper
> will create the rules).
>
> Why donate to apache ?
> I hope that I'm not the only one interested by having a runtime grammar
> tool, and I hope I could create a community on this project (because I'm
> alone).
> This isn't an easy domain, there is many things I do not know about
> compilation, so a community could bring speeder or new implementations.
> Also apache is well-known in university, which could be interested on this
> project for practical exercises.
>
> Future tasks :
> - LL(*) to create (abstract LL exists and is untested)
> - LR(1+) to create (I need documentations)
> - LALR(*) to create (need documentations too)
> - SLR(2+) to create (need documentations always)
> - Naming issue (bad english, bad words ...)
> - Comments
> - Tutorial
> - Find a way to serialize the grammar's states (actions on terminal input
> maps) and restore it with a simple bindings of terminals instead of grammar
> analyze.
> - Create bindings with a lexer (ORO ? JDK ?)
> - More reusable grammar part (boolean expression, sql parser, ...)
> - Parsing error management
> - Create a BNF to the SPI convertor
> - ...
>
> I join a source code to this mail.
> This is a maven2 / eclipse project [eclipse is not mandatory but the
> .project is provided]
>
> Now I need a champion (If I understand the mechanism : apache rule are not
> simple) to integrate the project.
>
> Best regards,
> Gael Lalire
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>



-- 
Gurkan Erdogdu
http://gurkanerdogdu.blogspot.com