You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Gaël Lalire <ga...@gmail.com> on 2009/11/15 03:18:51 UTC
[grammar] pure java syntax analyzer (this is not a compiler of compiler)
Hello,
Today if you want to do syntax analysis, you have to use a compiler of compiler (bison, yacc, javaCC ...) which generates source code.
After generation, you need to compile your generated source code and then you can parse some input.
I dislike this method because :
- you need to learn a meta-language (the language which describe the grammar)
- reusability of grammar is excluded
- the grammar cannot be dynamic (self described)
However I did not found any dynamical grammar analyzer so I decided to write it.
I separate my project in 3 modules :
- API : define Token, Terminal, Grammar, exceptions, ... ; A lexical analyzer have to depends on this module to send terminals to the syntax analyzer
- Impl : The analyzers (LR, LL, SLR, ...) and some calculation utilities.
- SPI : This module provide user friendly abstract classes. For example, if you create a grammar using this module there will be a type checking (generics) on non-terminals
and its rules, so you will be sure that there will be no ClassCastException. It also provide a easy way to create arithmetic expression (you just need to provide terminals, the helper
will create the rules).
Why donate to apache ?
I hope that I'm not the only one interested by having a runtime grammar tool, and I hope I could create a community on this project (because I'm alone).
This isn't an easy domain, there is many things I do not know about compilation, so a community could bring speeder or new implementations.
Also apache is well-known in university, which could be interested on this project for practical exercises.
Future tasks :
- LL(*) to create (abstract LL exists and is untested)
- LR(1+) to create (I need documentations)
- LALR(*) to create (need documentations too)
- SLR(2+) to create (need documentations always)
- Naming issue (bad english, bad words ...)
- Comments
- Tutorial
- Find a way to serialize the grammar's states (actions on terminal input maps) and restore it with a simple bindings of terminals instead of grammar analyze.
- Create bindings with a lexer (ORO ? JDK ?)
- More reusable grammar part (boolean expression, sql parser, ...)
- Parsing error management
- Create a BNF to the SPI convertor
- ...
I join a source code to this mail.
This is a maven2 / eclipse project [eclipse is not mandatory but the .project is provided]
Now I need a champion (If I understand the mechanism : apache rule are not simple) to integrate the project.
Best regards,
Gael Lalire
Re: [grammar] pure java syntax analyzer (this is not a compiler of
compiler)
Posted by Emmanuel Lécharny <el...@apache.org>.
Gaël Lalire wrote:
> Hello,
>
Hi !
> Today if you want to do syntax analysis, you have to use a compiler of compiler (bison, yacc, javaCC ...) which generates source code.
> After generation, you need to compile your generated source code and then you can parse some input.
>
> I dislike this method because :
> - you need to learn a meta-language (the language which describe the grammar)
> - reusability of grammar is excluded
> - the grammar cannot be dynamic (self described)
>
> However I did not found any dynamical grammar analyzer so I decided to write it.
> I separate my project in 3 modules :
> - API : define Token, Terminal, Grammar, exceptions, ... ; A lexical analyzer have to depends on this module to send terminals to the syntax analyzer
> - Impl : The analyzers (LR, LL, SLR, ...) and some calculation utilities.
> - SPI : This module provide user friendly abstract classes. For example, if you create a grammar using this module there will be a type checking (generics) on non-terminals
> and its rules, so you will be sure that there will be no ClassCastException. It also provide a easy way to create arithmetic expression (you just need to provide terminals, the helper
> will create the rules).
>
I checked the code base and there are interesting (though partial) good
ideas there. I really like the approach, ie providing a way to define
your grammar in Java. Make me think that at some point, it could be cool
to use annotations to define the grammar ...
> Why donate to apache ?
> I hope that I'm not the only one interested by having a runtime grammar tool, and I hope I could create a community on this project (because I'm alone).
> This isn't an easy domain, there is many things I do not know about compilation, so a community could bring speeder or new implementations.
> Also apache is well-known in university, which could be interested on this project for practical exercises.
>
As Gurkan stated, the very first step is to fill a proposal :
http://incubator.apache.org/guides/proposal.html
> Future tasks :
> - LL(*) to create (abstract LL exists and is untested)
> - LR(1+) to create (I need documentations)
> - LALR(*) to create (need documentations too)
> - SLR(2+) to create (need documentations always)
> - Naming issue (bad english, bad words ...)
> - Comments
> - Tutorial
> - Find a way to serialize the grammar's states (actions on terminal input maps) and restore it with a simple bindings of terminals instead of grammar analyze.
> - Create bindings with a lexer (ORO ? JDK ?)
> - More reusable grammar part (boolean expression, sql parser, ...)
> - Parsing error management
> - Create a BNF to the SPI convertor
> - ...
>
> I join a source code to this mail.
> This is a maven2 / eclipse project [eclipse is not mandatory but the .project is provided]
>
> Now I need a champion (If I understand the mechanism : apache rule are not simple) to integrate the project.
>
Apache rules are simple. It's just that they are formal. This is the key
to get a successful project out of incubation : The ASF is not
sourceForge, and we promote Community above Code : the project must
remain alive even if you quit it.
Thanks !
--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org
Re: [grammar] pure java syntax analyzer (this is not a compiler of
compiler)
Posted by Gurkan Erdogdu <cg...@gmail.com>.
Hi:
I think that writing some incubator proposal document related with your
project is a good idea to start discussions.
You could look at sample proposals here . http://wiki.apache.org/incubator/
Thanks;
--Gurkan
2009/11/15 Gaël Lalire <ga...@gmail.com>
> Hello,
>
> Today if you want to do syntax analysis, you have to use a compiler of
> compiler (bison, yacc, javaCC ...) which generates source code.
> After generation, you need to compile your generated source code and then
> you can parse some input.
>
> I dislike this method because :
> - you need to learn a meta-language (the language which describe the
> grammar)
> - reusability of grammar is excluded
> - the grammar cannot be dynamic (self described)
>
> However I did not found any dynamical grammar analyzer so I decided to
> write it.
> I separate my project in 3 modules :
> - API : define Token, Terminal, Grammar, exceptions, ... ; A lexical
> analyzer have to depends on this module to send terminals to the syntax
> analyzer
> - Impl : The analyzers (LR, LL, SLR, ...) and some calculation utilities.
> - SPI : This module provide user friendly abstract classes. For example, if
> you create a grammar using this module there will be a type checking
> (generics) on non-terminals
> and its rules, so you will be sure that there will be no
> ClassCastException. It also provide a easy way to create arithmetic
> expression (you just need to provide terminals, the helper
> will create the rules).
>
> Why donate to apache ?
> I hope that I'm not the only one interested by having a runtime grammar
> tool, and I hope I could create a community on this project (because I'm
> alone).
> This isn't an easy domain, there is many things I do not know about
> compilation, so a community could bring speeder or new implementations.
> Also apache is well-known in university, which could be interested on this
> project for practical exercises.
>
> Future tasks :
> - LL(*) to create (abstract LL exists and is untested)
> - LR(1+) to create (I need documentations)
> - LALR(*) to create (need documentations too)
> - SLR(2+) to create (need documentations always)
> - Naming issue (bad english, bad words ...)
> - Comments
> - Tutorial
> - Find a way to serialize the grammar's states (actions on terminal input
> maps) and restore it with a simple bindings of terminals instead of grammar
> analyze.
> - Create bindings with a lexer (ORO ? JDK ?)
> - More reusable grammar part (boolean expression, sql parser, ...)
> - Parsing error management
> - Create a BNF to the SPI convertor
> - ...
>
> I join a source code to this mail.
> This is a maven2 / eclipse project [eclipse is not mandatory but the
> .project is provided]
>
> Now I need a champion (If I understand the mechanism : apache rule are not
> simple) to integrate the project.
>
> Best regards,
> Gael Lalire
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
--
Gurkan Erdogdu
http://gurkanerdogdu.blogspot.com