You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by Oscar Westra van Holthe - Kind <os...@westravanholthe.nl> on 2022/01/05 22:13:37 UTC

Maintaining the IDL in the 21st century

Hi everyone,

One of the avro components in Java is the IDL parser. It is generated from
a grammar using JavaCC. This is an old library, that enforces Java 1.5
syntax for the IDL class. What's worse, it is not really upgraded anymore.

For now it works, but it's not suitable for maintaining a high quality
component. As a result, I've been looking for a replacement.

Some alternatives to JavaCC are:
* JavaCC 21; written by a brilliant mind, but who has been out of
programming for a decade and returned a few years ago. JavaCC 21 lacks
maven support, and I cannot find it in Maven Central.
* CookCC; has a YACC-like syntax (quite archaic), and is at version
0.something
* CUP (a Java based “Constructor of Useful Parsers”) generates more
powerful LALR parsers, which allows for grammars that are easier to read.
However, it was last released in Oktober 2015.
* ANTLR is actively developed (9 releases since 2017, last one November
2021) and thus the most obvious choice, but requires ANTLR artifacts at
runtime (i.e. a 328kB extra runtime dependency for the IDL parser)

Going forward, I suggest switching to ANTLR.
My reasons:
1. It's actively maintained
2. There is a Maven plugin,  to generate the parser during the build
3. I know where to find working examples

Does anyone have a better/easier suggestion?

Or should I continue with ANTLR?


Kind regards,
Oscar


-- 
Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>

Re: Maintaining the IDL in the 21st century

Posted by Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>.
On Fri, 7 Jan 2022 at 10:43, Martin Grigorov <mg...@apache.org> wrote:

> Have you noticed https://github.com/javacc/ ?
> It looks like a fork of the old JavaCC but with updates for Java 8.
> https://search.maven.org/artifact/org.javacc.generator/java/8.0.1/jar and
>
> https://search.maven.org/artifact/org.javacc.plugin/javacc-maven-plugin/3.0.3/maven-plugin
> have been released last April.
>

Thank you for these links. I did find that GitHub project, but due to the
lack of attribution on the various forks I'm hesitant to take any one path.
I had not, however, found that plugin yet. I'll add this as a PR shortly.


Kind regards,
Oscar

-- 
✉️ Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>

Re: Maintaining the IDL in the 21st century

Posted by Martin Grigorov <mg...@apache.org>.
Hi Oscar,

Have you noticed https://github.com/javacc/ ?
It looks like a fork of the old JavaCC but with updates for Java 8.
https://search.maven.org/artifact/org.javacc.generator/java/8.0.1/jar and
https://search.maven.org/artifact/org.javacc.plugin/javacc-maven-plugin/3.0.3/maven-plugin
have been released last April.

On Thu, Jan 6, 2022 at 12:13 AM Oscar Westra van Holthe - Kind <
oscar@westravanholthe.nl> wrote:

> Hi everyone,
>
> One of the avro components in Java is the IDL parser. It is generated from
> a grammar using JavaCC. This is an old library, that enforces Java 1.5
> syntax for the IDL class. What's worse, it is not really upgraded anymore.
>
> For now it works, but it's not suitable for maintaining a high quality
> component. As a result, I've been looking for a replacement.
>
> Some alternatives to JavaCC are:
> * JavaCC 21; written by a brilliant mind, but who has been out of
> programming for a decade and returned a few years ago. JavaCC 21 lacks
> maven support, and I cannot find it in Maven Central.
> * CookCC; has a YACC-like syntax (quite archaic), and is at version
> 0.something
> * CUP (a Java based “Constructor of Useful Parsers”) generates more
> powerful LALR parsers, which allows for grammars that are easier to read.
> However, it was last released in Oktober 2015.
> * ANTLR is actively developed (9 releases since 2017, last one November
> 2021) and thus the most obvious choice, but requires ANTLR artifacts at
> runtime (i.e. a 328kB extra runtime dependency for the IDL parser)
>
> Going forward, I suggest switching to ANTLR.
> My reasons:
> 1. It's actively maintained
> 2. There is a Maven plugin,  to generate the parser during the build
> 3. I know where to find working examples
>
> Does anyone have a better/easier suggestion?
>
> Or should I continue with ANTLR?
>
>
> Kind regards,
> Oscar
>
>
> --
> Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>
>

Re: Maintaining the IDL in the 21st century

Posted by Martin Grigorov <mg...@apache.org>.
On Sun, Jan 9, 2022 at 4:42 PM Oscar Westra van Holthe - Kind <
oscar@westravanholthe.nl> wrote:

> Thank you for the reactions on this! I really appreciate it.
>
> Given the responses, migrating to ANTLR will allow us to tap more
> knowledge,
> and target multiple languages. ANTLR supports, among others, C++, C#, Java,
> JavaScript, PHP, Python 2 and Python 3.
>

I agree that ANTLR 4 looks like the best tool for this job at the moment!
Looking forward to learn something new from you! :-)


>
> Going forward, I'll be taking a few steps:
>
>    1. Upgrade the JavaCC plugin as per Martins suggestion, and JavaCC to
>    the most recent version of the plugin dependency, 7.0.10
>    2. Upgrade the comment handling by the IDL to warn if documentation
>    comments are out of place
>    (this is useful to prepare for less hacky documentation comment
>    handling, but will force compilation errors if documentation comments
> are
>    used as regular multiline comments)
>    3. Suggest an ANTLR grammar to replace the IDL grammar
>
>
> Kind regards,
> Oscar
>
>
> On Thu, 6 Jan 2022 at 18:24, Ryan Skraba <ry...@skraba.com> wrote:
>
> > Hey, this is an excellent analysis for the future of the IDL
> > implementation, thanks!
> >
> > I am really not any expert around parser generators, but I've learned
> > quite a bit from reading through your PRs.  Switching to ANTLR might
> > bring in some more active developers in this area, so I'm all for it!
> > Thanks again for the work you've been doing and if you choose to take
> > on this task.
> >
> > Would using ANTLR help some of the other SDKs implement the IDL
> > specification?  That would be a huge advantage.
> >
> > All my best, Ryan
> >
> >
> >
> > On Thu, Jan 6, 2022 at 4:30 AM Raymie Stata <rs...@yahoo.com.invalid>
> > wrote:
> > >
> > > I support your recommendation.  I've recently used ANTLR on a use-case
> > > similar to this one and had a great experience.
> > >
> > >   Raymie
> > >
> > > On Wed, Jan 5, 2022 at 5:13 PM Oscar Westra van Holthe - Kind <
> > > oscar@westravanholthe.nl> wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > One of the avro components in Java is the IDL parser. It is generated
> > from
> > > > a grammar using JavaCC. This is an old library, that enforces Java
> 1.5
> > > > syntax for the IDL class. What's worse, it is not really upgraded
> > anymore.
> > > >
> > > > For now it works, but it's not suitable for maintaining a high
> quality
> > > > component. As a result, I've been looking for a replacement.
> > > >
> > > > Some alternatives to JavaCC are:
> > > > * JavaCC 21; written by a brilliant mind, but who has been out of
> > > > programming for a decade and returned a few years ago. JavaCC 21
> lacks
> > > > maven support, and I cannot find it in Maven Central.
> > > > * CookCC; has a YACC-like syntax (quite archaic), and is at version
> > > > 0.something
> > > > * CUP (a Java based “Constructor of Useful Parsers”) generates more
> > > > powerful LALR parsers, which allows for grammars that are easier to
> > read.
> > > > However, it was last released in Oktober 2015.
> > > > * ANTLR is actively developed (9 releases since 2017, last one
> November
> > > > 2021) and thus the most obvious choice, but requires ANTLR artifacts
> at
> > > > runtime (i.e. a 328kB extra runtime dependency for the IDL parser)
> > > >
> > > > Going forward, I suggest switching to ANTLR.
> > > > My reasons:
> > > > 1. It's actively maintained
> > > > 2. There is a Maven plugin,  to generate the parser during the build
> > > > 3. I know where to find working examples
> > > >
> > > > Does anyone have a better/easier suggestion?
> > > >
> > > > Or should I continue with ANTLR?
> > > >
> > > >
> > > > Kind regards,
> > > > Oscar
> > > >
> > > >
> > > > --
> > > > Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>
> > > >
> >
>
>
> --
>
> ✉️ Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>
>

Re: Maintaining the IDL in the 21st century

Posted by Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>.
Thank you for the reactions on this! I really appreciate it.

Given the responses, migrating to ANTLR will allow us to tap more knowledge,
and target multiple languages. ANTLR supports, among others, C++, C#, Java,
JavaScript, PHP, Python 2 and Python 3.

Going forward, I'll be taking a few steps:

   1. Upgrade the JavaCC plugin as per Martins suggestion, and JavaCC to
   the most recent version of the plugin dependency, 7.0.10
   2. Upgrade the comment handling by the IDL to warn if documentation
   comments are out of place
   (this is useful to prepare for less hacky documentation comment
   handling, but will force compilation errors if documentation comments are
   used as regular multiline comments)
   3. Suggest an ANTLR grammar to replace the IDL grammar


Kind regards,
Oscar


On Thu, 6 Jan 2022 at 18:24, Ryan Skraba <ry...@skraba.com> wrote:

> Hey, this is an excellent analysis for the future of the IDL
> implementation, thanks!
>
> I am really not any expert around parser generators, but I've learned
> quite a bit from reading through your PRs.  Switching to ANTLR might
> bring in some more active developers in this area, so I'm all for it!
> Thanks again for the work you've been doing and if you choose to take
> on this task.
>
> Would using ANTLR help some of the other SDKs implement the IDL
> specification?  That would be a huge advantage.
>
> All my best, Ryan
>
>
>
> On Thu, Jan 6, 2022 at 4:30 AM Raymie Stata <rs...@yahoo.com.invalid>
> wrote:
> >
> > I support your recommendation.  I've recently used ANTLR on a use-case
> > similar to this one and had a great experience.
> >
> >   Raymie
> >
> > On Wed, Jan 5, 2022 at 5:13 PM Oscar Westra van Holthe - Kind <
> > oscar@westravanholthe.nl> wrote:
> >
> > > Hi everyone,
> > >
> > > One of the avro components in Java is the IDL parser. It is generated
> from
> > > a grammar using JavaCC. This is an old library, that enforces Java 1.5
> > > syntax for the IDL class. What's worse, it is not really upgraded
> anymore.
> > >
> > > For now it works, but it's not suitable for maintaining a high quality
> > > component. As a result, I've been looking for a replacement.
> > >
> > > Some alternatives to JavaCC are:
> > > * JavaCC 21; written by a brilliant mind, but who has been out of
> > > programming for a decade and returned a few years ago. JavaCC 21 lacks
> > > maven support, and I cannot find it in Maven Central.
> > > * CookCC; has a YACC-like syntax (quite archaic), and is at version
> > > 0.something
> > > * CUP (a Java based “Constructor of Useful Parsers”) generates more
> > > powerful LALR parsers, which allows for grammars that are easier to
> read.
> > > However, it was last released in Oktober 2015.
> > > * ANTLR is actively developed (9 releases since 2017, last one November
> > > 2021) and thus the most obvious choice, but requires ANTLR artifacts at
> > > runtime (i.e. a 328kB extra runtime dependency for the IDL parser)
> > >
> > > Going forward, I suggest switching to ANTLR.
> > > My reasons:
> > > 1. It's actively maintained
> > > 2. There is a Maven plugin,  to generate the parser during the build
> > > 3. I know where to find working examples
> > >
> > > Does anyone have a better/easier suggestion?
> > >
> > > Or should I continue with ANTLR?
> > >
> > >
> > > Kind regards,
> > > Oscar
> > >
> > >
> > > --
> > > Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>
> > >
>


-- 

✉️ Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>

Re: Maintaining the IDL in the 21st century

Posted by Ryan Skraba <ry...@skraba.com>.
Hey, this is an excellent analysis for the future of the IDL
implementation, thanks!

I am really not any expert around parser generators, but I've learned
quite a bit from reading through your PRs.  Switching to ANTLR might
bring in some more active developers in this area, so I'm all for it!
Thanks again for the work you've been doing and if you choose to take
on this task.

Would using ANTLR help some of the other SDKs implement the IDL
specification?  That would be a huge advantage.

All my best, Ryan



On Thu, Jan 6, 2022 at 4:30 AM Raymie Stata <rs...@yahoo.com.invalid> wrote:
>
> I support your recommendation.  I've recently used ANTLR on a use-case
> similar to this one and had a great experience.
>
>   Raymie
>
> On Wed, Jan 5, 2022 at 5:13 PM Oscar Westra van Holthe - Kind <
> oscar@westravanholthe.nl> wrote:
>
> > Hi everyone,
> >
> > One of the avro components in Java is the IDL parser. It is generated from
> > a grammar using JavaCC. This is an old library, that enforces Java 1.5
> > syntax for the IDL class. What's worse, it is not really upgraded anymore.
> >
> > For now it works, but it's not suitable for maintaining a high quality
> > component. As a result, I've been looking for a replacement.
> >
> > Some alternatives to JavaCC are:
> > * JavaCC 21; written by a brilliant mind, but who has been out of
> > programming for a decade and returned a few years ago. JavaCC 21 lacks
> > maven support, and I cannot find it in Maven Central.
> > * CookCC; has a YACC-like syntax (quite archaic), and is at version
> > 0.something
> > * CUP (a Java based “Constructor of Useful Parsers”) generates more
> > powerful LALR parsers, which allows for grammars that are easier to read.
> > However, it was last released in Oktober 2015.
> > * ANTLR is actively developed (9 releases since 2017, last one November
> > 2021) and thus the most obvious choice, but requires ANTLR artifacts at
> > runtime (i.e. a 328kB extra runtime dependency for the IDL parser)
> >
> > Going forward, I suggest switching to ANTLR.
> > My reasons:
> > 1. It's actively maintained
> > 2. There is a Maven plugin,  to generate the parser during the build
> > 3. I know where to find working examples
> >
> > Does anyone have a better/easier suggestion?
> >
> > Or should I continue with ANTLR?
> >
> >
> > Kind regards,
> > Oscar
> >
> >
> > --
> > Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>
> >

Re: Maintaining the IDL in the 21st century

Posted by Raymie Stata <rs...@yahoo.com.INVALID>.
I support your recommendation.  I've recently used ANTLR on a use-case
similar to this one and had a great experience.

  Raymie

On Wed, Jan 5, 2022 at 5:13 PM Oscar Westra van Holthe - Kind <
oscar@westravanholthe.nl> wrote:

> Hi everyone,
>
> One of the avro components in Java is the IDL parser. It is generated from
> a grammar using JavaCC. This is an old library, that enforces Java 1.5
> syntax for the IDL class. What's worse, it is not really upgraded anymore.
>
> For now it works, but it's not suitable for maintaining a high quality
> component. As a result, I've been looking for a replacement.
>
> Some alternatives to JavaCC are:
> * JavaCC 21; written by a brilliant mind, but who has been out of
> programming for a decade and returned a few years ago. JavaCC 21 lacks
> maven support, and I cannot find it in Maven Central.
> * CookCC; has a YACC-like syntax (quite archaic), and is at version
> 0.something
> * CUP (a Java based “Constructor of Useful Parsers”) generates more
> powerful LALR parsers, which allows for grammars that are easier to read.
> However, it was last released in Oktober 2015.
> * ANTLR is actively developed (9 releases since 2017, last one November
> 2021) and thus the most obvious choice, but requires ANTLR artifacts at
> runtime (i.e. a 328kB extra runtime dependency for the IDL parser)
>
> Going forward, I suggest switching to ANTLR.
> My reasons:
> 1. It's actively maintained
> 2. There is a Maven plugin,  to generate the parser during the build
> 3. I know where to find working examples
>
> Does anyone have a better/easier suggestion?
>
> Or should I continue with ANTLR?
>
>
> Kind regards,
> Oscar
>
>
> --
> Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>
>