You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by Jake Farrell <jf...@apache.org> on 2011/12/13 02:56:25 UTC

Decouple Thrift IDL from generators (THRIFT-1463)

Diwaker 
What is the issue you are trying to address that you are seeing in your day to day development with this ticket? Initially you could remove given languages not in use from the compiler and this was removed in favor of always having available every language that is possibly to be generated. This should have no impact on any test cases or libraries you wish to not include. Configure also offers the ability to compile without any of the client libs with the --without-<LIB> option. This should allow you to compile and test the client libs that you are looking at using and you will not have to worry about dependencies for libraries you are not going to use. What problem are you running into that you are trying to address?

- Jake




On Dec 12, 2011, at 6:39 PM, Diwaker Gupta (Created) (JIRA) wrote:

> Decouple Thrift IDL from generators
> -----------------------------------
> 
>                 Key: THRIFT-1463
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1463
>             Project: Thrift
>          Issue Type: Wish
>          Components: Compiler (General)
>            Reporter: Diwaker Gupta
>            Assignee: Jake Farrell
> 
> 
> While Thrift's broad language support is fantastic, it does impose many constraints on day to day development.
> 
> * The current design of the compiler make it hard to improve and test the generator/library for a particular language. The codebase is monolithic and hard to navigate.
> * Each language has it's own idiosyncrasies. For instance, the Java library required ant to build, the Ruby library has other dependencies. Running unit tests is slightly different for each language library. Currently, all of this is duck-taped together (barely) using 'make', which has its own flaws.
> * Adding support for a new language is fairly easy, but rather than making the code more modular, it adds to the current complexity of the codebase. For example, setting up Jenkins jobs to test/verify builds for a new language take a while to come up to parity with other languages.
> 
> I think Google's Protocol Buffer approach is instructive here. They're trying to strip down the core compiler and decouple the IDL from language specific stubs. For a rich environment like Thrift, I think this decoupling is crucial to allow for a more maintainable and testable code base going forward. To refresh:
> 
> * the core compiler takes in a Thrift grammar file and generates an intermediate representation: think of an in-memory AST
> * each supported language will be implemented via plugins, that can be loaded at runtime by the compiler
> * the plugins take this AST and transform it into source code
> * each plugin can be in its own repository (consuming the compiler via a git submodule, for instance). Plugins can freely choose their own build system, unit test frameworks etc
> * a meta repository can contain the compiler, and the 'blessed' (officially supported) languages. This meta repository can include integration tests that will test language interoperability
> 
> I realize, of course, that this will be a big change. But we have to start somewhere.
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
> 


Re: Decouple Thrift IDL from generators (THRIFT-1463)

Posted by Diwaker Gupta <di...@maginatics.com>.
Thanks for bringing this onto the mailing list, Jake.

First, I should preface by saying that I created the ticket with the goal
of initiating a discussion around Thrift's architecture going forward.
There is no urgent or immediate concern (conversely, the nature of the
ticket implies that changes, if any, are unlikely to be quick and would
require serious thought/discussion).

The TL;DR version is this: I think the current Thrift repo is increasingly
harder to maintain and test, especially when it comes to consistency of
code quality and functionality across languages.

Yes, I can use configure to specify exactly which libraries I want to
build. But consider:

* Say I'm a new Thrift user. I check out the repo and now I just want to
quickly build Thrift and run some tests. Let's also assume that configure
does a good job of choosing sensible defaults (which itself is suspect).
What's the next step? Will 'make' in the top-dir build all the configured
libraries? Will 'make check' (or similar) run all the tests? Last I tried,
that didn't work. Even if it does, as more languages get added, it requires
more and more plumbing to just keep all this together with make.

* Every language has it's own ecosystem and way of doing things. Thrift
should embrace them where possible, rather than try to shoe-horn everything
into a single mold -- it leads to suboptimal use of the developers' time.
For instance, Java works great with Maven and Scala with sbt. There's no
clean way to integrate them both into a single top-level project using
'make' or whatever else, though.

* This tight coupling between the IDL and the language libraries also makes
it harder to experiment. Say I wanted to write a new Java library for
Thrift. I'd end up a lot of time just trying to integrate this new library
with the various build scripts and such. If I had compiler and a clean API
to work with, I could just create a separate Maven project, drop in the
compiler and voila! Conversely, say I wanted to write a new compiler in
Python or something and test it against one of the existing libraries.
Again, the simpler the inter-dependencies, the easier these things become.

* It's unlikely that all the Thrift libraries will have the same level of
code quality or even consistency in implementation. Right now there's no
good mechanism for the devs to focus on a few, popular libraries in the
core distribution, while still allowing other libraries to thrive in the
ecosystem. A quick search in Jira will reveal that even among Java and C++,
there's a host of subtle differences. All these issues will get worse over
time. With the decoupling, each component can independently strengthen it's
own tests: the core IDL compiler, the Java library, C++ library etc -- each
using the best tool for the job. IMO it is a lot simpler to then put
together an integration harness that can mix and match various
client/server implementations to auto-generate a compatibility matrix.

Finally, code has a tendency to linger. The older and larger the codebase,
the harder it gets to streamline and modularize it. What I'm suggesting is
not unlike the principles you would apply to any other large piece of code.

Diwaker

On Mon, Dec 12, 2011 at 5:56 PM, Jake Farrell <jf...@apache.org> wrote:

> Diwaker
> What is the issue you are trying to address that you are seeing in your
> day to day development with this ticket? Initially you could remove given
> languages not in use from the compiler and this was removed in favor of
> always having available every language that is possibly to be generated.
> This should have no impact on any test cases or libraries you wish to not
> include. Configure also offers the ability to compile without any of the
> client libs with the --without-<LIB> option. This should allow you to
> compile and test the client libs that you are looking at using and you will
> not have to worry about dependencies for libraries you are not going to
> use. What problem are you running into that you are trying to address?
>
> - Jake
>
>
>
>
> On Dec 12, 2011, at 6:39 PM, Diwaker Gupta (Created) (JIRA) wrote:
>
> > Decouple Thrift IDL from generators
> > -----------------------------------
> >
> >                 Key: THRIFT-1463
> >                 URL: https://issues.apache.org/jira/browse/THRIFT-1463
> >             Project: Thrift
> >          Issue Type: Wish
> >          Components: Compiler (General)
> >            Reporter: Diwaker Gupta
> >            Assignee: Jake Farrell
> >
> >
> > While Thrift's broad language support is fantastic, it does impose many
> constraints on day to day development.
> >
> > * The current design of the compiler make it hard to improve and test
> the generator/library for a particular language. The codebase is monolithic
> and hard to navigate.
> > * Each language has it's own idiosyncrasies. For instance, the Java
> library required ant to build, the Ruby library has other dependencies.
> Running unit tests is slightly different for each language library.
> Currently, all of this is duck-taped together (barely) using 'make', which
> has its own flaws.
> > * Adding support for a new language is fairly easy, but rather than
> making the code more modular, it adds to the current complexity of the
> codebase. For example, setting up Jenkins jobs to test/verify builds for a
> new language take a while to come up to parity with other languages.
> >
> > I think Google's Protocol Buffer approach is instructive here. They're
> trying to strip down the core compiler and decouple the IDL from language
> specific stubs. For a rich environment like Thrift, I think this decoupling
> is crucial to allow for a more maintainable and testable code base going
> forward. To refresh:
> >
> > * the core compiler takes in a Thrift grammar file and generates an
> intermediate representation: think of an in-memory AST
> > * each supported language will be implemented via plugins, that can be
> loaded at runtime by the compiler
> > * the plugins take this AST and transform it into source code
> > * each plugin can be in its own repository (consuming the compiler via a
> git submodule, for instance). Plugins can freely choose their own build
> system, unit test frameworks etc
> > * a meta repository can contain the compiler, and the 'blessed'
> (officially supported) languages. This meta repository can include
> integration tests that will test language interoperability
> >
> > I realize, of course, that this will be a big change. But we have to
> start somewhere.
> >
> > --
> > This message is automatically generated by JIRA.
> > If you think it was sent incorrectly, please contact your JIRA
> administrators:
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> > For more information on JIRA, see:
> http://www.atlassian.com/software/jira
> >
> >
>
>


-- 
http://maginatics.com