You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by Kevin Ballard <ke...@rapleaf.com> on 2008/05/27 00:14:09 UTC
Re-architecting ruby libraries
Hi folks. My name is Kevin Ballard and I'm an intern at Rapleaf. Among
other things, I have been tasked with "rubifying" the thrift ruby
libraries. I took a look at them, and they look like they were written
by a Java developer. Rubifying the existing libraries would be very
difficult.
What I want to do is completely rip out the old libraries and re-write
them in the Ruby way. This should clean up the API and make things a
lot easier for ruby developers to understand. The biggest issue here
is, of course, that the API will be changing, and changing in ways
that means that non-ruby developers won't be able to easily draw
parallels between their code and the ruby code. I view this as a
necessary sacrifice. I believe it should be possible to write a
'thrift/old' library that would wrap the new libraries in the old API
for existing code, but I also think that creating a new API now is
very important to do before the old one is so entrenched that we can't
change it.
I also want to fully spec out this new code and package it up as a
gem, so anybody who wants to write a client can simply `gem install
thrift` and be up and running.
Are there any objections, suggestions, or questions here? If not, how
should I go about working? Is there a git repository somewhere? Should
I maybe just create a new git repo on github.com for just the ruby
libraries and work there, where everyone can see, and then submit it
back for inclusion in the main project when it's done? Or are there
other suggestions?
-Kevin Ballard
--
Kevin Ballard
kevin@rapleaf.com
Re: Re-architecting ruby libraries
Posted by Kevin Ballard <ke...@rapleaf.com>.
On May 26, 2008, at 4:30 PM, Bryan Duxbury wrote:
>>> I'm also seeing attempts to mimic Interfaces/virtual classes,
>>> which is
>>> just plain ugly (see TProtocol where every method returns nil).
>> As long as there is a nice way of documenting what methods are
>> expected
>> to be implemented in the concrete subclasses.
> I would be happy with the "abstract" methods raising exceptions when
> they're called. I think their definitions should still live though,
> and perhaps with a little RDoc to actually explain their purposes.
That does have the benefit of showing the method listing in rdoc, but
as long as we bundle at least one concrete implementation of each
"abstract" class, that same goal will be achieved just by documenting
the methods on the concrete implementation. I could go either way on
this one (though I would prefer not to have the dummy methods),
because the most important aspect here is that unimplemented methods
should raise an exception.
-Kevin
--
Kevin Ballard
kevin@rapleaf.com
Re: Re-architecting ruby libraries
Posted by Bryan Duxbury <br...@rapleaf.com>.
On May 26, 2008, at 4:22 PM, David Reiss wrote:
>>
>> The
>> method names are not idiomatic to ruby - I'm seeing things like
>> readStructEnd instead of read_struct_end and isOpen instead of open?.
> I'm fine with changing these, but I am super biased against camelCase.
> What do others think?
I think we should switch to ruby_case. It's not a disastrously large
change in any case.
>> I'm also seeing attempts to mimic Interfaces/virtual classes,
>> which is
>> just plain ugly (see TProtocol where every method returns nil).
> As long as there is a nice way of documenting what methods are
> expected
> to be implemented in the concrete subclasses.
I would be happy with the "abstract" methods raising exceptions when
they're called. I think their definitions should still live though,
and perhaps with a little RDoc to actually explain their purposes.
Re: Re-architecting ruby libraries
Posted by Kevin Ballard <ke...@rapleaf.com>.
On May 26, 2008, at 4:22 PM, David Reiss wrote:
>> In the current libraries, all these classes pollute the global
>> namespace. They should all be encapsulated under a Thrift module
> I am 100% in favor of this.
>
>> given better names (such as Thrift::Protocols::BinaryProtocol instead
>> TBinaryProtocol).
> This I'm not so sure about. We could do this in C++, Java, and Python
> too, but I don't think it is the right call. The cost of the extra
> "T"
> is tiny, and it makes it easier to do imports without bringing in
> names
> like "Protocol" and "Transport".
Do you mean if you try `include Thrift` in your code? I would argue
that such a practice is not appropriate for production code. However,
if this is something people are concerned about, it wouldn't be much
trouble to include a 'thrift/old' library which provides top-level
names like TBinaryProtocol that map to
Thrift::Protocols::BinaryProtocol. I would also point out the
wonderful EventMachine library, which is architected very well and it
uses EventMachine::Protocols::Foo as its namespacing. It works pretty
well.
>> The
>> method names are not idiomatic to ruby - I'm seeing things like
>> readStructEnd instead of read_struct_end and isOpen instead of open?.
> I'm fine with changing these, but I am super biased against camelCase.
> What do others think?
Idiomatic ruby code uses underscores instead of camelCasing. Look at
any system-provided ruby library and you'll see methods like
File.writable_real? instead if File.isWritableReal.
>> I'm also seeing attempts to mimic Interfaces/virtual classes, which
>> is
>> just plain ugly (see TProtocol where every method returns nil).
> As long as there is a nice way of documenting what methods are
> expected
> to be implemented in the concrete subclasses.
How about comments? They're just as good as methods that do nothing,
except they make no attempt at mimicing an interface. Those methods
should be commented anyway for rdoc purposes. This also has the
benefit of raising an exception if someone calls an unimplemented
method instead of simply doing nothing, as the current code would. We
could also provide a sample implementation that defines all the
methods, so users can look at that for what they're expected to do.
>>>> Is there a git repository somewhere?
>>> There is an unofficial repo at <git://thrift-git.thruhere.net/thrift.git
>>> >.
>>> A lot of non-committers have been using it to develop and share
>>> experimental branches. I can give you an account so you can push
>>> branches if you send me your ssh public key. If you prefer
>>> github, it
>>> is mirrored at <http://github.com/dreiss/thrift/tree/master>, but
>>> I've
>>> been encouraging developers to also push their changes to thruhere
>>> so
>>> everything will be in one place.
>>
>> How frequently is that mirror updated?
> Every 5 minutes.
But I assume it's meant to be read-only? I'm thinking maybe I should
fork that, then simply regularly push back to thruhere so, as you
said, it's all together.
-Kevin
--
Kevin Ballard
kevin@rapleaf.com
Re: Re-architecting ruby libraries
Posted by David Reiss <dr...@facebook.com>.
> In the current libraries, all these classes pollute the global
> namespace. They should all be encapsulated under a Thrift module
I am 100% in favor of this.
> given better names (such as Thrift::Protocols::BinaryProtocol instead
> TBinaryProtocol).
This I'm not so sure about. We could do this in C++, Java, and Python
too, but I don't think it is the right call. The cost of the extra "T"
is tiny, and it makes it easier to do imports without bringing in names
like "Protocol" and "Transport".
> The 'thrift/thrift' library defines a bunch of classes
> (again in the global namespace) which should all be broken up into
> separate files and placed under the Thrift module
Sounds good.
> and renamed.
See above.
> The
> method names are not idiomatic to ruby - I'm seeing things like
> readStructEnd instead of read_struct_end and isOpen instead of open?.
I'm fine with changing these, but I am super biased against camelCase.
What do others think?
> I'm also seeing attempts to mimic Interfaces/virtual classes, which is
> just plain ugly (see TProtocol where every method returns nil).
As long as there is a nice way of documenting what methods are expected
to be implemented in the concrete subclasses.
>>> Is there a git repository somewhere?
>> There is an unofficial repo at <git://thrift-git.thruhere.net/thrift.git>.
>> A lot of non-committers have been using it to develop and share
>> experimental branches. I can give you an account so you can push
>> branches if you send me your ssh public key. If you prefer github, it
>> is mirrored at <http://github.com/dreiss/thrift/tree/master>, but I've
>> been encouraging developers to also push their changes to thruhere so
>> everything will be in one place.
>
> How frequently is that mirror updated?
Every 5 minutes.
--David
Re: Re-architecting ruby libraries
Posted by Kevin Ballard <ke...@rapleaf.com>.
On May 26, 2008, at 3:53 PM, David Reiss wrote:
>> The biggest issue here
>> is, of course, that the API will be changing, and changing in ways
>> that means that non-ruby developers won't be able to easily draw
>> parallels between their code and the ruby code. I view this as a
>> necessary sacrifice.
> Why is it necessary to break parallels so completely in order to make
> the API more Rubyish? We've had some changes suggested to the Python
> API to, for example, use named parameters and callables (instead of
> factories), but the basic structure of the library (and the
> interfaces)
> would remain the same.
Looking at the ruby library, it just screams "Java", and as a ruby
programmer it feels very dirty. For example, TBufferedTransport uses
some kind of odd inheritance/delegation pattern that is very clearly
compensating for single-inheritance. In ruby, mixins are a much
cleaner solution. In the current libraries, all these classes pollute
the global namespace. They should all be encapsulated under a Thrift
module and be given better names (such as
Thrift::Protocols::BinaryProtocol instead TBinaryProtocol). The
'thrift/thrift' library defines a bunch of classes (again in the
global namespace) which should all be broken up into separate files
and placed under the Thrift module and renamed. The method names are
not idiomatic to ruby - I'm seeing things like readStructEnd instead
of read_struct_end and isOpen instead of open?. I'm also seeing
attempts to mimic Interfaces/virtual classes, which is just plain ugly
(see TProtocol where every method returns nil). It's stuff like this
that I want to change, and doing all of this will be a fairly large
API change.
>> Are there any objections, suggestions, or questions here?
> I would suggest that you sync up early with Kevin Clark, the de facto
> owner of the current Ruby mapping.
I saw he had a fork on github, but the last commit was 1.5 months ago.
I suppose I'll have to ask him about that.
>> Is there a git repository somewhere?
> There is an unofficial repo at <git://thrift-git.thruhere.net/thrift.git
> >.
> A lot of non-committers have been using it to develop and share
> experimental branches. I can give you an account so you can push
> branches if you send me your ssh public key. If you prefer github, it
> is mirrored at <http://github.com/dreiss/thrift/tree/master>, but I've
> been encouraging developers to also push their changes to thruhere so
> everything will be in one place.
How frequently is that mirror updated?
-Kevin Ballard
--
Kevin Ballard
kevin@rapleaf.com
Re: Re-architecting ruby libraries
Posted by David Reiss <dr...@facebook.com>.
> The biggest issue here
> is, of course, that the API will be changing, and changing in ways
> that means that non-ruby developers won't be able to easily draw
> parallels between their code and the ruby code. I view this as a
> necessary sacrifice.
Why is it necessary to break parallels so completely in order to make
the API more Rubyish? We've had some changes suggested to the Python
API to, for example, use named parameters and callables (instead of
factories), but the basic structure of the library (and the interfaces)
would remain the same.
> Are there any objections, suggestions, or questions here?
I would suggest that you sync up early with Kevin Clark, the de facto
owner of the current Ruby mapping.
> Is there a git repository somewhere?
There is an unofficial repo at <git://thrift-git.thruhere.net/thrift.git>.
A lot of non-committers have been using it to develop and share
experimental branches. I can give you an account so you can push
branches if you send me your ssh public key. If you prefer github, it
is mirrored at <http://github.com/dreiss/thrift/tree/master>, but I've
been encouraging developers to also push their changes to thruhere so
everything will be in one place.
--David
Fwd: Re-architecting ruby libraries
Posted by Kevin Clark <ke...@gmail.com>.
Should have gone to the list...
---------- Forwarded message ----------
From: Kevin Clark <ke...@gmail.com>
Date: Tue, May 27, 2008 at 10:17 AM
Subject: Re: Re-architecting ruby libraries
To: Kevin Ballard <ke...@rapleaf.com>
On Tue, May 27, 2008 at 10:11 AM, Kevin Ballard <ke...@rapleaf.com> wrote:
> ruby_styling or ruby_lib_namespacing? The latter appears to be ahead of the
> former by a few commits.
They're related. I'd look at both.
>> Please _do_ submit tests. Please _do_ submit good rdoc. Please _do_
>> feel free to toss patches onto JIRA improving the state of the Ruby
>> bindings, but I'm unlikely to support a rewrite from scratch. There's
>> too many people running this code to simply ignore them. This _can_ be
>> done in a backwards compatible way, as is being done in the
>> ruby_styling branch.
>
> Ok, I will try to do it this way. But it seems that to not break anything,
> you can't change any API. How are you defining not breaking anything?
Look at the deprecate! method I wrote. It's still in flux, but I think
it's the right way to go about it. We spew a warning message about it
being deprecated, and delegates to the new api. We can provide the old
method names while moving to the new that way. Class renamings/splits
can happen with constant remapping a la:
module Foo
class Bar
end
end
TBar = Foo::Bar
--
Kevin Clark
http://glu.ttono.us
--
Kevin Clark
http://glu.ttono.us
Re: Re-architecting ruby libraries
Posted by Kevin Ballard <ke...@rapleaf.com>.
On May 27, 2008, at 8:30 AM, Kevin Clark wrote:
> Hi Kevin.
>
> As someone who's done large rewrites of the ruby libs a couple times
> (what's there still mostly isn't my code), I've got to strongly push
> that you don't rewrite this from scratch. My github mirror isn't up to
> date, but you should look at the ruby_styling branch here:
>
> http://github.com/kevinclark/thrift/commits/ruby_styling
>
> In it, I'm doing most of what you've described in a backwards
> compatible way. Namespaces are being added in, proper tests are being
> written, the camel case is being pulled out, but all if it is
> happening in a way that won't break existing code.
ruby_styling or ruby_lib_namespacing? The latter appears to be ahead
of the former by a few commits.
> I know David doesn't care about breaking the Ruby bindings, but I do.
> We have lots of infrastructure running on top of it, and while I do
> _desperately_ want this code to feel like a Ruby library (I agree it
> does not), I'm not willing to break things in the process. I think we
> can have the best of both worlds.
>
> Please _do_ submit tests. Please _do_ submit good rdoc. Please _do_
> feel free to toss patches onto JIRA improving the state of the Ruby
> bindings, but I'm unlikely to support a rewrite from scratch. There's
> too many people running this code to simply ignore them. This _can_ be
> done in a backwards compatible way, as is being done in the
> ruby_styling branch.
Ok, I will try to do it this way. But it seems that to not break
anything, you can't change any API. How are you defining not breaking
anything?
-Kevin
--
Kevin Ballard
kevin@rapleaf.com
Re: Re-architecting ruby libraries
Posted by Kevin Clark <ke...@gmail.com>.
Hi Kevin.
As someone who's done large rewrites of the ruby libs a couple times
(what's there still mostly isn't my code), I've got to strongly push
that you don't rewrite this from scratch. My github mirror isn't up to
date, but you should look at the ruby_styling branch here:
http://github.com/kevinclark/thrift/commits/ruby_styling
In it, I'm doing most of what you've described in a backwards
compatible way. Namespaces are being added in, proper tests are being
written, the camel case is being pulled out, but all if it is
happening in a way that won't break existing code.
I know David doesn't care about breaking the Ruby bindings, but I do.
We have lots of infrastructure running on top of it, and while I do
_desperately_ want this code to feel like a Ruby library (I agree it
does not), I'm not willing to break things in the process. I think we
can have the best of both worlds.
Please _do_ submit tests. Please _do_ submit good rdoc. Please _do_
feel free to toss patches onto JIRA improving the state of the Ruby
bindings, but I'm unlikely to support a rewrite from scratch. There's
too many people running this code to simply ignore them. This _can_ be
done in a backwards compatible way, as is being done in the
ruby_styling branch.
I'm just down the block if you have need to talk in person.
--
Kevin Clark
http://glu.ttono.us
Re: Re-architecting ruby libraries
Posted by Kevin Clark <ke...@gmail.com>.
On Tue, May 27, 2008 at 5:17 AM, Ben Taitelbaum <bt...@cs.oberlin.edu> wrote:
> I work in a ruby shop with thrift, and I agree that it would be nice to
> follow more ruby idioms with the thrift libraries, especially with rdoc and
> rspec, and with a gem for installing. The naming conventions haven't been a
> problem for us, and we kind of enjoy the fact that it would be easy for us
> to start developing thrift services in c++ or java since we already know the
> API.
Yeah, I've considered the advantages of a unified-ish api. I think it
might make sense to maintain the T on class names. I don't like it,
but it's nothing if not consistent with the other bindings.
CamelCasing has to go though ;)
> The one HUGE request we have for the rewrite is to do some type-checking, or
> allow for type checking to be turned on/off. Currently when a service method
> returns an incorrect type, or throws an undeclared exception, we get a
> TTransportException with no explanation. Even worse, in some instances when
> a method returns a string when an array was expected, the call just hangs.
This might make sense. Please file a JIRA.
https://issues.apache.org/jira/browse/THRIFT
--
Kevin Clark
http://glu.ttono.us
Re: Re-architecting ruby libraries
Posted by Ben Taitelbaum <bt...@cs.oberlin.edu>.
I work in a ruby shop with thrift, and I agree that it would be nice
to follow more ruby idioms with the thrift libraries, especially with
rdoc and rspec, and with a gem for installing. The naming conventions
haven't been a problem for us, and we kind of enjoy the fact that it
would be easy for us to start developing thrift services in c++ or
java since we already know the API.
The one HUGE request we have for the rewrite is to do some type-
checking, or allow for type checking to be turned on/off. Currently
when a service method returns an incorrect type, or throws an
undeclared exception, we get a TTransportException with no
explanation. Even worse, in some instances when a method returns a
string when an array was expected, the call just hangs.
Thanks,
Ben Taitelbaum
On May 26, 2008, at 6:14 PM, Kevin Ballard wrote:
> Hi folks. My name is Kevin Ballard and I'm an intern at Rapleaf.
> Among other things, I have been tasked with "rubifying" the thrift
> ruby libraries. I took a look at them, and they look like they were
> written by a Java developer. Rubifying the existing libraries would
> be very difficult.
>
> What I want to do is completely rip out the old libraries and re-
> write them in the Ruby way. This should clean up the API and make
> things a lot easier for ruby developers to understand. The biggest
> issue here is, of course, that the API will be changing, and
> changing in ways that means that non-ruby developers won't be able
> to easily draw parallels between their code and the ruby code. I
> view this as a necessary sacrifice. I believe it should be possible
> to write a 'thrift/old' library that would wrap the new libraries in
> the old API for existing code, but I also think that creating a new
> API now is very important to do before the old one is so entrenched
> that we can't change it.
>
> I also want to fully spec out this new code and package it up as a
> gem, so anybody who wants to write a client can simply `gem install
> thrift` and be up and running.
>
> Are there any objections, suggestions, or questions here? If not,
> how should I go about working? Is there a git repository somewhere?
> Should I maybe just create a new git repo on github.com for just the
> ruby libraries and work there, where everyone can see, and then
> submit it back for inclusion in the main project when it's done? Or
> are there other suggestions?
>
> -Kevin Ballard
>
> --
> Kevin Ballard
> kevin@rapleaf.com
>