You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by Kevin Ballard <ke...@rapleaf.com> on 2008/05/27 00:14:09 UTC

Re-architecting ruby libraries

Hi folks. My name is Kevin Ballard and I'm an intern at Rapleaf. Among  
other things, I have been tasked with "rubifying" the thrift ruby  
libraries. I took a look at them, and they look like they were written  
by a Java developer. Rubifying the existing libraries would be very  
difficult.

What I want to do is completely rip out the old libraries and re-write  
them in the Ruby way. This should clean up the API and make things a  
lot easier for ruby developers to understand. The biggest issue here  
is, of course, that the API will be changing, and changing in ways  
that means that non-ruby developers won't be able to easily draw  
parallels between their code and the ruby code. I view this as a  
necessary sacrifice. I believe it should be possible to write a  
'thrift/old' library that would wrap the new libraries in the old API  
for existing code, but I also think that creating a new API now is  
very important to do before the old one is so entrenched that we can't  
change it.

I also want to fully spec out this new code and package it up as a  
gem, so anybody who wants to write a client can simply `gem install  
thrift` and be up and running.

Are there any objections, suggestions, or questions here? If not, how  
should I go about working? Is there a git repository somewhere? Should  
I maybe just create a new git repo on github.com for just the ruby  
libraries and work there, where everyone can see, and then submit it  
back for inclusion in the main project when it's done? Or are there  
other suggestions?

-Kevin Ballard

-- 
Kevin Ballard
kevin@rapleaf.com


Re: Re-architecting ruby libraries

Posted by Kevin Ballard <ke...@rapleaf.com>.
On May 26, 2008, at 4:30 PM, Bryan Duxbury wrote:

>>> I'm also seeing attempts to mimic Interfaces/virtual classes,  
>>> which is
>>> just plain ugly (see TProtocol where every method returns nil).
>> As long as there is a nice way of documenting what methods are  
>> expected
>> to be implemented in the concrete subclasses.
> I would be happy with the "abstract" methods raising exceptions when  
> they're called. I think their definitions should still live though,  
> and perhaps with a little RDoc to actually explain their purposes.

That does have the benefit of showing the method listing in rdoc, but  
as long as we bundle at least one concrete implementation of each  
"abstract" class, that same goal will be achieved just by documenting  
the methods on the concrete implementation. I could go either way on  
this one (though I would prefer not to have the dummy methods),  
because the most important aspect here is that unimplemented methods  
should raise an exception.

-Kevin

-- 
Kevin Ballard
kevin@rapleaf.com




Re: Re-architecting ruby libraries

Posted by Bryan Duxbury <br...@rapleaf.com>.
On May 26, 2008, at 4:22 PM, David Reiss wrote:

>>
>> The
>> method names are not idiomatic to ruby - I'm seeing things like
>> readStructEnd instead of read_struct_end and isOpen instead of open?.
> I'm fine with changing these, but I am super biased against camelCase.
> What do others think?

I think we should switch to ruby_case. It's not a disastrously large  
change in any case.

>> I'm also seeing attempts to mimic Interfaces/virtual classes,  
>> which is
>> just plain ugly (see TProtocol where every method returns nil).
> As long as there is a nice way of documenting what methods are  
> expected
> to be implemented in the concrete subclasses.
I would be happy with the "abstract" methods raising exceptions when  
they're called. I think their definitions should still live though,  
and perhaps with a little RDoc to actually explain their purposes.


Re: Re-architecting ruby libraries

Posted by Kevin Ballard <ke...@rapleaf.com>.
On May 26, 2008, at 4:22 PM, David Reiss wrote:

>> In the current libraries, all these classes pollute the global
>> namespace. They should all be encapsulated under a Thrift module
> I am 100% in favor of this.
>
>> given better names (such as Thrift::Protocols::BinaryProtocol instead
>> TBinaryProtocol).
> This I'm not so sure about.  We could do this in C++, Java, and Python
> too, but I don't think it is the right call.  The cost of the extra  
> "T"
> is tiny, and it makes it easier to do imports without bringing in  
> names
> like "Protocol" and "Transport".

Do you mean if you try `include Thrift` in your code? I would argue  
that such a practice is not appropriate for production code. However,  
if this is something people are concerned about, it wouldn't be much  
trouble to include a 'thrift/old' library which provides top-level  
names like TBinaryProtocol that map to  
Thrift::Protocols::BinaryProtocol. I would also point out the  
wonderful EventMachine library, which is architected very well and it  
uses EventMachine::Protocols::Foo as its namespacing. It works pretty  
well.

>> The
>> method names are not idiomatic to ruby - I'm seeing things like
>> readStructEnd instead of read_struct_end and isOpen instead of open?.
> I'm fine with changing these, but I am super biased against camelCase.
> What do others think?

Idiomatic ruby code uses underscores instead of camelCasing. Look at  
any system-provided ruby library and you'll see methods like  
File.writable_real? instead if File.isWritableReal.

>> I'm also seeing attempts to mimic Interfaces/virtual classes, which  
>> is
>> just plain ugly (see TProtocol where every method returns nil).
> As long as there is a nice way of documenting what methods are  
> expected
> to be implemented in the concrete subclasses.

How about comments? They're just as good as methods that do nothing,  
except they make no attempt at mimicing an interface. Those methods  
should be commented anyway for rdoc purposes. This also has the  
benefit of raising an exception if someone calls an unimplemented  
method instead of simply doing nothing, as the current code would. We  
could also provide a sample implementation that defines all the  
methods, so users can look at that for what they're expected to do.

>>>> Is there a git repository somewhere?
>>> There is an unofficial repo at <git://thrift-git.thruhere.net/thrift.git 
>>> >.
>>> A lot of non-committers have been using it to develop and share
>>> experimental branches.  I can give you an account so you can push
>>> branches if you send me your ssh public key.  If you prefer  
>>> github, it
>>> is mirrored at <http://github.com/dreiss/thrift/tree/master>, but  
>>> I've
>>> been encouraging developers to also push their changes to thruhere  
>>> so
>>> everything will be in one place.
>>
>> How frequently is that mirror updated?
> Every 5 minutes.

But I assume it's meant to be read-only? I'm thinking maybe I should  
fork that, then simply regularly push back to thruhere so, as you  
said, it's all together.

-Kevin

-- 
Kevin Ballard
kevin@rapleaf.com




Re: Re-architecting ruby libraries

Posted by David Reiss <dr...@facebook.com>.
> In the current libraries, all these classes pollute the global
> namespace. They should all be encapsulated under a Thrift module
I am 100% in favor of this.

> given better names (such as Thrift::Protocols::BinaryProtocol instead
> TBinaryProtocol).
This I'm not so sure about.  We could do this in C++, Java, and Python
too, but I don't think it is the right call.  The cost of the extra "T"
is tiny, and it makes it easier to do imports without bringing in names
like "Protocol" and "Transport".

> The 'thrift/thrift' library defines a bunch of classes
> (again in the global namespace) which should all be broken up into
> separate files and placed under the Thrift module
Sounds good.

> and renamed.
See above.

> The
> method names are not idiomatic to ruby - I'm seeing things like
> readStructEnd instead of read_struct_end and isOpen instead of open?.
I'm fine with changing these, but I am super biased against camelCase.
What do others think?

> I'm also seeing attempts to mimic Interfaces/virtual classes, which is
> just plain ugly (see TProtocol where every method returns nil).
As long as there is a nice way of documenting what methods are expected
to be implemented in the concrete subclasses.

>>> Is there a git repository somewhere?
>> There is an unofficial repo at <git://thrift-git.thruhere.net/thrift.git>.
>> A lot of non-committers have been using it to develop and share
>> experimental branches.  I can give you an account so you can push
>> branches if you send me your ssh public key.  If you prefer github, it
>> is mirrored at <http://github.com/dreiss/thrift/tree/master>, but I've
>> been encouraging developers to also push their changes to thruhere so
>> everything will be in one place.
> 
> How frequently is that mirror updated?
Every 5 minutes.

--David

Re: Re-architecting ruby libraries

Posted by Kevin Ballard <ke...@rapleaf.com>.
On May 26, 2008, at 3:53 PM, David Reiss wrote:

>> The biggest issue here
>> is, of course, that the API will be changing, and changing in ways
>> that means that non-ruby developers won't be able to easily draw
>> parallels between their code and the ruby code. I view this as a
>> necessary sacrifice.
> Why is it necessary to break parallels so completely in order to make
> the API more Rubyish?  We've had some changes suggested to the Python
> API to, for example, use named parameters and callables (instead of
> factories), but the basic structure of the library (and the  
> interfaces)
> would remain the same.

Looking at the ruby library, it just screams "Java", and as a ruby  
programmer it feels very dirty. For example, TBufferedTransport uses  
some kind of odd inheritance/delegation pattern that is very clearly  
compensating for single-inheritance. In ruby, mixins are a much  
cleaner solution. In the current libraries, all these classes pollute  
the global namespace. They should all be encapsulated under a Thrift  
module and be given better names (such as  
Thrift::Protocols::BinaryProtocol instead TBinaryProtocol). The  
'thrift/thrift' library defines a bunch of classes (again in the  
global namespace) which should all be broken up into separate files  
and placed under the Thrift module and renamed. The method names are  
not idiomatic to ruby - I'm seeing things like readStructEnd instead  
of read_struct_end and isOpen instead of open?. I'm also seeing  
attempts to mimic Interfaces/virtual classes, which is just plain ugly  
(see TProtocol where every method returns nil). It's stuff like this  
that I want to change, and doing all of this will be a fairly large  
API change.

>> Are there any objections, suggestions, or questions here?
> I would suggest that you sync up early with Kevin Clark, the de facto
> owner of the current Ruby mapping.

I saw he had a fork on github, but the last commit was 1.5 months ago.  
I suppose I'll have to ask him about that.

>> Is there a git repository somewhere?
> There is an unofficial repo at <git://thrift-git.thruhere.net/thrift.git 
> >.
> A lot of non-committers have been using it to develop and share
> experimental branches.  I can give you an account so you can push
> branches if you send me your ssh public key.  If you prefer github, it
> is mirrored at <http://github.com/dreiss/thrift/tree/master>, but I've
> been encouraging developers to also push their changes to thruhere so
> everything will be in one place.

How frequently is that mirror updated?

-Kevin Ballard

-- 
Kevin Ballard
kevin@rapleaf.com




Re: Re-architecting ruby libraries

Posted by David Reiss <dr...@facebook.com>.
> The biggest issue here 
> is, of course, that the API will be changing, and changing in ways 
> that means that non-ruby developers won't be able to easily draw 
> parallels between their code and the ruby code. I view this as a 
> necessary sacrifice.
Why is it necessary to break parallels so completely in order to make
the API more Rubyish?  We've had some changes suggested to the Python
API to, for example, use named parameters and callables (instead of
factories), but the basic structure of the library (and the interfaces)
would remain the same.

> Are there any objections, suggestions, or questions here?
I would suggest that you sync up early with Kevin Clark, the de facto
owner of the current Ruby mapping.

> Is there a git repository somewhere?
There is an unofficial repo at <git://thrift-git.thruhere.net/thrift.git>.
A lot of non-committers have been using it to develop and share
experimental branches.  I can give you an account so you can push
branches if you send me your ssh public key.  If you prefer github, it
is mirrored at <http://github.com/dreiss/thrift/tree/master>, but I've
been encouraging developers to also push their changes to thruhere so
everything will be in one place.

--David

Fwd: Re-architecting ruby libraries

Posted by Kevin Clark <ke...@gmail.com>.
Should have gone to the list...


---------- Forwarded message ----------
From: Kevin Clark <ke...@gmail.com>
Date: Tue, May 27, 2008 at 10:17 AM
Subject: Re: Re-architecting ruby libraries
To: Kevin Ballard <ke...@rapleaf.com>


On Tue, May 27, 2008 at 10:11 AM, Kevin Ballard <ke...@rapleaf.com> wrote:
> ruby_styling or ruby_lib_namespacing? The latter appears to be ahead of the
> former by a few commits.

They're related. I'd look at both.


>> Please _do_ submit tests. Please _do_ submit good rdoc. Please _do_
>> feel free to toss patches onto JIRA improving the state of the Ruby
>> bindings, but I'm unlikely to support a rewrite from scratch. There's
>> too many people running this code to simply ignore them. This _can_ be
>> done in a backwards compatible way, as is being done in the
>> ruby_styling branch.
>
> Ok, I will try to do it this way. But it seems that to not break anything,
> you can't change any API. How are you defining not breaking anything?

Look at the deprecate! method I wrote. It's still in flux, but I think
it's the right way to go about it. We spew a warning message about it
being deprecated, and delegates to the new api. We can provide the old
method names while moving to the new that way. Class renamings/splits
can happen with constant remapping a la:

module Foo
 class Bar
 end
end

TBar = Foo::Bar



--
Kevin Clark
http://glu.ttono.us



-- 
Kevin Clark
http://glu.ttono.us

Re: Re-architecting ruby libraries

Posted by Kevin Ballard <ke...@rapleaf.com>.
On May 27, 2008, at 8:30 AM, Kevin Clark wrote:

> Hi Kevin.
>
> As someone who's done large rewrites of the ruby libs a couple times
> (what's there still mostly isn't my code), I've got to strongly push
> that you don't rewrite this from scratch. My github mirror isn't up to
> date, but you should look at the ruby_styling branch here:
>
> http://github.com/kevinclark/thrift/commits/ruby_styling
>
> In it, I'm doing most of what you've described in a backwards
> compatible way. Namespaces are being added in, proper tests are being
> written, the camel case is being pulled out, but all if it is
> happening in a way that won't break existing code.

ruby_styling or ruby_lib_namespacing? The latter appears to be ahead  
of the former by a few commits.

> I know David doesn't care about breaking the Ruby bindings, but I do.
> We have lots of infrastructure running on top of it, and while I do
> _desperately_ want this code to feel like a Ruby library (I agree it
> does not), I'm not willing to break things in the process. I think we
> can have the best of both worlds.
>
> Please _do_ submit tests. Please _do_ submit good rdoc. Please _do_
> feel free to toss patches onto JIRA improving the state of the Ruby
> bindings, but I'm unlikely to support a rewrite from scratch. There's
> too many people running this code to simply ignore them. This _can_ be
> done in a backwards compatible way, as is being done in the
> ruby_styling branch.

Ok, I will try to do it this way. But it seems that to not break  
anything, you can't change any API. How are you defining not breaking  
anything?

-Kevin

-- 
Kevin Ballard
kevin@rapleaf.com




Re: Re-architecting ruby libraries

Posted by Kevin Clark <ke...@gmail.com>.
Hi Kevin.

As someone who's done large rewrites of the ruby libs a couple times
(what's there still mostly isn't my code), I've got to strongly push
that you don't rewrite this from scratch. My github mirror isn't up to
date, but you should look at the ruby_styling branch here:

http://github.com/kevinclark/thrift/commits/ruby_styling

In it, I'm doing most of what you've described in a backwards
compatible way. Namespaces are being added in, proper tests are being
written, the camel case is being pulled out, but all if it is
happening in a way that won't break existing code.

I know David doesn't care about breaking the Ruby bindings, but I do.
We have lots of infrastructure running on top of it, and while I do
_desperately_ want this code to feel like a Ruby library (I agree it
does not), I'm not willing to break things in the process. I think we
can have the best of both worlds.

Please _do_ submit tests. Please _do_ submit good rdoc. Please _do_
feel free to toss patches onto JIRA improving the state of the Ruby
bindings, but I'm unlikely to support a rewrite from scratch. There's
too many people running this code to simply ignore them. This _can_ be
done in a backwards compatible way, as is being done in the
ruby_styling branch.

I'm just down the block if you have need to talk in person.


-- 
Kevin Clark
http://glu.ttono.us

Re: Re-architecting ruby libraries

Posted by Kevin Clark <ke...@gmail.com>.
On Tue, May 27, 2008 at 5:17 AM, Ben Taitelbaum <bt...@cs.oberlin.edu> wrote:
> I work in a ruby shop with thrift, and I agree that it would be nice to
> follow more ruby idioms with the thrift libraries, especially with rdoc and
> rspec, and with a gem for installing. The naming conventions haven't been a
> problem for us, and we kind of enjoy the fact that it would be easy for us
> to start developing thrift services in c++ or java since we already know the
> API.

Yeah, I've considered the advantages of a unified-ish api. I think it
might make sense to maintain the T on class names. I don't like it,
but it's nothing if not consistent with the other bindings.
CamelCasing has to go though ;)

> The one HUGE request we have for the rewrite is to do some type-checking, or
> allow for type checking to be turned on/off. Currently when a service method
> returns an incorrect type, or throws an undeclared exception, we get a
> TTransportException with no explanation. Even worse, in some instances when
> a method returns a string when an array was expected, the call just hangs.

This might make sense. Please file a JIRA.
https://issues.apache.org/jira/browse/THRIFT

-- 
Kevin Clark
http://glu.ttono.us

Re: Re-architecting ruby libraries

Posted by Ben Taitelbaum <bt...@cs.oberlin.edu>.
I work in a ruby shop with thrift, and I agree that it would be nice  
to follow more ruby idioms with the thrift libraries, especially with  
rdoc and rspec, and with a gem for installing. The naming conventions  
haven't been a problem for us, and we kind of enjoy the fact that it  
would be easy for us to start developing thrift services in c++ or  
java since we already know the API.

The one HUGE request we have for the rewrite is to do some type- 
checking, or allow for type checking to be turned on/off. Currently  
when a service method returns an incorrect type, or throws an  
undeclared exception, we get a TTransportException with no  
explanation. Even worse, in some instances when a method returns a  
string when an array was expected, the call just hangs.

Thanks,
Ben Taitelbaum

On May 26, 2008, at 6:14 PM, Kevin Ballard wrote:

> Hi folks. My name is Kevin Ballard and I'm an intern at Rapleaf.  
> Among other things, I have been tasked with "rubifying" the thrift  
> ruby libraries. I took a look at them, and they look like they were  
> written by a Java developer. Rubifying the existing libraries would  
> be very difficult.
>
> What I want to do is completely rip out the old libraries and re- 
> write them in the Ruby way. This should clean up the API and make  
> things a lot easier for ruby developers to understand. The biggest  
> issue here is, of course, that the API will be changing, and  
> changing in ways that means that non-ruby developers won't be able  
> to easily draw parallels between their code and the ruby code. I  
> view this as a necessary sacrifice. I believe it should be possible  
> to write a 'thrift/old' library that would wrap the new libraries in  
> the old API for existing code, but I also think that creating a new  
> API now is very important to do before the old one is so entrenched  
> that we can't change it.
>
> I also want to fully spec out this new code and package it up as a  
> gem, so anybody who wants to write a client can simply `gem install  
> thrift` and be up and running.
>
> Are there any objections, suggestions, or questions here? If not,  
> how should I go about working? Is there a git repository somewhere?  
> Should I maybe just create a new git repo on github.com for just the  
> ruby libraries and work there, where everyone can see, and then  
> submit it back for inclusion in the main project when it's done? Or  
> are there other suggestions?
>
> -Kevin Ballard
>
> -- 
> Kevin Ballard
> kevin@rapleaf.com
>