You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by Kevin Ballard <ke...@rapleaf.com> on 2008/06/03 03:03:37 UTC

Full specs for ruby libraries

I just finished speccing out the ruby Thrift libraries. It's available  
at

http://github.com/kballard/thrift

Do note that this contains Kevin Clark's work on ruby lib namespacing,  
taken to completion by me (and yes, this is reflected in the specs).

Also, as of this writing there are a couple failing specs, because I  
included specs for a couple open issues on JIRA as well as one fixed  
in Kevin Clark's repo.

Any comments/suggestions?

-Kevin Ballard

-- 
Kevin Ballard
kevin@rapleaf.com




Re: Full specs for ruby libraries

Posted by Kevin Clark <ke...@gmail.com>.
All that sounds fine. I'm happy to commit as soon as the source is
into Apache's svn.

-- 
Kevin Clark
http://glu.ttono.us

Re: Full specs for ruby libraries

Posted by Kevin Ballard <ke...@rapleaf.com>.
On Jun 4, 2008, at 11:22 AM, Kevin Clark wrote:

> On Mon, Jun 2, 2008 at 6:03 PM, Kevin Ballard <ke...@rapleaf.com>  
> wrote:
>> Any comments/suggestions?
>
> First off, great work. This is a major improvement to the libraries.
> Here's my notes from reviewing the commits (through 713371df7f):

Thank you. I've responded below.

> For those playing along, you can view these by going to
> http://github.com/kballard/thrift/commits/COMMIT
>
> 7a1b95c8d886
> returns (lines 50, 31):
> The first item in the array (field name) isn't actually used/ 
> supplied currently
> May want to do the same in the tests for consistency.
> * I'm not really sure why it's returned at all. Legacy?

I put them in the tests because I figure as long as the code ignores  
them, everything is fine, but if someone fixes the libraries to start  
caring, the tests shouldn't break. I assume it's in there (but this is  
just a guess) because it makes inspecting the network traffic by hand  
a lot easier to debug, but since the binary protocol doesn't send the  
names (and that's the only concrete protocol in the ruby libraries),  
it hardly matters.

> c7101e6334f3b
> Consequences of including Thrift in the example group?

AFAIK there should be none. You'll notice all of my specs are  
structured this way, creating a concrete example group and including  
Thrift, simply to make the testing code simpler (as it doesn't have to  
constantly namespace all the thrift classes). Note that in  
665d95c51fe6 I ended up changing the names of all these example groups  
to avoid spec leakage when running multiple specs at once, but the  
same idea is there.

> 6cda1b66c24
> On testing the remapped methods: Could we mock to expect the old
> method called by the new?
> Alternatively, if they're the same method (I don't think they are),
> it's possible method()
> could be used for equality.

No we can't. The way deprecate! works, it actually fetches the new  
method directly with instance_method, binds it to the current object,  
and calls it. I suppose I could test to ensure instance_method() is  
called with the new symbol. I'll look into doing that now.

> 665d95c51f
> What's going on here?

When I started using the concrete ExampleGroup classes, I didn't run  
`rake spec` until I'd written several of them (instead I ran the  
individual specs separately). Running all the specs at once revealed  
that having each ExampleGroup named ThriftSpec was, in fact, re- 
opening the same class over and over (which makes sense), re-including  
Thrift over and over, and actually leaking some information between  
specs. I forget exactly what actually leaked, but `rake spec` was  
broken before that change, and worked fine after that change.

> 1d897acba89
> Range errors on Bignum are ok but on integer aren't?

There's a few reasons I made this change. The biggest one is the code  
here assumes it's writing signed integers (which makes sense, all the  
libraries assume they're dealing with signed integers and return  
signed from read_foo). The range-checking code there was enforcing the  
signed-ness of the integers (although, looking back, I allowed  
unsigned bytes in write_byte, which differs from write_i16 and  
write_i32). This actually broke the binary protocol because it writes  
out an i32 with the high bit set when it writes its version  
(0x80010000), then coerces it back to unsigned when reading (read_i32  
| VERSION_MASK converts the return value of read_i32 to the equivalent  
unsigned value with the same bitfield). My two options were to either  
increase the range bounds of i16/i32 to allow writing unsigned  
integers, or remove the range checking. I opted to remove the range  
checking primarily because, looking at the Java and CPP libraries,  
there's no explicit range checking there, and I figured with that  
precedent I'd rather not slow down the binary protocol by adding extra  
tests to every single integer write. I mean, if the user writes a long  
as an i16, they shouldn't be the least surprised to find it gets  
clipped.

But there was nothing I could do about the Bignum range error  
(Array#pack raises that), so I simply encoded that behavior in the  
tests.

If there is a real desire to re-introduce range checking I can  
certainly do that, with the ranges expanded to include unsigned  
integers. Another option is to introduce read/write_ui* functions, but  
that would make the ruby libs different than all other libraries.

> 800706783f1
> I think this might have been correct. We don't want the server bailing
> out if something goes wrong.
> We should discuss.

The problem was it wasn't catching exceptions. The construct

   rescue Exception => e

is clearly designed to catch all exceptions, except it wasn't. I don't  
know what quirk of Ruby was preventing that from working, but in  
testing it would explicitly catch Exception objects but not e.g.  
StandardError objects. Using

   rescue => e

seems to catch them all. Granted, I didn't test, this could be  
catching all subclasses of StandardError rather than Exception, but  
the only reason a non-StandardError exception should be thrown is if  
the processor or handler triggers a syntax or load error, and I'm  
perfectly happy with the server blowing up on that.

> 210803a358
> I think switching sets to use Set breaks backwards compatability.
> There's also a bug we need to be aware of that allows arrays to be
> passed as sets (because .each doesn't care, and the value of the hash)
> isn't used.
>
> I think breaking backwards compat might be ok here, but we need to  
> discuss
> on the list.

This actually should continue to work with pre-generated code, the  
only observable difference in clients is when they call a method that  
returns a Thrift::SET, it'll return a Set instead of a hash. But I  
don't think that's too big of a change to ask clients to accommodate.  
I will note that passing a Set to a method, and having a hash-style  
set as the default value of a generated struct should actually  
continue to work (384064468d). Also, there's a decent chance that  
client code that handles the return value will still work anyway, for  
example

   client.returnSet("foo").each { |k,v| p k }

That will print the exact same thing regardless of whether a Set or a  
Hash is returned. I realize we can't assume all code (or even most  
code) handles Set return values this way, but my point is there's very  
little to change. Unfortunately there's no way to deprecate the old  
style, because the method doesn't have any way of knowing what it's  
expected to return.

> 384064468d
> Ah, you got it (the set thing). That might be ok then. May want to do
> further testing.

See above. If you have suggestions for any more tests you'd like to  
do, or any genius ideas about preserving backwards compatibility, I'm  
all ears. I just felt strongly enough that using {"a" => true, "b" =>  
true} as a set was too ugly.

> c83c437d7b38
> I'm trying to remember if get_buffer is for the fast binary  
> protocol. I think
> at this point it only uses borrow and consume. I'll double check.

If the fast binary protocol uses it, it needs to be changed.  
#get_buffer was preventing a pretty serious memory optimization  
(42f75ed74ad). Basically, the MemoryBuffer was holding on to every  
piece of data written to it, even though the only way to retrieve data  
once it had been read back out was with #get_buffer (which nobody in  
the ruby libraries used). This meant that if a single MemoryBuffer was  
used to transport 2MB of data, until that MemoryBuffer was finalized  
those 2MB would have effectively leaked. Without #get_buffer the  
MemoryBuffer could toss any data the moment it was read.

> 5f29875a24
> This is a good change. I think we should prefer testing with literals
> over method calls.

That was my thought.

> e27daba8897
> Wait what? A hash as the second arg to raise?

Yep. Thrift::Test::Xception is a generated struct from the  
ThriftTest.thrift file. It inherits from StandardError, but it also  
includes Thrift::Struct. The problem here is that while StandardError  
wants a message as its only arg, Thrift::Struct has other ideas. It re- 
defines the initializer to take a hash that's used to populate its  
fields. When you say

   raise Thrift::Test::Xception, 'error'

What you're really saying is

   raise Thrift::Test::Xception.new('error')

This didn't used to break because Thrift::Struct used the #[] method  
exclusively on its argument. What this meant, though, is that this  
raise was effectively doing the same thing as

   raise Thrift::Test::Exception.new

This bug cropped up when I made a seemingly-innocuous change to the  
Thrift::Struct initializer, in which I added the line

   d.fetch(name.to_s) # d here is the variable used for the hash  
argument

This suddenly caused the specs to blow up on that line in  
raiseException(). The only explanation was d.fetch() was raising an  
exception, which it was, because it was being given 'error' instead of  
a Hash.

In other words, this code was always broken, it was just staying quiet  
until I played with it. The new form of the raise is equivalent to

   raise Thrift::Test::Xception.new(:message => 'error')

and that behaves exactly as desired.

-Kevin Ballard

-- 
Kevin Ballard
kevin@rapleaf.com




Re: Full specs for ruby libraries

Posted by Kevin Clark <ke...@gmail.com>.
On Mon, Jun 2, 2008 at 6:03 PM, Kevin Ballard <ke...@rapleaf.com> wrote:
> Any comments/suggestions?

First off, great work. This is a major improvement to the libraries.
Here's my notes from reviewing the commits (through 713371df7f):

For those playing along, you can view these by going to
http://github.com/kballard/thrift/commits/COMMIT

7a1b95c8d886
returns (lines 50, 31):
The first item in the array (field name) isn't actually used/supplied currently
May want to do the same in the tests for consistency.
* I'm not really sure why it's returned at all. Legacy?

c7101e6334f3b
Consequences of including Thrift in the example group?

6cda1b66c24
On testing the remapped methods: Could we mock to expect the old
method called by the new?
Alternatively, if they're the same method (I don't think they are),
it's possible method()
could be used for equality.

665d95c51f
What's going on here?

1d897acba89
Range errors on Bignum are ok but on integer aren't?

800706783f1
I think this might have been correct. We don't want the server bailing
out if something goes wrong.
We should discuss.

210803a358
I think switching sets to use Set breaks backwards compatability.
There's also a bug we need to be aware of that allows arrays to be
passed as sets (because .each doesn't care, and the value of the hash)
isn't used.

I think breaking backwards compat might be ok here, but we need to discuss
on the list.

384064468d
Ah, you got it (the set thing). That might be ok then. May want to do
further testing.

c83c437d7b38
I'm trying to remember if get_buffer is for the fast binary protocol. I think
at this point it only uses borrow and consume. I'll double check.

5f29875a24
This is a good change. I think we should prefer testing with literals
over method calls.

e27daba8897
Wait what? A hash as the second arg to raise?




-- 
Kevin Clark
http://glu.ttono.us