You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by Kevin Ballard <ke...@rapleaf.com> on 2008/06/03 03:03:37 UTC
Full specs for ruby libraries
I just finished speccing out the ruby Thrift libraries. It's available
at
http://github.com/kballard/thrift
Do note that this contains Kevin Clark's work on ruby lib namespacing,
taken to completion by me (and yes, this is reflected in the specs).
Also, as of this writing there are a couple failing specs, because I
included specs for a couple open issues on JIRA as well as one fixed
in Kevin Clark's repo.
Any comments/suggestions?
-Kevin Ballard
--
Kevin Ballard
kevin@rapleaf.com
Re: Full specs for ruby libraries
Posted by Kevin Clark <ke...@gmail.com>.
All that sounds fine. I'm happy to commit as soon as the source is
into Apache's svn.
--
Kevin Clark
http://glu.ttono.us
Re: Full specs for ruby libraries
Posted by Kevin Ballard <ke...@rapleaf.com>.
On Jun 4, 2008, at 11:22 AM, Kevin Clark wrote:
> On Mon, Jun 2, 2008 at 6:03 PM, Kevin Ballard <ke...@rapleaf.com>
> wrote:
>> Any comments/suggestions?
>
> First off, great work. This is a major improvement to the libraries.
> Here's my notes from reviewing the commits (through 713371df7f):
Thank you. I've responded below.
> For those playing along, you can view these by going to
> http://github.com/kballard/thrift/commits/COMMIT
>
> 7a1b95c8d886
> returns (lines 50, 31):
> The first item in the array (field name) isn't actually used/
> supplied currently
> May want to do the same in the tests for consistency.
> * I'm not really sure why it's returned at all. Legacy?
I put them in the tests because I figure as long as the code ignores
them, everything is fine, but if someone fixes the libraries to start
caring, the tests shouldn't break. I assume it's in there (but this is
just a guess) because it makes inspecting the network traffic by hand
a lot easier to debug, but since the binary protocol doesn't send the
names (and that's the only concrete protocol in the ruby libraries),
it hardly matters.
> c7101e6334f3b
> Consequences of including Thrift in the example group?
AFAIK there should be none. You'll notice all of my specs are
structured this way, creating a concrete example group and including
Thrift, simply to make the testing code simpler (as it doesn't have to
constantly namespace all the thrift classes). Note that in
665d95c51fe6 I ended up changing the names of all these example groups
to avoid spec leakage when running multiple specs at once, but the
same idea is there.
> 6cda1b66c24
> On testing the remapped methods: Could we mock to expect the old
> method called by the new?
> Alternatively, if they're the same method (I don't think they are),
> it's possible method()
> could be used for equality.
No we can't. The way deprecate! works, it actually fetches the new
method directly with instance_method, binds it to the current object,
and calls it. I suppose I could test to ensure instance_method() is
called with the new symbol. I'll look into doing that now.
> 665d95c51f
> What's going on here?
When I started using the concrete ExampleGroup classes, I didn't run
`rake spec` until I'd written several of them (instead I ran the
individual specs separately). Running all the specs at once revealed
that having each ExampleGroup named ThriftSpec was, in fact, re-
opening the same class over and over (which makes sense), re-including
Thrift over and over, and actually leaking some information between
specs. I forget exactly what actually leaked, but `rake spec` was
broken before that change, and worked fine after that change.
> 1d897acba89
> Range errors on Bignum are ok but on integer aren't?
There's a few reasons I made this change. The biggest one is the code
here assumes it's writing signed integers (which makes sense, all the
libraries assume they're dealing with signed integers and return
signed from read_foo). The range-checking code there was enforcing the
signed-ness of the integers (although, looking back, I allowed
unsigned bytes in write_byte, which differs from write_i16 and
write_i32). This actually broke the binary protocol because it writes
out an i32 with the high bit set when it writes its version
(0x80010000), then coerces it back to unsigned when reading (read_i32
| VERSION_MASK converts the return value of read_i32 to the equivalent
unsigned value with the same bitfield). My two options were to either
increase the range bounds of i16/i32 to allow writing unsigned
integers, or remove the range checking. I opted to remove the range
checking primarily because, looking at the Java and CPP libraries,
there's no explicit range checking there, and I figured with that
precedent I'd rather not slow down the binary protocol by adding extra
tests to every single integer write. I mean, if the user writes a long
as an i16, they shouldn't be the least surprised to find it gets
clipped.
But there was nothing I could do about the Bignum range error
(Array#pack raises that), so I simply encoded that behavior in the
tests.
If there is a real desire to re-introduce range checking I can
certainly do that, with the ranges expanded to include unsigned
integers. Another option is to introduce read/write_ui* functions, but
that would make the ruby libs different than all other libraries.
> 800706783f1
> I think this might have been correct. We don't want the server bailing
> out if something goes wrong.
> We should discuss.
The problem was it wasn't catching exceptions. The construct
rescue Exception => e
is clearly designed to catch all exceptions, except it wasn't. I don't
know what quirk of Ruby was preventing that from working, but in
testing it would explicitly catch Exception objects but not e.g.
StandardError objects. Using
rescue => e
seems to catch them all. Granted, I didn't test, this could be
catching all subclasses of StandardError rather than Exception, but
the only reason a non-StandardError exception should be thrown is if
the processor or handler triggers a syntax or load error, and I'm
perfectly happy with the server blowing up on that.
> 210803a358
> I think switching sets to use Set breaks backwards compatability.
> There's also a bug we need to be aware of that allows arrays to be
> passed as sets (because .each doesn't care, and the value of the hash)
> isn't used.
>
> I think breaking backwards compat might be ok here, but we need to
> discuss
> on the list.
This actually should continue to work with pre-generated code, the
only observable difference in clients is when they call a method that
returns a Thrift::SET, it'll return a Set instead of a hash. But I
don't think that's too big of a change to ask clients to accommodate.
I will note that passing a Set to a method, and having a hash-style
set as the default value of a generated struct should actually
continue to work (384064468d). Also, there's a decent chance that
client code that handles the return value will still work anyway, for
example
client.returnSet("foo").each { |k,v| p k }
That will print the exact same thing regardless of whether a Set or a
Hash is returned. I realize we can't assume all code (or even most
code) handles Set return values this way, but my point is there's very
little to change. Unfortunately there's no way to deprecate the old
style, because the method doesn't have any way of knowing what it's
expected to return.
> 384064468d
> Ah, you got it (the set thing). That might be ok then. May want to do
> further testing.
See above. If you have suggestions for any more tests you'd like to
do, or any genius ideas about preserving backwards compatibility, I'm
all ears. I just felt strongly enough that using {"a" => true, "b" =>
true} as a set was too ugly.
> c83c437d7b38
> I'm trying to remember if get_buffer is for the fast binary
> protocol. I think
> at this point it only uses borrow and consume. I'll double check.
If the fast binary protocol uses it, it needs to be changed.
#get_buffer was preventing a pretty serious memory optimization
(42f75ed74ad). Basically, the MemoryBuffer was holding on to every
piece of data written to it, even though the only way to retrieve data
once it had been read back out was with #get_buffer (which nobody in
the ruby libraries used). This meant that if a single MemoryBuffer was
used to transport 2MB of data, until that MemoryBuffer was finalized
those 2MB would have effectively leaked. Without #get_buffer the
MemoryBuffer could toss any data the moment it was read.
> 5f29875a24
> This is a good change. I think we should prefer testing with literals
> over method calls.
That was my thought.
> e27daba8897
> Wait what? A hash as the second arg to raise?
Yep. Thrift::Test::Xception is a generated struct from the
ThriftTest.thrift file. It inherits from StandardError, but it also
includes Thrift::Struct. The problem here is that while StandardError
wants a message as its only arg, Thrift::Struct has other ideas. It re-
defines the initializer to take a hash that's used to populate its
fields. When you say
raise Thrift::Test::Xception, 'error'
What you're really saying is
raise Thrift::Test::Xception.new('error')
This didn't used to break because Thrift::Struct used the #[] method
exclusively on its argument. What this meant, though, is that this
raise was effectively doing the same thing as
raise Thrift::Test::Exception.new
This bug cropped up when I made a seemingly-innocuous change to the
Thrift::Struct initializer, in which I added the line
d.fetch(name.to_s) # d here is the variable used for the hash
argument
This suddenly caused the specs to blow up on that line in
raiseException(). The only explanation was d.fetch() was raising an
exception, which it was, because it was being given 'error' instead of
a Hash.
In other words, this code was always broken, it was just staying quiet
until I played with it. The new form of the raise is equivalent to
raise Thrift::Test::Xception.new(:message => 'error')
and that behaves exactly as desired.
-Kevin Ballard
--
Kevin Ballard
kevin@rapleaf.com
Re: Full specs for ruby libraries
Posted by Kevin Clark <ke...@gmail.com>.
On Mon, Jun 2, 2008 at 6:03 PM, Kevin Ballard <ke...@rapleaf.com> wrote:
> Any comments/suggestions?
First off, great work. This is a major improvement to the libraries.
Here's my notes from reviewing the commits (through 713371df7f):
For those playing along, you can view these by going to
http://github.com/kballard/thrift/commits/COMMIT
7a1b95c8d886
returns (lines 50, 31):
The first item in the array (field name) isn't actually used/supplied currently
May want to do the same in the tests for consistency.
* I'm not really sure why it's returned at all. Legacy?
c7101e6334f3b
Consequences of including Thrift in the example group?
6cda1b66c24
On testing the remapped methods: Could we mock to expect the old
method called by the new?
Alternatively, if they're the same method (I don't think they are),
it's possible method()
could be used for equality.
665d95c51f
What's going on here?
1d897acba89
Range errors on Bignum are ok but on integer aren't?
800706783f1
I think this might have been correct. We don't want the server bailing
out if something goes wrong.
We should discuss.
210803a358
I think switching sets to use Set breaks backwards compatability.
There's also a bug we need to be aware of that allows arrays to be
passed as sets (because .each doesn't care, and the value of the hash)
isn't used.
I think breaking backwards compat might be ok here, but we need to discuss
on the list.
384064468d
Ah, you got it (the set thing). That might be ok then. May want to do
further testing.
c83c437d7b38
I'm trying to remember if get_buffer is for the fast binary protocol. I think
at this point it only uses borrow and consume. I'll double check.
5f29875a24
This is a good change. I think we should prefer testing with literals
over method calls.
e27daba8897
Wait what? A hash as the second arg to raise?
--
Kevin Clark
http://glu.ttono.us