You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by Allen George <al...@gmail.com> on 2016/11/20 03:55:01 UTC

Question about OPT_IN_REQ_OUT fields

Hi,

I've made a lot of progress on the Rust thrift generation front and
I've been able to get the C++ server and Rust client communicating for
some types. One thing really puzzles me however: T_OPT_IN_REQ_OUT
fields. Looking at the docs and other code generators it looks like
those fields are written out no matter what. Does that mean that I'm
writing a default value (empty string, empty collections, 0 for
numeric types) when they're not set? This seems very puzzling, because
the receiver can't tell the difference between sender-unset values and
values with the standard default. It's unfortunate from my point of
view because a Rust client communicating with an echo server receives
a result that doesn't match what it sent out!

An example:

struct Foo {
  1: string thing;
}

Rust representation:

struct Foo {
  thing: Option<String>,
}

If I don't set "thing" I have the following value to send:

let sent = Foo { thing: None }

With the OPT_IN_REQ_OUT behavior it seems I have to serialize an empty
string over the wire. As a result, when the echo server responds, I
actually get:

Foo { thing: Some("") }

This means that my simple "check that the response is exactly what I
sent out" doesn't quite work and I'll have to modify it for those
cases where I have unset fields. At any rate - just wanted to check
that my understanding of OPT_IN_REQ_OUT was right.

Thanks,
Allen

Terminal Musings: http://www.allengeorge.com/
Raft in Java: https://github.com/allengeorge/libraft/
Twitter: https://twitter.com/allenageorge/

Re: Question about OPT_IN_REQ_OUT fields

Posted by Allen George <al...@gmail.com>.
Thanks Jens - I appreciate the detailed response. The funny thing was
that I checked only the C++/Go generators - and stopped there (my
bad!) I'll have to spend some time thinking about what makes the most
sense in Rust.

Cheers,
Allen

Terminal Musings: http://www.allengeorge.com/
Raft in Java: https://github.com/allengeorge/libraft/
Twitter: https://twitter.com/allenageorge/


On Sun, Nov 20, 2016 at 6:32 AM, Jens Geyer <je...@hotmail.com> wrote:
> Hi Allen,
>
> there are three kinds of requiredness with Thrift:
>
> a) required
> The field must be written and the reader must be able to read it. If it is
> not properly set on write or if the reader does not find it in the data,
> some language implementations (not all) throw an exception.
>
> b) optional
> The field can be written, but may be as well missing form the data stream.
> Usually the implementations check on write whether the field is null or if
> the matching "isset" flag is set. What method is used is mostly an
> implememntation detail of the particular language. For convenience, some
> implementations use property setters to set the "isset" flag along with the
> value.
>
> c) default
> This is the default (hence the name) that is applied when neither required
> nor optional are specified. These fields are intended to be written always
> to the output stream. But, on the other hand, if the foo member in the Bar
> struct below is null, foo will in many (all?) cases not be written at all:
>
> struct Foo { ... }
> struct Bar {
>     1: Foo foo
> }
>
>
>> With the OPT_IN_REQ_OUT behavior it seems I have to serialize an empty
>> string over the wire. As a result, when the echo server responds, I
>> actually get:
>>
>> Foo { thing: Some("") }
>>
>> This seems very puzzling, because
>> the receiver can't tell the difference between sender-unset values and
>> values with the standard default.
>
> That may just be right from the formal definition, but there's a "but". I
> think the main confusion stems from the fact, that None is not a real value,
> but the attempt of getting out of the language what would be null values
> otherwise. The formally correct way would be to enforce a set field on write
> or throw otherwise, just like with "required". If we do a short facts check
> against the current code base, we find this behaviour for unset foo members:
>
> - C++ writes an empty foo
> - Python does not write unset foo
> - NodeJS does not write unset foo
> - Java  does not write unset foo
> - Delphi does not write unset foo
> - C_glib expects foo to be set and writes it
> - C# does not write unset foo
>
> So a lot of implementations treat "default" very similar to "optional", but
> some don't and follow the formal definition to always write some data. It
> works though, because on read the field is mandatory and the implementation
> provides some default for it, which in a lot of cases will again be "null".
>
> From my perspective, I would recommend to do the thing that makes most sense
> w/regard to the Rust language environment, the option that provides the
> least surprise.
>
> Have fun,
> JensG
>
>
> -----Ursprüngliche Nachricht-----
> From: Allen George
> Sent: Sunday, November 20, 2016 4:55 AM
> To: dev@thrift.apache.org
> Subject: Question about OPT_IN_REQ_OUT fields
>
> Hi,
>
> I've made a lot of progress on the Rust thrift generation front and
> I've been able to get the C++ server and Rust client communicating for
> some types. One thing really puzzles me however: T_OPT_IN_REQ_OUT
> fields. Looking at the docs and other code generators it looks like
> those fields are written out no matter what. Does that mean that I'm
> writing a default value (empty string, empty collections, 0 for
> numeric types) when they're not set? This seems very puzzling, because
> the receiver can't tell the difference between sender-unset values and
> values with the standard default. It's unfortunate from my point of
> view because a Rust client communicating with an echo server receives
> a result that doesn't match what it sent out!
>
> An example:
>
> struct Foo {
>   1: string thing;
> }
>
> Rust representation:
>
> struct Foo {
>   thing: Option<String>,
> }
>
> If I don't set "thing" I have the following value to send:
>
> let sent = Foo { thing: None }
>
> With the OPT_IN_REQ_OUT behavior it seems I have to serialize an empty
> string over the wire. As a result, when the echo server responds, I
> actually get:
>
> Foo { thing: Some("") }
>
> This means that my simple "check that the response is exactly what I
> sent out" doesn't quite work and I'll have to modify it for those
> cases where I have unset fields. At any rate - just wanted to check
> that my understanding of OPT_IN_REQ_OUT was right.
>
> Thanks,
> Allen
>
> Terminal Musings: http://www.allengeorge.com/
> Raft in Java: https://github.com/allengeorge/libraft/
> Twitter: https://twitter.com/allenageorge/
>

Re: Question about OPT_IN_REQ_OUT fields

Posted by Jens Geyer <je...@hotmail.com>.
Hi Allen,

there are three kinds of requiredness with Thrift:

a) required
The field must be written and the reader must be able to read it. If it is 
not properly set on write or if the reader does not find it in the data, 
some language implementations (not all) throw an exception.

b) optional
The field can be written, but may be as well missing form the data stream. 
Usually the implementations check on write whether the field is null or if 
the matching "isset" flag is set. What method is used is mostly an 
implememntation detail of the particular language. For convenience, some 
implementations use property setters to set the "isset" flag along with the 
value.

c) default
This is the default (hence the name) that is applied when neither required 
nor optional are specified. These fields are intended to be written always 
to the output stream. But, on the other hand, if the foo member in the Bar 
struct below is null, foo will in many (all?) cases not be written at all:

struct Foo { ... }
struct Bar {
    1: Foo foo
}


> With the OPT_IN_REQ_OUT behavior it seems I have to serialize an empty
> string over the wire. As a result, when the echo server responds, I
> actually get:
>
> Foo { thing: Some("") }
>
> This seems very puzzling, because
> the receiver can't tell the difference between sender-unset values and
> values with the standard default.

That may just be right from the formal definition, but there's a "but". I 
think the main confusion stems from the fact, that None is not a real value, 
but the attempt of getting out of the language what would be null values 
otherwise. The formally correct way would be to enforce a set field on write 
or throw otherwise, just like with "required". If we do a short facts check 
against the current code base, we find this behaviour for unset foo members:

- C++ writes an empty foo
- Python does not write unset foo
- NodeJS does not write unset foo
- Java  does not write unset foo
- Delphi does not write unset foo
- C_glib expects foo to be set and writes it
- C# does not write unset foo

So a lot of implementations treat "default" very similar to "optional", but 
some don't and follow the formal definition to always write some data. It 
works though, because on read the field is mandatory and the implementation 
provides some default for it, which in a lot of cases will again be "null".

From my perspective, I would recommend to do the thing that makes most sense 
w/regard to the Rust language environment, the option that provides the 
least surprise.

Have fun,
JensG


-----Ursprüngliche Nachricht----- 
From: Allen George
Sent: Sunday, November 20, 2016 4:55 AM
To: dev@thrift.apache.org
Subject: Question about OPT_IN_REQ_OUT fields

Hi,

I've made a lot of progress on the Rust thrift generation front and
I've been able to get the C++ server and Rust client communicating for
some types. One thing really puzzles me however: T_OPT_IN_REQ_OUT
fields. Looking at the docs and other code generators it looks like
those fields are written out no matter what. Does that mean that I'm
writing a default value (empty string, empty collections, 0 for
numeric types) when they're not set? This seems very puzzling, because
the receiver can't tell the difference between sender-unset values and
values with the standard default. It's unfortunate from my point of
view because a Rust client communicating with an echo server receives
a result that doesn't match what it sent out!

An example:

struct Foo {
  1: string thing;
}

Rust representation:

struct Foo {
  thing: Option<String>,
}

If I don't set "thing" I have the following value to send:

let sent = Foo { thing: None }

With the OPT_IN_REQ_OUT behavior it seems I have to serialize an empty
string over the wire. As a result, when the echo server responds, I
actually get:

Foo { thing: Some("") }

This means that my simple "check that the response is exactly what I
sent out" doesn't quite work and I'll have to modify it for those
cases where I have unset fields. At any rate - just wanted to check
that my understanding of OPT_IN_REQ_OUT was right.

Thanks,
Allen

Terminal Musings: http://www.allengeorge.com/
Raft in Java: https://github.com/allengeorge/libraft/
Twitter: https://twitter.com/allenageorge/