You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Abishek Baskaran <ba...@gmail.com> on 2014/02/11 05:34:53 UTC

Avro serialization in Ruby - how to write to a String/buffer instead of file

I am trying to do avro serialization in Ruby. I have written the JSON
schema, however I like to have the serialized data as bytes in Ruby instead
of writing to file.

My code is hanging in somewhere like:

SCHEMA = <<-JSON { "type": "record", "name": "User", "fields" : [ {"name":
"name", "type": "string"}, {"name": "id", "type": "long"}, {"name": "city",
"type": "string"} ] } JSON

schema = Avro::Schema.parse(SCHEMA)
dw = Avro::IO::DatumWriter.new(schema)
buffer = StringIO.new
encoder = Avro::IO::BinaryEncoder.new(buffer)???

I have the values for name, id and city and wondering how to create the
User object and serialize it to a string / byte buffer.


Regards,

Abishek

Re: Avro serialization in Ruby - how to write to a String/buffer instead of file

Posted by Abishek Baskaran <ba...@gmail.com>.
Thanks Martin, I almost got there except for the pro tip which is very
helpful!
On Feb 11, 2014 10:19 PM, "Martin Kleppmann" <ma...@rapportive.com> wrote:

> Hi Abishek,
>
> You got most of the way there! Next thing you need to do is to use the
> DatumWriter you created:
>
> dw.write({'name': 'Bob', 'id': 42, 'city': 'New York'}, encoder)
> buffer.string # now contains the binary representation of that record
>
> Pro tip: in Ruby >= 1.9, if you do StringIO.new it uses your default
> character encoding, which may be UTF-8 or something like that. However,
> Avro generates binary data, not UTF-8, so that may cause errors down the
> line. Safest if you tell Ruby explicitly that your StringIO is going to
> contain binary data. You can do that as follows:
>
> buffer = StringIO.new("".force_encoding("BINARY"))
>
> We should really make a simper API for this!
>
> Best,
> Martin
>
>
> On 11 February 2014 04:34, Abishek Baskaran <ba...@gmail.com> wrote:
>
>>  I am trying to do avro serialization in Ruby. I have written the JSON
>> schema, however I like to have the serialized data as bytes in Ruby instead
>> of writing to file.
>>
>> My code is hanging in somewhere like:
>>
>> SCHEMA = <<-JSON { "type": "record", "name": "User", "fields" : [
>> {"name": "name", "type": "string"}, {"name": "id", "type": "long"},
>> {"name": "city", "type": "string"} ] } JSON
>>
>>
>> schema = Avro::Schema.parse(SCHEMA)
>> dw = Avro::IO::DatumWriter.new(schema)
>> buffer = StringIO.new
>> encoder = Avro::IO::BinaryEncoder.new(buffer)???
>>
>> I have the values for name, id and city and wondering how to create the
>> User object and serialize it to a string / byte buffer.
>>
>>
>> Regards,
>>
>> Abishek
>>
>
>

Re: Avro serialization in Ruby - how to write to a String/buffer instead of file

Posted by Martin Kleppmann <ma...@rapportive.com>.
Hi Abishek,

You got most of the way there! Next thing you need to do is to use the
DatumWriter you created:

dw.write({'name': 'Bob', 'id': 42, 'city': 'New York'}, encoder)
buffer.string # now contains the binary representation of that record

Pro tip: in Ruby >= 1.9, if you do StringIO.new it uses your default
character encoding, which may be UTF-8 or something like that. However,
Avro generates binary data, not UTF-8, so that may cause errors down the
line. Safest if you tell Ruby explicitly that your StringIO is going to
contain binary data. You can do that as follows:

buffer = StringIO.new("".force_encoding("BINARY"))

We should really make a simper API for this!

Best,
Martin


On 11 February 2014 04:34, Abishek Baskaran <ba...@gmail.com> wrote:

>  I am trying to do avro serialization in Ruby. I have written the JSON
> schema, however I like to have the serialized data as bytes in Ruby instead
> of writing to file.
>
> My code is hanging in somewhere like:
>
> SCHEMA = <<-JSON { "type": "record", "name": "User", "fields" : [ {"name":
> "name", "type": "string"}, {"name": "id", "type": "long"}, {"name": "city",
> "type": "string"} ] } JSON
>
> schema = Avro::Schema.parse(SCHEMA)
> dw = Avro::IO::DatumWriter.new(schema)
> buffer = StringIO.new
> encoder = Avro::IO::BinaryEncoder.new(buffer)???
>
> I have the values for name, id and city and wondering how to create the
> User object and serialize it to a string / byte buffer.
>
>
> Regards,
>
> Abishek
>