You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Willem van Bergen (JIRA)" <ji...@apache.org> on 2014/07/02 17:48:25 UTC

[jira] [Commented] (AVRO-1499) Ruby 2+ Writes Invalid avro files using the avro gem

    [ https://issues.apache.org/jira/browse/AVRO-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050073#comment-14050073 ] 

Willem van Bergen commented on AVRO-1499:
-----------------------------------------

W.r.t. monkey patching String. I agree with you in general, but in this case I think it is preferable over doing a `respond_to` in the write method.

- It basically backports a method in a way that is completely compatible with Ruby 1.9+, and it only does so if it's not available. 
- This makes it a lot easier to drop 1.8 support later. Just one file of backports to delete, instead of having to go through the entire source code to find occurrences.
- Performance: only one respond_to check when the library is loaded, instead of a check on every write.

I have no real strong feelings about it, so feel free to ignore this :)


> Ruby 2+ Writes Invalid avro files using the avro gem
> ----------------------------------------------------
>
>                 Key: AVRO-1499
>                 URL: https://issues.apache.org/jira/browse/AVRO-1499
>             Project: Avro
>          Issue Type: Bug
>          Components: ruby
>    Affects Versions: 1.7.5
>            Reporter: Michael Ries
>            Assignee: Martin Kleppmann
>              Labels: ruby
>             Fix For: 1.7.7
>
>         Attachments: AVRO-1499-2.patch, AVRO-1499-3.patch, AVRO-1499.patch
>
>
> The rubygem writes corrupted avro files under ruby 2.0.0 and ruby 2.1.1. It appears to work correctly under jruby-1.7.10 and ruby 1.9.3.
> Here is a reproducible:
> ```ruby
> require 'avro'
>  
> data = [
>   {"guid"=>"144045de-eb44-dd1b-d9af-6c8b5d41a96e", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119", "name"=>"My Awesome Bank", "created_at"=>1390617818, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"51e06057-14d2-7527-81fa-b07dba0a263b", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119", "name"=>"Student Loans R' Us", "created_at"=>1386178342, "updated_at"=>1398180286, "deleted_at"=>nil},
>   {"guid"=>"b4d1d99f-4351-d0e7-221c-a3fae08716bc", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119", "name"=>"My Awesome Bank", "created_at"=>1390617026, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"084638fa-a78d-bbdd-e075-7c9c957a9b46", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119", "name"=>"My Awesome Bank", "created_at"=>1390617138, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"79287c76-4e8f-0a21-7569-a2bcdc2b2f4d", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119", "name"=>"My Awesome Bank", "created_at"=>1390617135, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"3bcc26b2-7d3b-6c4d-cb27-4eb1574b3c20", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119", "name"=>"Cayman Islands Bank", "created_at"=>1386902345, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"75e1e56c-7611-4030-d002-afa2af70e5a1", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119", "name"=>"My Awesome Bank", "created_at"=>1390617427, "updated_at"=>1398180288, "deleted_at"=>nil},
> ]
>  
> member_schema = <<-SCHEMA
> {"namespace": "md.data_logs",
>  "type": "record",
>  "name": "Member",
>  "fields": [
>      {"name": "guid", "type": "string"},
>      {"name": "user_guid", "type": "string"},
>      {"name": "name", "type": ["string","null"]},
>      {"name": "created_at", "type":"long"},
>      {"name": "updated_at", "type":"long"},
>      {"name": "deleted_at", "type":["long","null"]}
>  ]
> }
> SCHEMA
> filepath = "./members.avro"
> File.unlink(filepath) if File.exists?(filepath)
>  
> Avro::DataFile.open(filepath, "w", member_schema) do |dw|
>   data.each do |entry|
>     dw << entry
>   end
> end
>  
>  
> entries = []
> Avro::DataFile.open(filepath, "r") do |reader|
>   reader.each do |entry|
>     entries << entry
>   end
> end
>  
> puts "Here is the data I wrote into the file:"
> data.each{|e| p e }
> print "\n\n\n\n"
>  
> puts "Here is the data I read from the file:"
> entries.each{|e| p e }
> ```
> Under ruby 2+ it fails with the message "undefined method 'unpack' for nil:NilClass (NoMethodError)". I have also tested that the rubygem can correctly read avro files written by the java client, but the java client fails to read files written by the ruby client, so the issue is definitely in how the rubygem is trying to write the binary file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)