You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "John Karp (JIRA)" <ji...@apache.org> on 2014/05/30 20:49:04 UTC

[jira] [Updated] (AVRO-1470) Perl API boolean type misencoded

     [ https://issues.apache.org/jira/browse/AVRO-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Karp updated AVRO-1470:
----------------------------

    Description: 
The boolean serialization is incorrect, as these new unit tests would indicate:
{noformat}
primitive_ok boolean => 0, "\x0";
primitive_ok boolean => 1, "\x1";
{noformat}
When run they print:
{noformat}
#   Failed test 'primitive boolean encoded correctly'
#   at t/02_bin_encode.t line 40.
#          got: '30'
#     expected: '00'

#   Failed test 'primitive boolean encoded correctly'
#   at t/02_bin_encode.t line 40.
#          got: '31'
#     expected: '01'
{noformat}

Secondly, when evaluating whether a 'boolean' branch should be taken in a union, the check for boolean-ness of the data is being done incorrectly, matching a regular expression against the wrong variable.

  was:
h1. Boolean Serialization
The boolean serialization code in BinaryEncoder.pm is:
{noformat}
$data ? \0x1 : \0x0
{noformat}
intending that anything false to perl, such as 0, '0', '', () and undef are encoded as zero, and everything else is encoded as one. However, this code doesn't work, as these unit tests would indicate:
{noformat}
primitive_ok boolean => 0, "\x0";
primitive_ok boolean => 1, "\x1";
{noformat}
which print:
{noformat}
#   Failed test 'primitive boolean encoded correctly'
#   at t/02_bin_encode.t line 40.
#          got: '30'
#     expected: '00'

#   Failed test 'primitive boolean encoded correctly'
#   at t/02_bin_encode.t line 40.
#          got: '31'
#     expected: '01'
{noformat}

h1. Booleans in Unions
Inconsistent with the above serialization, the code used in Schema.pm to determine which union branch to use, is attempting to check for boolean-ness with:
{noformat}
m{yes|no|y|n|t|f|true|false}i
{noformat}
meaning only those particular strings are considered booleans, however they will all get encoded as '0' by BinaryEncoder.pm.

I say 'attempts' because its actually matching this regex against the data type name $type, which in this context will always be 'boolean', instead of of the value $data.

h1. Suggested Fix
Perl has no boolean type, so there's no ideal solution for the inconsistency. But we could keep it simple, and have only the numbers 0 and 1 accepted as boolean values.


> Perl API boolean type misencoded
> --------------------------------
>
>                 Key: AVRO-1470
>                 URL: https://issues.apache.org/jira/browse/AVRO-1470
>             Project: Avro
>          Issue Type: Bug
>          Components: perl
>            Reporter: John Karp
>            Assignee: John Karp
>         Attachments: AVRO-1470.patch
>
>
> The boolean serialization is incorrect, as these new unit tests would indicate:
> {noformat}
> primitive_ok boolean => 0, "\x0";
> primitive_ok boolean => 1, "\x1";
> {noformat}
> When run they print:
> {noformat}
> #   Failed test 'primitive boolean encoded correctly'
> #   at t/02_bin_encode.t line 40.
> #          got: '30'
> #     expected: '00'
> #   Failed test 'primitive boolean encoded correctly'
> #   at t/02_bin_encode.t line 40.
> #          got: '31'
> #     expected: '01'
> {noformat}
> Secondly, when evaluating whether a 'boolean' branch should be taken in a union, the check for boolean-ness of the data is being done incorrectly, matching a regular expression against the wrong variable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)