You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Paul Banks (JIRA)" <ji...@apache.org> on 2015/06/30 15:02:04 UTC

[jira] [Commented] (AVRO-1664) PHP library can't serialise records with optional (union-null) values

    [ https://issues.apache.org/jira/browse/AVRO-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608245#comment-14608245 ] 

Paul Banks commented on AVRO-1664:
----------------------------------

Seems there is no interest in this issue from others. This seems like a deal breaker to me for using Avro with PHP at all. I'd really like to but PHP is a primary language in our stack and originator of most of our data.

I traced the cause down to this line:

https://github.com/apache/avro/blob/trunk/lang/php/lib/avro/schema.php#L446

For a record type, the incoming array MUST have a field with correct name and valid type.

Note that python has different (presumably correct) behaviour since the semantics of {{datum.get(f.name)}} are such that {{None}} is returned if there is no such field which is a "correct" value for "null".

The fix it seems would be something like: 

{code:php}
$value = isset($datum[$field->name()]) ? $datum[$field->name()] : null;
if (!self::is_valid_datum($field->type(), $value))
  return false;
{code}

(untested, I'll review correct procedure for submitting patch later).

Rationale: if the field is not set (or explicitly {null}) assume it's value is {null} (this is actually default behaviour in PHP - just removing the {array_key_exists} check would be enough to fix the issue although it will give 'undefined index' warning).



> PHP library can't serialise records with optional (union-null) values
> ---------------------------------------------------------------------
>
>                 Key: AVRO-1664
>                 URL: https://issues.apache.org/jira/browse/AVRO-1664
>             Project: Avro
>          Issue Type: Bug
>          Components: php
>    Affects Versions: 1.7.7
>         Environment: php 5.5.15 OS X 10.9 
>            Reporter: Paul Banks
>
> PHP avro serialising doesn't appear to support "optional" fields in records.
> Consider the PHP script below:
> {code}
> <?php
> require_once('lib/avro.php');
> $schema_json = <<<_JSON
> {"name":"member",
>  "type":"record",
>  "fields":[{"name":"one", "type":"int"},
>            {"name":"two", "type":["null", "string"]}
>            ]}
> _JSON;
> $schema = AvroSchema::parse($schema_json);
> // Our datum is missing the 'optional' field (i.e. it's null)
> $datum = array("one" => 1);
> $io = new AvroStringIO();
> $writer = new AvroIODatumWriter($schema);
> $encoder = new AvroIOBinaryEncoder($io);
> $writer->write($datum, $encoder);
> $bin = $io->string();
> echo bin2hex($bin) . "\n";
> {code}
> My understanding from documentation is that this should work and output the encoded binary in hex.
> Instead it throws:
> {code}
> PHP Fatal error:  Uncaught exception 'AvroIOTypeException' with message 'The datum array (
>   'one' => 1,
> ) is not an example of schema {"type":"record","name":"member","fields":[{"name":"one","type":"int"},{"name":"two","type":["null","string"]}]}'
> {code}
> It's possible that this is not a valid usage of Avro and I'm mistaken in my expectations, so I tried the python library as a comparison. Sure enough the following script works as expected:
> {code}
> from avro import schema
> from avro import io
> from StringIO import StringIO
> s = schema.parse("""
>     {"name":"member",
>     "type":"record",
>     "fields":[{"name":"one", "type":"int"},
>               {"name":"two", "type":["null", "string"]}
>              ]}""")
> writer = StringIO()
> encoder = io.BinaryEncoder(writer)
> datum_writer = io.DatumWriter(s)
> datum_writer.write({"one": 1}, encoder)
> print writer.getvalue().encode("hex")
> {code}
> which outputs:
> {code}
> $ python avro_test.py
> 0200
> {code}
> As expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)