You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Paul Banks (JIRA)" <ji...@apache.org> on 2015/06/30 15:02:04 UTC
[jira] [Commented] (AVRO-1664) PHP library can't serialise records
with optional (union-null) values
[ https://issues.apache.org/jira/browse/AVRO-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608245#comment-14608245 ]
Paul Banks commented on AVRO-1664:
----------------------------------
Seems there is no interest in this issue from others. This seems like a deal breaker to me for using Avro with PHP at all. I'd really like to but PHP is a primary language in our stack and originator of most of our data.
I traced the cause down to this line:
https://github.com/apache/avro/blob/trunk/lang/php/lib/avro/schema.php#L446
For a record type, the incoming array MUST have a field with correct name and valid type.
Note that python has different (presumably correct) behaviour since the semantics of {{datum.get(f.name)}} are such that {{None}} is returned if there is no such field which is a "correct" value for "null".
The fix it seems would be something like:
{code:php}
$value = isset($datum[$field->name()]) ? $datum[$field->name()] : null;
if (!self::is_valid_datum($field->type(), $value))
return false;
{code}
(untested, I'll review correct procedure for submitting patch later).
Rationale: if the field is not set (or explicitly {null}) assume it's value is {null} (this is actually default behaviour in PHP - just removing the {array_key_exists} check would be enough to fix the issue although it will give 'undefined index' warning).
> PHP library can't serialise records with optional (union-null) values
> ---------------------------------------------------------------------
>
> Key: AVRO-1664
> URL: https://issues.apache.org/jira/browse/AVRO-1664
> Project: Avro
> Issue Type: Bug
> Components: php
> Affects Versions: 1.7.7
> Environment: php 5.5.15 OS X 10.9
> Reporter: Paul Banks
>
> PHP avro serialising doesn't appear to support "optional" fields in records.
> Consider the PHP script below:
> {code}
> <?php
> require_once('lib/avro.php');
> $schema_json = <<<_JSON
> {"name":"member",
> "type":"record",
> "fields":[{"name":"one", "type":"int"},
> {"name":"two", "type":["null", "string"]}
> ]}
> _JSON;
> $schema = AvroSchema::parse($schema_json);
> // Our datum is missing the 'optional' field (i.e. it's null)
> $datum = array("one" => 1);
> $io = new AvroStringIO();
> $writer = new AvroIODatumWriter($schema);
> $encoder = new AvroIOBinaryEncoder($io);
> $writer->write($datum, $encoder);
> $bin = $io->string();
> echo bin2hex($bin) . "\n";
> {code}
> My understanding from documentation is that this should work and output the encoded binary in hex.
> Instead it throws:
> {code}
> PHP Fatal error: Uncaught exception 'AvroIOTypeException' with message 'The datum array (
> 'one' => 1,
> ) is not an example of schema {"type":"record","name":"member","fields":[{"name":"one","type":"int"},{"name":"two","type":["null","string"]}]}'
> {code}
> It's possible that this is not a valid usage of Avro and I'm mistaken in my expectations, so I tried the python library as a comparison. Sure enough the following script works as expected:
> {code}
> from avro import schema
> from avro import io
> from StringIO import StringIO
> s = schema.parse("""
> {"name":"member",
> "type":"record",
> "fields":[{"name":"one", "type":"int"},
> {"name":"two", "type":["null", "string"]}
> ]}""")
> writer = StringIO()
> encoder = io.BinaryEncoder(writer)
> datum_writer = io.DatumWriter(s)
> datum_writer.write({"one": 1}, encoder)
> print writer.getvalue().encode("hex")
> {code}
> which outputs:
> {code}
> $ python avro_test.py
> 0200
> {code}
> As expected.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)