You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Bram Biesbrouck (JIRA)" <ji...@apache.org> on 2016/01/18 22:03:39 UTC
[jira] [Commented] (AVRO-457) add tools that read/write xml records
from/to avro data files
[ https://issues.apache.org/jira/browse/AVRO-457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105832#comment-15105832 ]
Bram Biesbrouck commented on AVRO-457:
--------------------------------------
Please allow me to comment on this after having used Michael's project (from https://github.com/mikepigott/xml-to-avro) on the official (and fairly complex) ebucore.xsd schema version 1.6 (see https://tech.ebu.ch/MetadataEbuCore and https://www.ebu.ch/metadata/schemas/EBUCore/ebucore.zip)
To me, from a developer point of view, the need for the tool Michael has written is very high; nearly all official ontologies release their versions using XML schema (XSD) files. Just like the XJC (and by extent the JAXB) project, it's important to have de-facto standard projects to convert them to working memory models. Having a reliable XSD->AVSC converter would be awesome.
I've played around with Michael's code and got it to successfully generate an avro schema from the ebucore.xsd file. However, I had to make a lot of modifications to the original file because not all standards are implemented in xml-to-avro (for one, elements with default, empty types crash the converter).
After having tried four solutions:
1) https://github.com/stealthly/xml-avro
2) https://github.com/mikepigott/xml-to-avro
3) https://github.com/nokia/Avro-Schema-Generator
4) https://github.com/FasterXML/jackson-dataformat-avro
I conclude that solution 1 is the best for now, because it works out of the box without modifications and generates a more type-safe schema (than Michael's converter), although for complex schemas like ebucore, double types are introduced (eg; Double1, Double2, ...).
All this to make a point: I, together with a lot of other developers, truly see the need for an official XSD->AVSC converter, so please consider it. I can help with testing, but I'm no XSD expert.
You might want to contact to folks at https://github.com/stealthly/xml-avro
bram
> add tools that read/write xml records from/to avro data files
> -------------------------------------------------------------
>
> Key: AVRO-457
> URL: https://issues.apache.org/jira/browse/AVRO-457
> Project: Avro
> Issue Type: New Feature
> Components: java
> Affects Versions: 1.7.8
> Reporter: Doug Cutting
> Labels: gsoc
> Attachments: AVRO-457.patch, AVRO-457.patch, AVRO-457.patch, AVRO-457.patch
>
>
> It might be useful to have command-line tools that can read & write arbitrary XML data from & to Avro data files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)