You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Stefán Baxter (JIRA)" <ji...@apache.org> on 2015/11/10 15:40:11 UTC
[jira] [Created] (DRILL-4056) Avro deserialization
Stefán Baxter created DRILL-4056:
------------------------------------
Summary: Avro deserialization
Key: DRILL-4056
URL: https://issues.apache.org/jira/browse/DRILL-4056
Project: Apache Drill
Issue Type: Bug
Components: Storage - Other
Affects Versions: 1.3.0
Environment: Ubuntu 15.04 - Oracle Java
Reporter: Stefán Baxter
Fix For: 1.3.0
I have an Avro file that support the following data/schema:
{"field":"some", "classification":{"variant":"Gæst"}}
When I select 10 rows from this file I get:
+---------------------+
| EXPR$0 |
+---------------------+
| Gæst |
| Voksen |
| Voksen |
| Invitation KIF KBH |
| Invitation KIF KBH |
| Ordinarie pris KBH |
| Ordinarie pris KBH |
| Biljetter 200 krBH |
| Biljetter 200 krBH |
| Biljetter 200 krBH |
+---------------------+
The bug is that the field values are incorrectly de-serialized and the value from the previous row is retained if the subsequent row is shorter.
The sql query:
"select s.classification.variant variant from dfs.<some> as s limit 10;"
That way the "Ordinarie pris" becomes "Ordinarie pris KBH" because the previous row had the value "Invitation KIF KBH".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)