You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2019/11/01 20:55:47 UTC

[GitHub] [incubator-iceberg] rdblue commented on issue #601: Fix Parquet with special characters in field names.

rdblue commented on issue #601: Fix Parquet with special characters in field names.
URL: https://github.com/apache/incubator-iceberg/pull/601#issuecomment-548947670
 
 
   @rdsr, please have another look. I added tests to iceberg-data and ended up needing to fix a couple of things:
   
   * `BuildAvroProjection` couldn't project fields with special characters in the name because Avro would reject the projection schema. I added a couple calls to `makeCompatibleName` to fix this. It should be safe because special characters would cause failures in these cases before.
   * The generic `DataReader` created iceberg-data records with a schema converted from Avro. When using sanitized names, the records would not use the right field names for `getField`. The fix was to pass in the original Iceberg schema. To make this easy, I added a new visitor.
   
   I think the second fix also addresses the case introduced by #207 when the Avro names don't match because the shouldn't be projected. Next, we should be able to fix Avro reads by using a similar pattern to iceberg-data, but one that produces Avro generics.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org