You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/08/06 03:13:42 UTC

[GitHub] [iceberg] openinx opened a new issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

openinx opened a new issue #1302:
URL: https://github.com/apache/iceberg/issues/1302


   I found that we did not put uuid type in `org.apache.iceberg.data.DataTest`, and after added it in the `SUPPORTED_PRIMITIVES` (using the following patch),  it would break many unit tests.
   
   ```patch
   diff --git a/data/src/test/java/org/apache/iceberg/data/DataTest.java b/data/src/test/java/org/apache/iceberg/data/DataTest.java
   index 5566a753..03f848a9 100644
   --- a/data/src/test/java/org/apache/iceberg/data/DataTest.java
   +++ b/data/src/test/java/org/apache/iceberg/data/DataTest.java
   @@ -56,7 +56,8 @@ public abstract class DataTest {
          required(114, "dec_9_0", Types.DecimalType.of(9, 0)),
          required(115, "dec_11_2", Types.DecimalType.of(11, 2)),
          required(116, "dec_38_10", Types.DecimalType.of(38, 10)), // maximum precision
   -      required(117, "time", Types.TimeType.get())
   +      required(117, "time", Types.TimeType.get()),
   +      required(118, "uuid", Types.UUIDType.get())
      );
    
      @Rule
   ```
   
   The broken unit tests: 
   
   ![image](https://user-images.githubusercontent.com/5028729/89486370-80f88d00-d7d5-11ea-9bc2-f998851265e8.png)
   
   We should add the uuid type in the tests and fix those broken cases.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] JingsongLi commented on issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on issue #1302:
URL: https://github.com/apache/iceberg/issues/1302#issuecomment-669670967


   Although Spark does not inherit this test, but looks like Spark Avro writer also has bug too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #1302:
URL: https://github.com/apache/iceberg/issues/1302#issuecomment-671451730


   This wouldn't be too difficult to fix. The problem before was that Parquet didn't have a UUID type annotation so there was no way to encode that a fixed was a UUID and correctly infer the schema. Now that this has been fixed, we could start using it. The larger issue now is that no processing engines can create a UUID type.
   
   I'm still leaning toward removing it from the spec because there is no support for it in engines. I think a better way to do what this was attempting to is to add support in processing engines first, then support it in Iceberg if we need to.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

Posted by GitBox <gi...@apache.org>.
openinx commented on issue #1302:
URL: https://github.com/apache/iceberg/issues/1302#issuecomment-669668989


   > we don't visit with the accompanying Iceberg schema 
   
   I think we may need to pass both parquet schema and iceberg schema to the visitor so that we could write the UUID value to binary correctly. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] JingsongLi commented on issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on issue #1302:
URL: https://github.com/apache/iceberg/issues/1302#issuecomment-671685597


   +1 to remove UUID Type now.
   Users can choose to use FIX_BYTES or STRING.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on issue #1302:
URL: https://github.com/apache/iceberg/issues/1302#issuecomment-671151713


   I recently read @rdblue's comment on Iceberg Slack suggesting future removal of UUID type from the spec. Would be great to hear Ryan's opinion if this is worth fixing or if we should rather remove the type.
   
   >David Stryker Jul 14th at 12:36 PM
   Hi @rdblue  The Iceberg Presto connector doesn't currently support the UUID type.  David Phillips and I were wondering if you thought it was worthwhile to provide that support. (edited) 
   Ryan Blue
   No, we don't support it in other models either. We should remove it from the spec.
   
   https://the-asf.slack.com/archives/CF01LKV9S/p1594755385061000


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

Posted by GitBox <gi...@apache.org>.
openinx commented on issue #1302:
URL: https://github.com/apache/iceberg/issues/1302#issuecomment-669675486


   Yes, spark writer also have this bug.  I saw we [commented](https://github.com/apache/iceberg/commit/270bb8f353866b9e4b55b1e9c51504442192598d) the uuid type in spark test case. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on issue #1302:
URL: https://github.com/apache/iceberg/issues/1302#issuecomment-669661217


   I met this as well.  This is due to we don't visit with the accompanying Iceberg schema so `GenericParquetWriter` cannot convert the  UUID value before writing to parquet binary. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] electrum commented on issue #1302: Missing Types.UUIDType in SUPPORTED_PRIMITIVES

Posted by GitBox <gi...@apache.org>.
electrum commented on issue #1302:
URL: https://github.com/apache/iceberg/issues/1302#issuecomment-678548226


   Presto does have a first class UUID type: https://prestosql.io/docs/current/language/types.html#uuid
   
   It would be straightforward to add this in the Presto Iceberg connector if we want to support it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org