You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/28 11:08:44 UTC

[GitHub] [iceberg] findepi commented on a change in pull request #1611: DOCS: describe type compatibility between Spark and Iceberg

findepi commented on a change in pull request #1611:
URL: https://github.com/apache/iceberg/pull/1611#discussion_r678201399



##########
File path: site/docs/spark.md
##########
@@ -728,3 +730,82 @@ spark.read.format("iceberg").load("db.table.files").show(truncate = false)
 // Hadoop path table
 spark.read.format("iceberg").load("hdfs://nn:8020/path/to/table#files").show(truncate = false)
 ```
+
+## Type compatibility
+
+Spark and Iceberg support different set of types. Iceberg does the type conversion automatically, but not for all combinations,
+so you may want to understand the type conversion in Iceberg in prior to design the types of columns in your tables.
+
+### Spark type to Iceberg type on creating table
+
+This type conversion table describes how Spark types are converted to the Iceberg types. The conversion applies on creating Iceberg table via Spark without using Iceberg core API.
+
+| Spark           | Iceberg                 | Notes |
+|-----------------|-------------------------|-------|
+| boolean         | boolean                 |       |
+| integer         | integer                 |       |
+| short           | integer                 |       |
+| byte            | integer                 |       |
+| long            | long                    |       |
+| float           | float                   |       |
+| double          | double                  |       |
+| date            | date                    |       |
+| timestamp       | timestamp with timezone |       |
+| string          | string                  |       |
+| char            | string                  |       |
+| varchar         | string                  |       |
+| binary          | binary                  |       |
+| decimal         | decimal                 |       |
+| struct          | struct                  |       |
+| array           | list                    |       |
+| map             | map                     |       |
+
+The type conversion is asymmetric: this table doesn't represent the types of Iceberg Spark can "read" from, or "write" to.
+The following sections describe the feasibility on read/write for Iceberg type from Spark.
+
+### Iceberg to Spark on reading from Iceberg table
+
+| Iceberg                    | Spark                   | Note  |
+|----------------------------|-------------------------|-------|
+| boolean                    | boolean                 |       |
+| integer                    | integer                 |       |
+| long                       | long                    |       |
+| float                      | float                   |       |
+| double                     | double                  |       |
+| date                       | date                    |       |
+| time                       | <N/A>                   |       |
+| timestamp with timezone    | timestamp               |       |
+| timestamp without timezone | <N/A>                   |       |
+| string                     | string                  |       |
+| uuid                       | string                  |       |

Review comment:
       Followed up at https://github.com/trinodb/trino/issues/6663 and on the mailing list.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org