You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/05/12 02:41:08 UTC

[GitHub] [iceberg] x-malet opened a new issue #2586: Add geometry type to iceberg

x-malet opened a new issue #2586:
URL: https://github.com/apache/iceberg/issues/2586


   Hi everyone,
   
   I was playing with Trino and Spark to store and manipulate geospatial data and store them in an Iceberg table but the geometry type is not supported yet. Is there a plan to add or support it? 
   
   I store the geometries as binary for now but, even if it's fast, it should be easier to store it as a geometry type and not having to translate it in every geospatial operations ( intersections, overlaps, etc.)
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick edited a comment on issue #2586: Add geometry type to iceberg

Posted by GitBox <gi...@apache.org>.
kbendick edited a comment on issue #2586:
URL: https://github.com/apache/iceberg/issues/2586#issuecomment-840356830


   I'm not necessarily opposed to the idea, but support for geospatial types seems like something that would need to be added to the various query engines (Trino, spark, etc).
   
   If they supported geospatial data structures, or if they even had the geospatial functions like postgres-gis does, then I think it would definitely make sense to add support for them into Iceberg.
   
   However, Iceberg is simply a table format for use by the various big data query engines (spark, Trino, flink, hive, ...). Without 1st class support for the geospatial types in those engines, I can't see what you'd be able to do with the stored geometry data types other than fall back to binary or string types for manipulation.
   
   It's spark / Trino that is responsible for the query processing. Iceberg is more for the table definition as well as the source and the sink (and occasionally things like specialized joins, specialized commands to update the tables metadata which goes along with being a table definition, etc).
   
   So I can't see support for geometric shapes being added into iceberg unless the major query engines that we support added them.
   
   I know there's spark-gis (spark is very popular so there are many spark libraries), but is geospatial data supported outside of the occasional 3rd party spark lib? Like, does Trino have support for geometry operations (intersections, overlaps, contains, whatever)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] x-malet commented on issue #2586: Add geometry type to iceberg

Posted by GitBox <gi...@apache.org>.
x-malet commented on issue #2586:
URL: https://github.com/apache/iceberg/issues/2586#issuecomment-840593904


   Hi @kbendick, thanks for your answer.  Yeah there is support for geospatial data in Trino ( [trino geospartial func](https://trino.io/docs/current/functions/geospatial.html) ) and spark have now a major projet supported by apache ( [Apache Sedona](http://sedona.apache.org/) )  and the problem is that, ever time you want to process a massive dataset stored in binary, you have to translate it from binary to geometry and  then process it.
   
   On the other hand, I think that storing a geometry in a binary format is more exchangeable between systems and allow more flexibility...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on issue #2586: Add geometry type to iceberg

Posted by GitBox <gi...@apache.org>.
kbendick commented on issue #2586:
URL: https://github.com/apache/iceberg/issues/2586#issuecomment-840356830


   I'm not necessarily opposed to the idea, but support for geospatial types seems like something that would need to be added to the various query engines (Trino, spark, etc).
   
   If they supported geospatial data structures, or if they even had the geospatial functions like postgres-gis does, then I think it would definitely make sense to add support for them into Iceberg.
   
   However, Iceberg is simply a table format for use by the various big data query engines (spark, Trino, flink, hive, ...). Without 1st class support for the geospatial types in those engines, I can't see what you'd be able to do with the stored geometry data types other than fall back to binary or string types for manipulation.
   
   It's spark / Trino that is responsible for the query processing. Iceberg is more for the table definition as well as the source and the sink (and occasionally things like specialized joins, specialized commands to update the tables metadata which goes along with being a table definition, etc).
   
   So I can't see support for geometric shapes being added into iceberg unless the major query engines that we support added them.
   
   I know there's spark-gis (spark is very popular so there are many spark libraries), but is geospatial data supported outside of 3rd party libs in spark? Like, does Trino have support for geometry operations (intersections, overlaps, contains, whatever)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org