You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/24 06:33:51 UTC

[GitHub] [iceberg] kbendick opened a new issue, #5349: Implement Spark’s FunctionCatalog for Existing Transform functions

kbendick opened a new issue, #5349:
URL: https://github.com/apache/iceberg/issues/5349

   We need to implement Spark’s `FunctionCatalog` so that we can use the partition transformation functions in queries.
   
   This allows for using the partition transforms on non-partition columns in generated code.
   
   This is necessary in order to write Catalyst rules which will pass `bucket` So that storage partitioned joins (aka bucketed joins) can be implemented.
   
   See also:
   - [FunctionCatalog](https://spark.apache.org/docs/latest//api/java/index.html?org/apache/spark/sql/connector/catalog/FunctionCatalog.html) : https://spark.apache.org/docs/latest//api/java/index.html?org/apache/spark/sql/connector/catalog/FunctionCatalog.html)
   - [ScalarFunction](https://spark.apache.org/docs/3.2.0/api/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.html) class, which has practical description of what is needed for codegeneration and the benefits of it.
   
   
   The functions we have that are likely highest priority:
   - truncate
   - bucket
   - zorder
   - date transformations


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi closed issue #5349: Implement Spark’s FunctionCatalog for Existing Transformations

Posted by GitBox <gi...@apache.org>.
aokolnychyi closed issue #5349: Implement Spark’s FunctionCatalog for Existing Transformations
URL: https://github.com/apache/iceberg/issues/5349


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on issue #5349: Implement Spark’s FunctionCatalog for Existing Transformations

Posted by GitBox <gi...@apache.org>.
kbendick commented on issue #5349:
URL: https://github.com/apache/iceberg/issues/5349#issuecomment-1193261706

   This will allow us to make use of Spark’s storage partitioned joins (aka bucket joins which is one subset of possible join optimizations oj transformed columns) https://issues.apache.org/jira/browse/SPARK-37166


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on issue #5349: Implement Spark’s FunctionCatalog for Existing Transformations

Posted by GitBox <gi...@apache.org>.
kbendick commented on issue #5349:
URL: https://github.com/apache/iceberg/issues/5349#issuecomment-1197301844

   This relates to https://github.com/apache/iceberg/issues/430


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org