You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/27 17:26:39 UTC

[GitHub] [iceberg] SinghAsDev opened a new issue #3994: Allow table properties defaults to be configured at catalog level

SinghAsDev opened a new issue #3994:
URL: https://github.com/apache/iceberg/issues/3994


   Pushing the need to set platform recommended default behaviors to users hurts usability and not all users actually remember to follow it either. For example, we would like to keep zstd as default for parquet compression. However, to do so we have to rely on all users creating tables to set this through a table property. We have similar use-cases for some other properties as well. Maybe we can move the table property defaults to catalog properties instead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] anuragmantri commented on issue #3994: Allow table properties defaults to be configured at catalog level

Posted by GitBox <gi...@apache.org>.
anuragmantri commented on issue #3994:
URL: https://github.com/apache/iceberg/issues/3994#issuecomment-1023561013


   Great. Thanks for working on this @SinghAsDev.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] SinghAsDev commented on issue #3994: Allow table properties defaults to be configured at catalog level

Posted by GitBox <gi...@apache.org>.
SinghAsDev commented on issue #3994:
URL: https://github.com/apache/iceberg/issues/3994#issuecomment-1023559327


   Great, I agree with `default` and `required` configs. We are working on a PR on this, will tag you both on reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on issue #3994: Allow table properties defaults to be configured at catalog level

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on issue #3994:
URL: https://github.com/apache/iceberg/issues/3994#issuecomment-1023493891


   we actually have a similar use case that EMR clusters would benefit from some cluster level configs that could be applied to all CREATE TABLE operations, and it’s not hard to achieve.
   
   I would imagine that a Spark cluster admin could do something like the following (I use CLI example, but this could be in the Spark default config file):
   
   ```
   spark-sql \
       --conf spark.sql.catalog.glue=org.apache.iceberg.spark.SparkCatalog \
       --conf spark.sql.catalog.glue.warehouse=s3://bucket/glue \
       --conf spark.sql.catalog.glue.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog \
       --conf spark.sql.catalog.glue.table.create.default.write.parquet.compression-codec=zstd
       --conf spark.sql.catalog.glue.table.create.default.write.parquet.dict-size-bytes=4194304
   ```
   
   And everything under `table.create.default` are initialized in the catalog implementation to be used as the default when a table is created. If user provides the same property, user can win.
   
   Another property we can introduce is `table.create.required`, in that case if user provides the same property, it has no effect and system default wins.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] anuragmantri commented on issue #3994: Allow table properties defaults to be configured at catalog level

Posted by GitBox <gi...@apache.org>.
anuragmantri commented on issue #3994:
URL: https://github.com/apache/iceberg/issues/3994#issuecomment-1023540702


   +1 to this request. We also have some properties that we want to set as admins like `write.object-storage.enabled` that we don't want users to set or worry about. As @jackye1995 mentioned we could have three levels
   - default - users can override
   - required - users cannot override
   - Another category for anything that can be added to defaults. (In spark we do that for properties like `spark.executor.extraJavaOptions` etc. Not sure if Iceberg has such properties). 
   
   I can take a look at how this can be done. Any suggestions welcome.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org