You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/05 15:10:59 UTC

[GitHub] [iceberg] rymurr opened a new pull request #2031: Fix properties for custom catalogs in Flink

rymurr opened a new pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031


   When testing Nessie as a custom catalog I came across Flink (1.11.3) not accepting custom catalog properties or `catalog-impl`. This PR adds `catalog-impl` as a property and a wildcard suffix for custom catalog properties, The result is:
   
   ```
   CREATE CATALOG nessie_catalog WITH (
       'type'='iceberg',
       'catalog-impl'='org.apache.iceberg.nessie.NessieCatalog',
       'custom-catalog.url'='http://localhost:19120/api/v1',
       'custom-catalog.warehouse'='/home/ryan/warehouse');
   ```
   I am wondering if the community thinks this is the best pattern or if a better way is preferred? If this is good I will update the docs accordingly.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-759139749


   > I think that we should allow all properties if we don't think that we can enumerate the set of catalog properties that we need.
   
   If we are fine with that, simply use the following can work:
   
   ```
   @Override
   public List<String> supportedProperties() {
       return ImmutableList.of("*");
   }


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-757052208


   Sorry I missed the notification for this one. I tested with Flink on EMR which is an older version, let me try with the latest version and reply here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-760469330


   I'm going to go ahead and merge this. I don't see a way around changing the list to `*`. If we want more validation, we can do it in the adapter.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-758826705


   The problem with using a prefix like `custom-catalog` is that it makes the configuration differ between engines. For Flink, you would need to use `custom-catalog.property`, and for others you'd use just `property`. That will confuse users that use the wrong property for an engine and expect it to work.
   
   I think that we should allow all properties if we don't think that we can enumerate the set of catalog properties that we need.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-757983330


   > Apache flink will load those catalog factory classes whose **required properties** are matched, and check whether there's only one matched factory ( If more than one or no factory, it will throw exception, means we're failing to load catalog).
   
   Thanks @openinx just a bit of clarification. What do you mean by **required properties**. Is there a way to set properties in `WITH` clauses to have required/not-required fields? If I can pass non-required fields through to the catalog w/o this PR I would be very happy :-)
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
openinx commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-757912157


   Apache flink will load those catalog factory classes whose __required properties__ are matched,  and check whether there's only one matched factory ( If more than one or no factory,  it will throw exception, means we're failing to load catalog). 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-757052208


   Sorry I missed the notification for this one. I tested with Flink on EMR which is an older version, let me try with the latest version and reply here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-754927740


   I would rather not use the prefix. Do we need to pass all of the properties to Flink or can we tell it to accept anything?
   
   @jackye1995, you may want to make sure that Flink can use the Glue catalog and AWS config properties.
   
   FYI @openinx.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-756496637


   In that case, I'd go with adding all of the properties in `CatalogProperties`. We can always add a prefix later if we want.
   
   To do that, we probably need `CatalogProperties` to have an array or list of all properties, like `CatalogProperties.all()` so we don't need to remember to go update Flink as we add more.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-760099383


   I am fine with this approach however I wonder what the flink folks think? I am not sure if that breaks compatibility anywhere?
   
   I've updated the PR with @jackye1995's proposal and the following works with the Nessie catalog. 
   
   ```
   CREATE CATALOG nessie_catalog WITH (
       'type'='iceberg',
       'catalog-impl'='org.apache.iceberg.nessie.NessieCatalog',
       'url'='http://localhost:19120/api/v1',
       'warehouse'='/home/ryan/warehouse');
   CREATE TABLE nessie_catalog.testns.foo (
       id BIGINT COMMENT 'unique id',
       data STRING
    );
   INSERT INTO nessie_catalog.testns.foo VALUES (1, 'a');
   SELECT * FROM nessie_catalog.testns.foo;
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2031: Fix properties for custom catalogs in Flink

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2031:
URL: https://github.com/apache/iceberg/pull/2031#issuecomment-756073479


   I am not a Flink expert but my reading of the [javadoc](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/factories/TableFactory.java#L54) is that we could accept everything but it seems like its not recommended. I really like that Flink will throw an error if its configured incorrectly so I would prefer to keep some control over options.
   
   However, I agree that the prefix isn't awesome.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org