You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "raviranak (via GitHub)" <gi...@apache.org> on 2023/04/14 08:32:03 UTC

[GitHub] [iceberg] raviranak opened a new issue, #7344: Using Iceberg from EKS to access resource in another aws account loads instance role by default

raviranak opened a new issue, #7344:
URL: https://github.com/apache/iceberg/issues/7344

   ### Query engine
   
   Using Iceberg from EKS to access resource in another aws account loads instance role by default
   
   ### Question
   
   Using Iceberg from EKS to access resource in another aws account loads instance role by default
   Currently iceberg just have AssumeRoleAwsClientFactory and with our current setup this we cannot leverage due to multiple nodegroup in eks to provide for assume role access for another aws account. What we wanted is to load the service role that loads via WebIdentityTokenCredentialsProvider for iceberg .  Could you please help in solution here  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] raviranak commented on issue #7344: Using Iceberg from EKS to access resource in another aws account loads instance role by default

Posted by "raviranak (via GitHub)" <gi...@apache.org>.
raviranak commented on issue #7344:
URL: https://github.com/apache/iceberg/issues/7344#issuecomment-1512392013

   can you help here how to change the provider to use WebIdentityTokenFileCredentialsProvider for iceberg client 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] MarquisC commented on issue #7344: Using Iceberg from EKS to access resource in another aws account loads instance role by default

Posted by "MarquisC (via GitHub)" <gi...@apache.org>.
MarquisC commented on issue #7344:
URL: https://github.com/apache/iceberg/issues/7344#issuecomment-1646560968

   
   hey @raviranak @stevenzwu what we're seeing something similar EKS as well via the Iceberg Flink path (wanted to get your thoughts):
   
   - We're using the aws sdk bundle jar [tested against 2.17.257 and 2.20.99]
   
   [The default credential provider ](https://github.com/aws/aws-sdk-java-v2/blob/2.17.257/core/auth/src/main/java/software/amazon/awssdk/auth/credentials/DefaultCredentialsProvider.java) _should_ have by precedent attempted to leverage the WebIdentity path right (before it defaults to the EC2 instance role)?
   
   If I kubectl exec in and install the aws-cli within the container, the the result of `aws sts get-caller-identity` correctly identifies the hierarchy and selects the kubernetes service account -> IAM role (WebIdentity Path)
   
   I couldn't find an easy way in the time that I looked to directly influence/configure the Glue client from the Iceberg lib.
   
   What I ended up doing was just letting the EC2 instance role assume the role it needs via:
   
   ```
   Create Catalog ...
   'catalog-impl'='org.apache.iceberg.aws.glue.GlueCatalog',  'io-impl'='org.apache.iceberg.aws.s3.S3FileIO', 'client.assume-role.region' = 'us-east-1', 'client.factory' = 'org.apache.iceberg.aws.AssumeRoleAwsClientFactory', 'client.assume-role.arn' = 'arn:aws:iam::${aws account number}:role/${the role that should of worked from web identity perms')"}"
   ```
   
   The particular where it wasn't working use case was enabling Flink Session clusters on kubernetes and the Flink SQL Gateway to chat with Glue correctly. 
   
   The weird part about this is, our fat jar Flink jobs (same deps), successfully leverage the WebIdentity path (we allow our jobs to dynamically create the tables and databases in glue if they don't exist). 
   
   Hopefully this is helpful for you @raviranak -when I get more time I'll keep fiddling and try to see what I can see (I might be able to for example step through the SQL Gateway Impl and see what happened).
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on issue #7344: Using Iceberg from EKS to access resource in another aws account loads instance role by default

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on issue #7344:
URL: https://github.com/apache/iceberg/issues/7344#issuecomment-1508888117

   @raviranak can you check if it is the same issue as https://github.com/apache/iceberg/issues/6715. Latest 1.2.0 Iceberg release should have included the fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] raviranak commented on issue #7344: Using Iceberg from EKS to access resource in another aws account loads instance role by default

Posted by "raviranak (via GitHub)" <gi...@apache.org>.
raviranak commented on issue #7344:
URL: https://github.com/apache/iceberg/issues/7344#issuecomment-1511261114

   Hi @stevenzwu 
   
   Here is my spark-context used 
   from pyspark.sql import SparkSession 
   
   `
   
   spark = SparkSession.builder \
       .appName("MyApp") \
       .config("spark.sql.hive.metastore.glueCatalog.enabled", "true") \
       .config("spark.sql.catalog.iceberg_catalog.catalog-impl", "org.apache.iceberg.aws.glue.GlueCatalog") \
       .config("spark.sql.catalog.iceberg_catalog.warehouse", "s3://internal/iceberg/warehouse/") \
       .config("spark.sql.catalog.iceberg_catalog.io-impl", "org.apache.iceberg.aws.s3.S3FileIO") \
       .config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkCatalog") \
       .config("spark.sql.catalogImplementation", "hive") \
       .config("spark.hadoop.fs.s3a.aws.credentials.provider", "com.amazonaws.auth.WebIdentityTokenCredentialsProvider") \
       .config("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") \
       .config("spark.jars.packages", "org.apache.hadoop:hadoop-aws:3.3.1,org.apache.spark:spark-avro_2.12:3.2.0,"
                                       "org.apache.hadoop:hadoop-aws:3.3.1,"
                                       "org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:1.2.0") \
       .config("spark.jars", "/home/ray/.ivy2/jars/org.apache.spark_spark-avro_2.12-3.2.0.jar,"
                             "/home/ray/.ivy2/jars/com.amazonaws_aws-java-sdk-bundle-1.11.901.jar,"
                             "/home/ray/.ivy2/jars/org.apache.hadoop_hadoop-aws-3.3.1.jar,"
                             "/home/ray/.ivy2/jars/org.apache.iceberg_iceberg-spark-runtime-3.2_2.12-1.2.0.jar,"
                             "https://internal.s3.amazonaws.com/iceberg/bundle-2.17.131.jar,"
                             "https://internal.s3.amazonaws.com/iceberg/url-connection-client-2.17.131.jar") \
       .config("spark.hadoop.fs.s3a.canned.acl", "BucketOwnerFullControl") \
       .config("spark.hadoop.hive.metastore.glue.catalogid", "123456789") \
       .getOrCreate()
   `
   
   still facing this issue 
   `
   software.amazon.awssdk.services.glue.model.AccessDeniedException: User: arn:aws:sts:::assumed-role/clusteri-07d3180159a814e31 is not authorized to perform: glue:GetTable on resource: 
   `
   
   Can you please here as it seems role doesn't resolve to service role  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] raviranak commented on issue #7344: Using Iceberg from EKS to access resource in another aws account loads instance role by default

Posted by "raviranak (via GitHub)" <gi...@apache.org>.
raviranak commented on issue #7344:
URL: https://github.com/apache/iceberg/issues/7344#issuecomment-1511264185

   Using Spark 3.3.0 and .iceberg_iceberg-spark-runtime-3.2_2.12-1.2.0 [iceberg 1.2.0]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org