You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/05/26 08:38:07 UTC

[GitHub] [druid] EwanValentine opened a new issue #11303: Issues connecting to S3 on EKS

EwanValentine opened a new issue #11303:
URL: https://github.com/apache/druid/issues/11303


   I'm attempting to use S3 deep storage on EKS, however I just get a 403 error. I'm not in a position to use a client secret pair from our AWS account directly. But the nodes within our K8s cluster have service accounts. Attached to my Druid clusters namespace is a role which has all permissions for a specific bucket. However, when I attempt to load the sample dataset into Druid, I get an AWS 403 error in the logs.
   
   There's a web token file set in the environment variables, which typically any AWS SDK related stuff normally picks up. I'm also explicitly passing in the region etc
   
   ### Affected Version
   
   `0.20, 0.21, 0.21.1-rc`
   
   ### Description
   
   Please include as much detailed information about the problem as possible.
   - Cluster size
   Two to three m5.large's
   
   - Configurations in use
   ```
   apiVersion: druid.apache.org/v1alpha1
   kind: Druid
   metadata:
     name: ewanstenant
   spec:
     commonConfigMountPath: /opt/druid/conf/druid/cluster/_common
     serviceAccount: "druid-scaling-spike"
     nodeSelector:
       service: ewanstenant-druid
     tolerations:
       - key: 'dedicated'
         operator: 'Equal'
         value: 'ewanstenant-druid'
         effect: 'NoSchedule'
     securityContext:
       fsGroup: 0
       runAsUser: 0
       runAsGroup: 0
     image: "apache/druid:0.21.1-rc1"
     startScript: /druid.sh
     jvm.options: |-
       -server
       -XX:+UseG1GC
       -Xloggc:gc-%t-%p.log
       -XX:+UseGCLogFileRotation
       -XX:GCLogFileSize=100M
       -XX:NumberOfGCLogFiles=10
       -XX:+HeapDumpOnOutOfMemoryError
       -XX:HeapDumpPath=/druid/data/logs
       -verbose:gc
       -XX:+PrintGCDetails
       -XX:+PrintGCTimeStamps
       -XX:+PrintGCDateStamps
       -XX:+PrintGCApplicationStoppedTime
       -XX:+PrintGCApplicationConcurrentTime
       -XX:+PrintAdaptiveSizePolicy
       -XX:+PrintReferenceGC
       -XX:+PrintFlagsFinal
       -Duser.timezone=UTC
       -Dfile.encoding=UTF-8
       -Djava.io.tmpdir=/druid/data
       -Daws.region=eu-west-1
       -Dorg.jboss.logging.provider=slf4j
       -Dlog4j.shutdownCallbackRegistry=org.apache.druid.common.config.Log4jShutdown
       -Dlog4j.shutdownHookEnabled=true
       -Dcom.sun.management.jmxremote.authenticate=false
       -Dcom.sun.management.jmxremote.ssl=false
     common.runtime.properties: |
       ###############################################
       # service names for coordinator and overlord
       ###############################################
       druid.selectors.indexing.serviceName=druid/overlord
       druid.selectors.coordinator.serviceName=druid/coordinator
       ##################################################
       # Request logging, monitoring, and segment
       ##################################################
       druid.request.logging.type=slf4j
       druid.request.logging.feed=requests
       ##################################################
       # Monitoring ( enable when using prometheus )
       #################################################
       
       ################################################
       # Extensions
       ################################################
       druid.extensions.directory=/opt/druid/extensions
       druid.extensions.loadList=["druid-s3-extensions","postgresql-metadata-storage"]
       ####################################################
       # Enable sql
       ####################################################
       druid.sql.enable=true
   
       druid.storage.type=s3
       druid.storage.bucket=druid-scaling-spike-deepstore
       druid.storage.baseKey=druid/segments
       druid.indexer.logs.directory=data/logs/
       druid.storage.sse.type=s3
       druid.storage.disableAcl=false
   
   
       # druid.storage.type=local
       # druid.storage.storageDirectory=/druid/deepstorage
   
       druid.metadata.storage.type=derby
       druid.metadata.storage.connector.connectURI=jdbc:derby://localhost:1527/druid/data/derbydb/metadata.db;create=true
       druid.metadata.storage.connector.host=localhost
       druid.metadata.storage.connector.port=1527
       druid.metadata.storage.connector.createTables=true
   
       druid.zk.service.host=tiny-cluster-zk-0.tiny-cluster-zk
       druid.zk.paths.base=/druid
       druid.zk.service.compress=false
   
       druid.indexer.logs.type=file
       druid.indexer.logs.directory=/druid/data/indexing-logs
       druid.lookup.enableLookupSyncOnStartup=false
     volumeClaimTemplates:
       - 
         metadata:
           name: deepstorage-volume
         spec:
           accessModes:
             - ReadWriteOnce
           resources:
             requests:
               storage: 50Gi
           storageClassName: gp2
     volumeMounts:
       - mountPath: /druid/data
         name: data-volume
       - mountPath: /druid/deepstorage
         name: deepstorage-volume
     volumes:
       - name: data-volume
         emptyDir: {}
       - name: deepstorage-volume
         hostPath:
           path: /tmp/druid/deepstorage
           type: DirectoryOrCreate
   
     nodes:
       brokers: 
         kind: Deployment
         druid.port: 8080
         nodeType: broker
         nodeConfigMountPath: "/opt/druid/conf/druid/cluster/query/broker"
         env:
           - name: DRUID_XMS
             value: 12000m
           - name: DRUID_XMX
             value: 12000m
           - name: DRUID_MAXDIRECTMEMORYSIZE
             value: 8g
           - name: AWS_REGION
             value: eu-west-1
         replicas: 1
         resources:
           limits:
             cpu: 1
             memory: 8Gi
           requests:
             cpu: 1
             memory: 8Gi
         readinessProbe:
           initialDelaySeconds: 60
           periodSeconds: 10
           failureThreshold: 30
           httpGet:
             path: /druid/broker/v1/readiness
             port: 8080
         runtime.properties: |
            druid.service=druid/broker
            druid.log4j2.sourceCategory=druid/broker
            druid.broker.http.numConnections=5
            # Processing threads and buffers
            druid.processing.buffer.sizeBytes=268435456
            druid.processing.numMergeBuffers=1
            druid.processing.numThreads=4
   
       coordinators:
         druid.port: 8080
         kind: Deployment
         maxSurge: 2
         maxUnavailable: 0
         nodeType: coordinator
         nodeConfigMountPath: "/opt/druid/conf/druid/cluster/master/coordinator-overlord"
         podDisruptionBudgetSpec:
           maxUnavailable: 1
         replicas: 1
         resources:
           limits:
             cpu: 1000m
             memory: 1Gi
           requests:
             cpu: 500m
             memory: 1Gi
         livenessProbe:
           initialDelaySeconds: 60
           periodSeconds: 5
           failureThreshold: 3
           httpGet:
             path: /status/health
             port: 8080
         readinessProbe:
           initialDelaySeconds: 60
           periodSeconds: 5
           failureThreshold: 3
           httpGet:
             path: /status/health
             port: 8080
         env:
           - name: DRUID_XMS
             value: 1g 
           - name: DRUID_XMX
             value: 1g
           - name: AWS_REGION
             value: eu-west-1
         runtime.properties: |
             druid.service=druid/coordinator
             druid.log4j2.sourceCategory=druid/coordinator
             druid.indexer.runner.type=httpRemote
             druid.indexer.queue.startDelay=PT5S
             druid.coordinator.balancer.strategy=cachingCost
             druid.serverview.type=http
             druid.indexer.storage.type=metadata
             druid.coordinator.startDelay=PT10S
             druid.coordinator.period=PT5S
             druid.server.http.numThreads=5000
             druid.coordinator.asOverlord.enabled=true
             druid.coordinator.asOverlord.overlordService=druid/overlord
   
       historical:
         druid.port: 8080
         nodeType: historical
         nodeConfigMountPath: "/opt/druid/conf/druid/cluster/data/historical"
         replicas: 1
         livenessProbe:
           initialDelaySeconds: 1800
           periodSeconds: 5
           failureThreshold: 3
           httpGet:
             path: /status/health
             port: 8080
         readinessProbe:
           httpGet:
             path: /druid/historical/v1/readiness
             port: 8080
           periodSeconds: 10
           failureThreshold: 18
         resources:
           limits:
             cpu: 1000m
             memory: 12Gi
           requests:
             cpu: 1000m
             memory: 12Gi
         env:
           - name: DRUID_XMS
             value: 1500m
           - name: DRUID_XMX
             value: 1500m 
           - name: DRUID_MAXDIRECTMEMORYSIZE
             value: 12g
           - name: AWS_REGION
             value: eu-west-1
         runtime.properties: |
           druid.service=druid/historical
           druid.log4j2.sourceCategory=druid/historical
           # HTTP server threads
           druid.server.http.numThreads=10
           # Processing threads and buffers
           druid.processing.buffer.sizeBytes=536870912
           druid.processing.numMergeBuffers=1
           druid.processing.numThreads=2
           # Segment storage 
           druid.segmentCache.locations=[{\"path\":\"/opt/druid/data/historical/segments\",\"maxSize\": 10737418240}]
           druid.server.maxSize=10737418240
           # Query cache
           druid.historical.cache.useCache=true
           druid.historical.cache.populateCache=true
           druid.cache.type=caffeine
           druid.cache.sizeInBytes=256000000
         volumeClaimTemplates:
           -
             metadata:
               name: historical-volume
             spec:
               accessModes:
                 - ReadWriteOnce
               resources:
                 requests:
                   storage: 50Gi
               storageClassName: gp2
         volumeMounts:
           -
             mountPath: /opt/druid/data/historical
             name: historical-volume
   
       middlemanagers:
         druid.port: 8080
         nodeType: middleManager
         nodeConfigMountPath: /opt/druid/conf/druid/cluster/data/middleManager
         env:
           - name: DRUID_XMX
             value: 4096m
           - name: DRUID_XMS
             value: 4096m
           - name: AWS_REGION
             value: eu-west-1
           - name: AWS_DEFAULT_REGION
             value: eu-west-1
         replicas: 1
         resources:
           limits:
             cpu: 1000m
             memory: 6Gi
           requests:
             cpu: 1000m
             memory: 6Gi
         livenessProbe:
           initialDelaySeconds: 60
           periodSeconds: 5
           failureThreshold: 3
           httpGet:
             path: /status/health
             port: 8080
         readinessProbe:
           initialDelaySeconds: 60
           periodSeconds: 5
           failureThreshold: 3
           httpGet:
             path: /status/health
             port: 8080
         runtime.properties: |
           druid.service=druid/middleManager
           druid.worker.capacity=3
           druid.indexer.task.baseTaskDir=/opt/druid/data/middlemanager/task
           druid.indexer.runner.javaOpts=-server -XX:MaxDirectMemorySize=10240g -Duser.timezone=UTC -Daws.region=eu-west-1 -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/opt/druid/data/tmp -Dlog4j.debug -XX:+UnlockDiagnosticVMOptions -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=50 -XX:GCLogFileSize=10m -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -XX:HeapDumpPath=/opt/druid/data/logs/peon.%t.%p.hprof -Xms10G -Xmx10G
   
           # HTTP server threads
           druid.server.http.numThreads=25
           # Processing threads and buffers on Peons
           druid.indexer.fork.property.druid.processing.numMergeBuffers=2
           druid.indexer.fork.property.druid.processing.buffer.sizeBytes=32000000
           druid.indexer.fork.property.druid.processing.numThreads=2
         volumeClaimTemplates:
           -
             metadata:
               name: middlemanagers-volume
             spec:
               accessModes:
                 - ReadWriteOnce
               resources:
                 requests:
                   storage: 50Gi
               storageClassName: gp2
         volumeMounts:
           -
             mountPath: /opt/druid/data/historical
             name: middlemanagers-volume
   
       routers:
         kind: Deployment
         nodeConfigMountPath: "/opt/druid/conf/druid/cluster/query/router"
         livenessProbe:
           initialDelaySeconds: 60
           periodSeconds: 5
           failureThreshold: 3
           httpGet:
             path: /status/health
             port: 8080
         readinessProbe:
           initialDelaySeconds: 60
           periodSeconds: 5
           failureThreshold: 3
           httpGet:
             path: /status/health
             port: 8080
         druid.port: 8080
         env:
           - name: AWS_REGION
             value: eu-west-1
           - name: AWS_DEFAULT_REGION
             value: eu-west-1
           - name: DRUID_XMX
             value: 1024m
           - name: DRUID_XMS
             value: 1024m
         resources:
           limits:
             cpu: 500m
             memory: 2Gi
           requests:
             cpu: 500m
             memory: 2Gi
         nodeType: router
         podDisruptionBudgetSpec:
           maxUnavailable: 1
         replicas: 1
         runtime.properties: |
             druid.service=druid/router
             druid.log4j2.sourceCategory=druid/router
             # HTTP proxy
             druid.router.http.numConnections=5000
             druid.router.http.readTimeout=PT5M
             druid.router.http.numMaxThreads=1000
             druid.server.http.numThreads=1000
             # Service discovery
             druid.router.defaultBrokerServiceName=druid/broker
             druid.router.coordinatorServiceName=druid/coordinator
             druid.router.managementProxy.enabled=true
         services:
           -
             metadata:
               name: router-%s-service
             spec:
               ports:
                 -
                   name: router-port
                   port: 8080
               type: NodePort
   
   ```
   
   - Steps to reproduce the problem
   - Deploy the above using the latest operator version, to an EKS cluster
   - Expose the router port using kubectl proxy:
   ```
   $ kubectl port-forward service/router-druid-ewanstenant-routers-service 12345:8080 -n <yourtenant>
   ```
   - Load the sample dataset, using the default settings
   
   - The error message or stack traces encountered. Providing more context, such as nearby log messages or even entire logs, can be helpful.
   ```
   {"ingestionStatsAndErrors":{"taskId":"index_parallel_wikipedia_pedgollm_2021-05-25T23:51:09.811Z","payload":{"ingestionState":"BUILD_SEGMENTS","unparseableEvents":{},"rowStats":{"determinePartitions":{"processed":24433,"processedWithError":0,"thrownAway":0,"unparseable":0},"buildSegments":{"processed":24433,"processedWithError":0,"thrownAway":0,"unparseable":0}},"errorMsg":"java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: DJQGKG8Z57V4R2MP; S3 Extended Request ID: IXmXtwpGLsf1mWTrU7sJLx/cM2Cg72GarKfbsAtpt763Wi62fft6odbo/jmQ2nZOJbS6hro0/QY=), S3 Extended Request ID: IXmXtwpGLsf1mWTrU7sJLx/cM2Cg72GarKfbsAtpt763Wi62fft6odbo/jmQ2nZOJbS6hro0/QY=\n\tat org.apache.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:938)\n\tat org.apache.druid.indexing.com
 mon.task.IndexTask.runTask(IndexTask.java:494)\n\tat org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:152)\n\tat org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.runSequential(ParallelIndexSupervisorTask.java:964)\n\tat org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.runTask(ParallelIndexSupervisorTask.java:445)\n\tat org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:152)\n\tat org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:451)\n\tat org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:423)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)
   ```
   
   - Any debugging that you have already done
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] danielzurawski commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
danielzurawski commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-921731964


   Hi @josephglanville , @AdheipSingh ,
   
   Thanks for all your efforts trying to get the AWS extensions utilising the WebIdentityTokenProvider and K8S ServiceAccounts.
   We are running on the latest 0.22.0 release of Druid and experiencing the same issue with the kinesis extension not correctly utilising the WebIdentityTokenProvider/AWS_WEB_IDENTITY_TOKEN_FILE and file that is present on the Pod.
   
   I am not quite sure how to debug this, my initial suspicion was that maybe the indexing service creates a fork that doesn't have the AWS_WEB_IDENTITY_TOKEN_FILE set (we're running the kinesis indexing task on coordinator/overlord, not utilising remote middlemanager). 
   
   Alternatively, could there be a problem with the default `WebIdentityTokenProvider.create()` that it doesn't pick up these envs and you have to manually use the builder to provide path to the token file etc.?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] josephglanville commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
josephglanville commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-915666388


   @AdheipSingh
   
   kube versions:
   ```
   Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"darwin/amd64"}
   Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-0389ca3", GitCommit:"8a4e27b9d88142bbdd21b997b532eb6d493df6d2", GitTreeState:"clean", BuildDate:"2021-07-31T01:34:46Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
   ```
   
   SA account annotations:
   ```
   Name:                druid
   Namespace:           data
   Labels:              tanka.dev/environment=6bac8592800828ba90b13a675b892ed3add99e3cf09bd1f9
   Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::<account_id_scrubbed>:role/prod-us-west-2-druid
   Image pull secrets:  <none>
   Mountable secrets:   druid-token-2k8d7
   Tokens:              druid-token-2k8d7
   Events:              <none>
   ```
   
   SA properly injecting token into pod:
   ```
         AWS_DEFAULT_REGION:           us-west-2
         AWS_REGION:                   us-west-2
         AWS_ROLE_ARN:                 arn:aws:iam::<account_id_scrubbed>:role/prod-us-west-2-druid
         AWS_WEB_IDENTITY_TOKEN_FILE:  /var/run/secrets/eks.amazonaws.com/serviceaccount/token
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] didip edited a comment on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
didip edited a comment on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1055037028


   It seems like `AWSCredentialsUtils.defaultAWSCredentialsProviderChain` is not even used in `S3InputSource.java` and `S3StorageDruidModule.java`
   
   Also, it looks like talking to Kinesis would work because `AWSCredentialsUtils.defaultAWSCredentialsProviderChain` is used in `KinesisRecordSupplier.java`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] niravpeak edited a comment on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
niravpeak edited a comment on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1019019260


   Hello Team, Yet Same issue with 0.22.1 
   we are using latest druid version 0.22.1 with helm chart. As per #10541 Fix already in place. and aws-sdk are compatible with [aws-sdks](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html). Howerver our test shows that druid is not using pod level roles. Always checks node / instance level permissions. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] niravpeak commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
niravpeak commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1019019260


   Hello Team, we are using latest druid version 0.22.1 with helm chart. As per #10541 Fix already in place. and aws-sdk are compatible with [aws-sdks](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html). Howerver our test shows that druid is not using pod level roles. Always checks node / instance level permissions. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] josephglanville commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
josephglanville commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-921778682


   I'm pretty sure it's a class path problem as it's caused by a ClassNotFound error but I haven't had time to dig deeper yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] didip commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
didip commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1055037028


   It seems like `AWSCredentialsUtils.defaultAWSCredentialsProviderChain` is not even used in `S3InputSource.java` and `S3StorageDruidModule.java`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] didip commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
didip commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1059782778


   Also, `aws-java-sdk-sts.jar` is missing on `lib/`. That's not normal, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cintoSunny commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
cintoSunny commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1068466344


   Is this resolved? I am getting the same issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] AdheipSingh commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
AdheipSingh commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-915231146


   @EwanValentine do you mind showing your service account on k8s ?
   kubectl get sa -n <namespace> -o yaml < name of sa >


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] AdheipSingh edited a comment on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
AdheipSingh edited a comment on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-915231146


   @EwanValentine do you mind showing your service account on k8s ?
   kubectl get sa -n <namespace> -o yaml < name of sa >
   
   @josephglanville 
   
   > @himadrisingh did you test this feature on EKS and if so did it require any specific Druid configuration to work with IRSA?
   
   No it does not need any druid configuration, annoation needs to be added in the kubernetes service account object thats it. 
   We haven't tested this recently, but before updating the SDK version it was tested. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FarhadF commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
FarhadF commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-990947040


   we have the same issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] didip edited a comment on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
didip edited a comment on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1055037028


   It seems like `AWSCredentialsUtils.defaultAWSCredentialsProviderChain` is not even used in `S3InputSource.java` and `S3StorageDruidModule.java`
   
   Also, it looks like talking to Kinesis would work because `AWSCredentialsUtils.defaultAWSCredentialsProviderChain` is used in `KinesisRecordSupplier.java`
   
   The seemingly useful `AWSModule.java: AWSModule().getAWSCredentialsProvider(config)` is only used in one test file: `TestAWSCredentialsProvider.java`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] Chase-H commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
Chase-H commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1009199294


   Anyone ever find a solution to this, because I am dealing with the exact same problem?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] josephglanville edited a comment on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
josephglanville edited a comment on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-915666388


   @AdheipSingh
   
   kube versions:
   ```
   Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"darwin/amd64"}
   Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-0389ca3", GitCommit:"8a4e27b9d88142bbdd21b997b532eb6d493df6d2", GitTreeState:"clean", BuildDate:"2021-07-31T01:34:46Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
   ```
   
   SA account annotations:
   ```
   Name:                druid
   Namespace:           data
   Labels:              tanka.dev/environment=6bac8592800828ba90b13a675b892ed3add99e3cf09bd1f9
   Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::<account_id_scrubbed>:role/prod-us-west-2-druid
   Image pull secrets:  <none>
   Mountable secrets:   druid-token-2k8d7
   Tokens:              druid-token-2k8d7
   Events:              <none>
   ```
   
   SA properly injecting token into pod:
   ```
         AWS_DEFAULT_REGION:           us-west-2
         AWS_REGION:                   us-west-2
         AWS_ROLE_ARN:                 arn:aws:iam::<account_id_scrubbed>:role/prod-us-west-2-druid
         AWS_WEB_IDENTITY_TOKEN_FILE:  /var/run/secrets/eks.amazonaws.com/serviceaccount/token
   ```
   
   Worth mentioning we are using IRSA successfully for other services/software so everything else is working AFAICT.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] AdheipSingh commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
AdheipSingh commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-915288630


   BTW what version of k8s are you using it ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] josephglanville commented on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
josephglanville commented on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-913808482


   This appears to still be the case on master/0.22.0. The root cause appears to be that the WebIdentityTokenProvider isn't able to be initialised due to a ClassNotFound exception being raised:
   ```
   Caused by: java.lang.RuntimeException: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [org.apache.druid.common.aws.ConfigDrivenAwsCredentialsConfigProvider@772ef0c7: Unable to load AWS credentials from druid AWSCredentialsConfig, org.apache.druid.common.aws.LazyFileSessionCredentialsProvider@b4474bb: cannot refresh AWS credentials, EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)), SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey), WebIdentityTokenCredentialsProvider: To use assume role profiles the aws-java-sdk-sts module must be on the class path., com.amazonaws.auth.profile.ProfileCredentialsProvider@40d95ffc: profile file cannot be null, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper@c112365: Unauthorized
  (Service: null; Status Code: 401; Error Code: null; Request ID: null; Proxy: null), com.amazonaws.auth.InstanceProfileCredentialsProvider@661ec260: Unauthorized (Service: null; Status Code: 401; Error Code: null; Request ID: null; Proxy: null)]`
   ```
   
   Curiously the sts module its complaining about appears to be present in the s3 extension folder and loaded by Druid during startup:
   
   ```
   ization - Loading extension [druid-s3-extensions], jars: aws-java-sdk-core-1.12.37.jar, aws-java-sdk-sts-1.12.37.jar, commons-codec-1.13.jar, commons-logging-1.1.1.jar, druid-s3-extensions-0.22.0-SNAPSHOT.jar, httpclient-4.5.10.jar, httpcore-4.4.11.jar, ion-java-1.0.2.jar, jackson-dataformat-cbor-2.10.5.jar, jmespath-java-1.12.37.jar, joda-time-2.10.5.jar
   ```
   
   @himadrisingh did you test this feature on EKS and if so did it require any specific Druid configuration to work with IRSA?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] AdheipSingh edited a comment on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
AdheipSingh edited a comment on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-915231146


   @EwanValentine do you mind showing your service account on k8s ?
   kubectl get sa -n <namespace> -o yaml < name of sa >
   
   @josephglanville 
   
   > @himadrisingh did you test this feature on EKS and if so did it require any specific Druid configuration to work with IRSA?
   
   No it does not need any druid configuration, annotation needs to be added in the kubernetes service account object thats it. 
   We haven't tested this recently, but before updating the SDK version it was tested. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] didip edited a comment on issue #11303: Issues connecting to S3 on EKS

Posted by GitBox <gi...@apache.org>.
didip edited a comment on issue #11303:
URL: https://github.com/apache/druid/issues/11303#issuecomment-1055037028


   It seems like `AWSCredentialsUtils.defaultAWSCredentialsProviderChain` is not even used in `S3InputSource.java` and `S3StorageDruidModule.java`
   
   Also, it looks like talking to Kinesis would work because `AWSCredentialsUtils.defaultAWSCredentialsProviderChain` is used in `KinesisRecordSupplier.java`
   
   The seemingly useful `AWSModule.java: AWSModule().getAWSCredentialsProvider(config)` is only used in one file, and that's a test file: `TestAWSCredentialsProvider.java`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org