You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/06/13 12:47:00 UTC

[jira] [Commented] (SPARK-21077) Cannot access public files over S3 protocol

    [ https://issues.apache.org/jira/browse/SPARK-21077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047828#comment-16047828 ] 

Sean Owen commented on SPARK-21077:
-----------------------------------

I think this is a Hadoop or AWS SDK issue, not Spark.

> Cannot access public files over S3 protocol
> -------------------------------------------
>
>                 Key: SPARK-21077
>                 URL: https://issues.apache.org/jira/browse/SPARK-21077
>             Project: Spark
>          Issue Type: Bug
>          Components: EC2
>    Affects Versions: 2.1.0
>         Environment: Spark 2.1.0 default installation. No existing hadoop, using the one distributed with Spark.
> Added in $SPARK_HOME/jars:  
> hadoop-aws-2.7.3.jar and aws-java-sdk-1.7.4.jar
> Added endpoint configuration in $SPARK_HOME/conf/core-site.xml (I want to access datasets hosted by organisation with CEPH; follows S3 protocols).
> Ubuntu 14.04 x64.
>            Reporter: Ciprian Tomoiaga
>
> I am trying to access a dataset with public (anonymous) credentials via the S3 (or S3a, s3n) protocol. 
> It fails with the error that no provider in chain can supply the credentials.
> I asked our sysadmin to add some dummy credentials, and if I set them up (via link or config) then I have access.
> I tried setting the config :
> {code:xml}
> <property>
>   <name>fs.s3a.credentials.provider</name>
>   <value>org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider</value>
> </property>
> {code}
> but it still doesn't work.
> I suggested that it is a java-aws issue [here|https://github.com/aws/aws-sdk-java/issues/1122#issuecomment-307814540], but they said it is not.
> Any hints on how to use public S3 files from Spark ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org