You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Daniel Carl Jones (Jira)" <ji...@apache.org> on 2022/11/18 09:24:00 UTC

[jira] [Commented] (SPARK-38958) Override S3 Client in Spark Write/Read calls

    [ https://issues.apache.org/jira/browse/SPARK-38958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635755#comment-17635755 ] 

Daniel Carl Jones commented on SPARK-38958:
-------------------------------------------

I had someone reach out to me with a similar request - static headers on all S3 requests for a given S3A file system.

If static headers per fs by config were to be added as a feature, do we have any idea what configuration for a feature like this might look like? i.e. how do we model a list of key value pairs in the Hadoop configurations? Best I see is "getStrings" which we need to figure out if its even (right number of k,v pairs) or maybe have each k,v pair be one string joined by equals symbol.

Also, any reasons not to have such a configuration or any better way to design it?

> Override S3 Client in Spark Write/Read calls
> --------------------------------------------
>
>                 Key: SPARK-38958
>                 URL: https://issues.apache.org/jira/browse/SPARK-38958
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 3.2.1
>            Reporter: Hershal
>            Priority: Major
>
> Hello,
> I have been working to use spark to read and write data to S3. Unfortunately, there are a few S3 headers that I need to add to my spark read/write calls. After much looking, I have not found a way to replace the S3 client that spark uses to make the read/write calls. I also have not found a configuration that allows me to pass in S3 headers. Here is an example of some common S3 request headers ([https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonRequestHeaders.html).] Does there already exist functionality to add S3 headers to spark read/write calls or pass in a custom client that would pass these headers on every read/write request? Appreciate the help and feedback
>  
> Thanks,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org