You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Weston Pace (Jira)" <ji...@apache.org> on 2022/01/11 20:29:00 UTC

[jira] [Created] (ARROW-15306) [C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified

Weston Pace created ARROW-15306:
-----------------------------------

             Summary: [C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified
                 Key: ARROW-15306
                 URL: https://issues.apache.org/jira/browse/ARROW-15306
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
            Reporter: Weston Pace


By default the S3FileSystem leaves the header alone, which is technically correct as I don't think a content-type should be specified if one isn't known.

However, the aws-s3-sdk appears to set the content-type to application/xml whenever it is not specified: https://github.com/aws/aws-sdk-cpp/blob/5378016f845fe85e334ffc30319614e7d4dad41f/aws-cpp-sdk-s3/include/aws/s3/S3Request.h#L41

We could potentially file an issue with the S3 SDK but I'm not sure if that would make any progress (and S3 itself may require a content-type always be present for some reason).

Since there is no way to avoid specifying a content-type then we should default to application/octet-stream which is a more accurate "I don't know what this file is" than "application/xml".

The content-type can confuse libraries that try and automatically act on the file based on the content-type.  See https://github.com/apache/arrow/issues/11934



--
This message was sent by Atlassian Jira
(v8.20.1#820001)