You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Weston Pace (Jira)" <ji...@apache.org> on 2022/01/11 20:29:00 UTC
[jira] [Created] (ARROW-15306) [C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified
Weston Pace created ARROW-15306:
-----------------------------------
Summary: [C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified
Key: ARROW-15306
URL: https://issues.apache.org/jira/browse/ARROW-15306
Project: Apache Arrow
Issue Type: Bug
Components: C++
Reporter: Weston Pace
By default the S3FileSystem leaves the header alone, which is technically correct as I don't think a content-type should be specified if one isn't known.
However, the aws-s3-sdk appears to set the content-type to application/xml whenever it is not specified: https://github.com/aws/aws-sdk-cpp/blob/5378016f845fe85e334ffc30319614e7d4dad41f/aws-cpp-sdk-s3/include/aws/s3/S3Request.h#L41
We could potentially file an issue with the S3 SDK but I'm not sure if that would make any progress (and S3 itself may require a content-type always be present for some reason).
Since there is no way to avoid specifying a content-type then we should default to application/octet-stream which is a more accurate "I don't know what this file is" than "application/xml".
The content-type can confuse libraries that try and automatically act on the file based on the content-type. See https://github.com/apache/arrow/issues/11934
--
This message was sent by Atlassian Jira
(v8.20.1#820001)