You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "alex balzer (JIRA)" <ji...@apache.org> on 2017/07/07 22:45:00 UTC
[jira] [Created] (FLUME-3118) S3 urls do not find the correct region.

alex balzer created FLUME-3118:
----------------------------------

             Summary: S3 urls do not find the correct region.
                 Key: FLUME-3118
                 URL: https://issues.apache.org/jira/browse/FLUME-3118
             Project: Flume
          Issue Type: Improvement
          Components: Sinks+Sources
            Reporter: alex balzer


So I am trying to use a S3 sink using hdfs but I am running into hurdles at every corner. My situation is that I need to be able to push to s3 without using access/secret amazon keys and using the underlying instance profile to authenticate with s3. I also need to add the aws encryption header for AES256. I am trying to use the base path of `s3://something.us-east-2.something/else`, but when I try it I get a `<Error><Code>AuthorizationHeaderMalformed</Code><Message>The authorization header is malformed; the region 'us-east-1' is wrong; expecting 'us-east-2'</Message><Region>us-east-2</Region><RequestId>N/A</RequestId><HostId>N/A</HostId></Error>` 

Here is my flume config:
```
tier1.sources  = source1
tier1.channels = channel1
tier1.sinks = sink1

tier1.sources.source1.type = org.apache.flume.source.kafka.KafkaSource
tier1.sources.source1.zookeeperConnect = localhost:2181
tier1.sources.source1.topic = lynch
# tier1.sources.source1.groupId = flume
tier1.sources.source1.channels = channel1
tier1.sources.source1.interceptors = i1
tier1.sources.source1.interceptors.i1.type = timestamp
tier1.sources.source1.kafka.consumer.timeout.ms = 100

tier1.channels.channel1.type = memory
#tier1.channels.channel1.capacity = 10000
#tier1.channels.channel1.transactionCapacity = 1000

tier1.sinks.sink1.type = hdfs
tier1.sinks.sink1.hdfs.path = s3://something.us-east-2.something/else
tier1.sinks.sink1.hdfs.rollInterval = 5
tier1.sinks.sink1.hdfs.rollSize = 0
tier1.sinks.sink1.hdfs.rollCount = 0
tier1.sinks.sink1.hdfs.fileType = DataStream
tier1.sinks.sink1.channel = channel1
```

Here is the command to run it:
```
bin/flume-ng agent -c . -f kafka-source.conf -n tier1
```

It should not be this difficult to push to S3 and adding support for s3:// addresses and instance profiles needs to happen. I have tried many permutations to get this to work, and I really want to see flume become a more friendly tool in these situations.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)