You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Mark Lewandowski <ma...@gmail.com> on 2011/11/17 20:23:27 UTC

Problems writing to S3

Has anyone had success getting flume-0.9.4 to write to S3?  Since I
upgraded I've been having issues where I get 404s from S3 the majority of
the time, but not all the time.  Here's my error:

2011-11-17 19:10:59,755 WARN
org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
'/logs%2F2011-10-22%2F0400%2Fpath-web-20111117-190826507%2B0000.32131453853534114.00000039.tmp'
- Unexpected response code 404, expected 200

I've followed Eric Lubow's blog post about how to get it to work, but for
some reason I'm still getting these errors, a lot.

Thanks in advance,

-Mark

Re: Problems writing to S3

Posted by Mark Lewandowski <ma...@gmail.com>.
Eric,

I don't believe that I'm hitting different machines.  As far as I can tell
S3 does not give you that kind of transparency though.  I'm not sure where
to go from here though.  The whole flow was working perfectly until
sometime around early October (when I upgraded to 0.9.4).  Now so many
writes are failing that I'm starting to wonder if flume can even support
this anymore, and if it can't I need to find a different solution.

My current stack is ~20 agents talking E2E to 3 auto chained collectors.
The collectors are all trying to write to my S3 bucket.

-Mark

On Thu, Nov 17, 2011 at 12:36 PM, Eric Sammer <es...@cloudera.com> wrote:

> Mark:
>
> I'm not an S3 ninja, but if I remember correctly, this can happen when an
> S3 node falls behind another with respect to a copy of the data (i.e. you
> can see an inconsistent picture with concurrent access). Is it possible
> you're hitting different machines in S3? Does one even get that kind of
> visibility into the system?
>
>
> On Thu, Nov 17, 2011 at 11:23 AM, Mark Lewandowski <
> mark.e.lewandowski@gmail.com> wrote:
>
>> Has anyone had success getting flume-0.9.4 to write to S3?  Since I
>> upgraded I've been having issues where I get 404s from S3 the majority of
>> the time, but not all the time.  Here's my error:
>>
>> 2011-11-17 19:10:59,755 WARN
>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>> '/logs%2F2011-10-22%2F0400%2Fpath-web-20111117-190826507%2B0000.32131453853534114.00000039.tmp'
>> - Unexpected response code 404, expected 200
>>
>> I've followed Eric Lubow's blog post about how to get it to work, but for
>> some reason I'm still getting these errors, a lot.
>>
>> Thanks in advance,
>>
>> -Mark
>>
>
>
>
> --
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com
>

Re: Problems writing to S3

Posted by Eric Sammer <es...@cloudera.com>.
Mark:

I'm not an S3 ninja, but if I remember correctly, this can happen when an
S3 node falls behind another with respect to a copy of the data (i.e. you
can see an inconsistent picture with concurrent access). Is it possible
you're hitting different machines in S3? Does one even get that kind of
visibility into the system?

On Thu, Nov 17, 2011 at 11:23 AM, Mark Lewandowski <
mark.e.lewandowski@gmail.com> wrote:

> Has anyone had success getting flume-0.9.4 to write to S3?  Since I
> upgraded I've been having issues where I get 404s from S3 the majority of
> the time, but not all the time.  Here's my error:
>
> 2011-11-17 19:10:59,755 WARN
> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
> '/logs%2F2011-10-22%2F0400%2Fpath-web-20111117-190826507%2B0000.32131453853534114.00000039.tmp'
> - Unexpected response code 404, expected 200
>
> I've followed Eric Lubow's blog post about how to get it to work, but for
> some reason I'm still getting these errors, a lot.
>
> Thanks in advance,
>
> -Mark
>



-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com