You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Jonathan Meed <jm...@umich.edu> on 2011/11/22 16:45:48 UTC

Flume collectors started crashing regularly all of a sudden

Hi,

I had a flume cluster operating pretty well over the last few weeks. In the
past two days 1 of my 2 flume collectors started erroring ever few minutes.
It would work fine after a restart moving files for only a few minutes
before stopping with the same error. Here's a link to the weblog. Any help
would be greatly appreciated.

http://pastebin.com/fspsNdYC

Jonathan Meed
University of Michigan
College of Engineering Class of 2013
jmeed@umich.edu
917-880-7974

Re: Flume collectors started crashing regularly all of a sudden

Posted by Shuang <sh...@open42.com>.
According to my experience with Flume's S3 sink, such 404 errors are not
indication of real problems. I have been using Flume to write to S3 for the
last 6 months, and I see these errors all the time without any data loss.
At the beginning I was worried, and posted on this mail list asking for
clarification, no definitive conclusion was reached. Now I kinds of believe
it's just incorrectly reported.

Shuang

On Tue, Nov 22, 2011 at 9:45 AM, Mark Lewandowski <
mark.e.lewandowski@gmail.com> wrote:

> I wrote to this list with a similar problem last week.  I started noticing
> the 404s after upgrading to flume 0.9.4.  The weird part is, most of the
> requests to S3 are 404-ing, but not all.  Eric Sammer suggested it might be
> due to data inconsistency on the S3 side, but I'm not sure I believe that.
> I also posted a question about this on the S3 forum, but haven't heard back
> yet.
>
> -Mark
>
>
> On Tue, Nov 22, 2011 at 8:26 AM, Alexander C.H. Lorenz <
> wget.null@googlemail.com> wrote:
>
>> I'm not sure, in your log I see a lot of 404 instead 200 (means that some
>> buckets could not be load) from a s3 instance
>> (org.jets3t.service.impl.rest.httpclient.RestS3Service). All warnings
>> concerns the same file (0025 at the end), and at least flume will give up.
>> Looks for me like a S3 problem to write the file (s3n://
>> hooklogic-data-east/flume/%Y/%m/%d/%H","syslog")
>>
>> best,
>>  Alex
>>
>>
>> On Tue, Nov 22, 2011 at 5:14 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>
>>> Sorry to seam a little dense. So if I understand this correctly the
>>> flume collector is having issues connecting to the flume master and
>>> therefore is erroring? Both the flume collector in question and the master
>>> are on the same physical machine, and non of the other flume nodes on
>>> different machines are showing any errors, which is peculiar. Any
>>> suggestions on how to fix this?
>>>
>>>
>>> Jonathan Meed
>>> University of Michigan
>>> School of Engineering Class of 2013
>>> jmeed@umich.edu
>>> 917-880-7974
>>>
>>>
>>>
>>> On Tue, Nov 22, 2011 at 11:07 AM, Alexander C.H. Lorenz <
>>> wget.null@googlemail.com> wrote:
>>>
>>>> That exception will be send when the master-RPC is'nt reachable:
>>>>
>>>> https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/java/com/cloudera/flume/handlers/endtoend/CollectorAckListener.java
>>>>
>>>> - Alex
>>>>
>>>> On Tue, Nov 22, 2011 at 4:58 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>>>
>>>>> It appears that S3 is working. I can see new flume events getting
>>>>> added in my S3 bucket.
>>>>>
>>>>> Jonathan Meed
>>>>> University of Michigan
>>>>> School of Engineering Class of 2013
>>>>> jmeed@umich.edu
>>>>> 917-880-7974
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 22, 2011 at 10:55 AM, Alexander C.H. Lorenz <
>>>>> wget.null@googlemail.com> wrote:
>>>>>
>>>>>> Yes, looks like an issue in your S3 instance. Are they running and
>>>>>> available?
>>>>>>
>>>>>> - alex
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jm...@umich.edu>wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>  I am actually using a multi-sink for HDFS and S3. Could that be
>>>>>>> the issues. The config I sued is below.
>>>>>>>
>>>>>>>
>>>>>>> config [beacon_flume02use, autoCollectorSource,
>>>>>>> [collectorSink("hdfs://107.20.248.101/user/flume/beaconlog","syslog"
>>>>>>> , 30000),
>>>>>>> collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ],
>>>>>>> exec, config, beacon_flume01use, autoCollectorSource,
>>>>>>> [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000),
>>>>>>> collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Jonathan Meed
>>>>>>> University of Michigan
>>>>>>> School of Engineering Class of 2013
>>>>>>> jmeed@umich.edu
>>>>>>>  917-880-7974
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
>>>>>>> wget.null@googlemail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> runs on Amazon's S3?
>>>>>>>>
>>>>>>>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>>>>>>>> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
>>>>>>>> - Unexpected response code 404, expected 200
>>>>>>>>
>>>>>>>> Check if the trackers are running.
>>>>>>>>
>>>>>>>> - alex
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu>wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I had a flume cluster operating pretty well over the last few
>>>>>>>>> weeks. In the past two days 1 of my 2 flume collectors started erroring
>>>>>>>>> ever few minutes. It would work fine after a restart moving files for only
>>>>>>>>> a few minutes before stopping with the same error. Here's a link to the
>>>>>>>>> weblog. Any help would be greatly appreciated.
>>>>>>>>>
>>>>>>>>> http://pastebin.com/fspsNdYC
>>>>>>>>>
>>>>>>>>> Jonathan Meed
>>>>>>>>> University of Michigan
>>>>>>>>> College of Engineering Class of 2013
>>>>>>>>> jmeed@umich.edu
>>>>>>>>> 917-880-7974
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Alexander Lorenz
>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>
>>>>>>>> *P **Think of the environment: please don't print this email
>>>>>>>> unless you really need to.*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Alexander Lorenz
>>>>>> http://mapredit.blogspot.com
>>>>>>
>>>>>> *P **Think of the environment: please don't print this email unless
>>>>>> you really need to.*
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Alexander Lorenz
>>>> http://mapredit.blogspot.com
>>>>
>>>> *P **Think of the environment: please don't print this email unless
>>>> you really need to.*
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Alexander Lorenz
>> http://mapredit.blogspot.com
>>
>> *P **Think of the environment: please don't print this email unless you
>> really need to.*
>>
>>
>>
>

Re: Flume collectors started crashing regularly all of a sudden

Posted by Mark Lewandowski <ma...@gmail.com>.
I wrote to this list with a similar problem last week.  I started noticing
the 404s after upgrading to flume 0.9.4.  The weird part is, most of the
requests to S3 are 404-ing, but not all.  Eric Sammer suggested it might be
due to data inconsistency on the S3 side, but I'm not sure I believe that.
I also posted a question about this on the S3 forum, but haven't heard back
yet.

-Mark

On Tue, Nov 22, 2011 at 8:26 AM, Alexander C.H. Lorenz <
wget.null@googlemail.com> wrote:

> I'm not sure, in your log I see a lot of 404 instead 200 (means that some
> buckets could not be load) from a s3 instance
> (org.jets3t.service.impl.rest.httpclient.RestS3Service). All warnings
> concerns the same file (0025 at the end), and at least flume will give up.
> Looks for me like a S3 problem to write the file (s3n://
> hooklogic-data-east/flume/%Y/%m/%d/%H","syslog")
>
> best,
>  Alex
>
>
> On Tue, Nov 22, 2011 at 5:14 PM, Jonathan Meed <jm...@umich.edu> wrote:
>
>> Sorry to seam a little dense. So if I understand this correctly the flume
>> collector is having issues connecting to the flume master and therefore is
>> erroring? Both the flume collector in question and the master are on the
>> same physical machine, and non of the other flume nodes on different
>> machines are showing any errors, which is peculiar. Any suggestions on how
>> to fix this?
>>
>>
>> Jonathan Meed
>> University of Michigan
>> School of Engineering Class of 2013
>> jmeed@umich.edu
>> 917-880-7974
>>
>>
>>
>> On Tue, Nov 22, 2011 at 11:07 AM, Alexander C.H. Lorenz <
>> wget.null@googlemail.com> wrote:
>>
>>> That exception will be send when the master-RPC is'nt reachable:
>>>
>>> https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/java/com/cloudera/flume/handlers/endtoend/CollectorAckListener.java
>>>
>>> - Alex
>>>
>>> On Tue, Nov 22, 2011 at 4:58 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>>
>>>> It appears that S3 is working. I can see new flume events getting added
>>>> in my S3 bucket.
>>>>
>>>> Jonathan Meed
>>>> University of Michigan
>>>> School of Engineering Class of 2013
>>>> jmeed@umich.edu
>>>> 917-880-7974
>>>>
>>>>
>>>>
>>>> On Tue, Nov 22, 2011 at 10:55 AM, Alexander C.H. Lorenz <
>>>> wget.null@googlemail.com> wrote:
>>>>
>>>>> Yes, looks like an issue in your S3 instance. Are they running and
>>>>> available?
>>>>>
>>>>> - alex
>>>>>
>>>>>
>>>>> On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jm...@umich.edu>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>  I am actually using a multi-sink for HDFS and S3. Could that be the
>>>>>> issues. The config I sued is below.
>>>>>>
>>>>>>
>>>>>> config [beacon_flume02use, autoCollectorSource,
>>>>>> [collectorSink("hdfs://107.20.248.101/user/flume/beaconlog","syslog"
>>>>>> , 30000),
>>>>>> collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ],
>>>>>> exec, config, beacon_flume01use, autoCollectorSource,
>>>>>> [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000),
>>>>>> collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Jonathan Meed
>>>>>> University of Michigan
>>>>>> School of Engineering Class of 2013
>>>>>> jmeed@umich.edu
>>>>>>  917-880-7974
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
>>>>>> wget.null@googlemail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> runs on Amazon's S3?
>>>>>>>
>>>>>>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>>>>>>> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
>>>>>>> - Unexpected response code 404, expected 200
>>>>>>>
>>>>>>> Check if the trackers are running.
>>>>>>>
>>>>>>> - alex
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu>wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I had a flume cluster operating pretty well over the last few
>>>>>>>> weeks. In the past two days 1 of my 2 flume collectors started erroring
>>>>>>>> ever few minutes. It would work fine after a restart moving files for only
>>>>>>>> a few minutes before stopping with the same error. Here's a link to the
>>>>>>>> weblog. Any help would be greatly appreciated.
>>>>>>>>
>>>>>>>> http://pastebin.com/fspsNdYC
>>>>>>>>
>>>>>>>> Jonathan Meed
>>>>>>>> University of Michigan
>>>>>>>> College of Engineering Class of 2013
>>>>>>>> jmeed@umich.edu
>>>>>>>> 917-880-7974
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Alexander Lorenz
>>>>>>> http://mapredit.blogspot.com
>>>>>>>
>>>>>>> *P **Think of the environment: please don't print this email unless
>>>>>>> you really need to.*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Alexander Lorenz
>>>>> http://mapredit.blogspot.com
>>>>>
>>>>> *P **Think of the environment: please don't print this email unless
>>>>> you really need to.*
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>> *P **Think of the environment: please don't print this email unless you
>>> really need to.*
>>>
>>>
>>>
>>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless you
> really need to.*
>
>
>

Re: Flume collectors started crashing regularly all of a sudden

Posted by "Alexander C.H. Lorenz" <wg...@googlemail.com>.
I'm not sure, in your log I see a lot of 404 instead 200 (means that some
buckets could not be load) from a s3 instance
(org.jets3t.service.impl.rest.httpclient.RestS3Service). All warnings
concerns the same file (0025 at the end), and at least flume will give up.
Looks for me like a S3 problem to write the file (s3n://
hooklogic-data-east/flume/%Y/%m/%d/%H","syslog")

best,
 Alex

On Tue, Nov 22, 2011 at 5:14 PM, Jonathan Meed <jm...@umich.edu> wrote:

> Sorry to seam a little dense. So if I understand this correctly the flume
> collector is having issues connecting to the flume master and therefore is
> erroring? Both the flume collector in question and the master are on the
> same physical machine, and non of the other flume nodes on different
> machines are showing any errors, which is peculiar. Any suggestions on how
> to fix this?
>
>
> Jonathan Meed
> University of Michigan
> School of Engineering Class of 2013
> jmeed@umich.edu
> 917-880-7974
>
>
>
> On Tue, Nov 22, 2011 at 11:07 AM, Alexander C.H. Lorenz <
> wget.null@googlemail.com> wrote:
>
>> That exception will be send when the master-RPC is'nt reachable:
>>
>> https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/java/com/cloudera/flume/handlers/endtoend/CollectorAckListener.java
>>
>> - Alex
>>
>> On Tue, Nov 22, 2011 at 4:58 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>
>>> It appears that S3 is working. I can see new flume events getting added
>>> in my S3 bucket.
>>>
>>> Jonathan Meed
>>> University of Michigan
>>> School of Engineering Class of 2013
>>> jmeed@umich.edu
>>> 917-880-7974
>>>
>>>
>>>
>>> On Tue, Nov 22, 2011 at 10:55 AM, Alexander C.H. Lorenz <
>>> wget.null@googlemail.com> wrote:
>>>
>>>> Yes, looks like an issue in your S3 instance. Are they running and
>>>> available?
>>>>
>>>> - alex
>>>>
>>>>
>>>> On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>  I am actually using a multi-sink for HDFS and S3. Could that be the
>>>>> issues. The config I sued is below.
>>>>>
>>>>>
>>>>> config [beacon_flume02use, autoCollectorSource, [collectorSink("hdfs://
>>>>> 107.20.248.101/user/flume/beaconlog","syslog" , 30000),
>>>>> collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ],
>>>>> exec, config, beacon_flume01use, autoCollectorSource,
>>>>> [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000),
>>>>> collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
>>>>>
>>>>> Thanks
>>>>>
>>>>> Jonathan Meed
>>>>> University of Michigan
>>>>> School of Engineering Class of 2013
>>>>> jmeed@umich.edu
>>>>>  917-880-7974
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
>>>>> wget.null@googlemail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> runs on Amazon's S3?
>>>>>>
>>>>>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>>>>>> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
>>>>>> - Unexpected response code 404, expected 200
>>>>>>
>>>>>> Check if the trackers are running.
>>>>>>
>>>>>> - alex
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu>wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I had a flume cluster operating pretty well over the last few weeks.
>>>>>>> In the past two days 1 of my 2 flume collectors started erroring ever few
>>>>>>> minutes. It would work fine after a restart moving files for only a few
>>>>>>> minutes before stopping with the same error. Here's a link to the weblog.
>>>>>>> Any help would be greatly appreciated.
>>>>>>>
>>>>>>> http://pastebin.com/fspsNdYC
>>>>>>>
>>>>>>> Jonathan Meed
>>>>>>> University of Michigan
>>>>>>> College of Engineering Class of 2013
>>>>>>> jmeed@umich.edu
>>>>>>> 917-880-7974
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Alexander Lorenz
>>>>>> http://mapredit.blogspot.com
>>>>>>
>>>>>> *P **Think of the environment: please don't print this email unless
>>>>>> you really need to.*
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Alexander Lorenz
>>>> http://mapredit.blogspot.com
>>>>
>>>> *P **Think of the environment: please don't print this email unless
>>>> you really need to.*
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Alexander Lorenz
>> http://mapredit.blogspot.com
>>
>> *P **Think of the environment: please don't print this email unless you
>> really need to.*
>>
>>
>>
>


-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*

Re: Flume collectors started crashing regularly all of a sudden

Posted by Jonathan Meed <jm...@umich.edu>.
Sorry to seam a little dense. So if I understand this correctly the flume
collector is having issues connecting to the flume master and therefore is
erroring? Both the flume collector in question and the master are on the
same physical machine, and non of the other flume nodes on different
machines are showing any errors, which is peculiar. Any suggestions on how
to fix this?


Jonathan Meed
University of Michigan
School of Engineering Class of 2013
jmeed@umich.edu
917-880-7974



On Tue, Nov 22, 2011 at 11:07 AM, Alexander C.H. Lorenz <
wget.null@googlemail.com> wrote:

> That exception will be send when the master-RPC is'nt reachable:
>
> https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/java/com/cloudera/flume/handlers/endtoend/CollectorAckListener.java
>
> - Alex
>
> On Tue, Nov 22, 2011 at 4:58 PM, Jonathan Meed <jm...@umich.edu> wrote:
>
>> It appears that S3 is working. I can see new flume events getting added
>> in my S3 bucket.
>>
>> Jonathan Meed
>> University of Michigan
>> School of Engineering Class of 2013
>> jmeed@umich.edu
>> 917-880-7974
>>
>>
>>
>> On Tue, Nov 22, 2011 at 10:55 AM, Alexander C.H. Lorenz <
>> wget.null@googlemail.com> wrote:
>>
>>> Yes, looks like an issue in your S3 instance. Are they running and
>>> available?
>>>
>>> - alex
>>>
>>>
>>> On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>>
>>>> Hi,
>>>>
>>>>  I am actually using a multi-sink for HDFS and S3. Could that be the
>>>> issues. The config I sued is below.
>>>>
>>>>
>>>> config [beacon_flume02use, autoCollectorSource, [collectorSink("hdfs://
>>>> 107.20.248.101/user/flume/beaconlog","syslog" , 30000),
>>>> collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ],
>>>> exec, config, beacon_flume01use, autoCollectorSource,
>>>> [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000),
>>>> collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
>>>>
>>>> Thanks
>>>>
>>>> Jonathan Meed
>>>> University of Michigan
>>>> School of Engineering Class of 2013
>>>> jmeed@umich.edu
>>>>  917-880-7974
>>>>
>>>>
>>>>
>>>> On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
>>>> wget.null@googlemail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> runs on Amazon's S3?
>>>>>
>>>>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>>>>> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
>>>>> - Unexpected response code 404, expected 200
>>>>>
>>>>> Check if the trackers are running.
>>>>>
>>>>> - alex
>>>>>
>>>>>
>>>>> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I had a flume cluster operating pretty well over the last few weeks.
>>>>>> In the past two days 1 of my 2 flume collectors started erroring ever few
>>>>>> minutes. It would work fine after a restart moving files for only a few
>>>>>> minutes before stopping with the same error. Here's a link to the weblog.
>>>>>> Any help would be greatly appreciated.
>>>>>>
>>>>>> http://pastebin.com/fspsNdYC
>>>>>>
>>>>>> Jonathan Meed
>>>>>> University of Michigan
>>>>>> College of Engineering Class of 2013
>>>>>> jmeed@umich.edu
>>>>>> 917-880-7974
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Alexander Lorenz
>>>>> http://mapredit.blogspot.com
>>>>>
>>>>> *P **Think of the environment: please don't print this email unless
>>>>> you really need to.*
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>> *P **Think of the environment: please don't print this email unless you
>>> really need to.*
>>>
>>>
>>>
>>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless you
> really need to.*
>
>
>

Re: Flume collectors started crashing regularly all of a sudden

Posted by "Alexander C.H. Lorenz" <wg...@googlemail.com>.
That exception will be send when the master-RPC is'nt reachable:
https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/java/com/cloudera/flume/handlers/endtoend/CollectorAckListener.java

- Alex

On Tue, Nov 22, 2011 at 4:58 PM, Jonathan Meed <jm...@umich.edu> wrote:

> It appears that S3 is working. I can see new flume events getting added in
> my S3 bucket.
>
> Jonathan Meed
> University of Michigan
> School of Engineering Class of 2013
> jmeed@umich.edu
> 917-880-7974
>
>
>
> On Tue, Nov 22, 2011 at 10:55 AM, Alexander C.H. Lorenz <
> wget.null@googlemail.com> wrote:
>
>> Yes, looks like an issue in your S3 instance. Are they running and
>> available?
>>
>> - alex
>>
>>
>> On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>
>>> Hi,
>>>
>>>  I am actually using a multi-sink for HDFS and S3. Could that be the
>>> issues. The config I sued is below.
>>>
>>>
>>> config [beacon_flume02use, autoCollectorSource, [collectorSink("hdfs://
>>> 107.20.248.101/user/flume/beaconlog","syslog" , 30000),
>>> collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ],
>>> exec, config, beacon_flume01use, autoCollectorSource,
>>> [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000),
>>> collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
>>>
>>> Thanks
>>>
>>> Jonathan Meed
>>> University of Michigan
>>> School of Engineering Class of 2013
>>> jmeed@umich.edu
>>>  917-880-7974
>>>
>>>
>>>
>>> On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
>>> wget.null@googlemail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> runs on Amazon's S3?
>>>>
>>>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>>>> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
>>>> - Unexpected response code 404, expected 200
>>>>
>>>> Check if the trackers are running.
>>>>
>>>> - alex
>>>>
>>>>
>>>> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I had a flume cluster operating pretty well over the last few weeks.
>>>>> In the past two days 1 of my 2 flume collectors started erroring ever few
>>>>> minutes. It would work fine after a restart moving files for only a few
>>>>> minutes before stopping with the same error. Here's a link to the weblog.
>>>>> Any help would be greatly appreciated.
>>>>>
>>>>> http://pastebin.com/fspsNdYC
>>>>>
>>>>> Jonathan Meed
>>>>> University of Michigan
>>>>> College of Engineering Class of 2013
>>>>> jmeed@umich.edu
>>>>> 917-880-7974
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Alexander Lorenz
>>>> http://mapredit.blogspot.com
>>>>
>>>> *P **Think of the environment: please don't print this email unless
>>>> you really need to.*
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Alexander Lorenz
>> http://mapredit.blogspot.com
>>
>> *P **Think of the environment: please don't print this email unless you
>> really need to.*
>>
>>
>>
>


-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*

Re: Flume collectors started crashing regularly all of a sudden

Posted by Jonathan Meed <jm...@umich.edu>.
It appears that S3 is working. I can see new flume events getting added in
my S3 bucket.

Jonathan Meed
University of Michigan
School of Engineering Class of 2013
jmeed@umich.edu
917-880-7974



On Tue, Nov 22, 2011 at 10:55 AM, Alexander C.H. Lorenz <
wget.null@googlemail.com> wrote:

> Yes, looks like an issue in your S3 instance. Are they running and
> available?
>
> - alex
>
>
> On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jm...@umich.edu> wrote:
>
>> Hi,
>>
>>  I am actually using a multi-sink for HDFS and S3. Could that be the
>> issues. The config I sued is below.
>>
>>
>> config [beacon_flume02use, autoCollectorSource, [collectorSink("hdfs://
>> 107.20.248.101/user/flume/beaconlog","syslog" , 30000),
>> collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ],
>> exec, config, beacon_flume01use, autoCollectorSource,
>> [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000),
>> collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
>>
>> Thanks
>>
>> Jonathan Meed
>> University of Michigan
>> School of Engineering Class of 2013
>> jmeed@umich.edu
>>  917-880-7974
>>
>>
>>
>> On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
>> wget.null@googlemail.com> wrote:
>>
>>> Hi,
>>>
>>> runs on Amazon's S3?
>>>
>>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>>> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
>>> - Unexpected response code 404, expected 200
>>>
>>> Check if the trackers are running.
>>>
>>> - alex
>>>
>>>
>>> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>>
>>>> Hi,
>>>>
>>>> I had a flume cluster operating pretty well over the last few weeks. In
>>>> the past two days 1 of my 2 flume collectors started erroring ever few
>>>> minutes. It would work fine after a restart moving files for only a few
>>>> minutes before stopping with the same error. Here's a link to the weblog.
>>>> Any help would be greatly appreciated.
>>>>
>>>> http://pastebin.com/fspsNdYC
>>>>
>>>> Jonathan Meed
>>>> University of Michigan
>>>> College of Engineering Class of 2013
>>>> jmeed@umich.edu
>>>> 917-880-7974
>>>>
>>>>
>>>
>>>
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>> *P **Think of the environment: please don't print this email unless you
>>> really need to.*
>>>
>>>
>>>
>>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless you
> really need to.*
>
>
>

Re: Flume collectors started crashing regularly all of a sudden

Posted by "Alexander C.H. Lorenz" <wg...@googlemail.com>.
Yes, looks like an issue in your S3 instance. Are they running and
available?

- alex

On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jm...@umich.edu> wrote:

> Hi,
>
> I am actually using a multi-sink for HDFS and S3. Could that be the
> issues. The config I sued is below.
>
>
> config [beacon_flume02use, autoCollectorSource, [collectorSink("hdfs://
> 107.20.248.101/user/flume/beaconlog","syslog" , 30000),
> collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ],
> exec, config, beacon_flume01use, autoCollectorSource,
> [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000),
> collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
>
> Thanks
>
> Jonathan Meed
> University of Michigan
> School of Engineering Class of 2013
> jmeed@umich.edu
> 917-880-7974
>
>
>
> On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
> wget.null@googlemail.com> wrote:
>
>> Hi,
>>
>> runs on Amazon's S3?
>>
>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
>> - Unexpected response code 404, expected 200
>>
>> Check if the trackers are running.
>>
>> - alex
>>
>>
>> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu> wrote:
>>
>>> Hi,
>>>
>>> I had a flume cluster operating pretty well over the last few weeks. In
>>> the past two days 1 of my 2 flume collectors started erroring ever few
>>> minutes. It would work fine after a restart moving files for only a few
>>> minutes before stopping with the same error. Here's a link to the weblog.
>>> Any help would be greatly appreciated.
>>>
>>> http://pastebin.com/fspsNdYC
>>>
>>> Jonathan Meed
>>> University of Michigan
>>> College of Engineering Class of 2013
>>> jmeed@umich.edu
>>> 917-880-7974
>>>
>>>
>>
>>
>> --
>> Alexander Lorenz
>> http://mapredit.blogspot.com
>>
>> *P **Think of the environment: please don't print this email unless you
>> really need to.*
>>
>>
>>
>


-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*

Re: Flume collectors started crashing regularly all of a sudden

Posted by Jonathan Meed <jm...@umich.edu>.
Hi,

I am actually using a multi-sink for HDFS and S3. Could that be the issues.
The config I sued is below.


config [beacon_flume02use, autoCollectorSource, [collectorSink("hdfs://
107.20.248.101/user/flume/beaconlog","syslog" , 30000),
collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ],
exec, config, beacon_flume01use, autoCollectorSource,
[collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000),
collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]

Thanks
Jonathan Meed
University of Michigan
School of Engineering Class of 2013
jmeed@umich.edu
917-880-7974



On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
wget.null@googlemail.com> wrote:

> Hi,
>
> runs on Amazon's S3?
>
> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
> - Unexpected response code 404, expected 200
>
> Check if the trackers are running.
>
> - alex
>
>
> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu> wrote:
>
>> Hi,
>>
>> I had a flume cluster operating pretty well over the last few weeks. In
>> the past two days 1 of my 2 flume collectors started erroring ever few
>> minutes. It would work fine after a restart moving files for only a few
>> minutes before stopping with the same error. Here's a link to the weblog.
>> Any help would be greatly appreciated.
>>
>> http://pastebin.com/fspsNdYC
>>
>> Jonathan Meed
>> University of Michigan
>> College of Engineering Class of 2013
>> jmeed@umich.edu
>> 917-880-7974
>>
>>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless you
> really need to.*
>
>
>

Re: Flume collectors started crashing regularly all of a sudden

Posted by "Alexander C.H. Lorenz" <wg...@googlemail.com>.
Hi,

runs on Amazon's S3?

org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
'/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
- Unexpected response code 404, expected 200

Check if the trackers are running.

- alex


On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jm...@umich.edu> wrote:

> Hi,
>
> I had a flume cluster operating pretty well over the last few weeks. In
> the past two days 1 of my 2 flume collectors started erroring ever few
> minutes. It would work fine after a restart moving files for only a few
> minutes before stopping with the same error. Here's a link to the weblog.
> Any help would be greatly appreciated.
>
> http://pastebin.com/fspsNdYC
>
> Jonathan Meed
> University of Michigan
> College of Engineering Class of 2013
> jmeed@umich.edu
> 917-880-7974
>
>


-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*