You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Mayur Gupta <ma...@gmail.com> on 2014/01/28 20:16:00 UTC

Source Failover and Sink Failover

I have two questions related to failover in Flume.

1) If the source is not accepting events because of a failure condition, is
there a failover mechanism in Flume other configuring 2 agents to listen to
the events, in which I will get duplicates.

2) I configured a sink processor for failover with 2 sinks, one logger and
another HDFS. If the HDFS sink starts failing (say HDFS is no longer up),
the events are not processed by logger sink. Is this excepted behavior. Is
the failover only works with 2 avro sinks and not in the manner I have
configured?

I will appreciate it if somebody can point me in the right direction.

-Mayur

Re: Source Failover and Sink Failover

Posted by Jeff Lord <jl...@cloudera.com>.
Mayur,

The hdfs sink is going to keep trying to connect for maxRetries=10
Are you able to post the complete log? or at least another couple of
minutes ?

-Jeff



On Fri, Feb 7, 2014 at 1:32 AM, Mayur Gupta <ma...@gmail.com> wrote:

> 1) The source is Avro client. The events are lost. The intent of the
> question is there a way I can provide failover for the very first Flume
> Source without getting duplicate events.
>
> 2) Here is configuration for Failover sinks along with the logs.
>
> ag1.sources=s1
> ag1.channels=c1
> ag1.sinks=k1 k2
>
>
> ag1.sources.s1.type=netcat
> ag1.sources.s1.channels=c1
> ag1.sources.s1.bind=0.0.0.0
> ag1.sources.s1.port=12345
>
> ag1.channels.c1.type=memory
>
> ag1.sinkgroups=sg1
> ag1.sinkgroups.sg1.sinks=k1 k2
> ag1.sinkgroups.sg1.processor.type=failover
> ag1.sinkgroups.sg1.processor.priority.k1=10
> ag1.sinkgroups.sg1.processor.priority.k2=20
>
> ag1.sinks.k1.type=logger
> ag1.sinks.k1.channel=c1
>
> ag1.sinks.k2.type=hdfs
> ag1.sinks.k2.hdfs.path=flume/data/%{directory}
> ag1.sinks.k2.hdfs.fileSuffix=.log
> ag1.sinks.k2.hdfs.rollInterval=0
> ag1.sinks.k2.hdfs.rollCount=10
> ag1.sinks.k2.hdfs.rollSize=0
> ag1.sinks.k2.hdfs.inUsePrefix=.
> ag1.sinks.k2.hdfs.inUseSuffix=
> ag1.sinks.k2.hdfs.fileType=DataStream
> ag1.sinks.k2.channel=c1
>
>
> *Logs*
>
>  at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:202)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1243)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1087)
> ... 15 more
> 14/02/07 15:01:40 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:8020. Already tried 0 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
> 14/02/07 15:01:41 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:8020. Already tried 1 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
> 14/02/07 15:01:42 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:8020. Already tried 2 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
> 14/02/07 15:01:42 WARN hdfs.HDFSEventSink: HDFS IO error
> java.io.IOException: DFSOutputStream is closed
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3879)
>  at
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
> at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:117)
>  at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:356)
> at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:353)
>  at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:536)
> at
> org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:160)
>  at
> org.apache.flume.sink.hdfs.BucketWriter.access$1000(BucketWriter.java:56)
> at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:533)
>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>  at java.lang.Thread.run(Thread.java:662)
> 14/02/07 15:01:43 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:8020. Already tried 3 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
> 14/02/07 15:01:44 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:8020. Already tried 4 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
> 14/02/07 15:01:45 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:8020. Already tried 5 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
>
>
>
>
>
>
>
>

Re: Source Failover and Sink Failover

Posted by Mayur Gupta <ma...@gmail.com>.
1) The source is Avro client. The events are lost. The intent of the
question is there a way I can provide failover for the very first Flume
Source without getting duplicate events.

2) Here is configuration for Failover sinks along with the logs.

ag1.sources=s1
ag1.channels=c1
ag1.sinks=k1 k2


ag1.sources.s1.type=netcat
ag1.sources.s1.channels=c1
ag1.sources.s1.bind=0.0.0.0
ag1.sources.s1.port=12345

ag1.channels.c1.type=memory

ag1.sinkgroups=sg1
ag1.sinkgroups.sg1.sinks=k1 k2
ag1.sinkgroups.sg1.processor.type=failover
ag1.sinkgroups.sg1.processor.priority.k1=10
ag1.sinkgroups.sg1.processor.priority.k2=20

ag1.sinks.k1.type=logger
ag1.sinks.k1.channel=c1

ag1.sinks.k2.type=hdfs
ag1.sinks.k2.hdfs.path=flume/data/%{directory}
ag1.sinks.k2.hdfs.fileSuffix=.log
ag1.sinks.k2.hdfs.rollInterval=0
ag1.sinks.k2.hdfs.rollCount=10
ag1.sinks.k2.hdfs.rollSize=0
ag1.sinks.k2.hdfs.inUsePrefix=.
ag1.sinks.k2.hdfs.inUseSuffix=
ag1.sinks.k2.hdfs.fileType=DataStream
ag1.sinks.k2.channel=c1


*Logs*

 at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:202)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1243)
at org.apache.hadoop.ipc.Client.call(Client.java:1087)
... 15 more
14/02/07 15:01:40 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 0 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:41 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 1 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:42 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 2 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:42 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: DFSOutputStream is closed
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3879)
at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:117)
at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:356)
at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:353)
at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:536)
at
org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:160)
at org.apache.flume.sink.hdfs.BucketWriter.access$1000(BucketWriter.java:56)
at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:533)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
14/02/07 15:01:43 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 3 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:44 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 4 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:45 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 5 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

Re: Source Failover and Sink Failover

Posted by Ashish <pa...@gmail.com>.
On Wed, Jan 29, 2014 at 12:46 AM, Mayur Gupta <ma...@gmail.com>wrote:

> I have two questions related to failover in Flume.
>
> 1) If the source is not accepting events because of a failure condition,
> is there a failover mechanism in Flume other configuring 2 agents to listen
> to the events, in which I will get duplicates.
>

What is the source (not Flume Source) of Events? What happens to the Events
if the Flume Source fails?


>
> 2) I configured a sink processor for failover with 2 sinks, one logger and
> another HDFS. If the HDFS sink starts failing (say HDFS is no longer up),
> the events are not processed by logger sink. Is this excepted behavior. Is
> the failover only works with 2 avro sinks and not in the manner I have
> configured?
>

IMHO, this should not be the case. Anything in the logs?


>
> I will appreciate it if somebody can point me in the right direction.
>
> -Mayur
>



-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal