You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Noel Duffy <no...@sli-systems.com> on 2013/02/19 22:32:36 UTC

Architecting Flume for failover

I have a Flume agent that pulls events from RabbitMQ and pushes them into HDFS. So far so good, but now I want to have a second Flume agent on a different host acting as a hot backup for the first agent such that the loss of the first host running Flume would not cause any events to be lost. In the testing I've done I've gotten two Flume agents on separate hosts to read the same events from the RabbitMQ queue, but it's not clear to me how to configure the sinks such that only one of the sinks actually does something and the other does nothing.

>From reading the documentation, I supposed that a sinkgroup configured for failover was what I needed, but the documentation examples only cover the case where the sinks in a failover group are all on the same agent on the same host. I've seen messages online which seem to say that sinks in a sinkgroup can be on different hosts, but I can find no clear explanation of how to configure such a sinkgroup. How would sinks on different hosts communicate with one another? Would the sinks in the sinkgroup have to use a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?

In my testing I set up two agents on different hosts and configured a sinkgroup containing two sinks, both HDFS sinks.  

agent.sinkgroups = sinkgroup1
agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
agent.sinkgroups.sinkgroup1.processor.type=failover

agent.sinks = hdfsSink-1 hdfsSink-2
agent.sinks.hdfsSink-1.type = hdfs
agent.sinks.hdfsSink-1.bind = 10.20.30.81
agent.sinks.hdfsSink-1.channel = fileChannel-1
agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-1.hdfs.round = false
agent.sinks.hdfsSink-1.hdfs.rollCount=50
agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
agent.sinks.hdfsSink-1.hdfs.codeC = lzo
agent.sinks.hdfsSink-1.hdfs.rollInterval=30
agent.sinks.hdfsSink-1.hdfs.rollSize=0
agent.sinks.hdfsSink-1.hdfs.batchSize=1

agent.sinks.hdfsSink-2.bind = 10.20.30.119
agent.sinks.hdfsSink-2.type = hdfs
agent.sinks.hdfsSink-2.channel = fileChannel-1
agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-2.hdfs.round = false
agent.sinks.hdfsSink-2.hdfs.rollCount=50
agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
agent.sinks.hdfsSink-2.hdfs.codeC = lzo
agent.sinks.hdfsSink-2.hdfs.rollInterval=30
agent.sinks.hdfsSink-2.hdfs.rollSize=0
agent.sinks.hdfsSink-2.hdfs.batchSize=1

However, this does not achieve the failover I hoped for. The sink hdfsSink-2 on both agents writes the events to HDFS. The agents are not communicating, so the binding of the sink to an ip address is not doing anything.



RE: Architecting Flume for failover

Posted by Noel Duffy <no...@sli-systems.com>.
The configuration file in its entirety:

agent.sources = rabbitmq-source1

agent.sinkgroups = sinkgroup1
agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
agent.sinkgroups.sinkgroup1.processor.type=failover

agent.channels = fileChannel-1
agent.channels.fileChannel-1.type = file
agent.channels.fileChannel-1.checkpointDir = /var/flume/checkpoint
agent.channels.fileChannel-1.dataDirs = /var/flume/data

agent.sinks = hdfsSink-1 hdfsSink-2
agent.sinks.hdfsSink-1.type = hdfs
agent.sinks.hdfsSink-1.bind = 10.20.30.81
agent.sinks.hdfsSink-1.channel = fileChannel-1
agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-1.hdfs.round = false
agent.sinks.hdfsSink-1.hdfs.rollCount=50
agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
agent.sinks.hdfsSink-1.hdfs.codeC = lzo
agent.sinks.hdfsSink-1.hdfs.rollInterval=30
agent.sinks.hdfsSink-1.hdfs.rollSize=0
agent.sinks.hdfsSink-1.hdfs.batchSize=1

agent.sinks.hdfsSink-2.bind = 10.20.30.119
agent.sinks.hdfsSink-2.type = hdfs
agent.sinks.hdfsSink-2.channel = fileChannel-1
agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-2.hdfs.round = false
agent.sinks.hdfsSink-2.hdfs.rollCount=50
agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
agent.sinks.hdfsSink-2.hdfs.codeC = lzo
agent.sinks.hdfsSink-2.hdfs.rollInterval=30
agent.sinks.hdfsSink-2.hdfs.rollSize=0
agent.sinks.hdfsSink-2.hdfs.batchSize=1

agent.sources.rabbitmq-source1.channels = fileChannel-1
agent.sources.rabbitmq-source1.type = org.apache.flume.source.rabbitmq.RabbitMQSource
agent.sources.rabbitmq-source1.hostname = 10.20.30.28
agent.sources.rabbitmq-source1.exchangename = rtr.topic.logs
agent.sources.rabbitmq-source1.username = ***
agent.sources.rabbitmq-source1.password = ***
agent.sources.rabbitmq-source1.port = 5672
agent.sources.rabbitmq-source1.virtualhost = rtr_prod
agent.sources.rabbitmq-source1.topics=b.*.*.*.*
agent.sources.rabbitmq-source1.interceptors=gzip_intercept
agent.sources.rabbitmq-source1.interceptors.gzip_intercept.type=com.slisystems.flume.FlumeDecompressInterceptor$Builder


From: Jeff Lord [mailto:jlord@cloudera.com] 
Sent: Wednesday, 20 February 2013 2:46 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Maybe post your entire config?

On Tue, Feb 19, 2013 at 5:35 PM, Noel Duffy <no...@sli-systems.com> wrote:
In my tests I set up two Flume agents connected to two different HDFS clusters. The configuration of both Flume agents is identical. They read events from the same RabbitMQ server. In my test, both agent hosts wrote the event to their respective HDFS servers using hdfsSink-2, but I expected the failover sinkgroup configuration would mean only one host would write the event. In other words, I thought that a failover sinkgroup could be configured to have sinks on different hosts but that only one sink on one host would actually write the event and that the other host would not do anything.

All the examples in the documentation have all sinks in a sinkgroup on a single host. I want to have the sinks on different hosts. I've seen a number of assertions online that this can be done, but so far, I've not seen any examples of how to actually configure it.

From: Jeff Lord [mailto:jlord@cloudera.com]
Sent: Wednesday, 20 February 2013 2:17 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Noel,

What test did you perform?
Did you stop sink-2? 
Currently you have set a higher priority for sink-2 so it will be the default sink so long as it is up and running.

-Jeff

http://flume.apache.org/FlumeUserGuide.html#failover-sink-processor

On Tue, Feb 19, 2013 at 5:03 PM, Noel Duffy <no...@sli-systems.com> wrote:
The ip addresses 10.20.30.81 and 10.20.30.119 are the addresses of the Flume agents. The first agent is on 10.20.30.81, the second on 10.20.30.119. The idea was to have two sinks on different hosts and to configure Flume to failover to the second host if the first host should disappear. Although the documentation does not say so explicitly, I have read posts online which say that such a configuration is possible. I am running Flume-ng 1.2.0.

It may be that I am approaching this problem in the wrong way. We need to have Flume reading events from RabbitMQ and writing them to HDFS. We want to have two different hosts running Flume so that if one dies for any reason, the other would take over and no events should be lost or delayed. Later we may have more Flume hosts, depending on how well they cope with the expected traffic, but for now two will suffice to prove the concept. A load-balancing sink processor sounds like it might also be a solution, but again, I do not see how to configure this to work across more than one host.


From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Wednesday, 20 February 2013 1:31 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Can you change the hdfs.path to hdfs://10.20.30.81/flume/localbrain-events and hdfs://10.20.30.119/flume/localbrain-events on hdfsSink-1 and hdfsSink-2 respectively (assuming those are your namenodes)? The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources). 


Thanks
Hari

-- 
Hari Shreedharan

On Tuesday, February 19, 2013 at 4:05 PM, Noel Duffy wrote:
If I disable the agent.sinks line, both my sinks are disabled and nothing gets written to HDFS. The status page no longer shows me any sinks.

From: Yogi Nerella [mailto:ynerella999@gmail.com]
Sent: Wednesday, 20 February 2013 12:40 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Hi Noel,

May be you are specifying  both sinkgroups and sinks.  

Can you try removing the sinks.
#agent.sinks = hdfsSink-1 hdfsSink-2

Yogi


On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <no...@sli-systems.com> wrote:
I have a Flume agent that pulls events from RabbitMQ and pushes them into HDFS. So far so good, but now I want to have a second Flume agent on a different host acting as a hot backup for the first agent such that the loss of the first host running Flume would not cause any events to be lost. In the testing I've done I've gotten two Flume agents on separate hosts to read the same events from the RabbitMQ queue, but it's not clear to me how to configure the sinks such that only one of the sinks actually does something and the other does nothing.

>From reading the documentation, I supposed that a sinkgroup configured for failover was what I needed, but the documentation examples only cover the case where the sinks in a failover group are all on the same agent on the same host. I've seen messages online which seem to say that sinks in a sinkgroup can be on different hosts, but I can find no clear explanation of how to configure such a sinkgroup. How would sinks on different hosts communicate with one another? Would the sinks in the sinkgroup have to use a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?

In my testing I set up two agents on different hosts and configured a sinkgroup containing two sinks, both HDFS sinks.

agent.sinkgroups = sinkgroup1
agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
agent.sinkgroups.sinkgroup1.processor.type=failover

agent.sinks = hdfsSink-1 hdfsSink-2
agent.sinks.hdfsSink-1.type = hdfs
agent.sinks.hdfsSink-1.bind = 10.20.30.81
agent.sinks.hdfsSink-1.channel = fileChannel-1
agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-1.hdfs.round = false
agent.sinks.hdfsSink-1.hdfs.rollCount=50
agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
agent.sinks.hdfsSink-1.hdfs.codeC = lzo
agent.sinks.hdfsSink-1.hdfs.rollInterval=30
agent.sinks.hdfsSink-1.hdfs.rollSize=0
agent.sinks.hdfsSink-1.hdfs.batchSize=1

agent.sinks.hdfsSink-2.bind = 10.20.30.119
agent.sinks.hdfsSink-2.type = hdfs
agent.sinks.hdfsSink-2.channel = fileChannel-1
agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-2.hdfs.round = false
agent.sinks.hdfsSink-2.hdfs.rollCount=50
agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
agent.sinks.hdfsSink-2.hdfs.codeC = lzo
agent.sinks.hdfsSink-2.hdfs.rollInterval=30
agent.sinks.hdfsSink-2.hdfs.rollSize=0
agent.sinks.hdfsSink-2.hdfs.batchSize=1

However, this does not achieve the failover I hoped for. The sink hdfsSink-2 on both agents writes the events to HDFS. The agents are not communicating, so the binding of the sink to an ip address is not doing anything.


Re: Architecting Flume for failover

Posted by Jeff Lord <jl...@cloudera.com>.
Maybe post your entire config?


On Tue, Feb 19, 2013 at 5:35 PM, Noel Duffy <no...@sli-systems.com>wrote:

> In my tests I set up two Flume agents connected to two different HDFS
> clusters. The configuration of both Flume agents is identical. They read
> events from the same RabbitMQ server. In my test, both agent hosts wrote
> the event to their respective HDFS servers using hdfsSink-2, but I expected
> the failover sinkgroup configuration would mean only one host would write
> the event. In other words, I thought that a failover sinkgroup could be
> configured to have sinks on different hosts but that only one sink on one
> host would actually write the event and that the other host would not do
> anything.
>
> All the examples in the documentation have all sinks in a sinkgroup on a
> single host. I want to have the sinks on different hosts. I've seen a
> number of assertions online that this can be done, but so far, I've not
> seen any examples of how to actually configure it.
>
> From: Jeff Lord [mailto:jlord@cloudera.com]
> Sent: Wednesday, 20 February 2013 2:17 p.m.
> To: user@flume.apache.org
> Subject: Re: Architecting Flume for failover
>
> Noel,
>
> What test did you perform?
> Did you stop sink-2?
> Currently you have set a higher priority for sink-2 so it will be the
> default sink so long as it is up and running.
>
> -Jeff
>
> http://flume.apache.org/FlumeUserGuide.html#failover-sink-processor
>
> On Tue, Feb 19, 2013 at 5:03 PM, Noel Duffy <no...@sli-systems.com>
> wrote:
> The ip addresses 10.20.30.81 and 10.20.30.119 are the addresses of the
> Flume agents. The first agent is on 10.20.30.81, the second on
> 10.20.30.119. The idea was to have two sinks on different hosts and to
> configure Flume to failover to the second host if the first host should
> disappear. Although the documentation does not say so explicitly, I have
> read posts online which say that such a configuration is possible. I am
> running Flume-ng 1.2.0.
>
> It may be that I am approaching this problem in the wrong way. We need to
> have Flume reading events from RabbitMQ and writing them to HDFS. We want
> to have two different hosts running Flume so that if one dies for any
> reason, the other would take over and no events should be lost or delayed.
> Later we may have more Flume hosts, depending on how well they cope with
> the expected traffic, but for now two will suffice to prove the concept. A
> load-balancing sink processor sounds like it might also be a solution, but
> again, I do not see how to configure this to work across more than one host.
>
>
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> Sent: Wednesday, 20 February 2013 1:31 p.m.
> To: user@flume.apache.org
> Subject: Re: Architecting Flume for failover
>
> Can you change the hdfs.path to hdfs://10.20.30.81/flume/localbrain-eventsand hdfs://
> 10.20.30.119/flume/localbrain-events on hdfsSink-1 and hdfsSink-2
> respectively (assuming those are your namenodes)? The "bind" configuration
> param does not really exist for HDFS Sink (it is only for the IPC sources).
>
>
> Thanks
> Hari
>
> --
> Hari Shreedharan
>
> On Tuesday, February 19, 2013 at 4:05 PM, Noel Duffy wrote:
> If I disable the agent.sinks line, both my sinks are disabled and nothing
> gets written to HDFS. The status page no longer shows me any sinks.
>
> From: Yogi Nerella [mailto:ynerella999@gmail.com]
> Sent: Wednesday, 20 February 2013 12:40 p.m.
> To: user@flume.apache.org
> Subject: Re: Architecting Flume for failover
>
> Hi Noel,
>
> May be you are specifying  both sinkgroups and sinks.
>
> Can you try removing the sinks.
> #agent.sinks = hdfsSink-1 hdfsSink-2
>
> Yogi
>
>
> On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <no...@sli-systems.com>
> wrote:
> I have a Flume agent that pulls events from RabbitMQ and pushes them into
> HDFS. So far so good, but now I want to have a second Flume agent on a
> different host acting as a hot backup for the first agent such that the
> loss of the first host running Flume would not cause any events to be lost.
> In the testing I've done I've gotten two Flume agents on separate hosts to
> read the same events from the RabbitMQ queue, but it's not clear to me how
> to configure the sinks such that only one of the sinks actually does
> something and the other does nothing.
>
> From reading the documentation, I supposed that a sinkgroup configured for
> failover was what I needed, but the documentation examples only cover the
> case where the sinks in a failover group are all on the same agent on the
> same host. I've seen messages online which seem to say that sinks in a
> sinkgroup can be on different hosts, but I can find no clear explanation of
> how to configure such a sinkgroup. How would sinks on different hosts
> communicate with one another? Would the sinks in the sinkgroup have to use
> a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?
>
> In my testing I set up two agents on different hosts and configured a
> sinkgroup containing two sinks, both HDFS sinks.
>
> agent.sinkgroups = sinkgroup1
> agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
> agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
> agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
> agent.sinkgroups.sinkgroup1.processor.type=failover
>
> agent.sinks = hdfsSink-1 hdfsSink-2
> agent.sinks.hdfsSink-1.type = hdfs
> agent.sinks.hdfsSink-1.bind = 10.20.30.81
> agent.sinks.hdfsSink-1.channel = fileChannel-1
> agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
> agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
> agent.sinks.hdfsSink-1.hdfs.round = false
> agent.sinks.hdfsSink-1.hdfs.rollCount=50
> agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
> agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
> agent.sinks.hdfsSink-1.hdfs.codeC = lzo
> agent.sinks.hdfsSink-1.hdfs.rollInterval=30
> agent.sinks.hdfsSink-1.hdfs.rollSize=0
> agent.sinks.hdfsSink-1.hdfs.batchSize=1
>
> agent.sinks.hdfsSink-2.bind = 10.20.30.119
> agent.sinks.hdfsSink-2.type = hdfs
> agent.sinks.hdfsSink-2.channel = fileChannel-1
> agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
> agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
> agent.sinks.hdfsSink-2.hdfs.round = false
> agent.sinks.hdfsSink-2.hdfs.rollCount=50
> agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
> agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
> agent.sinks.hdfsSink-2.hdfs.codeC = lzo
> agent.sinks.hdfsSink-2.hdfs.rollInterval=30
> agent.sinks.hdfsSink-2.hdfs.rollSize=0
> agent.sinks.hdfsSink-2.hdfs.batchSize=1
>
> However, this does not achieve the failover I hoped for. The sink
> hdfsSink-2 on both agents writes the events to HDFS. The agents are not
> communicating, so the binding of the sink to an ip address is not doing
> anything.
>
>

RE: Architecting Flume for failover

Posted by Noel Duffy <no...@sli-systems.com>.
In my tests I set up two Flume agents connected to two different HDFS clusters. The configuration of both Flume agents is identical. They read events from the same RabbitMQ server. In my test, both agent hosts wrote the event to their respective HDFS servers using hdfsSink-2, but I expected the failover sinkgroup configuration would mean only one host would write the event. In other words, I thought that a failover sinkgroup could be configured to have sinks on different hosts but that only one sink on one host would actually write the event and that the other host would not do anything.

All the examples in the documentation have all sinks in a sinkgroup on a single host. I want to have the sinks on different hosts. I've seen a number of assertions online that this can be done, but so far, I've not seen any examples of how to actually configure it.

From: Jeff Lord [mailto:jlord@cloudera.com] 
Sent: Wednesday, 20 February 2013 2:17 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Noel,

What test did you perform?
Did you stop sink-2? 
Currently you have set a higher priority for sink-2 so it will be the default sink so long as it is up and running.

-Jeff

http://flume.apache.org/FlumeUserGuide.html#failover-sink-processor

On Tue, Feb 19, 2013 at 5:03 PM, Noel Duffy <no...@sli-systems.com> wrote:
The ip addresses 10.20.30.81 and 10.20.30.119 are the addresses of the Flume agents. The first agent is on 10.20.30.81, the second on 10.20.30.119. The idea was to have two sinks on different hosts and to configure Flume to failover to the second host if the first host should disappear. Although the documentation does not say so explicitly, I have read posts online which say that such a configuration is possible. I am running Flume-ng 1.2.0.

It may be that I am approaching this problem in the wrong way. We need to have Flume reading events from RabbitMQ and writing them to HDFS. We want to have two different hosts running Flume so that if one dies for any reason, the other would take over and no events should be lost or delayed. Later we may have more Flume hosts, depending on how well they cope with the expected traffic, but for now two will suffice to prove the concept. A load-balancing sink processor sounds like it might also be a solution, but again, I do not see how to configure this to work across more than one host.


From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Wednesday, 20 February 2013 1:31 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Can you change the hdfs.path to hdfs://10.20.30.81/flume/localbrain-events and hdfs://10.20.30.119/flume/localbrain-events on hdfsSink-1 and hdfsSink-2 respectively (assuming those are your namenodes)? The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources). 


Thanks
Hari

-- 
Hari Shreedharan

On Tuesday, February 19, 2013 at 4:05 PM, Noel Duffy wrote:
If I disable the agent.sinks line, both my sinks are disabled and nothing gets written to HDFS. The status page no longer shows me any sinks.

From: Yogi Nerella [mailto:ynerella999@gmail.com]
Sent: Wednesday, 20 February 2013 12:40 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Hi Noel,

May be you are specifying  both sinkgroups and sinks.  

Can you try removing the sinks.
#agent.sinks = hdfsSink-1 hdfsSink-2

Yogi


On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <no...@sli-systems.com> wrote:
I have a Flume agent that pulls events from RabbitMQ and pushes them into HDFS. So far so good, but now I want to have a second Flume agent on a different host acting as a hot backup for the first agent such that the loss of the first host running Flume would not cause any events to be lost. In the testing I've done I've gotten two Flume agents on separate hosts to read the same events from the RabbitMQ queue, but it's not clear to me how to configure the sinks such that only one of the sinks actually does something and the other does nothing.

>From reading the documentation, I supposed that a sinkgroup configured for failover was what I needed, but the documentation examples only cover the case where the sinks in a failover group are all on the same agent on the same host. I've seen messages online which seem to say that sinks in a sinkgroup can be on different hosts, but I can find no clear explanation of how to configure such a sinkgroup. How would sinks on different hosts communicate with one another? Would the sinks in the sinkgroup have to use a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?

In my testing I set up two agents on different hosts and configured a sinkgroup containing two sinks, both HDFS sinks.

agent.sinkgroups = sinkgroup1
agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
agent.sinkgroups.sinkgroup1.processor.type=failover

agent.sinks = hdfsSink-1 hdfsSink-2
agent.sinks.hdfsSink-1.type = hdfs
agent.sinks.hdfsSink-1.bind = 10.20.30.81
agent.sinks.hdfsSink-1.channel = fileChannel-1
agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-1.hdfs.round = false
agent.sinks.hdfsSink-1.hdfs.rollCount=50
agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
agent.sinks.hdfsSink-1.hdfs.codeC = lzo
agent.sinks.hdfsSink-1.hdfs.rollInterval=30
agent.sinks.hdfsSink-1.hdfs.rollSize=0
agent.sinks.hdfsSink-1.hdfs.batchSize=1

agent.sinks.hdfsSink-2.bind = 10.20.30.119
agent.sinks.hdfsSink-2.type = hdfs
agent.sinks.hdfsSink-2.channel = fileChannel-1
agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-2.hdfs.round = false
agent.sinks.hdfsSink-2.hdfs.rollCount=50
agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
agent.sinks.hdfsSink-2.hdfs.codeC = lzo
agent.sinks.hdfsSink-2.hdfs.rollInterval=30
agent.sinks.hdfsSink-2.hdfs.rollSize=0
agent.sinks.hdfsSink-2.hdfs.batchSize=1

However, this does not achieve the failover I hoped for. The sink hdfsSink-2 on both agents writes the events to HDFS. The agents are not communicating, so the binding of the sink to an ip address is not doing anything.


Re: Architecting Flume for failover

Posted by Jeff Lord <jl...@cloudera.com>.
Noel,

What test did you perform?
Did you stop sink-2?
Currently you have set a higher priority for sink-2 so it will be the
default sink so long as it is up and running.

-Jeff

http://flume.apache.org/FlumeUserGuide.html#failover-sink-processor


On Tue, Feb 19, 2013 at 5:03 PM, Noel Duffy <no...@sli-systems.com>wrote:

> The ip addresses 10.20.30.81 and 10.20.30.119 are the addresses of the
> Flume agents. The first agent is on 10.20.30.81, the second on
> 10.20.30.119. The idea was to have two sinks on different hosts and to
> configure Flume to failover to the second host if the first host should
> disappear. Although the documentation does not say so explicitly, I have
> read posts online which say that such a configuration is possible. I am
> running Flume-ng 1.2.0.
>
> It may be that I am approaching this problem in the wrong way. We need to
> have Flume reading events from RabbitMQ and writing them to HDFS. We want
> to have two different hosts running Flume so that if one dies for any
> reason, the other would take over and no events should be lost or delayed.
> Later we may have more Flume hosts, depending on how well they cope with
> the expected traffic, but for now two will suffice to prove the concept. A
> load-balancing sink processor sounds like it might also be a solution, but
> again, I do not see how to configure this to work across more than one host.
>
>
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> Sent: Wednesday, 20 February 2013 1:31 p.m.
> To: user@flume.apache.org
> Subject: Re: Architecting Flume for failover
>
> Can you change the hdfs.path to hdfs://10.20.30.81/flume/localbrain-eventsand hdfs://
> 10.20.30.119/flume/localbrain-events on hdfsSink-1 and hdfsSink-2
> respectively (assuming those are your namenodes)? The "bind" configuration
> param does not really exist for HDFS Sink (it is only for the IPC sources).
>
>
> Thanks
> Hari
>
> --
> Hari Shreedharan
>
> On Tuesday, February 19, 2013 at 4:05 PM, Noel Duffy wrote:
> If I disable the agent.sinks line, both my sinks are disabled and nothing
> gets written to HDFS. The status page no longer shows me any sinks.
>
> From: Yogi Nerella [mailto:ynerella999@gmail.com]
> Sent: Wednesday, 20 February 2013 12:40 p.m.
> To: user@flume.apache.org
> Subject: Re: Architecting Flume for failover
>
> Hi Noel,
>
> May be you are specifying  both sinkgroups and sinks.
>
> Can you try removing the sinks.
> #agent.sinks = hdfsSink-1 hdfsSink-2
>
> Yogi
>
>
> On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <no...@sli-systems.com>
> wrote:
> I have a Flume agent that pulls events from RabbitMQ and pushes them into
> HDFS. So far so good, but now I want to have a second Flume agent on a
> different host acting as a hot backup for the first agent such that the
> loss of the first host running Flume would not cause any events to be lost.
> In the testing I've done I've gotten two Flume agents on separate hosts to
> read the same events from the RabbitMQ queue, but it's not clear to me how
> to configure the sinks such that only one of the sinks actually does
> something and the other does nothing.
>
> From reading the documentation, I supposed that a sinkgroup configured for
> failover was what I needed, but the documentation examples only cover the
> case where the sinks in a failover group are all on the same agent on the
> same host. I've seen messages online which seem to say that sinks in a
> sinkgroup can be on different hosts, but I can find no clear explanation of
> how to configure such a sinkgroup. How would sinks on different hosts
> communicate with one another? Would the sinks in the sinkgroup have to use
> a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?
>
> In my testing I set up two agents on different hosts and configured a
> sinkgroup containing two sinks, both HDFS sinks.
>
> agent.sinkgroups = sinkgroup1
> agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
> agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
> agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
> agent.sinkgroups.sinkgroup1.processor.type=failover
>
> agent.sinks = hdfsSink-1 hdfsSink-2
> agent.sinks.hdfsSink-1.type = hdfs
> agent.sinks.hdfsSink-1.bind = 10.20.30.81
> agent.sinks.hdfsSink-1.channel = fileChannel-1
> agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
> agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
> agent.sinks.hdfsSink-1.hdfs.round = false
> agent.sinks.hdfsSink-1.hdfs.rollCount=50
> agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
> agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
> agent.sinks.hdfsSink-1.hdfs.codeC = lzo
> agent.sinks.hdfsSink-1.hdfs.rollInterval=30
> agent.sinks.hdfsSink-1.hdfs.rollSize=0
> agent.sinks.hdfsSink-1.hdfs.batchSize=1
>
> agent.sinks.hdfsSink-2.bind = 10.20.30.119
> agent.sinks.hdfsSink-2.type = hdfs
> agent.sinks.hdfsSink-2.channel = fileChannel-1
> agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
> agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
> agent.sinks.hdfsSink-2.hdfs.round = false
> agent.sinks.hdfsSink-2.hdfs.rollCount=50
> agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
> agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
> agent.sinks.hdfsSink-2.hdfs.codeC = lzo
> agent.sinks.hdfsSink-2.hdfs.rollInterval=30
> agent.sinks.hdfsSink-2.hdfs.rollSize=0
> agent.sinks.hdfsSink-2.hdfs.batchSize=1
>
> However, this does not achieve the failover I hoped for. The sink
> hdfsSink-2 on both agents writes the events to HDFS. The agents are not
> communicating, so the binding of the sink to an ip address is not doing
> anything.
>
>

RE: Architecting Flume for failover

Posted by Noel Duffy <no...@sli-systems.com>.
The ip addresses 10.20.30.81 and 10.20.30.119 are the addresses of the Flume agents. The first agent is on 10.20.30.81, the second on 10.20.30.119. The idea was to have two sinks on different hosts and to configure Flume to failover to the second host if the first host should disappear. Although the documentation does not say so explicitly, I have read posts online which say that such a configuration is possible. I am running Flume-ng 1.2.0.

It may be that I am approaching this problem in the wrong way. We need to have Flume reading events from RabbitMQ and writing them to HDFS. We want to have two different hosts running Flume so that if one dies for any reason, the other would take over and no events should be lost or delayed. Later we may have more Flume hosts, depending on how well they cope with the expected traffic, but for now two will suffice to prove the concept. A load-balancing sink processor sounds like it might also be a solution, but again, I do not see how to configure this to work across more than one host.


From: Hari Shreedharan [mailto:hshreedharan@cloudera.com] 
Sent: Wednesday, 20 February 2013 1:31 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Can you change the hdfs.path to hdfs://10.20.30.81/flume/localbrain-events and hdfs://10.20.30.119/flume/localbrain-events on hdfsSink-1 and hdfsSink-2 respectively (assuming those are your namenodes)? The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources). 


Thanks
Hari

-- 
Hari Shreedharan

On Tuesday, February 19, 2013 at 4:05 PM, Noel Duffy wrote:
If I disable the agent.sinks line, both my sinks are disabled and nothing gets written to HDFS. The status page no longer shows me any sinks.

From: Yogi Nerella [mailto:ynerella999@gmail.com] 
Sent: Wednesday, 20 February 2013 12:40 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Hi Noel,

May be you are specifying  both sinkgroups and sinks.  

Can you try removing the sinks.
#agent.sinks = hdfsSink-1 hdfsSink-2

Yogi


On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <no...@sli-systems.com> wrote:
I have a Flume agent that pulls events from RabbitMQ and pushes them into HDFS. So far so good, but now I want to have a second Flume agent on a different host acting as a hot backup for the first agent such that the loss of the first host running Flume would not cause any events to be lost. In the testing I've done I've gotten two Flume agents on separate hosts to read the same events from the RabbitMQ queue, but it's not clear to me how to configure the sinks such that only one of the sinks actually does something and the other does nothing.

From reading the documentation, I supposed that a sinkgroup configured for failover was what I needed, but the documentation examples only cover the case where the sinks in a failover group are all on the same agent on the same host. I've seen messages online which seem to say that sinks in a sinkgroup can be on different hosts, but I can find no clear explanation of how to configure such a sinkgroup. How would sinks on different hosts communicate with one another? Would the sinks in the sinkgroup have to use a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?

In my testing I set up two agents on different hosts and configured a sinkgroup containing two sinks, both HDFS sinks.

agent.sinkgroups = sinkgroup1
agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
agent.sinkgroups.sinkgroup1.processor.type=failover

agent.sinks = hdfsSink-1 hdfsSink-2
agent.sinks.hdfsSink-1.type = hdfs
agent.sinks.hdfsSink-1.bind = 10.20.30.81
agent.sinks.hdfsSink-1.channel = fileChannel-1
agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-1.hdfs.round = false
agent.sinks.hdfsSink-1.hdfs.rollCount=50
agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
agent.sinks.hdfsSink-1.hdfs.codeC = lzo
agent.sinks.hdfsSink-1.hdfs.rollInterval=30
agent.sinks.hdfsSink-1.hdfs.rollSize=0
agent.sinks.hdfsSink-1.hdfs.batchSize=1

agent.sinks.hdfsSink-2.bind = 10.20.30.119
agent.sinks.hdfsSink-2.type = hdfs
agent.sinks.hdfsSink-2.channel = fileChannel-1
agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-2.hdfs.round = false
agent.sinks.hdfsSink-2.hdfs.rollCount=50
agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
agent.sinks.hdfsSink-2.hdfs.codeC = lzo
agent.sinks.hdfsSink-2.hdfs.rollInterval=30
agent.sinks.hdfsSink-2.hdfs.rollSize=0
agent.sinks.hdfsSink-2.hdfs.batchSize=1

However, this does not achieve the failover I hoped for. The sink hdfsSink-2 on both agents writes the events to HDFS. The agents are not communicating, so the binding of the sink to an ip address is not doing anything.


RE: Architecting Flume for failover

Posted by Noel Duffy <no...@sli-systems.com>.
That makes sense. Thanks for the detailed reply. And thanks to Hari, Jeff, and everyone else who took time to answer.

________________________________________
From: Juhani Connolly [juhani_connolly@cyberagent.co.jp]
Sent: Wednesday, 20 February 2013 7:27 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Sink-groups are a concept local to a single agent. What is happening to
you is both your agents have the same sink group, and they're both
selecting the highest priority sink.

Agents are independent and communicate with each other via sink-source
couplings such as avro-sink to avro-source(or thrift).

Fail-over and load balancing in sink groups is intended to be a
mechanism that redirects traffic when a particular sink fails. So if say
you have two hdfs clusters, and one fails, you can reroute your data to
the secondary one until it recovers.

If you want to failover agents,  what you likely want to do to
guarrantee immediate delivery is just set up two agents that are both
directing the data to your hdfs cluster. Then when processing the data
filter out duplicates.

If this isn't workable, more complex methods might involve adding
sequence numbers to the data, and another flume agent with an
interceptor that filters out duplicates based on the sequence numbers.
This would be rather complex, and isn't something built into flume.

On 02/20/2013 03:07 PM, Noel Duffy wrote:
> I think we're talking at cross purposes here. First, let me re-state the problem at a high level. Hopefully this will be clearer:
>
> I want to have two or more Flume agents running on different hosts reading events from the same source (RabbitMQ) but only writing the event to the final sink (hdfs) once. Thus, if someone kicks out a cable and one of the Flume hosts dies, the other(s) should take over seamlessly without the loss of any events. I thought that failover and load-balancer sinkgroups were created to solve this kind of problem, but the documentation only covers the case where all the sinks in the sinkgroup are on one host, and this does not give me the redundancy I need.
>
> This is how I tried to solve the problem. I set up two hosts running Flume with the same configuration. Thinking that a failover sinkgroup was the right approach to tackling my problem, I created a sinkgroup and put two hdfs sinks in it, hdfsSink-1 and hdfsSink-2. The idea was that hdfsSink-1 would be on Flume host A and hdfsSink-2 would be on Flume host B. Then, events arrive on both host A and host B, and host A would write the event to hdfsSink-2, sending it across the network to Flume Host B, while host B would write the same event to hdfsSink-2 which is local to it.  Both agents should, in theory, write the event to the same sink on Flume Host B. Yes, I know that this would still duplicates the events, but I figured I would worry about that later. However, this has not worked as I expected. Flume Host A does not pass anything to Flume Host B.
>
> I need to know if the approach I've adopted, sinkgroups and failover, can achieve the kind of multi-host redundancy I want. If they can, are there examples of such configurations that I can see? If they cannot, what kind of configuration would be suitable for what I want to achieve?
>
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> Sent: Wednesday, 20 February 2013 5:39 p.m.
> To: user@flume.apache.org
> Subject: Re: Architecting Flume for failover
>
> Also, as Jeff said, sink-2 has a higher priority (the absolute value of the priority being higher, that sink is picked up).
>
>
>
>
> --
> Hari Shreedharan
>
> On Tuesday, February 19, 2013 at 8:37 PM, Hari Shreedharan wrote:
> No, it does not mean that. To talk to different HDFS clusters you must specify the hdfs.path as hdfs://namenode:port/<path>. You don't need to specify the bind etc.
>
> Hope this helps.
>
> Hari
>
> --
> Hari Shreedharan
>
> On Tuesday, February 19, 2013 at 8:18 PM, Noel Duffy wrote:
> Hari Shreedharan [mailto:hshreedharan@cloudera.com] wrote:
>
> The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources).
>
> Does this mean that failover for sinks on different hosts cannot work for HDFS sinks at all? Does it require Avro sinks, which seem to have a hostname parameter?
>
>


Re: Architecting Flume for failover

Posted by Juhani Connolly <ju...@cyberagent.co.jp>.
Sink-groups are a concept local to a single agent. What is happening to 
you is both your agents have the same sink group, and they're both 
selecting the highest priority sink.

Agents are independent and communicate with each other via sink-source 
couplings such as avro-sink to avro-source(or thrift).

Fail-over and load balancing in sink groups is intended to be a 
mechanism that redirects traffic when a particular sink fails. So if say 
you have two hdfs clusters, and one fails, you can reroute your data to 
the secondary one until it recovers.

If you want to failover agents,  what you likely want to do to 
guarrantee immediate delivery is just set up two agents that are both 
directing the data to your hdfs cluster. Then when processing the data 
filter out duplicates.

If this isn't workable, more complex methods might involve adding 
sequence numbers to the data, and another flume agent with an 
interceptor that filters out duplicates based on the sequence numbers. 
This would be rather complex, and isn't something built into flume.

On 02/20/2013 03:07 PM, Noel Duffy wrote:
> I think we're talking at cross purposes here. First, let me re-state the problem at a high level. Hopefully this will be clearer:
>
> I want to have two or more Flume agents running on different hosts reading events from the same source (RabbitMQ) but only writing the event to the final sink (hdfs) once. Thus, if someone kicks out a cable and one of the Flume hosts dies, the other(s) should take over seamlessly without the loss of any events. I thought that failover and load-balancer sinkgroups were created to solve this kind of problem, but the documentation only covers the case where all the sinks in the sinkgroup are on one host, and this does not give me the redundancy I need.
>
> This is how I tried to solve the problem. I set up two hosts running Flume with the same configuration. Thinking that a failover sinkgroup was the right approach to tackling my problem, I created a sinkgroup and put two hdfs sinks in it, hdfsSink-1 and hdfsSink-2. The idea was that hdfsSink-1 would be on Flume host A and hdfsSink-2 would be on Flume host B. Then, events arrive on both host A and host B, and host A would write the event to hdfsSink-2, sending it across the network to Flume Host B, while host B would write the same event to hdfsSink-2 which is local to it.  Both agents should, in theory, write the event to the same sink on Flume Host B. Yes, I know that this would still duplicates the events, but I figured I would worry about that later. However, this has not worked as I expected. Flume Host A does not pass anything to Flume Host B.
>
> I need to know if the approach I've adopted, sinkgroups and failover, can achieve the kind of multi-host redundancy I want. If they can, are there examples of such configurations that I can see? If they cannot, what kind of configuration would be suitable for what I want to achieve?
>
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> Sent: Wednesday, 20 February 2013 5:39 p.m.
> To: user@flume.apache.org
> Subject: Re: Architecting Flume for failover
>
> Also, as Jeff said, sink-2 has a higher priority (the absolute value of the priority being higher, that sink is picked up).
>
>
>
>
> -- 
> Hari Shreedharan
>
> On Tuesday, February 19, 2013 at 8:37 PM, Hari Shreedharan wrote:
> No, it does not mean that. To talk to different HDFS clusters you must specify the hdfs.path as hdfs://namenode:port/<path>. You don't need to specify the bind etc.
>
> Hope this helps.
>
> Hari
>
> -- 
> Hari Shreedharan
>
> On Tuesday, February 19, 2013 at 8:18 PM, Noel Duffy wrote:
> Hari Shreedharan [mailto:hshreedharan@cloudera.com] wrote:
>
> The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources).
>
> Does this mean that failover for sinks on different hosts cannot work for HDFS sinks at all? Does it require Avro sinks, which seem to have a hostname parameter?
>
>


RE: Architecting Flume for failover

Posted by Noel Duffy <no...@sli-systems.com>.
I think we're talking at cross purposes here. First, let me re-state the problem at a high level. Hopefully this will be clearer:

I want to have two or more Flume agents running on different hosts reading events from the same source (RabbitMQ) but only writing the event to the final sink (hdfs) once. Thus, if someone kicks out a cable and one of the Flume hosts dies, the other(s) should take over seamlessly without the loss of any events. I thought that failover and load-balancer sinkgroups were created to solve this kind of problem, but the documentation only covers the case where all the sinks in the sinkgroup are on one host, and this does not give me the redundancy I need.

This is how I tried to solve the problem. I set up two hosts running Flume with the same configuration. Thinking that a failover sinkgroup was the right approach to tackling my problem, I created a sinkgroup and put two hdfs sinks in it, hdfsSink-1 and hdfsSink-2. The idea was that hdfsSink-1 would be on Flume host A and hdfsSink-2 would be on Flume host B. Then, events arrive on both host A and host B, and host A would write the event to hdfsSink-2, sending it across the network to Flume Host B, while host B would write the same event to hdfsSink-2 which is local to it.  Both agents should, in theory, write the event to the same sink on Flume Host B. Yes, I know that this would still duplicates the events, but I figured I would worry about that later. However, this has not worked as I expected. Flume Host A does not pass anything to Flume Host B. 

I need to know if the approach I've adopted, sinkgroups and failover, can achieve the kind of multi-host redundancy I want. If they can, are there examples of such configurations that I can see? If they cannot, what kind of configuration would be suitable for what I want to achieve? 

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com] 
Sent: Wednesday, 20 February 2013 5:39 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Also, as Jeff said, sink-2 has a higher priority (the absolute value of the priority being higher, that sink is picked up). 




-- 
Hari Shreedharan

On Tuesday, February 19, 2013 at 8:37 PM, Hari Shreedharan wrote:
No, it does not mean that. To talk to different HDFS clusters you must specify the hdfs.path as hdfs://namenode:port/<path>. You don't need to specify the bind etc. 

Hope this helps.

Hari

-- 
Hari Shreedharan

On Tuesday, February 19, 2013 at 8:18 PM, Noel Duffy wrote:
Hari Shreedharan [mailto:hshreedharan@cloudera.com] wrote:

The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources). 

Does this mean that failover for sinks on different hosts cannot work for HDFS sinks at all? Does it require Avro sinks, which seem to have a hostname parameter?



Re: Architecting Flume for failover

Posted by Hari Shreedharan <hs...@cloudera.com>.
Also, as Jeff said, sink-2 has a higher priority (the absolute value of the priority being higher, that sink is picked up). 




-- 
Hari Shreedharan


On Tuesday, February 19, 2013 at 8:37 PM, Hari Shreedharan wrote:

> No, it does not mean that. To talk to different HDFS clusters you must specify the hdfs.path as hdfs://namenode:port/<path>. You don't need to specify the bind etc. 
> 
> Hope this helps.
> 
> Hari 
> 
> -- 
> Hari Shreedharan
> 
> 
> On Tuesday, February 19, 2013 at 8:18 PM, Noel Duffy wrote:
> 
> > Hari Shreedharan [mailto:hshreedharan@cloudera.com] wrote:
> > 
> > > The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources). 
> > 
> > Does this mean that failover for sinks on different hosts cannot work for HDFS sinks at all? Does it require Avro sinks, which seem to have a hostname parameter? 
> 


Re: Architecting Flume for failover

Posted by Hari Shreedharan <hs...@cloudera.com>.
No, it does not mean that. To talk to different HDFS clusters you must specify the hdfs.path as hdfs://namenode:port/<path>. You don't need to specify the bind etc. 

Hope this helps.

Hari 

-- 
Hari Shreedharan


On Tuesday, February 19, 2013 at 8:18 PM, Noel Duffy wrote:

> Hari Shreedharan [mailto:hshreedharan@cloudera.com] wrote:
> 
> > The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources). 
> 
> Does this mean that failover for sinks on different hosts cannot work for HDFS sinks at all? Does it require Avro sinks, which seem to have a hostname parameter? 


RE: Architecting Flume for failover

Posted by Noel Duffy <no...@sli-systems.com>.
Hari Shreedharan [mailto:hshreedharan@cloudera.com]  wrote:

> The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources). 

Does this mean that failover for sinks on different hosts cannot work for HDFS sinks at all? Does it require Avro sinks, which seem to have a hostname parameter?

Re: Architecting Flume for failover

Posted by Hari Shreedharan <hs...@cloudera.com>.
Can you change the hdfs.path to hdfs://10.20.30.81/flume/localbrain-events and hdfs://10.20.30.119/flume/localbrain-events on hdfsSink-1 and hdfsSink-2 respectively (assuming those are your namenodes)? The "bind" configuration param does not really exist for HDFS Sink (it is only for the IPC sources). 


Thanks
Hari

-- 
Hari Shreedharan


On Tuesday, February 19, 2013 at 4:05 PM, Noel Duffy wrote:

> If I disable the agent.sinks line, both my sinks are disabled and nothing gets written to HDFS. The status page no longer shows me any sinks.
> 
> From: Yogi Nerella [mailto:ynerella999@gmail.com] 
> Sent: Wednesday, 20 February 2013 12:40 p.m.
> To: user@flume.apache.org (mailto:user@flume.apache.org)
> Subject: Re: Architecting Flume for failover
> 
> Hi Noel,
> 
> May be you are specifying  both sinkgroups and sinks.  
> 
> Can you try removing the sinks.
> #agent.sinks = hdfsSink-1 hdfsSink-2
> 
> Yogi
> 
> 
> On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <noel.duffy@sli-systems.com (mailto:noel.duffy@sli-systems.com)> wrote:
> I have a Flume agent that pulls events from RabbitMQ and pushes them into HDFS. So far so good, but now I want to have a second Flume agent on a different host acting as a hot backup for the first agent such that the loss of the first host running Flume would not cause any events to be lost. In the testing I've done I've gotten two Flume agents on separate hosts to read the same events from the RabbitMQ queue, but it's not clear to me how to configure the sinks such that only one of the sinks actually does something and the other does nothing.
> 
> From reading the documentation, I supposed that a sinkgroup configured for failover was what I needed, but the documentation examples only cover the case where the sinks in a failover group are all on the same agent on the same host. I've seen messages online which seem to say that sinks in a sinkgroup can be on different hosts, but I can find no clear explanation of how to configure such a sinkgroup. How would sinks on different hosts communicate with one another? Would the sinks in the sinkgroup have to use a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?
> 
> In my testing I set up two agents on different hosts and configured a sinkgroup containing two sinks, both HDFS sinks.
> 
> agent.sinkgroups = sinkgroup1
> agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
> agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
> agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
> agent.sinkgroups.sinkgroup1.processor.type=failover
> 
> agent.sinks = hdfsSink-1 hdfsSink-2
> agent.sinks.hdfsSink-1.type = hdfs
> agent.sinks.hdfsSink-1.bind = 10.20.30.81
> agent.sinks.hdfsSink-1.channel = fileChannel-1
> agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
> agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
> agent.sinks.hdfsSink-1.hdfs.round = false
> agent.sinks.hdfsSink-1.hdfs.rollCount=50
> agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
> agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
> agent.sinks.hdfsSink-1.hdfs.codeC = lzo
> agent.sinks.hdfsSink-1.hdfs.rollInterval=30
> agent.sinks.hdfsSink-1.hdfs.rollSize=0
> agent.sinks.hdfsSink-1.hdfs.batchSize=1
> 
> agent.sinks.hdfsSink-2.bind = 10.20.30.119
> agent.sinks.hdfsSink-2.type = hdfs
> agent.sinks.hdfsSink-2.channel = fileChannel-1
> agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
> agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
> agent.sinks.hdfsSink-2.hdfs.round = false
> agent.sinks.hdfsSink-2.hdfs.rollCount=50
> agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
> agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
> agent.sinks.hdfsSink-2.hdfs.codeC = lzo
> agent.sinks.hdfsSink-2.hdfs.rollInterval=30
> agent.sinks.hdfsSink-2.hdfs.rollSize=0
> agent.sinks.hdfsSink-2.hdfs.batchSize=1
> 
> However, this does not achieve the failover I hoped for. The sink hdfsSink-2 on both agents writes the events to HDFS. The agents are not communicating, so the binding of the sink to an ip address is not doing anything. 


RE: Architecting Flume for failover

Posted by Noel Duffy <no...@sli-systems.com>.
If I disable the agent.sinks line, both my sinks are disabled and nothing gets written to HDFS. The status page no longer shows me any sinks.

From: Yogi Nerella [mailto:ynerella999@gmail.com] 
Sent: Wednesday, 20 February 2013 12:40 p.m.
To: user@flume.apache.org
Subject: Re: Architecting Flume for failover

Hi Noel,

May be you are specifying  both sinkgroups and sinks.  

Can you try removing the sinks.
#agent.sinks = hdfsSink-1 hdfsSink-2

Yogi


On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <no...@sli-systems.com> wrote:
I have a Flume agent that pulls events from RabbitMQ and pushes them into HDFS. So far so good, but now I want to have a second Flume agent on a different host acting as a hot backup for the first agent such that the loss of the first host running Flume would not cause any events to be lost. In the testing I've done I've gotten two Flume agents on separate hosts to read the same events from the RabbitMQ queue, but it's not clear to me how to configure the sinks such that only one of the sinks actually does something and the other does nothing.

>From reading the documentation, I supposed that a sinkgroup configured for failover was what I needed, but the documentation examples only cover the case where the sinks in a failover group are all on the same agent on the same host. I've seen messages online which seem to say that sinks in a sinkgroup can be on different hosts, but I can find no clear explanation of how to configure such a sinkgroup. How would sinks on different hosts communicate with one another? Would the sinks in the sinkgroup have to use a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?

In my testing I set up two agents on different hosts and configured a sinkgroup containing two sinks, both HDFS sinks.

agent.sinkgroups = sinkgroup1
agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
agent.sinkgroups.sinkgroup1.processor.type=failover

agent.sinks = hdfsSink-1 hdfsSink-2
agent.sinks.hdfsSink-1.type = hdfs
agent.sinks.hdfsSink-1.bind = 10.20.30.81
agent.sinks.hdfsSink-1.channel = fileChannel-1
agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-1.hdfs.round = false
agent.sinks.hdfsSink-1.hdfs.rollCount=50
agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
agent.sinks.hdfsSink-1.hdfs.codeC = lzo
agent.sinks.hdfsSink-1.hdfs.rollInterval=30
agent.sinks.hdfsSink-1.hdfs.rollSize=0
agent.sinks.hdfsSink-1.hdfs.batchSize=1

agent.sinks.hdfsSink-2.bind = 10.20.30.119
agent.sinks.hdfsSink-2.type = hdfs
agent.sinks.hdfsSink-2.channel = fileChannel-1
agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
agent.sinks.hdfsSink-2.hdfs.round = false
agent.sinks.hdfsSink-2.hdfs.rollCount=50
agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
agent.sinks.hdfsSink-2.hdfs.codeC = lzo
agent.sinks.hdfsSink-2.hdfs.rollInterval=30
agent.sinks.hdfsSink-2.hdfs.rollSize=0
agent.sinks.hdfsSink-2.hdfs.batchSize=1

However, this does not achieve the failover I hoped for. The sink hdfsSink-2 on both agents writes the events to HDFS. The agents are not communicating, so the binding of the sink to an ip address is not doing anything.



Re: Architecting Flume for failover

Posted by Yogi Nerella <yn...@gmail.com>.
Hi Noel,

May be you are specifying  both sinkgroups and sinks.

Can you try removing the sinks.
#agent.sinks = hdfsSink-1 hdfsSink-2

Yogi



On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <no...@sli-systems.com>wrote:

> I have a Flume agent that pulls events from RabbitMQ and pushes them into
> HDFS. So far so good, but now I want to have a second Flume agent on a
> different host acting as a hot backup for the first agent such that the
> loss of the first host running Flume would not cause any events to be lost.
> In the testing I've done I've gotten two Flume agents on separate hosts to
> read the same events from the RabbitMQ queue, but it's not clear to me how
> to configure the sinks such that only one of the sinks actually does
> something and the other does nothing.
>
> From reading the documentation, I supposed that a sinkgroup configured for
> failover was what I needed, but the documentation examples only cover the
> case where the sinks in a failover group are all on the same agent on the
> same host. I've seen messages online which seem to say that sinks in a
> sinkgroup can be on different hosts, but I can find no clear explanation of
> how to configure such a sinkgroup. How would sinks on different hosts
> communicate with one another? Would the sinks in the sinkgroup have to use
> a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro?
>
> In my testing I set up two agents on different hosts and configured a
> sinkgroup containing two sinks, both HDFS sinks.
>
> agent.sinkgroups = sinkgroup1
> agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2
> agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5
> agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10
> agent.sinkgroups.sinkgroup1.processor.type=failover
>
> agent.sinks = hdfsSink-1 hdfsSink-2
> agent.sinks.hdfsSink-1.type = hdfs
> agent.sinks.hdfsSink-1.bind = 10.20.30.81
> agent.sinks.hdfsSink-1.channel = fileChannel-1
> agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events
> agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events
> agent.sinks.hdfsSink-1.hdfs.round = false
> agent.sinks.hdfsSink-1.hdfs.rollCount=50
> agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile
> agent.sinks.hdfsSink-1.hdfs.writeFormat=Text
> agent.sinks.hdfsSink-1.hdfs.codeC = lzo
> agent.sinks.hdfsSink-1.hdfs.rollInterval=30
> agent.sinks.hdfsSink-1.hdfs.rollSize=0
> agent.sinks.hdfsSink-1.hdfs.batchSize=1
>
> agent.sinks.hdfsSink-2.bind = 10.20.30.119
> agent.sinks.hdfsSink-2.type = hdfs
> agent.sinks.hdfsSink-2.channel = fileChannel-1
> agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events
> agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events
> agent.sinks.hdfsSink-2.hdfs.round = false
> agent.sinks.hdfsSink-2.hdfs.rollCount=50
> agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile
> agent.sinks.hdfsSink-2.hdfs.writeFormat=Text
> agent.sinks.hdfsSink-2.hdfs.codeC = lzo
> agent.sinks.hdfsSink-2.hdfs.rollInterval=30
> agent.sinks.hdfsSink-2.hdfs.rollSize=0
> agent.sinks.hdfsSink-2.hdfs.batchSize=1
>
> However, this does not achieve the failover I hoped for. The sink
> hdfsSink-2 on both agents writes the events to HDFS. The agents are not
> communicating, so the binding of the sink to an ip address is not doing
> anything.
>
>
>