You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by "samnik60 ." <mo...@gmail.com> on 2016/03/02 19:12:05 UTC

Flume NG High Availability

Hi guys,
I have the following queries about flume ng High Availability,

- Is it possible to have active/active or active/standby flume agent? If so
kindly point me to a document to refer, i am unable to find it in flume
documentation.

The Use Case i am trying to address,

I  want to run a flume agent in a node which will take events from a jms
queue and persist it in hdfs. If the flume agent goes down , i want another
flume agent running in another node to take over this responsiblity. (
active/standby)

I dont mind having both flume agents trying to process events from same jms
queue , provided a event is taken by only one flume agent without race
condition which i feel is not practical ( active/active).


Any information on this is greatly appreciated.

Thanks,
Sam.

Re: Flume NG High Availability

Posted by Gonzalo Herreros <gh...@gmail.com>.
AFAIK that it's not possible unless you implement some kind of lock
mechanism yourself.
Normally if your agent dies is because there is something wrong with the
machine (otherwise Cloudera or Ambari would just restart the agent), so
those files in the spool directory are not accessible anyway even if you
have multiple agents there.

On 2 March 2016 at 19:51, samnik60 . <mo...@gmail.com> wrote:

> Thanks Gonzalo,
>
> In my other use case, i want to achive the same HA i have mentioned before
> ( active/active or active/stand by )  using spool directory source, so
> going by your response , if i use a spool directory as source i cannot have
> active/active or active/stand by HA , since active/active will result in
> race condition when two source try to process from same directory.
>
> Thanks,
> sam
>
> On Wed, Mar 2, 2016 at 1:17 PM, Gonzalo Herreros <gh...@gmail.com>
> wrote:
>
>> Jms queues guarantee only one of the clients will get each message.
>> Unless you build it yourself, Flume doesn't have active/pasive. HA is
>> achieved by having multiple agents running the same configuration.
>>
>> On 2 March 2016 at 18:12, samnik60 . <mo...@gmail.com> wrote:
>>
>>> Hi guys,
>>> I have the following queries about flume ng High Availability,
>>>
>>> - Is it possible to have active/active or active/standby flume agent? If
>>> so kindly point me to a document to refer, i am unable to find it in flume
>>> documentation.
>>>
>>> The Use Case i am trying to address,
>>>
>>> I  want to run a flume agent in a node which will take events from a jms
>>> queue and persist it in hdfs. If the flume agent goes down , i want another
>>> flume agent running in another node to take over this responsiblity. (
>>> active/standby)
>>>
>>> I dont mind having both flume agents trying to process events from same
>>> jms queue , provided a event is taken by only one flume agent without race
>>> condition which i feel is not practical ( active/active).
>>>
>>>
>>> Any information on this is greatly appreciated.
>>>
>>> Thanks,
>>> Sam.
>>>
>>>
>>
>

Re: Flume NG High Availability

Posted by "samnik60 ." <mo...@gmail.com>.
Thanks Gonzalo,

In my other use case, i want to achive the same HA i have mentioned before
( active/active or active/stand by )  using spool directory source, so
going by your response , if i use a spool directory as source i cannot have
active/active or active/stand by HA , since active/active will result in
race condition when two source try to process from same directory.

Thanks,
sam

On Wed, Mar 2, 2016 at 1:17 PM, Gonzalo Herreros <gh...@gmail.com>
wrote:

> Jms queues guarantee only one of the clients will get each message.
> Unless you build it yourself, Flume doesn't have active/pasive. HA is
> achieved by having multiple agents running the same configuration.
>
> On 2 March 2016 at 18:12, samnik60 . <mo...@gmail.com> wrote:
>
>> Hi guys,
>> I have the following queries about flume ng High Availability,
>>
>> - Is it possible to have active/active or active/standby flume agent? If
>> so kindly point me to a document to refer, i am unable to find it in flume
>> documentation.
>>
>> The Use Case i am trying to address,
>>
>> I  want to run a flume agent in a node which will take events from a jms
>> queue and persist it in hdfs. If the flume agent goes down , i want another
>> flume agent running in another node to take over this responsiblity. (
>> active/standby)
>>
>> I dont mind having both flume agents trying to process events from same
>> jms queue , provided a event is taken by only one flume agent without race
>> condition which i feel is not practical ( active/active).
>>
>>
>> Any information on this is greatly appreciated.
>>
>> Thanks,
>> Sam.
>>
>>
>

Re: Flume NG High Availability

Posted by Gonzalo Herreros <gh...@gmail.com>.
Jms queues guarantee only one of the clients will get each message.
Unless you build it yourself, Flume doesn't have active/pasive. HA is
achieved by having multiple agents running the same configuration.

On 2 March 2016 at 18:12, samnik60 . <mo...@gmail.com> wrote:

> Hi guys,
> I have the following queries about flume ng High Availability,
>
> - Is it possible to have active/active or active/standby flume agent? If
> so kindly point me to a document to refer, i am unable to find it in flume
> documentation.
>
> The Use Case i am trying to address,
>
> I  want to run a flume agent in a node which will take events from a jms
> queue and persist it in hdfs. If the flume agent goes down , i want another
> flume agent running in another node to take over this responsiblity. (
> active/standby)
>
> I dont mind having both flume agents trying to process events from same
> jms queue , provided a event is taken by only one flume agent without race
> condition which i feel is not practical ( active/active).
>
>
> Any information on this is greatly appreciated.
>
> Thanks,
> Sam.
>
>