You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Majid Alfifi <ma...@gmail.com> on 2014/11/25 19:58:02 UTC

Downstream port-less flume agent

I have a typical flume pipeline that collects logs from online servers and
aggregate them and push them down to HDFS. The typical configuration is to
open a port on the local cluster so the online flume agent can send Avro
events to.

Is it possible to have a flume agent on the local cluster basically
"pulling" events from the online agent without the need to open a local
port?

Best Regards,
Majid

Re: Downstream port-less flume agent

Posted by Hari Shreedharan <hs...@cloudera.com>.
Flume communicates via the network - without ports that would just not be possible. I don’t see a way to work around it without giving Flume the ability to use the network.


Thanks,
Hari

On Wed, Dec 3, 2014 at 2:28 AM, Majid Alfifi <ma...@gmail.com>
wrote:

> Thanks Hari. Using Spool Dir, I could have remote flume agents write events
> to a remote dir and run rsync locally to sync a local dir with the remote
> dir and have local flume agent pick up events from the local dir.
> But this way I am breaking the flume pipeline with rsync in the middle. I
> don't know how this will affect  flume features
> like reliability, scalability, etc.
> -Majid
> On Tuesday, December 2, 2014, Hari Shreedharan <hshreedharan@cloudera.com
> <javascript:_e(%7B%7D,'cvml','hshreedharan@cloudera.com');>> wrote:
>> Not sure how that would be possible. You could use a Spool Dir Source if
>> you want to write the data to files and then read it from there.
>>
>> Thanks,
>> Hari
>>
>>
>> On Tue, Nov 25, 2014 at 11:00 AM, Majid Alfifi <ma...@gmail.com>
>> wrote:
>>
>>> I have a typical flume pipeline that collects logs from online servers
>>> and aggregate them and push them down to HDFS. The typical configuration is
>>> to open a port on the local cluster so the online flume agent can send Avro
>>> events to.
>>>
>>> Is it possible to have a flume agent on the local cluster basically
>>> "pulling" events from the online agent without the need to open a local
>>> port?
>>>
>>> Best Regards,
>>> Majid
>>>
>>
>>

Downstream port-less flume agent

Posted by Majid Alfifi <ma...@gmail.com>.
Thanks Hari. Using Spool Dir, I could have remote flume agents write events
to a remote dir and run rsync locally to sync a local dir with the remote
dir and have local flume agent pick up events from the local dir.

But this way I am breaking the flume pipeline with rsync in the middle. I
don't know how this will affect  flume features
like reliability, scalability, etc.

-Majid


On Tuesday, December 2, 2014, Hari Shreedharan <hshreedharan@cloudera.com
<javascript:_e(%7B%7D,'cvml','hshreedharan@cloudera.com');>> wrote:

> Not sure how that would be possible. You could use a Spool Dir Source if
> you want to write the data to files and then read it from there.
>
> Thanks,
> Hari
>
>
> On Tue, Nov 25, 2014 at 11:00 AM, Majid Alfifi <ma...@gmail.com>
> wrote:
>
>> I have a typical flume pipeline that collects logs from online servers
>> and aggregate them and push them down to HDFS. The typical configuration is
>> to open a port on the local cluster so the online flume agent can send Avro
>> events to.
>>
>> Is it possible to have a flume agent on the local cluster basically
>> "pulling" events from the online agent without the need to open a local
>> port?
>>
>> Best Regards,
>> Majid
>>
>
>

Re: Downstream port-less flume agent

Posted by Hari Shreedharan <hs...@cloudera.com>.
Not sure how that would be possible. You could use a Spool Dir Source if you want to write the data to files and then read it from there.


Thanks,
Hari

On Tue, Nov 25, 2014 at 11:00 AM, Majid Alfifi <ma...@gmail.com>
wrote:

> I have a typical flume pipeline that collects logs from online servers and
> aggregate them and push them down to HDFS. The typical configuration is to
> open a port on the local cluster so the online flume agent can send Avro
> events to.
> Is it possible to have a flume agent on the local cluster basically
> "pulling" events from the online agent without the need to open a local
> port?
> Best Regards,
> Majid