You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by "Cochran, David M (Contractor)" <Da...@bsee.gov> on 2012/08/23 16:08:54 UTC

conf help needed

Let me see if I can explain what I'd like to accomplish, and yes I"m certain I'm just not grasping something here and making a mess of it all, and mayhaps ya'll can point me straight...

Simple.. grab this log file xxx.log from host foo and store it in a on the sink server as ~/flume/%host/xxx-%date.log

Where the xxx.log and hostname are dynamic so multiple hosts/logs can be templated.  I guess that makes sense... afterall that's what this whole project is about...  

My troubles are coming from sorting the conf:

here's where I've started... a combination of a bunch of conf's I've borrowed bits from.


agent1.channels.ch1.type = memory

agent1.sources =  tailsource-1
agent1.sources.channels = ch1
agent1.sources.tailsource-1.type = exec
agent1.sources.tailsource-1.command = tail -F /root/Desktop/apache-flume-1.3.0-SNAPSHOT/test.log
agent1.sources.tailsource-1.channels = ch1
agent1.sources.tailsource1.bind = 0.0.0.0
agent1.sources.tailsource-1.port = 41414

agent1.sources.tailsource-1.interceptors = hostint
agent1.sources.tailsource-1.interceptors.hostint.type = org.apache.flume.interceptor.HostInterceptor$Builder
agent1.sources.tailsource-1.interceptors.hostint.preserveExisting = true
agent1.sources.tailsource-1.interceptors.hostint.useIP = false
agent1.sources.tailsource-1.interceptors.timestamp.type = timestamp

agent1.sinks.log-sink1.channel = ch1
agent1.sinks.log-sink1.type = FILE_ROLL
agent1.sinks.log-sink1.sink.directory = /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}
agent1.sinks.log-sink1.sink.rollInterval = 86400
agent1.sinks.log-sink1.sink.filePrefix = %{filename}.%Y-%m-%d


Somehow I'm missing something as the %host never makes it forward, nor does the log filename.... before adding the host/filename stuff in it everything worked, just using filenames like 1345725771524-1 which aren't very intuitive by themselves but lumped into an unnamed directory makes them even less so.


This is the resulting logfile...

23 Aug 2012 08:29:49,146 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:160)  - Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Failed to open file /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}/1345728588635-1 while delivering event
        at org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:166)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException: /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}/1345728588635-1 (No such file or directory)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:145)
        at org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:160)
        ... 3 more


Anyone care to knock some the the dummy-dust loose that's accumulated in my head and point me in the right direction to get this sorted?


Thanks,
Dave

RE: conf help needed

Posted by "Cochran, David M (Contractor)" <Da...@bsee.gov>.
Sorry for the long delay... Isaac threw a wrench into things.

 

Thanks for the replies...  Okay since what I'm looking for isn't
implemented yet, I can manually set the directories for each host
without a big deal...  

 

How about the output filenames?  Can they be manually set somehow,
perhaps just set a prefix and then let flume append it with the numbered
structure?  

 

ie..  app1_log_xxxxxxxxxx

        app2_log_xxxxxxxxxx

         app2_out_xxxxxxxxxx

 

or would I need to create a directory for each and deal with the
672153171-1 filenames?

 

Thanks, 

Dave

 

 

From: Will McQueen [mailto:will@cloudera.com] 
Sent: Thursday, August 23, 2012 11:31 AM
To: user@flume.apache.org
Subject: Re: conf help needed

 

Hi David,

There's currently an open, unassigned ticket for what you're asking:
     FLUME-1295: RollingFileSink needs to be able to construct directory
path based on escape sequence

Cheers,
Will

On Thu, Aug 23, 2012 at 9:17 AM, Bhaskar V. Karambelkar
<bh...@gmail.com> wrote:

Looking at the flume documentation, looks like dynamic paths are only
supported for the HDFS sink and not the file_roll sink.

You can use a Multiplexing channel selector to achieve the same effect
for file_roll sink, but the possible hosts has to be a

predetermined list and not dynamic.

 


Re: conf help needed

Posted by Will McQueen <wi...@cloudera.com>.
Hi David,

There's currently an open, unassigned ticket for what you're asking:
     FLUME-1295: RollingFileSink needs to be able to construct directory
path based on escape sequence

Cheers,
Will

On Thu, Aug 23, 2012 at 9:17 AM, Bhaskar V. Karambelkar <bhaskarvk@gmail.com
> wrote:

> Looking at the flume documentation, looks like dynamic paths are only
> supported for the HDFS sink and not the file_roll sink.
> You can use a Multiplexing channel selector to achieve the same effect for
> file_roll sink, but the possible hosts has to be a
> predetermined list and not dynamic.
>
>
> On Thu, Aug 23, 2012 at 10:08 AM, Cochran, David M (Contractor) <
> David.Cochran@bsee.gov> wrote:
>
>> **
>>
>> Let me see if I can explain what I'd like to accomplish, and yes I"m
>> certain I'm just not grasping something here and making a mess of it all,
>> and mayhaps ya'll can point me straight...
>>
>> Simple.. grab this log file xxx.log from host foo and store it in a on
>> the sink server as ~/flume/%host/xxx-%date.log
>>
>> Where the xxx.log and hostname are dynamic so multiple hosts/logs can be
>> templated.  I guess that makes sense... afterall that's what this whole
>> project is about...
>>
>> My troubles are coming from sorting the conf:
>>
>> here's where I've started... a combination of a bunch of conf's I've
>> borrowed bits from.
>>
>>
>> agent1.channels.ch1.type = memory
>>
>> agent1.sources =  tailsource-1
>> agent1.sources.channels = ch1
>> agent1.sources.tailsource-1.type = exec
>> agent1.sources.tailsource-1.command = tail -F
>> /root/Desktop/apache-flume-1.3.0-SNAPSHOT/test.log
>> agent1.sources.tailsource-1.channels = ch1
>> agent1.sources.tailsource1.bind = 0.0.0.0
>> agent1.sources.tailsource-1.port = 41414
>>
>> agent1.sources.tailsource-1.interceptors = hostint
>> agent1.sources.tailsource-1.interceptors.hostint.type =
>> org.apache.flume.interceptor.HostInterceptor$Builder
>> agent1.sources.tailsource-1.interceptors.hostint.preserveExisting = true
>> agent1.sources.tailsource-1.interceptors.hostint.useIP = false
>> agent1.sources.tailsource-1.interceptors.timestamp.type = timestamp
>>
>> agent1.sinks.log-sink1.channel = ch1
>> agent1.sinks.log-sink1.type = FILE_ROLL
>> agent1.sinks.log-sink1.sink.directory =
>> /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}
>> agent1.sinks.log-sink1.sink.rollInterval = 86400
>> agent1.sinks.log-sink1.sink.filePrefix = %{filename}.%Y-%m-%d
>>
>>
>> Somehow I'm missing something as the %host never makes it forward, nor
>> does the log filename.... before adding the host/filename stuff in it
>> everything worked, just using filenames like 1345725771524-1 which aren't
>> very intuitive by themselves but lumped into an unnamed directory makes
>> them even less so.
>>
>>
>> This is the resulting logfile...
>>
>> 23 Aug 2012 08:29:49,146 ERROR
>> [SinkRunner-PollingRunner-DefaultSinkProcessor]
>> (org.apache.flume.SinkRunner$PollingRunner.run:160)  - Unable to deliver
>> event. Exception follows.
>> org.apache.flume.EventDeliveryException: Failed to open file
>> /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}/1345728588635-1
>> while delivering event
>>         at
>> org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:166)
>>         at
>> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>>         at
>> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>>         at java.lang.Thread.run(Thread.java:662)
>> Caused by: java.io.FileNotFoundException:
>> /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}/1345728588635-1 (No
>> such file or directory)
>>         at java.io.FileOutputStream.open(Native Method)
>>         at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
>>         at java.io.FileOutputStream.<init>(FileOutputStream.java:145)
>>         at
>> org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:160)
>>         ... 3 more
>>
>>
>> Anyone care to knock some the the dummy-dust loose that's accumulated in
>> my head and point me in the right direction to get this sorted?
>>
>>
>> Thanks,
>> Dave
>>
>
>

Re: conf help needed

Posted by "Bhaskar V. Karambelkar" <bh...@gmail.com>.
Looking at the flume documentation, looks like dynamic paths are only
supported for the HDFS sink and not the file_roll sink.
You can use a Multiplexing channel selector to achieve the same effect for
file_roll sink, but the possible hosts has to be a
predetermined list and not dynamic.

On Thu, Aug 23, 2012 at 10:08 AM, Cochran, David M (Contractor) <
David.Cochran@bsee.gov> wrote:

> **
>
> Let me see if I can explain what I'd like to accomplish, and yes I"m
> certain I'm just not grasping something here and making a mess of it all,
> and mayhaps ya'll can point me straight...
>
> Simple.. grab this log file xxx.log from host foo and store it in a on the
> sink server as ~/flume/%host/xxx-%date.log
>
> Where the xxx.log and hostname are dynamic so multiple hosts/logs can be
> templated.  I guess that makes sense... afterall that's what this whole
> project is about...
>
> My troubles are coming from sorting the conf:
>
> here's where I've started... a combination of a bunch of conf's I've
> borrowed bits from.
>
>
> agent1.channels.ch1.type = memory
>
> agent1.sources =  tailsource-1
> agent1.sources.channels = ch1
> agent1.sources.tailsource-1.type = exec
> agent1.sources.tailsource-1.command = tail -F
> /root/Desktop/apache-flume-1.3.0-SNAPSHOT/test.log
> agent1.sources.tailsource-1.channels = ch1
> agent1.sources.tailsource1.bind = 0.0.0.0
> agent1.sources.tailsource-1.port = 41414
>
> agent1.sources.tailsource-1.interceptors = hostint
> agent1.sources.tailsource-1.interceptors.hostint.type =
> org.apache.flume.interceptor.HostInterceptor$Builder
> agent1.sources.tailsource-1.interceptors.hostint.preserveExisting = true
> agent1.sources.tailsource-1.interceptors.hostint.useIP = false
> agent1.sources.tailsource-1.interceptors.timestamp.type = timestamp
>
> agent1.sinks.log-sink1.channel = ch1
> agent1.sinks.log-sink1.type = FILE_ROLL
> agent1.sinks.log-sink1.sink.directory =
> /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}
> agent1.sinks.log-sink1.sink.rollInterval = 86400
> agent1.sinks.log-sink1.sink.filePrefix = %{filename}.%Y-%m-%d
>
>
> Somehow I'm missing something as the %host never makes it forward, nor
> does the log filename.... before adding the host/filename stuff in it
> everything worked, just using filenames like 1345725771524-1 which aren't
> very intuitive by themselves but lumped into an unnamed directory makes
> them even less so.
>
>
> This is the resulting logfile...
>
> 23 Aug 2012 08:29:49,146 ERROR
> [SinkRunner-PollingRunner-DefaultSinkProcessor]
> (org.apache.flume.SinkRunner$PollingRunner.run:160)  - Unable to deliver
> event. Exception follows.
> org.apache.flume.EventDeliveryException: Failed to open file
> /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}/1345728588635-1
> while delivering event
>         at
> org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:166)
>         at
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at
> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.FileNotFoundException:
> /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/%{host}/1345728588635-1 (No
> such file or directory)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:145)
>         at
> org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:160)
>         ... 3 more
>
>
> Anyone care to knock some the the dummy-dust loose that's accumulated in
> my head and point me in the right direction to get this sorted?
>
>
> Thanks,
> Dave
>