You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Jeff Lord <jl...@cloudera.com> on 2013/05/01 03:35:07 UTC
Re: Getting "Checking file:conf/flume.conf for changes" message in loop
Vikas,
This message is normal and harmless.
2013-04-29 08:26:11,868 (conf-file-poller-0) [DEBUG -
org.apache.flume.conf.file.AbstractFileConfigurationProvi
der$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)]
Checking file:conf/flume.conf for changes
If you change your log settings to INFO level it will also not show up.
Regarding the reason for which you do not see the contents of your file in
hdfs.
One thing with the exec source and tail is that the events are buffered
until 20 events have been written to the cache. One way to work around this
is to change the default from 20 -> 1
batchSize 20 The max number of lines to read and send to the channel at a
time
Alternatively there was a recent patch that sets the batchTimeout for exec
source that will let you flush the cache based on elapsed time. That fix is
available on the latest version of trunk.
-Jeff
On Mon, Apr 29, 2013 at 8:31 AM, Vikas Kanth <ka...@yahoo.co.in>wrote:
> Hi,
>
> I am getting following message in loop. The source file hasn't moved to
> the destination.
>
> 2013-04-29 08:24:41,346 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:73)]
> Component type: CHANNEL, name: Channel-2 started
> 2013-04-29 08:24:41,846 (conf-file-poller-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:141)]
> Starting Sink HDFS
> 2013-04-29 08:24:41,847 (conf-file-poller-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:152)]
> Starting Source tail
> 2013-04-29 08:24:41,847 (lifecycleSupervisor-1-3) [INFO -
> org.apache.flume.source.ExecSource.start(ExecSource.java:155)] Exec source
> starting with command:tail -F /home/vkanth/temp/Sample2.txt
> 2013-04-29 08:24:41,850 (lifecycleSupervisor-1-3) [DEBUG -
> org.apache.flume.source.ExecSource.start(ExecSource.java:173)] Exec source
> started
> 2013-04-29 08:24:41,850 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:89)]
> Monitoried counter group for type: SINK, name: HDFS, registered
> successfully.
> 2013-04-29 08:24:41,851 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:73)]
> Component type: SINK, name: HDFS started
> 2013-04-29 08:24:41,852 (SinkRunner-PollingRunner-DefaultSinkProcessor)
> [DEBUG -
> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Polling
> sink runner starting
> 2013-04-29 08:25:11,855 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)]
> Checking file:conf/flume.conf for changes
> 2013-04-29 08:25:41,861 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)]
> Checking file:conf/flume.conf for changes
> 2013-04-29 08:26:11,868 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)]
> Checking file:conf/flume.conf for changes
> .......
> .......
>
>
> Flume.conf:
> agent1.sources = tail
> agent1.channels = Channel-2
> agent1.sinks = HDFS
>
> agent1.sources.tail.type = exec
> agent1.sources.tail.command = tail -F /home/vikas/temp/Sample2.txt
> agent1.sources.tail.channels = Channel-2
>
> agent1.sinks.HDFS.channel = Channel-2
> agent1.sinks.HDFS.type = hdfs
> agent1.sinks.HDFS.hdfs.path = hdfs://dev-pub01.xyz.abc.com:8020/tmp
> agent1.sinks.HDFS.hdfs.file.fileType = DataStream
>
> agent1.channels.Channel-2.type = memory
> agent1.channels.Channel-2.capacity = 1000
> agent1.channels.Channel-2.transactionCapacity=10
>
> Command:
> bin/flume-ng agent --conf ./conf/ -f conf/flume.conf
> -Dflume.root.logger=DEBUG,console -n agent1
>
> Please let me know if I am missing something.
>
> Thanks,
> Vikas
>
>
Re: Getting "Checking file:conf/flume.conf for changes" message in loop
Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
You may look into interceptors:
http://flume.apache.org/releases/content/1.3.0/apidocs/org/apache/flume/interceptor/Interceptor.html
Regards,
Alex
On May 1, 2013, at 5:00 PM, Vikas Kanth <ka...@yahoo.co.in> wrote:
> Hi Jeff,
>
> Thanks for the reply. Your suggestion worked.
>
> I've got one more question. The logs generated at the HDFS, using exec/spooling, are in the following format :
> -rw-r--r-- 3 vikas 66 2013-05-01 05:43 /tmp/FlumeData.1367412193479
> -rw-r--r-- 3 vikas 67 2013-05-01 05:43 /tmp/FlumeData.1367412193480
> -rw-r--r-- 3 vikas 61 2013-05-01 05:43 /tmp/FlumeData.1367412193481
> -rw-r--r-- 3 vikas 44 2013-05-01 05:43 /tmp/FlumeData.1367412193482
>
> Is there any way where I can copy the files (eg. com/org/test/flm/Sample.txt) from source to HDFS in the same folder structure/size/name?
> If not, what is the alternative?
>
> Thanks,
> Vikas
> From: Jeff Lord <jl...@cloudera.com>
> To: user@flume.apache.org; Vikas Kanth <ka...@yahoo.co.in>
> Sent: Wednesday, 1 May 2013 7:05 AM
> Subject: Re: Getting "Checking file:conf/flume.conf for changes" message in loop
>
> Vikas,
>
> This message is normal and harmless.
>
> 2013-04-29 08:26:11,868 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:conf/flume.conf for changes
>
> If you change your log settings to INFO level it will also not show up.
>
> Regarding the reason for which you do not see the contents of your file in hdfs.
> One thing with the exec source and tail is that the events are buffered until 20 events have been written to the cache. One way to work around this is to change the default from 20 -> 1
>
> batchSize 20 The max number of lines to read and send to the channel at a time
>
> Alternatively there was a recent patch that sets the batchTimeout for exec source that will let you flush the cache based on elapsed time. That fix is available on the latest version of trunk.
>
> -Jeff
>
>
>
> On Mon, Apr 29, 2013 at 8:31 AM, Vikas Kanth <ka...@yahoo.co.in> wrote:
> Hi,
>
> I am getting following message in loop. The source file hasn't moved to the destination.
>
> 2013-04-29 08:24:41,346 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:73)] Component type: CHANNEL, name: Channel-2 started
> 2013-04-29 08:24:41,846 (conf-file-poller-0) [INFO - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:141)] Starting Sink HDFS
> 2013-04-29 08:24:41,847 (conf-file-poller-0) [INFO - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:152)] Starting Source tail
> 2013-04-29 08:24:41,847 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.source.ExecSource.start(ExecSource.java:155)] Exec source starting with command:tail -F /home/vkanth/temp/Sample2.txt
> 2013-04-29 08:24:41,850 (lifecycleSupervisor-1-3) [DEBUG - org.apache.flume.source.ExecSource.start(ExecSource.java:173)] Exec source started
> 2013-04-29 08:24:41,850 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:89)] Monitoried counter group for type: SINK, name: HDFS, registered successfully.
> 2013-04-29 08:24:41,851 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:73)] Component type: SINK, name: HDFS started
> 2013-04-29 08:24:41,852 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Polling sink runner starting
> 2013-04-29 08:25:11,855 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:conf/flume.conf for changes
> 2013-04-29 08:25:41,861 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:conf/flume.conf for changes
> 2013-04-29 08:26:11,868 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:conf/flume.conf for changes
> .......
> .......
>
>
> Flume.conf:
> agent1.sources = tail
> agent1.channels = Channel-2
> agent1.sinks = HDFS
>
> agent1.sources.tail.type = exec
> agent1.sources.tail.command = tail -F /home/vikas/temp/Sample2.txt
> agent1.sources.tail.channels = Channel-2
>
> agent1.sinks.HDFS.channel = Channel-2
> agent1.sinks.HDFS.type = hdfs
> agent1.sinks.HDFS.hdfs.path = hdfs://dev-pub01.xyz.abc.com:8020/tmp
> agent1.sinks.HDFS.hdfs.file.fileType = DataStream
>
> agent1.channels.Channel-2.type = memory
> agent1.channels.Channel-2.capacity = 1000
> agent1.channels.Channel-2.transactionCapacity=10
>
> Command:
> bin/flume-ng agent --conf ./conf/ -f conf/flume.conf -Dflume.root.logger=DEBUG,console -n agent1
>
> Please let me know if I am missing something.
>
> Thanks,
> Vikas
>
>
>
>
--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF
Re: Getting "Checking file:conf/flume.conf for changes" message in loop
Posted by Vikas Kanth <ka...@yahoo.co.in>.
Hi Jeff,
Thanks for the reply. Your suggestion worked.
I've got one more question. The logs generated at the HDFS, using exec/spooling, are in the following format :
-rw-r--r-- 3 vikas 66 2013-05-01 05:43 /tmp/FlumeData.1367412193479
-rw-r--r-- 3 vikas 67 2013-05-01 05:43 /tmp/FlumeData.1367412193480
-rw-r--r-- 3 vikas 61 2013-05-01 05:43 /tmp/FlumeData.1367412193481
-rw-r--r-- 3 vikas 44 2013-05-01 05:43 /tmp/FlumeData.1367412193482
Is there any way where I can copy the files (eg. com/org/test/flm/Sample.txt) from source to HDFS in the same folder structure/size/name?
If not, what is the alternative?
Thanks,
Vikas
________________________________
From: Jeff Lord <jl...@cloudera.com>
To: user@flume.apache.org; Vikas Kanth <ka...@yahoo.co.in>
Sent: Wednesday, 1 May 2013 7:05 AM
Subject: Re: Getting "Checking file:conf/flume.conf for changes" message in loop
Vikas,
This message is normal and harmless.
2013-04-29 08:26:11,868 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:conf/flume.conf for changes
If you change your log settings to INFO level it will also not show up.
Regarding the reason for which you do not see the contents of your file in hdfs.
One thing with the exec source and tail is that the events are buffered until 20 events have been written to the cache. One way to work around this is to change the default from 20 -> 1
batchSize 20The max number of lines to read and send to the channel at a time
Alternatively there was a recent patch that sets the batchTimeout for exec source that will let you flush the cache based on elapsed time. That fix is available on the latest version of trunk.
-Jeff
On Mon, Apr 29, 2013 at 8:31 AM, Vikas Kanth <ka...@yahoo.co.in> wrote:
Hi,
>
>
>I am getting following message in loop. The source file hasn't moved to the destination.
>
>
>2013-04-29 08:24:41,346 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:73)] Component type: CHANNEL, name: Channel-2 started
>2013-04-29 08:24:41,846 (conf-file-poller-0) [INFO - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:141)] Starting Sink HDFS
>2013-04-29 08:24:41,847 (conf-file-poller-0) [INFO - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:152)] Starting Source tail
>2013-04-29 08:24:41,847 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.source.ExecSource.start(ExecSource.java:155)] Exec source starting with command:tail -F /home/vkanth/temp/Sample2.txt
>2013-04-29 08:24:41,850 (lifecycleSupervisor-1-3) [DEBUG - org.apache.flume.source.ExecSource.start(ExecSource.java:173)] Exec source started
>2013-04-29 08:24:41,850 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:89)] Monitoried counter group for type: SINK, name: HDFS, registered successfully.
>2013-04-29 08:24:41,851 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:73)] Component type: SINK, name: HDFS started
>2013-04-29 08:24:41,852 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Polling sink runner starting
>2013-04-29 08:25:11,855(conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:conf/flume.conf for changes
>2013-04-29 08:25:41,861 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:conf/flume.conf for changes
>2013-04-29 08:26:11,868(conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:conf/flume.conf for changes
>.......
>.......
>
>
>
>
>Flume.conf:
>agent1.sources = tail
>agent1.channels = Channel-2
>agent1.sinks = HDFS
>
>
>agent1.sources.tail.type = exec
>agent1.sources.tail.command = tail -F /home/vikas/temp/Sample2.txt
>agent1.sources.tail.channels = Channel-2
>
>
>agent1.sinks.HDFS.channel = Channel-2
>agent1.sinks.HDFS.type = hdfs
>agent1.sinks.HDFS.hdfs.path = hdfs://dev-pub01.xyz.abc.com:8020/tmp
>agent1.sinks.HDFS.hdfs.file.fileType = DataStream
>
>
>agent1.channels.Channel-2.type = memory
>agent1.channels.Channel-2.capacity = 1000
>agent1.channels.Channel-2.transactionCapacity=10
>
>
>Command:
>bin/flume-ng agent --conf ./conf/ -f conf/flume.conf -Dflume.root.logger=DEBUG,console -n agent1
>
>
>Please let me know if I am missing something.
>
>
>Thanks,
>Vikas
>
>