You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by mardan Khan <ma...@gmail.com> on 2012/07/22 14:27:55 UTC

Use of Flume for the sensor network data

 I am working on large dataset storage and processing. I am collect the
large dataset through .NET application from multiple sensor devices. The
data are storing in C drive. I want to process this on hadoop cluster. For
this purpose i have setup the hadoop cluster. Now I dont know how I can
automatically upload the data to hadoop cluster from remote computer.
Actually, the machine collecting the data is not part of cluster.

Could you please let me know, in my case, flume can help me. I mean can I
use the flume to automatically the upload the file into hadoop. The sensor
devices contineously generating the files of 100 MB.

I have read about the flume that it is used only for the log data. How I
can use for any another type of data.

Many thanks

Re: Use of Flume for the sensor network data

Posted by mardan Khan <ma...@gmail.com>.
Followed the following links

1)
https://ccp.cloudera.com/display/CDH4B2/CDH4+Installation#CDH4Installation-AddingaDebianRepository

2) https://ccp.cloudera.com/display/CDH4B2/Flume+Installation




cheers



On Tue, Jul 24, 2012 at 3:55 AM, Will McQueen <wi...@cloudera.com> wrote:

> Hi Mardan,
>
> How did you install Flume in SLES? Are you installing Flume straight from
> Apache, or from a distro?
>
> Cheers,
> Will
>
>
> On Mon, Jul 23, 2012 at 4:46 PM, mardan Khan <ma...@gmail.com> wrote:
>
>> Hi,
>>
>> I am still getting missing class error.
>>
>> I have specify the full path. My flume is installed in usr/lib/flume-ng
>> and my configuration file is :/usr/lib/flume-ng/conf/flume.conf.
>>
>> you have mentioned that -c option with command, sorry i dont know exactly
>> how I can use, I tried but give me the error message.
>>
>> I am apply the following command:
>>
>> $: /usr/bin/flume-ng agent -n agent -f /usr/lib/flume-ng/conf/flume.conf
>>
>>
>> Could you please write me the exact command to override this problem.ch
>>
>> The configuration file exactly same you have posted,  just change the
>> hdfs path and location of log file.
>>
>>
>> For remind the error message as:
>>
>>
>> 12/07/24 00:33:51 ERROR channel.ChannelProcessor: Builder class not
>> found. Exception follows.
>>
>> java.lang.ClassNotFoundException:
>> org.apache.flume.interceptor.HostInterceptor$Builder
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>     at java.lang.Class.forName0(Native Method)
>>
>> I am really straggling to run one command successful
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Jul 23, 2012 at 9:09 AM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> Hi mardan,
>>>
>>>      You need to use the -c option with tour command to specify the
>>> directory where your configuration file is kept. Just look at the the
>>> other thread of yours.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>> On Mon, Jul 23, 2012 at 10:19 AM, mardan Khan <ma...@gmail.com>
>>> wrote:
>>> > Dear Mohammad Tariq,
>>> >
>>> > Many thanks for your valuable information.
>>> >
>>> > For the testing purpose , i have installed flume on SuSE Linux system.
>>> when
>>> > i have type the command $/etc/init.d/flume-ng-agent start. Then i have
>>> > receive message that Starting Flume NG agent daemon (flume-ng-agent):
>>> >
>>> > I think this mean my flume agent is working properly. I have the
>>> following
>>> > changes in configuration file according to your example. The
>>> configuration
>>> > file as:
>>> >
>>> >
>>> >
>>> > agent.sources = seqGenSrc
>>> > agent.channels = memoryChannel
>>> > agent.sinks = loggerSink
>>> >
>>> > # For each one of the sources, the type is defined
>>> > agent.sources.seqGenSrc.type = seq
>>> >
>>> > # The channel can be defined as follows.
>>> > agent.sources.seqGenSrc.channels = memoryChannel
>>> >
>>> > # Each sink's type must be defined
>>> > agent.sinks.loggerSink.type = logger
>>> >
>>> > #Specify the channel the sink should use
>>> > agent.sinks.loggerSink.channel = memoryChannel
>>> >
>>> > # Each channel's type is defined.
>>> > agent.channels.memoryChannel.type = memory
>>> >
>>> > # Other config values specific to each type of channel(sink or source)
>>> > # can be defined as well
>>> > # In this case, it specifies the capacity of the memory channel
>>> > agent.channels.memoryChannel.capacity = 100
>>> >
>>> >
>>> > agent1.sources = tail
>>> > agent1.channels = MemoryChannel-2
>>> > agent1.sinks = HDFS
>>> >
>>> > agent1.sources.tail.type = exec
>>> > agent1.sources.tail.command = tail -F /var/log/flume-ng/flume-init.log
>>> >
>>> > agent1.sources.tail.channels = MemoryChannel-2
>>> >
>>> > agent1.sources.tail.interceptors = hostint
>>> > agent1.sources.tail.interceptors.hostint.type =
>>> > org.apache.flume.interceptor.HostInterceptor$Builder
>>> > agent1.sources.tail.interceptors.hostint.preserverExisting = true
>>> >
>>> > agent1.sources.tail.interceptors.hostint.useIP = true
>>> >
>>> > agent1.sinks.HDFS.channel = MemoryChannel-2
>>> > agent1.channels.MemoryChannel-2.type = memory
>>> > agent1.sinks.HDFS.type =hdfs
>>> > agent1.sinks.HDFS.hdfs.path = hdfs://134.83.35.24/user/mardan/
>>> >
>>> > agent1.sinks.HDFS.hdfs.file.Type = DataStream
>>> > agent1.sinks.HDFS.hdfs.writeFormat = Text
>>> >
>>> >
>>> >
>>> > when i have type the following command
>>> >
>>> > $ /usr/bin/flume-ng agent -n agent1 -f /etc/flume-ng/conf/flume.conf
>>> >
>>> >
>>> >
>>> > I got the following warring / error  messages
>>> >
>>> > Warning: No configuration directory set! Use --conf <dir> to override.
>>> > Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
>>> access
>>> > Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
>>> > Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from
>>> classpath
>>> > Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from
>>> classpath
>>> > Info: Excluding /usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar
>>> from
>>> > classpath
>>> > Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE
>>> access
>>> > Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.6.1.jar from
>>> classpath
>>> > Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.6.1.jar from
>>> classpath
>>> > Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar from
>>> > classpath
>>> > Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
>>> > Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from
>>> classpath
>>> > Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from
>>> classpath
>>> > + exec /usr/java/jdk1.6.0_31/bin/java -Xmx20m -cp
>>> >
>>> '/usr/lib/flume-ng/lib/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/avro-1.5.4.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/commons-logging-api-1.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/hue-plugins-2.0.0-cdh4.0.1.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.
>>> >
>>> >
>>> ................................................................................
>>> >
>>> >
>>> > 12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting
>>> lifecycle
>>> > supervisor 1
>>> > 12/07/23 05:41:29 INFO node.FlumeNode: Flume node starting - agent1
>>> > 12/07/23 05:41:29 INFO nodemanager.DefaultLogicalNodeManager: Node
>>> manager
>>> > starting
>>> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
>>> > Configuration provider starting
>>> > 12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting
>>> lifecycle
>>> > supervisor 10
>>> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
>>> > Reloading configuration file:/etc/flume-ng/conf/flume.conf
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: loggerSink
>>> > Agent: agent
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: HDFS
>>> Agent:
>>> > agent1
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
>>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Post-validation flume
>>> > configuration contains configuration  for agents: [agent, agent1]
>>> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
>>> > Creating channels
>>> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
>>> > created channel MemoryChannel-2
>>> > 12/07/23 05:41:29 ERROR channel.ChannelProcessor: Builder class not
>>> found.
>>> > Exception follows.
>>> > java.lang.ClassNotFoundException:
>>> > org.apache.flume.interceptor.HostInterceptor$Builder
>>> >     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>> >     at java.security.AccessController.doPrivileged(Native Method)
>>> >     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>> >     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>> >     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>> >     at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>> >     at java.lang.Class.forName0(Native Method)
>>> >     at java.lang.Class.forName(Class.java:169)
>>> >     at
>>> >
>>> org.apache.flume.channel.ChannelProcessor.configureInterceptors(ChannelProcessor.java:103)
>>> >     at
>>> >
>>> org.apache.flume.channel.ChannelProcessor.configure(ChannelProcessor.java:79)
>>> >     at
>>> org.apache.flume.conf.Configurables.configure(Configurables.java:41)
>>> >     at
>>> >
>>> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:337)
>>> >     at
>>> >
>>> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222)
>>> >     at
>>> >
>>> org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123)
>>> >     at
>>> >
>>> org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
>>> >     at
>>> >
>>> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202)
>>> >     at
>>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>> >     at
>>> >
>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>>> >     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>>> >     at
>>> >
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>>> >     at
>>> >
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>>> >     at
>>> >
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>>> >     at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> >     at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> >     at java.lang.Thread.run(Thread.java:662)
>>> >
>>> >
>>> >
>>> > Could you please let me know why give me this message / class missing.
>>> >
>>> > Many thanks
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > On Sun, Jul 22, 2012 at 10:12 PM, Mohammad Tariq <do...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hello Mardan,
>>> >>
>>> >>         In order to aggregate data into your Hadoop cluster you need
>>> >> to set up a Flume agent first. In order to do that you have to write a
>>> >> config file having desired properties. An example file would be
>>> >> somewhat like this :
>>> >>
>>> >> agent1.sources = tail
>>> >> agent1.channels = MemoryChannel-2
>>> >> agent1.sinks = HDFS
>>> >>
>>> >> agent1.sources.tail.type = exec
>>> >> agent1.sources.tail.command = tail -F /var/log/apache2/access.log
>>> >> agent1.sources.tail.channels = MemoryChannel-2
>>> >>
>>> >> agent1.sources.tail.interceptors = hostint
>>> >> agent1.sources.tail.interceptors.hostint.type =
>>> >> org.apache.flume.interceptor.HostInterceptor$Builder
>>> >> agent1.sources.tail.interceptors.hostint.preserveExisting = true
>>> >> agent1.sources.tail.interceptors.hostint.useIP = true
>>> >>
>>> >> agent1.sinks.HDFS.channel = MemoryChannel-2
>>> >> agent1.sinks.HDFS.type = hdfs
>>> >> agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000/flume/%{host}
>>> >> agent1.sinks.HDFS.hdfs.file.Type = DataStream
>>> >> agent1.sinks.HDFS.hdfs.writeFormat = Text
>>> >>
>>> >> agent1.channels.MemoryChannel-2.type = memory
>>> >>
>>> >> You can visit this link as the starting point, if you want -
>>> >>
>>> http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html
>>> >>
>>> >> And, it is quite possible to run Flume-1.x o windows. Here is a great
>>> >> post by Alex on how to do that -
>>> >> http://mapredit.blogspot.in/2012/07/run-flume-13x-on-windows.html
>>> >>
>>> >> Hope it helps.
>>> >>
>>> >> Regards,
>>> >>     Mohammad Tariq
>>> >>
>>> >>
>>> >> On Mon, Jul 23, 2012 at 2:17 AM, mardan Khan <ma...@gmail.com>
>>> wrote:
>>> >> > Yeah, my cluster is always running. But i dont know how to setup the
>>> >> > flume
>>> >> > that directly stream the data to hadoop. I have must install the
>>> flume
>>> >> > agent
>>> >> > on window machine. As per my study the flume version-0.9.4 agent can
>>> >> > install
>>> >> > on window machine. Can we install flume version 1.x on window
>>> machine?
>>> >> > If any one have done, please let me guide.
>>> >> >
>>> >> >
>>> >> >
>>> >> > Many thanks
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Sun, Jul 22, 2012 at 7:26 PM, Mohammad Tariq <dontariq@gmail.com
>>> >
>>> >> > wrote:
>>> >> >>
>>> >> >> NameNode and DataNode must be running if we need to write anything
>>> to
>>> >> >> the
>>> >> >> Hdfs.
>>> >> >>
>>> >> >> Regards,
>>> >> >>     Mohammad Tariq
>>> >> >>
>>> >> >>
>>> >> >> On Sun, Jul 22, 2012 at 11:41 PM, Henry Larson <
>>> newtoflume@gmail.com>
>>> >> >> wrote:
>>> >> >> > You can have flume write to HDFS: however, do you have your
>>> hadoop
>>> >> >> > cluster running all the time?
>>> >> >
>>> >> >
>>> >
>>> >
>>>
>>
>>
>

Re: Use of Flume for the sensor network data

Posted by Will McQueen <wi...@cloudera.com>.
Hi Mardan,

How did you install Flume in SLES? Are you installing Flume straight from
Apache, or from a distro?

Cheers,
Will

On Mon, Jul 23, 2012 at 4:46 PM, mardan Khan <ma...@gmail.com> wrote:

> Hi,
>
> I am still getting missing class error.
>
> I have specify the full path. My flume is installed in usr/lib/flume-ng
> and my configuration file is :/usr/lib/flume-ng/conf/flume.conf.
>
> you have mentioned that -c option with command, sorry i dont know exactly
> how I can use, I tried but give me the error message.
>
> I am apply the following command:
>
> $: /usr/bin/flume-ng agent -n agent -f /usr/lib/flume-ng/conf/flume.conf
>
>
> Could you please write me the exact command to override this problem.ch
>
> The configuration file exactly same you have posted,  just change the hdfs
> path and location of log file.
>
>
> For remind the error message as:
>
>
> 12/07/24 00:33:51 ERROR channel.ChannelProcessor: Builder class not found.
> Exception follows.
>
> java.lang.ClassNotFoundException:
> org.apache.flume.interceptor.HostInterceptor$Builder
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>     at java.lang.Class.forName0(Native Method)
>
> I am really straggling to run one command successful
>
> Thanks
>
>
>
>
>
>
>
> On Mon, Jul 23, 2012 at 9:09 AM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> Hi mardan,
>>
>>      You need to use the -c option with tour command to specify the
>> directory where your configuration file is kept. Just look at the the
>> other thread of yours.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>> On Mon, Jul 23, 2012 at 10:19 AM, mardan Khan <ma...@gmail.com>
>> wrote:
>> > Dear Mohammad Tariq,
>> >
>> > Many thanks for your valuable information.
>> >
>> > For the testing purpose , i have installed flume on SuSE Linux system.
>> when
>> > i have type the command $/etc/init.d/flume-ng-agent start. Then i have
>> > receive message that Starting Flume NG agent daemon (flume-ng-agent):
>> >
>> > I think this mean my flume agent is working properly. I have the
>> following
>> > changes in configuration file according to your example. The
>> configuration
>> > file as:
>> >
>> >
>> >
>> > agent.sources = seqGenSrc
>> > agent.channels = memoryChannel
>> > agent.sinks = loggerSink
>> >
>> > # For each one of the sources, the type is defined
>> > agent.sources.seqGenSrc.type = seq
>> >
>> > # The channel can be defined as follows.
>> > agent.sources.seqGenSrc.channels = memoryChannel
>> >
>> > # Each sink's type must be defined
>> > agent.sinks.loggerSink.type = logger
>> >
>> > #Specify the channel the sink should use
>> > agent.sinks.loggerSink.channel = memoryChannel
>> >
>> > # Each channel's type is defined.
>> > agent.channels.memoryChannel.type = memory
>> >
>> > # Other config values specific to each type of channel(sink or source)
>> > # can be defined as well
>> > # In this case, it specifies the capacity of the memory channel
>> > agent.channels.memoryChannel.capacity = 100
>> >
>> >
>> > agent1.sources = tail
>> > agent1.channels = MemoryChannel-2
>> > agent1.sinks = HDFS
>> >
>> > agent1.sources.tail.type = exec
>> > agent1.sources.tail.command = tail -F /var/log/flume-ng/flume-init.log
>> >
>> > agent1.sources.tail.channels = MemoryChannel-2
>> >
>> > agent1.sources.tail.interceptors = hostint
>> > agent1.sources.tail.interceptors.hostint.type =
>> > org.apache.flume.interceptor.HostInterceptor$Builder
>> > agent1.sources.tail.interceptors.hostint.preserverExisting = true
>> >
>> > agent1.sources.tail.interceptors.hostint.useIP = true
>> >
>> > agent1.sinks.HDFS.channel = MemoryChannel-2
>> > agent1.channels.MemoryChannel-2.type = memory
>> > agent1.sinks.HDFS.type =hdfs
>> > agent1.sinks.HDFS.hdfs.path = hdfs://134.83.35.24/user/mardan/
>> >
>> > agent1.sinks.HDFS.hdfs.file.Type = DataStream
>> > agent1.sinks.HDFS.hdfs.writeFormat = Text
>> >
>> >
>> >
>> > when i have type the following command
>> >
>> > $ /usr/bin/flume-ng agent -n agent1 -f /etc/flume-ng/conf/flume.conf
>> >
>> >
>> >
>> > I got the following warring / error  messages
>> >
>> > Warning: No configuration directory set! Use --conf <dir> to override.
>> > Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
>> access
>> > Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
>> > Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from
>> classpath
>> > Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from
>> classpath
>> > Info: Excluding /usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar
>> from
>> > classpath
>> > Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE
>> access
>> > Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.6.1.jar from
>> classpath
>> > Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.6.1.jar from
>> classpath
>> > Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar from
>> > classpath
>> > Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
>> > Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from
>> classpath
>> > Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from
>> classpath
>> > + exec /usr/java/jdk1.6.0_31/bin/java -Xmx20m -cp
>> >
>> '/usr/lib/flume-ng/lib/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/avro-1.5.4.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/commons-logging-api-1.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/hue-plugins-2.0.0-cdh4.0.1.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.
>> >
>> >
>> ................................................................................
>> >
>> >
>> > 12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
>> > supervisor 1
>> > 12/07/23 05:41:29 INFO node.FlumeNode: Flume node starting - agent1
>> > 12/07/23 05:41:29 INFO nodemanager.DefaultLogicalNodeManager: Node
>> manager
>> > starting
>> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
>> > Configuration provider starting
>> > 12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
>> > supervisor 10
>> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
>> > Reloading configuration file:/etc/flume-ng/conf/flume.conf
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: loggerSink
>> > Agent: agent
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent:
>> > agent1
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
>> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Post-validation flume
>> > configuration contains configuration  for agents: [agent, agent1]
>> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
>> > Creating channels
>> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
>> > created channel MemoryChannel-2
>> > 12/07/23 05:41:29 ERROR channel.ChannelProcessor: Builder class not
>> found.
>> > Exception follows.
>> > java.lang.ClassNotFoundException:
>> > org.apache.flume.interceptor.HostInterceptor$Builder
>> >     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>> >     at java.security.AccessController.doPrivileged(Native Method)
>> >     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>> >     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>> >     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>> >     at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>> >     at java.lang.Class.forName0(Native Method)
>> >     at java.lang.Class.forName(Class.java:169)
>> >     at
>> >
>> org.apache.flume.channel.ChannelProcessor.configureInterceptors(ChannelProcessor.java:103)
>> >     at
>> >
>> org.apache.flume.channel.ChannelProcessor.configure(ChannelProcessor.java:79)
>> >     at
>> org.apache.flume.conf.Configurables.configure(Configurables.java:41)
>> >     at
>> >
>> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:337)
>> >     at
>> >
>> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222)
>> >     at
>> >
>> org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123)
>> >     at
>> >
>> org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
>> >     at
>> >
>> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202)
>> >     at
>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>> >     at
>> >
>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>> >     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>> >     at
>> >
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>> >     at
>> >
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>> >     at
>> >
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>> >     at
>> >
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >     at
>> >
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >     at java.lang.Thread.run(Thread.java:662)
>> >
>> >
>> >
>> > Could you please let me know why give me this message / class missing.
>> >
>> > Many thanks
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Sun, Jul 22, 2012 at 10:12 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> >>
>> >> Hello Mardan,
>> >>
>> >>         In order to aggregate data into your Hadoop cluster you need
>> >> to set up a Flume agent first. In order to do that you have to write a
>> >> config file having desired properties. An example file would be
>> >> somewhat like this :
>> >>
>> >> agent1.sources = tail
>> >> agent1.channels = MemoryChannel-2
>> >> agent1.sinks = HDFS
>> >>
>> >> agent1.sources.tail.type = exec
>> >> agent1.sources.tail.command = tail -F /var/log/apache2/access.log
>> >> agent1.sources.tail.channels = MemoryChannel-2
>> >>
>> >> agent1.sources.tail.interceptors = hostint
>> >> agent1.sources.tail.interceptors.hostint.type =
>> >> org.apache.flume.interceptor.HostInterceptor$Builder
>> >> agent1.sources.tail.interceptors.hostint.preserveExisting = true
>> >> agent1.sources.tail.interceptors.hostint.useIP = true
>> >>
>> >> agent1.sinks.HDFS.channel = MemoryChannel-2
>> >> agent1.sinks.HDFS.type = hdfs
>> >> agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000/flume/%{host}
>> >> agent1.sinks.HDFS.hdfs.file.Type = DataStream
>> >> agent1.sinks.HDFS.hdfs.writeFormat = Text
>> >>
>> >> agent1.channels.MemoryChannel-2.type = memory
>> >>
>> >> You can visit this link as the starting point, if you want -
>> >>
>> http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html
>> >>
>> >> And, it is quite possible to run Flume-1.x o windows. Here is a great
>> >> post by Alex on how to do that -
>> >> http://mapredit.blogspot.in/2012/07/run-flume-13x-on-windows.html
>> >>
>> >> Hope it helps.
>> >>
>> >> Regards,
>> >>     Mohammad Tariq
>> >>
>> >>
>> >> On Mon, Jul 23, 2012 at 2:17 AM, mardan Khan <ma...@gmail.com>
>> wrote:
>> >> > Yeah, my cluster is always running. But i dont know how to setup the
>> >> > flume
>> >> > that directly stream the data to hadoop. I have must install the
>> flume
>> >> > agent
>> >> > on window machine. As per my study the flume version-0.9.4 agent can
>> >> > install
>> >> > on window machine. Can we install flume version 1.x on window
>> machine?
>> >> > If any one have done, please let me guide.
>> >> >
>> >> >
>> >> >
>> >> > Many thanks
>> >> >
>> >> >
>> >> >
>> >> > On Sun, Jul 22, 2012 at 7:26 PM, Mohammad Tariq <do...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> NameNode and DataNode must be running if we need to write anything
>> to
>> >> >> the
>> >> >> Hdfs.
>> >> >>
>> >> >> Regards,
>> >> >>     Mohammad Tariq
>> >> >>
>> >> >>
>> >> >> On Sun, Jul 22, 2012 at 11:41 PM, Henry Larson <
>> newtoflume@gmail.com>
>> >> >> wrote:
>> >> >> > You can have flume write to HDFS: however, do you have your hadoop
>> >> >> > cluster running all the time?
>> >> >
>> >> >
>> >
>> >
>>
>
>

Re: Use of Flume for the sensor network data

Posted by mardan Khan <ma...@gmail.com>.
Hi,

I am still getting missing class error.

I have specify the full path. My flume is installed in usr/lib/flume-ng and
my configuration file is :/usr/lib/flume-ng/conf/flume.conf.

you have mentioned that -c option with command, sorry i dont know exactly
how I can use, I tried but give me the error message.

I am apply the following command:

$: /usr/bin/flume-ng agent -n agent -f /usr/lib/flume-ng/conf/flume.conf


Could you please write me the exact command to override this problem.ch

The configuration file exactly same you have posted,  just change the hdfs
path and location of log file.


For remind the error message as:


12/07/24 00:33:51 ERROR channel.ChannelProcessor: Builder class not found.
Exception follows.
java.lang.ClassNotFoundException:
org.apache.flume.interceptor.HostInterceptor$Builder
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.lang.Class.forName0(Native Method)

I am really straggling to run one command successful

Thanks






On Mon, Jul 23, 2012 at 9:09 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hi mardan,
>
>      You need to use the -c option with tour command to specify the
> directory where your configuration file is kept. Just look at the the
> other thread of yours.
>
> Regards,
>     Mohammad Tariq
>
>
> On Mon, Jul 23, 2012 at 10:19 AM, mardan Khan <ma...@gmail.com>
> wrote:
> > Dear Mohammad Tariq,
> >
> > Many thanks for your valuable information.
> >
> > For the testing purpose , i have installed flume on SuSE Linux system.
> when
> > i have type the command $/etc/init.d/flume-ng-agent start. Then i have
> > receive message that Starting Flume NG agent daemon (flume-ng-agent):
> >
> > I think this mean my flume agent is working properly. I have the
> following
> > changes in configuration file according to your example. The
> configuration
> > file as:
> >
> >
> >
> > agent.sources = seqGenSrc
> > agent.channels = memoryChannel
> > agent.sinks = loggerSink
> >
> > # For each one of the sources, the type is defined
> > agent.sources.seqGenSrc.type = seq
> >
> > # The channel can be defined as follows.
> > agent.sources.seqGenSrc.channels = memoryChannel
> >
> > # Each sink's type must be defined
> > agent.sinks.loggerSink.type = logger
> >
> > #Specify the channel the sink should use
> > agent.sinks.loggerSink.channel = memoryChannel
> >
> > # Each channel's type is defined.
> > agent.channels.memoryChannel.type = memory
> >
> > # Other config values specific to each type of channel(sink or source)
> > # can be defined as well
> > # In this case, it specifies the capacity of the memory channel
> > agent.channels.memoryChannel.capacity = 100
> >
> >
> > agent1.sources = tail
> > agent1.channels = MemoryChannel-2
> > agent1.sinks = HDFS
> >
> > agent1.sources.tail.type = exec
> > agent1.sources.tail.command = tail -F /var/log/flume-ng/flume-init.log
> >
> > agent1.sources.tail.channels = MemoryChannel-2
> >
> > agent1.sources.tail.interceptors = hostint
> > agent1.sources.tail.interceptors.hostint.type =
> > org.apache.flume.interceptor.HostInterceptor$Builder
> > agent1.sources.tail.interceptors.hostint.preserverExisting = true
> >
> > agent1.sources.tail.interceptors.hostint.useIP = true
> >
> > agent1.sinks.HDFS.channel = MemoryChannel-2
> > agent1.channels.MemoryChannel-2.type = memory
> > agent1.sinks.HDFS.type =hdfs
> > agent1.sinks.HDFS.hdfs.path = hdfs://134.83.35.24/user/mardan/
> >
> > agent1.sinks.HDFS.hdfs.file.Type = DataStream
> > agent1.sinks.HDFS.hdfs.writeFormat = Text
> >
> >
> >
> > when i have type the following command
> >
> > $ /usr/bin/flume-ng agent -n agent1 -f /etc/flume-ng/conf/flume.conf
> >
> >
> >
> > I got the following warring / error  messages
> >
> > Warning: No configuration directory set! Use --conf <dir> to override.
> > Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
> access
> > Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
> > Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from
> classpath
> > Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from
> classpath
> > Info: Excluding /usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar
> from
> > classpath
> > Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE
> access
> > Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.6.1.jar from
> classpath
> > Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.6.1.jar from classpath
> > Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar from
> > classpath
> > Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
> > Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from
> classpath
> > Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from
> classpath
> > + exec /usr/java/jdk1.6.0_31/bin/java -Xmx20m -cp
> >
> '/usr/lib/flume-ng/lib/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/avro-1.5.4.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/commons-logging-api-1.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/hue-plugins-2.0.0-cdh4.0.1.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.
> >
> >
> ................................................................................
> >
> >
> > 12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> > supervisor 1
> > 12/07/23 05:41:29 INFO node.FlumeNode: Flume node starting - agent1
> > 12/07/23 05:41:29 INFO nodemanager.DefaultLogicalNodeManager: Node
> manager
> > starting
> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
> > Configuration provider starting
> > 12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> > supervisor 10
> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
> > Reloading configuration file:/etc/flume-ng/conf/flume.conf
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: loggerSink
> > Agent: agent
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent:
> > agent1
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
> > 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Post-validation flume
> > configuration contains configuration  for agents: [agent, agent1]
> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
> > Creating channels
> > 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
> > created channel MemoryChannel-2
> > 12/07/23 05:41:29 ERROR channel.ChannelProcessor: Builder class not
> found.
> > Exception follows.
> > java.lang.ClassNotFoundException:
> > org.apache.flume.interceptor.HostInterceptor$Builder
> >     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >     at java.security.AccessController.doPrivileged(Native Method)
> >     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> >     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >     at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> >     at java.lang.Class.forName0(Native Method)
> >     at java.lang.Class.forName(Class.java:169)
> >     at
> >
> org.apache.flume.channel.ChannelProcessor.configureInterceptors(ChannelProcessor.java:103)
> >     at
> >
> org.apache.flume.channel.ChannelProcessor.configure(ChannelProcessor.java:79)
> >     at
> org.apache.flume.conf.Configurables.configure(Configurables.java:41)
> >     at
> >
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:337)
> >     at
> >
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222)
> >     at
> >
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123)
> >     at
> >
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> >     at
> >
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202)
> >     at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> >     at
> >
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> >     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> >     at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> >     at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> >     at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> >     at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >     at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >     at java.lang.Thread.run(Thread.java:662)
> >
> >
> >
> > Could you please let me know why give me this message / class missing.
> >
> > Many thanks
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Sun, Jul 22, 2012 at 10:12 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >>
> >> Hello Mardan,
> >>
> >>         In order to aggregate data into your Hadoop cluster you need
> >> to set up a Flume agent first. In order to do that you have to write a
> >> config file having desired properties. An example file would be
> >> somewhat like this :
> >>
> >> agent1.sources = tail
> >> agent1.channels = MemoryChannel-2
> >> agent1.sinks = HDFS
> >>
> >> agent1.sources.tail.type = exec
> >> agent1.sources.tail.command = tail -F /var/log/apache2/access.log
> >> agent1.sources.tail.channels = MemoryChannel-2
> >>
> >> agent1.sources.tail.interceptors = hostint
> >> agent1.sources.tail.interceptors.hostint.type =
> >> org.apache.flume.interceptor.HostInterceptor$Builder
> >> agent1.sources.tail.interceptors.hostint.preserveExisting = true
> >> agent1.sources.tail.interceptors.hostint.useIP = true
> >>
> >> agent1.sinks.HDFS.channel = MemoryChannel-2
> >> agent1.sinks.HDFS.type = hdfs
> >> agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000/flume/%{host}
> >> agent1.sinks.HDFS.hdfs.file.Type = DataStream
> >> agent1.sinks.HDFS.hdfs.writeFormat = Text
> >>
> >> agent1.channels.MemoryChannel-2.type = memory
> >>
> >> You can visit this link as the starting point, if you want -
> >>
> http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html
> >>
> >> And, it is quite possible to run Flume-1.x o windows. Here is a great
> >> post by Alex on how to do that -
> >> http://mapredit.blogspot.in/2012/07/run-flume-13x-on-windows.html
> >>
> >> Hope it helps.
> >>
> >> Regards,
> >>     Mohammad Tariq
> >>
> >>
> >> On Mon, Jul 23, 2012 at 2:17 AM, mardan Khan <ma...@gmail.com>
> wrote:
> >> > Yeah, my cluster is always running. But i dont know how to setup the
> >> > flume
> >> > that directly stream the data to hadoop. I have must install the flume
> >> > agent
> >> > on window machine. As per my study the flume version-0.9.4 agent can
> >> > install
> >> > on window machine. Can we install flume version 1.x on window machine?
> >> > If any one have done, please let me guide.
> >> >
> >> >
> >> >
> >> > Many thanks
> >> >
> >> >
> >> >
> >> > On Sun, Jul 22, 2012 at 7:26 PM, Mohammad Tariq <do...@gmail.com>
> >> > wrote:
> >> >>
> >> >> NameNode and DataNode must be running if we need to write anything to
> >> >> the
> >> >> Hdfs.
> >> >>
> >> >> Regards,
> >> >>     Mohammad Tariq
> >> >>
> >> >>
> >> >> On Sun, Jul 22, 2012 at 11:41 PM, Henry Larson <newtoflume@gmail.com
> >
> >> >> wrote:
> >> >> > You can have flume write to HDFS: however, do you have your hadoop
> >> >> > cluster running all the time?
> >> >
> >> >
> >
> >
>

Re: Use of Flume for the sensor network data

Posted by Mohammad Tariq <do...@gmail.com>.
Hi mardan,

     You need to use the -c option with tour command to specify the
directory where your configuration file is kept. Just look at the the
other thread of yours.

Regards,
    Mohammad Tariq


On Mon, Jul 23, 2012 at 10:19 AM, mardan Khan <ma...@gmail.com> wrote:
> Dear Mohammad Tariq,
>
> Many thanks for your valuable information.
>
> For the testing purpose , i have installed flume on SuSE Linux system. when
> i have type the command $/etc/init.d/flume-ng-agent start. Then i have
> receive message that Starting Flume NG agent daemon (flume-ng-agent):
>
> I think this mean my flume agent is working properly. I have the following
> changes in configuration file according to your example. The configuration
> file as:
>
>
>
> agent.sources = seqGenSrc
> agent.channels = memoryChannel
> agent.sinks = loggerSink
>
> # For each one of the sources, the type is defined
> agent.sources.seqGenSrc.type = seq
>
> # The channel can be defined as follows.
> agent.sources.seqGenSrc.channels = memoryChannel
>
> # Each sink's type must be defined
> agent.sinks.loggerSink.type = logger
>
> #Specify the channel the sink should use
> agent.sinks.loggerSink.channel = memoryChannel
>
> # Each channel's type is defined.
> agent.channels.memoryChannel.type = memory
>
> # Other config values specific to each type of channel(sink or source)
> # can be defined as well
> # In this case, it specifies the capacity of the memory channel
> agent.channels.memoryChannel.capacity = 100
>
>
> agent1.sources = tail
> agent1.channels = MemoryChannel-2
> agent1.sinks = HDFS
>
> agent1.sources.tail.type = exec
> agent1.sources.tail.command = tail -F /var/log/flume-ng/flume-init.log
>
> agent1.sources.tail.channels = MemoryChannel-2
>
> agent1.sources.tail.interceptors = hostint
> agent1.sources.tail.interceptors.hostint.type =
> org.apache.flume.interceptor.HostInterceptor$Builder
> agent1.sources.tail.interceptors.hostint.preserverExisting = true
>
> agent1.sources.tail.interceptors.hostint.useIP = true
>
> agent1.sinks.HDFS.channel = MemoryChannel-2
> agent1.channels.MemoryChannel-2.type = memory
> agent1.sinks.HDFS.type =hdfs
> agent1.sinks.HDFS.hdfs.path = hdfs://134.83.35.24/user/mardan/
>
> agent1.sinks.HDFS.hdfs.file.Type = DataStream
> agent1.sinks.HDFS.hdfs.writeFormat = Text
>
>
>
> when i have type the following command
>
> $ /usr/bin/flume-ng agent -n agent1 -f /etc/flume-ng/conf/flume.conf
>
>
>
> I got the following warring / error  messages
>
> Warning: No configuration directory set! Use --conf <dir> to override.
> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
> Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
> Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from classpath
> Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from classpath
> Info: Excluding /usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar from
> classpath
> Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE access
> Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.6.1.jar from classpath
> Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.6.1.jar from classpath
> Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar from
> classpath
> Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
> Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from classpath
> Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from classpath
> + exec /usr/java/jdk1.6.0_31/bin/java -Xmx20m -cp
> '/usr/lib/flume-ng/lib/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/avro-1.5.4.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/commons-logging-api-1.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/hue-plugins-2.0.0-cdh4.0.1.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.
>
> ................................................................................
>
>
> 12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> supervisor 1
> 12/07/23 05:41:29 INFO node.FlumeNode: Flume node starting - agent1
> 12/07/23 05:41:29 INFO nodemanager.DefaultLogicalNodeManager: Node manager
> starting
> 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
> Configuration provider starting
> 12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> supervisor 10
> 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
> Reloading configuration file:/etc/flume-ng/conf/flume.conf
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: loggerSink
> Agent: agent
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent:
> agent1
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
> 12/07/23 05:41:29 INFO conf.FlumeConfiguration: Post-validation flume
> configuration contains configuration  for agents: [agent, agent1]
> 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
> Creating channels
> 12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
> created channel MemoryChannel-2
> 12/07/23 05:41:29 ERROR channel.ChannelProcessor: Builder class not found.
> Exception follows.
> java.lang.ClassNotFoundException:
> org.apache.flume.interceptor.HostInterceptor$Builder
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:169)
>     at
> org.apache.flume.channel.ChannelProcessor.configureInterceptors(ChannelProcessor.java:103)
>     at
> org.apache.flume.channel.ChannelProcessor.configure(ChannelProcessor.java:79)
>     at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
>     at
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:337)
>     at
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222)
>     at
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123)
>     at
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
>     at
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202)
>     at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>     at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>     at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>     at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>     at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:662)
>
>
>
> Could you please let me know why give me this message / class missing.
>
> Many thanks
>
>
>
>
>
>
>
>
>
>
>
> On Sun, Jul 22, 2012 at 10:12 PM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hello Mardan,
>>
>>         In order to aggregate data into your Hadoop cluster you need
>> to set up a Flume agent first. In order to do that you have to write a
>> config file having desired properties. An example file would be
>> somewhat like this :
>>
>> agent1.sources = tail
>> agent1.channels = MemoryChannel-2
>> agent1.sinks = HDFS
>>
>> agent1.sources.tail.type = exec
>> agent1.sources.tail.command = tail -F /var/log/apache2/access.log
>> agent1.sources.tail.channels = MemoryChannel-2
>>
>> agent1.sources.tail.interceptors = hostint
>> agent1.sources.tail.interceptors.hostint.type =
>> org.apache.flume.interceptor.HostInterceptor$Builder
>> agent1.sources.tail.interceptors.hostint.preserveExisting = true
>> agent1.sources.tail.interceptors.hostint.useIP = true
>>
>> agent1.sinks.HDFS.channel = MemoryChannel-2
>> agent1.sinks.HDFS.type = hdfs
>> agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000/flume/%{host}
>> agent1.sinks.HDFS.hdfs.file.Type = DataStream
>> agent1.sinks.HDFS.hdfs.writeFormat = Text
>>
>> agent1.channels.MemoryChannel-2.type = memory
>>
>> You can visit this link as the starting point, if you want -
>> http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html
>>
>> And, it is quite possible to run Flume-1.x o windows. Here is a great
>> post by Alex on how to do that -
>> http://mapredit.blogspot.in/2012/07/run-flume-13x-on-windows.html
>>
>> Hope it helps.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>> On Mon, Jul 23, 2012 at 2:17 AM, mardan Khan <ma...@gmail.com> wrote:
>> > Yeah, my cluster is always running. But i dont know how to setup the
>> > flume
>> > that directly stream the data to hadoop. I have must install the flume
>> > agent
>> > on window machine. As per my study the flume version-0.9.4 agent can
>> > install
>> > on window machine. Can we install flume version 1.x on window machine?
>> > If any one have done, please let me guide.
>> >
>> >
>> >
>> > Many thanks
>> >
>> >
>> >
>> > On Sun, Jul 22, 2012 at 7:26 PM, Mohammad Tariq <do...@gmail.com>
>> > wrote:
>> >>
>> >> NameNode and DataNode must be running if we need to write anything to
>> >> the
>> >> Hdfs.
>> >>
>> >> Regards,
>> >>     Mohammad Tariq
>> >>
>> >>
>> >> On Sun, Jul 22, 2012 at 11:41 PM, Henry Larson <ne...@gmail.com>
>> >> wrote:
>> >> > You can have flume write to HDFS: however, do you have your hadoop
>> >> > cluster running all the time?
>> >
>> >
>
>

Re: Use of Flume for the sensor network data

Posted by mardan Khan <ma...@gmail.com>.
Dear Mohammad Tariq,

Many thanks for your valuable information.

For the testing purpose , i have installed flume on SuSE Linux system. when
i have type the command $/etc/init.d/flume-ng-agent start. Then i have
receive message that Starting Flume NG agent daemon (flume-ng-agent):

I think this mean my flume agent is working properly. I have the following
changes in configuration file according to your example. The configuration
file as:



agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = loggerSink

# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = seq

# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel

# Each sink's type must be defined
agent.sinks.loggerSink.type = logger

#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel

# Each channel's type is defined.
agent.channels.memoryChannel.type = memory

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100

agent1.sources = tail
agent1.channels = MemoryChannel-2
agent1.sinks = HDFS

agent1.sources.tail.type = exec
agent1.sources.tail.command = tail -F /var/log/flume-ng/flume-init.log
agent1.sources.tail.channels = MemoryChannel-2

agent1.sources.tail.interceptors = hostint
agent1.sources.tail.interceptors.hostint.type =
org.apache.flume.interceptor.HostInterceptor$Builder
agent1.sources.tail.interceptors.hostint.preserverExisting = true
agent1.sources.tail.interceptors.hostint.useIP = true

agent1.sinks.HDFS.channel = MemoryChannel-2
agent1.channels.MemoryChannel-2.type = memory
agent1.sinks.HDFS.type =hdfs
agent1.sinks.HDFS.hdfs.path = hdfs://134.83.35.24/user/mardan/
agent1.sinks.HDFS.hdfs.file.Type = DataStream
agent1.sinks.HDFS.hdfs.writeFormat = Text



when i have type the following command

$ /usr/bin/flume-ng agent -n agent1 -f /etc/flume-ng/conf/flume.conf



I got the following warring / error  messages

Warning: No configuration directory set! Use --conf <dir> to override.
Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from classpath
Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar from
classpath
Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE access
Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar from
classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from classpath
Info: Excluding /usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar from classpath
+ exec /usr/java/jdk1.6.0_31/bin/java -Xmx20m -cp
'/usr/lib/flume-ng/lib/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/avro-1.5.4.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/commons-logging-api-1.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/hue-plugins-2.0.0-cdh4.0.1.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.

................................................................................


12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
supervisor 1
12/07/23 05:41:29 INFO node.FlumeNode: Flume node starting - agent1
12/07/23 05:41:29 INFO nodemanager.DefaultLogicalNodeManager: Node manager
starting
12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
Configuration provider starting
12/07/23 05:41:29 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
supervisor 10
12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
Reloading configuration file:/etc/flume-ng/conf/flume.conf
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: loggerSink
Agent: agent
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent:
agent1
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:HDFS
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Processing:loggerSink
12/07/23 05:41:29 INFO conf.FlumeConfiguration: Post-validation flume
configuration contains configuration  for agents: [agent, agent1]
12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
Creating channels
12/07/23 05:41:29 INFO properties.PropertiesFileConfigurationProvider:
created channel MemoryChannel-2
12/07/23 05:41:29 ERROR channel.ChannelProcessor: Builder class not found.
Exception follows.
java.lang.ClassNotFoundException:
org.apache.flume.interceptor.HostInterceptor$Builder
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:169)
    at
org.apache.flume.channel.ChannelProcessor.configureInterceptors(ChannelProcessor.java:103)
    at
org.apache.flume.channel.ChannelProcessor.configure(ChannelProcessor.java:79)
    at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
    at
org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:337)
    at
org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222)
    at
org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123)
    at
org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
    at
org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202)
    at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
    at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
    at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
    at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
    at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)



Could you please let me know why give me this message / class missing.

Many thanks










On Sun, Jul 22, 2012 at 10:12 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Mardan,
>
>         In order to aggregate data into your Hadoop cluster you need
> to set up a Flume agent first. In order to do that you have to write a
> config file having desired properties. An example file would be
> somewhat like this :
>
> agent1.sources = tail
> agent1.channels = MemoryChannel-2
> agent1.sinks = HDFS
>
> agent1.sources.tail.type = exec
> agent1.sources.tail.command = tail -F /var/log/apache2/access.log
> agent1.sources.tail.channels = MemoryChannel-2
>
> agent1.sources.tail.interceptors = hostint
> agent1.sources.tail.interceptors.hostint.type =
> org.apache.flume.interceptor.HostInterceptor$Builder
> agent1.sources.tail.interceptors.hostint.preserveExisting = true
> agent1.sources.tail.interceptors.hostint.useIP = true
>
> agent1.sinks.HDFS.channel = MemoryChannel-2
> agent1.sinks.HDFS.type = hdfs
> agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000/flume/%{host}
> agent1.sinks.HDFS.hdfs.file.Type = DataStream
> agent1.sinks.HDFS.hdfs.writeFormat = Text
>
> agent1.channels.MemoryChannel-2.type = memory
>
> You can visit this link as the starting point, if you want -
> http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html
>
> And, it is quite possible to run Flume-1.x o windows. Here is a great
> post by Alex on how to do that -
> http://mapredit.blogspot.in/2012/07/run-flume-13x-on-windows.html
>
> Hope it helps.
>
> Regards,
>     Mohammad Tariq
>
>
> On Mon, Jul 23, 2012 at 2:17 AM, mardan Khan <ma...@gmail.com> wrote:
> > Yeah, my cluster is always running. But i dont know how to setup the
> flume
> > that directly stream the data to hadoop. I have must install the flume
> agent
> > on window machine. As per my study the flume version-0.9.4 agent can
> install
> > on window machine. Can we install flume version 1.x on window machine?
> > If any one have done, please let me guide.
> >
> >
> >
> > Many thanks
> >
> >
> >
> > On Sun, Jul 22, 2012 at 7:26 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >>
> >> NameNode and DataNode must be running if we need to write anything to
> the
> >> Hdfs.
> >>
> >> Regards,
> >>     Mohammad Tariq
> >>
> >>
> >> On Sun, Jul 22, 2012 at 11:41 PM, Henry Larson <ne...@gmail.com>
> >> wrote:
> >> > You can have flume write to HDFS: however, do you have your hadoop
> >> > cluster running all the time?
> >
> >
>

Re: Use of Flume for the sensor network data

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Mardan,

        In order to aggregate data into your Hadoop cluster you need
to set up a Flume agent first. In order to do that you have to write a
config file having desired properties. An example file would be
somewhat like this :

agent1.sources = tail
agent1.channels = MemoryChannel-2
agent1.sinks = HDFS

agent1.sources.tail.type = exec
agent1.sources.tail.command = tail -F /var/log/apache2/access.log
agent1.sources.tail.channels = MemoryChannel-2

agent1.sources.tail.interceptors = hostint
agent1.sources.tail.interceptors.hostint.type =
org.apache.flume.interceptor.HostInterceptor$Builder
agent1.sources.tail.interceptors.hostint.preserveExisting = true
agent1.sources.tail.interceptors.hostint.useIP = true

agent1.sinks.HDFS.channel = MemoryChannel-2
agent1.sinks.HDFS.type = hdfs
agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000/flume/%{host}
agent1.sinks.HDFS.hdfs.file.Type = DataStream
agent1.sinks.HDFS.hdfs.writeFormat = Text

agent1.channels.MemoryChannel-2.type = memory

You can visit this link as the starting point, if you want -
http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html

And, it is quite possible to run Flume-1.x o windows. Here is a great
post by Alex on how to do that -
http://mapredit.blogspot.in/2012/07/run-flume-13x-on-windows.html

Hope it helps.

Regards,
    Mohammad Tariq


On Mon, Jul 23, 2012 at 2:17 AM, mardan Khan <ma...@gmail.com> wrote:
> Yeah, my cluster is always running. But i dont know how to setup the flume
> that directly stream the data to hadoop. I have must install the flume agent
> on window machine. As per my study the flume version-0.9.4 agent can install
> on window machine. Can we install flume version 1.x on window machine?
> If any one have done, please let me guide.
>
>
>
> Many thanks
>
>
>
> On Sun, Jul 22, 2012 at 7:26 PM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> NameNode and DataNode must be running if we need to write anything to the
>> Hdfs.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>> On Sun, Jul 22, 2012 at 11:41 PM, Henry Larson <ne...@gmail.com>
>> wrote:
>> > You can have flume write to HDFS: however, do you have your hadoop
>> > cluster running all the time?
>
>

Re: Use of Flume for the sensor network data

Posted by mardan Khan <ma...@gmail.com>.
Yeah, my cluster is always running. But i dont know how to setup the flume
that directly stream the data to hadoop. I have must install the flume
agent on window machine. As per my study the flume version-0.9.4 agent can
install on window machine. Can we install flume version 1.x on window
machine?
If any one have done, please let me guide.



Many thanks


On Sun, Jul 22, 2012 at 7:26 PM, Mohammad Tariq <do...@gmail.com> wrote:

> NameNode and DataNode must be running if we need to write anything to the
> Hdfs.
>
> Regards,
>     Mohammad Tariq
>
>
> On Sun, Jul 22, 2012 at 11:41 PM, Henry Larson <ne...@gmail.com>
> wrote:
> > You can have flume write to HDFS: however, do you have your hadoop
> > cluster running all the time?
>

Re: Use of Flume for the sensor network data

Posted by Mohammad Tariq <do...@gmail.com>.
NameNode and DataNode must be running if we need to write anything to the Hdfs.

Regards,
    Mohammad Tariq


On Sun, Jul 22, 2012 at 11:41 PM, Henry Larson <ne...@gmail.com> wrote:
> You can have flume write to HDFS: however, do you have your hadoop
> cluster running all the time?

Re: Use of Flume for the sensor network data

Posted by Henry Larson <ne...@gmail.com>.
You can have flume write to HDFS: however, do you have your hadoop
cluster running all the time?