You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Subramanyam Satyanarayana <su...@attributor.com> on 2011/11/02 21:55:05 UTC
Flume set up help
Hi,
We are trying to set up flume in our production for data transfer
between hosts. We had implementations of one agent node running 5 logical
nodes & talking to 5 logical collectors running on 2 boxes ( one of them
being an custom hbase sink) using the flow isolation mechanism. What we
noticed it runs fine for the first couple of hours & then data just stops
flowing for unknown reasons. There are no particular symptoms in the log
files OR the jstacks.
We need help what to debug this & root cause it. We are not sure which of
the following is the known bottleneck choking the system
a) Flow isolation b) Fan out sinks c) Batching at collector d) Multi tailing
Here is the original set up info
===============================================
#Configure the 5 agents
exec config 'uiagent' 'uiflow'
'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
exec config 'newbookagent' 'newbookflow'
'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
'agentDFOSink'
exec config 'friendagent' 'friendflow'
'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")' 'agentDFOSink'
exec config 'systemagent' 'systemflow'
'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")' 'agentDFOSink'
exec config 'feedagent' 'feedflow'
'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")' 'agentDFOSink'
#Configure the 5 collectors
exec config 'uicollector' 'uiflow' 'autoCollectorSource'
'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
exec config 'newbookcollector' 'newbookflow' 'autoCollectorSource'
'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
"friends_list", "friendslist" )]}'
#Mappings
exec spawn 'agent' 'uiagent'
exec spawn 'agent' 'systemagent'
exec spawn 'agent' 'feedagent'
exec spawn 'agent' 'friendagent'
exec spawn 'agent' 'newbookagent'
exec spawn 'collector2' 'uicollector'
exec spawn 'collector2' 'systemcollector'
exec spawn 'collector' 'feedcollector'
exec spawn 'collector' 'friendcollector'
exec spawn 'collector2' 'newbookcollector
====================================================
P.S : We also finally broke it down to a bear minimum ( as shown below) of
one agent talking to one collector & HDFS & still did not scale for large
hours of data flow.
=====================================
exec config 'fastagent'
'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
'agentDFOSink("stage-event-001<http://stage-event-001.shelflife.attributor.com>
","35853")'
exec config 'collector' 'collectorSource(35853)'
'collector(60000){[collectorSink("hdfs://
stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'
======================================
==============================================
Thanks!!
~Subbu
--
Thanks!!
~Subbu
Re: Flume set up help
Posted by Subramanyam Satyanarayana <su...@attributor.com>.
Thanks Mingjie for the response. It certainly seems to be the issue. We
are not noticing it now that we removed batching on the collector side. We
will try to apply the patch and let you know how that goes
Thanks!!
~Subbu
On Thu, Nov 3, 2011 at 2:06 PM, Mingjie Lai <mj...@gmail.com> wrote:
> Subbu.
>
> I do believe what you experienced is still flume-798. Hope we can have a
> committed patch to trunk soon. If you really want to fix it early, you can
> try to apply the attached patch at the jira.
>
> The default flume log directory is /var/log/flume/. You can check the
> collector's folder for the error messages.
>
> Thanks,
> Mingjie
>
>
> On 11/03/2011 12:55 PM, Subramanyam Satyanarayana wrote:
>
>> Hi Alex,
>> I am using the 0.9.4 version. I do not see any errors in the
>> master log. Although I must admit that I havent figured a way to capture
>> logs. I just currently redirect the console logs to a file & that seems
>> to have INFO & other logs.I am not sure how to get DEBUG logs & make
>> sure that goes to a file instead of console. I have created the "logs"
>> directory under flume folder, but no files get created there
>>
>> Lastly please let me know how do we capture dmesg outputs? The
>> processes are very much alive but data stops flowing after a while
>>
>> ~Subbu
>>
>> On Thu, Nov 3, 2011 at 12:55 AM, Alexander C.H. Lorenz
>> <wget.null@googlemail.com <ma...@googlemail.com>>>
>> wrote:
>>
>> Hi Subbu,
>>
>> which version of flume you use? The master-node has no errors? What
>> says dmesg @nodes? First I would check all servers for errors,
>> mostly dmesg-output will be helpfull. The processes are up and
>> running or did they die?
>>
>> best,
>> Alex
>>
>> On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana
>> <subbu@attributor.com <ma...@attributor.com>> wrote:
>>
>>
>> Hi,
>> We are trying to set up flume in our production for data
>> transfer between hosts. We had implementations of one agent node
>> running 5 logical nodes & talking to 5 logical collectors
>> running on 2 boxes ( one of them being an custom hbase sink)
>> using the flow isolation mechanism. What we noticed it runs fine
>> for the first couple of hours & then data just stops flowing for
>> unknown reasons. There are no particular symptoms in the log
>> files OR the jstacks.
>>
>> We need help what to debug this & root cause it. We are not sure
>> which of the following is the known bottleneck choking the system
>> a) Flow isolation b) Fan out sinks c) Batching at collector d)
>> Multi tailing
>>
>> Here is the original set up info
>> ==============================**=================
>> #Configure the 5 agents
>> exec config 'uiagent' 'uiflow'
>> 'tail("/usr/local/stow/tomcat/**logs/ui/uievent.log")'
>> 'agentDFOSink'
>> exec config 'newbookagent' 'newbookflow'
>> 'tail("/usr/local/stow/tomcat/**logs/newbook/newbookevent.log"**)'
>> 'agentDFOSink'
>> exec config 'friendagent' 'friendflow'
>> 'tail("/usr/local/stow/tomcat/**logs/friend/friendevent.log")'
>> 'agentDFOSink'
>> exec config 'systemagent' 'systemflow'
>> 'tail("/usr/local/stow/tomcat/**logs/system/systemevent.log")'
>> 'agentDFOSink'
>> exec config 'feedagent' 'feedflow'
>> 'tail("/usr/local/stow/tomcat/**logs/feed/feedevent.log")'
>> 'agentDFOSink'
>>
>> #Configure the 5 collectors
>> exec config 'uicollector' 'uiflow' 'autoCollectorSource'
>> 'collector(300000){[**collectorSink("hdfs://prod-**
>> namenode-001:54310/user/argus/**events/ui/inputarchive/%Y/%m/%**
>> d/","%{host}-%
>> {tailSrcFile}.log"),**collectorSink("file:///data/**
>> drivers/nile_uievent/**tempinput","%{host}-%{**tailSrcFile}.log")]}'
>> exec config 'newbookcollector' 'newbookflow'
>> 'autoCollectorSource'
>> 'collector(300000){[**collectorSink("hdfs://prod-**
>> namenode-001:54310/user/argus/**events/newbook/inputarchive/%**
>> Y/%m/%d/","%{host}-%{**tailSrcFile}.log"),**collectorSink("file:///data/*
>> *drivers/nile_newbookevent/**tempinput","%{host}-%{**
>> tailSrcFile}.log")]}'
>> exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
>> 'collector(300000){[**collectorSink("hdfs://prod-**
>> namenode-001:54310/user/argus/**events/system/inputarchive/%Y/**
>> %m/%d/","%{host}-%{**tailSrcFile}.log"),**collectorSink("file:///data/**
>> drivers/nile_systemevent/**tempinput","%{host}-%{**tailSrcFile}.log")]}'
>> exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
>> 'collector(60000){[**collectorSink("hdfs://prod-**
>> namenode-001:54310/user/argus/**events/feed/inputarchive/%Y/%**
>> m/%d/","%{host}-%{tailSrcFile}**.log"),collectorSink("file:///**
>> data/drivers/nile_feedevent/**tempinput","%{host}-%{**
>> tailSrcFile}.log")]}'
>> exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
>> 'collector(3){[collectorSink("**hdfs://prod-namenode-001:**
>> 54310/user/argus/events/**friend/inputarchive/%Y/%m/%d/"**
>> ,"%{host}-%{tailSrcFile}.log")**,collectorSink("file:///data/**
>> drivers/nile_friendevent/**tempinput","%{host}-%{**tailSrcFile}.log"),**
>> friends2hbase(
>> "friends_list", "friendslist" )]}'
>>
>> #Mappings
>> exec spawn 'agent' 'uiagent'
>> exec spawn 'agent' 'systemagent'
>> exec spawn 'agent' 'feedagent'
>> exec spawn 'agent' 'friendagent'
>> exec spawn 'agent' 'newbookagent'
>>
>> exec spawn 'collector2' 'uicollector'
>> exec spawn 'collector2' 'systemcollector'
>> exec spawn 'collector' 'feedcollector'
>> exec spawn 'collector' 'friendcollector'
>> exec spawn 'collector2' 'newbookcollector
>> ==============================**======================
>>
>>
>> P.S : We also finally broke it down to a bear minimum ( as shown
>> below) of one agent talking to one collector & HDFS & still did
>> not scale for large hours of data flow.
>>
>> ==============================**=======
>> exec config 'fastagent'
>> 'multitail("/usr/local/stow/**tomcat/logs/feed/feedevent.**
>> log","/usr/local/stow/tomcat/**logs/friend/friendevent.log")'
>> 'agentDFOSink("stage-event-001
>> <http://stage-event-001.**shelflife.attributor.com<http://stage-event-001.shelflife.attributor.com>
>> >","**35853")'
>>
>>
>> exec config 'collector' 'collectorSource(35853)'
>> 'collector(60000){[**collectorSink("hdfs://stage-**
>> namenode-001:54310/user/argus/**events/backup/%Y/%m/%d/
>> <http://stage-namenode-001.**shelflife.attributor.com:**
>> 54310/user/argus/events/**backup/%Y/%m/%d/>","%{host}-%{**
>> tailSrcFile}.log"),**collectorSink("file:///data/**
>> drivers/nile_allevents/**tempinput","%{host}-%{**tailSrcFile}.log")]}'
>>
>>
>> ==============================**========
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ==============================**================
>>
>> Thanks!!
>> ~Subbu
>>
>>
>>
>> --
>> Thanks!!
>> ~Subbu
>>
>>
>>
>>
>> --
>> Alexander Lorenz
>> http://mapredit.blogspot.com
>>
>> *P **Think of the environment: please don't print this email unless
>> you really need to.*
>>
>>
>>
>>
>>
>> --
>> Thanks!!
>> ~Subbu
>>
>>
--
Thanks!!
~Subbu
Re: Flume set up help
Posted by Mingjie Lai <mj...@gmail.com>.
Subbu.
I do believe what you experienced is still flume-798. Hope we can have a
committed patch to trunk soon. If you really want to fix it early, you
can try to apply the attached patch at the jira.
The default flume log directory is /var/log/flume/. You can check the
collector's folder for the error messages.
Thanks,
Mingjie
On 11/03/2011 12:55 PM, Subramanyam Satyanarayana wrote:
> Hi Alex,
> I am using the 0.9.4 version. I do not see any errors in the
> master log. Although I must admit that I havent figured a way to capture
> logs. I just currently redirect the console logs to a file & that seems
> to have INFO & other logs.I am not sure how to get DEBUG logs & make
> sure that goes to a file instead of console. I have created the "logs"
> directory under flume folder, but no files get created there
>
> Lastly please let me know how do we capture dmesg outputs? The
> processes are very much alive but data stops flowing after a while
>
> ~Subbu
>
> On Thu, Nov 3, 2011 at 12:55 AM, Alexander C.H. Lorenz
> <wget.null@googlemail.com <ma...@googlemail.com>> wrote:
>
> Hi Subbu,
>
> which version of flume you use? The master-node has no errors? What
> says dmesg @nodes? First I would check all servers for errors,
> mostly dmesg-output will be helpfull. The processes are up and
> running or did they die?
>
> best,
> Alex
>
> On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana
> <subbu@attributor.com <ma...@attributor.com>> wrote:
>
>
> Hi,
> We are trying to set up flume in our production for data
> transfer between hosts. We had implementations of one agent node
> running 5 logical nodes & talking to 5 logical collectors
> running on 2 boxes ( one of them being an custom hbase sink)
> using the flow isolation mechanism. What we noticed it runs fine
> for the first couple of hours & then data just stops flowing for
> unknown reasons. There are no particular symptoms in the log
> files OR the jstacks.
>
> We need help what to debug this & root cause it. We are not sure
> which of the following is the known bottleneck choking the system
> a) Flow isolation b) Fan out sinks c) Batching at collector d)
> Multi tailing
>
> Here is the original set up info
> ===============================================
> #Configure the 5 agents
> exec config 'uiagent' 'uiflow'
> 'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
> exec config 'newbookagent' 'newbookflow'
> 'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
> 'agentDFOSink'
> exec config 'friendagent' 'friendflow'
> 'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")'
> 'agentDFOSink'
> exec config 'systemagent' 'systemflow'
> 'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")'
> 'agentDFOSink'
> exec config 'feedagent' 'feedflow'
> 'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")'
> 'agentDFOSink'
>
> #Configure the 5 collectors
> exec config 'uicollector' 'uiflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
> {tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'newbookcollector' 'newbookflow'
> 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
> 'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
> 'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
> "friends_list", "friendslist" )]}'
>
> #Mappings
> exec spawn 'agent' 'uiagent'
> exec spawn 'agent' 'systemagent'
> exec spawn 'agent' 'feedagent'
> exec spawn 'agent' 'friendagent'
> exec spawn 'agent' 'newbookagent'
>
> exec spawn 'collector2' 'uicollector'
> exec spawn 'collector2' 'systemcollector'
> exec spawn 'collector' 'feedcollector'
> exec spawn 'collector' 'friendcollector'
> exec spawn 'collector2' 'newbookcollector
> ====================================================
>
>
> P.S : We also finally broke it down to a bear minimum ( as shown
> below) of one agent talking to one collector & HDFS & still did
> not scale for large hours of data flow.
>
> =====================================
> exec config 'fastagent'
> 'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
> 'agentDFOSink("stage-event-001
> <http://stage-event-001.shelflife.attributor.com>","35853")'
>
> exec config 'collector' 'collectorSource(35853)'
> 'collector(60000){[collectorSink("hdfs://stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
> <http://stage-namenode-001.shelflife.attributor.com:54310/user/argus/events/backup/%Y/%m/%d/>","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'
>
> ======================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ==============================================
>
> Thanks!!
> ~Subbu
>
>
>
> --
> Thanks!!
> ~Subbu
>
>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless
> you really need to.*
>
>
>
>
>
> --
> Thanks!!
> ~Subbu
>
Re: Flume set up help
Posted by Subramanyam Satyanarayana <su...@attributor.com>.
Hi Alex,
I am using the 0.9.4 version. I do not see any errors in the
master log. Although I must admit that I havent figured a way to capture
logs. I just currently redirect the console logs to a file & that seems to
have INFO & other logs.I am not sure how to get DEBUG logs & make sure that
goes to a file instead of console. I have created the "logs" directory
under flume folder, but no files get created there
Lastly please let me know how do we capture dmesg outputs? The processes
are very much alive but data stops flowing after a while
~Subbu
On Thu, Nov 3, 2011 at 12:55 AM, Alexander C.H. Lorenz <
wget.null@googlemail.com> wrote:
> Hi Subbu,
>
> which version of flume you use? The master-node has no errors? What says
> dmesg @nodes? First I would check all servers for errors, mostly
> dmesg-output will be helpfull. The processes are up and running or did they
> die?
>
> best,
> Alex
>
> On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana <
> subbu@attributor.com> wrote:
>
>>
>> Hi,
>> We are trying to set up flume in our production for data transfer
>> between hosts. We had implementations of one agent node running 5 logical
>> nodes & talking to 5 logical collectors running on 2 boxes ( one of them
>> being an custom hbase sink) using the flow isolation mechanism. What we
>> noticed it runs fine for the first couple of hours & then data just stops
>> flowing for unknown reasons. There are no particular symptoms in the log
>> files OR the jstacks.
>>
>> We need help what to debug this & root cause it. We are not sure which of
>> the following is the known bottleneck choking the system
>> a) Flow isolation b) Fan out sinks c) Batching at collector d) Multi
>> tailing
>>
>> Here is the original set up info
>> ===============================================
>> #Configure the 5 agents
>> exec config 'uiagent' 'uiflow'
>> 'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
>> exec config 'newbookagent' 'newbookflow'
>> 'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
>> 'agentDFOSink'
>> exec config 'friendagent' 'friendflow'
>> 'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")' 'agentDFOSink'
>> exec config 'systemagent' 'systemflow'
>> 'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")' 'agentDFOSink'
>> exec config 'feedagent' 'feedflow'
>> 'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")' 'agentDFOSink'
>>
>> #Configure the 5 collectors
>> exec config 'uicollector' 'uiflow' 'autoCollectorSource'
>> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
>>
>> {tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>> exec config 'newbookcollector' 'newbookflow' 'autoCollectorSource'
>> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>> exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
>> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>> exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
>> 'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>> exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
>> 'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
>> "friends_list", "friendslist" )]}'
>>
>> #Mappings
>> exec spawn 'agent' 'uiagent'
>> exec spawn 'agent' 'systemagent'
>> exec spawn 'agent' 'feedagent'
>> exec spawn 'agent' 'friendagent'
>> exec spawn 'agent' 'newbookagent'
>>
>> exec spawn 'collector2' 'uicollector'
>> exec spawn 'collector2' 'systemcollector'
>> exec spawn 'collector' 'feedcollector'
>> exec spawn 'collector' 'friendcollector'
>> exec spawn 'collector2' 'newbookcollector
>> ====================================================
>>
>>
>> P.S : We also finally broke it down to a bear minimum ( as shown below)
>> of one agent talking to one collector & HDFS & still did not scale for
>> large hours of data flow.
>>
>> =====================================
>> exec config 'fastagent'
>> 'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
>> 'agentDFOSink("stage-event-001<http://stage-event-001.shelflife.attributor.com>
>> ","35853")'
>>
>> exec config 'collector' 'collectorSource(35853)'
>> 'collector(60000){[collectorSink("hdfs://
>> stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
>> ","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'
>>
>> ======================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ==============================================
>>
>> Thanks!!
>> ~Subbu
>>
>>
>>
>> --
>> Thanks!!
>> ~Subbu
>>
>>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless you
> really need to.*
>
>
>
--
Thanks!!
~Subbu
Re: Flume set up help
Posted by "Alexander C.H. Lorenz" <wg...@googlemail.com>.
Hi Subbu,
which version of flume you use? The master-node has no errors? What says
dmesg @nodes? First I would check all servers for errors, mostly
dmesg-output will be helpfull. The processes are up and running or did they
die?
best,
Alex
On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana <
subbu@attributor.com> wrote:
>
> Hi,
> We are trying to set up flume in our production for data transfer
> between hosts. We had implementations of one agent node running 5 logical
> nodes & talking to 5 logical collectors running on 2 boxes ( one of them
> being an custom hbase sink) using the flow isolation mechanism. What we
> noticed it runs fine for the first couple of hours & then data just stops
> flowing for unknown reasons. There are no particular symptoms in the log
> files OR the jstacks.
>
> We need help what to debug this & root cause it. We are not sure which of
> the following is the known bottleneck choking the system
> a) Flow isolation b) Fan out sinks c) Batching at collector d) Multi
> tailing
>
> Here is the original set up info
> ===============================================
> #Configure the 5 agents
> exec config 'uiagent' 'uiflow'
> 'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
> exec config 'newbookagent' 'newbookflow'
> 'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
> 'agentDFOSink'
> exec config 'friendagent' 'friendflow'
> 'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")' 'agentDFOSink'
> exec config 'systemagent' 'systemflow'
> 'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")' 'agentDFOSink'
> exec config 'feedagent' 'feedflow'
> 'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")' 'agentDFOSink'
>
> #Configure the 5 collectors
> exec config 'uicollector' 'uiflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
>
> {tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'newbookcollector' 'newbookflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
> 'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
> 'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
> "friends_list", "friendslist" )]}'
>
> #Mappings
> exec spawn 'agent' 'uiagent'
> exec spawn 'agent' 'systemagent'
> exec spawn 'agent' 'feedagent'
> exec spawn 'agent' 'friendagent'
> exec spawn 'agent' 'newbookagent'
>
> exec spawn 'collector2' 'uicollector'
> exec spawn 'collector2' 'systemcollector'
> exec spawn 'collector' 'feedcollector'
> exec spawn 'collector' 'friendcollector'
> exec spawn 'collector2' 'newbookcollector
> ====================================================
>
>
> P.S : We also finally broke it down to a bear minimum ( as shown below) of
> one agent talking to one collector & HDFS & still did not scale for large
> hours of data flow.
>
> =====================================
> exec config 'fastagent'
> 'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
> 'agentDFOSink("stage-event-001<http://stage-event-001.shelflife.attributor.com>
> ","35853")'
>
> exec config 'collector' 'collectorSource(35853)'
> 'collector(60000){[collectorSink("hdfs://
> stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
> ","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'
>
> ======================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ==============================================
>
> Thanks!!
> ~Subbu
>
>
>
> --
> Thanks!!
> ~Subbu
>
>
--
Alexander Lorenz
http://mapredit.blogspot.com
*P **Think of the environment: please don't print this email unless you
really need to.*