You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Subramanyam Satyanarayana <su...@attributor.com> on 2011/11/02 21:55:05 UTC

Flume set up help

Hi,
     We are trying to set up flume in our production for data transfer
between hosts. We had implementations of one agent node running 5 logical
nodes & talking to 5 logical collectors running on 2 boxes ( one of them
being an custom hbase sink) using the flow isolation mechanism. What we
noticed it runs fine for the first couple of hours & then data just stops
flowing for unknown reasons. There are no particular symptoms in the log
files OR the jstacks.

We need help what to debug this & root cause it. We are not sure which of
the following is the known bottleneck choking the system
a) Flow isolation b) Fan out sinks c) Batching at collector d) Multi tailing

Here is the original set up info
===============================================
#Configure the 5 agents
exec  config 'uiagent' 'uiflow'
'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
exec  config 'newbookagent' 'newbookflow'
'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
'agentDFOSink'
exec  config 'friendagent' 'friendflow'
'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")' 'agentDFOSink'
exec  config 'systemagent' 'systemflow'
'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")' 'agentDFOSink'
exec  config 'feedagent' 'feedflow'
'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")' 'agentDFOSink'

#Configure the 5 collectors
exec config 'uicollector' 'uiflow' 'autoCollectorSource'
'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
exec config 'newbookcollector' 'newbookflow' 'autoCollectorSource'
'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
"friends_list", "friendslist" )]}'

#Mappings
exec spawn 'agent' 'uiagent'
exec spawn 'agent' 'systemagent'
exec spawn 'agent' 'feedagent'
exec spawn 'agent' 'friendagent'
exec spawn 'agent' 'newbookagent'

exec spawn 'collector2' 'uicollector'
exec spawn 'collector2' 'systemcollector'
exec spawn 'collector' 'feedcollector'
exec spawn 'collector' 'friendcollector'
exec spawn 'collector2' 'newbookcollector
====================================================


P.S : We also finally broke it down to a bear minimum ( as shown below) of
one agent talking to one collector & HDFS & still did not scale for large
hours of data flow.

=====================================
exec  config 'fastagent'
'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
'agentDFOSink("stage-event-001<http://stage-event-001.shelflife.attributor.com>
","35853")'

exec config 'collector' 'collectorSource(35853)'
'collector(60000){[collectorSink("hdfs://
stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'

======================================
























==============================================

Thanks!!
~Subbu



-- 
Thanks!!
~Subbu

Re: Flume set up help

Posted by Subramanyam Satyanarayana <su...@attributor.com>.
Thanks Mingjie for the response. It certainly seems to be the  issue. We
are not noticing it now that we removed batching on the collector side. We
will try to apply the patch and let you know how that goes

Thanks!!
~Subbu


On Thu, Nov 3, 2011 at 2:06 PM, Mingjie Lai <mj...@gmail.com> wrote:

> Subbu.
>
> I do believe what you experienced is still flume-798. Hope we can have a
> committed patch to trunk soon. If you really want to fix it early, you can
> try to apply the attached patch at the jira.
>
> The default flume log directory is /var/log/flume/. You can check the
> collector's folder for the error messages.
>
> Thanks,
> Mingjie
>
>
> On 11/03/2011 12:55 PM, Subramanyam Satyanarayana wrote:
>
>> Hi Alex,
>>          I am using the 0.9.4 version. I do not see any errors in the
>> master log. Although I must admit that I havent figured a way to capture
>> logs. I just currently redirect the console logs to a file & that seems
>> to have INFO & other logs.I am not sure how to get DEBUG logs & make
>> sure that goes to a file instead of console. I have created the "logs"
>> directory under flume folder, but no files get created there
>>
>> Lastly please let me know how do we capture dmesg  outputs? The
>> processes are very much alive but data stops flowing after a while
>>
>> ~Subbu
>>
>> On Thu, Nov 3, 2011 at 12:55 AM, Alexander C.H. Lorenz
>> <wget.null@googlemail.com <ma...@googlemail.com>>>
>> wrote:
>>
>>    Hi Subbu,
>>
>>    which version of flume you use? The master-node has no errors? What
>>    says dmesg @nodes? First I would check all servers for errors,
>>    mostly dmesg-output will be helpfull. The processes are up and
>>    running or did they die?
>>
>>    best,
>>      Alex
>>
>>    On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana
>>    <subbu@attributor.com <ma...@attributor.com>> wrote:
>>
>>
>>        Hi,
>>              We are trying to set up flume in our production for data
>>        transfer between hosts. We had implementations of one agent node
>>        running 5 logical nodes & talking to 5 logical collectors
>>        running on 2 boxes ( one of them being an custom hbase sink)
>>        using the flow isolation mechanism. What we noticed it runs fine
>>        for the first couple of hours & then data just stops flowing for
>>        unknown reasons. There are no particular symptoms in the log
>>        files OR the jstacks.
>>
>>        We need help what to debug this & root cause it. We are not sure
>>        which of the following is the known bottleneck choking the system
>>        a) Flow isolation b) Fan out sinks c) Batching at collector d)
>>        Multi tailing
>>
>>        Here is the original set up info
>>        ==============================**=================
>>        #Configure the 5 agents
>>        exec  config 'uiagent' 'uiflow'
>>        'tail("/usr/local/stow/tomcat/**logs/ui/uievent.log")'
>> 'agentDFOSink'
>>        exec  config 'newbookagent' 'newbookflow'
>>        'tail("/usr/local/stow/tomcat/**logs/newbook/newbookevent.log"**)'
>>        'agentDFOSink'
>>        exec  config 'friendagent' 'friendflow'
>>        'tail("/usr/local/stow/tomcat/**logs/friend/friendevent.log")'
>>        'agentDFOSink'
>>        exec  config 'systemagent' 'systemflow'
>>        'tail("/usr/local/stow/tomcat/**logs/system/systemevent.log")'
>>        'agentDFOSink'
>>        exec  config 'feedagent' 'feedflow'
>>        'tail("/usr/local/stow/tomcat/**logs/feed/feedevent.log")'
>>        'agentDFOSink'
>>
>>        #Configure the 5 collectors
>>        exec config 'uicollector' 'uiflow' 'autoCollectorSource'
>>        'collector(300000){[**collectorSink("hdfs://prod-**
>> namenode-001:54310/user/argus/**events/ui/inputarchive/%Y/%m/%**
>> d/","%{host}-%
>>        {tailSrcFile}.log"),**collectorSink("file:///data/**
>> drivers/nile_uievent/**tempinput","%{host}-%{**tailSrcFile}.log")]}'
>>        exec config 'newbookcollector' 'newbookflow'
>>        'autoCollectorSource'
>>        'collector(300000){[**collectorSink("hdfs://prod-**
>> namenode-001:54310/user/argus/**events/newbook/inputarchive/%**
>> Y/%m/%d/","%{host}-%{**tailSrcFile}.log"),**collectorSink("file:///data/*
>> *drivers/nile_newbookevent/**tempinput","%{host}-%{**
>> tailSrcFile}.log")]}'
>>        exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
>>        'collector(300000){[**collectorSink("hdfs://prod-**
>> namenode-001:54310/user/argus/**events/system/inputarchive/%Y/**
>> %m/%d/","%{host}-%{**tailSrcFile}.log"),**collectorSink("file:///data/**
>> drivers/nile_systemevent/**tempinput","%{host}-%{**tailSrcFile}.log")]}'
>>        exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
>>        'collector(60000){[**collectorSink("hdfs://prod-**
>> namenode-001:54310/user/argus/**events/feed/inputarchive/%Y/%**
>> m/%d/","%{host}-%{tailSrcFile}**.log"),collectorSink("file:///**
>> data/drivers/nile_feedevent/**tempinput","%{host}-%{**
>> tailSrcFile}.log")]}'
>>        exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
>>        'collector(3){[collectorSink("**hdfs://prod-namenode-001:**
>> 54310/user/argus/events/**friend/inputarchive/%Y/%m/%d/"**
>> ,"%{host}-%{tailSrcFile}.log")**,collectorSink("file:///data/**
>> drivers/nile_friendevent/**tempinput","%{host}-%{**tailSrcFile}.log"),**
>> friends2hbase(
>>        "friends_list", "friendslist" )]}'
>>
>>        #Mappings
>>        exec spawn 'agent' 'uiagent'
>>        exec spawn 'agent' 'systemagent'
>>        exec spawn 'agent' 'feedagent'
>>        exec spawn 'agent' 'friendagent'
>>        exec spawn 'agent' 'newbookagent'
>>
>>        exec spawn 'collector2' 'uicollector'
>>        exec spawn 'collector2' 'systemcollector'
>>        exec spawn 'collector' 'feedcollector'
>>        exec spawn 'collector' 'friendcollector'
>>        exec spawn 'collector2' 'newbookcollector
>>        ==============================**======================
>>
>>
>>        P.S : We also finally broke it down to a bear minimum ( as shown
>>        below) of one agent talking to one collector & HDFS & still did
>>        not scale for large hours of data flow.
>>
>>        ==============================**=======
>>        exec  config 'fastagent'
>>        'multitail("/usr/local/stow/**tomcat/logs/feed/feedevent.**
>> log","/usr/local/stow/tomcat/**logs/friend/friendevent.log")'
>>        'agentDFOSink("stage-event-001
>>        <http://stage-event-001.**shelflife.attributor.com<http://stage-event-001.shelflife.attributor.com>
>> >","**35853")'
>>
>>
>>        exec config 'collector' 'collectorSource(35853)'
>>        'collector(60000){[**collectorSink("hdfs://stage-**
>> namenode-001:54310/user/argus/**events/backup/%Y/%m/%d/
>>        <http://stage-namenode-001.**shelflife.attributor.com:**
>> 54310/user/argus/events/**backup/%Y/%m/%d/>","%{host}-%{**
>> tailSrcFile}.log"),**collectorSink("file:///data/**
>> drivers/nile_allevents/**tempinput","%{host}-%{**tailSrcFile}.log")]}'
>>
>>
>>        ==============================**========
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>        ==============================**================
>>
>>        Thanks!!
>>        ~Subbu
>>
>>
>>
>>        --
>>        Thanks!!
>>        ~Subbu
>>
>>
>>
>>
>>    --
>>    Alexander Lorenz
>>    http://mapredit.blogspot.com
>>
>>    *P **Think of the environment: please don't print this email unless
>>    you really need to.*
>>
>>
>>
>>
>>
>> --
>> Thanks!!
>> ~Subbu
>>
>>


-- 
Thanks!!
~Subbu

Re: Flume set up help

Posted by Mingjie Lai <mj...@gmail.com>.
Subbu.

I do believe what you experienced is still flume-798. Hope we can have a 
committed patch to trunk soon. If you really want to fix it early, you 
can try to apply the attached patch at the jira.

The default flume log directory is /var/log/flume/. You can check the 
collector's folder for the error messages.

Thanks,
Mingjie

On 11/03/2011 12:55 PM, Subramanyam Satyanarayana wrote:
> Hi Alex,
>           I am using the 0.9.4 version. I do not see any errors in the
> master log. Although I must admit that I havent figured a way to capture
> logs. I just currently redirect the console logs to a file & that seems
> to have INFO & other logs.I am not sure how to get DEBUG logs & make
> sure that goes to a file instead of console. I have created the "logs"
> directory under flume folder, but no files get created there
>
> Lastly please let me know how do we capture dmesg  outputs? The
> processes are very much alive but data stops flowing after a while
>
> ~Subbu
>
> On Thu, Nov 3, 2011 at 12:55 AM, Alexander C.H. Lorenz
> <wget.null@googlemail.com <ma...@googlemail.com>> wrote:
>
>     Hi Subbu,
>
>     which version of flume you use? The master-node has no errors? What
>     says dmesg @nodes? First I would check all servers for errors,
>     mostly dmesg-output will be helpfull. The processes are up and
>     running or did they die?
>
>     best,
>       Alex
>
>     On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana
>     <subbu@attributor.com <ma...@attributor.com>> wrote:
>
>
>         Hi,
>               We are trying to set up flume in our production for data
>         transfer between hosts. We had implementations of one agent node
>         running 5 logical nodes & talking to 5 logical collectors
>         running on 2 boxes ( one of them being an custom hbase sink)
>         using the flow isolation mechanism. What we noticed it runs fine
>         for the first couple of hours & then data just stops flowing for
>         unknown reasons. There are no particular symptoms in the log
>         files OR the jstacks.
>
>         We need help what to debug this & root cause it. We are not sure
>         which of the following is the known bottleneck choking the system
>         a) Flow isolation b) Fan out sinks c) Batching at collector d)
>         Multi tailing
>
>         Here is the original set up info
>         ===============================================
>         #Configure the 5 agents
>         exec  config 'uiagent' 'uiflow'
>         'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
>         exec  config 'newbookagent' 'newbookflow'
>         'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
>         'agentDFOSink'
>         exec  config 'friendagent' 'friendflow'
>         'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")'
>         'agentDFOSink'
>         exec  config 'systemagent' 'systemflow'
>         'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")'
>         'agentDFOSink'
>         exec  config 'feedagent' 'feedflow'
>         'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")'
>         'agentDFOSink'
>
>         #Configure the 5 collectors
>         exec config 'uicollector' 'uiflow' 'autoCollectorSource'
>         'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
>         {tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>         exec config 'newbookcollector' 'newbookflow'
>         'autoCollectorSource'
>         'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>         exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
>         'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>         exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
>         'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>         exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
>         'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
>         "friends_list", "friendslist" )]}'
>
>         #Mappings
>         exec spawn 'agent' 'uiagent'
>         exec spawn 'agent' 'systemagent'
>         exec spawn 'agent' 'feedagent'
>         exec spawn 'agent' 'friendagent'
>         exec spawn 'agent' 'newbookagent'
>
>         exec spawn 'collector2' 'uicollector'
>         exec spawn 'collector2' 'systemcollector'
>         exec spawn 'collector' 'feedcollector'
>         exec spawn 'collector' 'friendcollector'
>         exec spawn 'collector2' 'newbookcollector
>         ====================================================
>
>
>         P.S : We also finally broke it down to a bear minimum ( as shown
>         below) of one agent talking to one collector & HDFS & still did
>         not scale for large hours of data flow.
>
>         =====================================
>         exec  config 'fastagent'
>         'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
>         'agentDFOSink("stage-event-001
>         <http://stage-event-001.shelflife.attributor.com>","35853")'
>
>         exec config 'collector' 'collectorSource(35853)'
>         'collector(60000){[collectorSink("hdfs://stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
>         <http://stage-namenode-001.shelflife.attributor.com:54310/user/argus/events/backup/%Y/%m/%d/>","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'
>
>         ======================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>         ==============================================
>
>         Thanks!!
>         ~Subbu
>
>
>
>         --
>         Thanks!!
>         ~Subbu
>
>
>
>
>     --
>     Alexander Lorenz
>     http://mapredit.blogspot.com
>
>     *P **Think of the environment: please don't print this email unless
>     you really need to.*
>
>
>
>
>
> --
> Thanks!!
> ~Subbu
>

Re: Flume set up help

Posted by Subramanyam Satyanarayana <su...@attributor.com>.
Hi Alex,
         I am using the 0.9.4 version. I do not see any errors in the
master log. Although I must admit that I havent figured a way to capture
logs. I just currently redirect the console logs to a file & that seems to
have INFO & other logs.I am not sure how to get DEBUG logs & make sure that
goes to a file instead of console. I have created the "logs" directory
under flume folder, but no files get created there

Lastly please let me know how do we capture dmesg  outputs? The processes
are very much alive but data stops flowing after a while

~Subbu

On Thu, Nov 3, 2011 at 12:55 AM, Alexander C.H. Lorenz <
wget.null@googlemail.com> wrote:

> Hi Subbu,
>
> which version of flume you use? The master-node has no errors? What says
> dmesg @nodes? First I would check all servers for errors, mostly
> dmesg-output will be helpfull. The processes are up and running or did they
> die?
>
> best,
>  Alex
>
> On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana <
> subbu@attributor.com> wrote:
>
>>
>> Hi,
>>      We are trying to set up flume in our production for data transfer
>> between hosts. We had implementations of one agent node running 5 logical
>> nodes & talking to 5 logical collectors running on 2 boxes ( one of them
>> being an custom hbase sink) using the flow isolation mechanism. What we
>> noticed it runs fine for the first couple of hours & then data just stops
>> flowing for unknown reasons. There are no particular symptoms in the log
>> files OR the jstacks.
>>
>> We need help what to debug this & root cause it. We are not sure which of
>> the following is the known bottleneck choking the system
>> a) Flow isolation b) Fan out sinks c) Batching at collector d) Multi
>> tailing
>>
>> Here is the original set up info
>> ===============================================
>> #Configure the 5 agents
>> exec  config 'uiagent' 'uiflow'
>> 'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
>> exec  config 'newbookagent' 'newbookflow'
>> 'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
>> 'agentDFOSink'
>> exec  config 'friendagent' 'friendflow'
>> 'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")' 'agentDFOSink'
>> exec  config 'systemagent' 'systemflow'
>> 'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")' 'agentDFOSink'
>> exec  config 'feedagent' 'feedflow'
>> 'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")' 'agentDFOSink'
>>
>> #Configure the 5 collectors
>> exec config 'uicollector' 'uiflow' 'autoCollectorSource'
>> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
>>
>> {tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>> exec config 'newbookcollector' 'newbookflow' 'autoCollectorSource'
>> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>> exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
>> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>> exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
>> 'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
>> exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
>> 'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
>> "friends_list", "friendslist" )]}'
>>
>> #Mappings
>> exec spawn 'agent' 'uiagent'
>> exec spawn 'agent' 'systemagent'
>> exec spawn 'agent' 'feedagent'
>> exec spawn 'agent' 'friendagent'
>> exec spawn 'agent' 'newbookagent'
>>
>> exec spawn 'collector2' 'uicollector'
>> exec spawn 'collector2' 'systemcollector'
>> exec spawn 'collector' 'feedcollector'
>> exec spawn 'collector' 'friendcollector'
>> exec spawn 'collector2' 'newbookcollector
>> ====================================================
>>
>>
>> P.S : We also finally broke it down to a bear minimum ( as shown below)
>> of one agent talking to one collector & HDFS & still did not scale for
>> large hours of data flow.
>>
>> =====================================
>> exec  config 'fastagent'
>> 'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
>> 'agentDFOSink("stage-event-001<http://stage-event-001.shelflife.attributor.com>
>> ","35853")'
>>
>> exec config 'collector' 'collectorSource(35853)'
>> 'collector(60000){[collectorSink("hdfs://
>> stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
>> ","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'
>>
>> ======================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ==============================================
>>
>> Thanks!!
>> ~Subbu
>>
>>
>>
>> --
>> Thanks!!
>> ~Subbu
>>
>>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless you
> really need to.*
>
>
>


-- 
Thanks!!
~Subbu

Re: Flume set up help

Posted by "Alexander C.H. Lorenz" <wg...@googlemail.com>.
Hi Subbu,

which version of flume you use? The master-node has no errors? What says
dmesg @nodes? First I would check all servers for errors, mostly
dmesg-output will be helpfull. The processes are up and running or did they
die?

best,
 Alex

On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana <
subbu@attributor.com> wrote:

>
> Hi,
>      We are trying to set up flume in our production for data transfer
> between hosts. We had implementations of one agent node running 5 logical
> nodes & talking to 5 logical collectors running on 2 boxes ( one of them
> being an custom hbase sink) using the flow isolation mechanism. What we
> noticed it runs fine for the first couple of hours & then data just stops
> flowing for unknown reasons. There are no particular symptoms in the log
> files OR the jstacks.
>
> We need help what to debug this & root cause it. We are not sure which of
> the following is the known bottleneck choking the system
> a) Flow isolation b) Fan out sinks c) Batching at collector d) Multi
> tailing
>
> Here is the original set up info
> ===============================================
> #Configure the 5 agents
> exec  config 'uiagent' 'uiflow'
> 'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
> exec  config 'newbookagent' 'newbookflow'
> 'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
> 'agentDFOSink'
> exec  config 'friendagent' 'friendflow'
> 'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")' 'agentDFOSink'
> exec  config 'systemagent' 'systemflow'
> 'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")' 'agentDFOSink'
> exec  config 'feedagent' 'feedflow'
> 'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")' 'agentDFOSink'
>
> #Configure the 5 collectors
> exec config 'uicollector' 'uiflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
>
> {tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'newbookcollector' 'newbookflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
> 'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
> 'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
> "friends_list", "friendslist" )]}'
>
> #Mappings
> exec spawn 'agent' 'uiagent'
> exec spawn 'agent' 'systemagent'
> exec spawn 'agent' 'feedagent'
> exec spawn 'agent' 'friendagent'
> exec spawn 'agent' 'newbookagent'
>
> exec spawn 'collector2' 'uicollector'
> exec spawn 'collector2' 'systemcollector'
> exec spawn 'collector' 'feedcollector'
> exec spawn 'collector' 'friendcollector'
> exec spawn 'collector2' 'newbookcollector
> ====================================================
>
>
> P.S : We also finally broke it down to a bear minimum ( as shown below) of
> one agent talking to one collector & HDFS & still did not scale for large
> hours of data flow.
>
> =====================================
> exec  config 'fastagent'
> 'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
> 'agentDFOSink("stage-event-001<http://stage-event-001.shelflife.attributor.com>
> ","35853")'
>
> exec config 'collector' 'collectorSource(35853)'
> 'collector(60000){[collectorSink("hdfs://
> stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
> ","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'
>
> ======================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ==============================================
>
> Thanks!!
> ~Subbu
>
>
>
> --
> Thanks!!
> ~Subbu
>
>


-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*