You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Jay Stricks <ja...@wapolabs.com> on 2012/03/19 19:21:37 UTC

Agent Nodes Configured, but Only One Forwarding Data to Collector

I have two logical nodes on my servers which I initialize using a
derivation of the flume-daemon.sh script. After I upgraded to v094_cdh3u2,
I started seeing the second node not begin forwarding its files to a
collector.  An excerpt from the init script:

export FLUME_HOME=/usr/local/flume-0.9.4-cdh3u2
export FLUME_LOG_DIR="/var/log/flume"
export FLUME_LOGFILE=flume-flume-node-$HOSTNAME.log
log=$FLUME_LOG_DIR/flume-flume-$HOSTNAME.out

for IN in `cat /etc/flume-node.conf`; do
          counter=0;
          arr=$(echo $IN | tr ";" "\n")
          for x in $arr
          do
              flume_args[$counter]=`echo $x`;
              counter=$(( counter + 1 ))
          done
          flume_host_name=`/bin/hostname`${flume_args[1]}
          nohup ${FLUME_HOME}/bin/flume node -n $flume_host_name > "$log"
2>&1 < /dev/null &
done

Both nodes' processes are running, both are ACTIVE on Node Status table and
both have the correct configuration on the Node Configuration table.  But
files accumulate in the /logged directory for the second node.

The problem resolves with a refresh {node_name} command to the master.

Configuration is:

*Agent1: *node_name1 useast_events syslogUdp(5140) {value("app","ngn") =>
autoE2EChain }
*Collector: *collector_name useast_events autoCollectorSource
collectorSink(s3://events...)

*Agent2: *node_name2 useast_accesslogs syslogUdp(5140) {value("app","ngn")
=> autoE2EChain }
*Collector: *collector_name useast_accesslogs autoCollectorSource
collectorSink(s3://accesslogs...)


After I submit the refresh command, the agent's sink actually is changed
from {value("app","ngn") => autoE2EChain } to:

{ value( "app", "ngn" ) => { ackedWriteAhead => { stubbornAppend => {
insistentOpen => < logicalSink( "collector_2_1a_094_events" ) ? <
logicalSink( "collector_1_1c_094_events" ) ? logicalSink(
"collector_1_1b_094_events" ) > > } } } }

I can't figure out what is happening, but I do set the FLUME_LOGFILE
environment variable only once (i.e., outside the for loop).  Sometimes I
get multiple nodes writing to the same log file concurrently; but other
times I will see a second log file created with a date extension that only
the second node writes to.

Does anyone have any suggestions to guarantee both nodes are initialized
correctly? I could add a refresh command in the init script, but I want to
make sure that I understood the problem since this wasn't happening before
the upgrade.

Thanks,

Jay S.