You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Michael Luban <mi...@gmail.com> on 2011/08/11 18:25:41 UTC

Heartbeats backing up when bootstrapping embedded flume node

Hello,

I have an issue trying to bootstrap an embedded flume node within a custom
service.
I'm using the same sort of start physical node, configure and spawn logical
node methodology that flumebase employs:

https://github.com/flumebase/flumebase/blob/master/src/main/java/com/odiago/flumebase/flume/EmbeddedFlumeConfig.java

When using the cdh3u1 release and configured to either use an embedded or an
external flume master, I get (upon first time execution) a "Hearbeats are
backing up" log message, followed by what looks to be the decommissioning of
my logical node ("Forcing driver to exit uncleanly")
Subsequent manual configuration of the embedded node through the master with
the same sink (custom plugin) and source (rpcSource(54010)) picks up the
configuration correctly and the embedded node works fine.
I'm trying to eliminate or understand the root cause of the "Forcing driver
to exit uncleanly" message when the logical node is configured and the
spawned into the physical node for the first time.

-----------Log snippet---------------

01:59:02.457 [Heartbeat] INFO  c.e.t.flume.TimeSeriesAggregatorSink -
EventSink builder() : building event sink
01:59:02.457 [Heartbeat] INFO  c.c.flume.agent.LogicalNodeManager - creating
new logical node timeseries-aggregator-host.local.domain
01:59:02.470 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (HELLO,Wed Dec 31
19:00:00 EST 1969)
01:59:02.472 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
 c.c.f.h.thrift.ThriftEventSource - Starting blocking thread pool server on
port 54010...
01:59:02.475 [logicalNode timeseries-aggregator-host.local.domain-32] DEBUG
c.c.f.core.connector.DirectDriver - Starting driver ThriftEventSource |
TimeSeriesAggregatorSink
01:59:02.479 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:02.475 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
01:59:02.480 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode -
Attempt to load config com.cloudera.flume.conf.FlumeConfigData@449afc
01:59:02.480 [Check config] INFO  c.e.t.flume.TimeSeriesAggregatorSink -
EventSink builder() : building event sink
01:59:07.482 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:07.484 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:12.487 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:12.489 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 1 heartbeats
01:59:12.490 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:17.493 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:17.494 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 2 heartbeats
01:59:17.495 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:22.501 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:22.502 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 3 heartbeats
01:59:22.502 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:27.505 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:27.506 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 4 heartbeats
01:59:27.506 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:32.509 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:32.511 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 5 heartbeats
01:59:32.511 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:37.514 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:37.516 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 6 heartbeats
01:59:37.516 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:42.519 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:42.521 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 7 heartbeats
01:59:42.521 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:47.524 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:47.526 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 8 heartbeats
01:59:47.526 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:52.529 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:52.532 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 9 heartbeats
01:59:52.532 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:57.535 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:57.537 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 10 heartbeats
01:59:57.537 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:02.480 [Check config] ERROR com.cloudera.flume.agent.LogicalNode -
Forcing driver to exit uncleanly
02:00:02.481 [Check config] INFO  com.cloudera.flume.agent.LogicalNode -
Node config successfully set to
com.cloudera.flume.conf.FlumeConfigData@449afc
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.483 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.483 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.483 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.483 [Thread-4] DEBUG c.c.flume.reporter.ReportEvent - Ignoring null
string metric message
Aug 11, 2011 2:00:02 AM
com.twitter.common.application.modules.AppLauncherModule$1 uncaughtException
SEVERE: Uncaught exception from Thread[Thread-4,5,main]
java.lang.NullPointerException
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.getMetrics(ThriftEventSource.java:99)
    at
com.cloudera.flume.core.EventSource$Base.getReports(EventSource.java:186)
    at com.cloudera.flume.agent.LogicalNode.getReports(LogicalNode.java:309)
    at
com.cloudera.flume.reporter.MasterReportPusher$PusherThread.querySrcSinkReports(MasterReportPusher.java:102)
    at
com.cloudera.flume.reporter.MasterReportPusher$PusherThread.sendReports(MasterReportPusher.java:110)
    at
com.cloudera.flume.reporter.MasterReportPusher$PusherThread.run(MasterReportPusher.java:119)
02:00:02.487 [logicalNode timeseries-aggregator-host.local.domain-34] INFO
 c.c.f.h.thrift.ThriftEventSource - Starting blocking thread pool server on
port 54010...
02:00:02.488 [logicalNode timeseries-aggregator-host.local.domain-34] ERROR
o.a.t.server.TSaneThreadPoolServer - Error occurred during listening.
org.apache.thrift.transport.TTransportException: Could not create
ServerSocket on address 0.0.0.0/0.0.0.0:54010.
    at
org.apache.thrift.transport.TSaneServerSocket.bind(TSaneServerSocket.java:110)
~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
    at
org.apache.thrift.transport.TSaneServerSocket.listen(TSaneServerSocket.java:116)
~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
    at
org.apache.thrift.server.TSaneThreadPoolServer.start(TSaneThreadPoolServer.java:162)
~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.open(ThriftEventSource.java:151)
[flume-core-0.9.4-cdh3u1.jar:na]
    at
com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:87)
[flume-core-0.9.4-cdh3u1.jar:na]
02:00:02.488 [logicalNode timeseries-aggregator-host.local.domain-34] DEBUG
c.c.f.core.connector.DirectDriver - Starting driver ThriftEventSource |
TimeSeriesAggregatorSink
02:00:02.539 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:02.539 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:02.555 [logicalNode timeseries-aggregator-host.local.domain-32] ERROR
c.c.f.core.connector.DirectDriver - Closing down due to exception during
append calls
java.io.IOException: Waiting for queue element was interrupted! null
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:222)
~[flume-core-0.9.4-cdh3u1.jar:na]
    at
com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:105)
~[flume-core-0.9.4-cdh3u1.jar:na]
Caused by: java.lang.InterruptedException: null
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
~[na:1.6.0_22]
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2094)
~[na:1.6.0_22]
    at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:423)
~[na:1.6.0_22]
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:209)
~[flume-core-0.9.4-cdh3u1.jar:na]
    ... 1 common frames omitted
02:00:02.557 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
 c.c.f.core.connector.DirectDriver - Connector logicalNode
timeseries-aggregator-host.local.domain-32 exited with error: Waiting for
queue element was interrupted! null
java.io.IOException: Waiting for queue element was interrupted! null
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:222)
~[flume-core-0.9.4-cdh3u1.jar:na]
    at
com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:105)
~[flume-core-0.9.4-cdh3u1.jar:na]
Caused by: java.lang.InterruptedException: null
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
~[na:1.6.0_22]
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2094)
~[na:1.6.0_22]
    at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:423)
~[na:1.6.0_22]
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:209)
~[flume-core-0.9.4-cdh3u1.jar:na]
    ... 1 common frames omitted
02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
 c.c.f.h.thrift.ThriftEventSource - Closed server on port 54010...
02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
 c.c.f.h.thrift.ThriftEventSource - Queue still has 0 elements ...
02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] ERROR
c.c.f.core.connector.DirectDriver - Exiting driver logicalNode
timeseries-aggregator-host.local.domain-32 in error state ThriftEventSource
| TimeSeriesAggregatorSink because Waiting for queue element was
interrupted! null
02:00:07.540 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:07.541 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:12.542 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:12.542 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:17.545 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:17.546 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:22.549 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:22.549 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:27.552 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:27.553 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
---------------------------

Re: Heartbeats backing up when bootstrapping embedded flume node

Posted by Jonathan Hsieh <jo...@cloudera.com>.
Michael,

Unfortunately, the cdh3u1 version of flume and the 0.9.4 version of flume
had several regressions.  This problem looks very similar to the one
described here:

https://issues.apache.org/jira/browse/flume-706

I've been able to diagnose and fix the problem but it is not yet available
in packaged form.

Jon.

On Thu, Aug 11, 2011 at 9:25 AM, Michael Luban <mi...@gmail.com>wrote:

> Hello,
>
> I have an issue trying to bootstrap an embedded flume node within a custom
> service.
> I'm using the same sort of start physical node, configure and spawn logical
> node methodology that flumebase employs:
>
>
> https://github.com/flumebase/flumebase/blob/master/src/main/java/com/odiago/flumebase/flume/EmbeddedFlumeConfig.java
>
> When using the cdh3u1 release and configured to either use an embedded or
> an external flume master, I get (upon first time execution) a "Hearbeats are
> backing up" log message, followed by what looks to be the decommissioning of
> my logical node ("Forcing driver to exit uncleanly")
> Subsequent manual configuration of the embedded node through the master
> with the same sink (custom plugin) and source (rpcSource(54010)) picks up
> the configuration correctly and the embedded node works fine.
>  I'm trying to eliminate or understand the root cause of the "Forcing
> driver to exit uncleanly" message when the logical node is configured and
> the spawned into the physical node for the first time.
>
> -----------Log snippet---------------
>
> 01:59:02.457 [Heartbeat] INFO  c.e.t.flume.TimeSeriesAggregatorSink -
> EventSink builder() : building event sink
> 01:59:02.457 [Heartbeat] INFO  c.c.flume.agent.LogicalNodeManager -
> creating new logical node timeseries-aggregator-host.local.domain
> 01:59:02.470 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (HELLO,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:02.472 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
>  c.c.f.h.thrift.ThriftEventSource - Starting blocking thread pool server on
> port 54010...
> 01:59:02.475 [logicalNode timeseries-aggregator-host.local.domain-32] DEBUG
> c.c.f.core.connector.DirectDriver - Starting driver ThriftEventSource |
> TimeSeriesAggregatorSink
> 01:59:02.479 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:02.475 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 01:59:02.480 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode -
> Attempt to load config com.cloudera.flume.conf.FlumeConfigData@449afc
> 01:59:02.480 [Check config] INFO  c.e.t.flume.TimeSeriesAggregatorSink -
> EventSink builder() : building event sink
> 01:59:07.482 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:07.484 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:12.487 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:12.489 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 1 heartbeats
> 01:59:12.490 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:17.493 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:17.494 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 2 heartbeats
> 01:59:17.495 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:22.501 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:22.502 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 3 heartbeats
> 01:59:22.502 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:27.505 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:27.506 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 4 heartbeats
> 01:59:27.506 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:32.509 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:32.511 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 5 heartbeats
> 01:59:32.511 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:37.514 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:37.516 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 6 heartbeats
> 01:59:37.516 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:42.519 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:42.521 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 7 heartbeats
> 01:59:42.521 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:47.524 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:47.526 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 8 heartbeats
> 01:59:47.526 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:52.529 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:52.532 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 9 heartbeats
> 01:59:52.532 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 01:59:57.535 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
> 19:00:00 EST 1969)
> 01:59:57.537 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
> are backing up, currently behind by 10 heartbeats
> 01:59:57.537 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 02:00:02.480 [Check config] ERROR com.cloudera.flume.agent.LogicalNode -
> Forcing driver to exit uncleanly
> 02:00:02.481 [Check config] INFO  com.cloudera.flume.agent.LogicalNode -
> Node config successfully set to
> com.cloudera.flume.conf.FlumeConfigData@449afc
> 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.483 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.483 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
> another heartbeat
> 02:00:02.483 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
> nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
> (Thu Aug 11 01:58:57 EDT 2011).
> 02:00:02.483 [Thread-4] DEBUG c.c.flume.reporter.ReportEvent - Ignoring
> null string metric message
> Aug 11, 2011 2:00:02 AM
> com.twitter.common.application.modules.AppLauncherModule$1 uncaughtException
> SEVERE: Uncaught exception from Thread[Thread-4,5,main]
> java.lang.NullPointerException
>     at
> com.cloudera.flume.handlers.thrift.ThriftEventSource.getMetrics(ThriftEventSource.java:99)
>     at
> com.cloudera.flume.core.EventSource$Base.getReports(EventSource.java:186)
>     at
> com.cloudera.flume.agent.LogicalNode.getReports(LogicalNode.java:309)
>     at
> com.cloudera.flume.reporter.MasterReportPusher$PusherThread.querySrcSinkReports(MasterReportPusher.java:102)
>     at
> com.cloudera.flume.reporter.MasterReportPusher$PusherThread.sendReports(MasterReportPusher.java:110)
>     at
> com.cloudera.flume.reporter.MasterReportPusher$PusherThread.run(MasterReportPusher.java:119)
> 02:00:02.487 [logicalNode timeseries-aggregator-host.local.domain-34] INFO
>  c.c.f.h.thrift.ThriftEventSource - Starting blocking thread pool server on
> port 54010...
> 02:00:02.488 [logicalNode timeseries-aggregator-host.local.domain-34] ERROR
> o.a.t.server.TSaneThreadPoolServer - Error occurred during listening.
> org.apache.thrift.transport.TTransportException: Could not create
> ServerSocket on address 0.0.0.0/0.0.0.0:54010.
>     at
> org.apache.thrift.transport.TSaneServerSocket.bind(TSaneServerSocket.java:110)
> ~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
>     at
> org.apache.thrift.transport.TSaneServerSocket.listen(TSaneServerSocket.java:116)
> ~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
>     at
> org.apache.thrift.server.TSaneThreadPoolServer.start(TSaneThreadPoolServer.java:162)
> ~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
>     at
> com.cloudera.flume.handlers.thrift.ThriftEventSource.open(ThriftEventSource.java:151)
> [flume-core-0.9.4-cdh3u1.jar:na]
>     at
> com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:87)
> [flume-core-0.9.4-cdh3u1.jar:na]
> 02:00:02.488 [logicalNode timeseries-aggregator-host.local.domain-34] DEBUG
> c.c.f.core.connector.DirectDriver - Starting driver ThriftEventSource |
> TimeSeriesAggregatorSink
> 02:00:02.539 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
> 01:58:57 EDT 2011)
> 02:00:02.539 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 02:00:02.555 [logicalNode timeseries-aggregator-host.local.domain-32] ERROR
> c.c.f.core.connector.DirectDriver - Closing down due to exception during
> append calls
> java.io.IOException: Waiting for queue element was interrupted! null
>     at
> com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:222)
> ~[flume-core-0.9.4-cdh3u1.jar:na]
>     at
> com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:105)
> ~[flume-core-0.9.4-cdh3u1.jar:na]
> Caused by: java.lang.InterruptedException: null
>     at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
> ~[na:1.6.0_22]
>     at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2094)
> ~[na:1.6.0_22]
>     at
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:423)
> ~[na:1.6.0_22]
>     at
> com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:209)
> ~[flume-core-0.9.4-cdh3u1.jar:na]
>     ... 1 common frames omitted
> 02:00:02.557 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
>  c.c.f.core.connector.DirectDriver - Connector logicalNode
> timeseries-aggregator-host.local.domain-32 exited with error: Waiting for
> queue element was interrupted! null
> java.io.IOException: Waiting for queue element was interrupted! null
>     at
> com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:222)
> ~[flume-core-0.9.4-cdh3u1.jar:na]
>     at
> com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:105)
> ~[flume-core-0.9.4-cdh3u1.jar:na]
> Caused by: java.lang.InterruptedException: null
>     at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
> ~[na:1.6.0_22]
>     at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2094)
> ~[na:1.6.0_22]
>     at
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:423)
> ~[na:1.6.0_22]
>     at
> com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:209)
> ~[flume-core-0.9.4-cdh3u1.jar:na]
>     ... 1 common frames omitted
> 02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
>  c.c.f.h.thrift.ThriftEventSource - Closed server on port 54010...
> 02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
>  c.c.f.h.thrift.ThriftEventSource - Queue still has 0 elements ...
> 02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] ERROR
> c.c.f.core.connector.DirectDriver - Exiting driver logicalNode
> timeseries-aggregator-host.local.domain-32 in error state ThriftEventSource
> | TimeSeriesAggregatorSink because Waiting for queue element was
> interrupted! null
> 02:00:07.540 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
> 01:58:57 EDT 2011)
> 02:00:07.541 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 02:00:12.542 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
> 01:58:57 EDT 2011)
> 02:00:12.542 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 02:00:17.545 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
> 01:58:57 EDT 2011)
> 02:00:17.546 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 02:00:22.549 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
> 01:58:57 EDT 2011)
> 02:00:22.549 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> 02:00:27.552 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
> Heartbeat from host:host.local.domain
> logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
> 01:58:57 EDT 2011)
> 02:00:27.553 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
> acks waiting for master: {}
> ---------------------------
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com