You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Will McQueen (Created) (JIRA)" <ji...@apache.org> on 2012/03/29 03:29:29 UTC
[jira] [Created] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Flume agent reconfiguration enters permanent bad state
------------------------------------------------------
Key: FLUME-1079
URL: https://issues.apache.org/jira/browse/FLUME-1079
Project: Flume
Issue Type: Bug
Components: Node
Affects Versions: v1.2.0
Reporter: Will McQueen
Fix For: v1.2.0
Steps:
1) Start with this config in a1.properties:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
2) Run the flume node:
bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1 r2
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
a1.sources.r2.type = AVRO
a1.sources.r2.channels = c1
a1.sources.r2.bind = localhost
a1.sources.r2.port = 1473
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
java.lang.NullPointerException
at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
4) Now correct the config by changing r2's port to 1474:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1 r2
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
a1.sources.r2.type = AVRO
a1.sources.r2.channels = c1
a1.sources.r2.bind = localhost
a1.sources.r2.port = 1474
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
...but this results in an illegal state:
java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
...which tells me that we've entered a permanent bad state that would require restarting the agent.
5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
org.apache.flume.FlumeException: RPC connection error. Exception follows.
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
... 6 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Arvind Prabhakar (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240962#comment-13240962 ]
Arvind Prabhakar commented on FLUME-1079:
-----------------------------------------
@Hari - I think this is a serious issue because there is no way to recover from it without restarting the process. Which means that due to a mistake during reconfiguration, the agent can enter this bad state and would then require a complete shutdown in order to fix it.
@Will - does this interpretation match what you have observed?
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Priority: Minor
> Fix For: v1.2.0
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Hari Shreedharan (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240904#comment-13240904 ]
Hari Shreedharan commented on FLUME-1079:
-----------------------------------------
Here is an analysis:
The config changes -> AbstractFileConfigProvider$FileWatcherRunnable calls doLoad, which in turn, calls load, which tries to stop previous threads
since the old components never loaded correctly, a call to stop throws a NullPointerException
2012-03-28 17:52:36,503 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
java.lang.NullPointerException
at org.apache.flume.source.AvroSource.stop(AvroSource.java:150)
at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
That causes the load function to exit, and never load the new configuration.
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Fix For: v1.2.0
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242096#comment-13242096 ]
Hudson commented on FLUME-1079:
-------------------------------
Integrated in flume-trunk #150 (See [https://builds.apache.org/job/flume-trunk/150/])
FLUME-1079. Flume agent reconfiguration enters permanent bad state.
(Hari Shreedharan via Arvind Prabhakar) (Revision 1307278)
Result = SUCCESS
arvind : http://svn.apache.org/viewvc/?view=rev&rev=1307278
Files :
* /incubator/flume/trunk/flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch, FLUME-1079-2.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241931#comment-13241931 ]
jiraposter@reviews.apache.org commented on FLUME-1079:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4551/#review6552
-----------------------------------------------------------
Changes look good Hari. One comment below:
flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java
<https://reviews.apache.org/r/4551/#comment14229>
We should have the same logic here as well since otherwise if one component fails to start the others will not be attempted.
flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java
<https://reviews.apache.org/r/4551/#comment14230>
Same here as well.
- Arvind
On 2012-03-29 07:25:45, Hari Shreedharan wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/4551/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2012-03-29 07:25:45)
bq.
bq.
bq. Review request for Flume.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. Fixing a bug that causes a bad configuration to never allow reconfiguration.
bq.
bq.
bq. This addresses bug FLUME-1079.
bq. https://issues.apache.org/jira/browse/FLUME-1079
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java 2c0cff6
bq.
bq. Diff: https://reviews.apache.org/r/4551/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. Verified using the conf that produced the error. Works ok now.
bq.
bq.
bq. Thanks,
bq.
bq. Hari
bq.
bq.
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241047#comment-13241047 ]
jiraposter@reviews.apache.org commented on FLUME-1079:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4551/
-----------------------------------------------------------
Review request for Flume.
Summary
-------
Fixing a bug that causes a bad configuration to never allow reconfiguration.
This addresses bug FLUME-1079.
https://issues.apache.org/jira/browse/FLUME-1079
Diffs
-----
flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java 2c0cff6
Diff: https://reviews.apache.org/r/4551/diff
Testing
-------
Verified using the conf that produced the error. Works ok now.
Thanks,
Hari
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Arvind Prabhakar (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arvind Prabhakar updated FLUME-1079:
------------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Patch committed. Thanks Hari!
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch, FLUME-1079-2.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Hari Shreedharan (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hari Shreedharan updated FLUME-1079:
------------------------------------
Status: Patch Available (was: Open)
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241945#comment-13241945 ]
jiraposter@reviews.apache.org commented on FLUME-1079:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4551/
-----------------------------------------------------------
(Updated 2012-03-30 00:06:09.676444)
Review request for Flume.
Changes
-------
Adding checks for failed component starts.
Summary
-------
Fixing a bug that causes a bad configuration to never allow reconfiguration.
This addresses bug FLUME-1079.
https://issues.apache.org/jira/browse/FLUME-1079
Diffs (updated)
-----
flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java 2c0cff6
Diff: https://reviews.apache.org/r/4551/diff
Testing
-------
Verified using the conf that produced the error. Works ok now.
Thanks,
Hari
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Will McQueen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Will McQueen updated FLUME-1079:
--------------------------------
Description:
Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
Steps:
1) Start with this config in a1.properties:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
2) Run the flume node:
bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1 r2
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
a1.sources.r2.type = AVRO
a1.sources.r2.channels = c1
a1.sources.r2.bind = localhost
a1.sources.r2.port = 1473
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
java.lang.NullPointerException
at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
4) Now correct the config by changing r2's port to 1474:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1 r2
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
a1.sources.r2.type = AVRO
a1.sources.r2.channels = c1
a1.sources.r2.bind = localhost
a1.sources.r2.port = 1474
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
...but this results in an illegal state:
java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
...which tells me that we've entered a permanent bad state that would require restarting the agent.
5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
org.apache.flume.FlumeException: RPC connection error. Exception follows.
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
... 6 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
was:
Steps:
1) Start with this config in a1.properties:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
2) Run the flume node:
bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1 r2
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
a1.sources.r2.type = AVRO
a1.sources.r2.channels = c1
a1.sources.r2.bind = localhost
a1.sources.r2.port = 1473
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
java.lang.NullPointerException
at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
4) Now correct the config by changing r2's port to 1474:
# a = agent
# r = source
# c = channel
# k = sink
a1.sources = r1 r2
a1.channels = c1
a1.sinks = k1
# ===SOURCES===
a1.sources.r1.type = NETCAT
a1.sources.r1.channels = c1
a1.sources.r1.bind = localhost
a1.sources.r1.port = 1473
a1.sources.r2.type = AVRO
a1.sources.r2.channels = c1
a1.sources.r2.bind = localhost
a1.sources.r2.port = 1474
# ===CHANNELS===
a1.channels.c1.type = MEMORY
# ===SINKS===
a1.sinks.k1.type = NULL
a1.sinks.k1.channel = c1
...but this results in an illegal state:
java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
...which tells me that we've entered a permanent bad state that would require restarting the agent.
5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
org.apache.flume.FlumeException: RPC connection error. Exception follows.
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
... 6 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Reporter: Will McQueen
> Fix For: v1.2.0
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Hari Shreedharan (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hari Shreedharan updated FLUME-1079:
------------------------------------
Attachment: FLUME-1079-1.patch
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Will McQueen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Will McQueen updated FLUME-1079:
--------------------------------
Environment:
CentOS 6.2 64-bit
JDK 1.6.0_26 64-bit
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Fix For: v1.2.0
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Arvind Prabhakar (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240961#comment-13240961 ]
Arvind Prabhakar commented on FLUME-1079:
-----------------------------------------
@Hari - I think this is a serious issue because there is no way to recover from it without restarting the process. Which means that due to a mistake during reconfiguration, the agent can enter this bad state and would then require a complete shutdown in order to fix it.
@Will - does this interpretation match what you have observed?
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Priority: Minor
> Fix For: v1.2.0
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242007#comment-13242007 ]
jiraposter@reviews.apache.org commented on FLUME-1079:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4551/#review6558
-----------------------------------------------------------
Ship it!
lgtm
- Prasad
On 2012-03-30 00:06:09, Hari Shreedharan wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/4551/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2012-03-30 00:06:09)
bq.
bq.
bq. Review request for Flume.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. Fixing a bug that causes a bad configuration to never allow reconfiguration.
bq.
bq.
bq. This addresses bug FLUME-1079.
bq. https://issues.apache.org/jira/browse/FLUME-1079
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java 2c0cff6
bq.
bq. Diff: https://reviews.apache.org/r/4551/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. Verified using the conf that produced the error. Works ok now.
bq.
bq.
bq. Thanks,
bq.
bq. Hari
bq.
bq.
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Hari Shreedharan (Assigned) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hari Shreedharan reassigned FLUME-1079:
---------------------------------------
Assignee: Hari Shreedharan
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (FLUME-1079) Flume agent
reconfiguration enters permanent bad state
Posted by "Hari Shreedharan (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241020#comment-13241020 ]
Hari Shreedharan edited comment on FLUME-1079 at 3/29/12 6:57 AM:
------------------------------------------------------------------
Ok - I will add a try catch around the call to stop the components. This will make sure that even if one throws an exception, we can still proceed to the next.
Arvind - Yes, this is what happens.
was (Author: hshreedharan):
Ok - I will add a try catch around the call to stop the components. This will make sure that even if one throws an exception, we can still proceed to the next.
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Priority: Minor
> Fix For: v1.2.0
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242013#comment-13242013 ]
jiraposter@reviews.apache.org commented on FLUME-1079:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4551/#review6560
-----------------------------------------------------------
Ship it!
+1. Please attach patch to the Jira.
- Arvind
On 2012-03-30 00:06:09, Hari Shreedharan wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/4551/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2012-03-30 00:06:09)
bq.
bq.
bq. Review request for Flume.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. Fixing a bug that causes a bad configuration to never allow reconfiguration.
bq.
bq.
bq. This addresses bug FLUME-1079.
bq. https://issues.apache.org/jira/browse/FLUME-1079
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. flume-ng-node/src/main/java/org/apache/flume/node/nodemanager/DefaultLogicalNodeManager.java 2c0cff6
bq.
bq. Diff: https://reviews.apache.org/r/4551/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. Verified using the conf that produced the error. Works ok now.
bq.
bq.
bq. Thanks,
bq.
bq. Hari
bq.
bq.
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Hari Shreedharan (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241020#comment-13241020 ]
Hari Shreedharan commented on FLUME-1079:
-----------------------------------------
Ok - I will add a try catch around the call to stop the components. This will make sure that even if one throws an exception, we can still proceed to the next.
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Priority: Minor
> Fix For: v1.2.0
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Hari Shreedharan (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hari Shreedharan updated FLUME-1079:
------------------------------------
Attachment: FLUME-1079-2.patch
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Assignee: Hari Shreedharan
> Priority: Minor
> Fix For: v1.2.0
>
> Attachments: FLUME-1079-1.patch, FLUME-1079-2.patch
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1079) Flume agent reconfiguration enters
permanent bad state
Posted by "Hari Shreedharan (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hari Shreedharan updated FLUME-1079:
------------------------------------
Priority: Minor (was: Major)
It is not a major issue. The problem happens only when one of the configs causes the components not to start(due to an error - like a port bind error etc.) and then a reconfigure occurs.
> Flume agent reconfiguration enters permanent bad state
> ------------------------------------------------------
>
> Key: FLUME-1079
> URL: https://issues.apache.org/jira/browse/FLUME-1079
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.2.0
> Environment: CentOS 6.2 64-bit
> JDK 1.6.0_26 64-bit
> Reporter: Will McQueen
> Priority: Minor
> Fix For: v1.2.0
>
>
> Using flume trunk, commit ad24cb31bb1b5a0d1ee4b0ec18572a223ed9d397
> Steps:
> 1) Start with this config in a1.properties:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> 2) Run the flume node:
> bin/flume-ng node --conf conf --conf-file conf/a1.properties --name a1
> 3) Update the a1.properties file to add a new source a the same port, which would cause a port bind exception on r2 due to r1 already using port 1473:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1473
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...and updating the props file to the above config results in (after waiting a max of 30 secs for the reconfig to be noticed):
> 2012-03-28 18:11:24,027 (conf-file-poller-0) [ERROR - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:205)] Failed to load configuration data. Exception follows.
> java.lang.NullPointerException
> at org.apache.flume.source.AvroSource.stop(AvroSource.java:137)
> at org.apache.flume.source.EventDrivenSourceRunner.stop(EventDrivenSourceRunner.java:45)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:155)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:66)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 4) Now correct the config by changing r2's port to 1474:
> # a = agent
> # r = source
> # c = channel
> # k = sink
> a1.sources = r1 r2
> a1.channels = c1
> a1.sinks = k1
> # ===SOURCES===
> a1.sources.r1.type = NETCAT
> a1.sources.r1.channels = c1
> a1.sources.r1.bind = localhost
> a1.sources.r1.port = 1473
> a1.sources.r2.type = AVRO
> a1.sources.r2.channels = c1
> a1.sources.r2.bind = localhost
> a1.sources.r2.port = 1474
> # ===CHANNELS===
> a1.channels.c1.type = MEMORY
> # ===SINKS===
> a1.sinks.k1.type = NULL
> a1.sinks.k1.channel = c1
> ...but this results in an illegal state:
> java.lang.IllegalStateException: Unaware of SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@5090d8ea counterGroup:{ name:null counters:{runner.backoffs.consecutive=5, runner.backoffs=5, runner.interruptions=1} } } - can not unsupervise
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:145)
> at org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:61)
> at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:217)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:124)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:203)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> ...which tells me that we've entered a permanent bad state that would require restarting the agent.
> 5) Start the avro-client. We expect the avro-client to connect to the agent (if there would have been no errors in previous steps), but connection is refused:
> bin/flume-ng avro-client --cnf --host localhost --port 1474 --filename /home/will/bigdata.txt
> 2012-03-28 18:27:35,650 (main) [ERROR - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)] Unable to open connection to Flume. Exception follows.
> org.apache.flume.FlumeException: RPC connection error. Exception follows.
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:114)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:96)
> at org.apache.flume.api.NettyAvroRpcClient.access$100(NettyAvroRpcClient.java:50)
> at org.apache.flume.api.NettyAvroRpcClient$Builder.build(NettyAvroRpcClient.java:389)
> at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:45)
> at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:120)
> at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:64)
> Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:1474
> at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:250)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:199)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:148)
> at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:116)
> at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:107)
> ... 6 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
> at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-03-28 18:27:35,683 (main) [DEBUG - org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:77)] Exiting
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira