You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "jeff (Created) (JIRA)" <ji...@apache.org> on 2012/02/01 08:02:58 UTC

[jira] [Created] (FLUME-948) [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector

[Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector
----------------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: FLUME-948
                 URL: https://issues.apache.org/jira/browse/FLUME-948
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v0.9.3
            Reporter: jeff
            Priority: Critical
             Fix For: v0.9.5


Here is my config for flume. 
agent:   :  rpcSource(333)| agentDFOChain("192.168.130.15:17876","192.168.130.14:17876") 
collector:  collector(17876)|myCustomplugin 

Here is my test case:

1. use an rpcClient send one event to agent every munite.
2. shutdown the primay collector and sendory collector
3. wait 10min, start the sendory collector


In my expect,  the events received by agent at the posted time of  the secondary recovered, should be send to sendory collector, but in actually, the events just be discard as it be send to the null sink in BEChain.

Here is my log:[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.debug.StubbornAppendSink 76] Append failed java.net.SocketException: No route to host
[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.thrift.ThriftEventSink 89] ThriftEventSink on port 17876 closed
[2012-01-20 10:19:01,066] [WARN ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
[2012-01-20 10:19:01,067] [INFO ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
[2012-01-20 10:19:03,102] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

[2012-01-20 10:19:04,070] [WARN ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
[2012-01-20 10:19:06,070] [INFO ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
[2012-01-20 10:19:06,106] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.14:17876 : java.net.NoRouteToHostException: No route to host

[2012-01-20 10:20:39,830] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:21:47,997] [INFO ] [Thread-2] [com.cloudera.flume.agent.ThriftMasterRPC 78] Connected to master at 192.168.130.13:17872

[2012-01-20 10:22:44,874] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:23:35,951] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:24:40,049] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:25:44,139] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:27:39,987] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:29:39,914] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:31:43,054] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (FLUME-948) [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector

Posted by "Alexander Lorenz-Alten (Reopened) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FLUME-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Lorenz-Alten reopened FLUME-948:
------------------------------------------

    
> [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-948
>                 URL: https://issues.apache.org/jira/browse/FLUME-948
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v0.9.3
>            Reporter: jeff
>            Priority: Critical
>             Fix For: v0.9.5
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Here is my config for flume. 
> agent:   :  rpcSource(333)| agentDFOChain("192.168.130.15:17876","192.168.130.14:17876") 
> collector:  collector(17876)|myCustomplugin 
> Here is my test case:
> 1. use an rpcClient send one event to agent every munite.
> 2. shutdown the primay collector and sendory collector
> 3. wait about 1.5h, start the sendory collector
> In my expect,  the events received by agent at the posted time of  the secondary recovered, should be send to sendory collector, but in actually, the events just be discard as it be send to the null sink in BEChain.
> Here is my log:[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.debug.StubbornAppendSink 76] Append failed java.net.SocketException: No route to host
> [2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.thrift.ThriftEventSink 89] ThriftEventSink on port 17876 closed
> [2012-01-20 10:19:01,066] [WARN ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:01,067] [INFO ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:03,102] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:19:04,070] [WARN ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:06,070] [INFO ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:06,106] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.14:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:20:39,830] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:21:47,997] [INFO ] [Thread-2] [com.cloudera.flume.agent.ThriftMasterRPC 78] Connected to master at 192.168.130.13:17872
> [2012-01-20 10:22:44,874] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:23:35,951] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:24:40,049] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:25:44,139] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:27:39,987] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:29:39,914] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:31:43,054] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-948) [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector

Posted by "jeff (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197637#comment-13197637 ] 

jeff commented on FLUME-948:
----------------------------

I' want to make a  patch for this bug ,but how to create an review for patch?

                
> [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-948
>                 URL: https://issues.apache.org/jira/browse/FLUME-948
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v0.9.3
>            Reporter: jeff
>            Priority: Critical
>             Fix For: v0.9.5
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Here is my config for flume. 
> agent:   :  rpcSource(333)| agentDFOChain("192.168.130.15:17876","192.168.130.14:17876") 
> collector:  collector(17876)|myCustomplugin 
> Here is my test case:
> 1. use an rpcClient send one event to agent every munite.
> 2. shutdown the primay collector and sendory collector
> 3. wait about 1.5h, start the sendory collector
> In my expect,  the events received by agent at the posted time of  the secondary recovered, should be send to sendory collector, but in actually, the events just be discard as it be send to the null sink in BEChain.
> Here is my log:[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.debug.StubbornAppendSink 76] Append failed java.net.SocketException: No route to host
> [2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.thrift.ThriftEventSink 89] ThriftEventSink on port 17876 closed
> [2012-01-20 10:19:01,066] [WARN ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:01,067] [INFO ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:03,102] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:19:04,070] [WARN ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:06,070] [INFO ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:06,106] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.14:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:20:39,830] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:21:47,997] [INFO ] [Thread-2] [com.cloudera.flume.agent.ThriftMasterRPC 78] Connected to master at 192.168.130.13:17872
> [2012-01-20 10:22:44,874] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:23:35,951] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:24:40,049] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:25:44,139] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:27:39,987] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:29:39,914] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:31:43,054] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (FLUME-948) [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector

Posted by "Alexander Lorenz-Alten (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FLUME-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Lorenz-Alten resolved FLUME-948.
------------------------------------------

    Resolution: Won't Fix
    
> [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-948
>                 URL: https://issues.apache.org/jira/browse/FLUME-948
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v0.9.3
>            Reporter: jeff
>            Priority: Critical
>             Fix For: v0.9.5
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Here is my config for flume. 
> agent:   :  rpcSource(333)| agentDFOChain("192.168.130.15:17876","192.168.130.14:17876") 
> collector:  collector(17876)|myCustomplugin 
> Here is my test case:
> 1. use an rpcClient send one event to agent every munite.
> 2. shutdown the primay collector and sendory collector
> 3. wait about 1.5h, start the sendory collector
> In my expect,  the events received by agent at the posted time of  the secondary recovered, should be send to sendory collector, but in actually, the events just be discard as it be send to the null sink in BEChain.
> Here is my log:[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.debug.StubbornAppendSink 76] Append failed java.net.SocketException: No route to host
> [2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.thrift.ThriftEventSink 89] ThriftEventSink on port 17876 closed
> [2012-01-20 10:19:01,066] [WARN ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:01,067] [INFO ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:03,102] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:19:04,070] [WARN ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:06,070] [INFO ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:06,106] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.14:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:20:39,830] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:21:47,997] [INFO ] [Thread-2] [com.cloudera.flume.agent.ThriftMasterRPC 78] Connected to master at 192.168.130.13:17872
> [2012-01-20 10:22:44,874] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:23:35,951] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:24:40,049] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:25:44,139] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:27:39,987] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:29:39,914] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:31:43,054] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (FLUME-948) [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector

Posted by "jeff (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FLUME-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

jeff updated FLUME-948:
-----------------------

    Description: 
Here is my config for flume. 
agent:   :  rpcSource(333)| agentDFOChain("192.168.130.15:17876","192.168.130.14:17876") 
collector:  collector(17876)|myCustomplugin 

Here is my test case:

1. use an rpcClient send one event to agent every munite.
2. shutdown the primay collector and sendory collector
3. wait about 1.5h, start the sendory collector


In my expect,  the events received by agent at the posted time of  the secondary recovered, should be send to sendory collector, but in actually, the events just be discard as it be send to the null sink in BEChain.

Here is my log:[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.debug.StubbornAppendSink 76] Append failed java.net.SocketException: No route to host
[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.thrift.ThriftEventSink 89] ThriftEventSink on port 17876 closed
[2012-01-20 10:19:01,066] [WARN ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
[2012-01-20 10:19:01,067] [INFO ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
[2012-01-20 10:19:03,102] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

[2012-01-20 10:19:04,070] [WARN ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
[2012-01-20 10:19:06,070] [INFO ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
[2012-01-20 10:19:06,106] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.14:17876 : java.net.NoRouteToHostException: No route to host

[2012-01-20 10:20:39,830] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:21:47,997] [INFO ] [Thread-2] [com.cloudera.flume.agent.ThriftMasterRPC 78] Connected to master at 192.168.130.13:17872

[2012-01-20 10:22:44,874] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:23:35,951] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:24:40,049] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:25:44,139] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:27:39,987] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:29:39,914] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:31:43,054] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

  was:
Here is my config for flume. 
agent:   :  rpcSource(333)| agentDFOChain("192.168.130.15:17876","192.168.130.14:17876") 
collector:  collector(17876)|myCustomplugin 

Here is my test case:

1. use an rpcClient send one event to agent every munite.
2. shutdown the primay collector and sendory collector
3. wait 10min, start the sendory collector


In my expect,  the events received by agent at the posted time of  the secondary recovered, should be send to sendory collector, but in actually, the events just be discard as it be send to the null sink in BEChain.

Here is my log:[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.debug.StubbornAppendSink 76] Append failed java.net.SocketException: No route to host
[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.thrift.ThriftEventSink 89] ThriftEventSink on port 17876 closed
[2012-01-20 10:19:01,066] [WARN ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
[2012-01-20 10:19:01,067] [INFO ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
[2012-01-20 10:19:03,102] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

[2012-01-20 10:19:04,070] [WARN ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
[2012-01-20 10:19:06,070] [INFO ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
[2012-01-20 10:19:06,106] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.14:17876 : java.net.NoRouteToHostException: No route to host

[2012-01-20 10:20:39,830] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:21:47,997] [INFO ] [Thread-2] [com.cloudera.flume.agent.ThriftMasterRPC 78] Connected to master at 192.168.130.13:17872

[2012-01-20 10:22:44,874] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:23:35,951] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:24:40,049] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:25:44,139] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:27:39,987] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:29:39,914] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
[2012-01-20 10:31:43,054] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

    
> [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-948
>                 URL: https://issues.apache.org/jira/browse/FLUME-948
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v0.9.3
>            Reporter: jeff
>            Priority: Critical
>             Fix For: v0.9.5
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Here is my config for flume. 
> agent:   :  rpcSource(333)| agentDFOChain("192.168.130.15:17876","192.168.130.14:17876") 
> collector:  collector(17876)|myCustomplugin 
> Here is my test case:
> 1. use an rpcClient send one event to agent every munite.
> 2. shutdown the primay collector and sendory collector
> 3. wait about 1.5h, start the sendory collector
> In my expect,  the events received by agent at the posted time of  the secondary recovered, should be send to sendory collector, but in actually, the events just be discard as it be send to the null sink in BEChain.
> Here is my log:[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.debug.StubbornAppendSink 76] Append failed java.net.SocketException: No route to host
> [2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.thrift.ThriftEventSink 89] ThriftEventSink on port 17876 closed
> [2012-01-20 10:19:01,066] [WARN ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:01,067] [INFO ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:03,102] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:19:04,070] [WARN ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:06,070] [INFO ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:06,106] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.14:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:20:39,830] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:21:47,997] [INFO ] [Thread-2] [com.cloudera.flume.agent.ThriftMasterRPC 78] Connected to master at 192.168.130.13:17872
> [2012-01-20 10:22:44,874] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:23:35,951] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:24:40,049] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:25:44,139] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:27:39,987] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:29:39,914] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:31:43,054] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-948) [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector

Posted by "jeff (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197634#comment-13197634 ] 

jeff commented on FLUME-948:
----------------------------

I've looked in the code in 0.9.3 Cloudera, These is some wrong for update backoff retry condition in BackOffFailOverSink.java.
When primary collector is halt, the event will be handled by the backofFailOverSink as the primary sink is sencondary collector and the secondary sink is nullsink
In this tacke, the event will never send to primay sink , because backoffPolicy for primary sink didn't reach the retry condition as the backoff time update wrong in the code.
                
> [Agent Reliability] In BEChain mode,if primary collector is off,but the secondary collector is on,agent's event send to null sink, instead of seconday collector
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-948
>                 URL: https://issues.apache.org/jira/browse/FLUME-948
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v0.9.3
>            Reporter: jeff
>            Priority: Critical
>             Fix For: v0.9.5
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Here is my config for flume. 
> agent:   :  rpcSource(333)| agentDFOChain("192.168.130.15:17876","192.168.130.14:17876") 
> collector:  collector(17876)|myCustomplugin 
> Here is my test case:
> 1. use an rpcClient send one event to agent every munite.
> 2. shutdown the primay collector and sendory collector
> 3. wait about 1.5h, start the sendory collector
> In my expect,  the events received by agent at the posted time of  the secondary recovered, should be send to sendory collector, but in actually, the events just be discard as it be send to the null sink in BEChain.
> Here is my log:[2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.debug.StubbornAppendSink 76] Append failed java.net.SocketException: No route to host
> [2012-01-20 10:19:00,098] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.handlers.thrift.ThriftEventSink 89] ThriftEventSink on port 17876 closed
> [2012-01-20 10:19:01,066] [WARN ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:01,067] [INFO ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:03,102] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:19:04,070] [WARN ] [Heartbeat] [com.cloudera.flume.agent.MultiMasterRPC 198] Could not connect to any master nodes (tried 1: [192.168.130.13:17872])
> [2012-01-20 10:19:06,070] [INFO ] [Thread-2] [com.cloudera.flume.agent.MultiMasterRPC 194] MasterRPC called while disconnected.
> [2012-01-20 10:19:06,106] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.14:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:20:39,830] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:21:47,997] [INFO ] [Thread-2] [com.cloudera.flume.agent.ThriftMasterRPC 78] Connected to master at 192.168.130.13:17872
> [2012-01-20 10:22:44,874] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:23:35,951] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:24:40,049] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:25:44,139] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:27:39,987] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:29:39,914] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host
> [2012-01-20 10:31:43,054] [INFO ] [logicalNode ESC01_agent-24] [com.cloudera.flume.core.BackOffFailOverSink 143] Failed to open thrift event sink at 192.168.130.15:17876 : java.net.NoRouteToHostException: No route to host

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira