You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jonathan Baynes <Jo...@tradeweb.com> on 2017/09/29 13:43:32 UTC

Issue with New Production Cluster

Hi Community,

I have a 6 node ring, covering 2 DC's, the ring isn't being used yet and we are just in the connectivity and testing phase. So the boxes are NOT under any load.

I've gone to connect to CQLSH this afternoon and I've had this returned:

cqlsh xx.xxx.xxx.xx -u cassandra -p cassandra
Connection error: ('Unable to connect to any servers', { xx.xxx.xxx.xx': OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})

Status of the service is "Running"
Nodetool status is UP for all nodes.
Tested telnet on all ports and they are all up and connecting.
Checked the PID is up for cassandra (it is)

Any idea how I can debug the cause of this further?


Thanks
J



Jonathan Baynes
DBA
Tradeweb Europe Limited
Moor Place  *  1 Fore Street Avenue  *  London EC2Y 9DT
P +44 (0)20 77760988  *  F +44 (0)20 7776 3201  *  M +44 (0)xxxx xxxxxx
Jonathan.Baynes@tradeweb.com

[cid:image001.jpg@01CD26AD.4165F110]<http://www.tradeweb.com/>   follow us:  [cid:image002.jpg@01CD26AD.4165F110] <https://www.linkedin.com/company/tradeweb?trk=top_nav_home>    [cid:image003.jpg@01CD26AD.4165F110] <http://www.twitter.com/Tradeweb>
-----------------------------------------
A leading marketplace<http://www.tradeweb.com/About-Us/Awards/> for electronic fixed income, derivatives and ETF trading


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT. To see our privacy policy, visit our website @ www.tradeweb.com.

Re: Issue with New Production Cluster

Posted by Michael Shuler <mi...@pbandjelly.org>.
"Connection reset by peer" is almost certainly network issues. Same error:

https://github.com/netty/netty/issues/5993

mtr - ping/trace tool to find possible flaky switch/router
tcpdump and/or wireshark - tools to gather and observe network packets

-- 
Michael

On 09/29/2017 10:38 AM, Thakrar, Jayesh wrote:
> Other things you can do to test network issues (as a non-sysadmin user)
> is to use the following commands -
> 
>  
> 
> "netstat -i"
> 
> This will show all the network interfaces (including any bonded
> interfaces) and the in/out traffic (as rx/tx) and any errors in them.
> 
> Note that these are cumulative counters, so you need to do a number of
> samplings to ensure that the errors, if any, are not old ones.
> 
>  
> 
> "netstat -s"
> 
> This show more extended network level stats - and the info varies by the
> OS flavor and version.
> 
> But many of the stats are kind of self-explanatory.
> 
>  
> 
> *From: *Jeff Jirsa <jj...@gmail.com>
> *Date: *Friday, September 29, 2017 at 10:32 AM
> *To: *cassandra <us...@cassandra.apache.org>
> *Subject: *Re: Issue with New Production Cluster
> 
>  
> 
> I don't know what logging is available driver side, I'd probably be
> writing a shell script to ping all three servers to see if I can prove
> there's a network problem outside of cassandra first.
> 
>  
> 
> On Fri, Sep 29, 2017 at 8:19 AM, Jonathan Baynes
> <Jonathan.Baynes@tradeweb.com <ma...@tradeweb.com>> wrote:
> 
>     Thank you Jeff, you are a credit to this community.. very helpful
> 
>      
> 
>     Last question Is there any logging I can turn on (especially with
>     the cassandra drive 2.5.0) that would assist in gaining more of an
>     insight as to what’s going on? If I’m getting any failed connections
>     at the cassandra layer?
> 
>      
> 
>     Thanks
> 
>     J
> 
>      
> 
>     *From:*Jeff Jirsa [mailto:jjirsa@gmail.com <ma...@gmail.com>]
>     *Sent:* 29 September 2017 16:07
> 
> 
>     *To:* cassandra
>     *Subject:* Re: Issue with New Production Cluster
> 
>      
> 
>     The failure detector is seeing updates every 2.1-2.5  seconds, which
>     it will ignore because it's over the 2 second default failure
>     detector interval.
> 
>      
> 
>     If there's no load, there's no reason it shouldn't be seeing it far
>     more frequently.
> 
>      
> 
>     If you're not seeing any signs of GC pauses (like GCInspector log
>     lines), then it seems like you've got a network interface flapping -
>     maybe STP is triggering on a port, or a bad cable/interface, or
>     something along those lines. 
> 
>      
> 
>      
> 
>      
> 
>      
> 
>      
> 
>     On Fri, Sep 29, 2017 at 7:32 AM, Jonathan Baynes
>     <Jonathan.Baynes@tradeweb.com <ma...@tradeweb.com>>
>     wrote:
> 
>     Hi Jeff,
> 
>      
> 
>     This is version 3.0.11. Being run on Oracle Red Hat Linux.
> 
>      
> 
>     If I retry immediately it fails, leave it for 20 minutes, like I
>     have just now and retry it and it has worked. (?!?!)
> 
>     Ive checked all the logs (system and Debug ) and in the logs I have
>     this:
> 
>      
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:04,390
>     FailureDetector.java:456 - Ignoring interval time of 2102011879
>     <tel:(210)%20201-1879> for /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,382 Message.java:615
>     - Unexpected exception during request; channel = [id: 0x963c334d,
>     L:/10.172.117.61:9042
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=>
>     - R:/10.172.117.21:51853
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51853&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=NwQs5pv53MTu7RXO4tUrDSVZAtsEdXTsObmf9GWbNKE&e=>]
> 
>     io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
>     failed: Connection reset by peerat
>     io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown
>     Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:07,389
>     FailureDetector.java:456 - Ignoring interval time of 2085147992
>     <tel:(208)%20514-7992> for /10.172.117.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
> 
>     INFO  [SharedPool-Worker-2] 2017-09-29 15:28:07,400 Message.java:615
>     - Unexpected exception during request; channel = [id: 0xa7633234,
>     L:/10.172.117.61:9042
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=>
>     ! R:/10.172.117.21:51818
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51818&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=Zd17m05nbPXla0hA2GWhNKHRRo-N_nFqOfrN2_c6meY&e=>]
> 
>     io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
>     failed: Connection reset by peerat
>     io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown
>     Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> 
>     INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,418 Message.java:615
>     - Unexpected exception during request; channel = [id: 0x8c685c2c,
>     L:/10.172.117.61:9042
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=>
>     ! R:/10.172.117.21:51838
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51838&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=P-JajFY4WPK7GbEnPsK2RAoRmzPCztQnE8RT5atNnhY&e=>]
> 
>     io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
>     failed: Connection reset by peerat
>     io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown
>     Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> 
>     INFO  [HANDSHAKE-/10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>]
>     2017-09-29 15:28:10,140 OutboundTcpConnection.java:523 - Handshaking
>     version with /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:12,288
>     FailureDetector.java:456 - Ignoring interval time of 2153356364
>     <tel:(215)%20335-6364> for /10.172.117.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:15,289
>     FailureDetector.java:456 - Ignoring interval time of 2000150143 for
>     /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:16,305
>     FailureDetector.java:456 - Ignoring interval time of 2790916098 for
>     /10.172.117.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:17,289
>     FailureDetector.java:456 - Ignoring interval time of 2000209617 for
>     /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:18,391
>     FailureDetector.java:456 - Ignoring interval time of 2085345347
>     <tel:(208)%20534-5347> for /10.172.181.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:19,391
>     FailureDetector.java:456 - Ignoring interval time of 2102054326
>     <tel:(210)%20205-4326> for /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:20,391
>     FailureDetector.java:456 - Ignoring interval time of 2000754004 for
>     /10.172.181.61
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:20,391
>     FailureDetector.java:456 - Ignoring interval time of 2000787648 for
>     /10.172.181.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:23,391
>     FailureDetector.java:456 - Ignoring interval time of 2254752613
>     <tel:(225)%20475-2613> for /10.172.117.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:23,392
>     FailureDetector.java:456 - Ignoring interval time of 2086206045
>     <tel:(208)%20620-6045> for /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:23,392
>     FailureDetector.java:456 - Ignoring interval time of 2086239333
>     <tel:(208)%20623-9333> for /10.172.181.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:26,392
>     FailureDetector.java:456 - Ignoring interval time of 2000751862 for
>     /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:26,516
>     FailureDetector.java:456 - Ignoring interval time of 2124952635
>     <tel:(212)%20495-2635> for /10.172.181.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:29,392
>     FailureDetector.java:456 - Ignoring interval time of 2254658974
>     <tel:(225)%20465-8974> for /10.172.117.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:30,306
>     FailureDetector.java:456 - Ignoring interval time of 2000326847 for
>     /10.172.181.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:32,393
>     FailureDetector.java:456 - Ignoring interval time of 2086445998
>     <tel:(208)%20644-5998> for /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:32,393
>     FailureDetector.java:456 - Ignoring interval time of 2086481727
>     <tel:(208)%20648-1727> for /10.172.181.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:35,290
>     FailureDetector.java:456 - Ignoring interval time of 2363746015
>     <tel:(236)%20374-6015> for /10.172.117.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:35,290
>     FailureDetector.java:456 - Ignoring interval time of 2151813348 for
>     /10.172.117.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:37,393
>     FailureDetector.java:456 - Ignoring interval time of 2103432914
>     <tel:(210)%20343-2914> for /10.172.181.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:40,394
>     FailureDetector.java:456 - Ignoring interval time of 2255155834
>     <tel:(225)%20515-5834> for /10.172.117.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:41,394
>     FailureDetector.java:456 - Ignoring interval time of 2086966679
>     <tel:(208)%20696-6679> for /10.172.117.62
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
> 
>     DEBUG [GossipStage:1] 2017-09-29 15:28:41,394
>     FailureDetector.java:456 - Ignoring interval time of 2087037617
>     <tel:(208)%20703-7617> for /10.172.181.63
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
> 
>     INFO  [HANDSHAKE-/10.172.181.61
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>]
>     2017-09-29 15:28:47,132 OutboundTcpConnection.java:523 - Handshaking
>     version with /10.172.181.61
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>
> 
>      
> 
>      
> 
>     The Handshake @15:28 is about the same time as I just tried it and
>     it working. So it has an inability to handshake with the nodes??
> 
>      
> 
>     Any suggestions would be great, I have to go to the Network team
>     with this, but I’m not sure where to start…
> 
>      
> 
>     *From:*Jeff Jirsa [mailto:jjirsa@gmail.com <ma...@gmail.com>]
>     *Sent:* 29 September 2017 14:54
>     *To:* cassandra
>     *Subject:* Re: Issue with New Production Cluster
> 
>      
> 
>     What version?
> 
>     If you retry immediately, does it reconnect?
> 
>     Anything in the logs?
> 
>      
> 
>     What you describe is atypical - timeouts on queries can (and will)
>     happen occasionally under load, but timeout on connect is atypical.
>     Any sign of networking issues/slowness/dns problems/etc?
> 
>      
> 
>      
> 
>     On Fri, Sep 29, 2017 at 6:43 AM, Jonathan Baynes
>     <Jonathan.Baynes@tradeweb.com <ma...@tradeweb.com>>
>     wrote:
> 
>     Hi Community,
> 
>      
> 
>     I have a 6 node ring, covering 2 DC’s, the ring isn’t being used yet
>     and we are just in the connectivity and testing phase. So the boxes
>     are NOT under any load.
> 
>      
> 
>     I’ve gone to connect to CQLSH this afternoon and I’ve had this returned:
> 
>      
> 
>     cqlsh xx.xxx.xxx.xx -u cassandra -p cassandra
> 
>     Connection error: ('Unable to connect to any servers', {
>     xx.xxx.xxx.xx’: OperationTimedOut('errors=Timed out creating
>     connection (5 seconds), last_host=None',)})
> 
>      
> 
>     Status of the service is “Running”
> 
>     Nodetool status is UP for all nodes.
> 
>     Tested telnet on all ports and they are all up and connecting.
> 
>     Checked the PID is up for cassandra (it is)
> 
>      
> 
>     Any idea how I can debug the cause of this further?
> 
>      
> 
>      
> 
>     Thanks
> 
>     J
> 
>      
> 
>      
> 
>      
> 
>     *Jonathan Baynes*
> 
>     DBA*
>     *Tradeweb Europe Limited
> 
>     Moor Place  •  1 Fore Street Avenue
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>  •
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-250D-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-250D-2BEC2Y-2B9DT-2B-253Chttps-3A__urldefense.proofpoint.com_v2_url-3Fu-253Dhttps-2D3A-5F-5Fmaps.google.com-5F-2D3Fq-2D3D1-2D2BFore-2D2BStreet-2D2BAvenue-2D25C2-2D25A0-2D25C2-2D25A0-2D25E2-2D2580-2D25A2-2D25C2-2D25A0-2D25C2-2D25A0London-2D2BEC2Y-2D2B9DT-2D26entry-2D3Dgmail-2D26source-2D3Dg-2526d-253DDwMFaQ-2526c-253DsA0VaJZJFLZREu2pbPeqjXHJ-2DWd9NNzgHW3gpUOLSSk-2526r-253DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-2DdiQ8-2526m-253DynRqj6nQ2Rc0PoXhoOv-5F0HAaz8G8xuGO8wNbuHB-2D3pU-2526s-253D46cq-2D66-2DQUqnj-2DtaNMRFYe7YK3eNez6ogHJCv014X9w-2526e-253D-253E-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=a46k9sWzI3NKA2ZlZh8o78iGyheFPgXJFCmwuhYy-Qo&e=>  London
>     EC2Y 9DT
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>
>     P +44 (0)20 77760988 <tel:+44%2020%207776%200988>  •
>     <https://maps.google.com/?q=1%0D+Fore+Street+Avenue%C2%A0%C2%A0%E2%80%A2+%3Chttps://urldefense.proofpoint.com/v2/url?u%3Dhttps-3A__maps.google.com_-3Fq-3D1-250D-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-250D-2BEC2Y-2B9DT-2B-253Chttps-3A__urldefense.proofpoint.com_v2_url-3Fu-253Dhttps-2D3A-5F-5Fmaps.google.com-5F-2D3Fq-2D3D1-2D2BFore-2D2BStreet-2D2BAvenue-2D25C2-2D25A0-2D25C2-2D25A0-2D25E2-2D2580-2D25A2-2D25C2-2D25A0-2D25C2-2D25A0London-2D2BEC2Y-2D2B9DT-2D26entry-2D3Dgmail-2D26source-2D3Dg-2526d-253DDwMFaQ-2526c-253DsA0VaJZJFLZREu2pbPeqjXHJ-2DWd9NNzgHW3gpUOLSSk-2526r-253DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-2DdiQ8-2526m-253DynRqj6nQ2Rc0PoXhoOv-5F0HAaz8G8xuGO8wNbuHB-2D3pU-2526s-253D46cq-2D66-2DQUqnj-2DtaNMRFYe7YK3eNez6ogHJCv014X9w-2526e-253D-253E-26entry-3Dgmail-26source-3Dg%26d%3DDwMFaQ%26c%3DsA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk%26r%3DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8%26m%3DLHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA%26s%3Da46k9sWzI3NKA2ZlZh8o78iGyheFPgXJFCmwuhYy-Qo%26e%3D%3E%C2%A0%C2%A0London%0D+EC2Y+9DT+%3Chttps://urldefense.proofpoint.com/v2/url?u%3Dhttps-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg%26d%3DDwMFaQ%26c%3DsA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk%26r%3DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8%26m%3DynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU%26s%3D46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w%26e%3D%3E&entry=gmail&source=g>  F
>     +44 (0)20 7776 3201 <tel:+44%2020%207776%203201>  •  M +44 (0)xxxx
>     xxxxxx
> 
>     Jonathan.Baynes@tradeweb.com <ma...@tradeweb.com>
> 
>      
> 
>     id:image001.jpg@01CD26AD.4165F110 <http://www.tradeweb.com/>  
>     follow us:  *id:image002.jpg@01CD26AD.4165F110*
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_tradeweb-3Ftrk-3Dtop-5Fnav-5Fhome&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=saD3ZbF7LhTn3IHCHWSzhGp8Ns6SIRO0s0KXCnEbqLU&e=>  
>     id:image003.jpg@01CD26AD.4165F110
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.twitter.com_Tradeweb&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=Duw_ud1x7HIX3h-0UEtKorCrlKn8C5tUoH6tkrvXDy0&e=>
> 
>     —————————————————————————————————————————
> 
>     A leading marketplace <http://www.tradeweb.com/About-Us/Awards/> for
>     electronic fixed income, derivatives and ETF trading
> 
>      
> 
>     ________________________________________________________________________
> 
>     This e-mail may contain confidential and/or privileged information.
>     If you are not the intended recipient (or have received this e-mail
>     in error) please notify the sender immediately and destroy it. Any
>     unauthorized copying, disclosure or distribution of the material in
>     this e-mail is strictly forbidden. Tradeweb reserves the right to
>     monitor all e-mail communications through its networks. If you do
>     not wish to receive marketing emails about our products / services,
>     please let us know by contacting us, either by email at
>     contactus@tradeweb.com <ma...@tradeweb.com> or by writing
>     to us at the registered office of Tradeweb in the UK, which is:
>     Tradeweb Europe Limited (company number 3912826), 1 Fore Street
>     Avenue London EC2Y 9DT
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=VSn9NuhaGsP_VCpKoBZFV2hz_46iJ3VhVk5rYBKHeN8&e=>.
>     To see our privacy policy, visit our website @ www.tradeweb.com
>     <http://www.tradeweb.com>.
> 
>      
> 
>     ________________________________________________________________________
> 
>     This e-mail may contain confidential and/or privileged information.
>     If you are not the intended recipient (or have received this e-mail
>     in error) please notify the sender immediately and destroy it. Any
>     unauthorized copying, disclosure or distribution of the material in
>     this e-mail is strictly forbidden. Tradeweb reserves the right to
>     monitor all e-mail communications through its networks. If you do
>     not wish to receive marketing emails about our products / services,
>     please let us know by contacting us, either by email at
>     contactus@tradeweb.com <ma...@tradeweb.com> or by writing
>     to us at the registered office of Tradeweb in the UK, which is:
>     Tradeweb Europe Limited (company number 3912826), 1 Fore Street
>     Avenue London EC2Y 9DT
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=m5TdlyiKgTyluGY2ssC5qtvdFJw9hAnGpfbDAKEHzxA&e=>.
>     To see our privacy policy, visit our website @ www.tradeweb.com
>     <http://www.tradeweb.com>.
> 
>      
> 
>     ________________________________________________________________________
> 
>     This e-mail may contain confidential and/or privileged information.
>     If you are not the intended recipient (or have received this e-mail
>     in error) please notify the sender immediately and destroy it. Any
>     unauthorized copying, disclosure or distribution of the material in
>     this e-mail is strictly forbidden. Tradeweb reserves the right to
>     monitor all e-mail communications through its networks. If you do
>     not wish to receive marketing emails about our products / services,
>     please let us know by contacting us, either by email at
>     contactus@tradeweb.com <ma...@tradeweb.com> or by writing
>     to us at the registered office of Tradeweb in the UK, which is:
>     Tradeweb Europe Limited (company number 3912826), 1 Fore Street
>     Avenue London EC2Y 9DT
>     <https://maps.google.com/?q=number+3912826),+1+Fore+Street+Avenue+London+EC2Y+9DT&entry=gmail&source=g>.
>     To see our privacy policy, visit our website @ www.tradeweb.com
>     <http://www.tradeweb.com>.
> 
>  
> 


Re: Issue with New Production Cluster

Posted by "Thakrar, Jayesh" <jt...@conversantmedia.com>.
Other things you can do to test network issues (as a non-sysadmin user) is to use the following commands -

"netstat -i"
This will show all the network interfaces (including any bonded interfaces) and the in/out traffic (as rx/tx) and any errors in them.
Note that these are cumulative counters, so you need to do a number of samplings to ensure that the errors, if any, are not old ones.

"netstat -s"
This show more extended network level stats - and the info varies by the OS flavor and version.
But many of the stats are kind of self-explanatory.

From: Jeff Jirsa <jj...@gmail.com>
Date: Friday, September 29, 2017 at 10:32 AM
To: cassandra <us...@cassandra.apache.org>
Subject: Re: Issue with New Production Cluster

I don't know what logging is available driver side, I'd probably be writing a shell script to ping all three servers to see if I can prove there's a network problem outside of cassandra first.

On Fri, Sep 29, 2017 at 8:19 AM, Jonathan Baynes <Jo...@tradeweb.com>> wrote:
Thank you Jeff, you are a credit to this community.. very helpful

Last question Is there any logging I can turn on (especially with the cassandra drive 2.5.0) that would assist in gaining more of an insight as to what’s going on? If I’m getting any failed connections at the cassandra layer?

Thanks
J

From: Jeff Jirsa [mailto:jjirsa@gmail.com<ma...@gmail.com>]
Sent: 29 September 2017 16:07

To: cassandra
Subject: Re: Issue with New Production Cluster

The failure detector is seeing updates every 2.1-2.5  seconds, which it will ignore because it's over the 2 second default failure detector interval.

If there's no load, there's no reason it shouldn't be seeing it far more frequently.

If you're not seeing any signs of GC pauses (like GCInspector log lines), then it seems like you've got a network interface flapping - maybe STP is triggering on a port, or a bad cable/interface, or something along those lines.





On Fri, Sep 29, 2017 at 7:32 AM, Jonathan Baynes <Jo...@tradeweb.com>> wrote:
Hi Jeff,

This is version 3.0.11. Being run on Oracle Red Hat Linux.

If I retry immediately it fails, leave it for 20 minutes, like I have just now and retry it and it has worked. (?!?!)
Ive checked all the logs (system and Debug ) and in the logs I have this:

DEBUG [GossipStage:1] 2017-09-29 15:28:04,390 FailureDetector.java:456 - Ignoring interval time of 2102011879<tel:(210)%20201-1879> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,382 Message.java:615 - Unexpected exception during request; channel = [id: 0x963c334d, L:/10.172.117.61:9042<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=> - R:/10.172.117.21:51853<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51853&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=NwQs5pv53MTu7RXO4tUrDSVZAtsEdXTsObmf9GWbNKE&e=>]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
DEBUG [GossipStage:1] 2017-09-29 15:28:07,389 FailureDetector.java:456 - Ignoring interval time of 2085147992<tel:(208)%20514-7992> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
INFO  [SharedPool-Worker-2] 2017-09-29 15:28:07,400 Message.java:615 - Unexpected exception during request; channel = [id: 0xa7633234, L:/10.172.117.61:9042<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=> ! R:/10.172.117.21:51818<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51818&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=Zd17m05nbPXla0hA2GWhNKHRRo-N_nFqOfrN2_c6meY&e=>]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,418 Message.java:615 - Unexpected exception during request; channel = [id: 0x8c685c2c, L:/10.172.117.61:9042<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=> ! R:/10.172.117.21:51838<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51838&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=P-JajFY4WPK7GbEnPsK2RAoRmzPCztQnE8RT5atNnhY&e=>]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
INFO  [HANDSHAKE-/10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>] 2017-09-29 15:28:10,140 OutboundTcpConnection.java:523 - Handshaking version with /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:12,288 FailureDetector.java:456 - Ignoring interval time of 2153356364<tel:(215)%20335-6364> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:15,289 FailureDetector.java:456 - Ignoring interval time of 2000150143 for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:16,305 FailureDetector.java:456 - Ignoring interval time of 2790916098 for /10.172.117.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:17,289 FailureDetector.java:456 - Ignoring interval time of 2000209617 for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:18,391 FailureDetector.java:456 - Ignoring interval time of 2085345347<tel:(208)%20534-5347> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:19,391 FailureDetector.java:456 - Ignoring interval time of 2102054326<tel:(210)%20205-4326> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 - Ignoring interval time of 2000754004 for /10.172.181.61<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 - Ignoring interval time of 2000787648 for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:23,391 FailureDetector.java:456 - Ignoring interval time of 2254752613<tel:(225)%20475-2613> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 - Ignoring interval time of 2086206045<tel:(208)%20620-6045> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 - Ignoring interval time of 2086239333<tel:(208)%20623-9333> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:26,392 FailureDetector.java:456 - Ignoring interval time of 2000751862 for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:26,516 FailureDetector.java:456 - Ignoring interval time of 2124952635<tel:(212)%20495-2635> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:29,392 FailureDetector.java:456 - Ignoring interval time of 2254658974<tel:(225)%20465-8974> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:30,306 FailureDetector.java:456 - Ignoring interval time of 2000326847 for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 - Ignoring interval time of 2086445998<tel:(208)%20644-5998> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 - Ignoring interval time of 2086481727<tel:(208)%20648-1727> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 - Ignoring interval time of 2363746015<tel:(236)%20374-6015> for /10.172.117.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 - Ignoring interval time of 2151813348 for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:37,393 FailureDetector.java:456 - Ignoring interval time of 2103432914<tel:(210)%20343-2914> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:40,394 FailureDetector.java:456 - Ignoring interval time of 2255155834<tel:(225)%20515-5834> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 - Ignoring interval time of 2086966679<tel:(208)%20696-6679> for /10.172.117.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 - Ignoring interval time of 2087037617<tel:(208)%20703-7617> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
INFO  [HANDSHAKE-/10.172.181.61<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>] 2017-09-29 15:28:47,132 OutboundTcpConnection.java:523 - Handshaking version with /10.172.181.61<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>


The Handshake @15:28 is about the same time as I just tried it and it working. So it has an inability to handshake with the nodes??

Any suggestions would be great, I have to go to the Network team with this, but I’m not sure where to start…

From: Jeff Jirsa [mailto:jjirsa@gmail.com<ma...@gmail.com>]
Sent: 29 September 2017 14:54
To: cassandra
Subject: Re: Issue with New Production Cluster

What version?
If you retry immediately, does it reconnect?
Anything in the logs?

What you describe is atypical - timeouts on queries can (and will) happen occasionally under load, but timeout on connect is atypical. Any sign of networking issues/slowness/dns problems/etc?


On Fri, Sep 29, 2017 at 6:43 AM, Jonathan Baynes <Jo...@tradeweb.com>> wrote:
Hi Community,

I have a 6 node ring, covering 2 DC’s, the ring isn’t being used yet and we are just in the connectivity and testing phase. So the boxes are NOT under any load.

I’ve gone to connect to CQLSH this afternoon and I’ve had this returned:

cqlsh xx.xxx.xxx.xx -u cassandra -p cassandra
Connection error: ('Unable to connect to any servers', { xx.xxx.xxx.xx’: OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})

Status of the service is “Running”
Nodetool status is UP for all nodes.
Tested telnet on all ports and they are all up and connecting.
Checked the PID is up for cassandra (it is)

Any idea how I can debug the cause of this further?


Thanks
J



Jonathan Baynes
DBA
Tradeweb Europe Limited
Moor Place  •  1 Fore Street Avenue<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>  •<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-250D-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-250D-2BEC2Y-2B9DT-2B-253Chttps-3A__urldefense.proofpoint.com_v2_url-3Fu-253Dhttps-2D3A-5F-5Fmaps.google.com-5F-2D3Fq-2D3D1-2D2BFore-2D2BStreet-2D2BAvenue-2D25C2-2D25A0-2D25C2-2D25A0-2D25E2-2D2580-2D25A2-2D25C2-2D25A0-2D25C2-2D25A0London-2D2BEC2Y-2D2B9DT-2D26entry-2D3Dgmail-2D26source-2D3Dg-2526d-253DDwMFaQ-2526c-253DsA0VaJZJFLZREu2pbPeqjXHJ-2DWd9NNzgHW3gpUOLSSk-2526r-253DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-2DdiQ8-2526m-253DynRqj6nQ2Rc0PoXhoOv-5F0HAaz8G8xuGO8wNbuHB-2D3pU-2526s-253D46cq-2D66-2DQUqnj-2DtaNMRFYe7YK3eNez6ogHJCv014X9w-2526e-253D-253E-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=a46k9sWzI3NKA2ZlZh8o78iGyheFPgXJFCmwuhYy-Qo&e=>  London EC2Y 9DT<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>
P +44 (0)20 77760988<tel:+44%2020%207776%200988>  •<https://maps.google.com/?q=1%0D+Fore+Street+Avenue%C2%A0%C2%A0%E2%80%A2+%3Chttps://urldefense.proofpoint.com/v2/url?u%3Dhttps-3A__maps.google.com_-3Fq-3D1-250D-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-250D-2BEC2Y-2B9DT-2B-253Chttps-3A__urldefense.proofpoint.com_v2_url-3Fu-253Dhttps-2D3A-5F-5Fmaps.google.com-5F-2D3Fq-2D3D1-2D2BFore-2D2BStreet-2D2BAvenue-2D25C2-2D25A0-2D25C2-2D25A0-2D25E2-2D2580-2D25A2-2D25C2-2D25A0-2D25C2-2D25A0London-2D2BEC2Y-2D2B9DT-2D26entry-2D3Dgmail-2D26source-2D3Dg-2526d-253DDwMFaQ-2526c-253DsA0VaJZJFLZREu2pbPeqjXHJ-2DWd9NNzgHW3gpUOLSSk-2526r-253DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-2DdiQ8-2526m-253DynRqj6nQ2Rc0PoXhoOv-5F0HAaz8G8xuGO8wNbuHB-2D3pU-2526s-253D46cq-2D66-2DQUqnj-2DtaNMRFYe7YK3eNez6ogHJCv014X9w-2526e-253D-253E-26entry-3Dgmail-26source-3Dg%26d%3DDwMFaQ%26c%3DsA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk%26r%3DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8%26m%3DLHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA%26s%3Da46k9sWzI3NKA2ZlZh8o78iGyheFPgXJFCmwuhYy-Qo%26e%3D%3E%C2%A0%C2%A0London%0D+EC2Y+9DT+%3Chttps://urldefense.proofpoint.com/v2/url?u%3Dhttps-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg%26d%3DDwMFaQ%26c%3DsA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk%26r%3DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8%26m%3DynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU%26s%3D46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w%26e%3D%3E&entry=gmail&source=g>  F +44 (0)20 7776 3201<tel:+44%2020%207776%203201>  •  M +44 (0)xxxx xxxxxx
Jonathan.Baynes@tradeweb.com<ma...@tradeweb.com>

[id:image001.jpg@01CD26AD.4165F110]<http://www.tradeweb.com/>   follow us:  [id:image002.jpg@01CD26AD.4165F110] <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_tradeweb-3Ftrk-3Dtop-5Fnav-5Fhome&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=saD3ZbF7LhTn3IHCHWSzhGp8Ns6SIRO0s0KXCnEbqLU&e=>    [id:image003.jpg@01CD26AD.4165F110] <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.twitter.com_Tradeweb&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=Duw_ud1x7HIX3h-0UEtKorCrlKn8C5tUoH6tkrvXDy0&e=>
—————————————————————————————————————————
A leading marketplace<http://www.tradeweb.com/About-Us/Awards/> for electronic fixed income, derivatives and ETF trading


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com<ma...@tradeweb.com> or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=VSn9NuhaGsP_VCpKoBZFV2hz_46iJ3VhVk5rYBKHeN8&e=>. To see our privacy policy, visit our website @ www.tradeweb.com<http://www.tradeweb.com>.


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com<ma...@tradeweb.com> or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=m5TdlyiKgTyluGY2ssC5qtvdFJw9hAnGpfbDAKEHzxA&e=>. To see our privacy policy, visit our website @ www.tradeweb.com<http://www.tradeweb.com>.


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com<ma...@tradeweb.com> or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT<https://maps.google.com/?q=number+3912826),+1+Fore+Street+Avenue+London+EC2Y+9DT&entry=gmail&source=g>. To see our privacy policy, visit our website @ www.tradeweb.com<http://www.tradeweb.com>.


Re: Issue with New Production Cluster

Posted by Jeff Jirsa <jj...@gmail.com>.
I don't know what logging is available driver side, I'd probably be writing
a shell script to ping all three servers to see if I can prove there's a
network problem outside of cassandra first.

On Fri, Sep 29, 2017 at 8:19 AM, Jonathan Baynes <
Jonathan.Baynes@tradeweb.com> wrote:

> Thank you Jeff, you are a credit to this community.. very helpful
>
>
>
> Last question Is there any logging I can turn on (especially with the
> cassandra drive 2.5.0) that would assist in gaining more of an insight as
> to what’s going on? If I’m getting any failed connections at the cassandra
> layer?
>
>
>
> Thanks
>
> J
>
>
>
> *From:* Jeff Jirsa [mailto:jjirsa@gmail.com]
> *Sent:* 29 September 2017 16:07
>
> *To:* cassandra
> *Subject:* Re: Issue with New Production Cluster
>
>
>
> The failure detector is seeing updates every 2.1-2.5  seconds, which it
> will ignore because it's over the 2 second default failure detector
> interval.
>
>
>
> If there's no load, there's no reason it shouldn't be seeing it far more
> frequently.
>
>
>
> If you're not seeing any signs of GC pauses (like GCInspector log lines),
> then it seems like you've got a network interface flapping - maybe STP is
> triggering on a port, or a bad cable/interface, or something along those
> lines.
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Sep 29, 2017 at 7:32 AM, Jonathan Baynes <
> Jonathan.Baynes@tradeweb.com> wrote:
>
> Hi Jeff,
>
>
>
> This is version 3.0.11. Being run on Oracle Red Hat Linux.
>
>
>
> If I retry immediately it fails, leave it for 20 minutes, like I have just
> now and retry it and it has worked. (?!?!)
>
> Ive checked all the logs (system and Debug ) and in the logs I have this:
>
>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:04,390 FailureDetector.java:456 -
> Ignoring interval time of 2102011879 <(210)%20201-1879> for /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,382 Message.java:615 -
> Unexpected exception during request; channel = [id: 0x963c334d, L:/
> 10.172.117.61:9042
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=>
> - R:/10.172.117.21:51853
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51853&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=NwQs5pv53MTu7RXO4tUrDSVZAtsEdXTsObmf9GWbNKE&e=>
> ]
>
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
> failed: Connection reset by peerat io.netty.channel.unix.
> FileDescriptor.readAddress(...)(Unknown Source)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:07,389 FailureDetector.java:456 -
> Ignoring interval time of 2085147992 <(208)%20514-7992> for /10.172.117.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
>
> INFO  [SharedPool-Worker-2] 2017-09-29 15:28:07,400 Message.java:615 -
> Unexpected exception during request; channel = [id: 0xa7633234, L:/
> 10.172.117.61:9042
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=>
> ! R:/10.172.117.21:51818
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51818&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=Zd17m05nbPXla0hA2GWhNKHRRo-N_nFqOfrN2_c6meY&e=>
> ]
>
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
> failed: Connection reset by peerat io.netty.channel.unix.
> FileDescriptor.readAddress(...)(Unknown Source)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>
> INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,418 Message.java:615 -
> Unexpected exception during request; channel = [id: 0x8c685c2c, L:/
> 10.172.117.61:9042
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=>
> ! R:/10.172.117.21:51838
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51838&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=P-JajFY4WPK7GbEnPsK2RAoRmzPCztQnE8RT5atNnhY&e=>
> ]
>
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
> failed: Connection reset by peerat io.netty.channel.unix.
> FileDescriptor.readAddress(...)(Unknown Source)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>
> INFO  [HANDSHAKE-/10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>]
> 2017-09-29 15:28:10,140 OutboundTcpConnection.java:523 - Handshaking
> version with /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:12,288 FailureDetector.java:456 -
> Ignoring interval time of 2153356364 <(215)%20335-6364> for /10.172.117.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:15,289 FailureDetector.java:456 -
> Ignoring interval time of 2000150143 for /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:16,305 FailureDetector.java:456 -
> Ignoring interval time of 2790916098 for /10.172.117.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:17,289 FailureDetector.java:456 -
> Ignoring interval time of 2000209617 for /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:18,391 FailureDetector.java:456 -
> Ignoring interval time of 2085345347 <(208)%20534-5347> for /10.172.181.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:19,391 FailureDetector.java:456 -
> Ignoring interval time of 2102054326 <(210)%20205-4326> for /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 -
> Ignoring interval time of 2000754004 for /10.172.181.61
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 -
> Ignoring interval time of 2000787648 for /10.172.181.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:23,391 FailureDetector.java:456 -
> Ignoring interval time of 2254752613 <(225)%20475-2613> for /10.172.117.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 -
> Ignoring interval time of 2086206045 <(208)%20620-6045> for /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 -
> Ignoring interval time of 2086239333 <(208)%20623-9333> for /10.172.181.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:26,392 FailureDetector.java:456 -
> Ignoring interval time of 2000751862 for /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:26,516 FailureDetector.java:456 -
> Ignoring interval time of 2124952635 <(212)%20495-2635> for /10.172.181.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:29,392 FailureDetector.java:456 -
> Ignoring interval time of 2254658974 <(225)%20465-8974> for /10.172.117.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:30,306 FailureDetector.java:456 -
> Ignoring interval time of 2000326847 for /10.172.181.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 -
> Ignoring interval time of 2086445998 <(208)%20644-5998> for /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 -
> Ignoring interval time of 2086481727 <(208)%20648-1727> for /10.172.181.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 -
> Ignoring interval time of 2363746015 <(236)%20374-6015> for /10.172.117.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 -
> Ignoring interval time of 2151813348 for /10.172.117.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:37,393 FailureDetector.java:456 -
> Ignoring interval time of 2103432914 <(210)%20343-2914> for /10.172.181.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:40,394 FailureDetector.java:456 -
> Ignoring interval time of 2255155834 <(225)%20515-5834> for /10.172.117.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 -
> Ignoring interval time of 2086966679 <(208)%20696-6679> for /10.172.117.62
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 -
> Ignoring interval time of 2087037617 <(208)%20703-7617> for /10.172.181.63
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
>
> INFO  [HANDSHAKE-/10.172.181.61
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>]
> 2017-09-29 15:28:47,132 OutboundTcpConnection.java:523 - Handshaking
> version with /10.172.181.61
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>
>
>
>
>
>
> The Handshake @15:28 is about the same time as I just tried it and it
> working. So it has an inability to handshake with the nodes??
>
>
>
> Any suggestions would be great, I have to go to the Network team with
> this, but I’m not sure where to start…
>
>
>
> *From:* Jeff Jirsa [mailto:jjirsa@gmail.com]
> *Sent:* 29 September 2017 14:54
> *To:* cassandra
> *Subject:* Re: Issue with New Production Cluster
>
>
>
> What version?
>
> If you retry immediately, does it reconnect?
>
> Anything in the logs?
>
>
>
> What you describe is atypical - timeouts on queries can (and will) happen
> occasionally under load, but timeout on connect is atypical. Any sign of
> networking issues/slowness/dns problems/etc?
>
>
>
>
>
> On Fri, Sep 29, 2017 at 6:43 AM, Jonathan Baynes <
> Jonathan.Baynes@tradeweb.com> wrote:
>
> Hi Community,
>
>
>
> I have a 6 node ring, covering 2 DC’s, the ring isn’t being used yet and
> we are just in the connectivity and testing phase. So the boxes are NOT
> under any load.
>
>
>
> I’ve gone to connect to CQLSH this afternoon and I’ve had this returned:
>
>
>
> cqlsh xx.xxx.xxx.xx -u cassandra -p cassandra
>
> Connection error: ('Unable to connect to any servers', { xx.xxx.xxx.xx’:
> OperationTimedOut('errors=Timed out creating connection (5 seconds),
> last_host=None',)})
>
>
>
> Status of the service is “Running”
>
> Nodetool status is UP for all nodes.
>
> Tested telnet on all ports and they are all up and connecting.
>
> Checked the PID is up for cassandra (it is)
>
>
>
> Any idea how I can debug the cause of this further?
>
>
>
>
>
> Thanks
>
> J
>
>
>
>
>
>
>
> *Jonathan Baynes*
>
> DBA
> Tradeweb Europe Limited
>
> Moor Place  •  1 Fore Street Avenue
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>
>   •
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-250D-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-250D-2BEC2Y-2B9DT-2B-253Chttps-3A__urldefense.proofpoint.com_v2_url-3Fu-253Dhttps-2D3A-5F-5Fmaps.google.com-5F-2D3Fq-2D3D1-2D2BFore-2D2BStreet-2D2BAvenue-2D25C2-2D25A0-2D25C2-2D25A0-2D25E2-2D2580-2D25A2-2D25C2-2D25A0-2D25C2-2D25A0London-2D2BEC2Y-2D2B9DT-2D26entry-2D3Dgmail-2D26source-2D3Dg-2526d-253DDwMFaQ-2526c-253DsA0VaJZJFLZREu2pbPeqjXHJ-2DWd9NNzgHW3gpUOLSSk-2526r-253DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-2DdiQ8-2526m-253DynRqj6nQ2Rc0PoXhoOv-5F0HAaz8G8xuGO8wNbuHB-2D3pU-2526s-253D46cq-2D66-2DQUqnj-2DtaNMRFYe7YK3eNez6ogHJCv014X9w-2526e-253D-253E-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=a46k9sWzI3NKA2ZlZh8o78iGyheFPgXJFCmwuhYy-Qo&e=>
>   London EC2Y 9DT
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>
> P +44 (0)20 77760988 <+44%2020%207776%200988>  •
> <https://maps.google.com/?q=1%0D+Fore+Street+Avenue%C2%A0%C2%A0%E2%80%A2+%3Chttps://urldefense.proofpoint.com/v2/url?u%3Dhttps-3A__maps.google.com_-3Fq-3D1-250D-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-250D-2BEC2Y-2B9DT-2B-253Chttps-3A__urldefense.proofpoint.com_v2_url-3Fu-253Dhttps-2D3A-5F-5Fmaps.google.com-5F-2D3Fq-2D3D1-2D2BFore-2D2BStreet-2D2BAvenue-2D25C2-2D25A0-2D25C2-2D25A0-2D25E2-2D2580-2D25A2-2D25C2-2D25A0-2D25C2-2D25A0London-2D2BEC2Y-2D2B9DT-2D26entry-2D3Dgmail-2D26source-2D3Dg-2526d-253DDwMFaQ-2526c-253DsA0VaJZJFLZREu2pbPeqjXHJ-2DWd9NNzgHW3gpUOLSSk-2526r-253DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-2DdiQ8-2526m-253DynRqj6nQ2Rc0PoXhoOv-5F0HAaz8G8xuGO8wNbuHB-2D3pU-2526s-253D46cq-2D66-2DQUqnj-2DtaNMRFYe7YK3eNez6ogHJCv014X9w-2526e-253D-253E-26entry-3Dgmail-26source-3Dg%26d%3DDwMFaQ%26c%3DsA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk%26r%3DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8%26m%3DLHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA%26s%3Da46k9sWzI3NKA2ZlZh8o78iGyheFPgXJFCmwuhYy-Qo%26e%3D%3E%C2%A0%C2%A0London%0D+EC2Y+9DT+%3Chttps://urldefense.proofpoint.com/v2/url?u%3Dhttps-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg%26d%3DDwMFaQ%26c%3DsA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk%26r%3DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8%26m%3DynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU%26s%3D46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w%26e%3D%3E&entry=gmail&source=g>  F
> +44 (0)20 7776 3201 <+44%2020%207776%203201>  •  M +44 (0)xxxx xxxxxx
>
> Jonathan.Baynes@tradeweb.com
>
>
>
> [image: cid:image001.jpg@01CD26AD.4165F110] <http://www.tradeweb.com/>
> follow us:  *[image: cid:image002.jpg@01CD26AD.4165F110]*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_tradeweb-3Ftrk-3Dtop-5Fnav-5Fhome&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=saD3ZbF7LhTn3IHCHWSzhGp8Ns6SIRO0s0KXCnEbqLU&e=>
> [image: cid:image003.jpg@01CD26AD.4165F110]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.twitter.com_Tradeweb&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=Duw_ud1x7HIX3h-0UEtKorCrlKn8C5tUoH6tkrvXDy0&e=>
>
> —————————————————————————————————————————
>
> A leading marketplace <http://www.tradeweb.com/About-Us/Awards/> for
> electronic fixed income, derivatives and ETF trading
>
>
>
> ________________________________________________________________________
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy it. Any unauthorized
> copying, disclosure or distribution of the material in this e-mail is
> strictly forbidden. Tradeweb reserves the right to monitor all e-mail
> communications through its networks. If you do not wish to receive
> marketing emails about our products / services, please let us know by
> contacting us, either by email at contactus@tradeweb.com or by writing to
> us at the registered office of Tradeweb in the UK, which is: Tradeweb
> Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y
> 9DT
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=VSn9NuhaGsP_VCpKoBZFV2hz_46iJ3VhVk5rYBKHeN8&e=>.
> To see our privacy policy, visit our website @ www.tradeweb.com.
>
>
>
> ________________________________________________________________________
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy it. Any unauthorized
> copying, disclosure or distribution of the material in this e-mail is
> strictly forbidden. Tradeweb reserves the right to monitor all e-mail
> communications through its networks. If you do not wish to receive
> marketing emails about our products / services, please let us know by
> contacting us, either by email at contactus@tradeweb.com or by writing to
> us at the registered office of Tradeweb in the UK, which is: Tradeweb
> Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y
> 9DT
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=m5TdlyiKgTyluGY2ssC5qtvdFJw9hAnGpfbDAKEHzxA&e=>.
> To see our privacy policy, visit our website @ www.tradeweb.com.
>
>
>
> ________________________________________________________________________
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy it. Any unauthorized
> copying, disclosure or distribution of the material in this e-mail is
> strictly forbidden. Tradeweb reserves the right to monitor all e-mail
> communications through its networks. If you do not wish to receive
> marketing emails about our products / services, please let us know by
> contacting us, either by email at contactus@tradeweb.com or by writing to
> us at the registered office of Tradeweb in the UK, which is: Tradeweb
> Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y
> 9DT
> <https://maps.google.com/?q=number+3912826),+1+Fore+Street+Avenue+London+EC2Y+9DT&entry=gmail&source=g>.
> To see our privacy policy, visit our website @ www.tradeweb.com.
>

RE: Issue with New Production Cluster

Posted by Jonathan Baynes <Jo...@tradeweb.com>.
Thank you Jeff, you are a credit to this community.. very helpful

Last question Is there any logging I can turn on (especially with the cassandra drive 2.5.0) that would assist in gaining more of an insight as to what’s going on? If I’m getting any failed connections at the cassandra layer?

Thanks
J

From: Jeff Jirsa [mailto:jjirsa@gmail.com]
Sent: 29 September 2017 16:07
To: cassandra
Subject: Re: Issue with New Production Cluster

The failure detector is seeing updates every 2.1-2.5  seconds, which it will ignore because it's over the 2 second default failure detector interval.

If there's no load, there's no reason it shouldn't be seeing it far more frequently.

If you're not seeing any signs of GC pauses (like GCInspector log lines), then it seems like you've got a network interface flapping - maybe STP is triggering on a port, or a bad cable/interface, or something along those lines.





On Fri, Sep 29, 2017 at 7:32 AM, Jonathan Baynes <Jo...@tradeweb.com>> wrote:
Hi Jeff,

This is version 3.0.11. Being run on Oracle Red Hat Linux.

If I retry immediately it fails, leave it for 20 minutes, like I have just now and retry it and it has worked. (?!?!)
Ive checked all the logs (system and Debug ) and in the logs I have this:

DEBUG [GossipStage:1] 2017-09-29 15:28:04,390 FailureDetector.java:456 - Ignoring interval time of 2102011879<tel:(210)%20201-1879> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,382 Message.java:615 - Unexpected exception during request; channel = [id: 0x963c334d, L:/10.172.117.61:9042<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=> - R:/10.172.117.21:51853<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51853&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=NwQs5pv53MTu7RXO4tUrDSVZAtsEdXTsObmf9GWbNKE&e=>]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
DEBUG [GossipStage:1] 2017-09-29 15:28:07,389 FailureDetector.java:456 - Ignoring interval time of 2085147992<tel:(208)%20514-7992> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
INFO  [SharedPool-Worker-2] 2017-09-29 15:28:07,400 Message.java:615 - Unexpected exception during request; channel = [id: 0xa7633234, L:/10.172.117.61:9042<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=> ! R:/10.172.117.21:51818<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51818&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=Zd17m05nbPXla0hA2GWhNKHRRo-N_nFqOfrN2_c6meY&e=>]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,418 Message.java:615 - Unexpected exception during request; channel = [id: 0x8c685c2c, L:/10.172.117.61:9042<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.61-3A9042&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=N1yysCzQbMJfPonp7tTVzQ5V6P_NGmCRCIGYb-JJS0w&e=> ! R:/10.172.117.21:51838<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.21-3A51838&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=P-JajFY4WPK7GbEnPsK2RAoRmzPCztQnE8RT5atNnhY&e=>]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
INFO  [HANDSHAKE-/10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>] 2017-09-29 15:28:10,140 OutboundTcpConnection.java:523 - Handshaking version with /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:12,288 FailureDetector.java:456 - Ignoring interval time of 2153356364<tel:(215)%20335-6364> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:15,289 FailureDetector.java:456 - Ignoring interval time of 2000150143 for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:16,305 FailureDetector.java:456 - Ignoring interval time of 2790916098 for /10.172.117.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:17,289 FailureDetector.java:456 - Ignoring interval time of 2000209617 for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:18,391 FailureDetector.java:456 - Ignoring interval time of 2085345347<tel:(208)%20534-5347> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:19,391 FailureDetector.java:456 - Ignoring interval time of 2102054326<tel:(210)%20205-4326> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 - Ignoring interval time of 2000754004 for /10.172.181.61<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 - Ignoring interval time of 2000787648 for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:23,391 FailureDetector.java:456 - Ignoring interval time of 2254752613<tel:(225)%20475-2613> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 - Ignoring interval time of 2086206045<tel:(208)%20620-6045> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 - Ignoring interval time of 2086239333<tel:(208)%20623-9333> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:26,392 FailureDetector.java:456 - Ignoring interval time of 2000751862 for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:26,516 FailureDetector.java:456 - Ignoring interval time of 2124952635<tel:(212)%20495-2635> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:29,392 FailureDetector.java:456 - Ignoring interval time of 2254658974<tel:(225)%20465-8974> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:30,306 FailureDetector.java:456 - Ignoring interval time of 2000326847 for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 - Ignoring interval time of 2086445998<tel:(208)%20644-5998> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 - Ignoring interval time of 2086481727<tel:(208)%20648-1727> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 - Ignoring interval time of 2363746015<tel:(236)%20374-6015> for /10.172.117.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 - Ignoring interval time of 2151813348 for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:37,393 FailureDetector.java:456 - Ignoring interval time of 2103432914<tel:(210)%20343-2914> for /10.172.181.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=F0Xvn8VrY2v_GJioteQGl2Xl9NrlI4T532IaGBwQL-U&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:40,394 FailureDetector.java:456 - Ignoring interval time of 2255155834<tel:(225)%20515-5834> for /10.172.117.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=kcNQ4izrbn-35U5vWnv5cBKG6XfN3nmgkKozyHEfaWA&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 - Ignoring interval time of 2086966679<tel:(208)%20696-6679> for /10.172.117.62<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.117.62&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=S4jvfMu8gpqeNX_WcphxcT67g_pCf41sOwHuiM4LWRg&e=>
DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 - Ignoring interval time of 2087037617<tel:(208)%20703-7617> for /10.172.181.63<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.63&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=C4UhWVbSaGt0q4hKIko0f7TmUwYlgXU39lZ3xvAZrRc&e=>
INFO  [HANDSHAKE-/10.172.181.61<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>] 2017-09-29 15:28:47,132 OutboundTcpConnection.java:523 - Handshaking version with /10.172.181.61<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.172.181.61&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=-lefWC2Mo3ksHhmGfeD7BgS4uynKiolxK6_cLxaySOM&e=>


The Handshake @15:28 is about the same time as I just tried it and it working. So it has an inability to handshake with the nodes??

Any suggestions would be great, I have to go to the Network team with this, but I’m not sure where to start…

From: Jeff Jirsa [mailto:jjirsa@gmail.com<ma...@gmail.com>]
Sent: 29 September 2017 14:54
To: cassandra
Subject: Re: Issue with New Production Cluster

What version?
If you retry immediately, does it reconnect?
Anything in the logs?

What you describe is atypical - timeouts on queries can (and will) happen occasionally under load, but timeout on connect is atypical. Any sign of networking issues/slowness/dns problems/etc?


On Fri, Sep 29, 2017 at 6:43 AM, Jonathan Baynes <Jo...@tradeweb.com>> wrote:
Hi Community,

I have a 6 node ring, covering 2 DC’s, the ring isn’t being used yet and we are just in the connectivity and testing phase. So the boxes are NOT under any load.

I’ve gone to connect to CQLSH this afternoon and I’ve had this returned:

cqlsh xx.xxx.xxx.xx -u cassandra -p cassandra
Connection error: ('Unable to connect to any servers', { xx.xxx.xxx.xx’: OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})

Status of the service is “Running”
Nodetool status is UP for all nodes.
Tested telnet on all ports and they are all up and connecting.
Checked the PID is up for cassandra (it is)

Any idea how I can debug the cause of this further?


Thanks
J



Jonathan Baynes
DBA
Tradeweb Europe Limited
Moor Place  •  1 Fore Street Avenue<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>  •<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-250D-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-250D-2BEC2Y-2B9DT-2B-253Chttps-3A__urldefense.proofpoint.com_v2_url-3Fu-253Dhttps-2D3A-5F-5Fmaps.google.com-5F-2D3Fq-2D3D1-2D2BFore-2D2BStreet-2D2BAvenue-2D25C2-2D25A0-2D25C2-2D25A0-2D25E2-2D2580-2D25A2-2D25C2-2D25A0-2D25C2-2D25A0London-2D2BEC2Y-2D2B9DT-2D26entry-2D3Dgmail-2D26source-2D3Dg-2526d-253DDwMFaQ-2526c-253DsA0VaJZJFLZREu2pbPeqjXHJ-2DWd9NNzgHW3gpUOLSSk-2526r-253DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-2DdiQ8-2526m-253DynRqj6nQ2Rc0PoXhoOv-5F0HAaz8G8xuGO8wNbuHB-2D3pU-2526s-253D46cq-2D66-2DQUqnj-2DtaNMRFYe7YK3eNez6ogHJCv014X9w-2526e-253D-253E-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=a46k9sWzI3NKA2ZlZh8o78iGyheFPgXJFCmwuhYy-Qo&e=>  London EC2Y 9DT<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>
P +44 (0)20 77760988<tel:+44%2020%207776%200988>  •  F +44 (0)20 7776 3201<tel:+44%2020%207776%203201>  •  M +44 (0)xxxx xxxxxx
Jonathan.Baynes@tradeweb.com<ma...@tradeweb.com>

[cid:image001.jpg@01CD26AD.4165F110]<http://www.tradeweb.com/>   follow us:  [cid:image002.jpg@01CD26AD.4165F110] <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_tradeweb-3Ftrk-3Dtop-5Fnav-5Fhome&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=saD3ZbF7LhTn3IHCHWSzhGp8Ns6SIRO0s0KXCnEbqLU&e=>    [cid:image003.jpg@01CD26AD.4165F110] <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.twitter.com_Tradeweb&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=Duw_ud1x7HIX3h-0UEtKorCrlKn8C5tUoH6tkrvXDy0&e=>
—————————————————————————————————————————
A leading marketplace<http://www.tradeweb.com/About-Us/Awards/> for electronic fixed income, derivatives and ETF trading


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com<ma...@tradeweb.com> or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=VSn9NuhaGsP_VCpKoBZFV2hz_46iJ3VhVk5rYBKHeN8&e=>. To see our privacy policy, visit our website @ www.tradeweb.com<http://www.tradeweb.com>.


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com<ma...@tradeweb.com> or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=LHuC2PiAE_qEnK8nzWN-xEQVrqsbc2SCkDvN8UV7FvA&s=m5TdlyiKgTyluGY2ssC5qtvdFJw9hAnGpfbDAKEHzxA&e=>. To see our privacy policy, visit our website @ www.tradeweb.com<http://www.tradeweb.com>.


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT. To see our privacy policy, visit our website @ www.tradeweb.com.

Re: Issue with New Production Cluster

Posted by Jeff Jirsa <jj...@gmail.com>.
The failure detector is seeing updates every 2.1-2.5  seconds, which it
will ignore because it's over the 2 second default failure detector
interval.

If there's no load, there's no reason it shouldn't be seeing it far more
frequently.

If you're not seeing any signs of GC pauses (like GCInspector log lines),
then it seems like you've got a network interface flapping - maybe STP is
triggering on a port, or a bad cable/interface, or something along those
lines.





On Fri, Sep 29, 2017 at 7:32 AM, Jonathan Baynes <
Jonathan.Baynes@tradeweb.com> wrote:

> Hi Jeff,
>
>
>
> This is version 3.0.11. Being run on Oracle Red Hat Linux.
>
>
>
> If I retry immediately it fails, leave it for 20 minutes, like I have just
> now and retry it and it has worked. (?!?!)
>
> Ive checked all the logs (system and Debug ) and in the logs I have this:
>
>
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:04,390 FailureDetector.java:456 -
> Ignoring interval time of 2102011879 <(210)%20201-1879> for /10.172.181.62
>
> INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,382 Message.java:615 -
> Unexpected exception during request; channel = [id: 0x963c334d, L:/
> 10.172.117.61:9042 - R:/10.172.117.21:51853]
>
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
> failed: Connection reset by peerat io.netty.channel.unix.
> FileDescriptor.readAddress(...)(Unknown Source)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:07,389 FailureDetector.java:456 -
> Ignoring interval time of 2085147992 <(208)%20514-7992> for /10.172.117.63
>
> INFO  [SharedPool-Worker-2] 2017-09-29 15:28:07,400 Message.java:615 -
> Unexpected exception during request; channel = [id: 0xa7633234, L:/
> 10.172.117.61:9042 ! R:/10.172.117.21:51818]
>
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
> failed: Connection reset by peerat io.netty.channel.unix.
> FileDescriptor.readAddress(...)(Unknown Source)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>
> INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,418 Message.java:615 -
> Unexpected exception during request; channel = [id: 0x8c685c2c, L:/
> 10.172.117.61:9042 ! R:/10.172.117.21:51838]
>
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
> failed: Connection reset by peerat io.netty.channel.unix.
> FileDescriptor.readAddress(...)(Unknown Source)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>
> INFO  [HANDSHAKE-/10.172.181.62] 2017-09-29 15:28:10,140
> OutboundTcpConnection.java:523 - Handshaking version with /10.172.181.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:12,288 FailureDetector.java:456 -
> Ignoring interval time of 2153356364 <(215)%20335-6364> for /10.172.117.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:15,289 FailureDetector.java:456 -
> Ignoring interval time of 2000150143 for /10.172.181.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:16,305 FailureDetector.java:456 -
> Ignoring interval time of 2790916098 for /10.172.117.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:17,289 FailureDetector.java:456 -
> Ignoring interval time of 2000209617 for /10.172.181.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:18,391 FailureDetector.java:456 -
> Ignoring interval time of 2085345347 <(208)%20534-5347> for /10.172.181.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:19,391 FailureDetector.java:456 -
> Ignoring interval time of 2102054326 <(210)%20205-4326> for /10.172.181.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 -
> Ignoring interval time of 2000754004 for /10.172.181.61
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 -
> Ignoring interval time of 2000787648 for /10.172.181.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:23,391 FailureDetector.java:456 -
> Ignoring interval time of 2254752613 <(225)%20475-2613> for /10.172.117.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 -
> Ignoring interval time of 2086206045 <(208)%20620-6045> for /10.172.181.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 -
> Ignoring interval time of 2086239333 <(208)%20623-9333> for /10.172.181.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:26,392 FailureDetector.java:456 -
> Ignoring interval time of 2000751862 for /10.172.181.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:26,516 FailureDetector.java:456 -
> Ignoring interval time of 2124952635 <(212)%20495-2635> for /10.172.181.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:29,392 FailureDetector.java:456 -
> Ignoring interval time of 2254658974 <(225)%20465-8974> for /10.172.117.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:30,306 FailureDetector.java:456 -
> Ignoring interval time of 2000326847 for /10.172.181.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 -
> Ignoring interval time of 2086445998 <(208)%20644-5998> for /10.172.181.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 -
> Ignoring interval time of 2086481727 <(208)%20648-1727> for /10.172.181.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 -
> Ignoring interval time of 2363746015 <(236)%20374-6015> for /10.172.117.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 -
> Ignoring interval time of 2151813348 for /10.172.117.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:37,393 FailureDetector.java:456 -
> Ignoring interval time of 2103432914 <(210)%20343-2914> for /10.172.181.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:40,394 FailureDetector.java:456 -
> Ignoring interval time of 2255155834 <(225)%20515-5834> for /10.172.117.63
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 -
> Ignoring interval time of 2086966679 <(208)%20696-6679> for /10.172.117.62
>
> DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 -
> Ignoring interval time of 2087037617 <(208)%20703-7617> for /10.172.181.63
>
> INFO  [HANDSHAKE-/10.172.181.61] 2017-09-29 15:28:47,132
> OutboundTcpConnection.java:523 - Handshaking version with /10.172.181.61
>
>
>
>
>
> The Handshake @15:28 is about the same time as I just tried it and it
> working. So it has an inability to handshake with the nodes??
>
>
>
> Any suggestions would be great, I have to go to the Network team with
> this, but I’m not sure where to start…
>
>
>
> *From:* Jeff Jirsa [mailto:jjirsa@gmail.com]
> *Sent:* 29 September 2017 14:54
> *To:* cassandra
> *Subject:* Re: Issue with New Production Cluster
>
>
>
> What version?
>
> If you retry immediately, does it reconnect?
>
> Anything in the logs?
>
>
>
> What you describe is atypical - timeouts on queries can (and will) happen
> occasionally under load, but timeout on connect is atypical. Any sign of
> networking issues/slowness/dns problems/etc?
>
>
>
>
>
> On Fri, Sep 29, 2017 at 6:43 AM, Jonathan Baynes <
> Jonathan.Baynes@tradeweb.com> wrote:
>
> Hi Community,
>
>
>
> I have a 6 node ring, covering 2 DC’s, the ring isn’t being used yet and
> we are just in the connectivity and testing phase. So the boxes are NOT
> under any load.
>
>
>
> I’ve gone to connect to CQLSH this afternoon and I’ve had this returned:
>
>
>
> cqlsh xx.xxx.xxx.xx -u cassandra -p cassandra
>
> Connection error: ('Unable to connect to any servers', { xx.xxx.xxx.xx’:
> OperationTimedOut('errors=Timed out creating connection (5 seconds),
> last_host=None',)})
>
>
>
> Status of the service is “Running”
>
> Nodetool status is UP for all nodes.
>
> Tested telnet on all ports and they are all up and connecting.
>
> Checked the PID is up for cassandra (it is)
>
>
>
> Any idea how I can debug the cause of this further?
>
>
>
>
>
> Thanks
>
> J
>
>
>
>
>
>
>
> *Jonathan Baynes*
>
> DBA
> Tradeweb Europe Limited
>
> Moor Place  •  1 Fore Street Avenue
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>
>   •
> <https://maps.google.com/?q=1%0D+Fore+Street+Avenue%C2%A0%C2%A0%E2%80%A2%C2%A0%C2%A0London%0D+EC2Y+9DT+%3Chttps://urldefense.proofpoint.com/v2/url?u%3Dhttps-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg%26d%3DDwMFaQ%26c%3DsA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk%26r%3DCNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8%26m%3DynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU%26s%3D46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w%26e%3D%3E&entry=gmail&source=g>
>   London EC2Y 9DT
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>
> P +44 (0)20 77760988 <+44%2020%207776%200988>  •  F +44 (0)20 7776 3201
> <+44%2020%207776%203201>  •  M +44 (0)xxxx xxxxxx
>
> Jonathan.Baynes@tradeweb.com
>
>
>
> [image: cid:image001.jpg@01CD26AD.4165F110] <http://www.tradeweb.com/>
> follow us:  *[image: cid:image002.jpg@01CD26AD.4165F110]*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_tradeweb-3Ftrk-3Dtop-5Fnav-5Fhome&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=saD3ZbF7LhTn3IHCHWSzhGp8Ns6SIRO0s0KXCnEbqLU&e=>
> [image: cid:image003.jpg@01CD26AD.4165F110]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.twitter.com_Tradeweb&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=Duw_ud1x7HIX3h-0UEtKorCrlKn8C5tUoH6tkrvXDy0&e=>
>
> —————————————————————————————————————————
>
> A leading marketplace <http://www.tradeweb.com/About-Us/Awards/> for
> electronic fixed income, derivatives and ETF trading
>
>
>
> ________________________________________________________________________
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy it. Any unauthorized
> copying, disclosure or distribution of the material in this e-mail is
> strictly forbidden. Tradeweb reserves the right to monitor all e-mail
> communications through its networks. If you do not wish to receive
> marketing emails about our products / services, please let us know by
> contacting us, either by email at contactus@tradeweb.com or by writing to
> us at the registered office of Tradeweb in the UK, which is: Tradeweb
> Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y
> 9DT
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=VSn9NuhaGsP_VCpKoBZFV2hz_46iJ3VhVk5rYBKHeN8&e=>.
> To see our privacy policy, visit our website @ www.tradeweb.com.
>
>
>
> ________________________________________________________________________
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy it. Any unauthorized
> copying, disclosure or distribution of the material in this e-mail is
> strictly forbidden. Tradeweb reserves the right to monitor all e-mail
> communications through its networks. If you do not wish to receive
> marketing emails about our products / services, please let us know by
> contacting us, either by email at contactus@tradeweb.com or by writing to
> us at the registered office of Tradeweb in the UK, which is: Tradeweb
> Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y
> 9DT
> <https://maps.google.com/?q=number+3912826),+1+Fore+Street+Avenue+London+EC2Y+9DT&entry=gmail&source=g>.
> To see our privacy policy, visit our website @ www.tradeweb.com.
>

RE: Issue with New Production Cluster

Posted by Jonathan Baynes <Jo...@tradeweb.com>.
Hi Jeff,

This is version 3.0.11. Being run on Oracle Red Hat Linux.

If I retry immediately it fails, leave it for 20 minutes, like I have just now and retry it and it has worked. (?!?!)
Ive checked all the logs (system and Debug ) and in the logs I have this:

DEBUG [GossipStage:1] 2017-09-29 15:28:04,390 FailureDetector.java:456 - Ignoring interval time of 2102011879 for /10.172.181.62
INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,382 Message.java:615 - Unexpected exception during request; channel = [id: 0x963c334d, L:/10.172.117.61:9042 - R:/10.172.117.21:51853]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
DEBUG [GossipStage:1] 2017-09-29 15:28:07,389 FailureDetector.java:456 - Ignoring interval time of 2085147992 for /10.172.117.63
INFO  [SharedPool-Worker-2] 2017-09-29 15:28:07,400 Message.java:615 - Unexpected exception during request; channel = [id: 0xa7633234, L:/10.172.117.61:9042 ! R:/10.172.117.21:51818]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
INFO  [SharedPool-Worker-1] 2017-09-29 15:28:07,418 Message.java:615 - Unexpected exception during request; channel = [id: 0x8c685c2c, L:/10.172.117.61:9042 ! R:/10.172.117.21:51838]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peerat io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
INFO  [HANDSHAKE-/10.172.181.62] 2017-09-29 15:28:10,140 OutboundTcpConnection.java:523 - Handshaking version with /10.172.181.62
DEBUG [GossipStage:1] 2017-09-29 15:28:12,288 FailureDetector.java:456 - Ignoring interval time of 2153356364 for /10.172.117.63
DEBUG [GossipStage:1] 2017-09-29 15:28:15,289 FailureDetector.java:456 - Ignoring interval time of 2000150143 for /10.172.181.62
DEBUG [GossipStage:1] 2017-09-29 15:28:16,305 FailureDetector.java:456 - Ignoring interval time of 2790916098 for /10.172.117.62
DEBUG [GossipStage:1] 2017-09-29 15:28:17,289 FailureDetector.java:456 - Ignoring interval time of 2000209617 for /10.172.181.62
DEBUG [GossipStage:1] 2017-09-29 15:28:18,391 FailureDetector.java:456 - Ignoring interval time of 2085345347 for /10.172.181.63
DEBUG [GossipStage:1] 2017-09-29 15:28:19,391 FailureDetector.java:456 - Ignoring interval time of 2102054326 for /10.172.181.62
DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 - Ignoring interval time of 2000754004 for /10.172.181.61
DEBUG [GossipStage:1] 2017-09-29 15:28:20,391 FailureDetector.java:456 - Ignoring interval time of 2000787648 for /10.172.181.63
DEBUG [GossipStage:1] 2017-09-29 15:28:23,391 FailureDetector.java:456 - Ignoring interval time of 2254752613 for /10.172.117.63
DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 - Ignoring interval time of 2086206045 for /10.172.181.62
DEBUG [GossipStage:1] 2017-09-29 15:28:23,392 FailureDetector.java:456 - Ignoring interval time of 2086239333 for /10.172.181.63
DEBUG [GossipStage:1] 2017-09-29 15:28:26,392 FailureDetector.java:456 - Ignoring interval time of 2000751862 for /10.172.181.62
DEBUG [GossipStage:1] 2017-09-29 15:28:26,516 FailureDetector.java:456 - Ignoring interval time of 2124952635 for /10.172.181.63
DEBUG [GossipStage:1] 2017-09-29 15:28:29,392 FailureDetector.java:456 - Ignoring interval time of 2254658974 for /10.172.117.63
DEBUG [GossipStage:1] 2017-09-29 15:28:30,306 FailureDetector.java:456 - Ignoring interval time of 2000326847 for /10.172.181.63
DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 - Ignoring interval time of 2086445998 for /10.172.181.62
DEBUG [GossipStage:1] 2017-09-29 15:28:32,393 FailureDetector.java:456 - Ignoring interval time of 2086481727 for /10.172.181.63
DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 - Ignoring interval time of 2363746015 for /10.172.117.62
DEBUG [GossipStage:1] 2017-09-29 15:28:35,290 FailureDetector.java:456 - Ignoring interval time of 2151813348 for /10.172.117.63
DEBUG [GossipStage:1] 2017-09-29 15:28:37,393 FailureDetector.java:456 - Ignoring interval time of 2103432914 for /10.172.181.62
DEBUG [GossipStage:1] 2017-09-29 15:28:40,394 FailureDetector.java:456 - Ignoring interval time of 2255155834 for /10.172.117.63
DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 - Ignoring interval time of 2086966679 for /10.172.117.62
DEBUG [GossipStage:1] 2017-09-29 15:28:41,394 FailureDetector.java:456 - Ignoring interval time of 2087037617 for /10.172.181.63
INFO  [HANDSHAKE-/10.172.181.61] 2017-09-29 15:28:47,132 OutboundTcpConnection.java:523 - Handshaking version with /10.172.181.61


The Handshake @15:28 is about the same time as I just tried it and it working. So it has an inability to handshake with the nodes??

Any suggestions would be great, I have to go to the Network team with this, but I’m not sure where to start…

From: Jeff Jirsa [mailto:jjirsa@gmail.com]
Sent: 29 September 2017 14:54
To: cassandra
Subject: Re: Issue with New Production Cluster

What version?
If you retry immediately, does it reconnect?
Anything in the logs?

What you describe is atypical - timeouts on queries can (and will) happen occasionally under load, but timeout on connect is atypical. Any sign of networking issues/slowness/dns problems/etc?


On Fri, Sep 29, 2017 at 6:43 AM, Jonathan Baynes <Jo...@tradeweb.com>> wrote:
Hi Community,

I have a 6 node ring, covering 2 DC’s, the ring isn’t being used yet and we are just in the connectivity and testing phase. So the boxes are NOT under any load.

I’ve gone to connect to CQLSH this afternoon and I’ve had this returned:

cqlsh xx.xxx.xxx.xx -u cassandra -p cassandra
Connection error: ('Unable to connect to any servers', { xx.xxx.xxx.xx’: OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})

Status of the service is “Running”
Nodetool status is UP for all nodes.
Tested telnet on all ports and they are all up and connecting.
Checked the PID is up for cassandra (it is)

Any idea how I can debug the cause of this further?


Thanks
J



Jonathan Baynes
DBA
Tradeweb Europe Limited
Moor Place  •  1 Fore Street Avenue<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>  •  London EC2Y 9DT<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D1-2BFore-2BStreet-2BAvenue-25C2-25A0-25C2-25A0-25E2-2580-25A2-25C2-25A0-25C2-25A0London-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=46cq-66-QUqnj-taNMRFYe7YK3eNez6ogHJCv014X9w&e=>
P +44 (0)20 77760988<tel:+44%2020%207776%200988>  •  F +44 (0)20 7776 3201<tel:+44%2020%207776%203201>  •  M +44 (0)xxxx xxxxxx
Jonathan.Baynes@tradeweb.com<ma...@tradeweb.com>

[cid:image001.jpg@01CD26AD.4165F110]<http://www.tradeweb.com/>   follow us:  [cid:image002.jpg@01CD26AD.4165F110] <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_tradeweb-3Ftrk-3Dtop-5Fnav-5Fhome&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=saD3ZbF7LhTn3IHCHWSzhGp8Ns6SIRO0s0KXCnEbqLU&e=>    [cid:image003.jpg@01CD26AD.4165F110] <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.twitter.com_Tradeweb&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=Duw_ud1x7HIX3h-0UEtKorCrlKn8C5tUoH6tkrvXDy0&e=>
—————————————————————————————————————————
A leading marketplace<http://www.tradeweb.com/About-Us/Awards/> for electronic fixed income, derivatives and ETF trading


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com<ma...@tradeweb.com> or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dnumber-2B3912826-29-2C-2B1-2BFore-2BStreet-2BAvenue-2BLondon-2BEC2Y-2B9DT-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=sA0VaJZJFLZREu2pbPeqjXHJ-Wd9NNzgHW3gpUOLSSk&r=CNKccIKIKCVbYTu1VxR8dIOP6NLpf4fYjidpNm-diQ8&m=ynRqj6nQ2Rc0PoXhoOv_0HAaz8G8xuGO8wNbuHB-3pU&s=VSn9NuhaGsP_VCpKoBZFV2hz_46iJ3VhVk5rYBKHeN8&e=>. To see our privacy policy, visit our website @ www.tradeweb.com<http://www.tradeweb.com>.


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT. To see our privacy policy, visit our website @ www.tradeweb.com.

Re: Issue with New Production Cluster

Posted by Jeff Jirsa <jj...@gmail.com>.
What version?
If you retry immediately, does it reconnect?
Anything in the logs?

What you describe is atypical - timeouts on queries can (and will) happen
occasionally under load, but timeout on connect is atypical. Any sign of
networking issues/slowness/dns problems/etc?


On Fri, Sep 29, 2017 at 6:43 AM, Jonathan Baynes <
Jonathan.Baynes@tradeweb.com> wrote:

> Hi Community,
>
>
>
> I have a 6 node ring, covering 2 DC’s, the ring isn’t being used yet and
> we are just in the connectivity and testing phase. So the boxes are NOT
> under any load.
>
>
>
> I’ve gone to connect to CQLSH this afternoon and I’ve had this returned:
>
>
>
> cqlsh xx.xxx.xxx.xx -u cassandra -p cassandra
>
> Connection error: ('Unable to connect to any servers', { xx.xxx.xxx.xx’:
> OperationTimedOut('errors=Timed out creating connection (5 seconds),
> last_host=None',)})
>
>
>
> Status of the service is “Running”
>
> Nodetool status is UP for all nodes.
>
> Tested telnet on all ports and they are all up and connecting.
>
> Checked the PID is up for cassandra (it is)
>
>
>
> Any idea how I can debug the cause of this further?
>
>
>
>
>
> Thanks
>
> J
>
>
>
>
>
>
>
> *Jonathan Baynes*
>
> DBA
> Tradeweb Europe Limited
>
> Moor Place  •  1 Fore Street Avenue
> <https://maps.google.com/?q=1+Fore+Street+Avenue%C2%A0%C2%A0%E2%80%A2%C2%A0%C2%A0London+EC2Y+9DT&entry=gmail&source=g>
>   •  London EC2Y 9DT
> <https://maps.google.com/?q=1+Fore+Street+Avenue%C2%A0%C2%A0%E2%80%A2%C2%A0%C2%A0London+EC2Y+9DT&entry=gmail&source=g>
> P +44 (0)20 77760988 <+44%2020%207776%200988>  •  F +44 (0)20 7776 3201
> <+44%2020%207776%203201>  •  M +44 (0)xxxx xxxxxx
>
> Jonathan.Baynes@tradeweb.com
>
>
>
> [image: cid:image001.jpg@01CD26AD.4165F110] <http://www.tradeweb.com/>
> follow us:  *[image: cid:image002.jpg@01CD26AD.4165F110]*
> <https://www.linkedin.com/company/tradeweb?trk=top_nav_home>   [image:
> cid:image003.jpg@01CD26AD.4165F110] <http://www.twitter.com/Tradeweb>
>
> —————————————————————————————————————————
>
> A leading marketplace <http://www.tradeweb.com/About-Us/Awards/> for
> electronic fixed income, derivatives and ETF trading
>
>
>
> ________________________________________________________________________
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy it. Any unauthorized
> copying, disclosure or distribution of the material in this e-mail is
> strictly forbidden. Tradeweb reserves the right to monitor all e-mail
> communications through its networks. If you do not wish to receive
> marketing emails about our products / services, please let us know by
> contacting us, either by email at contactus@tradeweb.com or by writing to
> us at the registered office of Tradeweb in the UK, which is: Tradeweb
> Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y
> 9DT
> <https://maps.google.com/?q=number+3912826),+1+Fore+Street+Avenue+London+EC2Y+9DT&entry=gmail&source=g>.
> To see our privacy policy, visit our website @ www.tradeweb.com.
>