You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jean Carlo <je...@gmail.com> on 2016/01/29 10:51:56 UTC

Cassandra's log is full of mesages reset by peers even without traffic

Hello guys,

I have a cluster cassandra 2.1.12 with 6 nodes. All the logs of my nodes
are having this messages marked as INFO

INFO  [SharedPool-Worker-1] 2016-01-29 10:40:57,745 Message.java:532 -
Unexpected exception during request; channel = [id: 0xff15eb8c, /
172.16.162.4:9042]
java.io.IOException: Error while read(...): Connection reset by peer
    at io.netty.channel.epoll.Native.readAddress(Native Method)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at
io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:675)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at
io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]

This happens either the cluster is stressed or not. Btw it is not
production. The ip marked there (172.16.162.4) belongs to a node of the
cluster, this is not the only node that appears, acctually we are having
all the node's ip having that reset by peer problem.

Our cluster is having more reads than writes. like 50 reads per second.

Any one got the same problem?


Best regards

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay

Re: Cassandra's log is full of mesages reset by peers even without traffic

Posted by Anuj Wadehra <an...@yahoo.co.in>.

Hi Jean,
As mentioned in the DataStax link, your TCP connections will be marked dead after 300+75*9 =975 seconds. Make sure that your firewall idle timeout is more than 975 seconds. Otherwise firewall will drop connections and you may face issues.You can also try setting all three values same as mentioned in the link to see whether the problem gets resolved after doing that.
ThanksAnuj

Sent from Yahoo Mail on Android 
 
  On Mon, 1 Feb, 2016 at 9:18 pm, Jean Carlo<je...@gmail.com> wrote:   Hello Annuj,,

I checked my settings and this what I got.

root@node001[SPH][BENCH][PnS3]:~$ sysctl -A | grep net.ipv4 | grep net.ipv4.tcp_keepalive_probes
net.ipv4.tcp_keepalive_probes = 9
root@node001[SPH][BENCH][PnS3]:~$ sysctl -A | grep net.ipv4 | grep net.ipv4.tcp_keepalive_intvl 
net.ipv4.tcp_keepalive_intvl = 75
root@node001[SPH][BENCH][PnS3]:~$ sysctl -A | grep net.ipv4 | grep net.ipv4.tcp_keepalive_time 
net.ipv4.tcp_keepalive_time = 300

The tcp_keepalive_time is quite high in comparation to that written on the doc

https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html




Do you think that is ok?  

Best regards

Jean Carlo
"The best way to predict the future is to invent it" Alan Kay

On Fri, Jan 29, 2016 at 11:02 AM, Anuj Wadehra <an...@yahoo.co.in> wrote:

Hi Jean,
Please make sure that your Firewall is not dropping TCP connections which are in use. Tcp keep alive on all nodes must be less than the firewall setting. Please refer to https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html for details on TCP settings.

ThanksAnuj

Sent from Yahoo Mail on Android 
 
 On Fri, 29 Jan, 2016 at 3:21 pm, Jean Carlo<je...@gmail.com> wrote:   Hello guys, 

I have a cluster cassandra 2.1.12 with 6 nodes. All the logs of my nodes are having this messages marked as INFO

INFO  [SharedPool-Worker-1] 2016-01-29 10:40:57,745 Message.java:532 - Unexpected exception during request; channel = [id: 0xff15eb8c, /172.16.162.4:9042]
java.io.IOException: Error while read(...): Connection reset by peer
    at io.netty.channel.epoll.Native.readAddress(Native Method) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:675) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]

This happens either the cluster is stressed or not. Btw it is not production. The ip marked there (172.16.162.4) belongs to a node of the cluster, this is not the only node that appears, acctually we are having all the node's ip having that reset by peer problem.

Our cluster is having more reads than writes. like 50 reads per second. 

Any one got the same problem?


Best regards

Jean Carlo
"The best way to predict the future is to invent it" Alan Kay

Re: Cassandra's log is full of mesages reset by peers even without traffic

Posted by Jean Carlo <je...@gmail.com>.

Hello Annuj,,

I checked my settings and this what I got.
root@node001[SPH][BENCH][PnS3]:~$ sysctl -A | grep net.ipv4 | grep
net.ipv4.tcp_keepalive_probes
net.ipv4.tcp_keepalive_probes = 9
root@node001[SPH][BENCH][PnS3]:~$ sysctl -A | grep net.ipv4 | grep
net.ipv4.tcp_keepalive_intvl
net.ipv4.tcp_keepalive_intvl = 75
root@node001[SPH][BENCH][PnS3]:~$ sysctl -A | grep net.ipv4 | grep
net.ipv4.tcp_keepalive_time
net.ipv4.tcp_keepalive_time = 300The tcp_keepalive_time is quite high in
comparation to that written on the doc
https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html
Do you think that is ok?
Best regards

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay

On Fri, Jan 29, 2016 at 11:02 AM, Anuj Wadehra <an...@yahoo.co.in>
wrote:

> Hi Jean,
>
> Please make sure that your Firewall is not dropping TCP connections which
> are in use. Tcp keep alive on all nodes must be less than the firewall
> setting. Please refer to
>
> https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html for
> details on TCP settings.
>
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Fri, 29 Jan, 2016 at 3:21 pm, Jean Carlo
> <je...@gmail.com> wrote:
> Hello guys,
>
> I have a cluster cassandra 2.1.12 with 6 nodes. All the logs of my nodes
> are having this messages marked as INFO
>
> INFO  [SharedPool-Worker-1] 2016-01-29 10:40:57,745 Message.java:532 -
> Unexpected exception during request; channel = [id: 0xff15eb8c, /
> 172.16.162.4:9042]
> java.io.IOException: Error while read(...): Connection reset by peer
>     at io.netty.channel.epoll.Native.readAddress(Native Method)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:675)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
>
> This happens either the cluster is stressed or not. Btw it is not
> production. The ip marked there (172.16.162.4) belongs to a node of the
> cluster, this is not the only node that appears, acctually we are having
> all the node's ip having that reset by peer problem.
>
> Our cluster is having more reads than writes. like 50 reads per second.
>
> Any one got the same problem?
>
>
> Best regards
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
>

Re: Cassandra's log is full of mesages reset by peers even without traffic

Posted by Jean Carlo <je...@gmail.com>.

Hi anuj,

Thx for your replay, acctually I paste part of the result of the command
gre done over one log and I can see only the ip of the machine local


grep   "Unexpected exception during request"
/var/opt/hosting/log/cassandra/system.log

INFO  [SharedPool-Worker-1] 2016-01-29 10:40:47,744 Message.java:532 -
Unexpected exception during request; channel = [id: 0x6ebe93cb, /
172.16.162.4:9042]
INFO  [SharedPool-Worker-1] 2016-01-29 10:40:57,745 Message.java:532 -
Unexpected exception during request; channel = [id: 0xff15eb8c, /
172.16.162.4:9042]
INFO  [SharedPool-Worker-1] 2016-01-29 10:54:33,721 Message.java:532 -
Unexpected exception during request; channel = [id: 0xc42cc7ff, /
172.16.162.2:11436 :> /172.16.162.5:9042]
INFO  [SharedPool-Worker-2] 2016-01-29 10:45:47,761 Message.java:532 -
Unexpected exception during request; channel = [id: 0x603349e4, /
172.16.162.4:9042]
INFO  [SharedPool-Worker-3] 2016-01-29 10:48:17,766 Message.java:532 -
Unexpected exception during request; channel = [id: 0x5bed4eae, /
172.16.162.4:9042]
INFO  [SharedPool-Worker-2] 2016-01-29 10:48:27,767 Message.java:532 -
Unexpected exception during request; channel = [id: 0x6136756b, /
172.16.162.4:9042]
INFO  [SharedPool-Worker-4] 2016-01-29 10:48:27,767 Message.java:532 -
Unexpected exception during request; channel = [id: 0x17c83eb8, /
172.16.162.4:9042]
INFO  [SharedPool-Worker-4] 2016-01-29 10:52:07,778 Message.java:532 -
Unexpected exception during request; channel = [id: 0x1a78b589, /
172.16.162.4:9042]
INFO  [SharedPool-Worker-2] 2016-01-29 10:52:17,779 Message.java:532 -
Unexpected exception during request; channel = [id: 0x017117b3, /
172.16.162.4:9042]
INFO  [SharedPool-Worker-2] 2016-01-29 11:01:37,813 Message.java:532 -
Unexpected exception during request; channel = [id: 0x3efbd844, /
172.16.162.4:9042]

Then I don't know if the firewall has something to do on that case, becasue
it is a local connection over native protocol

Best regards

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay

On Fri, Jan 29, 2016 at 11:02 AM, Anuj Wadehra <an...@yahoo.co.in>
wrote:

> Hi Jean,
>
> Please make sure that your Firewall is not dropping TCP connections which
> are in use. Tcp keep alive on all nodes must be less than the firewall
> setting. Please refer to
>
> https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html for
> details on TCP settings.
>
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Fri, 29 Jan, 2016 at 3:21 pm, Jean Carlo
> <je...@gmail.com> wrote:
> Hello guys,
>
> I have a cluster cassandra 2.1.12 with 6 nodes. All the logs of my nodes
> are having this messages marked as INFO
>
> INFO  [SharedPool-Worker-1] 2016-01-29 10:40:57,745 Message.java:532 -
> Unexpected exception during request; channel = [id: 0xff15eb8c, /
> 172.16.162.4:9042]
> java.io.IOException: Error while read(...): Connection reset by peer
>     at io.netty.channel.epoll.Native.readAddress(Native Method)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:675)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>     at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
>
> This happens either the cluster is stressed or not. Btw it is not
> production. The ip marked there (172.16.162.4) belongs to a node of the
> cluster, this is not the only node that appears, acctually we are having
> all the node's ip having that reset by peer problem.
>
> Our cluster is having more reads than writes. like 50 reads per second.
>
> Any one got the same problem?
>
>
> Best regards
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
>

Re: Cassandra's log is full of mesages reset by peers even without traffic

Posted by Anuj Wadehra <an...@yahoo.co.in>.

Hi Jean,
Please make sure that your Firewall is not dropping TCP connections which are in use. Tcp keep alive on all nodes must be less than the firewall setting. Please refer to https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html for details on TCP settings.

ThanksAnuj

Sent from Yahoo Mail on Android 
 
  On Fri, 29 Jan, 2016 at 3:21 pm, Jean Carlo<je...@gmail.com> wrote:   Hello guys, 

I have a cluster cassandra 2.1.12 with 6 nodes. All the logs of my nodes are having this messages marked as INFO

INFO  [SharedPool-Worker-1] 2016-01-29 10:40:57,745 Message.java:532 - Unexpected exception during request; channel = [id: 0xff15eb8c, /172.16.162.4:9042]
java.io.IOException: Error while read(...): Connection reset by peer
    at io.netty.channel.epoll.Native.readAddress(Native Method) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:675) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]

This happens either the cluster is stressed or not. Btw it is not production. The ip marked there (172.16.162.4) belongs to a node of the cluster, this is not the only node that appears, acctually we are having all the node's ip having that reset by peer problem.

Our cluster is having more reads than writes. like 50 reads per second. 

Any one got the same problem?


Best regards

Jean Carlo
"The best way to predict the future is to invent it" Alan Kay