You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Thomas Steinmaurer (JIRA)" <ji...@apache.org> on 2018/09/05 13:23:00 UTC
[jira] [Comment Edited] (CASSANDRA-14691) Cassandra 2.1 backport -
The JVM should exit if jmx fails to bind
[ https://issues.apache.org/jira/browse/CASSANDRA-14691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604394#comment-16604394 ]
Thomas Steinmaurer edited comment on CASSANDRA-14691 at 9/5/18 1:22 PM:
------------------------------------------------------------------------
As discussed with [~KurtG] in Cassandra user list, opened a clone of CASSANDRA-11540 for a 2.1 backport as the change seems to be simple + this all could be a source for corruption.
{noformat}
ERROR [main] 2018-08-05 21:10:24,046 CassandraDaemon.java:120 - Error starting local jmx server:
java.rmi.server.ExportException: Port already in use: 7199; nested exception is:
java.net.BindException: Address already in use (Bind failed)
…
But then continuing with stuff like opening system and even user tables:
INFO [main] 2018-08-05 21:10:24,060 CacheService.java:110 - Initializing key cache with capacity of 100 MBs.
INFO [main] 2018-08-05 21:10:24,067 CacheService.java:132 - Initializing row cache with capacity of 0 MBs
INFO [main] 2018-08-05 21:10:24,073 CacheService.java:149 - Initializing counter cache with capacity of 50 MBs
INFO [main] 2018-08-05 21:10:24,074 CacheService.java:160 - Scheduling counter cache save to every 7200 seconds (going to save all keys).
INFO [main] 2018-08-05 21:10:24,161 ColumnFamilyStore.java:365 - Initializing system.sstable_activity
INFO [SSTableBatchOpen:2] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-165 (2023 bytes)
INFO [SSTableBatchOpen:3] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-167 (2336 bytes)
INFO [SSTableBatchOpen:1] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-166 (2686 bytes)
INFO [main] 2018-08-05 21:10:24,755 ColumnFamilyStore.java:365 - Initializing system.hints
INFO [SSTableBatchOpen:1] 2018-08-05 21:10:24,758 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/hints-2666e20573ef38b390fefecf96e8f0c7/system-hints-ka-377 (46210621 bytes)
INFO [main] 2018-08-05 21:10:24,766 ColumnFamilyStore.java:365 - Initializing system.compaction_history
INFO [SSTableBatchOpen:1] 2018-08-05 21:10:24,768 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-129 (91269 bytes)
…
Replaying commit logs:
…
INFO [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:267 - Replaying /var/opt/xxx-managed/cassandra/commitlog/CommitLog-4-1533133668366.log
INFO [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:270 - Replaying /var/opt/xxx-managed/cassandra/commitlog/CommitLog-4-1533133668366.log (CL version 4, messaging version 8)
…
Even writing memtables already (below just pasted system tables, but also user tables):
…
INFO [MemtableFlushWriter:4] 2018-08-05 21:11:52,524 Memtable.java:347 - Writing Memtable-size_estimates@1941663179(2.655MiB serialized bytes, 325710 ops, 2%/0% of on/off-heap limit)
INFO [MemtableFlushWriter:3] 2018-08-05 21:11:52,552 Memtable.java:347 - Writing Memtable-peer_events@1474667699(0.199KiB serialized bytes, 4 ops, 0%/0% of on/off-heap limit)
…
Until it comes to a point where it can’t bind ports like the storage port 7000:
ERROR [main] 2018-08-05 21:11:54,350 CassandraDaemon.java:395 - Fatal configuration error
org.apache.cassandra.exceptions.ConfigurationException: /XXX:7000 is in use by another process. Change listen_address:storage_port in cassandra.yaml to values that do not conflict with other services
at org.apache.cassandra.net.MessagingService.getServerSockets(MessagingService.java:495) ~[apache-cassandra-2.1.18.jar:2.1.18]
…
Until Cassandra stops:
…
INFO [StorageServiceShutdownHook] 2018-08-05 21:11:54,361 Gossiper.java:1454 - Announcing shutdown
…
So, we have around 2 minutes where Cassandra is mangling with existing data, although it shouldn’t.
Sounds like a potential candidate for data corruption, right? E.g. later on we then see things like (still while being in progress to shutdown?):
WARN [SharedPool-Worker-1] 2018-08-05 21:11:58,181 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {}
java.lang.RuntimeException: java.io.FileNotFoundException: /var/opt/xxx-managed/cassandra/xxx/xxx-fdc68b70950611e8ad7179f2d5bfa3cf/xxx-xxx-ka-15-Data.db (No such file or directory)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:52) ~[apache-cassandra-2.1.18.jar:2.1.18]
at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createPooledReader(CompressedPoolingSegmentedFile.java:95) ~[apache-cassandra-2.1.18.jar:2.1.18]
at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:62) ~[apache-cassandra-2.1.18.jar:2.1.18]
…
{noformat}
was (Author: tsteinmaurer):
Could be a source for corruption.
{noformat}
ERROR [main] 2018-08-05 21:10:24,046 CassandraDaemon.java:120 - Error starting local jmx server:
java.rmi.server.ExportException: Port already in use: 7199; nested exception is:
java.net.BindException: Address already in use (Bind failed)
…
But then continuing with stuff like opening system and even user tables:
INFO [main] 2018-08-05 21:10:24,060 CacheService.java:110 - Initializing key cache with capacity of 100 MBs.
INFO [main] 2018-08-05 21:10:24,067 CacheService.java:132 - Initializing row cache with capacity of 0 MBs
INFO [main] 2018-08-05 21:10:24,073 CacheService.java:149 - Initializing counter cache with capacity of 50 MBs
INFO [main] 2018-08-05 21:10:24,074 CacheService.java:160 - Scheduling counter cache save to every 7200 seconds (going to save all keys).
INFO [main] 2018-08-05 21:10:24,161 ColumnFamilyStore.java:365 - Initializing system.sstable_activity
INFO [SSTableBatchOpen:2] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-165 (2023 bytes)
INFO [SSTableBatchOpen:3] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-167 (2336 bytes)
INFO [SSTableBatchOpen:1] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-166 (2686 bytes)
INFO [main] 2018-08-05 21:10:24,755 ColumnFamilyStore.java:365 - Initializing system.hints
INFO [SSTableBatchOpen:1] 2018-08-05 21:10:24,758 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/hints-2666e20573ef38b390fefecf96e8f0c7/system-hints-ka-377 (46210621 bytes)
INFO [main] 2018-08-05 21:10:24,766 ColumnFamilyStore.java:365 - Initializing system.compaction_history
INFO [SSTableBatchOpen:1] 2018-08-05 21:10:24,768 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-129 (91269 bytes)
…
Replaying commit logs:
…
INFO [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:267 - Replaying /var/opt/xxx-managed/cassandra/commitlog/CommitLog-4-1533133668366.log
INFO [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:270 - Replaying /var/opt/xxx-managed/cassandra/commitlog/CommitLog-4-1533133668366.log (CL version 4, messaging version 8)
…
Even writing memtables already (below just pasted system tables, but also user tables):
…
INFO [MemtableFlushWriter:4] 2018-08-05 21:11:52,524 Memtable.java:347 - Writing Memtable-size_estimates@1941663179(2.655MiB serialized bytes, 325710 ops, 2%/0% of on/off-heap limit)
INFO [MemtableFlushWriter:3] 2018-08-05 21:11:52,552 Memtable.java:347 - Writing Memtable-peer_events@1474667699(0.199KiB serialized bytes, 4 ops, 0%/0% of on/off-heap limit)
…
Until it comes to a point where it can’t bind ports like the storage port 7000:
ERROR [main] 2018-08-05 21:11:54,350 CassandraDaemon.java:395 - Fatal configuration error
org.apache.cassandra.exceptions.ConfigurationException: /XXX:7000 is in use by another process. Change listen_address:storage_port in cassandra.yaml to values that do not conflict with other services
at org.apache.cassandra.net.MessagingService.getServerSockets(MessagingService.java:495) ~[apache-cassandra-2.1.18.jar:2.1.18]
…
Until Cassandra stops:
…
INFO [StorageServiceShutdownHook] 2018-08-05 21:11:54,361 Gossiper.java:1454 - Announcing shutdown
…
So, we have around 2 minutes where Cassandra is mangling with existing data, although it shouldn’t.
Sounds like a potential candidate for data corruption, right? E.g. later on we then see things like (still while being in progress to shutdown?):
WARN [SharedPool-Worker-1] 2018-08-05 21:11:58,181 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {}
java.lang.RuntimeException: java.io.FileNotFoundException: /var/opt/xxx-managed/cassandra/xxx/xxx-fdc68b70950611e8ad7179f2d5bfa3cf/xxx-xxx-ka-15-Data.db (No such file or directory)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:52) ~[apache-cassandra-2.1.18.jar:2.1.18]
at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createPooledReader(CompressedPoolingSegmentedFile.java:95) ~[apache-cassandra-2.1.18.jar:2.1.18]
at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:62) ~[apache-cassandra-2.1.18.jar:2.1.18]
…
{noformat}
> Cassandra 2.1 backport - The JVM should exit if jmx fails to bind
> -----------------------------------------------------------------
>
> Key: CASSANDRA-14691
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14691
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Thomas Steinmaurer
> Assignee: Alex Petrov
> Priority: Major
> Labels: lhf
> Fix For: 2.1.x
>
>
> If you are already running a cassandra instance, but for some reason try to start another one, this happens:
> {noformat}
> INFO 20:57:09 JNA mlockall successful
> WARN 20:57:09 JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
> ERROR 20:57:10 Error starting local jmx server:
> java.rmi.server.ExportException: Port already in use: 7199; nested exception is:
> java.net.BindException: Address already in use
> at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:340) ~[na:1.7.0_76]
> at sun.rmi.transport.tcp.TCPTransport.exportObject(TCPTransport.java:248) ~[na:1.7.0_76]
> at sun.rmi.transport.tcp.TCPEndpoint.exportObject(TCPEndpoint.java:411) ~[na:1.7.0_76]
> at sun.rmi.transport.LiveRef.exportObject(LiveRef.java:147) ~[na:1.7.0_76]
> at sun.rmi.server.UnicastServerRef.exportObject(UnicastServerRef.java:207) ~[na:1.7.0_76]
> at sun.rmi.registry.RegistryImpl.setup(RegistryImpl.java:122) ~[na:1.7.0_76]
> at sun.rmi.registry.RegistryImpl.<init>(RegistryImpl.java:98) ~[na:1.7.0_76]
> at java.rmi.registry.LocateRegistry.createRegistry(LocateRegistry.java:239) ~[na:1.7.0_76]
> at org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:100) [main/:na]
> at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:222) [main/:na]
> at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564) [main/:na]
> at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653) [main/:na]
> Caused by: java.net.BindException: Address already in use
> at java.net.PlainSocketImpl.socketBind(Native Method) ~[na:1.7.0_76]
> at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376) ~[na:1.7.0_76]
> at java.net.ServerSocket.bind(ServerSocket.java:376) ~[na:1.7.0_76]
> at java.net.ServerSocket.<init>(ServerSocket.java:237) ~[na:1.7.0_76]
> at javax.net.DefaultServerSocketFactory.createServerSocket(ServerSocketFactory.java:231) ~[na:1.7.0_76]
> at org.apache.cassandra.utils.RMIServerSocketFactoryImpl.createServerSocket(RMIServerSocketFactoryImpl.java:13) ~[main/:na]
> at sun.rmi.transport.tcp.TCPEndpoint.newServerSocket(TCPEndpoint.java:666) ~[na:1.7.0_76]
> at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:329) ~[na:1.7.0_76]
> ... 11 common frames omitted
> {noformat}
> However the startup continues, and ends up replaying commitlogs, which is probably not a good thing.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org