You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Joseph Lynch (JIRA)" <ji...@apache.org> on 2019/05/28 23:15:00 UTC
[jira] [Commented] (CASSANDRA-14978) Cassandra going down with
"java.lang.OutOfMemoryError: Map failed" and "LEAK DETECTED"
[ https://issues.apache.org/jira/browse/CASSANDRA-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850230#comment-16850230 ]
Joseph Lynch commented on CASSANDRA-14978:
------------------------------------------
[~yakir.g] were you able to find a resolution? If not and this is still happening can you post a dominator tree, heap dump, or heap histogram ([https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr014.html#BABJIIHH]) ?
Also by any chance were you running repair?
> Cassandra going down with "java.lang.OutOfMemoryError: Map failed" and "LEAK DETECTED"
> --------------------------------------------------------------------------------------
>
> Key: CASSANDRA-14978
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14978
> Project: Cassandra
> Issue Type: Bug
> Reporter: Yakir Gibraltar
> Priority: Normal
>
> Cassandra version: 3.11.3
> OS: CentOS Linux release 7.4.1708 (Core)
> Kernel: 3.10.0-957.1.3.el7.x86_64
> JDK: jdk1.8.0_131
> Heap: same errors with 16GB / 32GB / 64GB.
> *We are seeing this errors in production:*
> *java.io.IOException: Map failed:*
> {code:java}
> ERROR [CompactionExecutor:5017] 2019-01-14 00:02:04,763 CassandraDaemon.java:228 - Exception in thread Thread[CompactionExecutor:5017,1,main]
> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
> at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:157) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:181) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:73) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:61) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:362) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:290) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:179) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:134) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:65) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:142) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:201) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:274) ~[apache-cassandra-3.11.3.jar:3.11.3]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_131]
> at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.3.jar:3.11.3]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
> Caused by: java.io.IOException: Map failed
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:940) ~[na:1.8.0_131]
> at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:153) ~[apache-cassandra-3.11.3.jar:3.11.3]
> ... 23 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Map failed
> at sun.nio.ch.FileChannelImpl.map0(Native Method) ~[na:1.8.0_131]
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:937) ~[na:1.8.0_131]
> ... 24 common frames omitted
> {code}
> *LEAK DETECTED:*
> {code:java}
> ERROR [Reference-Reaper:1] 2019-01-14 00:03:46,469 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@6a4ef142) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1651696741:Memory@[6b91a27c5290..6b91a27de290) was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2019-01-14 00:03:46,520 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@6c458f8a) to class org.apache.cassandra.io.util.FileHandle$Cleanup@1179238225:/var/lib/cassandra/data/disk1/sessions_rawdata/sessions_v2_2019_01_13-19be8e90037011e9a45847402874bbd7/mc-1209-big-Index.db was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2019-01-14 00:03:46,520 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@5b90823b) to class org.apache.cassandra.io.util.MmappedRegions$Tidier@783549664:/var/lib/cassandra/data/disk1/sessions_rawdata/sessions_v2_2019_01_13-19be8e90037011e9a45847402874bbd7/mc-1209-big-Data.db was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2019-01-14 00:03:46,520 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@6ecdf763) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1710583516:[Memory@[0..3e24), Memory@[0..45e88)] was not released before the reference was garbage collected{code}
>
> *Limits of Cassandra process:*
> {code:java}
> [root@cass063 ~ ]# cat /proc/`ps -ef | grep CassandraDaemon | grep -v grep | awk '\{print $2}'`/limits
> Limit Soft Limit Hard Limit Units
> Max cpu time unlimited unlimited seconds
> Max file size unlimited unlimited bytes
> Max data size unlimited unlimited bytes
> Max stack size 8388608 unlimited bytes
> Max core file size 0 unlimited bytes
> Max resident set unlimited unlimited bytes
> Max processes 32768 32768 processes
> Max open files 100000 100000 files
> Max locked memory unlimited unlimited bytes
> Max address space unlimited unlimited bytes
> Max file locks unlimited unlimited locks
> Max pending signals 766985 766985 signals
> Max msgqueue size 819200 819200 bytes
> Max nice priority 0 0
> Max realtime priority 0 0
> Max realtime timeout unlimited unlimited us{code}
>
> *max_map_count parameter on OS:*
> {code:java}
> [root@cass063 ~]# sysctl vm.max_map_count
> vm.max_map_count = 1073741824
> {code}
>
> *cassandra.yaml:*
> {code:java}
> cluster_name: 'Cass Cluster'
> num_tokens: 256
> hinted_handoff_enabled: false
> max_hint_window_in_ms: 10800000
> hinted_handoff_throttle_in_kb: 1024
> max_hints_delivery_threads: 2
> hints_directory: /var/lib/cassandra/hints
> hints_flush_period_in_ms: 10000
> max_hints_file_size_in_mb: 128
> batchlog_replay_throttle_in_kb: 1024
> authenticator: AllowAllAuthenticator
> authorizer: AllowAllAuthorizer
> role_manager: CassandraRoleManager
> roles_validity_in_ms: 2000
> permissions_validity_in_ms: 2000
> credentials_validity_in_ms: 2000
> partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> data_file_directories:
> - /var/lib/cassandra/data/disk1
> commitlog_directory: /var/lib/cassandra/data/disk1/commitlog
> cdc_enabled: false
> disk_failure_policy: stop
> commit_failure_policy: stop
> prepared_statements_cache_size_mb:
> thrift_prepared_statements_cache_size_mb:
> key_cache_size_in_mb: 0
> key_cache_save_period: 3600
> row_cache_size_in_mb: 0
> row_cache_save_period: 0
> counter_cache_size_in_mb:
> counter_cache_save_period: 7200
> saved_caches_directory: /var/lib/cassandra/data/disk1/saved_caches
> commitlog_sync: periodic
> commitlog_sync_period_in_ms: 10000
> commitlog_segment_size_in_mb: 32
> seed_provider:
> - class_name: org.apache.cassandra.locator.SimpleSeedProvider
> parameters:
> - seeds: "10.110.30.1,10.110.30.2,10.110.30.3"
> concurrent_reads: 48
> concurrent_writes: 96
> concurrent_counter_writes: 32
> concurrent_materialized_view_writes: 32
> file_cache_size_in_mb: 10240
> memtable_offheap_space_in_mb: 10240
> memtable_cleanup_threshold: 0.1
> memtable_allocation_type: offheap_buffers
> commitlog_total_space_in_mb: 8192
> memtable_flush_writers: 8
> index_summary_capacity_in_mb:
> index_summary_resize_interval_in_minutes: 60
> trickle_fsync: true
> trickle_fsync_interval_in_kb: 10240
> storage_port: 7000
> ssl_storage_port: 7001
> listen_address: 10.106.62.34
> start_native_transport: true
> native_transport_port: 9042
> start_rpc: false
> rpc_address: 0.0.0.0
> rpc_port: 9160
> broadcast_rpc_address: 10.106.62.34
> rpc_keepalive: true
> rpc_server_type: hsha
> rpc_max_threads: 128
> thrift_framed_transport_size_in_mb: 15
> incremental_backups: false
> snapshot_before_compaction: false
> auto_snapshot: true
> column_index_size_in_kb: 64
> column_index_cache_size_in_kb: 2
> concurrent_compactors: 32
> compaction_throughput_mb_per_sec: 500
> sstable_preemptive_open_interval_in_mb: 50
> stream_throughput_outbound_megabits_per_sec: 0
> read_request_timeout_in_ms: 10000
> range_request_timeout_in_ms: 10000
> write_request_timeout_in_ms: 60000
> counter_write_request_timeout_in_ms: 10000
> cas_contention_timeout_in_ms: 1000
> truncate_request_timeout_in_ms: 60000
> request_timeout_in_ms: 10000
> slow_query_log_timeout_in_ms: 500
> cross_node_timeout: false
> phi_convict_threshold: 12
> endpoint_snitch: GossipingPropertyFileSnitch
> dynamic_snitch_update_interval_in_ms: 100
> dynamic_snitch_reset_interval_in_ms: 600000
> dynamic_snitch_badness_threshold: 0.5
> request_scheduler: org.apache.cassandra.scheduler.NoScheduler
> server_encryption_options:
> internode_encryption: none
> keystore: conf/.keystore
> keystore_password: cassandra
> truststore: conf/.truststore
> truststore_password: cassandra
> client_encryption_options:
> enabled: false
> optional: false
> keystore: conf/.keystore
> keystore_password: cassandra
> internode_compression: dc
> inter_dc_tcp_nodelay: false
> tracetype_query_ttl: 86400
> tracetype_repair_ttl: 604800
> enable_user_defined_functions: false
> enable_scripted_user_defined_functions: false
> enable_materialized_views: true
> windows_timer_interval: 1
> transparent_data_encryption_options:
> enabled: false
> chunk_length_kb: 64
> cipher: AES/CBC/PKCS5Padding
> key_alias: testing:1
> key_provider:
> - class_name: org.apache.cassandra.security.JKSKeyProvider
> parameters:
> - keystore: conf/.keystore
> keystore_password: cassandra
> store_type: JCEKS
> key_password: cassandra
> tombstone_warn_threshold: 1000
> tombstone_failure_threshold: 100000
> batch_size_warn_threshold_in_kb: 5
> batch_size_fail_threshold_in_kb: 50
> unlogged_batch_across_partitions_warn_threshold: 10
> compaction_large_partition_warning_threshold_mb: 10
> gc_warn_threshold_in_ms: 1000
> back_pressure_enabled: false
> back_pressure_strategy:
> - class_name: org.apache.cassandra.net.RateBasedBackPressure
> parameters:
> - high_ratio: 0.90
> factor: 5
> flow: FAST{code}
>
> *A lot of maps, 200K maps of cassandra process,*:
> {code:java}
> [root@cass063 ~]# wc -l /proc/`ps -ef | grep CassandraDaemon | grep -v grep | awk '{print $2}'`/maps
> 239587 /proc/202664/maps{code}
> I got same error with heap of 16GB / 32GB / 64GB.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org