You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Shalom Sagges <sh...@gmail.com> on 2020/12/11 16:50:31 UTC

Upgrading to 3.11.8 Caused Map Failures

Hi All,

I upgraded Cassandra from v3.11.4 to v3.11.8.
The upgrade went smoothly, however, after a few hours, a node crashed on
OOM and a few hours later, another one crashed.

Seems like they crashed from excessive GC behaviour (CMS). The logs show
Map failures on CompactionExecutor:

ERROR *[CompactionExecutor:744] *2020-12-11 03:25:42,169
JVMStabilityInspector.java:94 - OutOfMemory error letting the JVM handle
the error:
ERROR [CompactionExecutor:744] 2020-12-11 03:25:37,765
CassandraDaemon.java:235 - Exception in thread
Thread[CompactionExecutor:744,1,main]
org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
        at
org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:157)
        at
org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310)
        at
org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246)
        at
org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:170)
        at
org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:73)
...
...
Caused by: java.io.IOException: Map failed
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:940)
        at
org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:153)
        ... 23 common frames omitted
Caused by: java.lang.OutOfMemoryError: Map failed
        at sun.nio.ch.FileChannelImpl.map0(Native Method)
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:937)
        ... 24 common frames omitted


*[CompactionExecutor:744] did the following before the crash:*
INFO  [CompactionExecutor:744] 2020-12-11 03:00:29,985 NoSpamLogger.java:91
- Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
WARN  [CompactionExecutor:744] 2020-12-11 03:10:57,437
BigTableWriter.java:211 - Writing large partition XXXX (108.963MiB)....
WARN  [CompactionExecutor:744] 2020-12-11 03:10:57,437
BigTableWriter.java:211 - Writing large partition YYYY (151.155MiB)
WARN  [CompactionExecutor:744] 2020-12-11 03:11:16,445
BigTableWriter.java:211 - Writing large partition ZZZZ (253.149MiB)


*Some more info:*
The *max_map_count* is set to 1048575, so all is well there.
Hugepages are enabled by default (I know I should disable them), but I
don't think it can cause this behaviour.
This never happened on v3.11.4, only on v3.11.8.


I'd really appreciate your help on this one.
Thanks!

Re: Upgrading to 3.11.8 Caused Map Failures

Posted by Shalom Sagges <sh...@gmail.com>.
You are right Yakir.
How did I miss that?? It was a misconfiguration on my end.

Thanks a lot!

On Sat, Dec 12, 2020 at 9:28 PM Yakir Gibraltar <ya...@gmail.com> wrote:

> See also:
> https://support.datastax.com/hc/en-us/articles/360027838911
>
>
> On Sat, Dec 12, 2020 at 9:11 PM Yakir Gibraltar <ya...@gmail.com> wrote:
>
>> Hi Shalom,
>> See bug: https://issues.apache.org/jira/browse/CASSANDRA-14978
>> Try to disable mmap:
>> disk_access_mode=standard
>> or
>> disk_access_mode=mmap_index_only
>> Yakir Gibraltar.
>>
>
>
> --
> *בברכה,*
> *יקיר גיברלטר*
>

Re: Upgrading to 3.11.8 Caused Map Failures

Posted by Yakir Gibraltar <ya...@gmail.com>.
See also:
https://support.datastax.com/hc/en-us/articles/360027838911


On Sat, Dec 12, 2020 at 9:11 PM Yakir Gibraltar <ya...@gmail.com> wrote:

> Hi Shalom,
> See bug: https://issues.apache.org/jira/browse/CASSANDRA-14978
> Try to disable mmap:
> disk_access_mode=standard
> or
> disk_access_mode=mmap_index_only
> Yakir Gibraltar.
>


-- 
*בברכה,*
*יקיר גיברלטר*

Re: Upgrading to 3.11.8 Caused Map Failures

Posted by Yakir Gibraltar <ya...@gmail.com>.
Hi Shalom,
See bug: https://issues.apache.org/jira/browse/CASSANDRA-14978
Try to disable mmap:
disk_access_mode=standard
or
disk_access_mode=mmap_index_only
Yakir Gibraltar.

Re: Upgrading to 3.11.8 Caused Map Failures

Posted by Shalom Sagges <sh...@gmail.com>.
Forgot to mention that there were also LEAK DETECTED errors:

ERROR [Reference-Reaper] 2020-12-11 03:25:42,172 Ref.java:229 - LEAK
DETECTED: a reference
(org.apache.cassandra.utils.concurrent.Ref$State@451030de) to class
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1272432140:Memory@[7f6237800000..7f623aa00000)
was not released before the reference was garbage collected
ERROR [Reference-Reaper] 2020-12-11 03:25:42,172 Ref.java:229 - LEAK
DETECTED: a reference
(org.apache.cassandra.utils.concurrent.Ref$State@4fe85bae) to class
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@183159863
:[Memory@[0..f060), Memory@[0..10e6c0)] was not released before the
reference was garbage collected
ERROR [Reference-Reaper] 2020-12-11 03:25:42,173 Ref.java:229 - LEAK
DETECTED: a reference
(org.apache.cassandra.utils.concurrent.Ref$State@4eb88b74) to class
org.apache.cassandra.io.util.MmappedRegions$Tidier@992658185:/data_path/md-1105027-big-Data.db
was not released before the reference was garbage collected
ERROR [Reference-Reaper] 2020-12-11 03:25:42,176 Ref.java:229 - LEAK
DETECTED: a reference
(org.apache.cassandra.utils.concurrent.Ref$State@3692dae9) to class
org.apache.cassandra.io.util.FileHandle$Cleanup@1791308664:/data_path/md-1105027-big-Index.db
was not released before the reference was garbage collected



On Fri, Dec 11, 2020 at 6:50 PM Shalom Sagges <sh...@gmail.com>
wrote:

> Hi All,
>
> I upgraded Cassandra from v3.11.4 to v3.11.8.
> The upgrade went smoothly, however, after a few hours, a node crashed on
> OOM and a few hours later, another one crashed.
>
> Seems like they crashed from excessive GC behaviour (CMS). The logs show
> Map failures on CompactionExecutor:
>
> ERROR *[CompactionExecutor:744] *2020-12-11 03:25:42,169
> JVMStabilityInspector.java:94 - OutOfMemory error letting the JVM handle
> the error:
> ERROR [CompactionExecutor:744] 2020-12-11 03:25:37,765
> CassandraDaemon.java:235 - Exception in thread
> Thread[CompactionExecutor:744,1,main]
> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
>         at
> org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:157)
>         at
> org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310)
>         at
> org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246)
>         at
> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:170)
>         at
> org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:73)
> ...
> ...
> Caused by: java.io.IOException: Map failed
>         at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:940)
>         at
> org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:153)
>         ... 23 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Map failed
>         at sun.nio.ch.FileChannelImpl.map0(Native Method)
>         at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:937)
>         ... 24 common frames omitted
>
>
> *[CompactionExecutor:744] did the following before the crash:*
> INFO  [CompactionExecutor:744] 2020-12-11 03:00:29,985
> NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot
> allocate chunk of 1048576
> WARN  [CompactionExecutor:744] 2020-12-11 03:10:57,437
> BigTableWriter.java:211 - Writing large partition XXXX (108.963MiB)....
> WARN  [CompactionExecutor:744] 2020-12-11 03:10:57,437
> BigTableWriter.java:211 - Writing large partition YYYY (151.155MiB)
> WARN  [CompactionExecutor:744] 2020-12-11 03:11:16,445
> BigTableWriter.java:211 - Writing large partition ZZZZ (253.149MiB)
>
>
> *Some more info:*
> The *max_map_count* is set to 1048575, so all is well there.
> Hugepages are enabled by default (I know I should disable them), but I
> don't think it can cause this behaviour.
> This never happened on v3.11.4, only on v3.11.8.
>
>
> I'd really appreciate your help on this one.
> Thanks!
>
>
>
>
>
>