You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Dmitriy Govorukhin (JIRA)" <ji...@apache.org> on 2018/07/22 09:50:00 UTC

[jira] [Assigned] (IGNITE-9049) Missed SWITCH_SEGMENT_RECORD at the end of WAL file but space enough

     [ https://issues.apache.org/jira/browse/IGNITE-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Govorukhin reassigned IGNITE-9049:
------------------------------------------

    Assignee: Dmitriy Govorukhin

> Missed SWITCH_SEGMENT_RECORD at the end of WAL file but space enough 
> ---------------------------------------------------------------------
>
>                 Key: IGNITE-9049
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9049
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Dmitriy Govorukhin
>            Assignee: Dmitriy Govorukhin
>            Priority: Major
>             Fix For: 2.7
>
>
> There is a situation the several threads try addRecord when the free space ends (need rollOver to the next WAL segment) and none thread writes SWITCH_SEGMENT_RECORD. This leads to an end of the file will have garbage. If we try to iterate over this segment, iterator stopped when try to read next record and stumble on the garbage at the end of the file, it leads log will not be fully read. Any type of operation required iterator may be broken (crash recovery, delta rebalance, etc.).
> Example:
> File size 1024 bytes
> Current tail position 768 (free space 256)
> 1. Thread-1 try addRecord (size 128) -> tail update to 896.
> 2. Thread-2 try addRecord (size 128) -> tail update to 1024 (free space ended).
> None thread still not write any data, it just reserves position for write. (SegmentedRingByteBuffer.offer).
> 3. Thread-3 try addRecord  (size 128) -> no space enough -> rollOver and CAS stop flag to TRUE.
> 4. Thread-1 and Thread-2 try to write data and cannot do it.
> FileWriteHandle.addRecord
> {code}
>   if (buf == null || (stop.get() && rec.type() != SWITCH_SEGMENT_RECORD))
>                             return null; // Can not write to this segment, need to switch to the next one.
> {code}
> Thread-3 - can not write SWITCH_SEGMENT_RECORD because of not enough space.
> Thread-1 and Thread-2 cannot write their data because a stop is TRUE
> We have garbage from 768 to 1024 position.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)