You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Mario Salazar de Torres (Jira)" <ji...@apache.org> on 2020/09/24 23:54:00 UTC
[jira] [Comment Edited] (GEODE-8535) Coredump while putting an entry to a LocalRegion

    [ https://issues.apache.org/jira/browse/GEODE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201785#comment-17201785 ] 

Mario Salazar de Torres edited comment on GEODE-8535 at 9/24/20, 11:53 PM:
---------------------------------------------------------------------------

My hypothesis for this case is that this problem is caused due to a time precision missalignment. My evidences to support this are in the coredump.log file and are the following:
 * The entry causing the crash is which key is *entry-505993* as can be seen in notifications-no-massif.log:34119
 * Previous mentions of this key are in notifications-no-massif.log:24867-24872:

{code:java}
[debug 2020/09/24 21:47:40.779570 CEST DESKTOP-3SQUK3P:746832 140626765563648] Entered entry expiry task handler for tombstone of key [entry-505993]: 513315836275097ns,513315826409197ns,10ms,-134100ns
[debug 2020/09/24 21:47:40.779623 CEST DESKTOP-3SQUK3P:746832 140626765563648] Resetting expiry task 134100ns later for key [entry-505993]
[debug 2020/09/24 21:47:40.779661 CEST DESKTOP-3SQUK3P:746832 140626765563648] Entered entry expiry task handler for tombstone of key [entry-505993]: 513315836390697ns,513315826409197ns,10ms,-18500ns
[debug 2020/09/24 21:47:40.779667 CEST DESKTOP-3SQUK3P:746832 140626765563648] Resetting expiry task 18500ns later for key [entry-505993]
[debug 2020/09/24 21:47:40.779676 CEST DESKTOP-3SQUK3P:746832 140626765563648] Entered entry expiry task handler for tombstone of key [entry-505993]: 513315836408997ns,513315826409197ns,10ms,-200ns
[debug 2020/09/24 21:47:40.779681 CEST DESKTOP-3SQUK3P:746832 140626765563648] Resetting expiry task 200ns later for key [entry-505993]{code}
 * As can be seen expiry task handler is woken up 3 times and in the last time, whenever only 200ns remain to execute the expiry task, there is no sign of the task being woken up again.
 * Looking into ExpiryTaskManager::resetTask it uses an ACE_Time_Value variable which minimum precision is microseconds.

*Therefore* my guess is that given the expiry time is below 200ns, whenever calling reset, the task is considered done and the handler is destroyed.


was (Author: gaussianrecurrence):
My hypothesis for this case is that this problem is caused due to a time precision missalignment. My evidences to support this are in the coredump.log file and are the following:
 * The entry causing the crash is which key is *entry-505993* as can be seen in notifications-no-massif.log:34119
 * Previous mentions of this key are in notifications-no-massif.log:24867-24872:

{code:java}
[debug 2020/09/24 21:47:40.779570 CEST DESKTOP-3SQUK3P:746832 140626765563648] Entered entry expiry task handler for tombstone of key [entry-505993]: 513315836275097ns,513315826409197ns,10ms,-134100ns
[debug 2020/09/24 21:47:40.779623 CEST DESKTOP-3SQUK3P:746832 140626765563648] Resetting expiry task 134100ns later for key [entry-505993]
[debug 2020/09/24 21:47:40.779661 CEST DESKTOP-3SQUK3P:746832 140626765563648] Entered entry expiry task handler for tombstone of key [entry-505993]: 513315836390697ns,513315826409197ns,10ms,-18500ns
[debug 2020/09/24 21:47:40.779667 CEST DESKTOP-3SQUK3P:746832 140626765563648] Resetting expiry task 18500ns later for key [entry-505993]
[debug 2020/09/24 21:47:40.779676 CEST DESKTOP-3SQUK3P:746832 140626765563648] Entered entry expiry task handler for tombstone of key [entry-505993]: 513315836408997ns,513315826409197ns,10ms,-200ns
[debug 2020/09/24 21:47:40.779681 CEST DESKTOP-3SQUK3P:746832 140626765563648] Resetting expiry task 200ns later for key [entry-505993]{code}

 * As can be seen expiry task handler is woken up 3 times and in the last time, whenever only 200ns remain to execute the expiry task, there is no sign of the task being woken up again.
 * Looking into ExpiryTaskManager::resetTask it uses an ACE_Time_Value variable which minimum precision is microseconds.

*Therefore* my guess is that given the expiry time is below 200ns, whenever calling reset, the task is considered done and the handler is destoyed.

> Coredump while putting an entry to a LocalRegion
> ------------------------------------------------
>
>                 Key: GEODE-8535
>                 URL: https://issues.apache.org/jira/browse/GEODE-8535
>             Project: Geode
>          Issue Type: Bug
>          Components: native client
>    Affects Versions: 1.13.0
>            Reporter: Mario Salazar de Torres
>            Priority: Major
>         Attachments: coredump.log, notifications-no-massif.log
>
>
> The scenario is the following:
> *GIVEN* concurrency-checks-enabled=true (as default) for the region in which the put operation is happening.
> *GIVEN* tombstone-timeout=10ms
> *WHENEVER* a huge load (hundreds per second) of LOCAL_CREATE, LOCAL_DESTROY notifications are received in the client for the same region and consecutive keys, as below example shows:
> {code:java}
> t_0: LOCAL_CREATE for key entry-1
> t_1: LOCAL_DESTROY for key entry-1
> t_2: LOCAL_CREATE for key entry-2
> t_3: LOCAL_DESTROY for key entry-2
> ·
> ·
> ·
> t_(2*(n-1)): LOCAL_CREATE for key entry-n
> t_(2*n-1): LOCAL_DESTROY for key entry-n{code}
> *THEN* the application crashes, in many different places, but as for the case reported here, whenever trying access the virtual destructor pointing of the ExpiryHandlerTask, which turns out to be nullptr.
>  
> Find segmentation report attached as *coredump.log* and also, geode-native debug log attached as *notifications-no-massif.log*
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)