You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Jesse Hulsizer (Jira)" <ji...@apache.org> on 2021/09/13 23:56:03 UTC

[jira] [Created] (PROTON-2432) Proton crashes because of a concurrency failure in collector->pool

Jesse Hulsizer created PROTON-2432:
--------------------------------------

             Summary: Proton crashes because of a concurrency failure in collector->pool
                 Key: PROTON-2432
                 URL: https://issues.apache.org/jira/browse/PROTON-2432
             Project: Qpid Proton
          Issue Type: Bug
          Components: proton-c
    Affects Versions: proton-c-0.32.0
         Environment: RHEL 7 
            Reporter: Jesse Hulsizer


While running our application tests, our application crashes with many different backtraces that look similar to this...
{noformat}
#0  0x0000000000000000 in ?? ()
#1  0x00007fc777579198 in pn_class_incref () from /usr/lib64/libqpid-proton.so.11
#2  0x00007fc777587d8a in pn_collector_put () from /usr/lib64/libqpid-proton.so.11
#3  0x00007fc7775887ea in ?? () from /usr/lib64/libqpid-proton.so.11
#4  0x00007fc777588c7b in pn_transport_pending () from /usr/lib64/libqpid-proton.so.11
#5  0x00007fc777588d9e in pn_transport_pop () from /usr/lib64/libqpid-proton.so.11
#6  0x00007fc777599298 in ?? () from /usr/lib64/libqpid-proton.so.11
#7  0x00007fc77759a784 in ?? () from /usr/lib64/libqpid-proton.so.11
#8  0x00007fc7773236f0 in proton::container::impl::thread() () from /usr/lib64/libqpid-proton-cpp.so.12
#9  0x00007fc7760b2470 in ?? () from /usr/lib64/libstdc++.so.6
#10 0x00007fc776309aa1 in start_thread () from /lib64/libpthread.so.0
#11 0x00007fc7758b6bdd in clone () from /lib64/libc.so.6{noformat}
Using gdb to probe one of the backtraces show that the collector->pool size is -1... (seen here as 18446744073709551615)
{noformat}
(gdb) p *collector $1 = \{pool = 0x7fa7182de180, head = 0x7fa7182de250, tail = 0x7fa7182b8b90, prev = 0x7fa7182ea010, freed = false}

(gdb) p collector->pool $2 = (pn_list_t *) 0x7fa7182de180 (gdb) p *collector->pool $3 = \{clazz = 0x7fa74eb7c000, capacity = 16, size = 18446744073709551615, elements = 0x7fa7182de1b0}{noformat}
The proton code was marked up with print statements which show that two threads were accessing the collector->pool data structure at the same time...

{noformat} 
 7b070700: pn_list_pop index 0 list->0x7fec401e0b70 value->0x7fec3c728a10
 4ffff700:pn_list_add index 1 size 2list->0x7fec401e0b70 value->0x7fec402095b0
 7b070700: pn_list_pop size 1 list->0x7fec401e0b70
 4ffff700: pn_list_pop size 1 list->0x7fec401e0b70
 7b070700: pn_list_pop index 0 list->0x7fec401e0b70 value->0x7fec3c728a10
 4ffff700: pn_list_pop index 0 list->0x7fec401e0b70 value->0x7fec3c728a10{noformat}

The hex number on the far left is the thread id. As can be seen in the last two lines, two threads are popping from the collector->pool simultaneously. This produces the -1 size as seen up above



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org