You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Pavel Moravec <pm...@redhat.com> on 2015/03/02 10:56:07 UTC

Review Request 31619: [C++ broker] segfault when processing QMF method during periodic processing

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31619/
-----------------------------------------------------------

Review request for qpid, Andrew Stitcher, Gordon Sim, and Kenneth Giusti.


Bugs: QPID-6397
    https://issues.apache.org/jira/browse/QPID-6397


Repository: qpid


Description
-------

The race condition leading to the coredump with backtraces in the JIRA when:

- traces are enabled (at least for qpid::management::ManagementAgent::debugSnapshot scope)
- periodic processing has just started, is in debugSnapshot, and iterates either over managementObjects (dumpMap) or over newManagementObjects (dumpVector)
- a QMF request is being processed via handleMethodRequest and moveNewObjects moves newly seen objects from newManagementObjects to managementObjects

Here the problem is, access to neither managementObjects or newManagementObjects is guarded by locks within debugSnapshot, causing the iteration over the map / vector fails when updating it concurrently.

I spotted segfaults in either dumpMap or in dumpVector, both needs to be fixed.

Proposed is a trivial patch in adding both locks. An option (in case the locking is time consuming) is conditional locks:

    Index: cpp/src/qpid/management/ManagementAgent.cpp
    ===================================================================
    --- cpp/src/qpid/management/ManagementAgent.cpp	(revision 1660046)
    +++ cpp/src/qpid/management/ManagementAgent.cpp	(working copy)
    @@ -2710,11 +2710,17 @@
                  << summarizeVector("new objects ", newManagementObjects)
                  << pendingDeletedObjs.size() << " pending deletes"
                  << summarizeAgents());
    -
    -    QPID_LOG_IF(trace, managementObjects.size(),
    -                title << ": objects" << dumpMap(managementObjects));
    -    QPID_LOG_IF(trace, newManagementObjects.size(),
    -                title << ": new objects" << dumpVector(newManagementObjects));
    +    bool print_traces;
    +    QPID_LOG_TEST(trace, print_traces);
    +    if (print_traces)
    +    {
    +        sys::Mutex::ScopedLock objLock (objectLock);
    +        sys::Mutex::ScopedLock lock(addLock);
    +        QPID_LOG_IF(trace, managementObjects.size(),
    +                    title << ": objects" << dumpMap(managementObjects));
    +        QPID_LOG_IF(trace, newManagementObjects.size(),
    +                    title << ": new objects" << dumpVector(newManagementObjects));
    +    }
     }

This might work well only if std::count_if on either map or vector (called in summarizeVector and summarizeMap) is thread-safe operation.


Diffs
-----

  trunk/qpid/cpp/src/qpid/management/ManagementAgent.cpp 1660046 

Diff: https://reviews.apache.org/r/31619/diff/


Testing
-------

The first patch successfully tested against the reproducer for 3 days, the 2nd proposed hasnt been tested.


Thanks,

Pavel Moravec


Re: Review Request 31619: [C++ broker] segfault when processing QMF method during periodic processing

Posted by Pavel Moravec <pm...@redhat.com>.

> On March 2, 2015, 4:40 p.m., Kenneth Giusti wrote:
> > trunk/qpid/cpp/src/qpid/management/ManagementAgent.cpp, line 2706
> > <https://reviews.apache.org/r/31619/diff/1/?file=882045#file882045line2706>
> >
> >     I think you have to include the userLock in this as well (stuff like packages and remoteAgents are being touched).
> >     
> >     There's a specific locking order that needs to be maintained when taking multiple locks:
> >     
> >     1) first take userLock,
> >     2) then take addLock
> >     3) finally take objectLock
> >     
> >     (see ManagementAgent.h).
> >     
> >     Taking objectLock first, then addlock (as done in this patch) can cause a deadlock.

Good point with userLock due to remoteAgents.

To publish updated version of the patch proposal soon.


- Pavel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31619/#review74743
-----------------------------------------------------------


On March 2, 2015, 9:56 a.m., Pavel Moravec wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31619/
> -----------------------------------------------------------
> 
> (Updated March 2, 2015, 9:56 a.m.)
> 
> 
> Review request for qpid, Andrew Stitcher, Gordon Sim, and Kenneth Giusti.
> 
> 
> Bugs: QPID-6397
>     https://issues.apache.org/jira/browse/QPID-6397
> 
> 
> Repository: qpid
> 
> 
> Description
> -------
> 
> The race condition leading to the coredump with backtraces in the JIRA when:
> 
> - traces are enabled (at least for qpid::management::ManagementAgent::debugSnapshot scope)
> - periodic processing has just started, is in debugSnapshot, and iterates either over managementObjects (dumpMap) or over newManagementObjects (dumpVector)
> - a QMF request is being processed via handleMethodRequest and moveNewObjects moves newly seen objects from newManagementObjects to managementObjects
> 
> Here the problem is, access to neither managementObjects or newManagementObjects is guarded by locks within debugSnapshot, causing the iteration over the map / vector fails when updating it concurrently.
> 
> I spotted segfaults in either dumpMap or in dumpVector, both needs to be fixed.
> 
> Proposed is a trivial patch in adding both locks. An option (in case the locking is time consuming) is conditional locks:
> 
>     Index: cpp/src/qpid/management/ManagementAgent.cpp
>     ===================================================================
>     --- cpp/src/qpid/management/ManagementAgent.cpp	(revision 1660046)
>     +++ cpp/src/qpid/management/ManagementAgent.cpp	(working copy)
>     @@ -2710,11 +2710,17 @@
>                   << summarizeVector("new objects ", newManagementObjects)
>                   << pendingDeletedObjs.size() << " pending deletes"
>                   << summarizeAgents());
>     -
>     -    QPID_LOG_IF(trace, managementObjects.size(),
>     -                title << ": objects" << dumpMap(managementObjects));
>     -    QPID_LOG_IF(trace, newManagementObjects.size(),
>     -                title << ": new objects" << dumpVector(newManagementObjects));
>     +    bool print_traces;
>     +    QPID_LOG_TEST(trace, print_traces);
>     +    if (print_traces)
>     +    {
>     +        sys::Mutex::ScopedLock objLock (objectLock);
>     +        sys::Mutex::ScopedLock lock(addLock);
>     +        QPID_LOG_IF(trace, managementObjects.size(),
>     +                    title << ": objects" << dumpMap(managementObjects));
>     +        QPID_LOG_IF(trace, newManagementObjects.size(),
>     +                    title << ": new objects" << dumpVector(newManagementObjects));
>     +    }
>      }
> 
> This might work well only if std::count_if on either map or vector (called in summarizeVector and summarizeMap) is thread-safe operation.
> 
> 
> Diffs
> -----
> 
>   trunk/qpid/cpp/src/qpid/management/ManagementAgent.cpp 1660046 
> 
> Diff: https://reviews.apache.org/r/31619/diff/
> 
> 
> Testing
> -------
> 
> The first patch successfully tested against the reproducer for 3 days, the 2nd proposed hasnt been tested.
> 
> 
> Thanks,
> 
> Pavel Moravec
> 
>


Re: Review Request 31619: [C++ broker] segfault when processing QMF method during periodic processing

Posted by Kenneth Giusti <kg...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31619/#review74743
-----------------------------------------------------------



trunk/qpid/cpp/src/qpid/management/ManagementAgent.cpp
<https://reviews.apache.org/r/31619/#comment121438>

    I think you have to include the userLock in this as well (stuff like packages and remoteAgents are being touched).
    
    There's a specific locking order that needs to be maintained when taking multiple locks:
    
    1) first take userLock,
    2) then take addLock
    3) finally take objectLock
    
    (see ManagementAgent.h).
    
    Taking objectLock first, then addlock (as done in this patch) can cause a deadlock.


- Kenneth Giusti


On March 2, 2015, 9:56 a.m., Pavel Moravec wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31619/
> -----------------------------------------------------------
> 
> (Updated March 2, 2015, 9:56 a.m.)
> 
> 
> Review request for qpid, Andrew Stitcher, Gordon Sim, and Kenneth Giusti.
> 
> 
> Bugs: QPID-6397
>     https://issues.apache.org/jira/browse/QPID-6397
> 
> 
> Repository: qpid
> 
> 
> Description
> -------
> 
> The race condition leading to the coredump with backtraces in the JIRA when:
> 
> - traces are enabled (at least for qpid::management::ManagementAgent::debugSnapshot scope)
> - periodic processing has just started, is in debugSnapshot, and iterates either over managementObjects (dumpMap) or over newManagementObjects (dumpVector)
> - a QMF request is being processed via handleMethodRequest and moveNewObjects moves newly seen objects from newManagementObjects to managementObjects
> 
> Here the problem is, access to neither managementObjects or newManagementObjects is guarded by locks within debugSnapshot, causing the iteration over the map / vector fails when updating it concurrently.
> 
> I spotted segfaults in either dumpMap or in dumpVector, both needs to be fixed.
> 
> Proposed is a trivial patch in adding both locks. An option (in case the locking is time consuming) is conditional locks:
> 
>     Index: cpp/src/qpid/management/ManagementAgent.cpp
>     ===================================================================
>     --- cpp/src/qpid/management/ManagementAgent.cpp	(revision 1660046)
>     +++ cpp/src/qpid/management/ManagementAgent.cpp	(working copy)
>     @@ -2710,11 +2710,17 @@
>                   << summarizeVector("new objects ", newManagementObjects)
>                   << pendingDeletedObjs.size() << " pending deletes"
>                   << summarizeAgents());
>     -
>     -    QPID_LOG_IF(trace, managementObjects.size(),
>     -                title << ": objects" << dumpMap(managementObjects));
>     -    QPID_LOG_IF(trace, newManagementObjects.size(),
>     -                title << ": new objects" << dumpVector(newManagementObjects));
>     +    bool print_traces;
>     +    QPID_LOG_TEST(trace, print_traces);
>     +    if (print_traces)
>     +    {
>     +        sys::Mutex::ScopedLock objLock (objectLock);
>     +        sys::Mutex::ScopedLock lock(addLock);
>     +        QPID_LOG_IF(trace, managementObjects.size(),
>     +                    title << ": objects" << dumpMap(managementObjects));
>     +        QPID_LOG_IF(trace, newManagementObjects.size(),
>     +                    title << ": new objects" << dumpVector(newManagementObjects));
>     +    }
>      }
> 
> This might work well only if std::count_if on either map or vector (called in summarizeVector and summarizeMap) is thread-safe operation.
> 
> 
> Diffs
> -----
> 
>   trunk/qpid/cpp/src/qpid/management/ManagementAgent.cpp 1660046 
> 
> Diff: https://reviews.apache.org/r/31619/diff/
> 
> 
> Testing
> -------
> 
> The first patch successfully tested against the reproducer for 3 days, the 2nd proposed hasnt been tested.
> 
> 
> Thanks,
> 
> Pavel Moravec
> 
>


Re: Review Request 31619: [C++ broker] segfault when processing QMF method during periodic processing

Posted by Kenneth Giusti <kg...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31619/#review74944
-----------------------------------------------------------

Ship it!


Ship It!

- Kenneth Giusti


On March 3, 2015, 12:38 p.m., Pavel Moravec wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31619/
> -----------------------------------------------------------
> 
> (Updated March 3, 2015, 12:38 p.m.)
> 
> 
> Review request for qpid, Andrew Stitcher, Gordon Sim, and Kenneth Giusti.
> 
> 
> Bugs: QPID-6397
>     https://issues.apache.org/jira/browse/QPID-6397
> 
> 
> Repository: qpid
> 
> 
> Description
> -------
> 
> The race condition leading to the coredump with backtraces in the JIRA when:
> 
> - traces are enabled (at least for qpid::management::ManagementAgent::debugSnapshot scope)
> - periodic processing has just started, is in debugSnapshot, and iterates either over managementObjects (dumpMap) or over newManagementObjects (dumpVector)
> - a QMF request is being processed via handleMethodRequest and moveNewObjects moves newly seen objects from newManagementObjects to managementObjects
> 
> Here the problem is, access to neither managementObjects or newManagementObjects is guarded by locks within debugSnapshot, causing the iteration over the map / vector fails when updating it concurrently.
> 
> I spotted segfaults in either dumpMap or in dumpVector, both needs to be fixed.
> 
> Proposed is a trivial patch in adding both locks. An option (in case the locking is time consuming) is conditional locks:
> 
>     Index: cpp/src/qpid/management/ManagementAgent.cpp
>     ===================================================================
>     --- cpp/src/qpid/management/ManagementAgent.cpp	(revision 1660046)
>     +++ cpp/src/qpid/management/ManagementAgent.cpp	(working copy)
>     @@ -2710,11 +2710,17 @@
>                   << summarizeVector("new objects ", newManagementObjects)
>                   << pendingDeletedObjs.size() << " pending deletes"
>                   << summarizeAgents());
>     -
>     -    QPID_LOG_IF(trace, managementObjects.size(),
>     -                title << ": objects" << dumpMap(managementObjects));
>     -    QPID_LOG_IF(trace, newManagementObjects.size(),
>     -                title << ": new objects" << dumpVector(newManagementObjects));
>     +    bool print_traces;
>     +    QPID_LOG_TEST(trace, print_traces);
>     +    if (print_traces)
>     +    {
>     +        sys::Mutex::ScopedLock objLock (objectLock);
>     +        sys::Mutex::ScopedLock lock(addLock);
>     +        QPID_LOG_IF(trace, managementObjects.size(),
>     +                    title << ": objects" << dumpMap(managementObjects));
>     +        QPID_LOG_IF(trace, newManagementObjects.size(),
>     +                    title << ": new objects" << dumpVector(newManagementObjects));
>     +    }
>      }
> 
> This might work well only if std::count_if on either map or vector (called in summarizeVector and summarizeMap) is thread-safe operation.
> 
> 
> Diffs
> -----
> 
>   trunk/qpid/cpp/src/qpid/management/ManagementAgent.cpp 1660046 
> 
> Diff: https://reviews.apache.org/r/31619/diff/
> 
> 
> Testing
> -------
> 
> The first patch successfully tested against the reproducer for 3 days, the 2nd proposed hasnt been tested.
> 
> 
> Thanks,
> 
> Pavel Moravec
> 
>


Re: Review Request 31619: [C++ broker] segfault when processing QMF method during periodic processing

Posted by Pavel Moravec <pm...@redhat.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31619/
-----------------------------------------------------------

(Updated March 3, 2015, 12:38 p.m.)


Review request for qpid, Andrew Stitcher, Gordon Sim, and Kenneth Giusti.


Changes
-------

Put the addLock and objectLock to proper ordering.

Acquired userLock in periodicProcessing before calling debugSnapshot.


Bugs: QPID-6397
    https://issues.apache.org/jira/browse/QPID-6397


Repository: qpid


Description
-------

The race condition leading to the coredump with backtraces in the JIRA when:

- traces are enabled (at least for qpid::management::ManagementAgent::debugSnapshot scope)
- periodic processing has just started, is in debugSnapshot, and iterates either over managementObjects (dumpMap) or over newManagementObjects (dumpVector)
- a QMF request is being processed via handleMethodRequest and moveNewObjects moves newly seen objects from newManagementObjects to managementObjects

Here the problem is, access to neither managementObjects or newManagementObjects is guarded by locks within debugSnapshot, causing the iteration over the map / vector fails when updating it concurrently.

I spotted segfaults in either dumpMap or in dumpVector, both needs to be fixed.

Proposed is a trivial patch in adding both locks. An option (in case the locking is time consuming) is conditional locks:

    Index: cpp/src/qpid/management/ManagementAgent.cpp
    ===================================================================
    --- cpp/src/qpid/management/ManagementAgent.cpp	(revision 1660046)
    +++ cpp/src/qpid/management/ManagementAgent.cpp	(working copy)
    @@ -2710,11 +2710,17 @@
                  << summarizeVector("new objects ", newManagementObjects)
                  << pendingDeletedObjs.size() << " pending deletes"
                  << summarizeAgents());
    -
    -    QPID_LOG_IF(trace, managementObjects.size(),
    -                title << ": objects" << dumpMap(managementObjects));
    -    QPID_LOG_IF(trace, newManagementObjects.size(),
    -                title << ": new objects" << dumpVector(newManagementObjects));
    +    bool print_traces;
    +    QPID_LOG_TEST(trace, print_traces);
    +    if (print_traces)
    +    {
    +        sys::Mutex::ScopedLock objLock (objectLock);
    +        sys::Mutex::ScopedLock lock(addLock);
    +        QPID_LOG_IF(trace, managementObjects.size(),
    +                    title << ": objects" << dumpMap(managementObjects));
    +        QPID_LOG_IF(trace, newManagementObjects.size(),
    +                    title << ": new objects" << dumpVector(newManagementObjects));
    +    }
     }

This might work well only if std::count_if on either map or vector (called in summarizeVector and summarizeMap) is thread-safe operation.


Diffs (updated)
-----

  trunk/qpid/cpp/src/qpid/management/ManagementAgent.cpp 1660046 

Diff: https://reviews.apache.org/r/31619/diff/


Testing
-------

The first patch successfully tested against the reproducer for 3 days, the 2nd proposed hasnt been tested.


Thanks,

Pavel Moravec