You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Stephen Allen (JIRA)" <ji...@apache.org> on 2011/08/15 23:57:27 UTC

[jira] [Created] (JENA-100) QueryIteratorBase concurrency issues

QueryIteratorBase concurrency issues
------------------------------------

                 Key: JENA-100
                 URL: https://issues.apache.org/jira/browse/JENA-100
             Project: Jena
          Issue Type: Bug
          Components: ARQ
            Reporter: Stephen Allen
         Attachments: JENA-100-r1157891.patch

QueryIteratorBase appears to have some concurrency bugs relating to cancelling a query:

1) The cancel() and cancelAllowContinue() methods did not have large enough synchronized blocks (they also used "this" as the lock, which is not recommended)
2) The order of setting the cancellation flags and the notifying subclasses via the requestCancel() method was incorrect
3) The visibility and happens-before relationships were incorrect for the requestingCancel and abortIterator variables

The cancelAllowContinue() feature adds a lot of complexity in terms of visibility and ordering.  Unfortunately it is hard to write test cases for these types of concurrency issues, so the existing tests did not uncover the issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-100) QueryIteratorBase concurrency issues

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085376#comment-13085376 ] 

Simon Helsen commented on JENA-100:
-----------------------------------

thanks Stephen. In https://issues.apache.org/jira/browse/JENA-93 I noticed that cancelAllowContinue sometimes throws an exception which it shouldn't. There may be a relation to what you found here. Once your patch makes it into a build, I'll test if it resolves the problem I observed.

re: complexity. Yes, I know because in the patch I original submitted for this feature, I had to introduce a multi-staged cancellation mechanism to gracefully allow the drain in a multi-threaded environment. It went something like this: when a cancel() is invoked, notify the iterator stack the cancel has been requested, but do not cancel just yet (!) Instead, all iterators have to be able to continue until the first execution of next() finishes. It is not sufficient to have hasNext() return false immediately because in a multi-threaded world, you may end up in a situation where a call to hasNext succeeded, but next() was not yet executed (and it usually calls hasNext under the hood again). So, after the first next() was retrieved, the cancel becomes "active" (stage 2) and hasNext() will start returning false.

This 2-staged cancellation was applied to ARQ in our released product and works very reliably. I hope cancelAllowContinue() can be implemented in a similar manner. It would hinder us from upgrading to a later ARQ/TDB

> QueryIteratorBase concurrency issues
> ------------------------------------
>
>                 Key: JENA-100
>                 URL: https://issues.apache.org/jira/browse/JENA-100
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: JENA-100-r1157891.patch
>
>
> QueryIteratorBase appears to have some concurrency bugs relating to cancelling a query:
> 1) The cancel() and cancelAllowContinue() methods did not have large enough synchronized blocks (they also used "this" as the lock, which is not recommended)
> 2) The order of setting the cancellation flags and the notifying subclasses via the requestCancel() method was incorrect
> 3) The visibility and happens-before relationships were incorrect for the requestingCancel and abortIterator variables
> The cancelAllowContinue() feature adds a lot of complexity in terms of visibility and ordering.  Unfortunately it is hard to write test cases for these types of concurrency issues, so the existing tests did not uncover the issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (JENA-100) QueryIteratorBase concurrency issues

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne resolved JENA-100.
--------------------------------

    Resolution: Fixed
      Assignee: Andy Seaborne

> QueryIteratorBase concurrency issues
> ------------------------------------
>
>                 Key: JENA-100
>                 URL: https://issues.apache.org/jira/browse/JENA-100
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>         Attachments: JENA-100-r1157891.patch
>
>
> QueryIteratorBase appears to have some concurrency bugs relating to cancelling a query:
> 1) The cancel() and cancelAllowContinue() methods did not have large enough synchronized blocks (they also used "this" as the lock, which is not recommended)
> 2) The order of setting the cancellation flags and the notifying subclasses via the requestCancel() method was incorrect
> 3) The visibility and happens-before relationships were incorrect for the requestingCancel and abortIterator variables
> The cancelAllowContinue() feature adds a lot of complexity in terms of visibility and ordering.  Unfortunately it is hard to write test cases for these types of concurrency issues, so the existing tests did not uncover the issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (JENA-100) QueryIteratorBase concurrency issues

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne closed JENA-100.
------------------------------


> QueryIteratorBase concurrency issues
> ------------------------------------
>
>                 Key: JENA-100
>                 URL: https://issues.apache.org/jira/browse/JENA-100
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>         Attachments: JENA-100-r1157891.patch
>
>
> QueryIteratorBase appears to have some concurrency bugs relating to cancelling a query:
> 1) The cancel() and cancelAllowContinue() methods did not have large enough synchronized blocks (they also used "this" as the lock, which is not recommended)
> 2) The order of setting the cancellation flags and the notifying subclasses via the requestCancel() method was incorrect
> 3) The visibility and happens-before relationships were incorrect for the requestingCancel and abortIterator variables
> The cancelAllowContinue() feature adds a lot of complexity in terms of visibility and ordering.  Unfortunately it is hard to write test cases for these types of concurrency issues, so the existing tests did not uncover the issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JENA-100) QueryIteratorBase concurrency issues

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-100:
-------------------------------

    Attachment: JENA-100-r1157891.patch

> QueryIteratorBase concurrency issues
> ------------------------------------
>
>                 Key: JENA-100
>                 URL: https://issues.apache.org/jira/browse/JENA-100
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: JENA-100-r1157891.patch
>
>
> QueryIteratorBase appears to have some concurrency bugs relating to cancelling a query:
> 1) The cancel() and cancelAllowContinue() methods did not have large enough synchronized blocks (they also used "this" as the lock, which is not recommended)
> 2) The order of setting the cancellation flags and the notifying subclasses via the requestCancel() method was incorrect
> 3) The visibility and happens-before relationships were incorrect for the requestingCancel and abortIterator variables
> The cancelAllowContinue() feature adds a lot of complexity in terms of visibility and ordering.  Unfortunately it is hard to write test cases for these types of concurrency issues, so the existing tests did not uncover the issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-100) QueryIteratorBase concurrency issues

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085373#comment-13085373 ] 

Stephen Allen commented on JENA-100:
------------------------------------

One of the potential issues caused by 3) is in the nextBinding() method on line 127:

It could be that a regular cancel was requested as we were midway through the method (after checking the abortIterator, but before checking requestCancel). This would cause the iterator to close and appear to terminate normally (but without all of the results) instead of throwing a QueryCancelledException.

> QueryIteratorBase concurrency issues
> ------------------------------------
>
>                 Key: JENA-100
>                 URL: https://issues.apache.org/jira/browse/JENA-100
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: JENA-100-r1157891.patch
>
>
> QueryIteratorBase appears to have some concurrency bugs relating to cancelling a query:
> 1) The cancel() and cancelAllowContinue() methods did not have large enough synchronized blocks (they also used "this" as the lock, which is not recommended)
> 2) The order of setting the cancellation flags and the notifying subclasses via the requestCancel() method was incorrect
> 3) The visibility and happens-before relationships were incorrect for the requestingCancel and abortIterator variables
> The cancelAllowContinue() feature adds a lot of complexity in terms of visibility and ordering.  Unfortunately it is hard to write test cases for these types of concurrency issues, so the existing tests did not uncover the issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-100) QueryIteratorBase concurrency issues

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085618#comment-13085618 ] 

Andy Seaborne commented on JENA-100:
------------------------------------

Patch applied and a build done, thanks 

Simon - please provide test cases for your use case.


> QueryIteratorBase concurrency issues
> ------------------------------------
>
>                 Key: JENA-100
>                 URL: https://issues.apache.org/jira/browse/JENA-100
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: JENA-100-r1157891.patch
>
>
> QueryIteratorBase appears to have some concurrency bugs relating to cancelling a query:
> 1) The cancel() and cancelAllowContinue() methods did not have large enough synchronized blocks (they also used "this" as the lock, which is not recommended)
> 2) The order of setting the cancellation flags and the notifying subclasses via the requestCancel() method was incorrect
> 3) The visibility and happens-before relationships were incorrect for the requestingCancel and abortIterator variables
> The cancelAllowContinue() feature adds a lot of complexity in terms of visibility and ordering.  Unfortunately it is hard to write test cases for these types of concurrency issues, so the existing tests did not uncover the issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira