You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Gustavo Salazar Torres (JIRA)" <ji...@apache.org> on 2013/01/22 14:38:13 UTC

[jira] [Commented] (GIRAPH-462) Multithreading breaks out-of-core graph

    [ https://issues.apache.org/jira/browse/GIRAPH-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559625#comment-13559625 ] 

Gustavo Salazar Torres commented on GIRAPH-462:
-----------------------------------------------

What if instead of this pull model a publish/subscribe would be used? That way workers, instead of calling directly the getPartition() method, another object, let's call it PartitionCoordinator, would receive subscribe events from workers expecting to receive a publish event from PartitionCoordinator when a partition is available.
Workers would have to block themselves until they receive the publish event.
                
> Multithreading breaks out-of-core graph
> ---------------------------------------
>
>                 Key: GIRAPH-462
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-462
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Alessandro Presta
>            Priority: Critical
>
> [~cmartella] pointed out this issue: when using multithreaded computation in conjunction with out-of-core graph, we incur in a race condition. The compute threads share the same DiskBackedPartitionStore, whose getPartition() method is not meant to be thread-safe. When two threads request two out-of-core partitions concurrently, they both try to load it to the same slot.
> The result is that we can lose the reference to one of the two partitions (which will not be written back to disk) and we can incur in a NullPointerException when both threads are trying to offload the currently loaded partition to disk.
> I ran this test to confirm the issue:
> https://gist.github.com/4429628
> All tests pass except the one that uses both out-of-core graph and multiple compute threads.
> The error is the following:
> https://gist.github.com/4429650

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira