You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Avery Ching (JIRA)" <ji...@apache.org> on 2013/06/20 21:00:22 UTC

[jira] [Updated] (GIRAPH-694) Setting configuration in GiraphConfiguration causes non thread safe copies

     [ https://issues.apache.org/jira/browse/GIRAPH-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching updated GIRAPH-694:
-------------------------------

    Attachment: GIRAPH-694.patch
    
> Setting configuration in GiraphConfiguration causes non thread safe copies
> --------------------------------------------------------------------------
>
>                 Key: GIRAPH-694
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-694
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>         Attachments: GIRAPH-694.patch
>
>
> When running multithreaded loading, I found a strange problem that all threads would get blocked on one thread that was reading an infinite sized map.
> The thread everyone was waiting on would be stuck doing the following:
> "load-17" prio=10 tid=0x00007f2bac138800 nid=0x6a8e runnable [0x0000000047d7a000]
>    java.lang.Thread.State: RUNNABLE
>    at java.util.HashMap.hash(HashMap.java:351)
>    at java.util.HashMap.putForCreate(HashMap.java:512)
>    at java.util.HashMap.putAllForCreate(HashMap.java:534)
>    at java.util.HashMap.<init>(HashMap.java:320)
>    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:291)
>    - locked <0x00007f2f9be162c8> (a org.apache.hadoop.mapred.JobConf)
>    at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:402)
>    at com.facebook.hiveio.input.HiveApiInputFormat.createRecordReader(HiveApiInputFormat.java:246)
>    at org.apache.giraph.hive.input.edge.HiveEdgeInputFormat.createEdgeReader(HiveEdgeInputFormat.java:86)
>    at com.facebook.digraph.affinitypropagation.io.hive.ReverseEdgeDuplicatorHiveInputFormat.createEdgeReader(ReverseEdgeDuplicatorHiveInputFormat.java:32)
>    at org.apache.giraph.io.internal.WrappedEdgeInputFormat.createEdgeReader(WrappedEdgeInputFormat.java:71)
>    at org.apache.giraph.worker.EdgeInputSplitsCallable.readInputSplit(EdgeInputSplitsCallable.java:123)
>    at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:267)
>    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211)
>    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
>    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>    at java.lang.Thread.run(Thread.java:722)
> "load-17" prio=10 tid=0x00007f2bac138800 nid=0x6a8e runnable [0x0000000047d7a000]
>    java.lang.Thread.State: RUNNABLE
>    at java.util.HashMap.putAllForCreate(HashMap.java:533)
>    at java.util.HashMap.<init>(HashMap.java:320)
>    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:291)
>    - locked <0x00007f2f9be162c8> (a org.apache.hadoop.mapred.JobConf)
>    at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:402)
>    at com.facebook.hiveio.input.HiveApiInputFormat.createRecordReader(HiveApiInputFormat.java:246)
>    at org.apache.giraph.hive.input.edge.HiveEdgeInputFormat.createEdgeReader(HiveEdgeInputFormat.java:86)
>    at com.facebook.digraph.affinitypropagation.io.hive.ReverseEdgeDuplicatorHiveInputFormat.createEdgeReader(ReverseEdgeDuplicatorHiveInputFormat.java:32)
>    at org.apache.giraph.io.internal.WrappedEdgeInputFormat.createEdgeReader(WrappedEdgeInputFormat.java:71)
>    at org.apache.giraph.worker.EdgeInputSplitsCallable.readInputSplit(EdgeInputSplitsCallable.java:123)
>    at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:267)
>    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211)
>    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
>    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>    at java.lang.Thread.run(Thread.java:722)
> This appears to have been caused by an unsafe Configuration#set() and a copy by another thread.  Configuration#set() is not thread-safe due to the part HashMap updatingResource.  This may or may not be present in other versions of Hadoop.
> The solution is simple.  We synchronize the GiraphConfiguration when setting and then the copy is now thread-safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira