You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Avery Ching (JIRA)" <ji...@apache.org> on 2012/10/16 03:49:04 UTC
[jira] [Updated] (GIRAPH-374) Multithreading in input split loading
and compute
[ https://issues.apache.org/jira/browse/GIRAPH-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Avery Ching updated GIRAPH-374:
-------------------------------
Description:
Cleaned up the WorkerClient hierarchy
- WorkerClientRequestProcessor is a request cache for every thread (input split loading / compute)
- With RPC gone, got rid of ugly WorkerClientServer and NettyWorkerClientServer
SendPartitionCache
Made GraphState immutable for multi-threading
Added multithreading for loading the input splits
Added multithreading for compute
Added thread-level debugging as an option
Added additional testing on the number of vertices, edges
Optimization on HashWorkerPartitioner to use CopyOnWriteArrayList instead of sychronized list (this is a bottleneck)
Added multithreaded TestPageRank test case
I ran the PageRankBenchmark on 20 workers with 10M vertices, 1B edges. All supersteps are about the same time, so I just compared superstep 0 from every test. Compute performance gains are quite nice (even a little faster than before with one thread). Actual gains will depend heavily on the number of cores you have and possible parallelism of the application.
{code}
Trunk
# threads compute time (secs) total time (secs)
1 89 97.543
Multithreading
1 86.70094 92.477
2 50.41521 57.850
4 38.07716 50.246
8 38.63188 45.940
16 22.999943 48.607
24 23.649189 45.112
32 21.412325 44.201
{code}
We also saw similar gains on the input split loading on an internal app. Future work can be to further improve the scalability of multithreading.
was:
Cleaned up the WorkerClient hierarchy
- WorkerClientRequestProcessor is a request cache for every thread (input split loading / compute)
- With RPC gone, got rid of ugly WorkerClientServer and NettyWorkerClientServer
SendPartitionCache
Made GraphState immutable for multi-threading
Added multithreading for loading the input splits
Added multithreading for compute
Added thread-level debugging as an option
Added additional testing on the number of vertices, edges
Optimization on HashWorkerPartitioner to use CopyOnWriteArrayList instead of sychronized list (this is a bottleneck)
Added multithreaded TestPageRank test case
I ran the PageRankBenchmark on 20 workers with 10M vertices, 1B edges. All supersteps are about the same time, so I just compared superstep 0 from every test. Compute performance gains are quite nice (even a little faster than before with one thread). Actual gains will depend heavily on the number of cores you have and possible parallelism of the application.
Trunk
# threads compute time (secs) total time (secs)
1 89 97.543
Multithreading
1 86.70094 92.477
2 50.41521 57.850
4 38.07716 50.246
8 38.63188 45.940
16 22.999943 48.607
24 23.649189 45.112
32 21.412325 44.201
We also saw similar gains on the input split loading on an internal app. Future work can be to further improve the scalability of multithreading.
> Multithreading in input split loading and compute
> -------------------------------------------------
>
> Key: GIRAPH-374
> URL: https://issues.apache.org/jira/browse/GIRAPH-374
> Project: Giraph
> Issue Type: Improvement
> Reporter: Avery Ching
> Assignee: Avery Ching
>
> Cleaned up the WorkerClient hierarchy
> - WorkerClientRequestProcessor is a request cache for every thread (input split loading / compute)
> - With RPC gone, got rid of ugly WorkerClientServer and NettyWorkerClientServer
> SendPartitionCache
> Made GraphState immutable for multi-threading
> Added multithreading for loading the input splits
> Added multithreading for compute
> Added thread-level debugging as an option
> Added additional testing on the number of vertices, edges
> Optimization on HashWorkerPartitioner to use CopyOnWriteArrayList instead of sychronized list (this is a bottleneck)
> Added multithreaded TestPageRank test case
> I ran the PageRankBenchmark on 20 workers with 10M vertices, 1B edges. All supersteps are about the same time, so I just compared superstep 0 from every test. Compute performance gains are quite nice (even a little faster than before with one thread). Actual gains will depend heavily on the number of cores you have and possible parallelism of the application.
> {code}
> Trunk
> # threads compute time (secs) total time (secs)
> 1 89 97.543
> Multithreading
> 1 86.70094 92.477
> 2 50.41521 57.850
> 4 38.07716 50.246
> 8 38.63188 45.940
> 16 22.999943 48.607
> 24 23.649189 45.112
> 32 21.412325 44.201
> {code}
> We also saw similar gains on the input split loading on an internal app. Future work can be to further improve the scalability of multithreading.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira