You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hama.apache.org by Apache Wiki <wi...@apache.org> on 2012/02/24 16:53:13 UTC
[Hama Wiki] Update of "SSSP" by thomasjungblut

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The "SSSP" page has been changed by thomasjungblut:
http://wiki.apache.org/hama/SSSP?action=diff&rev1=8&rev2=9

  == Single Source Shortest Paths ==
  
-  * The SSSP algorithm described in the Google Pregel paper was used.
+  * The SSSP (abbr. for Single Source Shortest Paths) algorithm described in the Google Pregel paper was used.
   * Introduces IO usage, partitioning based on hashing of vertextID, and collective communication.
  
- == Implementation ==
- 
- TODO: describe internal algorithm shows how it can be implemented using Hama BSP 
  
  == Usage ==
  
- TODO: 
+ {{{
+ bin/hama jar ../hama-0.4.0-examples.jar sssp <start vertex> <input path> <output path> [number of tasks]
+ }}}
  
- Have fun! If you are facing problems, feel free to ask questions on the official mailing list.
+ You need to provide a start vertex name from where the computation should start calculating the shortest paths, scroll down how to provide an input file for it.
  
+ == Submit your own graph ==
+ 
+ You can transform your graph as a adjacency list to fit into the input which Hama is going to parse and calculate the SSSP.
+ 
+ The file that Hama can successfully parse is a TextFile that has the following layout:
+ 
+ {{{
+ Berlin	Frankfurt:20	Munich:50
+ Frankfurt	Berlin:20	Munich:10
+ Munich
+ }}}
+ 
+ This piece of text will adjacent Berlin to Frankfurt (with edge weight of 20) and Munich (with edge weight of 10). Munich is a dangling node, it has no outlinks.
+ As you can see a vertex is always on the leftmost side (we call it the key-site), and the outlinks (to which other vertex it is connected to) are seperated by tabs (\t) as the following elements.
+ SSSP needs edge weights, you must provide them by separating the name of the vertex with a colon ":". The weight must be an integer.
+ 
+ Make sure that every vertex's outlink can somewhere be found in the file as a key-site. Otherwise it will result in weird NullPointerExceptions.
+ 
+ Now you need to transform the text file using:
+ {{{
+ bin/hama jar ../hama-0.4.0-examples.jar sssp-text2seq /tmp/input.txt /tmp/out/
+ }}}
+ 
+ Then you can run sssp on it with:
+ 
+ {{{
+ bin/hama jar ../hama-0.4.0-examples.jar sssp Berlin /tmp/out /tmp/sssp-output
+ }}}
+ 
+ Note that based on what you have configured, the paths may be in HDFS or on local disk.
+ 
+ == Output ==
+ 
+ After the job ran you can see a small snapshot of what the algorithm calculated, for the textfile above you should see:
+ 
+ {{{
+ 12/02/24 16:47:48 INFO bsp.BSPJobClient: Current supersteps number: 5
+ 12/02/24 16:47:48 INFO bsp.BSPJobClient: The total number of supersteps: 5
+ Berlin | 0
+ Munich | 30
+ Frankfurt | 20
+ Job Finished in 4.018 seconds
+ }}}
+ 
+ On the left side you see your vertex name and on the right the cost which is needed to get to that vertex.
+ In the output sequence file you should get a org.apache.hadoop.io.Text (KEY) and org.apache.hadoop.io.IntWritable (VALUE) pair which is exactly the output from above.
+