You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Jan van der Lugt <ja...@gmail.com> on 2014/05/05 23:04:07 UTC

Re: clustering coefficient (counting triangles) in giraph.

In superstep 1 you should definitely store the edges in a collection that
allows fast lookups (like a HashSet) and use that to your neighborhood
checks. Otherwise you will lose a huge amount of performance on high-degree
vertices.


On Wed, Mar 19, 2014 at 7:21 AM, Suijian Zhou <su...@gmail.com>wrote:

> Thank you Kaushik, I will have a try with it.
>
> Best Regards,
> Suijian
>
>
>
> 2014-03-18 10:16 GMT-05:00 Kaushik Patnaik <ka...@gmail.com>:
>
> Copy pasting my code here ... hope thats okay. You can place this in the
>> giraph examples folder where other examples are present.
>>
>> You will also need to create an additional "giraph io format" of the form
>> "JsonDoubleDoubleFloatDoubleVertexInputFormat.java". It takes minor edits
>> to create such a new io format from an already existing io format.
>>
>> package org.apache.giraph.examples;
>>
>> import org.apache.giraph.graph.BasicComputation;
>> import org.apache.giraph.conf.LongConfOption;
>> import org.apache.giraph.edge.Edge;
>> import org.apache.giraph.graph.Vertex;
>> import org.apache.hadoop.io.DoubleWritable;
>> import org.apache.hadoop.io.FloatWritable;
>> import org.apache.hadoop.io.LongWritable;
>> import org.apache.log4j.Logger;
>>
>> import java.io.IOException;
>>
>> @Algorithm(
>>     name = "Traingle Counting in Giraph",
>>     description = "Return the total number of triangle and also
>> enumerates traingles per vertex"
>> )
>> public class TriangleCounting extends BasicComputation<
>>     DoubleWritable, DoubleWritable, FloatWritable, DoubleWritable> {
>>
>>   /** Class logger */
>>   private static final Logger LOG =
>> Logger.getLogger(TriangleCounting.class);
>>
>>   @Override
>>   public void compute(
>>       Vertex<DoubleWritable, DoubleWritable, FloatWritable> vertex,
>>       Iterable<DoubleWritable> messages) throws IOException {
>>
>>   /** First Superstep releases messages to vertexIds() whose value is
>> greater than its value. Both VertexId and Message are double **/
>>   if (getSuperstep() == 0) {
>>     for (Edge<DoubleWritable, FloatWritable> edge: vertex.getEdges()) {
>>       if (edge.getTargetVertexId().compareTo(vertex.getId()) == 1) {
>>         sendMessage(edge.getTargetVertexId(), vertex.getId());
>>         if (LOG.isDebugEnabled()) {
>>           LOG.debug("Vertex " + vertex.getId() + " sent message " +
>> vertex.getId() + " to vertex " + edge.getTargetVertexId());
>>         }
>>         System.out.println("Vertex " + vertex.getId() + " sent message "
>> + vertex.getId() + " to vertex " + edge.getTargetVertexId());
>>       }
>>     }
>>   }
>>
>>   /** Second superstep releases messages to message.get() <
>> vertex.getId() < targetVertexId() **/
>>   if (getSuperstep() == 1) {
>>     for (DoubleWritable message: messages) {
>>         for (Edge<DoubleWritable, FloatWritable> edge: vertex.getEdges())
>> {
>>           if (edge.getTargetVertexId().compareTo(vertex.getId()) +
>> vertex.getId().compareTo(message) == 2) {
>>             sendMessage(edge.getTargetVertexId(), message);
>>             if (LOG.isDebugEnabled()) {
>>               LOG.debug("Vertex " + vertex.getId() + " sent message " +
>> message + " to vertex " + edge.getTargetVertexId());
>>             }
>>             System.out.println("Vertex " + vertex.getId() + " sent
>> message " + message + " to vertex " + edge.getTargetVertexId());
>>           }
>>         }
>>     }
>>   }
>>   /** Sends messages to all its neighbours, the messages it receives **/
>>   if (getSuperstep() == 2) {
>>     for (DoubleWritable message: messages) {
>>         sendMessageToAllEdges(vertex, message);
>>     }
>>   }
>>
>>   if (getSuperstep() == 3) {
>>       double Value = 0.0;
>>       for (DoubleWritable message: messages) {
>>           if (vertex.getId().equals(message)) {
>>               Value += 1.0;
>>               System.out.println("Vertex " + vertex.getId() + " received
>> message " + message);
>>           }
>>       }
>>       vertex.setValue(new DoubleWritable(Value));
>>   }
>>
>>   vertex.voteToHalt();
>>   }
>> }
>>
>>
>> On Mon, Mar 17, 2014 at 6:10 PM, Suijian Zhou <su...@gmail.com>wrote:
>>
>>> Hi, Paven and Kaushik,
>>>   Great! Yes, this is what I need. In the meantime, could you share your
>>> implementation with me? Thanks a lot!
>>>
>>>   Best Regards,
>>>   Suijian
>>>
>>>
>>>
>>> 2014-03-17 14:38 GMT-05:00 Pavan Kumar A <pa...@outlook.com>:
>>>
>>>  If what you need is
>>>> http://en.wikipedia.org/wiki/Clustering_coefficient#Local_clustering_coefficient
>>>> then I implemented it in Giraph, will submit a patch soon
>>>>
>>>> ------------------------------
>>>> Date: Mon, 17 Mar 2014 15:33:07 -0400
>>>> Subject: Re: clustering coefficient (counting triangles) in giraph.
>>>> From: kaushikpatnaik@gmail.com
>>>> To: user@giraph.apache.org
>>>>
>>>>
>>>> Check out this paper on implementing triangle counting in a BSP model
>>>> by Prof David Bader from Georgia Tech.
>>>>
>>>> http://www.cc.gatech.edu/~bader/papers/GraphBSPonXMT-MTAAP2013.pdf
>>>>
>>>> I implemented a similar version in Apache Giraph, and it worked pretty
>>>> well. You have to "switch on" the write to disk option though, as in the
>>>> second and third cycle of the algorithm you have a massive message build up.
>>>>
>>>>
>>>> On Mon, Mar 17, 2014 at 3:17 PM, Suijian Zhou <su...@gmail.com>wrote:
>>>>
>>>> Hi, Experts,
>>>>   Does anybody know if there are examples of implementation in giraph
>>>> for clustering coefficient (counting triangles)? Thanks!
>>>>
>>>>   Best Regards,
>>>>   Suijian
>>>>
>>>>
>>>>
>>>
>>
>