You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Jan van der Lugt <ja...@gmail.com> on 2014/05/05 23:04:07 UTC
Re: clustering coefficient (counting triangles) in giraph.
In superstep 1 you should definitely store the edges in a collection that
allows fast lookups (like a HashSet) and use that to your neighborhood
checks. Otherwise you will lose a huge amount of performance on high-degree
vertices.
On Wed, Mar 19, 2014 at 7:21 AM, Suijian Zhou <su...@gmail.com>wrote:
> Thank you Kaushik, I will have a try with it.
>
> Best Regards,
> Suijian
>
>
>
> 2014-03-18 10:16 GMT-05:00 Kaushik Patnaik <ka...@gmail.com>:
>
> Copy pasting my code here ... hope thats okay. You can place this in the
>> giraph examples folder where other examples are present.
>>
>> You will also need to create an additional "giraph io format" of the form
>> "JsonDoubleDoubleFloatDoubleVertexInputFormat.java". It takes minor edits
>> to create such a new io format from an already existing io format.
>>
>> package org.apache.giraph.examples;
>>
>> import org.apache.giraph.graph.BasicComputation;
>> import org.apache.giraph.conf.LongConfOption;
>> import org.apache.giraph.edge.Edge;
>> import org.apache.giraph.graph.Vertex;
>> import org.apache.hadoop.io.DoubleWritable;
>> import org.apache.hadoop.io.FloatWritable;
>> import org.apache.hadoop.io.LongWritable;
>> import org.apache.log4j.Logger;
>>
>> import java.io.IOException;
>>
>> @Algorithm(
>> name = "Traingle Counting in Giraph",
>> description = "Return the total number of triangle and also
>> enumerates traingles per vertex"
>> )
>> public class TriangleCounting extends BasicComputation<
>> DoubleWritable, DoubleWritable, FloatWritable, DoubleWritable> {
>>
>> /** Class logger */
>> private static final Logger LOG =
>> Logger.getLogger(TriangleCounting.class);
>>
>> @Override
>> public void compute(
>> Vertex<DoubleWritable, DoubleWritable, FloatWritable> vertex,
>> Iterable<DoubleWritable> messages) throws IOException {
>>
>> /** First Superstep releases messages to vertexIds() whose value is
>> greater than its value. Both VertexId and Message are double **/
>> if (getSuperstep() == 0) {
>> for (Edge<DoubleWritable, FloatWritable> edge: vertex.getEdges()) {
>> if (edge.getTargetVertexId().compareTo(vertex.getId()) == 1) {
>> sendMessage(edge.getTargetVertexId(), vertex.getId());
>> if (LOG.isDebugEnabled()) {
>> LOG.debug("Vertex " + vertex.getId() + " sent message " +
>> vertex.getId() + " to vertex " + edge.getTargetVertexId());
>> }
>> System.out.println("Vertex " + vertex.getId() + " sent message "
>> + vertex.getId() + " to vertex " + edge.getTargetVertexId());
>> }
>> }
>> }
>>
>> /** Second superstep releases messages to message.get() <
>> vertex.getId() < targetVertexId() **/
>> if (getSuperstep() == 1) {
>> for (DoubleWritable message: messages) {
>> for (Edge<DoubleWritable, FloatWritable> edge: vertex.getEdges())
>> {
>> if (edge.getTargetVertexId().compareTo(vertex.getId()) +
>> vertex.getId().compareTo(message) == 2) {
>> sendMessage(edge.getTargetVertexId(), message);
>> if (LOG.isDebugEnabled()) {
>> LOG.debug("Vertex " + vertex.getId() + " sent message " +
>> message + " to vertex " + edge.getTargetVertexId());
>> }
>> System.out.println("Vertex " + vertex.getId() + " sent
>> message " + message + " to vertex " + edge.getTargetVertexId());
>> }
>> }
>> }
>> }
>> /** Sends messages to all its neighbours, the messages it receives **/
>> if (getSuperstep() == 2) {
>> for (DoubleWritable message: messages) {
>> sendMessageToAllEdges(vertex, message);
>> }
>> }
>>
>> if (getSuperstep() == 3) {
>> double Value = 0.0;
>> for (DoubleWritable message: messages) {
>> if (vertex.getId().equals(message)) {
>> Value += 1.0;
>> System.out.println("Vertex " + vertex.getId() + " received
>> message " + message);
>> }
>> }
>> vertex.setValue(new DoubleWritable(Value));
>> }
>>
>> vertex.voteToHalt();
>> }
>> }
>>
>>
>> On Mon, Mar 17, 2014 at 6:10 PM, Suijian Zhou <su...@gmail.com>wrote:
>>
>>> Hi, Paven and Kaushik,
>>> Great! Yes, this is what I need. In the meantime, could you share your
>>> implementation with me? Thanks a lot!
>>>
>>> Best Regards,
>>> Suijian
>>>
>>>
>>>
>>> 2014-03-17 14:38 GMT-05:00 Pavan Kumar A <pa...@outlook.com>:
>>>
>>> If what you need is
>>>> http://en.wikipedia.org/wiki/Clustering_coefficient#Local_clustering_coefficient
>>>> then I implemented it in Giraph, will submit a patch soon
>>>>
>>>> ------------------------------
>>>> Date: Mon, 17 Mar 2014 15:33:07 -0400
>>>> Subject: Re: clustering coefficient (counting triangles) in giraph.
>>>> From: kaushikpatnaik@gmail.com
>>>> To: user@giraph.apache.org
>>>>
>>>>
>>>> Check out this paper on implementing triangle counting in a BSP model
>>>> by Prof David Bader from Georgia Tech.
>>>>
>>>> http://www.cc.gatech.edu/~bader/papers/GraphBSPonXMT-MTAAP2013.pdf
>>>>
>>>> I implemented a similar version in Apache Giraph, and it worked pretty
>>>> well. You have to "switch on" the write to disk option though, as in the
>>>> second and third cycle of the algorithm you have a massive message build up.
>>>>
>>>>
>>>> On Mon, Mar 17, 2014 at 3:17 PM, Suijian Zhou <su...@gmail.com>wrote:
>>>>
>>>> Hi, Experts,
>>>> Does anybody know if there are examples of implementation in giraph
>>>> for clustering coefficient (counting triangles)? Thanks!
>>>>
>>>> Best Regards,
>>>> Suijian
>>>>
>>>>
>>>>
>>>
>>
>