You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by "Sardeshmukh, Vivek" <vi...@uiowa.edu> on 2014/07/21 23:05:59 UTC

Setting variable value in Compute class and using it in the next superstep

Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)


RE: Setting variable value in Compute class and using it in the next superstep

Posted by "Sardeshmukh, Vivek" <vi...@uiowa.edu>.
Thank you Tom for your prompt reply.


If that is the case then I might be doing something wrong. I'll take a closer look with debug enabled and keep you posted.


Thank you again.


Vivek

________________________________
From: Schweiger, Tom <th...@ebay.com>
Sent: Monday, July 21, 2014 4:37 PM
To: user@giraph.apache.org
Subject: RE: Setting variable value in Compute class and using it in the next superstep

And in answer of :

This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can create/initialize the value at the beginning of the first superstep.

________________________________
From: Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org
Subject: Setting variable value in Compute class and using it in the next superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)


RE: Setting variable value in Compute class and using it in the next superstep

Posted by "Sardeshmukh, Vivek" <vi...@uiowa.edu>.
Sorry, my bad! Sorry for spamming your inbox!

Adding "public" keyword in front of class DeltaVertexWritable did the trick! I should have tested this before sending out the previous email.
Thank you.


Vivek
________________________________
From: Sardeshmukh, Vivek <vi...@uiowa.edu>
Sent: Wednesday, July 23, 2014 12:10 PM
To: user@giraph.apache.org
Subject: RE: Setting variable value in Compute class and using it in the next superstep


Hello again,


As Tom and Matthew suggested I wrote my own custom vertex value class and input format class. I followed Matthew's example to create my own custom vertex class but now I'm getting the following error while running the program


java.lang.IllegalStateException: newInstance: Illegal access org.apache.giraph.examples.DeltaVertexWritable
    at org.apache.giraph.utils.ReflectionUtils.newInstance(ReflectionUtils.java:84)
    at org.apache.giraph.utils.WritableUtils.createWritable(WritableUtils.java:68)
    at org.apache.giraph.factories.DefaultVertexValueFactory.newInstance(DefaultVertexValueFactory.java:48)
    at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.createVertexValue(ImmutableClassesGiraphConfiguration.java:729)
    at org.apache.giraph.utils.VertexIterator.resetEmptyVertex(VertexIterator.java:69)
    at org.apache.giraph.utils.VertexIterator.<init>(VertexIterator.java:60)
    at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:108)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)





Here is my DeltaVertexWritable class - https://gist.github.com/sar-vivek/df09cca17cc3f6b5ac60


I tried digging a bit but I couldn't get any success [at the first place I even didn't understand the error message!]



Thank you.

Vivek
________________________________
From: Sardeshmukh, Vivek <vi...@uiowa.edu>
Sent: Monday, July 21, 2014 6:06 PM
To: user@giraph.apache.org
Subject: RE: Setting variable value in Compute class and using it in the next superstep


Thank you Matthew. Now writing a custom vertex class and input format seems doable! Thank you.



--
Vivek
________________________________
From: Matthew Saltz <sa...@gmail.com>
Sent: Monday, July 21, 2014 5:50 PM
To: user@giraph.apache.org
Subject: Re: Setting variable value in Compute class and using it in the next superstep

Yeah, that's true. Sorry I forgot that part. Luckily, it isn't too tricky either, depending on the input format of your graph. Here's another example<https://gist.github.com/saltzm/ab7172c57dec927061be> to get you started, for a very simple input format for edges with no values. I basically took the code straight from here<http://giraph.apache.org/apidocs/org/apache/giraph/io/formats/LongLongNullTextInputFormat.html> and modified where I needed to it to return the InputFormat that I needed for my code. You'll probably be better off digging through some of the already implemented InputFormat classes that come with Giraph to do something similar, since I'm guessing your input files will be different than mine. Take a look at the subclasses of TextVertexInputFormat<http://giraph.apache.org/apidocs/org/apache/giraph/io/formats/TextVertexInputFormat.html>, since they deal with a lot of common input format styles, and see if you can modify their code to work with your custom vertex data format. Now, the example I give you is also easy because I just use the default constructor of the class, but if you need to load additional data from the file into your vertex data and the default constructor isn't appropriate, you may have to do some extra parsing and legwork for that.

Best of luck,
Matthew



On Tue, Jul 22, 2014 at 12:28 AM, Sardeshmukh, Vivek <vi...@uiowa.edu>> wrote:

Thank you Matthew for the example link. It is helpful. I'll give it a shot.


If I have a custom vertex class isn't it necessary to change the VertexInputFormat class too? Since this class "loads" the data into the vertex and if vertex has a custom value field then it doesn't know how to load the input. Am I right?


Vivek
________________________________
From: Schweiger, Tom <th...@ebay.com>>
Sent: Monday, July 21, 2014 5:16 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: RE: Setting variable value in Compute class and using it in the next superstep


For more than one flag, a custom class is necessary (unless you're able to, say, toggle the sign bit to get double usage out or a value).

I've started a private thread with Vivek to get a better understanding of what he was trying to solve.

And you are also correct that there isn't much to writing a custom vertex class.  The key is making sure you read and write in the same order.  Likewise, extending a vertex reader can be quite simple.

________________________________
From: Matthew Saltz [saltzm@gmail.com<ma...@gmail.com>]
Sent: Monday, July 21, 2014 3:09 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Re: Setting variable value in Compute class and using it in the next superstep

Tom,

If it's necessary to store more than one flag though, for example, won't a custom class be necessary? I'm a beginner too, so I apologize if I'm incorrect about that. Just to be clarify, to keep persistent data for a vertex from one superstep to the next, it is necessary to encapsulate it in the type used for the 'V', right? In other words, if Vivek tries to use a normal member variable for the Computation class, it won't work will it?

Also, just to point out, there actually isn't too much involved with writing your own custom vertex class. Here's a quick example<https://gist.github.com/saltzm/692fba1d3aade035ce9c> to get you started. Within your compute() method you can access the data in this class by doing

SampleVertexData d = vertex.getValue();

and then using d.setFlag(true) or boolean currentFlag = d.getFlag() for example.  And your computation class is now something like

public class MyComputation extends BasicComputation<IdType, SampleVertexData, EdgeType, MessageType> {
    @Override
    public void compute(Vertex<IdType, SampleVertexData, EdgeType> vertex, Iterable<MessageType> messages) {.....}

    ...

}

As a warning, for this class I'm using Hadoop 0.20.203 and I'm also a beginner, so take everything I say with a grain of salt, and Tom please correct me if I'm wrong.

Best of luck,
Matthew


On Mon, Jul 21, 2014 at 11:37 PM, Schweiger, Tom <th...@ebay.com>> wrote:
And in answer of :


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can create/initialize the value at the beginning of the first superstep.

________________________________
From: Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu<ma...@uiowa.edu>]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Setting variable value in Compute class and using it in the next superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)




RE: Setting variable value in Compute class and using it in the next superstep

Posted by "Sardeshmukh, Vivek" <vi...@uiowa.edu>.
Hello again,


As Tom and Matthew suggested I wrote my own custom vertex value class and input format class. I followed Matthew's example to create my own custom vertex class but now I'm getting the following error while running the program


java.lang.IllegalStateException: newInstance: Illegal access org.apache.giraph.examples.DeltaVertexWritable
    at org.apache.giraph.utils.ReflectionUtils.newInstance(ReflectionUtils.java:84)
    at org.apache.giraph.utils.WritableUtils.createWritable(WritableUtils.java:68)
    at org.apache.giraph.factories.DefaultVertexValueFactory.newInstance(DefaultVertexValueFactory.java:48)
    at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.createVertexValue(ImmutableClassesGiraphConfiguration.java:729)
    at org.apache.giraph.utils.VertexIterator.resetEmptyVertex(VertexIterator.java:69)
    at org.apache.giraph.utils.VertexIterator.<init>(VertexIterator.java:60)
    at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:108)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)





Here is my DeltaVertexWritable class - https://gist.github.com/sar-vivek/df09cca17cc3f6b5ac60


I tried digging a bit but I couldn't get any success [at the first place I even didn't understand the error message!]



Thank you.

Vivek
________________________________
From: Sardeshmukh, Vivek <vi...@uiowa.edu>
Sent: Monday, July 21, 2014 6:06 PM
To: user@giraph.apache.org
Subject: RE: Setting variable value in Compute class and using it in the next superstep


Thank you Matthew. Now writing a custom vertex class and input format seems doable! Thank you.



--
Vivek
________________________________
From: Matthew Saltz <sa...@gmail.com>
Sent: Monday, July 21, 2014 5:50 PM
To: user@giraph.apache.org
Subject: Re: Setting variable value in Compute class and using it in the next superstep

Yeah, that's true. Sorry I forgot that part. Luckily, it isn't too tricky either, depending on the input format of your graph. Here's another example<https://gist.github.com/saltzm/ab7172c57dec927061be> to get you started, for a very simple input format for edges with no values. I basically took the code straight from here<http://giraph.apache.org/apidocs/org/apache/giraph/io/formats/LongLongNullTextInputFormat.html> and modified where I needed to it to return the InputFormat that I needed for my code. You'll probably be better off digging through some of the already implemented InputFormat classes that come with Giraph to do something similar, since I'm guessing your input files will be different than mine. Take a look at the subclasses of TextVertexInputFormat<http://giraph.apache.org/apidocs/org/apache/giraph/io/formats/TextVertexInputFormat.html>, since they deal with a lot of common input format styles, and see if you can modify their code to work with your custom vertex data format. Now, the example I give you is also easy because I just use the default constructor of the class, but if you need to load additional data from the file into your vertex data and the default constructor isn't appropriate, you may have to do some extra parsing and legwork for that.

Best of luck,
Matthew



On Tue, Jul 22, 2014 at 12:28 AM, Sardeshmukh, Vivek <vi...@uiowa.edu>> wrote:

Thank you Matthew for the example link. It is helpful. I'll give it a shot.


If I have a custom vertex class isn't it necessary to change the VertexInputFormat class too? Since this class "loads" the data into the vertex and if vertex has a custom value field then it doesn't know how to load the input. Am I right?


Vivek
________________________________
From: Schweiger, Tom <th...@ebay.com>>
Sent: Monday, July 21, 2014 5:16 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: RE: Setting variable value in Compute class and using it in the next superstep


For more than one flag, a custom class is necessary (unless you're able to, say, toggle the sign bit to get double usage out or a value).

I've started a private thread with Vivek to get a better understanding of what he was trying to solve.

And you are also correct that there isn't much to writing a custom vertex class.  The key is making sure you read and write in the same order.  Likewise, extending a vertex reader can be quite simple.

________________________________
From: Matthew Saltz [saltzm@gmail.com<ma...@gmail.com>]
Sent: Monday, July 21, 2014 3:09 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Re: Setting variable value in Compute class and using it in the next superstep

Tom,

If it's necessary to store more than one flag though, for example, won't a custom class be necessary? I'm a beginner too, so I apologize if I'm incorrect about that. Just to be clarify, to keep persistent data for a vertex from one superstep to the next, it is necessary to encapsulate it in the type used for the 'V', right? In other words, if Vivek tries to use a normal member variable for the Computation class, it won't work will it?

Also, just to point out, there actually isn't too much involved with writing your own custom vertex class. Here's a quick example<https://gist.github.com/saltzm/692fba1d3aade035ce9c> to get you started. Within your compute() method you can access the data in this class by doing

SampleVertexData d = vertex.getValue();

and then using d.setFlag(true) or boolean currentFlag = d.getFlag() for example.  And your computation class is now something like

public class MyComputation extends BasicComputation<IdType, SampleVertexData, EdgeType, MessageType> {
    @Override
    public void compute(Vertex<IdType, SampleVertexData, EdgeType> vertex, Iterable<MessageType> messages) {.....}

    ...

}

As a warning, for this class I'm using Hadoop 0.20.203 and I'm also a beginner, so take everything I say with a grain of salt, and Tom please correct me if I'm wrong.

Best of luck,
Matthew


On Mon, Jul 21, 2014 at 11:37 PM, Schweiger, Tom <th...@ebay.com>> wrote:
And in answer of :


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can create/initialize the value at the beginning of the first superstep.

________________________________
From: Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu<ma...@uiowa.edu>]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Setting variable value in Compute class and using it in the next superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)




RE: Setting variable value in Compute class and using it in the next superstep

Posted by "Sardeshmukh, Vivek" <vi...@uiowa.edu>.
Thank you Matthew. Now writing a custom vertex class and input format seems doable! Thank you.



--
Vivek
________________________________
From: Matthew Saltz <sa...@gmail.com>
Sent: Monday, July 21, 2014 5:50 PM
To: user@giraph.apache.org
Subject: Re: Setting variable value in Compute class and using it in the next superstep

Yeah, that's true. Sorry I forgot that part. Luckily, it isn't too tricky either, depending on the input format of your graph. Here's another example<https://gist.github.com/saltzm/ab7172c57dec927061be> to get you started, for a very simple input format for edges with no values. I basically took the code straight from here<http://giraph.apache.org/apidocs/org/apache/giraph/io/formats/LongLongNullTextInputFormat.html> and modified where I needed to it to return the InputFormat that I needed for my code. You'll probably be better off digging through some of the already implemented InputFormat classes that come with Giraph to do something similar, since I'm guessing your input files will be different than mine. Take a look at the subclasses of TextVertexInputFormat<http://giraph.apache.org/apidocs/org/apache/giraph/io/formats/TextVertexInputFormat.html>, since they deal with a lot of common input format styles, and see if you can modify their code to work with your custom vertex data format. Now, the example I give you is also easy because I just use the default constructor of the class, but if you need to load additional data from the file into your vertex data and the default constructor isn't appropriate, you may have to do some extra parsing and legwork for that.

Best of luck,
Matthew



On Tue, Jul 22, 2014 at 12:28 AM, Sardeshmukh, Vivek <vi...@uiowa.edu>> wrote:

Thank you Matthew for the example link. It is helpful. I'll give it a shot.


If I have a custom vertex class isn't it necessary to change the VertexInputFormat class too? Since this class "loads" the data into the vertex and if vertex has a custom value field then it doesn't know how to load the input. Am I right?


Vivek
________________________________
From: Schweiger, Tom <th...@ebay.com>>
Sent: Monday, July 21, 2014 5:16 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: RE: Setting variable value in Compute class and using it in the next superstep


For more than one flag, a custom class is necessary (unless you're able to, say, toggle the sign bit to get double usage out or a value).

I've started a private thread with Vivek to get a better understanding of what he was trying to solve.

And you are also correct that there isn't much to writing a custom vertex class.  The key is making sure you read and write in the same order.  Likewise, extending a vertex reader can be quite simple.

________________________________
From: Matthew Saltz [saltzm@gmail.com<ma...@gmail.com>]
Sent: Monday, July 21, 2014 3:09 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Re: Setting variable value in Compute class and using it in the next superstep

Tom,

If it's necessary to store more than one flag though, for example, won't a custom class be necessary? I'm a beginner too, so I apologize if I'm incorrect about that. Just to be clarify, to keep persistent data for a vertex from one superstep to the next, it is necessary to encapsulate it in the type used for the 'V', right? In other words, if Vivek tries to use a normal member variable for the Computation class, it won't work will it?

Also, just to point out, there actually isn't too much involved with writing your own custom vertex class. Here's a quick example<https://gist.github.com/saltzm/692fba1d3aade035ce9c> to get you started. Within your compute() method you can access the data in this class by doing

SampleVertexData d = vertex.getValue();

and then using d.setFlag(true) or boolean currentFlag = d.getFlag() for example.  And your computation class is now something like

public class MyComputation extends BasicComputation<IdType, SampleVertexData, EdgeType, MessageType> {
    @Override
    public void compute(Vertex<IdType, SampleVertexData, EdgeType> vertex, Iterable<MessageType> messages) {.....}

    ...

}

As a warning, for this class I'm using Hadoop 0.20.203 and I'm also a beginner, so take everything I say with a grain of salt, and Tom please correct me if I'm wrong.

Best of luck,
Matthew


On Mon, Jul 21, 2014 at 11:37 PM, Schweiger, Tom <th...@ebay.com>> wrote:
And in answer of :


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can create/initialize the value at the beginning of the first superstep.

________________________________
From: Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu<ma...@uiowa.edu>]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Setting variable value in Compute class and using it in the next superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)




Re: Setting variable value in Compute class and using it in the next superstep

Posted by Matthew Saltz <sa...@gmail.com>.
Yeah, that's true. Sorry I forgot that part. Luckily, it isn't too tricky
either, depending on the input format of your graph. Here's another example
<https://gist.github.com/saltzm/ab7172c57dec927061be> to get you started,
for a very simple input format for edges with no values. I basically took
the code straight from here
<http://giraph.apache.org/apidocs/org/apache/giraph/io/formats/LongLongNullTextInputFormat.html>
and
modified where I needed to it to return the InputFormat that I needed for
my code. You'll probably be better off digging through some of the already
implemented InputFormat classes that come with Giraph to do something
similar, since I'm guessing your input files will be different than mine.
Take a look at the subclasses of TextVertexInputFormat
<http://giraph.apache.org/apidocs/org/apache/giraph/io/formats/TextVertexInputFormat.html>,
since they deal with a lot of common input format styles, and see if you
can modify their code to work with your custom vertex data format. Now, the
example I give you is also easy because I just use the default constructor
of the class, but if you need to load additional data from the file into
your vertex data and the default constructor isn't appropriate, you may
have to do some extra parsing and legwork for that.

Best of luck,
Matthew



On Tue, Jul 22, 2014 at 12:28 AM, Sardeshmukh, Vivek <
vivek-sardeshmukh@uiowa.edu> wrote:

>  Thank you Matthew for the example link. It is helpful. I'll give it a
> shot.
>
>
>  If I have a custom vertex class isn't it necessary to change the
> VertexInputFormat class too? Since this class "loads" the data into the
> vertex and if vertex has a custom value field then it doesn't know how to
> load the input. Am I right?
>
>
>
> Vivek
>   ------------------------------
> *From:* Schweiger, Tom <th...@ebay.com>
> *Sent:* Monday, July 21, 2014 5:16 PM
> *To:* user@giraph.apache.org
> *Subject:* RE: Setting variable value in Compute class and using it in
> the next superstep
>
>
> For more than one flag, a custom class is necessary (unless you're able
> to, say, toggle the sign bit to get double usage out or a value).
>
> I've started a private thread with Vivek to get a better understanding of
> what he was trying to solve.
>
> And you are also correct that there isn't much to writing a custom vertex
> class.  The key is making sure you read and write in the same order.
> Likewise, extending a vertex reader can be quite simple.
>
>  ------------------------------
> *From:* Matthew Saltz [saltzm@gmail.com]
> *Sent:* Monday, July 21, 2014 3:09 PM
> *To:* user@giraph.apache.org
> *Subject:* Re: Setting variable value in Compute class and using it in
> the next superstep
>
>   Tom,
>
>  If it's necessary to store more than one flag though, for example, won't
> a custom class be necessary? I'm a beginner too, so I apologize if I'm
> incorrect about that. Just to be clarify, to keep persistent data for a
> vertex from one superstep to the next, it is necessary to encapsulate it in
> the type used for the 'V', right? In other words, if Vivek tries to use a
> normal member variable for the Computation class, it won't work will it?
>
> Also, just to point out, there actually isn't too much involved with
> writing your own custom vertex class. Here's a quick example
> <https://gist.github.com/saltzm/692fba1d3aade035ce9c> to get you started.
> Within your compute() method you can access the data in this class by doing
>
>  SampleVertexData d = vertex.getValue();
>
>  and then using d.setFlag(true) or boolean currentFlag = d.getFlag() for
> example.  And your computation class is now something like
>
>  public class MyComputation extends BasicComputation<*IdType, *
> *SampleVertexData*, *EdgeType, MessageType>* {
>     @Override
>     public void compute(Vertex<*IdType, **SampleVertexData*, *EdgeType*>
> vertex, Iterable<*MessageType> messages) {.....} *
>
> *    ...*
>
>  }
>
>  As a warning, for this class I'm using Hadoop 0.20.203 and I'm also a
> beginner, so take everything I say with a grain of salt, and Tom please
> correct me if I'm wrong.
>
>  Best of luck,
> Matthew
>
>
> On Mon, Jul 21, 2014 at 11:37 PM, Schweiger, Tom <th...@ebay.com>
> wrote:
>
>>  And in answer of :
>>
>>
>> This post also suggests (along with what I described above) to have a
>> field in the vertex value itself. For that I need to change the vertex
>> input format and also create my own custom vertex class. Is it really
>> necessary?
>>
>>  No, you don't need a custom vertex class or vertex input format. You can
>> create/initialize the value at the beginning of the first superstep.
>>
>>  ------------------------------
>> *From:* Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu]
>> *Sent:* Monday, July 21, 2014 2:05 PM
>> *To:* user@giraph.apache.org
>> *Subject:* Setting variable value in Compute class and using it in the
>> next superstep
>>
>>    Hi, all--
>>
>>
>>  In my algorithm, I need to set a flag if certain conditions hold
>> (locally at a vertex v). If this flag is set then execute some other block
>> of code *only once*, and do nothing until some other condition is hold.
>>
>>
>>  My question is, can I declare a flag variable in the class where I
>> override compute function? I defined the flag as a public variable and
>> setting it once the conditions are met but it seems the value is not
>> "carried" over to the next superstep.
>>
>> I dig a little bit in this mailing list and found this
>>
>> https://www.mail-archive.com/user@giraph.apache.org/msg01266.html
>>
>>
>>  This post also suggests (along with what I described above) to have a
>> field in the vertex value itself. For that I need to change the vertex
>> input format and also create my own custom vertex class. Is it really
>> necessary?
>>
>>
>>  By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I
>> was able to run SimpleShortestPathComputation successfully.
>>
>>
>>  Here are more technical details of my algorithm: I am trying to
>> implement Delta-stepping shortest path algorithm (
>> http://dl.acm.org/citation.cfm?id=740136 or
>> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This
>> was mentioned in Pregel paper. A vertex "relax" light edges if it belongs
>> to the minimum bucket index (of course, aggregators!). Once a vertex is
>> done with relaxing light edges it relaxes heavy edges (here is where I need
>> a flag) once. A vertex may be "re-inserted" to a newer bucket and may have
>> to execute all the steps that I described here again.
>>
>>
>>  Thanks.
>>
>>
>>  Sincerely,
>>   Vivek
>> A beginner in Giraph (and Java too!)
>>
>>
>

RE: Setting variable value in Compute class and using it in the next superstep

Posted by "Sardeshmukh, Vivek" <vi...@uiowa.edu>.
Thank you Matthew for the example link. It is helpful. I'll give it a shot.


If I have a custom vertex class isn't it necessary to change the VertexInputFormat class too? Since this class "loads" the data into the vertex and if vertex has a custom value field then it doesn't know how to load the input. Am I right?


Vivek
________________________________
From: Schweiger, Tom <th...@ebay.com>
Sent: Monday, July 21, 2014 5:16 PM
To: user@giraph.apache.org
Subject: RE: Setting variable value in Compute class and using it in the next superstep


For more than one flag, a custom class is necessary (unless you're able to, say, toggle the sign bit to get double usage out or a value).

I've started a private thread with Vivek to get a better understanding of what he was trying to solve.

And you are also correct that there isn't much to writing a custom vertex class.  The key is making sure you read and write in the same order.  Likewise, extending a vertex reader can be quite simple.

________________________________
From: Matthew Saltz [saltzm@gmail.com]
Sent: Monday, July 21, 2014 3:09 PM
To: user@giraph.apache.org
Subject: Re: Setting variable value in Compute class and using it in the next superstep

Tom,

If it's necessary to store more than one flag though, for example, won't a custom class be necessary? I'm a beginner too, so I apologize if I'm incorrect about that. Just to be clarify, to keep persistent data for a vertex from one superstep to the next, it is necessary to encapsulate it in the type used for the 'V', right? In other words, if Vivek tries to use a normal member variable for the Computation class, it won't work will it?

Also, just to point out, there actually isn't too much involved with writing your own custom vertex class. Here's a quick example<https://gist.github.com/saltzm/692fba1d3aade035ce9c> to get you started. Within your compute() method you can access the data in this class by doing

SampleVertexData d = vertex.getValue();

and then using d.setFlag(true) or boolean currentFlag = d.getFlag() for example.  And your computation class is now something like

public class MyComputation extends BasicComputation<IdType, SampleVertexData, EdgeType, MessageType> {
    @Override
    public void compute(Vertex<IdType, SampleVertexData, EdgeType> vertex, Iterable<MessageType> messages) {.....}

    ...

}

As a warning, for this class I'm using Hadoop 0.20.203 and I'm also a beginner, so take everything I say with a grain of salt, and Tom please correct me if I'm wrong.

Best of luck,
Matthew


On Mon, Jul 21, 2014 at 11:37 PM, Schweiger, Tom <th...@ebay.com>> wrote:
And in answer of :


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can create/initialize the value at the beginning of the first superstep.

________________________________
From: Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu<ma...@uiowa.edu>]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Setting variable value in Compute class and using it in the next superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)



RE: Setting variable value in Compute class and using it in the next superstep

Posted by "Schweiger, Tom" <th...@ebay.com>.
For more than one flag, a custom class is necessary (unless you're able to, say, toggle the sign bit to get double usage out or a value).

I've started a private thread with Vivek to get a better understanding of what he was trying to solve.

And you are also correct that there isn't much to writing a custom vertex class.  The key is making sure you read and write in the same order.  Likewise, extending a vertex reader can be quite simple.

________________________________
From: Matthew Saltz [saltzm@gmail.com]
Sent: Monday, July 21, 2014 3:09 PM
To: user@giraph.apache.org
Subject: Re: Setting variable value in Compute class and using it in the next superstep

Tom,

If it's necessary to store more than one flag though, for example, won't a custom class be necessary? I'm a beginner too, so I apologize if I'm incorrect about that. Just to be clarify, to keep persistent data for a vertex from one superstep to the next, it is necessary to encapsulate it in the type used for the 'V', right? In other words, if Vivek tries to use a normal member variable for the Computation class, it won't work will it?

Also, just to point out, there actually isn't too much involved with writing your own custom vertex class. Here's a quick example<https://gist.github.com/saltzm/692fba1d3aade035ce9c> to get you started. Within your compute() method you can access the data in this class by doing

SampleVertexData d = vertex.getValue();

and then using d.setFlag(true) or boolean currentFlag = d.getFlag() for example.  And your computation class is now something like

public class MyComputation extends BasicComputation<IdType, SampleVertexData, EdgeType, MessageType> {
    @Override
    public void compute(Vertex<IdType, SampleVertexData, EdgeType> vertex, Iterable<MessageType> messages) {.....}

    ...

}

As a warning, for this class I'm using Hadoop 0.20.203 and I'm also a beginner, so take everything I say with a grain of salt, and Tom please correct me if I'm wrong.

Best of luck,
Matthew


On Mon, Jul 21, 2014 at 11:37 PM, Schweiger, Tom <th...@ebay.com>> wrote:
And in answer of :


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can create/initialize the value at the beginning of the first superstep.

________________________________
From: Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu<ma...@uiowa.edu>]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org<ma...@giraph.apache.org>
Subject: Setting variable value in Compute class and using it in the next superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)



Re: Setting variable value in Compute class and using it in the next superstep

Posted by Matthew Saltz <sa...@gmail.com>.
Tom,

If it's necessary to store more than one flag though, for example, won't a
custom class be necessary? I'm a beginner too, so I apologize if I'm
incorrect about that. Just to be clarify, to keep persistent data for a
vertex from one superstep to the next, it is necessary to encapsulate it in
the type used for the 'V', right? In other words, if Vivek tries to use a
normal member variable for the Computation class, it won't work will it?

Also, just to point out, there actually isn't too much involved with
writing your own custom vertex class. Here's a quick example
<https://gist.github.com/saltzm/692fba1d3aade035ce9c> to get you started.
Within your compute() method you can access the data in this class by doing

SampleVertexData d = vertex.getValue();

and then using d.setFlag(true) or boolean currentFlag = d.getFlag() for
example.  And your computation class is now something like

public class MyComputation extends BasicComputation<*IdType, *
*SampleVertexData*, *EdgeType, MessageType>* {
    @Override
    public void compute(Vertex<*IdType, **SampleVertexData*, *EdgeType*>
vertex, Iterable<*MessageType> messages) {.....} *

*    ...*

}

As a warning, for this class I'm using Hadoop 0.20.203 and I'm also a
beginner, so take everything I say with a grain of salt, and Tom please
correct me if I'm wrong.

Best of luck,
Matthew


On Mon, Jul 21, 2014 at 11:37 PM, Schweiger, Tom <th...@ebay.com>
wrote:

>  And in answer of :
>
>
> This post also suggests (along with what I described above) to have a
> field in the vertex value itself. For that I need to change the vertex
> input format and also create my own custom vertex class. Is it really
> necessary?
>
> No, you don't need a custom vertex class or vertex input format. You can
> create/initialize the value at the beginning of the first superstep.
>
>  ------------------------------
> *From:* Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu]
> *Sent:* Monday, July 21, 2014 2:05 PM
> *To:* user@giraph.apache.org
> *Subject:* Setting variable value in Compute class and using it in the
> next superstep
>
>   Hi, all--
>
>
>  In my algorithm, I need to set a flag if certain conditions hold
> (locally at a vertex v). If this flag is set then execute some other block
> of code *only once*, and do nothing until some other condition is hold.
>
>
>  My question is, can I declare a flag variable in the class where I
> override compute function? I defined the flag as a public variable and
> setting it once the conditions are met but it seems the value is not
> "carried" over to the next superstep.
>
> I dig a little bit in this mailing list and found this
>
> https://www.mail-archive.com/user@giraph.apache.org/msg01266.html
>
>
>  This post also suggests (along with what I described above) to have a
> field in the vertex value itself. For that I need to change the vertex
> input format and also create my own custom vertex class. Is it really
> necessary?
>
>
>  By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was
> able to run SimpleShortestPathComputation successfully.
>
>
>  Here are more technical details of my algorithm: I am trying to
> implement Delta-stepping shortest path algorithm (
> http://dl.acm.org/citation.cfm?id=740136 or
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This
> was mentioned in Pregel paper. A vertex "relax" light edges if it belongs
> to the minimum bucket index (of course, aggregators!). Once a vertex is
> done with relaxing light edges it relaxes heavy edges (here is where I need
> a flag) once. A vertex may be "re-inserted" to a newer bucket and may have
> to execute all the steps that I described here again.
>
>
>  Thanks.
>
>
>  Sincerely,
>   Vivek
> A beginner in Giraph (and Java too!)
>
>

RE: Setting variable value in Compute class and using it in the next superstep

Posted by "Schweiger, Tom" <th...@ebay.com>.
And in answer of :

This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can create/initialize the value at the beginning of the first superstep.

________________________________
From: Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org
Subject: Setting variable value in Compute class and using it in the next superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)


RE: Setting variable value in Compute class and using it in the next superstep

Posted by "Schweiger, Tom" <th...@ebay.com>.
I suggest you store this information as a property of the vertex.  If its just a flag, you could use a BooleanWritable.  That would be the V in <I, V, E>.




________________________________
From: Sardeshmukh, Vivek [vivek-sardeshmukh@uiowa.edu]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org
Subject: Setting variable value in Compute class and using it in the next superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a vertex v). If this flag is set then execute some other block of code *only once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override compute function? I defined the flag as a public variable and setting it once the conditions are met but it seems the value is not "carried" over to the next superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in the vertex value itself. For that I need to change the vertex input format and also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement Delta-stepping shortest path algorithm ( http://dl.acm.org/citation.cfm?id=740136 or http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was mentioned in Pregel paper. A vertex "relax" light edges if it belongs to the minimum bucket index (of course, aggregators!). Once a vertex is done with relaxing light edges it relaxes heavy edges (here is where I need a flag) once. A vertex may be "re-inserted" to a newer bucket and may have to execute all the steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)