You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Eli Reisman <ap...@gmail.com> on 2012/11/19 22:33:54 UTC

Re: Getting SimpleTriangleClosingVertex to run

There's been some interesting changes in Giraph memory usage over the last
few weeks that I have not had a chance to try out myself yet, but that
might interest you and might really help for running applications like
triangle closing and having them scale to interesting input graph sizes.
Worth playing with, anyway.

Hope all is going well,

Eli

On Sun, Oct 21, 2012 at 12:57 PM, Vernon Thommeret <sy...@gmail.com>wrote:

> Hey Eli,
>
> Thanks for the suggestions. I've been playing with this nights and
> weekends, which is why there's been such a delay :). I should have
> more time in a couple weeks and will dig back in and report back.
>
> Vernon
>
> On Mon, Oct 8, 2012 at 9:00 PM, Eli Reisman <ap...@gmail.com>
> wrote:
> > Brief follow-up:
> >
> > GIRAPH-314, which is not rebased or committed yet, is another part of
> this
> > puzzle where I attempt to combine the messages and allow primitive
> (hacky)
> > ability to amortize the supersteps where vertices message each other to
> keep
> > the volume of messages down per-superstep. Its a blatant trade of time
> for
> > space, and probably a desperate cry for help too. I will update it ASAP
> so
> > you can play with it. I had pretty promising results but that was when I
> had
> > a cluster to play with ;)
> >
> > First step, I'd try Maja's recipe for spill-to-disk during messaging. Her
> > advice is in those 314-322-328 threads.
> >
> >
> > On Mon, Oct 8, 2012 at 3:55 PM, Eli Reisman <ap...@gmail.com>
> > wrote:
> >>
> >> I have had some trouble scaling it too, that is an issue I've been
> working
> >> at from several angles for a few months now. The main problem is the
> >> explosion of messaging that occurs.
> >>
> >> It might be worth trying to employ the spill-to-disk features, there
> was a
> >> thread in the JIRA (I think for GIRAPH-328 or 322, maybe a bit earlier
> I can
> >> check...) where Maja explained that the spill also halts computation
> when
> >> messages build up so that we never quite overrun our memory reserves
> during
> >> the computation/message stages. This trades time for space, but is
> something
> >> I have been meaning to experiement with, as in many situations its a
> trade
> >> well worth making. I will be experimenting with this option myself
> soon, its
> >> on my "short list" of Giraph stuff-to-do!
> >>
> >> I am also independently working on some ways to deduplicate broadcast
> >> messages such as those used in triangle closing so that in-memory runs
> of
> >> this algorithm are possible at interesting scales. That idea has
> undergone
> >> some "evolution" and is still underway, (its the aforementioned
> GIRAPH-322)
> >> so more to follow there when my schoolwork lets up... ;)
> >>
> >> Eli
> >>
> >>
> >>
> >> On Sun, Oct 7, 2012 at 12:11 PM, Vernon Thommeret <sy...@gmail.com>
> >> wrote:
> >>>
> >>> Thanks. I ended up getting it working. Having some issues scaling it,
> >>> but working on it.
> >>>
> >>> On Mon, Sep 24, 2012 at 1:17 PM, Eli Reisman <apache.mailbox@gmail.com
> >
> >>> wrote:
> >>> > The io format types have to be compatible. Since
> >>> > IdWithValueVertexOutputFormat does not specify the types it takes, it
> >>> > just
> >>> > attempts to output them as using the Writable interface, I use it to
> >>> > output
> >>> > data from the SimpleTriangleClosingVertex. I also had to write an
> >>> > InputFormat to accept IntWritable id's and IntWritable out-edge
> >>> > destinations. Otherwise, should work.
> >>> >
> >>> >
> >>> >
> >>> > On Mon, Sep 24, 2012 at 12:06 AM, Avery Ching <ac...@apache.org>
> >>> > wrote:
> >>> >>
> >>> >> I don't think the types are compatible.
> >>> >>
> >>> >> public class SimpleTriangleClosingVertex extends EdgeListVertex<
> >>> >>   IntWritable, SimpleTriangleClosingVertex.IntArrayListWritable,
> >>> >>   NullWritable, IntWritable>
> >>> >>
> >>> >> You'll need to use an input format and output format that fits these
> >>> >> types.  Otherwise the issue is likely to be
> >>> >> serialization/deserialization
> >>> >> here.
> >>> >>
> >>> >>
> >>> >> On 9/23/12 10:44 PM, Vernon Thommeret wrote:
> >>> >>>
> >>> >>> I'm trying to get the SimpleTriangleClosingVertex to run, but
> getting
> >>> >>> this error:
> >>> >>>
> >>> >>> java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException:
> >>> >>> IPC
> >>> >>> server unable to read call parameters: null
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.giraph.comm.BasicRPCCommunications.sendPartitionRequest(BasicRPCCommunications.java:923)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:327)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:604)
> >>> >>>         at
> >>> >>> org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:377)
> >>> >>>         at
> >>> >>> org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:578)
> >>> >>>         at
> >>> >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> >>> >>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >>> >>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >>> >>>         at java.security.AccessController.doPrivileged(Native
> Method)
> >>> >>>         at javax.security.auth.Subject.doAs(Subject.java:396)
> >>> >>>         at
> >>> >>>
> >>> >>>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
> >>> >>>         at org.apache.hadoop.mapred.Child.main(Child.java:264)
> >>> >>> Caused by: org.apache.hadoop.ipc.RemoteException: IPC server
> >>> >>>
> >>> >>> This is the diff that causes the issue:
> >>> >>>
> >>> >>> @@ -33,7 +33,7 @@ import org.apache.hadoop.fs.Path;
> >>> >>>   import org.apache.hadoop.io.IntWritable;
> >>> >>>
> >>> >>>   import org.apache.giraph.graph.GiraphJob;
> >>> >>> -import org.apache.giraph.graph.IntIntNullIntVertex;
> >>> >>> +import org.apache.giraph.examples.SimpleTriangleClosingVertex;
> >>> >>>   import org.apache.giraph.io.IntIntNullIntTextInputFormat;
> >>> >>>   import org.apache.giraph.io.AdjacencyListTextVertexOutputFormat;
> >>> >>>
> >>> >>> @@ -44,16 +44,12 @@ import org.apache.log4j.Logger;
> >>> >>>   /**
> >>> >>>    * Simple function to return the in degree for each vertex.
> >>> >>>    */
> >>> >>> -public class SharedConnectionsVertex extends IntIntNullIntVertex
> >>> >>> implements Tool {
> >>> >>> +public class SharedConnections implements Tool {
> >>> >>>
> >>> >>>     private Configuration conf;
> >>> >>>     private static final Logger LOG =
> >>> >>>         Logger.getLogger(SharedConnections.class);
> >>> >>>
> >>> >>> -  public void compute(Iterable<IntWritable> messages) {
> >>> >>> -    voteToHalt();
> >>> >>> -  }
> >>> >>> -
> >>> >>>     @Override
> >>> >>>     public final int run(final String[] args) throws Exception {
> >>> >>>       Options options = new Options();
> >>> >>> @@ -71,7 +67,7 @@ public class SharedConnections extends
> >>> >>> IntIntNullIntVertex implements Tool {
> >>> >>>
> >>> >>>       GiraphJob job = new GiraphJob(getConf(),
> getClass().getName());
> >>> >>>
> >>> >>> -    job.setVertexClass(SharedConnections.class);
> >>> >>> +    job.setVertexClass(SimpleTriangleClosingVertex.class);
> >>> >>>
> >>> >>> job.setVertexInputFormatClass(IntIntNullIntTextInputFormat.class);
> >>> >>>
> >>> >>>
> >>> >>>
> job.setVertexOutputFormatClass(AdjacencyListTextVertexOutputFormat.class);
> >>> >>>       job.setWorkerConfiguration(10, 10, 100.0f);
> >>> >>>
> >>> >>> --
> >>> >>>
> >>> >>> I.e. I have a dummy job that just outputs the vertices which works,
> >>> >>> but trying to switch the vertex class doesn't seem to work. I'm
> >>> >>> running the latest version of Giraph (rev 1388628). Should this
> work
> >>> >>> or should I try something different?
> >>> >>>
> >>> >>> Thanks!
> >>> >>> Vernon
> >>> >>
> >>> >>
> >>> >
> >>
> >>
> >
>