You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Panagiotis Eustratiadis <ep...@gmail.com> on 2014/07/29 12:14:05 UTC
Generating unique vertex id's for addVertexRequest
Hello everyone,
I'm looking for a way to generate unique id's (of type Long) for the
addVertexRequest. For example, a very silly implementation that works for
graphs with less than 100 vertices would look like this:
public void compute(Iterable<NullWritable> messages) {
...
long generatedId = generateId(long getId().get());
addVertexRequest(new LongWritable(generatedId), new DoubleWritable(0));
...
}
private long generateId(long seed) {
return seed + 100;
}
But as I said, this is just silly. How can I modify the generateId so that
I know the vertex id is unique regardless of the graph size?
Panagiotis Eustratiadis.
Re: Generating unique vertex id's for addVertexRequest
Posted by Christian Krause <me...@ckrause.org>.
Hi Panagiotis,
you could use Java UUIDs. But you would need two longs for this. Otherwise
you could also define your own application specific logic to generate
universal IDs.
Cheers,
Christian
2014-07-29 12:14 GMT+02:00 Panagiotis Eustratiadis <ep...@gmail.com>:
> Hello everyone,
>
> I'm looking for a way to generate unique id's (of type Long) for the
> addVertexRequest. For example, a very silly implementation that works for
> graphs with less than 100 vertices would look like this:
>
> public void compute(Iterable<NullWritable> messages) {
> ...
> long generatedId = generateId(long getId().get());
> addVertexRequest(new LongWritable(generatedId), new DoubleWritable(0));
> ...
> }
>
> private long generateId(long seed) {
> return seed + 100;
> }
>
> But as I said, this is just silly. How can I modify the generateId so that
> I know the vertex id is unique regardless of the graph size?
>
> Panagiotis Eustratiadis.
>
Re: Generating unique vertex id's for addVertexRequest
Posted by Panagiotis Eustratiadis <ep...@gmail.com>.
Thanks a lot for your time, both useful answers =)
2014-07-29 19:34 GMT+03:00 Schweiger, Tom <th...@ebay.com>:
>
> With any generated ID like a hash, there will always be the possibility of
> a collision (different ids creating the same generated id). However,
> because you are using a long, the size of the hash space is quite large. a
> collision won't become likely until you have around 4 billion vertexes. If
> your graph has, say, 10 million vertexes, you can be 99.97% sure there are
> no collisions. Put another way, you would have to generate 3700 graphs.
> each with 10 million vertexes, before you got one with a single collision.
>
> Your other options are:
>
> * Manage your ids, using a cross-reference table, so that you guarantee a
> one-to-one relationship between the id and the long.
>
> * Change the classes you are using in Giraph to use Text instead of Long
> for the vertex ids.
>
>
> ------------------------------
> *From:* Panagiotis Eustratiadis [ep.pan.dit@gmail.com]
> *Sent:* Tuesday, July 29, 2014 3:14 AM
> *To:* user@giraph.apache.org
> *Subject:* Generating unique vertex id's for addVertexRequest
>
> Hello everyone,
>
> I'm looking for a way to generate unique id's (of type Long) for the
> addVertexRequest. For example, a very silly implementation that works for
> graphs with less than 100 vertices would look like this:
>
> public void compute(Iterable<NullWritable> messages) {
> ...
> long generatedId = generateId(long getId().get());
> addVertexRequest(new LongWritable(generatedId), new
> DoubleWritable(0));
> ...
> }
>
> private long generateId(long seed) {
> return seed + 100;
> }
>
> But as I said, this is just silly. How can I modify the generateId so
> that I know the vertex id is unique regardless of the graph size?
>
> Panagiotis Eustratiadis.
>
RE: Generating unique vertex id's for addVertexRequest
Posted by "Schweiger, Tom" <th...@ebay.com>.
With any generated ID like a hash, there will always be the possibility of a collision (different ids creating the same generated id). However, because you are using a long, the size of the hash space is quite large. a collision won't become likely until you have around 4 billion vertexes. If your graph has, say, 10 million vertexes, you can be 99.97% sure there are no collisions. Put another way, you would have to generate 3700 graphs. each with 10 million vertexes, before you got one with a single collision.
Your other options are:
* Manage your ids, using a cross-reference table, so that you guarantee a one-to-one relationship between the id and the long.
* Change the classes you are using in Giraph to use Text instead of Long for the vertex ids.
________________________________
From: Panagiotis Eustratiadis [ep.pan.dit@gmail.com]
Sent: Tuesday, July 29, 2014 3:14 AM
To: user@giraph.apache.org
Subject: Generating unique vertex id's for addVertexRequest
Hello everyone,
I'm looking for a way to generate unique id's (of type Long) for the addVertexRequest. For example, a very silly implementation that works for graphs with less than 100 vertices would look like this:
public void compute(Iterable<NullWritable> messages) {
...
long generatedId = generateId(long getId().get());
addVertexRequest(new LongWritable(generatedId), new DoubleWritable(0));
...
}
private long generateId(long seed) {
return seed + 100;
}
But as I said, this is just silly. How can I modify the generateId so that I know the vertex id is unique regardless of the graph size?
Panagiotis Eustratiadis.