You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Panagiotis Eustratiadis <ep...@gmail.com> on 2014/07/29 12:14:05 UTC

Generating unique vertex id's for addVertexRequest

Hello everyone,

I'm looking for a way to generate unique id's (of type Long) for the
addVertexRequest. For example, a very silly implementation that works for
graphs with less than 100 vertices would look like this:

public void compute(Iterable<NullWritable> messages) {
...
    long generatedId = generateId(long getId().get());
    addVertexRequest(new LongWritable(generatedId), new DoubleWritable(0));
...
}

private long generateId(long seed) {
    return seed + 100;
}

But as I said, this is just silly. How can I modify the generateId so that
I know the vertex id is unique regardless of the graph size?

Panagiotis Eustratiadis.

Re: Generating unique vertex id's for addVertexRequest

Posted by Christian Krause <me...@ckrause.org>.
Hi Panagiotis,

you could use Java UUIDs. But you would need two longs for this. Otherwise
you could also define your own application specific logic to generate
universal IDs.

Cheers,
Christian


2014-07-29 12:14 GMT+02:00 Panagiotis Eustratiadis <ep...@gmail.com>:

> Hello everyone,
>
> I'm looking for a way to generate unique id's (of type Long) for the
> addVertexRequest. For example, a very silly implementation that works for
> graphs with less than 100 vertices would look like this:
>
> public void compute(Iterable<NullWritable> messages) {
> ...
>     long generatedId = generateId(long getId().get());
>     addVertexRequest(new LongWritable(generatedId), new DoubleWritable(0));
> ...
> }
>
> private long generateId(long seed) {
>     return seed + 100;
> }
>
> But as I said, this is just silly. How can I modify the generateId so that
> I know the vertex id is unique regardless of the graph size?
>
> Panagiotis Eustratiadis.
>

Re: Generating unique vertex id's for addVertexRequest

Posted by Panagiotis Eustratiadis <ep...@gmail.com>.
Thanks a lot for your time, both useful answers =)


2014-07-29 19:34 GMT+03:00 Schweiger, Tom <th...@ebay.com>:

>
> With any generated ID like a hash, there will always be the possibility of
> a collision (different ids creating the same generated id).  However,
> because you are using a long, the size of the hash space is quite large.  a
> collision won't become likely until you have around 4 billion vertexes.  If
> your graph has, say, 10 million vertexes, you can be 99.97% sure there are
> no collisions.  Put another way, you would have to generate  3700 graphs.
> each with 10 million vertexes, before you got one with a single collision.
>
> Your other options are:
>
> * Manage your ids, using a cross-reference table, so that you guarantee a
> one-to-one relationship between the id and the long.
>
> * Change the classes you are using in Giraph to use Text instead of Long
> for the vertex ids.
>
>
>  ------------------------------
> *From:* Panagiotis Eustratiadis [ep.pan.dit@gmail.com]
> *Sent:* Tuesday, July 29, 2014 3:14 AM
> *To:* user@giraph.apache.org
> *Subject:* Generating unique vertex id's for addVertexRequest
>
>     Hello everyone,
>
>  I'm looking for a way to generate unique id's (of type Long) for the
> addVertexRequest. For example, a very silly implementation that works for
> graphs with less than 100 vertices would look like this:
>
>  public void compute(Iterable<NullWritable> messages) {
>  ...
>      long generatedId = generateId(long getId().get());
>      addVertexRequest(new LongWritable(generatedId), new
> DoubleWritable(0));
> ...
> }
>
>  private long generateId(long seed) {
>      return seed + 100;
> }
>
>  But as I said, this is just silly. How can I modify the generateId so
> that I know the vertex id is unique regardless of the graph size?
>
>  Panagiotis Eustratiadis.
>

RE: Generating unique vertex id's for addVertexRequest

Posted by "Schweiger, Tom" <th...@ebay.com>.
With any generated ID like a hash, there will always be the possibility of a collision (different ids creating the same generated id).  However, because you are using a long, the size of the hash space is quite large.  a collision won't become likely until you have around 4 billion vertexes.  If your graph has, say, 10 million vertexes, you can be 99.97% sure there are no collisions.  Put another way, you would have to generate  3700 graphs. each with 10 million vertexes, before you got one with a single collision.

Your other options are:

* Manage your ids, using a cross-reference table, so that you guarantee a one-to-one relationship between the id and the long.

* Change the classes you are using in Giraph to use Text instead of Long for the vertex ids.


________________________________
From: Panagiotis Eustratiadis [ep.pan.dit@gmail.com]
Sent: Tuesday, July 29, 2014 3:14 AM
To: user@giraph.apache.org
Subject: Generating unique vertex id's for addVertexRequest

Hello everyone,

I'm looking for a way to generate unique id's (of type Long) for the addVertexRequest. For example, a very silly implementation that works for graphs with less than 100 vertices would look like this:

public void compute(Iterable<NullWritable> messages) {
...
    long generatedId = generateId(long getId().get());
    addVertexRequest(new LongWritable(generatedId), new DoubleWritable(0));
...
}

private long generateId(long seed) {
    return seed + 100;
}

But as I said, this is just silly. How can I modify the generateId so that I know the vertex id is unique regardless of the graph size?

Panagiotis Eustratiadis.