You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org> on 2012/05/17 23:09:13 UTC

[jira] [Created] (GIRAPH-191) Random Walk with Restart

Gianmarco De Francisci Morales created GIRAPH-191:
-----------------------------------------------------

             Summary: Random Walk with Restart
                 Key: GIRAPH-191
                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
             Project: Giraph
          Issue Type: New Feature
            Reporter: Gianmarco De Francisci Morales


Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.

{code}
if ( myID == sourceID )
      DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
else
      DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
{code}

It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
More or less along these lines:
http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by Eli Reisman <in...@gmail.com>.
Yeah its too bad there isn't an easier way to do this, if you want just
bare metal performance then per-vertex internal state regarding out-edges
could easily be an array of primitives, during the calculation steps you
really only need writables for outgoing messages on the wire, but the whole
framework is really interwoven with them. Several people here were talking
about this same problem. I guess the need for writables during the calc
stages varies depending on your algorithm and the nature of each vertex's
interaction with non local vertices. Ideally we could deal with writables
only in the InputSplit and final output stages and use Netty to send
whatever compact data structures we want to populate the application with
but that might be a few JIRA's away.



On Sat, Aug 11, 2012 at 11:59 AM, Alessandro Presta <al...@fb.com>wrote:

> Hi Gianmarco,
>
> I sympathize with that. However, I can't think of a way to avoid it:
> Vertex needs to be defined in terms of Writables, we depend on that for
> serialization (which enables communication and fault tolerance).
> Was it any different in the 0.1 API? The only thing I can think of is that
> now a vertex with weighted edges needs to return Edge objects in its
> iterator.
>
> That said, if you're writing your own vertex with optimized edge storage,
> I guess you could add iterators over primitiave types, and use those in
> your algorithm. However, for example, you would still have to wrap values
> in Writables for message passing.
> But yeah, if this is for a vertex that gets shipped with Giraph, we
> probably don't want to do that.
>
> Please let me know if I understood the issue and any ideas you might have
> to improve it.
>
> Alessandro
>
> On 8/10/12 6:37 PM, "Gianmarco De Francisci Morales" <gd...@apache.org>
> wrote:
>
> >Hi Alessandro,
> >
> >My main concern is that I need a lot of wrapping objects if I want to keep
> >data as primitive values in the vertex implementation.
> >Right now I have some wrapper iterables/iterators that do that, but I
> >think
> >all this temporary object creation kind of kills the memory efficiency of
> >the vertex implementation.
> >
> >I am rebasing the patch once again after the aggregator changes (and
> >trying
> >to understand how they work now).
> >As soon as I finish and Jira comes up again I will post an updated patch.
> >
> >Cheers,
> >--
> >Gianmarco
> >
> >
> >
> >On Thu, Aug 9, 2012 at 4:00 PM, Alessandro Presta (JIRA)
> ><ji...@apache.org>wrote:
> >
> >>
> >>     [
> >>
> >>https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira
> .
> >>plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431815#c
> >>omment-13431815]
> >>
> >> Alessandro Presta commented on GIRAPH-191:
> >> ------------------------------------------
> >>
> >> Gianmarco: can you elaborate more on the difficulties you're having with
> >> vertex implementation?
> >>
> >> > Random Walks on Graphs
> >> > ----------------------
> >> >
> >> >                 Key: GIRAPH-191
> >> >                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
> >> >             Project: Giraph
> >> >          Issue Type: New Feature
> >> >          Components: examples
> >> >    Affects Versions: 0.2.0
> >> >            Reporter: Gianmarco De Francisci Morales
> >> >            Assignee: Gianmarco De Francisci Morales
> >> >         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch,
> >> GIRAPH-191.patch, PIG-191.1.patch
> >> >
> >> >
> >> > Implementing RWR on Giraph should be a very simple modification of the
> >> SimplePageRankVertex code.
> >> > {code}
> >> > if ( myID == sourceID )
> >> >       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f *
> >> sum);
> >> > else
> >> >       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> >> > {code}
> >> > It would be nice to make it as configurable as possible by using
> >> parametric damping factors, preference vectors, strongly preferential,
> >> etc...
> >> > More or less along these lines:
> >> >
> >>
> >>
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html
> >>
> >> --
> >> This message is automatically generated by JIRA.
> >> If you think it was sent incorrectly, please contact your JIRA
> >> administrators:
> >>
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> >> For more information on JIRA, see:
> >>http://www.atlassian.com/software/jira
> >>
> >>
>
>

Re: [jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by Alessandro Presta <al...@fb.com>.
Hi Gianmarco,

I sympathize with that. However, I can't think of a way to avoid it:
Vertex needs to be defined in terms of Writables, we depend on that for
serialization (which enables communication and fault tolerance).
Was it any different in the 0.1 API? The only thing I can think of is that
now a vertex with weighted edges needs to return Edge objects in its
iterator.

That said, if you're writing your own vertex with optimized edge storage,
I guess you could add iterators over primitive types, and use those in
your algorithm. However, for example, you would still have to wrap values
in Writables for message passing.
But yeah, if this is for a vertex that gets shipped with Giraph, we
probably don't want to do that.

Please let me know if I understood the issue and any ideas you might have
to improve it.

Alessandro

On 8/10/12 6:37 PM, "Gianmarco De Francisci Morales" <gd...@apache.org>
wrote:

>Hi Alessandro,
>
>My main concern is that I need a lot of wrapping objects if I want to keep
>data as primitive values in the vertex implementation.
>Right now I have some wrapper iterables/iterators that do that, but I
>think
>all this temporary object creation kind of kills the memory efficiency of
>the vertex implementation.
>
>I am rebasing the patch once again after the aggregator changes (and
>trying
>to understand how they work now).
>As soon as I finish and Jira comes up again I will post an updated patch.
>
>Cheers,
>--
>Gianmarco
>
>
>
>On Thu, Aug 9, 2012 at 4:00 PM, Alessandro Presta (JIRA)
><ji...@apache.org>wrote:
>
>>
>>     [
>> 
>>https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.
>>plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431815#c
>>omment-13431815]
>>
>> Alessandro Presta commented on GIRAPH-191:
>> ------------------------------------------
>>
>> Gianmarco: can you elaborate more on the difficulties you're having with
>> vertex implementation?
>>
>> > Random Walks on Graphs
>> > ----------------------
>> >
>> >                 Key: GIRAPH-191
>> >                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>> >             Project: Giraph
>> >          Issue Type: New Feature
>> >          Components: examples
>> >    Affects Versions: 0.2.0
>> >            Reporter: Gianmarco De Francisci Morales
>> >            Assignee: Gianmarco De Francisci Morales
>> >         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch,
>> GIRAPH-191.patch, PIG-191.1.patch
>> >
>> >
>> > Implementing RWR on Giraph should be a very simple modification of the
>> SimplePageRankVertex code.
>> > {code}
>> > if ( myID == sourceID )
>> >       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f *
>> sum);
>> > else
>> >       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
>> > {code}
>> > It would be nice to make it as configurable as possible by using
>> parametric damping factors, preference vectors, strongly preferential,
>> etc...
>> > More or less along these lines:
>> >
>> 
>>http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html
>>
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA
>> administrators:
>> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>> For more information on JIRA, see:
>>http://www.atlassian.com/software/jira
>>
>>


Re: [jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by Gianmarco De Francisci Morales <gd...@apache.org>.
Hi Alessandro,

My main concern is that I need a lot of wrapping objects if I want to keep
data as primitive values in the vertex implementation.
Right now I have some wrapper iterables/iterators that do that, but I think
all this temporary object creation kind of kills the memory efficiency of
the vertex implementation.

I am rebasing the patch once again after the aggregator changes (and trying
to understand how they work now).
As soon as I finish and Jira comes up again I will post an updated patch.

Cheers,
--
Gianmarco



On Thu, Aug 9, 2012 at 4:00 PM, Alessandro Presta (JIRA) <ji...@apache.org>wrote:

>
>     [
> https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431815#comment-13431815]
>
> Alessandro Presta commented on GIRAPH-191:
> ------------------------------------------
>
> Gianmarco: can you elaborate more on the difficulties you're having with
> vertex implementation?
>
> > Random Walks on Graphs
> > ----------------------
> >
> >                 Key: GIRAPH-191
> >                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
> >             Project: Giraph
> >          Issue Type: New Feature
> >          Components: examples
> >    Affects Versions: 0.2.0
> >            Reporter: Gianmarco De Francisci Morales
> >            Assignee: Gianmarco De Francisci Morales
> >         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch,
> GIRAPH-191.patch, PIG-191.1.patch
> >
> >
> > Implementing RWR on Giraph should be a very simple modification of the
> SimplePageRankVertex code.
> > {code}
> > if ( myID == sourceID )
> >       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f *
> sum);
> > else
> >       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> > {code}
> > It would be nice to make it as configurable as possible by using
> parametric damping factors, preference vectors, strongly preferential,
> etc...
> > More or less along these lines:
> >
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators:
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>

[jira] [Updated] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated GIRAPH-191:
--------------------------------------------------

    Attachment: GIRAPH-191.3.patch

As I said on the dev list, the main concern is creating a lot of objects just for wrapping primitives.
Anyway, here is a more complete patch, rebased and with workings tests.

On a side note, pleasing checkstyle makes me go nuts.
We may want to develop some automated formatting profiles for the most common IDEs to encourage people to contribute.
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch, PIG-191.1.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452255#comment-13452255 ] 

Gianmarco De Francisci Morales commented on GIRAPH-191:
-------------------------------------------------------

Thanks Eli, that would be great!
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452554#comment-13452554 ] 

Avery Ching commented on GIRAPH-191:
------------------------------------

Thanks Eli, I got backed up this weekend.  Thanks Gianmarco for the work!
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-191) Random Walks on Graphs

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated GIRAPH-191:
--------------------------------------

    Attachment: GIRAPH-191-1.patch
    
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated GIRAPH-191:
--------------------------------------------------

    Attachment: GIRAPH-191.3.patch
    
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Alessandro Presta (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431815#comment-13431815 ] 

Alessandro Presta commented on GIRAPH-191:
------------------------------------------

Gianmarco: can you elaborate more on the difficulties you're having with vertex implementation?
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.patch, PIG-191.1.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-191) Random Walks on Graphs

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated GIRAPH-191:
--------------------------------------

          Component/s: examples
    Affects Version/s: 0.2.0
              Summary: Random Walks on Graphs  (was: Random Walk with Restart)
    
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Eugene Koontz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433852#comment-13433852 ] 

Eugene Koontz commented on GIRAPH-191:
--------------------------------------

Good idea about the automated formatting profiles - perhaps it could be incorporated into the "mvn idea:idea" target for use by IntelliJ users (like myself).
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440167#comment-13440167 ] 

Gianmarco De Francisci Morales commented on GIRAPH-191:
-------------------------------------------------------

Forgot to mention, patch is available in RB now.
https://reviews.apache.org/r/6653/
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452592#comment-13452592 ] 

Hudson commented on GIRAPH-191:
-------------------------------

Integrated in Giraph-trunk-Commit #192 (See [https://builds.apache.org/job/Giraph-trunk-Commit/192/])
    GIRAPH-191: Random Walks On Graphs (Gianmarco De Francisci Morales via ereisman) (Revision 1383115)

     Result = SUCCESS
ereisman : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383115
Files : 
* /giraph/trunk/CHANGELOG
* /giraph/trunk/checkstyle.xml
* /giraph/trunk/src/main/java/org/apache/giraph/examples/DoubleSumCombiner.java
* /giraph/trunk/src/main/java/org/apache/giraph/examples/LongDoubleFloatDoubleTextInputFormat.java
* /giraph/trunk/src/main/java/org/apache/giraph/examples/NormalizingLongDoubleFloatDoubleTextInputFormat.java
* /giraph/trunk/src/main/java/org/apache/giraph/examples/RandomWalkVertex.java
* /giraph/trunk/src/main/java/org/apache/giraph/examples/RandomWalkWithRestartVertex.java
* /giraph/trunk/src/main/java/org/apache/giraph/examples/RandomWalkWorkerContext.java
* /giraph/trunk/src/main/java/org/apache/giraph/examples/VertexWithDoubleValueFloatEdgeTextOutputFormat.java
* /giraph/trunk/src/main/java/org/apache/giraph/graph/LongDoubleFloatDoubleEdgeListVertex.java
* /giraph/trunk/src/main/java/org/apache/giraph/graph/LongDoubleNullDoubleVertex.java
* /giraph/trunk/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java
* /giraph/trunk/src/main/java/org/apache/giraph/utils/MathUtils.java
* /giraph/trunk/src/main/java/org/apache/giraph/utils/UnmodifiableDoubleArrayIterator.java
* /giraph/trunk/src/main/java/org/apache/giraph/utils/UnmodifiableLongArrayIterator.java
* /giraph/trunk/src/main/java/org/apache/giraph/utils/UnmodifiableLongFloatEdgeArrayIterable.java
* /giraph/trunk/src/main/java/org/apache/giraph/utils/UnmodifiableLongNullEdgeArrayIterable.java
* /giraph/trunk/src/test/java/org/apache/giraph/examples/RandomWalkWithRestartVertexTest.java

                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Eugene Koontz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433850#comment-13433850 ] 

Eugene Koontz commented on GIRAPH-191:
--------------------------------------

Hi Gianmarco, I have the same issues with RB. I've asked the infra folks to kindly make it possible to submit Git patches:

https://issues.apache.org/jira/browse/INFRA-5128

I hope to hear from them soon.


                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279026#comment-13279026 ] 

Gianmarco De Francisci Morales commented on GIRAPH-191:
-------------------------------------------------------

I tested the patch and it looks like it's working.
I had to change the config parameter to add .getName() to SOURCE_VERTEX in RandomWalkWithRestartVertex.java:35
{code}
    /** Configuration parameter for the source vertex */
    static final String SOURCE_VERTEX =
            RandomWalkWithRestartVertex.class.getName() + ".sourceVertex";
{code}

Otherwise the String reads as "class blah..."

I compared some toy output with a reference implementation and it looks good!

I think the next step would be to support weighted graphs.
The graph can be made stochastic on the fly, at loading time.
Thoughts?

I will try to hack some code.

                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated GIRAPH-191:
--------------------------------------------------

    Attachment:     (was: GIRAPH-191.3.patch)
    
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279089#comment-13279089 ] 

Sebastian Schelter commented on GIRAPH-191:
-------------------------------------------

I tested PageRank in the test with 200M vertices on 6 machines without
problems.
Am 18.05.2012 19:59 schrieb "Gianmarco De Francisci Morales (JIRA)" <


                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated GIRAPH-191:
--------------------------------------------------

    Attachment: GIRAPH-191.2.patch

Attaching a work in progress, so we can synch.

Added weighted graph support.
Tested it on a small graph against a reference implementation.
I didn't manage to test it properly with unit testing. I was unable to run the tests, not sure whether it's munge, mvn or something else's fault.
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449289#comment-13449289 ] 

Avery Ching commented on GIRAPH-191:
------------------------------------

Gianmarco, I'll try and take a look today or tomorrow.  Thanks for the patch!
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452174#comment-13452174 ] 

Eli Reisman commented on GIRAPH-191:
------------------------------------

I can try to take a look at this too if Avery is busy, sorry about the delay I have been moving...

                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-191) Random Walk with Restart

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated GIRAPH-191:
--------------------------------------

    Attachment: GIRAPH-191.patch

A first draft for the code. It contains an abstract RandomWalkVertex which PageRank, RWR and others can extend.
                
> Random Walk with Restart
> ------------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated GIRAPH-191:
--------------------------------------------------

    Attachment:     (was: PIG-191.1.patch)
    
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walk with Restart

Posted by "Paolo Castagna (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278712#comment-13278712 ] 

Paolo Castagna commented on GIRAPH-191:
---------------------------------------

It would be good if we could also use this or provide a PageRank implementation which deals with dangling nodes/vertexes properly.

Dangling vertexes are vertexes with no edges.

SinglePageRankVertex has:

{code}
    if (getSuperstep() < MAX_SUPERSTEPS) {
      long edges = getNumOutEdges();
      sendMsgToAllEdges(
          new DoubleWritable(getVertexValue().get() / edges));
    } else {
      voteToHalt();
    }
{code}

This does not work when getNumOutEdges() returns 0.

Some suggest to divide the PageRank scores of dangling vertexes evenly among all other vertex (it's yet another sort of random jump to propagate PageRank scores to all nodes). This can be implemented in Giraph as a separate superstep using a SumAggregator.

Discussion on the giraph-user mailing list with further comments and references is here:

 - http://mail-archives.apache.org/mod_mbox/incubator-giraph-user/201205.mbox/%3C4FB509F4.4040407@googlemail.com%3E
                
> Random Walk with Restart
> ------------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452469#comment-13452469 ] 

Eli Reisman commented on GIRAPH-191:
------------------------------------

+1, committed. Thanks again.
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279092#comment-13279092 ] 

Sebastian Schelter commented on GIRAPH-191:
-------------------------------------------

I meant the code in the patch, sry.


                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (GIRAPH-191) Random Walks on Graphs

Posted by "Jakob Homan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jakob Homan reassigned GIRAPH-191:
----------------------------------

    Assignee: Gianmarco De Francisci Morales
    
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.patch, PIG-191.1.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433656#comment-13433656 ] 

Gianmarco De Francisci Morales commented on GIRAPH-191:
-------------------------------------------------------

Tried to upload the patch on RB but even with some text mangling to hide the git/svn differences reviews.apache.org just bounces it back to me with a nice 500 error.
No clue how to solve it, sorry.
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448756#comment-13448756 ] 

Gianmarco De Francisci Morales commented on GIRAPH-191:
-------------------------------------------------------

Could someone review the patch before it gets too stale, please?
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434015#comment-13434015 ] 

Gianmarco De Francisci Morales commented on GIRAPH-191:
-------------------------------------------------------

Yes, it works the same for Eclipse with the checkstyle plugin.
However, there is no automated way of just pressing a button and getting things straight.
Instead one has to manually fix every single issue.
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated GIRAPH-191:
--------------------------------------------------

    Attachment: PIG-191.1.patch

Added handling of dangling nodes.
Added the option of specifying the sources for the RWR via a file.
Tests still need some fixing.

I think the patch is at a point where it would need some expert eyes.
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.patch, PIG-191.1.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433969#comment-13433969 ] 

Avery Ching commented on GIRAPH-191:
------------------------------------

FYI, for IntelliJ users, you can simply make sure that you load our checkstyle file.  Then you can highlight the checkstyle errors.
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.3.patch, GIRAPH-191.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-191) Random Walks on Graphs

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431754#comment-13431754 ] 

Gianmarco De Francisci Morales commented on GIRAPH-191:
-------------------------------------------------------

Hi Jakob,
I am trying to get this patch back in sync with trunk but the changes to the API make it more difficult to implement your own memory structures for the vertex.
I need two layers of wrapping over the primitive types I am using (to save memory).
I will fix the checkstyle problems and the tests.
                
> Random Walks on Graphs
> ----------------------
>
>                 Key: GIRAPH-191
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-191
>             Project: Giraph
>          Issue Type: New Feature
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Gianmarco De Francisci Morales
>            Assignee: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191-1.patch, GIRAPH-191.2.patch, GIRAPH-191.patch, PIG-191.1.patch
>
>
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex code.
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors, preference vectors, strongly preferential, etc...
> More or less along these lines:
> http://law.dsi.unimi.it/software/docs/it/unimi/dsi/law/rank/PageRank.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira