You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Josh Patterson <jo...@cloudera.com> on 2011/09/27 20:56:54 UTC

List of Algos implemented on Giraph

Is there a list of known algorithms that have been implemented on the
Giraph framework?

JP

-- 
Twitter: @jpatanooga
Solution Architect @ Cloudera
hadoop: http://www.cloudera.com

Re: List of Algos implemented on Giraph

Posted by Josh Patterson <jo...@cloudera.com>.
At this point I'm more interested in seeing how people are
implementing various algorithms on different  parallel frameworks like

- Giraph
- Spark
- GraphLab
- Ciel

etc, as opposed to seeing anything "production ready". Its more of
taking notes on what frameworks work better for what techniques, being
able to illustrate some sort of heuristic on what algorithm should be
used with what framework. There are lots of small tradeoffs depending
on what is used and how. Just trying to map that all out.

An academic exercise, to be sure. =)

JP

On Tue, Sep 27, 2011 at 3:14 PM, Jake Mannix <ja...@gmail.com> wrote:
>
>
> On Tue, Sep 27, 2011 at 12:31 PM, Aapo Kyrola <ak...@cs.cmu.edu> wrote:
>>
>> I have written a very simple Belief Propagation algorithm for binary
>> variables.
>> Not ready for prime-time either, though :).
>
> That's awesome!   Where's the code? :)
> It shouldn't be construed that "not ready for prime-time" is *bad*, in case
> that's what it looked like I was saying.
> More examples the better, so we can see where the bottlenecks are, and move
> them toward productionalized stage!
>
>   -jake
>
>>
>> Aapo
>> On Sep 27, 2011, at 3:03 PM, Jake Mannix wrote:
>>
>> Not really.  It's really really early, and they're in the "examples" stage
>> - nothing is
>> really productionized.  There's things like PageRank, finding shortest
>> path, but
>> nothing is really ready for prime time yet.
>>
>> On Tue, Sep 27, 2011 at 11:56 AM, Josh Patterson <jo...@cloudera.com>
>> wrote:
>>>
>>> Is there a list of known algorithms that have been implemented on the
>>> Giraph framework?
>>>
>>> JP
>>>
>>> --
>>> Twitter: @jpatanooga
>>> Solution Architect @ Cloudera
>>> hadoop: http://www.cloudera.com
>>
>>
>> Aapo Kyrola
>> Ph.D. student, http://www.cs.cmu.edu/~akyrola
>
>



-- 
Twitter: @jpatanooga
Solution Architect @ Cloudera
hadoop: http://www.cloudera.com

Re: List of Algos implemented on Giraph

Posted by Avery Ching <ac...@apache.org>.
Thanks for your help Aapo.  Definitely fee free to open/tackle issues on 
the JIRA as you find them.  =)

Avery

On 9/27/11 5:05 PM, Aapo Kyrola wrote:
>
> Glad to hear about the progress. I am probably going to work on my algo
> soon and try to run it with very big graphs. I plan to do some 
> profiling and
> experimental modifications to Giraph code as well to get it to scale. 
> (Don't
> worry, I am not committing anything :)). Will keep you updated.
>
>
> On Sep 27, 2011, at 4:33 PM, Jake Mannix wrote:
>
>>
>>
>> On Tue, Sep 27, 2011 at 1:29 PM, Aapo Kyrola <akyrola@cs.cmu.edu 
>> <ma...@cs.cmu.edu>> wrote:
>>
>>
>>
>>     The code is still in draft stage, but I attached it.
>>
>>
>> Cool, thanks.  The best place to usually attach it is to a JIRA 
>> ticket which describes what
>> the code does, usually.
>>
>>     it is actually quite optimized code with regards to data
>>     serialization etc.
>>
>>     It is just that Giraph currently takes a lot of memory (I guess
>>     it is the RPC), which makes it difficult
>>     to run algos like this that cannot use a combiner.
>>
>>
>> https://issues.apache.org/jira/browse/GIRAPH-28
>>
>> is tracking some work toward reducing the memory footprint, but it 
>> requires some work on
>>
>> https://issues.apache.org/jira/browse/GIRAPH-36
>>
>> before it'll work as well as it can.  Similarly, the RPC (including 
>> memory overhead) is being improved in explorations in
>>
>> https://issues.apache.org/jira/browse/GIRAPH-12
>>
>> and
>>
>> https://issues.apache.org/jira/browse/GIRAPH-37
>>
>> Watch those spaces for upcoming improvements!
>>
>>   -jake
>>
>>
>>     Aapo
>>
>>     On Sep 27, 2011, at 4:14 PM, Jake Mannix wrote:
>>
>>>
>>>
>>>     On Tue, Sep 27, 2011 at 12:31 PM, Aapo Kyrola
>>>     <akyrola@cs.cmu.edu <ma...@cs.cmu.edu>> wrote:
>>>
>>>
>>>         I have written a very simple Belief Propagation algorithm
>>>         for binary
>>>         variables.
>>>
>>>         Not ready for prime-time either, though :).
>>>
>>>
>>>     That's awesome!   Where's the code? :)
>>>
>>>     It shouldn't be construed that "not ready for prime-time" is
>>>     *bad*, in case that's what it looked like I was saying.
>>>
>>>     More examples the better, so we can see where the bottlenecks
>>>     are, and move them toward productionalized stage!
>>>       -jake
>>>
>>>
>>>         Aapo
>>>
>>>         On Sep 27, 2011, at 3:03 PM, Jake Mannix wrote:
>>>
>>>>         Not really.  It's really really early, and they're in the
>>>>         "examples" stage - nothing is
>>>>         really productionized.  There's things like PageRank,
>>>>         finding shortest path, but
>>>>         nothing is really ready for prime time yet.
>>>>
>>>>         On Tue, Sep 27, 2011 at 11:56 AM, Josh Patterson
>>>>         <josh@cloudera.com <ma...@cloudera.com>> wrote:
>>>>
>>>>             Is there a list of known algorithms that have been
>>>>             implemented on the
>>>>             Giraph framework?
>>>>
>>>>             JP
>>>>
>>>>             --
>>>>             Twitter: @jpatanooga
>>>>             Solution Architect @ Cloudera
>>>>             hadoop: http://www.cloudera.com <http://www.cloudera.com/>
>>>>
>>>>
>>>
>>>         Aapo Kyrola
>>>         Ph.D. student, http://www.cs.cmu.edu/~akyrola
>>>         <http://www.cs.cmu.edu/%7Eakyrola>
>>>
>>>
>>
>>     Aapo Kyrola
>>     Ph.D. student, http://www.cs.cmu.edu/~akyrola
>>     <http://www.cs.cmu.edu/%7Eakyrola>
>>
>>
>>
>
> Aapo Kyrola
> Ph.D. student, http://www.cs.cmu.edu/~akyrola 
> <http://www.cs.cmu.edu/%7Eakyrola>
>


Re: List of Algos implemented on Giraph

Posted by Aapo Kyrola <ak...@cs.cmu.edu>.
Glad to hear about the progress. I am probably going to work on my algo
soon and try to run it with very big graphs. I plan to do some profiling and
experimental modifications to Giraph code as well to get it to scale. (Don't
worry, I am not committing anything :)). Will keep you updated.


On Sep 27, 2011, at 4:33 PM, Jake Mannix wrote:

> 
> 
> On Tue, Sep 27, 2011 at 1:29 PM, Aapo Kyrola <ak...@cs.cmu.edu> wrote:
> 
> 
> The code is still in draft stage, but I attached it.
> 
> Cool, thanks.  The best place to usually attach it is to a JIRA ticket which describes what
> the code does, usually.
>  
> it is actually quite optimized code with regards to data serialization etc.
> 
> It is just that Giraph currently takes a lot of memory (I guess it is the RPC), which makes it difficult
> to run algos like this that cannot use a combiner.
> 
>   https://issues.apache.org/jira/browse/GIRAPH-28 
> 
> is tracking some work toward reducing the memory footprint, but it requires some work on 
> 
>   https://issues.apache.org/jira/browse/GIRAPH-36
> 
> before it'll work as well as it can.  Similarly, the RPC (including memory overhead) is being improved in explorations in 
> 
>   https://issues.apache.org/jira/browse/GIRAPH-12 
> 
> and 
> 
>   https://issues.apache.org/jira/browse/GIRAPH-37
> 
> Watch those spaces for upcoming improvements!
> 
>   -jake
> 
> 
> Aapo
> 
> On Sep 27, 2011, at 4:14 PM, Jake Mannix wrote:
> 
>> 
>> 
>> On Tue, Sep 27, 2011 at 12:31 PM, Aapo Kyrola <ak...@cs.cmu.edu> wrote:
>> 
>> I have written a very simple Belief Propagation algorithm for binary
>> variables.
>> 
>> Not ready for prime-time either, though :).
>> 
>> That's awesome!   Where's the code? :)
>> 
>> It shouldn't be construed that "not ready for prime-time" is *bad*, in case that's what it looked like I was saying.
>> 
>> More examples the better, so we can see where the bottlenecks are, and move them toward productionalized stage!
>>  
>>   -jake
>>  
>> 
>> Aapo
>> 
>> On Sep 27, 2011, at 3:03 PM, Jake Mannix wrote:
>> 
>>> Not really.  It's really really early, and they're in the "examples" stage - nothing is
>>> really productionized.  There's things like PageRank, finding shortest path, but 
>>> nothing is really ready for prime time yet.
>>> 
>>> On Tue, Sep 27, 2011 at 11:56 AM, Josh Patterson <jo...@cloudera.com> wrote:
>>> Is there a list of known algorithms that have been implemented on the
>>> Giraph framework?
>>> 
>>> JP
>>> 
>>> --
>>> Twitter: @jpatanooga
>>> Solution Architect @ Cloudera
>>> hadoop: http://www.cloudera.com
>>> 
>> 
>> Aapo Kyrola
>> Ph.D. student, http://www.cs.cmu.edu/~akyrola
>> 
>> 
> 
> Aapo Kyrola
> Ph.D. student, http://www.cs.cmu.edu/~akyrola
> 
> 
> 

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola


Re: List of Algos implemented on Giraph

Posted by Jake Mannix <ja...@gmail.com>.
On Tue, Sep 27, 2011 at 1:29 PM, Aapo Kyrola <ak...@cs.cmu.edu> wrote:

>
>
> The code is still in draft stage, but I attached it.
>

Cool, thanks.  The best place to usually attach it is to a JIRA ticket which
describes what
the code does, usually.


> it is actually quite optimized code with regards to data serialization etc.
>
> It is just that Giraph currently takes a lot of memory (I guess it is the
> RPC), which makes it difficult
> to run algos like this that cannot use a combiner.
>

  https://issues.apache.org/jira/browse/GIRAPH-28

is tracking some work toward reducing the memory footprint, but it requires
some work on

  https://issues.apache.org/jira/browse/GIRAPH-36

before it'll work as well as it can.  Similarly, the RPC (including memory
overhead) is being improved in explorations in

  https://issues.apache.org/jira/browse/GIRAPH-12

and

  https://issues.apache.org/jira/browse/GIRAPH-37

Watch those spaces for upcoming improvements!

  -jake


> Aapo
>
> On Sep 27, 2011, at 4:14 PM, Jake Mannix wrote:
>
>
>
> On Tue, Sep 27, 2011 at 12:31 PM, Aapo Kyrola <ak...@cs.cmu.edu> wrote:
>
>>
>> I have written a very simple Belief Propagation algorithm for binary
>> variables.
>>
>> Not ready for prime-time either, though :).
>>
>
> That's awesome!   Where's the code? :)
>
> It shouldn't be construed that "not ready for prime-time" is *bad*, in case
> that's what it looked like I was saying.
>
> More examples the better, so we can see where the bottlenecks are, and move
> them toward productionalized stage!
>
>   -jake
>
>
>>
>> Aapo
>>
>> On Sep 27, 2011, at 3:03 PM, Jake Mannix wrote:
>>
>> Not really.  It's really really early, and they're in the "examples" stage
>> - nothing is
>> really productionized.  There's things like PageRank, finding shortest
>> path, but
>> nothing is really ready for prime time yet.
>>
>> On Tue, Sep 27, 2011 at 11:56 AM, Josh Patterson <jo...@cloudera.com>wrote:
>>
>>> Is there a list of known algorithms that have been implemented on the
>>> Giraph framework?
>>>
>>> JP
>>>
>>> --
>>> Twitter: @jpatanooga
>>> Solution Architect @ Cloudera
>>> hadoop: http://www.cloudera.com
>>>
>>
>>
>>  Aapo Kyrola
>> Ph.D. student, http://www.cs.cmu.edu/~akyrola
>>
>>
>
> Aapo Kyrola
> Ph.D. student, http://www.cs.cmu.edu/~akyrola
>
>
>

Re: List of Algos implemented on Giraph

Posted by Aapo Kyrola <ak...@cs.cmu.edu>.

The code is still in draft stage, but I attached it.

it is actually quite optimized code with regards to data serialization etc.

It is just that Giraph currently takes a lot of memory (I guess it is the RPC), which makes it difficult
to run algos like this that cannot use a combiner.

Aapo

On Sep 27, 2011, at 4:14 PM, Jake Mannix wrote:

> 
> 
> On Tue, Sep 27, 2011 at 12:31 PM, Aapo Kyrola <ak...@cs.cmu.edu> wrote:
> 
> I have written a very simple Belief Propagation algorithm for binary
> variables.
> 
> Not ready for prime-time either, though :).
> 
> That's awesome!   Where's the code? :)
> 
> It shouldn't be construed that "not ready for prime-time" is *bad*, in case that's what it looked like I was saying.
> 
> More examples the better, so we can see where the bottlenecks are, and move them toward productionalized stage!
>  
>   -jake
>  
> 
> Aapo
> 
> On Sep 27, 2011, at 3:03 PM, Jake Mannix wrote:
> 
>> Not really.  It's really really early, and they're in the "examples" stage - nothing is
>> really productionized.  There's things like PageRank, finding shortest path, but 
>> nothing is really ready for prime time yet.
>> 
>> On Tue, Sep 27, 2011 at 11:56 AM, Josh Patterson <jo...@cloudera.com> wrote:
>> Is there a list of known algorithms that have been implemented on the
>> Giraph framework?
>> 
>> JP
>> 
>> --
>> Twitter: @jpatanooga
>> Solution Architect @ Cloudera
>> hadoop: http://www.cloudera.com
>> 
> 
> Aapo Kyrola
> Ph.D. student, http://www.cs.cmu.edu/~akyrola
> 
> 

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola


Re: List of Algos implemented on Giraph

Posted by Jake Mannix <ja...@gmail.com>.
On Tue, Sep 27, 2011 at 12:31 PM, Aapo Kyrola <ak...@cs.cmu.edu> wrote:

>
> I have written a very simple Belief Propagation algorithm for binary
> variables.
>
> Not ready for prime-time either, though :).
>

That's awesome!   Where's the code? :)

It shouldn't be construed that "not ready for prime-time" is *bad*, in case
that's what it looked like I was saying.

More examples the better, so we can see where the bottlenecks are, and move
them toward productionalized stage!

  -jake


>
> Aapo
>
> On Sep 27, 2011, at 3:03 PM, Jake Mannix wrote:
>
> Not really.  It's really really early, and they're in the "examples" stage
> - nothing is
> really productionized.  There's things like PageRank, finding shortest
> path, but
> nothing is really ready for prime time yet.
>
> On Tue, Sep 27, 2011 at 11:56 AM, Josh Patterson <jo...@cloudera.com>wrote:
>
>> Is there a list of known algorithms that have been implemented on the
>> Giraph framework?
>>
>> JP
>>
>> --
>> Twitter: @jpatanooga
>> Solution Architect @ Cloudera
>> hadoop: http://www.cloudera.com
>>
>
>
>  Aapo Kyrola
> Ph.D. student, http://www.cs.cmu.edu/~akyrola
>
>

Re: List of Algos implemented on Giraph

Posted by Aapo Kyrola <ak...@cs.cmu.edu>.
I have written a very simple Belief Propagation algorithm for binary
variables.

Not ready for prime-time either, though :).

Aapo

On Sep 27, 2011, at 3:03 PM, Jake Mannix wrote:

> Not really.  It's really really early, and they're in the "examples" stage - nothing is
> really productionized.  There's things like PageRank, finding shortest path, but 
> nothing is really ready for prime time yet.
> 
> On Tue, Sep 27, 2011 at 11:56 AM, Josh Patterson <jo...@cloudera.com> wrote:
> Is there a list of known algorithms that have been implemented on the
> Giraph framework?
> 
> JP
> 
> --
> Twitter: @jpatanooga
> Solution Architect @ Cloudera
> hadoop: http://www.cloudera.com
> 

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola


Re: List of Algos implemented on Giraph

Posted by Jake Mannix <ja...@gmail.com>.
Not really.  It's really really early, and they're in the "examples" stage -
nothing is
really productionized.  There's things like PageRank, finding shortest path,
but
nothing is really ready for prime time yet.

On Tue, Sep 27, 2011 at 11:56 AM, Josh Patterson <jo...@cloudera.com> wrote:

> Is there a list of known algorithms that have been implemented on the
> Giraph framework?
>
> JP
>
> --
> Twitter: @jpatanooga
> Solution Architect @ Cloudera
> hadoop: http://www.cloudera.com
>