You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Claudio Martella <cl...@gmail.com> on 2011/11/21 21:43:18 UTC

Apache Giraph talk @ FOSDEM

Hi devs,

FOSDEM has announced a devroom completely dedicated to Graph Processing:

https://lists.fosdem.org/pipermail/fosdem/2011-November/001344.html

I'm going to submit for a talk there. Here's the draft, feedback is welcome :)

Title: "Apache Giraph: distributed graph processing in the cloud."

Abstract: Web and online social graphs have been rapidly growing in
size and scale during the past decade. In 2008, Google estimated that
the number of web pages reached over a trillion. Online social
networking and email sites, including Yahoo!, Google, Microsoft,
Facebook, LinkedIn, and Twitter, have hundreds of millions of users
and are expected to grow much more in the future. Processing these
graphs plays a big role in relevant and personalized information for
users, such as results from a search engine or news in an online
social networking site.

The Apache Giraph (http://incubator.apache.org/giraph) project is a
faul-tolerant in-memory distributed graph processing system which runs
on top of a standard Hadoop cluster and is capable of running any
standard Bulk Synchronous Parallel (BSP) operation over any large
generic data set which can be represented as a graph. Apache Giraph is
a loose implementation of Google Pregel.
Giraph entered the ASF Incubator in July 2011, where it has enlisted
the aid of committers from Yahoo!, Facebook, LinkedIn, and Twitter.

The talk will present why running MapReduce jobs for graph processing
can be a problem,  introducing the reason why Google designed Pregel
at first place. Later, the BSP model will be presented focusing on how
it can be used to implement a distributed graph processing engine.
The last part of the talk will be dedicated to Apache Giraph, with a
description of the programming model (i.e. the API, some typical
examples such as PageRank and Single Source Shortest Path) along with
a technical overview of how the architecture of Giraph works and how
it leverages the Hadoop infrastructure.


Best,
Claudio

-- 
   Claudio Martella
   claudio.martella@gmail.com

Re: Apache Giraph talk @ FOSDEM

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Very cool!

One thing I would update below Claudio is some sort of statement 
that Giraph is currently Incubating, with appropriate links back to the 
Incubator website and policies. 

+1.

Cheers,
Chris

On Nov 21, 2011, at 12:43 PM, Claudio Martella wrote:

> Hi devs,
> 
> FOSDEM has announced a devroom completely dedicated to Graph Processing:
> 
> https://lists.fosdem.org/pipermail/fosdem/2011-November/001344.html
> 
> I'm going to submit for a talk there. Here's the draft, feedback is welcome :)
> 
> Title: "Apache Giraph: distributed graph processing in the cloud."
> 
> Abstract: Web and online social graphs have been rapidly growing in
> size and scale during the past decade. In 2008, Google estimated that
> the number of web pages reached over a trillion. Online social
> networking and email sites, including Yahoo!, Google, Microsoft,
> Facebook, LinkedIn, and Twitter, have hundreds of millions of users
> and are expected to grow much more in the future. Processing these
> graphs plays a big role in relevant and personalized information for
> users, such as results from a search engine or news in an online
> social networking site.
> 
> The Apache Giraph (http://incubator.apache.org/giraph) project is a
> faul-tolerant in-memory distributed graph processing system which runs
> on top of a standard Hadoop cluster and is capable of running any
> standard Bulk Synchronous Parallel (BSP) operation over any large
> generic data set which can be represented as a graph. Apache Giraph is
> a loose implementation of Google Pregel.
> Giraph entered the ASF Incubator in July 2011, where it has enlisted
> the aid of committers from Yahoo!, Facebook, LinkedIn, and Twitter.
> 
> The talk will present why running MapReduce jobs for graph processing
> can be a problem,  introducing the reason why Google designed Pregel
> at first place. Later, the BSP model will be presented focusing on how
> it can be used to implement a distributed graph processing engine.
> The last part of the talk will be dedicated to Apache Giraph, with a
> description of the programming model (i.e. the API, some typical
> examples such as PageRank and Single Source Shortest Path) along with
> a technical overview of how the architecture of Giraph works and how
> it leverages the Hadoop infrastructure.
> 
> 
> Best,
> Claudio
> 
> -- 
>    Claudio Martella
>    claudio.martella@gmail.com


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Apache Giraph talk @ FOSDEM

Posted by Claudio Martella <cl...@gmail.com>.
Thanks!

Yes, I also thought it was cool for them to mention that. I'll for
sure use some of your slides. In case I can't download the native,
I'll ask you.

I wait for your feedback :)

On Mon, Nov 21, 2011 at 10:42 PM, Avery Ching <ac...@apache.org> wrote:
> Thanks for volunteering Claudio!  It's very nice that Apache Giraph was
> mentioned in the invite, even though it's a relatively new open-source
> project.  I'll read your draft and send you feedback privately sometime
> today.  Also, if you need slides, please feel free to use anything I've ever
> posted (http://www.slideshare.net/averyching).  I think you can download
> them natively.  Also, if you need feedback for your slides, let me know.
>
> Avery
>
> On 11/21/11 12:43 PM, Claudio Martella wrote:
>>
>> Hi devs,
>>
>> FOSDEM has announced a devroom completely dedicated to Graph Processing:
>>
>> https://lists.fosdem.org/pipermail/fosdem/2011-November/001344.html
>>
>> I'm going to submit for a talk there. Here's the draft, feedback is
>> welcome :)
>>
>> Title: "Apache Giraph: distributed graph processing in the cloud."
>>
>> Abstract: Web and online social graphs have been rapidly growing in
>> size and scale during the past decade. In 2008, Google estimated that
>> the number of web pages reached over a trillion. Online social
>> networking and email sites, including Yahoo!, Google, Microsoft,
>> Facebook, LinkedIn, and Twitter, have hundreds of millions of users
>> and are expected to grow much more in the future. Processing these
>> graphs plays a big role in relevant and personalized information for
>> users, such as results from a search engine or news in an online
>> social networking site.
>>
>> The Apache Giraph (http://incubator.apache.org/giraph) project is a
>> faul-tolerant in-memory distributed graph processing system which runs
>> on top of a standard Hadoop cluster and is capable of running any
>> standard Bulk Synchronous Parallel (BSP) operation over any large
>> generic data set which can be represented as a graph. Apache Giraph is
>> a loose implementation of Google Pregel.
>> Giraph entered the ASF Incubator in July 2011, where it has enlisted
>> the aid of committers from Yahoo!, Facebook, LinkedIn, and Twitter.
>>
>> The talk will present why running MapReduce jobs for graph processing
>> can be a problem,  introducing the reason why Google designed Pregel
>> at first place. Later, the BSP model will be presented focusing on how
>> it can be used to implement a distributed graph processing engine.
>> The last part of the talk will be dedicated to Apache Giraph, with a
>> description of the programming model (i.e. the API, some typical
>> examples such as PageRank and Single Source Shortest Path) along with
>> a technical overview of how the architecture of Giraph works and how
>> it leverages the Hadoop infrastructure.
>>
>>
>> Best,
>> Claudio
>>
>
>



-- 
   Claudio Martella
   claudio.martella@gmail.com

Re: Apache Giraph talk @ FOSDEM

Posted by Avery Ching <ac...@apache.org>.
Thanks for volunteering Claudio!  It's very nice that Apache Giraph was 
mentioned in the invite, even though it's a relatively new open-source 
project.  I'll read your draft and send you feedback privately sometime 
today.  Also, if you need slides, please feel free to use anything I've 
ever posted (http://www.slideshare.net/averyching).  I think you can 
download them natively.  Also, if you need feedback for your slides, let 
me know.

Avery

On 11/21/11 12:43 PM, Claudio Martella wrote:
> Hi devs,
>
> FOSDEM has announced a devroom completely dedicated to Graph Processing:
>
> https://lists.fosdem.org/pipermail/fosdem/2011-November/001344.html
>
> I'm going to submit for a talk there. Here's the draft, feedback is welcome :)
>
> Title: "Apache Giraph: distributed graph processing in the cloud."
>
> Abstract: Web and online social graphs have been rapidly growing in
> size and scale during the past decade. In 2008, Google estimated that
> the number of web pages reached over a trillion. Online social
> networking and email sites, including Yahoo!, Google, Microsoft,
> Facebook, LinkedIn, and Twitter, have hundreds of millions of users
> and are expected to grow much more in the future. Processing these
> graphs plays a big role in relevant and personalized information for
> users, such as results from a search engine or news in an online
> social networking site.
>
> The Apache Giraph (http://incubator.apache.org/giraph) project is a
> faul-tolerant in-memory distributed graph processing system which runs
> on top of a standard Hadoop cluster and is capable of running any
> standard Bulk Synchronous Parallel (BSP) operation over any large
> generic data set which can be represented as a graph. Apache Giraph is
> a loose implementation of Google Pregel.
> Giraph entered the ASF Incubator in July 2011, where it has enlisted
> the aid of committers from Yahoo!, Facebook, LinkedIn, and Twitter.
>
> The talk will present why running MapReduce jobs for graph processing
> can be a problem,  introducing the reason why Google designed Pregel
> at first place. Later, the BSP model will be presented focusing on how
> it can be used to implement a distributed graph processing engine.
> The last part of the talk will be dedicated to Apache Giraph, with a
> description of the programming model (i.e. the API, some typical
> examples such as PageRank and Single Source Shortest Path) along with
> a technical overview of how the architecture of Giraph works and how
> it leverages the Hadoop infrastructure.
>
>
> Best,
> Claudio
>