You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Sebastian Schelter <ss...@apache.org> on 2012/05/12 11:58:42 UTC

[Announcement] Giraph talk in Berlin on May 29th

Hi,

I will give a talk titled "Large Scale Graph Processing with Apache
Giraph" in Berlin on May 29th. Details are available at:

https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275

Best,
Sebastian

Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Gianmarco De Francisci Morales <gd...@apache.org>.
Sorry for the delay in answering.
I think the MR version would need approval and screening to get out in the
wild.
The Giraph version probably doesn't and can easily be open-source, but I
need to check anyway.

Cheers,
--
Gianmarco




On Tue, May 29, 2012 at 1:26 PM, Paolo Castagna <
castagna.lists@googlemail.com> wrote:

> Gianmarco De Francisci Morales wrote:
> > I have some toy code (not really well tested) that implements b-matching
> > (that is matching with integer capacities on the nodes).
> > It's a simple greedy method, along the lines of the one described here
> > www.vldb.org/pvldb/vol4/p460-morales.pdf
> >
> > I can share it if you are interested.
>
> Hi Gianmarco,
> examples of algorithms implemented in Apache Giraph are quite useful for
> people
> learning how to use a Pregel-like clone as Giraph.
>
> It would be nice to add bipartite matching algorithm (as described by the
> Pregel
> paper). Is your implementation open source and/or ASL?
>
> Bipartite matching is my next exercise with Giraph. :-)
>
> Paolo
>
> > Cheers,
> > --
> > Gianmarco
>
>

Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Paolo Castagna <ca...@googlemail.com>.
Gianmarco De Francisci Morales wrote:
> I have some toy code (not really well tested) that implements b-matching
> (that is matching with integer capacities on the nodes).
> It's a simple greedy method, along the lines of the one described here
> www.vldb.org/pvldb/vol4/p460-morales.pdf
> 
> I can share it if you are interested.

Hi Gianmarco,
examples of algorithms implemented in Apache Giraph are quite useful for people
learning how to use a Pregel-like clone as Giraph.

It would be nice to add bipartite matching algorithm (as described by the Pregel
paper). Is your implementation open source and/or ASL?

Bipartite matching is my next exercise with Giraph. :-)

Paolo

> Cheers,
> --
> Gianmarco


Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Gianmarco De Francisci Morales <gd...@apache.org>.
Hi,


> It would be good to present users a couple of non trivial examples and one
> or
> two 'real' use cases where Apache Giraph is used for processing large
> graphs.
> Apache Giraph comes with two examples: all shortest paths from a single
> source
> and PageRank. Google's Pregel paper describes 'bipartite matching' and
> 'semi-clustering'. Is anyone working on implementing these in Giraph?
> Or, what if in the shortest paths example you actually want to know the
> path?
>
>
I have some toy code (not really well tested) that implements b-matching
(that is matching with integer capacities on the nodes).
It's a simple greedy method, along the lines of the one described here
www.vldb.org/pvldb/vol4/p460-morales.pdf

I can share it if you are interested.

Cheers,
--
Gianmarco



It would be great to have examples on more advanced features: custom
> partitioning functions, aggregators, ...
>
> Personally, I'd like to see a side-by-side comparison of Google's Pregel as
> described in their paper and Giraph implementation (I am particularly
> interested
> on where they diverge and why).
>
> Another question (or thing I am not so sure about) is about 'capacity
> planning'
> (sort of...). Given a dataset and an algorithm implemented in Giraph, how
> you
> determine how many workers would be needed (in order to fit all your graph
> and
> messages for each superstep in RAM)?
>
> Last but not least, it seems to me that PageRank is what you use to
> 'benchmark'
> Giraph, is that the case? If that is the case, sharing a common dataset for
> others to use would be a first initial step to allow people to compare
> performances of different software running the very same algorithm, over
> the
> same data and the same hardware infrastructure.
>
> Paolo
>
> Sebastian Schelter wrote:
> > Hi,
> >
> > I will give a talk titled "Large Scale Graph Processing with Apache
> > Giraph" in Berlin on May 29th. Details are available at:
> >
> >
> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
> >
> > Best,
> > Sebastian
>
>

Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Paolo Castagna <ca...@googlemail.com>.
user@mahout.apache.org

Hi,
by the way, about talks/presentations, here are the Apache Giraph
talks/presentations I found:

“Giraph: Large-scale graph processing on Hadoop”, Avery Ching
Hadoop Summit 2011 - Santa Clara, California - June 2011
http://www.slideshare.net/averyching/20110628giraph-hadoop-summit
http://www.youtube.com/watch?v=l4nQjAG6fac

“Apache Giraph: Distributed Graph Processing in the Cloud”, Claudio Martella
FOSDEM 2012 - Brussels, Belgium - February 2012
http://prezi.com/9ake_klzwrga/apache-giraph-distributed-graph-processing-in-the-cloud/
http://blog.acaro.org/entry/giraph-talk-for-graphdevroom-fosdem-2012
http://www.youtube.com/watch?v=3ZrqPEIPRe4
http://www.youtube.com/watch?v=BmRaejKGeDM

“Introducing Apache Giraph for Large Scale Graph Processing”, Sebastian Schelter
Apache Hadoop Get Together - Berlin, Germany - April 2012
http://ssc.io/introducing-apache-giraph-for-large-scale-graph-processing/
http://www.slideshare.net/sscdotopen/introducing-apache-giraph-for-large-scale-graph-processing
http://vimeo.com/40737998

You could put the links on the Apache Giraph wiki.

First of all, thank you for sharing them and may I add a few comments or
suggestions for future presentations? (don't take this as a critic, please)...

It would be good to present users a couple of non trivial examples and one or
two 'real' use cases where Apache Giraph is used for processing large graphs.
Apache Giraph comes with two examples: all shortest paths from a single source
and PageRank. Google's Pregel paper describes 'bipartite matching' and
'semi-clustering'. Is anyone working on implementing these in Giraph?
Or, what if in the shortest paths example you actually want to know the path?

It would be great to have examples on more advanced features: custom
partitioning functions, aggregators, ...

Personally, I'd like to see a side-by-side comparison of Google's Pregel as
described in their paper and Giraph implementation (I am particularly interested
on where they diverge and why).

Another question (or thing I am not so sure about) is about 'capacity planning'
(sort of...). Given a dataset and an algorithm implemented in Giraph, how you
determine how many workers would be needed (in order to fit all your graph and
messages for each superstep in RAM)?

Last but not least, it seems to me that PageRank is what you use to 'benchmark'
Giraph, is that the case? If that is the case, sharing a common dataset for
others to use would be a first initial step to allow people to compare
performances of different software running the very same algorithm, over the
same data and the same hardware infrastructure.

Paolo

Sebastian Schelter wrote:
> Hi,
> 
> I will give a talk titled "Large Scale Graph Processing with Apache
> Giraph" in Berlin on May 29th. Details are available at:
> 
> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
> 
> Best,
> Sebastian


Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Sebastian Schelter <ss...@googlemail.com>.
I think the talk will be filmed. I don't plan to attend Hadoop summit.


On 13.05.2012 01:40, Mohit Anchlia wrote:
> Is there going to be video available too?Are you also planning to do the
> same talk during hadoop summit?
> 
> 
> On Sat, May 12, 2012 at 11:06 AM, Sebastian Schelter <ss...@apache.org> wrote:
> 
>> I will publish the slides shortly after. There will be a 2-day workshop
>> after Berlin Buzzwords called "Parallel Processing Beyond MapReduce"
>> which also covers Giraph (and will be attended by a couple of committers).
>>
>> Best,
>> Sebastian
>>
>> On 12.05.2012 19:43, Ted Dunning wrote:
>>> Wish I could be there.
>>>
>>> Can you send slides when they are available?
>>>
>>> On Sat, May 12, 2012 at 2:58 AM, Sebastian Schelter <ss...@apache.org>
>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I will give a talk titled "Large Scale Graph Processing with Apache
>>>> Giraph" in Berlin on May 29th. Details are available at:
>>>>
>>>>
>>>>
>> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
>>>>
>>>> Best,
>>>> Sebastian
>>>>
>>>
>>
>>
> 


Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Mohit Anchlia <mo...@gmail.com>.
Is there going to be video available too?Are you also planning to do the
same talk during hadoop summit?


On Sat, May 12, 2012 at 11:06 AM, Sebastian Schelter <ss...@apache.org> wrote:

> I will publish the slides shortly after. There will be a 2-day workshop
> after Berlin Buzzwords called "Parallel Processing Beyond MapReduce"
> which also covers Giraph (and will be attended by a couple of committers).
>
> Best,
> Sebastian
>
> On 12.05.2012 19:43, Ted Dunning wrote:
> > Wish I could be there.
> >
> > Can you send slides when they are available?
> >
> > On Sat, May 12, 2012 at 2:58 AM, Sebastian Schelter <ss...@apache.org>
> wrote:
> >
> >> Hi,
> >>
> >> I will give a talk titled "Large Scale Graph Processing with Apache
> >> Giraph" in Berlin on May 29th. Details are available at:
> >>
> >>
> >>
> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
> >>
> >> Best,
> >> Sebastian
> >>
> >
>
>

Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Sebastian Schelter <ss...@apache.org>.
I will publish the slides shortly after. There will be a 2-day workshop
after Berlin Buzzwords called "Parallel Processing Beyond MapReduce"
which also covers Giraph (and will be attended by a couple of committers).

Best,
Sebastian

On 12.05.2012 19:43, Ted Dunning wrote:
> Wish I could be there.
> 
> Can you send slides when they are available?
> 
> On Sat, May 12, 2012 at 2:58 AM, Sebastian Schelter <ss...@apache.org> wrote:
> 
>> Hi,
>>
>> I will give a talk titled "Large Scale Graph Processing with Apache
>> Giraph" in Berlin on May 29th. Details are available at:
>>
>>
>> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
>>
>> Best,
>> Sebastian
>>
> 


Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Ted Dunning <te...@gmail.com>.
Wish I could be there.

Can you send slides when they are available?

On Sat, May 12, 2012 at 2:58 AM, Sebastian Schelter <ss...@apache.org> wrote:

> Hi,
>
> I will give a talk titled "Large Scale Graph Processing with Apache
> Giraph" in Berlin on May 29th. Details are available at:
>
>
> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
>
> Best,
> Sebastian
>

Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Sebastian Schelter <ss...@apache.org>.
Warming up your audience :)

On 12.05.2012 22:01, Jakob Homan wrote:
> Stealing my thunder? :)
> 
> On Sat, May 12, 2012 at 7:36 AM, Avery Ching <ac...@apache.org> wrote:
>> Nice!
>>
>> Avery
>>
>>
>> On 5/12/12 2:58 AM, Sebastian Schelter wrote:
>>>
>>> Hi,
>>>
>>> I will give a talk titled "Large Scale Graph Processing with Apache
>>> Giraph" in Berlin on May 29th. Details are available at:
>>>
>>>
>>> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
>>>
>>> Best,
>>> Sebastian
>>
>>


Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Jakob Homan <jg...@gmail.com>.
Stealing my thunder? :)

On Sat, May 12, 2012 at 7:36 AM, Avery Ching <ac...@apache.org> wrote:
> Nice!
>
> Avery
>
>
> On 5/12/12 2:58 AM, Sebastian Schelter wrote:
>>
>> Hi,
>>
>> I will give a talk titled "Large Scale Graph Processing with Apache
>> Giraph" in Berlin on May 29th. Details are available at:
>>
>>
>> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
>>
>> Best,
>> Sebastian
>
>

Re: [Announcement] Giraph talk in Berlin on May 29th

Posted by Avery Ching <ac...@apache.org>.
Nice!

Avery

On 5/12/12 2:58 AM, Sebastian Schelter wrote:
> Hi,
>
> I will give a talk titled "Large Scale Graph Processing with Apache
> Giraph" in Berlin on May 29th. Details are available at:
>
> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
>
> Best,
> Sebastian