You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Gustavo Enrique Salazar Torres <gs...@ime.usp.br> on 2012/12/11 14:58:01 UTC
Breadth-first search
Hi:
I implemented a graph algorithm to recommend content to our users. Although
it is working (implementation uses Mahout) it very inefficient because I
have to run many iterations in order to perform a breadth-first search on
my graph.
I would like to use Giraph for that task. I would like to know if it is
production ready. I'm running jobs on Amazon EMR.
Thanks in advance.
Gustavo
Re: Breadth-first search
Posted by Jan van der Lugt <ja...@gmail.com>.
Hi Gustavo,
If your graph fits in memory, you might be interested Green-Marl, a
language tailored for graph processing:
https://github.com/stanford-ppl/Green-Marl
You can compile your Green-Marl program to an extremely fast C++ program,
but also to Giraph program when your graph does not fit in memory anymore.
- Jan
On Tue, Dec 11, 2012 at 8:33 PM, Gustavo Enrique Salazar Torres <
gsalazar@ime.usp.br> wrote:
> Hi Avery:
>
> Regarding resources I guess I won't need that much, our graph has 60,000
> nodes only, I believe one c1.xlarge EC2 machine can handle this or scale if
> needed.
>
> Thank you very much.
> Gustavo
>
> On Tue, Dec 11, 2012 at 4:40 PM, Avery Ching <ac...@apache.org> wrote:
>
>> We are running several Giraph applications in production using our
>> version of Hadoop (Corona) at Facebook. The part you have to be careful
>> about is ensuring you have enough resources for your job to run. But
>> otherwise, we are able to run at FB-scale (i.e. 1billion+ nodes, many more
>> edges).
>>
>> Avery
>>
>>
>> On 12/11/12 5:58 AM, Gustavo Enrique Salazar Torres wrote:
>>
>>> Hi:
>>>
>>> I implemented a graph algorithm to recommend content to our users.
>>> Although it is working (implementation uses Mahout) it very inefficient
>>> because I have to run many iterations in order to perform a breadth-first
>>> search on my graph.
>>> I would like to use Giraph for that task. I would like to know if it is
>>> production ready. I'm running jobs on Amazon EMR.
>>>
>>> Thanks in advance.
>>> Gustavo
>>>
>>
>>
>
>
>
>
Re: Breadth-first search
Posted by Gustavo Enrique Salazar Torres <gs...@ime.usp.br>.
Hi Avery:
Regarding resources I guess I won't need that much, our graph has 60,000
nodes only, I believe one c1.xlarge EC2 machine can handle this or scale if
needed.
Thank you very much.
Gustavo
On Tue, Dec 11, 2012 at 4:40 PM, Avery Ching <ac...@apache.org> wrote:
> We are running several Giraph applications in production using our version
> of Hadoop (Corona) at Facebook. The part you have to be careful about is
> ensuring you have enough resources for your job to run. But otherwise, we
> are able to run at FB-scale (i.e. 1billion+ nodes, many more edges).
>
> Avery
>
>
> On 12/11/12 5:58 AM, Gustavo Enrique Salazar Torres wrote:
>
>> Hi:
>>
>> I implemented a graph algorithm to recommend content to our users.
>> Although it is working (implementation uses Mahout) it very inefficient
>> because I have to run many iterations in order to perform a breadth-first
>> search on my graph.
>> I would like to use Giraph for that task. I would like to know if it is
>> production ready. I'm running jobs on Amazon EMR.
>>
>> Thanks in advance.
>> Gustavo
>>
>
>
Re: Breadth-first search
Posted by Alexandros Daglis <al...@epfl.ch>.
Dear Avery,
Regarding this decision about resource allocation, do you have a
methodology or a rule of thumb that helps you decide which setting is
expected to perform well?
For example, with a given input (number of graph vertices), can you
estimate what number of workers and how much memory per worker would be
optimal? Or the other way around: given a pool of resources (cores &
memory), what's a reasonable graph size?
That insight would be really interesting.
Thanks,
Alexandros
On 11 December 2012 19:40, Avery Ching <ac...@apache.org> wrote:
> We are running several Giraph applications in production using our version
> of Hadoop (Corona) at Facebook. The part you have to be careful about is
> ensuring you have enough resources for your job to run. But otherwise, we
> are able to run at FB-scale (i.e. 1billion+ nodes, many more edges).
>
> Avery
>
>
> On 12/11/12 5:58 AM, Gustavo Enrique Salazar Torres wrote:
>
>> Hi:
>>
>> I implemented a graph algorithm to recommend content to our users.
>> Although it is working (implementation uses Mahout) it very inefficient
>> because I have to run many iterations in order to perform a breadth-first
>> search on my graph.
>> I would like to use Giraph for that task. I would like to know if it is
>> production ready. I'm running jobs on Amazon EMR.
>>
>> Thanks in advance.
>> Gustavo
>>
>
>
Re: Breadth-first search
Posted by Avery Ching <ac...@apache.org>.
We are running several Giraph applications in production using our
version of Hadoop (Corona) at Facebook. The part you have to be careful
about is ensuring you have enough resources for your job to run. But
otherwise, we are able to run at FB-scale (i.e. 1billion+ nodes, many
more edges).
Avery
On 12/11/12 5:58 AM, Gustavo Enrique Salazar Torres wrote:
> Hi:
>
> I implemented a graph algorithm to recommend content to our users.
> Although it is working (implementation uses Mahout) it very
> inefficient because I have to run many iterations in order to perform
> a breadth-first search on my graph.
> I would like to use Giraph for that task. I would like to know if it
> is production ready. I'm running jobs on Amazon EMR.
>
> Thanks in advance.
> Gustavo