You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Paolo Castagna <ca...@googlemail.com> on 2012/04/05 06:30:32 UTC

Giraph as Whirr service, see WHIRR-530

Hi,
seen this?

  WHIRR-530 - Add Giraph as a service
  https://issues.apache.org/jira/browse/WHIRR-530

This could be quite useful for users who want to give Giraph a spin on cloud
infrastructure, just for testing or to run a few small experiments.
My experience with Whirr an small 10-20 nodes clusters has be quite positive.
Less so for larger clusters, but it more a problem/limit with the cloud
provider rather than Whirr itself. I think.

Whirr makes extremely easy and pleasant deploy stuff on-demand.

... and Whirr already supports YARN:
https://issues.apache.org/jira/browse/WHIRR-391

Is any Giraph developers/users here also a Whirr user?

Paolo

Re: Giraph as Whirr service, see WHIRR-530

Posted by Paolo Castagna <ca...@googlemail.com>.
Thank you all for your comments.

There seems to be some interest and certainly agreement on just
"for testing"/"temporary" and the limits on cloud infrastructure
in relation to things as Hadoop, ZooKeeper and Giraph.

I also agree that, given Whirr can already spin Hadoop clusters,
user can run Giraph that way.

Whirr option might become more interesting in relation to YARN
and perhaps unit/integration testing (although, I am not sure
if/who is willing to put a credit card that). Fortunately, Giraph
tests run reasonably well and quickly locally.

Anyway, I'll keep an eye on WHIRR-530 and as I learn more about
Giraph and Whirr help that if I can. Personally, I am more
interested in YARN and Giraph than Giraph in its current shape.
Or, in orther words, in the future of Giraph rather than in the
past (i.e. backward compatibility/legacy) (although, I am aware
you have that in your mind as well and it seems to me there are
already Giraph users, so...)

Thanks,
Paolo

Paolo Castagna wrote:
> Hi,
> seen this?
> 
>   WHIRR-530 - Add Giraph as a service
>   https://issues.apache.org/jira/browse/WHIRR-530
> 
> This could be quite useful for users who want to give Giraph a spin on cloud
> infrastructure, just for testing or to run a few small experiments.
> My experience with Whirr an small 10-20 nodes clusters has be quite positive.
> Less so for larger clusters, but it more a problem/limit with the cloud
> provider rather than Whirr itself. I think.
> 
> Whirr makes extremely easy and pleasant deploy stuff on-demand.
> 
> ... and Whirr already supports YARN:
> https://issues.apache.org/jira/browse/WHIRR-391
> 
> Is any Giraph developers/users here also a Whirr user?
> 
> Paolo

Re: Giraph as Whirr service, see WHIRR-530

Posted by Dan McClary <da...@northwestern.edu>.
Having used Whirr several times in EC2, it seems like a fine way to spin up
a temporary 'developers' cluster.  Zookeeper is the most likely source of
difficulty on VMs with limited I/O (i.e., it's very chatty and doesn't
tolerate the highly variable latency that smaller AMIs provide).  The HBase
community seems to be very aware of this; there's likely some tips and
tricks to be gleaned from reading their mailing lists.

-Dan

On Wed, Apr 4, 2012 at 11:08 PM, Brian Femiano <bf...@gmail.com> wrote:

> I've used it on clusters I started on EC2 launched by Whirr. Simply copy
> the fat
> jar to your client machine and it will distribute normally as a M/R
> dependency.
>
> It works very well.
>
> The only limitation I could potentially find (without much proof) was on
> VMs
> with limited IO the RPC message overhead between workers could be an issue.
> I never tried it on VMs with less than 'High' IO, so take that with a grain
> of salt.
>
> On Thu, Apr 5, 2012 at 12:51 AM, Jakob Homan <jg...@gmail.com> wrote:
>
> > This is interesting.  Whirr can already spin up Hadoop MR clusters,
> > which can then run the Giraph jobs.  Once Giraph is bootstrapped onto
> > YARN, this will make more sense as a Whirr service.
> >
> > On Wed, Apr 4, 2012 at 9:43 PM, Avery Ching <ac...@apache.org> wrote:
> > > I don't use Whirr...I haven't heard it mentioned on this forum yet.
> >  Anyone?
> > >
> > > Avery
> > >
> > >
> > > On 4/4/12 9:30 PM, Paolo Castagna wrote:
> > >>
> > >> Hi,
> > >> seen this?
> > >>
> > >>   WHIRR-530 - Add Giraph as a service
> > >>   https://issues.apache.org/jira/browse/WHIRR-530
> > >>
> > >> This could be quite useful for users who want to give Giraph a spin on
> > >> cloud
> > >> infrastructure, just for testing or to run a few small experiments.
> > >> My experience with Whirr an small 10-20 nodes clusters has be quite
> > >> positive.
> > >> Less so for larger clusters, but it more a problem/limit with the
> cloud
> > >> provider rather than Whirr itself. I think.
> > >>
> > >> Whirr makes extremely easy and pleasant deploy stuff on-demand.
> > >>
> > >> ... and Whirr already supports YARN:
> > >> https://issues.apache.org/jira/browse/WHIRR-391
> > >>
> > >> Is any Giraph developers/users here also a Whirr user?
> > >>
> > >> Paolo
> > >
> > >
> >
>



-- 
Daniel McClary, Ph.D.
Visiting Scholar
*Amaral Lab and Department of Chemical and Biological Engineering,
Northwestern University*
Bioinformatics Specialist II
*Howard Hughes Medical Institute*
Email: dan.mcclary@northwestern.edu
Phone: (847) 491-1234
Web: http://amaral-lab.org/people/mcclary/
Mailing address:
2145 Sheridan Rd, Room E-136
Northwestern University
Evanston, IL 60208

Re: Giraph as Whirr service, see WHIRR-530

Posted by Brian Femiano <bf...@gmail.com>.
I've used it on clusters I started on EC2 launched by Whirr. Simply copy
the fat
jar to your client machine and it will distribute normally as a M/R
dependency.

It works very well.

The only limitation I could potentially find (without much proof) was on VMs
with limited IO the RPC message overhead between workers could be an issue.
I never tried it on VMs with less than 'High' IO, so take that with a grain
of salt.

On Thu, Apr 5, 2012 at 12:51 AM, Jakob Homan <jg...@gmail.com> wrote:

> This is interesting.  Whirr can already spin up Hadoop MR clusters,
> which can then run the Giraph jobs.  Once Giraph is bootstrapped onto
> YARN, this will make more sense as a Whirr service.
>
> On Wed, Apr 4, 2012 at 9:43 PM, Avery Ching <ac...@apache.org> wrote:
> > I don't use Whirr...I haven't heard it mentioned on this forum yet.
>  Anyone?
> >
> > Avery
> >
> >
> > On 4/4/12 9:30 PM, Paolo Castagna wrote:
> >>
> >> Hi,
> >> seen this?
> >>
> >>   WHIRR-530 - Add Giraph as a service
> >>   https://issues.apache.org/jira/browse/WHIRR-530
> >>
> >> This could be quite useful for users who want to give Giraph a spin on
> >> cloud
> >> infrastructure, just for testing or to run a few small experiments.
> >> My experience with Whirr an small 10-20 nodes clusters has be quite
> >> positive.
> >> Less so for larger clusters, but it more a problem/limit with the cloud
> >> provider rather than Whirr itself. I think.
> >>
> >> Whirr makes extremely easy and pleasant deploy stuff on-demand.
> >>
> >> ... and Whirr already supports YARN:
> >> https://issues.apache.org/jira/browse/WHIRR-391
> >>
> >> Is any Giraph developers/users here also a Whirr user?
> >>
> >> Paolo
> >
> >
>

Re: Giraph as Whirr service, see WHIRR-530

Posted by Jakob Homan <jg...@gmail.com>.
This is interesting.  Whirr can already spin up Hadoop MR clusters,
which can then run the Giraph jobs.  Once Giraph is bootstrapped onto
YARN, this will make more sense as a Whirr service.

On Wed, Apr 4, 2012 at 9:43 PM, Avery Ching <ac...@apache.org> wrote:
> I don't use Whirr...I haven't heard it mentioned on this forum yet.  Anyone?
>
> Avery
>
>
> On 4/4/12 9:30 PM, Paolo Castagna wrote:
>>
>> Hi,
>> seen this?
>>
>>   WHIRR-530 - Add Giraph as a service
>>   https://issues.apache.org/jira/browse/WHIRR-530
>>
>> This could be quite useful for users who want to give Giraph a spin on
>> cloud
>> infrastructure, just for testing or to run a few small experiments.
>> My experience with Whirr an small 10-20 nodes clusters has be quite
>> positive.
>> Less so for larger clusters, but it more a problem/limit with the cloud
>> provider rather than Whirr itself. I think.
>>
>> Whirr makes extremely easy and pleasant deploy stuff on-demand.
>>
>> ... and Whirr already supports YARN:
>> https://issues.apache.org/jira/browse/WHIRR-391
>>
>> Is any Giraph developers/users here also a Whirr user?
>>
>> Paolo
>
>

Re: Giraph as Whirr service, see WHIRR-530

Posted by Avery Ching <ac...@apache.org>.
I don't use Whirr...I haven't heard it mentioned on this forum yet.  Anyone?

Avery

On 4/4/12 9:30 PM, Paolo Castagna wrote:
> Hi,
> seen this?
>
>    WHIRR-530 - Add Giraph as a service
>    https://issues.apache.org/jira/browse/WHIRR-530
>
> This could be quite useful for users who want to give Giraph a spin on cloud
> infrastructure, just for testing or to run a few small experiments.
> My experience with Whirr an small 10-20 nodes clusters has be quite positive.
> Less so for larger clusters, but it more a problem/limit with the cloud
> provider rather than Whirr itself. I think.
>
> Whirr makes extremely easy and pleasant deploy stuff on-demand.
>
> ... and Whirr already supports YARN:
> https://issues.apache.org/jira/browse/WHIRR-391
>
> Is any Giraph developers/users here also a Whirr user?
>
> Paolo