You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Traiano Welcome <tr...@gmail.com> on 2017/07/26 08:12:02 UTC

How to deploy Hadoop on Mesos

Hi

Would anyone know of some reliable guides to deploying  apache hadoop on
top of the mesos scheduler?

Thanks,
Traiano

Re: How to deploy Hadoop on Mesos

Posted by Traiano Welcome <tr...@gmail.com>.
Hadoop definitely seems to be on the list of frameworks for mesos:

http://mesos.apache.org/documentation/latest/frameworks/

Has anyone recently tested getting it to work?




On Thu, Jul 27, 2017 at 5:39 PM, Stephen Gran <st...@piksel.com>
wrote:

> Hi,
>
> On 27/07/17 13:54, Traiano Welcome wrote:
> > Hi Stephen
> >
> >
> > On Thu, Jul 27, 2017 at 12:19 PM, Stephen Gran <st...@piksel.com>
> wrote:
> >     Both spark and flink integrate natively with mesos, so no need for an
> >     intermediate yarn layer.  For batch work, we're looking at the aurora
> >     project for job scheduling.
> >
> >
> >
> > I haven't looked at Aurora before - would you consider it a drop in
> > replacement for hadoop for distributed batch workloads?
>
> It's definitely not a drop in replacement - they have very different
> APIs and capabilities.  What aurora gives us is a DSL to build the DAG
> of an execution, and with a little work, some primitives to run those
> executions.  So, the functionality ends up being similar for 'just
> batch', but the language, the bindings, etc are all very different.
>
> Cheers,
> --
> Stephen Gran
> Senior Technical Architect
>
> picture the possibilities | piksel.com
>

Re: How to deploy Hadoop on Mesos

Posted by Stephen Gran <st...@piksel.com>.
Hi,

On 27/07/17 13:54, Traiano Welcome wrote:
> Hi Stephen
> 
> 
> On Thu, Jul 27, 2017 at 12:19 PM, Stephen Gran <st...@piksel.com> wrote:
>     Both spark and flink integrate natively with mesos, so no need for an
>     intermediate yarn layer.  For batch work, we're looking at the aurora
>     project for job scheduling.
> 
> 
> 
> I haven't looked at Aurora before - would you consider it a drop in
> replacement for hadoop for distributed batch workloads?

It's definitely not a drop in replacement - they have very different
APIs and capabilities.  What aurora gives us is a DSL to build the DAG
of an execution, and with a little work, some primitives to run those
executions.  So, the functionality ends up being similar for 'just
batch', but the language, the bindings, etc are all very different.

Cheers,
-- 
Stephen Gran
Senior Technical Architect

picture the possibilities | piksel.com

Re: How to deploy Hadoop on Mesos

Posted by Traiano Welcome <tr...@gmail.com>.
Hi Stephen


On Thu, Jul 27, 2017 at 12:19 PM, Stephen Gran <st...@piksel.com>
wrote:

> Hi,
>
> So typically people run two sorts of workloads on hadoop -
> ad-hoc/scheduled batch work, and stream workloads (spark, flink, etc.).
>
>

I'm definitely sure we'll be using hadoop for batch workloads.
We will integrate Spark with mesos for streaming workloads.




> Both spark and flink integrate natively with mesos, so no need for an
> intermediate yarn layer.  For batch work, we're looking at the aurora
> project for job scheduling.
>
>

I haven't looked at Aurora before - would you consider it a drop in
replacement for hadoop for distributed batch workloads?



> hadoop brings some interesting things, but I've not found integration
> with mesos to ever be pain-free, so we're moving to other tools instead
> of continuing down the path of trying to get hadoop working with mesos.
>
>

Understandably :-) I think I might take your advice here. Even if a one
time integration of hadoop and mesos was successful, the pain of having to
keep the integration functional over time through rapid updates and code
changes between two unrelated project codebases would be a nightmare.


Good luck!
>
> On 27/07/17 08:50, Traiano Welcome wrote:
> > Hi Stephen
> >
> >
> > On Wed, Jul 26, 2017 at 5:18 PM, Stephen Gran <stephen.gran@piksel.com
> > <ma...@piksel.com>> wrote:
> >
> >     Hi,
> >
> >     It is having discussions about whether to stop, as it's having
> trouble
> >     getting enough contributors.
> >
> >     I guess I'd ask what you need to run on hadoop, why you're looking at
> >     mesos, and then see what else is in that space.
> >
> >
> >
> > I don't know what we'd need to run on hadoop at this point - it's open
> > ended, and for our developers to decide. However, should this make a
> > difference?
> >
> > We have mesos in place as a resource scheduler for a number of
> > frameworks and would like to resource manage it using the same
> > semantics, tools and mechanisms mesos provides.
> >
> > I've looked at two books so far that show how this is done, so it seems
> > this way of managing hadoop is in use in places (ref: "Apache Mesos
> > Essentials", "Mastering Mesos"), however these books are probably out of
> > date because the procedure they describe for integrating mesos and
> > hadoop is broken.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >     Cheers,
> >
> >     On 26/07/17 14:13, Brandon Gulla wrote:
> >     > Have you looked into Apache Myriad?
> >     >
> >     > http://myriad.apache.org/
> >     >
> >     > On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <
> traiano@gmail.com <ma...@gmail.com>
> >     > <mailto:traiano@gmail.com <ma...@gmail.com>>> wrote:
> >     >
> >     >     Hi
> >     >
> >     >     Would anyone know of some reliable guides to deploying  apache
> >     >     hadoop on top of the mesos scheduler?
> >     >
> >     >     Thanks,
> >     >     Traiano
> >     >
> >     >
> >     >
> >     >
> >     > --
> >     > Brandon
> >
> >     --
> >     Stephen Gran
> >     Senior Technical Architect
> >
> >     picture the possibilities | piksel.com <http://piksel.com>
> >
> >
>
> --
> Stephen Gran
> Senior Technical Architect
>
> picture the possibilities | piksel.com
>

Re: How to deploy Hadoop on Mesos

Posted by Stephen Gran <st...@piksel.com>.
Hi,

So typically people run two sorts of workloads on hadoop -
ad-hoc/scheduled batch work, and stream workloads (spark, flink, etc.).

Both spark and flink integrate natively with mesos, so no need for an
intermediate yarn layer.  For batch work, we're looking at the aurora
project for job scheduling.

hadoop brings some interesting things, but I've not found integration
with mesos to ever be pain-free, so we're moving to other tools instead
of continuing down the path of trying to get hadoop working with mesos.

Good luck!

On 27/07/17 08:50, Traiano Welcome wrote:
> Hi Stephen
> 
> 
> On Wed, Jul 26, 2017 at 5:18 PM, Stephen Gran <stephen.gran@piksel.com
> <ma...@piksel.com>> wrote:
> 
>     Hi,
> 
>     It is having discussions about whether to stop, as it's having trouble
>     getting enough contributors.
> 
>     I guess I'd ask what you need to run on hadoop, why you're looking at
>     mesos, and then see what else is in that space.
> 
> 
> 
> I don't know what we'd need to run on hadoop at this point - it's open
> ended, and for our developers to decide. However, should this make a
> difference?
> 
> We have mesos in place as a resource scheduler for a number of
> frameworks and would like to resource manage it using the same
> semantics, tools and mechanisms mesos provides.
> 
> I've looked at two books so far that show how this is done, so it seems
> this way of managing hadoop is in use in places (ref: "Apache Mesos
> Essentials", "Mastering Mesos"), however these books are probably out of
> date because the procedure they describe for integrating mesos and
> hadoop is broken.
> 
> 
> 
> 
> 
> 
> 
> 
>  
> 
>     Cheers,
> 
>     On 26/07/17 14:13, Brandon Gulla wrote:
>     > Have you looked into Apache Myriad?
>     >
>     > http://myriad.apache.org/
>     >
>     > On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <traiano@gmail.com <ma...@gmail.com>
>     > <mailto:traiano@gmail.com <ma...@gmail.com>>> wrote:
>     >
>     >     Hi
>     >
>     >     Would anyone know of some reliable guides to deploying  apache
>     >     hadoop on top of the mesos scheduler?
>     >
>     >     Thanks,
>     >     Traiano
>     >
>     >
>     >
>     >
>     > --
>     > Brandon
> 
>     --
>     Stephen Gran
>     Senior Technical Architect
> 
>     picture the possibilities | piksel.com <http://piksel.com>
> 
> 

-- 
Stephen Gran
Senior Technical Architect

picture the possibilities | piksel.com

Re: How to deploy Hadoop on Mesos

Posted by Traiano Welcome <tr...@gmail.com>.
Hi Stephen


On Wed, Jul 26, 2017 at 5:18 PM, Stephen Gran <st...@piksel.com>
wrote:

> Hi,
>
> It is having discussions about whether to stop, as it's having trouble
> getting enough contributors.
>
> I guess I'd ask what you need to run on hadoop, why you're looking at
> mesos, and then see what else is in that space.
>
>

I don't know what we'd need to run on hadoop at this point - it's open
ended, and for our developers to decide. However, should this make a
difference?

We have mesos in place as a resource scheduler for a number of frameworks
and would like to resource manage it using the same semantics, tools and
mechanisms mesos provides.

I've looked at two books so far that show how this is done, so it seems
this way of managing hadoop is in use in places (ref: "Apache Mesos
Essentials", "Mastering Mesos"), however these books are probably out of
date because the procedure they describe for integrating mesos and hadoop
is broken.










> Cheers,
>
> On 26/07/17 14:13, Brandon Gulla wrote:
> > Have you looked into Apache Myriad?
> >
> > http://myriad.apache.org/
> >
> > On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <traiano@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> >     Hi
> >
> >     Would anyone know of some reliable guides to deploying  apache
> >     hadoop on top of the mesos scheduler?
> >
> >     Thanks,
> >     Traiano
> >
> >
> >
> >
> > --
> > Brandon
>
> --
> Stephen Gran
> Senior Technical Architect
>
> picture the possibilities | piksel.com
>

Re: How to deploy Hadoop on Mesos

Posted by Stephen Gran <st...@piksel.com>.
Hi,

It is having discussions about whether to stop, as it's having trouble
getting enough contributors.

I guess I'd ask what you need to run on hadoop, why you're looking at
mesos, and then see what else is in that space.

Cheers,

On 26/07/17 14:13, Brandon Gulla wrote:
> Have you looked into Apache Myriad? 
> 
> http://myriad.apache.org/
> 
> On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <traiano@gmail.com
> <ma...@gmail.com>> wrote:
> 
>     Hi
> 
>     Would anyone know of some reliable guides to deploying  apache
>     hadoop on top of the mesos scheduler?
> 
>     Thanks,
>     Traiano
> 
> 
> 
> 
> -- 
> Brandon

-- 
Stephen Gran
Senior Technical Architect

picture the possibilities | piksel.com

Re: How to deploy Hadoop on Mesos

Posted by Traiano Welcome <tr...@gmail.com>.
Hi Brandon

On Wed, Jul 26, 2017 at 5:13 PM, Brandon Gulla <gu...@gmail.com>
wrote:

> Have you looked into Apache Myriad?
>
> http://myriad.apache.org/
>


I took a brief look and thought "more flaky, half-cooked stuff that doesn't
work in production and will cause a system engineer a world of pain to get
working reliably." ... So no, avoid like the plague :-)




>
>
> On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <tr...@gmail.com>
> wrote:
>
>> Hi
>>
>> Would anyone know of some reliable guides to deploying  apache hadoop on
>> top of the mesos scheduler?
>>
>> Thanks,
>> Traiano
>>
>
>
>
> --
> Brandon
>

Re: How to deploy Hadoop on Mesos

Posted by Brandon Gulla <gu...@gmail.com>.
Have you looked into Apache Myriad?

http://myriad.apache.org/

On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <tr...@gmail.com> wrote:

> Hi
>
> Would anyone know of some reliable guides to deploying  apache hadoop on
> top of the mesos scheduler?
>
> Thanks,
> Traiano
>



-- 
Brandon