You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Alex <al...@gmail.com> on 2015/01/29 17:08:19 UTC

Hadoop on Mesos

Hi guys,

I'm a Hadoop and Mesos n00b, so please be gentle. I'm trying to set up a
Mesos cluster, and my ultimate goal is to introduce Mesos in my
organization by showing off it's ability to run multiple Hadoop
clusters, plus other stuff, on the same resources. I'd like to be able
to do this with a HA configuration as close as possible to something we
would run in production.

I've successfully set up a Mesos cluster with 3 masters and 4 slaves,
but I'm having trouble getting Hadoop jobs to run on top of it. I'm
using Mesos 0.21.1 and Hadoop CDH 5.3.0. Initially I tried to follow the
Mesosphere tutorial[1], but it looks like it is very outdated and I
didn't get very far. Then I tried following the instructions in the
github repo[2], but they're also less than ideal.

I've managed to get a Hadoop jobtracker running on one of the masters, I
can submit jobs to it and they eventually finish. The strange thing is
that they take a really long time to start the reduce task, so much so
that the first few times I thought it wasn't working at all. Here's part
of the output for a simple wordcount example:

15/01/29 16:37:58 INFO mapred.JobClient:  map 0% reduce 0%
15/01/29 16:39:23 INFO mapred.JobClient:  map 25% reduce 0%
15/01/29 16:39:31 INFO mapred.JobClient:  map 50% reduce 0%
15/01/29 16:39:34 INFO mapred.JobClient:  map 75% reduce 0%
15/01/29 16:39:37 INFO mapred.JobClient:  map 100% reduce 0%
15/01/29 16:56:25 INFO mapred.JobClient:  map 100% reduce 100%
15/01/29 16:56:29 INFO mapred.JobClient: Job complete: job_201501291533_0004

Mesos started 3 task trackers which ran the map tasks pretty fast, but
then it looks like it was stuck for quite a while before launching a
fourth task tracker to run the reduce task. Is this normal, or is there
something wrong here?

More questions: my configuration file looks a lot like the example in
the github repo, but that's listed as being representative of a
pseudo-distributed configuration. What should it look like for a real
distributed setup? How can I go about running multiple Hadoop clusters?
Currently, all three masters have the same configuration file, so they
all create a different framework. How should things be set up for a
high-availability Hadoop framework that can survive the failure of a
Master? What do I need to do to run multiple versions of Hadoop on the
same Mesos cluster?

I'd really appreciate any hints of documentation or tutorials I may have
missed. Even better would be examples of Puppet configurations to set
something like this up, but I guess that's probably unlikely.

Thanks a lot in advance,
Alex


[1] https://mesosphere.com/docs/tutorials/run-hadoop-on-mesos/
[2] https://github.com/mesos/hadoop

Re: Hadoop on Mesos

Posted by Adam Bordelon <ad...@mesosphere.io>.
Note that you could also launch the framework scheduler/JT itself via
Marathon, and then it would run as a Mesos task on one of the slaves,
automatically restarting (potentially elsewhere) if it dies. Then you could
use something like mesos-dns <https://github.com/mesosphere/mesos-dns> to
set the mapred.job.tracker property to "marathon.hadoop.jobtracker:port" or
whatever name it generates.

For YARN/MR2 workloads, you might also want to check out Myriad
<https://github.com/mesos/myriad>

On Fri, Jan 30, 2015 at 5:17 AM, Alex <al...@gmail.com> wrote:

>  Hi Tom,
>
> Thanks a lot for your reply, it's very helpful.
>
> On 01/29/2015 05:54 PM, Tom Arnfeld wrote:
>
>  Hi Alex,
>
>  Great to hear you're hoping to use Hadoop on Mesos. We've been running
> it for a good 6 months and it's been awesome.
>
>  I'll answer the simpler question first, running multiple job trackers
> should be just fine.. even multiple JTs with HA enabled (we do this). The
> mesos scheduler for Hadoop will ship all configuration options needed for
> each TaskTracker within mesos, so there's nothing you need to have
> specifically configured on each slave..
>
>  # Slow slot allocations
>
>  If you only have a few slaves, not many resources and a large amount of
> resources per slot, you might end up with a pretty small slot allocation
> (e.g 5 mappers and 1 reducer). Because of the nature of Hadoop, slots are
> static for each TaskTracker and the framework does a *best effort* to
> figure out what balance of map/reduce slots to launch on the cluster.
>
>  Because of this, the current stable version of the framework has a few
> issues when running on small clusters, especially when you don't configure
> min/max slot capacity for each JobTracker. Few links below
>
>  - https://github.com/mesos/hadoop/issues/32
> - https://github.com/mesos/hadoop/issues/31
> - https://github.com/mesos/hadoop/issues/28
> - https://github.com/mesos/hadoop/issues/26
>
>  Having said that, we've been working on a solution to this problem which
> enables Hadoop to launch different types of slots over the lifetime of a
> single job, meaning you could start with 5 maps and 1 reduce, and then end
> with 0 maps and 6 reduce. It's not perfect, but it's a decent optimisation
> if you still need to use Hadoop.
>
>  - https://github.com/mesos/hadoop/pull/33
>
> You may also want to look into how large your executor URI is (the one
> containing the hadoop source that gets downloaded for each task tracker)
> and how long that takes to download.. it might be that the task trackers
> are taking a while to bootstrap.
>
>
> Do you have any idea of when your pull request will be merged? It looks
> pretty interesting, even if we're just playing around at this point. Is
> your hadoop-mesos-0.0.9.jar available for download somewhere, or do I have
> to build it myself? In the meantime, I'm adding more slaves to see if this
> makes the problem go away, at least for demos.
>
>
>  # HA Hadoop JTs
>
>  The framework currently does not support a full HA setup, however that's
> not a huge issue. The JT will automatically restart jobs where they left
> off on it's own when a failover occurs, but for the time being all the
> track trackers will be killed and new ones spawned. Depending on your
> setup, this could be a fairly negligible time.
>
>
> I'm not sure I understand. I know task trackers will get restarted, that's
> not what I'm worried about. The issue I see is with the JT: it's started on
> one master only. If that master goes down, the framework goes down. I was
> kind of hoping to be able to do something like this:
>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>zk://mesos01:2181,mesos02:2181,mesos03:2181/hadoop530</value>
> </property>
>
> Perhaps this doesn't actually work as I would expect. It doesn't look like
> there's been any progress on issue #28, unfortunately...
>
>
>  # Multiple versions of hadoop on the cluster
>
>  This is totally fine, each JT configuration can be given it's own hadoop
> tar.gz file with the right version in it, and they will all happily share
> the Mesos cluster.
>
>  I guess you have to have multiple startup scripts for this, and also
> multiple versions of Hadoop on the masters. Any pointers of how you've set
> this up would be much appreciated.
>
> Cheers,
> Alex
>

Re: Hadoop on Mesos

Posted by Alex <al...@gmail.com>.
Hi Tom,

Thanks a lot for your reply, it's very helpful.

On 01/29/2015 05:54 PM, Tom Arnfeld wrote:
> Hi Alex,
>
> Great to hear you're hoping to use Hadoop on Mesos. We've been running
> it for a good 6 months and it's been awesome.
>
> I'll answer the simpler question first, running multiple job trackers
> should be just fine.. even multiple JTs with HA enabled (we do this).
> The mesos scheduler for Hadoop will ship all configuration options
> needed for each TaskTracker within mesos, so there's nothing you need
> to have specifically configured on each slave..
>
> # Slow slot allocations
>
> If you only have a few slaves, not many resources and a large amount
> of resources per slot, you might end up with a pretty small slot
> allocation (e.g 5 mappers and 1 reducer). Because of the nature of
> Hadoop, slots are static for each TaskTracker and the framework does a
> /best effort/ to figure out what balance of map/reduce slots to launch
> on the cluster.
>
> Because of this, the current stable version of the framework has a few
> issues when running on small clusters, especially when you don't
> configure min/max slot capacity for each JobTracker. Few links below
>
> - https://github.com/mesos/hadoop/issues/32
> - https://github.com/mesos/hadoop/issues/31
> - https://github.com/mesos/hadoop/issues/28
> - https://github.com/mesos/hadoop/issues/26
>
> Having said that, we've been working on a solution to this problem
> which enables Hadoop to launch different types of slots over the
> lifetime of a single job, meaning you could start with 5 maps and 1
> reduce, and then end with 0 maps and 6 reduce. It's not perfect, but
> it's a decent optimisation if you still need to use Hadoop.
>
> - https://github.com/mesos/hadoop/pull/33
>
> You may also want to look into how large your executor URI is (the one
> containing the hadoop source that gets downloaded for each task
> tracker) and how long that takes to download.. it might be that the
> task trackers are taking a while to bootstrap.

Do you have any idea of when your pull request will be merged? It looks
pretty interesting, even if we're just playing around at this point. Is
your hadoop-mesos-0.0.9.jar available for download somewhere, or do I
have to build it myself? In the meantime, I'm adding more slaves to see
if this makes the problem go away, at least for demos.

>
> # HA Hadoop JTs
>
> The framework currently does not support a full HA setup, however
> that's not a huge issue. The JT will automatically restart jobs where
> they left off on it's own when a failover occurs, but for the time
> being all the track trackers will be killed and new ones spawned.
> Depending on your setup, this could be a fairly negligible time.

I'm not sure I understand. I know task trackers will get restarted,
that's not what I'm worried about. The issue I see is with the JT: it's
started on one master only. If that master goes down, the framework goes
down. I was kind of hoping to be able to do something like this:

<property>
  <name>mapred.job.tracker</name>
  <value>zk://mesos01:2181,mesos02:2181,mesos03:2181/hadoop530</value>
</property>

Perhaps this doesn't actually work as I would expect. It doesn't look
like there's been any progress on issue #28, unfortunately...

>
> # Multiple versions of hadoop on the cluster
>
> This is totally fine, each JT configuration can be given it's own
> hadoop tar.gz file with the right version in it, and they will all
> happily share the Mesos cluster.
>
I guess you have to have multiple startup scripts for this, and also
multiple versions of Hadoop on the masters. Any pointers of how you've
set this up would be much appreciated.

Cheers,
Alex

Re: Hadoop on Mesos

Posted by Tom Arnfeld <to...@duedil.com>.
Hi Alex,




Great to hear you're hoping to use Hadoop on Mesos. We've been running it for a good 6 months and it's been awesome.




I'll answer the simpler question first, running multiple job trackers should be just fine.. even multiple JTs with HA enabled (we do this). The mesos scheduler for Hadoop will ship all configuration options needed for each TaskTracker within mesos, so there's nothing you need to have specifically configured on each slave..




# Slow slot allocations




If you only have a few slaves, not many resources and a large amount of resources per slot, you might end up with a pretty small slot allocation (e.g 5 mappers and 1 reducer). Because of the nature of Hadoop, slots are static for each TaskTracker and the framework does a best effort to figure out what balance of map/reduce slots to launch on the cluster.




Because of this, the current stable version of the framework has a few issues when running on small clusters, especially when you don't configure min/max slot capacity for each JobTracker. Few links below




- https://github.com/mesos/hadoop/issues/32

- https://github.com/mesos/hadoop/issues/31

- https://github.com/mesos/hadoop/issues/28

- https://github.com/mesos/hadoop/issues/26




Having said that, we've been working on a solution to this problem which enables Hadoop to launch different types of slots over the lifetime of a single job, meaning you could start with 5 maps and 1 reduce, and then end with 0 maps and 6 reduce. It's not perfect, but it's a decent optimisation if you still need to use Hadoop.




- https://github.com/mesos/hadoop/pull/33


You may also want to look into how large your executor URI is (the one containing the hadoop source that gets downloaded for each task tracker) and how long that takes to download.. it might be that the task trackers are taking a while to bootstrap.




# HA Hadoop JTs




The framework currently does not support a full HA setup, however that's not a huge issue. The JT will automatically restart jobs where they left off on it's own when a failover occurs, but for the time being all the track trackers will be killed and new ones spawned. Depending on your setup, this could be a fairly negligible time.




# Multiple versions of hadoop on the cluster




This is totally fine, each JT configuration can be given it's own hadoop tar.gz file with the right version in it, and they will all happily share the Mesos cluster.




I hope this makes sense! Ping me on irc (tarnfeld) if you run into anything funky on that branch for flexi trackers.




Tom.


--


Tom Arnfeld

Developer // DueDil

On Thu, Jan 29, 2015 at 4:09 PM, Alex <al...@gmail.com> wrote:

> Hi guys,
> I'm a Hadoop and Mesos n00b, so please be gentle. I'm trying to set up a
> Mesos cluster, and my ultimate goal is to introduce Mesos in my
> organization by showing off it's ability to run multiple Hadoop
> clusters, plus other stuff, on the same resources. I'd like to be able
> to do this with a HA configuration as close as possible to something we
> would run in production.
> I've successfully set up a Mesos cluster with 3 masters and 4 slaves,
> but I'm having trouble getting Hadoop jobs to run on top of it. I'm
> using Mesos 0.21.1 and Hadoop CDH 5.3.0. Initially I tried to follow the
> Mesosphere tutorial[1], but it looks like it is very outdated and I
> didn't get very far. Then I tried following the instructions in the
> github repo[2], but they're also less than ideal.
> I've managed to get a Hadoop jobtracker running on one of the masters, I
> can submit jobs to it and they eventually finish. The strange thing is
> that they take a really long time to start the reduce task, so much so
> that the first few times I thought it wasn't working at all. Here's part
> of the output for a simple wordcount example:
> 15/01/29 16:37:58 INFO mapred.JobClient:  map 0% reduce 0%
> 15/01/29 16:39:23 INFO mapred.JobClient:  map 25% reduce 0%
> 15/01/29 16:39:31 INFO mapred.JobClient:  map 50% reduce 0%
> 15/01/29 16:39:34 INFO mapred.JobClient:  map 75% reduce 0%
> 15/01/29 16:39:37 INFO mapred.JobClient:  map 100% reduce 0%
> 15/01/29 16:56:25 INFO mapred.JobClient:  map 100% reduce 100%
> 15/01/29 16:56:29 INFO mapred.JobClient: Job complete: job_201501291533_0004
> Mesos started 3 task trackers which ran the map tasks pretty fast, but
> then it looks like it was stuck for quite a while before launching a
> fourth task tracker to run the reduce task. Is this normal, or is there
> something wrong here?
> More questions: my configuration file looks a lot like the example in
> the github repo, but that's listed as being representative of a
> pseudo-distributed configuration. What should it look like for a real
> distributed setup? How can I go about running multiple Hadoop clusters?
> Currently, all three masters have the same configuration file, so they
> all create a different framework. How should things be set up for a
> high-availability Hadoop framework that can survive the failure of a
> Master? What do I need to do to run multiple versions of Hadoop on the
> same Mesos cluster?
> I'd really appreciate any hints of documentation or tutorials I may have
> missed. Even better would be examples of Puppet configurations to set
> something like this up, but I guess that's probably unlikely.
> Thanks a lot in advance,
> Alex
> [1] https://mesosphere.com/docs/tutorials/run-hadoop-on-mesos/
> [2] https://github.com/mesos/hadoop