You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mesos.apache.org by th...@artorg.unibe.ch on 2017/06/22 14:13:02 UTC

Agent Working Directory Best Practices

Hi,

We have a couple of server nodes mainly used for computational tasks in
our mesos cluster. These servers have beefy cpus, gpus etc. but only
limited ssd space. We also have a 40GBe network and a decently fast
file server.

My question is simple but I didnt find an answer anywhere: What are the
best practices for the working directory on mesos-agent nodes? Should
we keep the working directory local or is it reasonable to use a nfs
mounted folder? We implemented both and they seem to work fine, but I
would rather like to follow "best practices".

Thanks and cheers

Tom

Re: Agent Working Directory Best Practices

Posted by Vinod Kone <vi...@apache.org>.

This is great information. Thanks for sharing Steven!

On Tue, Jun 27, 2017 at 7:05 AM, Steven Schlansker <
sschlansker@opentable.com> wrote:

>
> > On Jun 25, 2017, at 11:24 PM, Benjamin Mahler <bm...@apache.org>
> wrote:
> >
> > As a data point, as far as I'm aware, most users are using a local work
> directory, not an NFS mounted one. Would love to hear from anyone on the
> list if they are doing this, and if there are any subtleties that should be
> documented.
>
> We don't run NFS in particular but we did originally use a SAN -- two
> observations:
>
> NFS (historically, maybe it's better now, but doubtful...) has really bad
> failure modes.
> Network failures can cause serious hangs both in user-space and
> kernel-space.  Such
> hangs can be impossible to clear without rebooting the machine, and in
> some edge cases
> can even make it difficult or impossible to reboot the machine via normal
> means.
>
> Network attached drives (our SAN) are less reliable, slower, and more
> complex
> (read: more failure modes) than local disk.  It's also a really big single
> point
> of failure.  So far our only true cluster outages have been due to failure
> of
> the SAN, since it took down all nodes at once -- once we removed the SAN,
> future
> failures had islands of availability and any properly written application
> could continue running (obviously without network resources) through the
> incident.
>
> Maybe this isn't a huge deal for your use case, which might differ from
> ours.
> For us, it was enough of a problem that we now purchase local SSD scratch
> space
> for every node just so that we have some storage we can depend on a bit
> more
> than network attached storage.
>
> >
> > On Thu, Jun 22, 2017 at 11:13 PM, <th...@artorg.unibe.ch>
> wrote:
> > Hi,
> >
> > We have a couple of server nodes mainly used for computational tasks in
> > our mesos cluster. These servers have beefy cpus, gpus etc. but only
> > limited ssd space. We also have a 40GBe network and a decently fast
> > file server.
> >
> > My question is simple but I didnt find an answer anywhere: What are the
> > best practices for the working directory on mesos-agent nodes? Should
> > we keep the working directory local or is it reasonable to use a nfs
> > mounted folder? We implemented both and they seem to work fine, but I
> > would rather like to follow "best practices".
> >
> > Thanks and cheers
> >
> > Tom
> >
>
>

Re: Agent Working Directory Best Practices

Posted by Steven Schlansker <ss...@opentable.com>.

> On Jun 26, 2017, at 5:30 PM, James Peach <jo...@gmail.com> wrote:
> 
> 
>> On Jun 26, 2017, at 4:05 PM, Steven Schlansker <ss...@opentable.com> wrote:
>> 
>> 
>>> On Jun 25, 2017, at 11:24 PM, Benjamin Mahler <bm...@apache.org> wrote:
>>> 
>>> As a data point, as far as I'm aware, most users are using a local work directory, not an NFS mounted one. Would love to hear from anyone on the list if they are doing this, and if there are any subtleties that should be documented.
>> 
>> We don't run NFS in particular but we did originally use a SAN -- two observations:
>> 
>> NFS (historically, maybe it's better now, but doubtful...) has really bad failure modes.
>> Network failures can cause serious hangs both in user-space and kernel-space.  Such
>> hangs can be impossible to clear without rebooting the machine, and in some edge cases
>> can even make it difficult or impossible to reboot the machine via normal means.
> 
> You need to make sure to mount with the "intr" option.
> 
> https://speakerdeck.com/gnb/130-lca2008-nfs-tuning-secrets-d7

That's not without some caveats.  nfs(5):

The intr / nointr mount option is deprecated after kernel 2.6.25. Only SIGKILL can interrupt a pending NFS operation on these kernels, and if specified, this mount option is ignored to provide backwards compatibility with older kernels.
Using the intr option is preferred to using the soft option because it is significantly less likely to result in data corruption.

...

NB: A so-called "soft" timeout can cause silent data corruption in certain cases. As such, use the soft option only when client responsiveness is more important than data integrity. Using NFS over TCP or increasing the value of the retrans option may mitigate some of the risks of using the soft option.


So, 'intr' is deprecated / removed on any reasonable kernel, and 'soft' has silent data corruption issues.
Typical Linux, having a broken implementation that then points you to instead use a deprecated / removed implementation :)

I'm sure there's a way to get NFS working great.  Just pointing out that you'll need an expert to take ownership of it!

> 
>> 
>> Network attached drives (our SAN) are less reliable, slower, and more complex
>> (read: more failure modes) than local disk.  It's also a really big single point
>> of failure.  So far our only true cluster outages have been due to failure of
>> the SAN, since it took down all nodes at once -- once we removed the SAN, future
>> failures had islands of availability and any properly written application
>> could continue running (obviously without network resources) through the incident.
>> 
>> Maybe this isn't a huge deal for your use case, which might differ from ours.
>> For us, it was enough of a problem that we now purchase local SSD scratch space
>> for every node just so that we have some storage we can depend on a bit more
>> than network attached storage.
>> 
>>> 
>>> On Thu, Jun 22, 2017 at 11:13 PM, <th...@artorg.unibe.ch> wrote:
>>> Hi,
>>> 
>>> We have a couple of server nodes mainly used for computational tasks in
>>> our mesos cluster. These servers have beefy cpus, gpus etc. but only
>>> limited ssd space. We also have a 40GBe network and a decently fast
>>> file server.
>>> 
>>> My question is simple but I didnt find an answer anywhere: What are the
>>> best practices for the working directory on mesos-agent nodes? Should
>>> we keep the working directory local or is it reasonable to use a nfs
>>> mounted folder? We implemented both and they seem to work fine, but I
>>> would rather like to follow "best practices".
>>> 
>>> Thanks and cheers
>>> 
>>> Tom
>>> 
>> 
>

Re: Agent Working Directory Best Practices

Posted by James Peach <jo...@gmail.com>.

> On Jun 26, 2017, at 4:05 PM, Steven Schlansker <ss...@opentable.com> wrote:
> 
> 
>> On Jun 25, 2017, at 11:24 PM, Benjamin Mahler <bm...@apache.org> wrote:
>> 
>> As a data point, as far as I'm aware, most users are using a local work directory, not an NFS mounted one. Would love to hear from anyone on the list if they are doing this, and if there are any subtleties that should be documented.
> 
> We don't run NFS in particular but we did originally use a SAN -- two observations:
> 
> NFS (historically, maybe it's better now, but doubtful...) has really bad failure modes.
> Network failures can cause serious hangs both in user-space and kernel-space.  Such
> hangs can be impossible to clear without rebooting the machine, and in some edge cases
> can even make it difficult or impossible to reboot the machine via normal means.

You need to make sure to mount with the "intr" option.

https://speakerdeck.com/gnb/130-lca2008-nfs-tuning-secrets-d7

> 
> Network attached drives (our SAN) are less reliable, slower, and more complex
> (read: more failure modes) than local disk.  It's also a really big single point
> of failure.  So far our only true cluster outages have been due to failure of
> the SAN, since it took down all nodes at once -- once we removed the SAN, future
> failures had islands of availability and any properly written application
> could continue running (obviously without network resources) through the incident.
> 
> Maybe this isn't a huge deal for your use case, which might differ from ours.
> For us, it was enough of a problem that we now purchase local SSD scratch space
> for every node just so that we have some storage we can depend on a bit more
> than network attached storage.
> 
>> 
>> On Thu, Jun 22, 2017 at 11:13 PM, <th...@artorg.unibe.ch> wrote:
>> Hi,
>> 
>> We have a couple of server nodes mainly used for computational tasks in
>> our mesos cluster. These servers have beefy cpus, gpus etc. but only
>> limited ssd space. We also have a 40GBe network and a decently fast
>> file server.
>> 
>> My question is simple but I didnt find an answer anywhere: What are the
>> best practices for the working directory on mesos-agent nodes? Should
>> we keep the working directory local or is it reasonable to use a nfs
>> mounted folder? We implemented both and they seem to work fine, but I
>> would rather like to follow "best practices".
>> 
>> Thanks and cheers
>> 
>> Tom
>> 
>

Re: Agent Working Directory Best Practices

Posted by Steven Schlansker <ss...@opentable.com>.

> On Jun 25, 2017, at 11:24 PM, Benjamin Mahler <bm...@apache.org> wrote:
> 
> As a data point, as far as I'm aware, most users are using a local work directory, not an NFS mounted one. Would love to hear from anyone on the list if they are doing this, and if there are any subtleties that should be documented.

We don't run NFS in particular but we did originally use a SAN -- two observations:

NFS (historically, maybe it's better now, but doubtful...) has really bad failure modes.
Network failures can cause serious hangs both in user-space and kernel-space.  Such
hangs can be impossible to clear without rebooting the machine, and in some edge cases
can even make it difficult or impossible to reboot the machine via normal means.

Network attached drives (our SAN) are less reliable, slower, and more complex
(read: more failure modes) than local disk.  It's also a really big single point
of failure.  So far our only true cluster outages have been due to failure of
the SAN, since it took down all nodes at once -- once we removed the SAN, future
failures had islands of availability and any properly written application
could continue running (obviously without network resources) through the incident.

Maybe this isn't a huge deal for your use case, which might differ from ours.
For us, it was enough of a problem that we now purchase local SSD scratch space
for every node just so that we have some storage we can depend on a bit more
than network attached storage.

> 
> On Thu, Jun 22, 2017 at 11:13 PM, <th...@artorg.unibe.ch> wrote:
> Hi,
> 
> We have a couple of server nodes mainly used for computational tasks in
> our mesos cluster. These servers have beefy cpus, gpus etc. but only
> limited ssd space. We also have a 40GBe network and a decently fast
> file server.
> 
> My question is simple but I didnt find an answer anywhere: What are the
> best practices for the working directory on mesos-agent nodes? Should
> we keep the working directory local or is it reasonable to use a nfs
> mounted folder? We implemented both and they seem to work fine, but I
> would rather like to follow "best practices".
> 
> Thanks and cheers
> 
> Tom
>

Re: Agent Working Directory Best Practices

Posted by Benjamin Mahler <bm...@apache.org>.

As a data point, as far as I'm aware, most users are using a local work
directory, not an NFS mounted one. Would love to hear from anyone on the
list if they are doing this, and if there are any subtleties that should be
documented.

On Thu, Jun 22, 2017 at 11:13 PM, <th...@artorg.unibe.ch> wrote:

> Hi,
>
> We have a couple of server nodes mainly used for computational tasks in
> our mesos cluster. These servers have beefy cpus, gpus etc. but only
> limited ssd space. We also have a 40GBe network and a decently fast
> file server.
>
> My question is simple but I didnt find an answer anywhere: What are the
> best practices for the working directory on mesos-agent nodes? Should
> we keep the working directory local or is it reasonable to use a nfs
> mounted folder? We implemented both and they seem to work fine, but I
> would rather like to follow "best practices".
>
> Thanks and cheers
>
> Tom