You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Joseph Wu <jo...@mesosphere.io> on 2015/10/23 23:33:14 UTC

[Breaking bug fix] Binary in state endpoints

Hello,

The state endpoints, on master and agent, currently serialize two binary
data fields in the ExecutorInfo and TaskInfo objects.  These fields are set
by frameworks; and Mesos does not inspect their values.

The data fields can be found in the state JSON blobs:
/master/state -> frameworks[*].executors[*].data
/slave/state ->
frameworks[*].(executors|completed_executors)[*].(tasks|queued_tasks|completed_tasks)[*].data

*Problem:*
The state endpoints are JSON-ified in a non-standard way (i.e. not via our
normal Protobuf-to-json methods).  When we serialize the binary "data"
fields, the binary is dumped as a string, as is.  The resulting JSON may
not be valid if the binary data includes random bytes (i.e. not unicode).
Most JSON parsers will error on the state endpoints in this case.

*Proposed solution *(and breaking change)*:*
Simple -- remove the "data" fields from the state endpoints.  (And only
from the state endpoints.  The ExecutorInfo and TaskInfo objects will not
change.)

*Question:*
We believe that frameworks/tools do not rely on retrieving the "data"
fields from the state endpoints.

Is there any framework/tool that retrieves the "data" field from the state
endpoints?
And if so, is it critical to how the framework/tool works?

More details here: https://issues.apache.org/jira/browse/MESOS-3771

Thanks,
~Joseph

Re: [Breaking bug fix] Binary in state endpoints

Posted by David Greenberg <ds...@gmail.com>.
In that case, I rescind my objection. Memory use as a problem and labels as
an alternative work fine. Thanks!
On Mon, Nov 2, 2015 at 7:03 PM Benjamin Mahler <be...@gmail.com>
wrote:

> Sorry for the confusion, the motivation to remove 'data' is for memory
> scalability reasons (the ability to express binary fields is orthogonal and
> is not the reason to remove 'data').
>
> We can get into a really bad state in large clusters if frameworks are
> putting non-trivial amounts of 'data' in TaskInfos and ExecutorInfos. If
> it's too large for the master to hold in memory, the master will
> continually OOM and it becomes impossible to right your cluster. See
> https://issues.apache.org/jira/browse/MESOS-1746 for some history of
> stripping binary data, starting with TaskStatus.
>
> Labels were introduced to aid tooling, can you use labels? I realize they
> are not in ExecutorInfo yet.
>
> On Mon, Nov 2, 2015 at 6:23 PM, David Greenberg <ds...@gmail.com>
> wrote:
>
> > Why not base64 encode the field? We use that field in our frameworks, and
> > some of our platform tools would benefit from being able to read that
> data.
> > Base64 seems like a compromise with minimal complexity addition. It also
> > removes the potential for parse errors, doesn't rule out future
> > applications from using the data stored there (as specialized frameworks
> > use that field), and doesn't incur a message size overhead in the (I
> > presume) majority of frameworks not using that field.
> > On Mon, Nov 2, 2015 at 4:28 PM Guangya Liu <gy...@gmail.com> wrote:
> >
> > > +1 to remove the field directly, one comment is that the upgrade
> document
> > > may need to be updated.
> > >
> > > From my understanding, since the data is binary data and I did not see
> > too
> > > much requirement on retrieving binary data.
> > >
> > > Thanks!
> > >
> > > On Sat, Oct 24, 2015 at 5:33 AM, Joseph Wu <jo...@mesosphere.io>
> wrote:
> > >
> > > > Hello,
> > > >
> > > > The state endpoints, on master and agent, currently serialize two
> > binary
> > > > data fields in the ExecutorInfo and TaskInfo objects.  These fields
> are
> > > set
> > > > by frameworks; and Mesos does not inspect their values.
> > > >
> > > > The data fields can be found in the state JSON blobs:
> > > > /master/state -> frameworks[*].executors[*].data
> > > > /slave/state ->
> > > >
> > > >
> > >
> >
> frameworks[*].(executors|completed_executors)[*].(tasks|queued_tasks|completed_tasks)[*].data
> > > >
> > > > *Problem:*
> > > > The state endpoints are JSON-ified in a non-standard way (i.e. not
> via
> > > our
> > > > normal Protobuf-to-json methods).  When we serialize the binary
> "data"
> > > > fields, the binary is dumped as a string, as is.  The resulting JSON
> > may
> > > > not be valid if the binary data includes random bytes (i.e. not
> > unicode).
> > > > Most JSON parsers will error on the state endpoints in this case.
> > > >
> > > > *Proposed solution *(and breaking change)*:*
> > > > Simple -- remove the "data" fields from the state endpoints.  (And
> only
> > > > from the state endpoints.  The ExecutorInfo and TaskInfo objects will
> > not
> > > > change.)
> > > >
> > > > *Question:*
> > > > We believe that frameworks/tools do not rely on retrieving the "data"
> > > > fields from the state endpoints.
> > > >
> > > > Is there any framework/tool that retrieves the "data" field from the
> > > state
> > > > endpoints?
> > > > And if so, is it critical to how the framework/tool works?
> > > >
> > > > More details here: https://issues.apache.org/jira/browse/MESOS-3771
> > > >
> > > > Thanks,
> > > > ~Joseph
> > > >
> > >
> >
>

Re: [Breaking bug fix] Binary in state endpoints

Posted by Benjamin Mahler <be...@gmail.com>.
Sorry for the confusion, the motivation to remove 'data' is for memory
scalability reasons (the ability to express binary fields is orthogonal and
is not the reason to remove 'data').

We can get into a really bad state in large clusters if frameworks are
putting non-trivial amounts of 'data' in TaskInfos and ExecutorInfos. If
it's too large for the master to hold in memory, the master will
continually OOM and it becomes impossible to right your cluster. See
https://issues.apache.org/jira/browse/MESOS-1746 for some history of
stripping binary data, starting with TaskStatus.

Labels were introduced to aid tooling, can you use labels? I realize they
are not in ExecutorInfo yet.

On Mon, Nov 2, 2015 at 6:23 PM, David Greenberg <ds...@gmail.com>
wrote:

> Why not base64 encode the field? We use that field in our frameworks, and
> some of our platform tools would benefit from being able to read that data.
> Base64 seems like a compromise with minimal complexity addition. It also
> removes the potential for parse errors, doesn't rule out future
> applications from using the data stored there (as specialized frameworks
> use that field), and doesn't incur a message size overhead in the (I
> presume) majority of frameworks not using that field.
> On Mon, Nov 2, 2015 at 4:28 PM Guangya Liu <gy...@gmail.com> wrote:
>
> > +1 to remove the field directly, one comment is that the upgrade document
> > may need to be updated.
> >
> > From my understanding, since the data is binary data and I did not see
> too
> > much requirement on retrieving binary data.
> >
> > Thanks!
> >
> > On Sat, Oct 24, 2015 at 5:33 AM, Joseph Wu <jo...@mesosphere.io> wrote:
> >
> > > Hello,
> > >
> > > The state endpoints, on master and agent, currently serialize two
> binary
> > > data fields in the ExecutorInfo and TaskInfo objects.  These fields are
> > set
> > > by frameworks; and Mesos does not inspect their values.
> > >
> > > The data fields can be found in the state JSON blobs:
> > > /master/state -> frameworks[*].executors[*].data
> > > /slave/state ->
> > >
> > >
> >
> frameworks[*].(executors|completed_executors)[*].(tasks|queued_tasks|completed_tasks)[*].data
> > >
> > > *Problem:*
> > > The state endpoints are JSON-ified in a non-standard way (i.e. not via
> > our
> > > normal Protobuf-to-json methods).  When we serialize the binary "data"
> > > fields, the binary is dumped as a string, as is.  The resulting JSON
> may
> > > not be valid if the binary data includes random bytes (i.e. not
> unicode).
> > > Most JSON parsers will error on the state endpoints in this case.
> > >
> > > *Proposed solution *(and breaking change)*:*
> > > Simple -- remove the "data" fields from the state endpoints.  (And only
> > > from the state endpoints.  The ExecutorInfo and TaskInfo objects will
> not
> > > change.)
> > >
> > > *Question:*
> > > We believe that frameworks/tools do not rely on retrieving the "data"
> > > fields from the state endpoints.
> > >
> > > Is there any framework/tool that retrieves the "data" field from the
> > state
> > > endpoints?
> > > And if so, is it critical to how the framework/tool works?
> > >
> > > More details here: https://issues.apache.org/jira/browse/MESOS-3771
> > >
> > > Thanks,
> > > ~Joseph
> > >
> >
>

Re: [Breaking bug fix] Binary in state endpoints

Posted by David Greenberg <ds...@gmail.com>.
Why not base64 encode the field? We use that field in our frameworks, and
some of our platform tools would benefit from being able to read that data.
Base64 seems like a compromise with minimal complexity addition. It also
removes the potential for parse errors, doesn't rule out future
applications from using the data stored there (as specialized frameworks
use that field), and doesn't incur a message size overhead in the (I
presume) majority of frameworks not using that field.
On Mon, Nov 2, 2015 at 4:28 PM Guangya Liu <gy...@gmail.com> wrote:

> +1 to remove the field directly, one comment is that the upgrade document
> may need to be updated.
>
> From my understanding, since the data is binary data and I did not see too
> much requirement on retrieving binary data.
>
> Thanks!
>
> On Sat, Oct 24, 2015 at 5:33 AM, Joseph Wu <jo...@mesosphere.io> wrote:
>
> > Hello,
> >
> > The state endpoints, on master and agent, currently serialize two binary
> > data fields in the ExecutorInfo and TaskInfo objects.  These fields are
> set
> > by frameworks; and Mesos does not inspect their values.
> >
> > The data fields can be found in the state JSON blobs:
> > /master/state -> frameworks[*].executors[*].data
> > /slave/state ->
> >
> >
> frameworks[*].(executors|completed_executors)[*].(tasks|queued_tasks|completed_tasks)[*].data
> >
> > *Problem:*
> > The state endpoints are JSON-ified in a non-standard way (i.e. not via
> our
> > normal Protobuf-to-json methods).  When we serialize the binary "data"
> > fields, the binary is dumped as a string, as is.  The resulting JSON may
> > not be valid if the binary data includes random bytes (i.e. not unicode).
> > Most JSON parsers will error on the state endpoints in this case.
> >
> > *Proposed solution *(and breaking change)*:*
> > Simple -- remove the "data" fields from the state endpoints.  (And only
> > from the state endpoints.  The ExecutorInfo and TaskInfo objects will not
> > change.)
> >
> > *Question:*
> > We believe that frameworks/tools do not rely on retrieving the "data"
> > fields from the state endpoints.
> >
> > Is there any framework/tool that retrieves the "data" field from the
> state
> > endpoints?
> > And if so, is it critical to how the framework/tool works?
> >
> > More details here: https://issues.apache.org/jira/browse/MESOS-3771
> >
> > Thanks,
> > ~Joseph
> >
>

Re: [Breaking bug fix] Binary in state endpoints

Posted by Guangya Liu <gy...@gmail.com>.
+1 to remove the field directly, one comment is that the upgrade document
may need to be updated.

>From my understanding, since the data is binary data and I did not see too
much requirement on retrieving binary data.

Thanks!

On Sat, Oct 24, 2015 at 5:33 AM, Joseph Wu <jo...@mesosphere.io> wrote:

> Hello,
>
> The state endpoints, on master and agent, currently serialize two binary
> data fields in the ExecutorInfo and TaskInfo objects.  These fields are set
> by frameworks; and Mesos does not inspect their values.
>
> The data fields can be found in the state JSON blobs:
> /master/state -> frameworks[*].executors[*].data
> /slave/state ->
>
> frameworks[*].(executors|completed_executors)[*].(tasks|queued_tasks|completed_tasks)[*].data
>
> *Problem:*
> The state endpoints are JSON-ified in a non-standard way (i.e. not via our
> normal Protobuf-to-json methods).  When we serialize the binary "data"
> fields, the binary is dumped as a string, as is.  The resulting JSON may
> not be valid if the binary data includes random bytes (i.e. not unicode).
> Most JSON parsers will error on the state endpoints in this case.
>
> *Proposed solution *(and breaking change)*:*
> Simple -- remove the "data" fields from the state endpoints.  (And only
> from the state endpoints.  The ExecutorInfo and TaskInfo objects will not
> change.)
>
> *Question:*
> We believe that frameworks/tools do not rely on retrieving the "data"
> fields from the state endpoints.
>
> Is there any framework/tool that retrieves the "data" field from the state
> endpoints?
> And if so, is it critical to how the framework/tool works?
>
> More details here: https://issues.apache.org/jira/browse/MESOS-3771
>
> Thanks,
> ~Joseph
>