You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mesos.apache.org by Greg Mann <gr...@mesosphere.io> on 2018/10/18 22:21:11 UTC

Proposal: Adding health check definitions to master state output

Hi all,
In addition to the health check API change proposal that I recently sent
out, we're considering adding a task's health check definition (when
present) to the 'Task' protobuf message so that it appears in the master's
'/state' endpoint response, as well as the v1 GET_STATE response and the
TASK_ADDED event. This will allow operators to detect the presence and
configuration of health checks on tasks via the operator API, which they
are currently unable to do:

message Task {
  . . .

  optional HealthCheck health_check = 15;

  . . .
}

I wanted to check in with the community regarding this change, since for
very large clusters it could have a non-negligible impact on the size of
the master's state output.

It's worth mentioning that I believe the original intention of the 'Task'
message was to contain most information contained in 'TaskInfo', except for
those fields which could grow very large, like the 'data' field.

Please reply if you foresee this change having a negative impact on your
deployments, or if you have any other thoughts/concerns!

Thanks,
Greg

Re: Proposal: Adding health check definitions to master state output

Posted by Benjamin Mahler <bm...@apache.org>.

> It's worth mentioning that I believe the original intention of the 'Task'
> message was to contain most information contained in 'TaskInfo', except
for
> those fields which could grow very large, like the 'data' field.

+1 all task / executor metadata should be exposed IMO. I look at the 'data'
field as a payload delivered to the executor / task rather than it being
part of the metadata. Based on this, if one wanted to have custom metadata
that gets exposed, labels would be used instead.

On Thu, Oct 18, 2018 at 3:21 PM Greg Mann <gr...@mesosphere.io> wrote:

> Hi all,
> In addition to the health check API change proposal that I recently sent
> out, we're considering adding a task's health check definition (when
> present) to the 'Task' protobuf message so that it appears in the master's
> '/state' endpoint response, as well as the v1 GET_STATE response and the
> TASK_ADDED event. This will allow operators to detect the presence and
> configuration of health checks on tasks via the operator API, which they
> are currently unable to do:
>
> message Task {
>   . . .
>
>   optional HealthCheck health_check = 15;
>
>   . . .
> }
>
> I wanted to check in with the community regarding this change, since for
> very large clusters it could have a non-negligible impact on the size of
> the master's state output.
>
> It's worth mentioning that I believe the original intention of the 'Task'
> message was to contain most information contained in 'TaskInfo', except for
> those fields which could grow very large, like the 'data' field.
>
> Please reply if you foresee this change having a negative impact on your
> deployments, or if you have any other thoughts/concerns!
>
> Thanks,
> Greg
>

Re: Proposal: Adding health check definitions to master state output

Posted by Benjamin Mahler <bm...@apache.org>.

> It's worth mentioning that I believe the original intention of the 'Task'
> message was to contain most information contained in 'TaskInfo', except
for
> those fields which could grow very large, like the 'data' field.

+1 all task / executor metadata should be exposed IMO. I look at the 'data'
field as a payload delivered to the executor / task rather than it being
part of the metadata. Based on this, if one wanted to have custom metadata
that gets exposed, labels would be used instead.

On Thu, Oct 18, 2018 at 3:21 PM Greg Mann <gr...@mesosphere.io> wrote:

> Hi all,
> In addition to the health check API change proposal that I recently sent
> out, we're considering adding a task's health check definition (when
> present) to the 'Task' protobuf message so that it appears in the master's
> '/state' endpoint response, as well as the v1 GET_STATE response and the
> TASK_ADDED event. This will allow operators to detect the presence and
> configuration of health checks on tasks via the operator API, which they
> are currently unable to do:
>
> message Task {
>   . . .
>
>   optional HealthCheck health_check = 15;
>
>   . . .
> }
>
> I wanted to check in with the community regarding this change, since for
> very large clusters it could have a non-negligible impact on the size of
> the master's state output.
>
> It's worth mentioning that I believe the original intention of the 'Task'
> message was to contain most information contained in 'TaskInfo', except for
> those fields which could grow very large, like the 'data' field.
>
> Please reply if you foresee this change having a negative impact on your
> deployments, or if you have any other thoughts/concerns!
>
> Thanks,
> Greg
>