You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mesos.apache.org by Neil Conway <ne...@gmail.com> on 2016/12/09 15:08:57 UTC

Duplicate task IDs

Folks,

The master stores a cache of metadata about recently completed tasks;
for example, this information can be accessed via the "/tasks" HTTP
endpoint or the "GET_TASKS" call in the new Operator API.

The master currently stores this metadata using a list; this means
that duplicate task IDs are permitted. We're considering [1] changing
this to use a hashmap instead. Using a hashmap would mean that
duplicate task IDs would be discarded: if two completed tasks have the
same task ID, only the metadata for the most recently completed task
would be retained by the master.

If this behavior change would cause problems for your framework or
other software that relies on Mesos, please let me know.

(Note that if you do have two completed tasks with the same ID, you'd
need an unambiguous way to tell them apart. As a recommendation, I
would strongly encourage framework authors to never reuse task IDs.)

Neil

[1] https://reviews.apache.org/r/54179/

Re: Duplicate task IDs

Posted by Neil Conway <ne...@gmail.com>.

On Mon, Dec 12, 2016 at 1:32 PM, Joris Van Remoortere
<jo...@mesosphere.io> wrote:
> It sounds like using a multi_hashmap for now allows you to clean up the
> code and avoid some bugs, without changing the existing behavior.

Because we want cache-like behavior (bounded size + LRU replacement),
this would require adding a new data structure, BoundedMultiHashMap
(https://reviews.apache.org/r/54178/). That seems like overkill to me,
for now.

> It would also be unfortunate if we said we were dis-allowing duplicate task
> ids but only catch some of the manifestations.

Definitely unfortunate, but I don't see an alternative, as long as we
continue to allow frameworks to freely choose their own task IDs.

Neil

Re: Duplicate task IDs

Posted by Neil Conway <ne...@gmail.com>.

On Mon, Dec 12, 2016 at 1:32 PM, Joris Van Remoortere
<jo...@mesosphere.io> wrote:
> It sounds like using a multi_hashmap for now allows you to clean up the
> code and avoid some bugs, without changing the existing behavior.

Because we want cache-like behavior (bounded size + LRU replacement),
this would require adding a new data structure, BoundedMultiHashMap
(https://reviews.apache.org/r/54178/). That seems like overkill to me,
for now.

> It would also be unfortunate if we said we were dis-allowing duplicate task
> ids but only catch some of the manifestations.

Definitely unfortunate, but I don't see an alternative, as long as we
continue to allow frameworks to freely choose their own task IDs.

Neil

Re: Duplicate task IDs

Posted by Joris Van Remoortere <jo...@mesosphere.io>.

It sounds like using a multi_hashmap for now allows you to clean up the
code and avoid some bugs, without changing the existing behavior.

I agree that we would want a deprecation period if we changed the behavior.
It would also be unfortunate if we said we were dis-allowing duplicate task
ids but only catch some of the manifestations.

—
*Joris Van Remoortere*
Mesosphere

On Mon, Dec 12, 2016 at 7:56 AM, Neil Conway <ne...@gmail.com> wrote:

> Hi Joris,
>
> Fair point: I didn't deliberately set out to change the behavior for
> duplicate task IDs. Rather, it was a consequence of switching from
> boost::circular_buffer to using a hashmap for managing completed
> tasks. Using a hashmap has a few minor advantages [1], but we can
> certainly continue using circular_buffer (or a multi-hashmap) if we
> want to keep the current behavior.
>
> I think we have the following options:
>
> (1) Keep the current behavior: reusing task IDs is discouraged but
> supported.
>
> (2) Per Alex's suggestion, we can say that frameworks are no longer
> allowed to reuse task IDs. Because the master only keeps a
> limited-size cache of completed tasks (which is not preserved across
> master restart or failover), we wouldn't be able to reject all
> situations in which frameworks attempt to reuse task IDs.
>
> If we pursue #2, we might need a deprecation period or master
> capability to give framework authors some time to migrate.
>
> For the moment, I'll avoid changing the behavior for duplicate task
> IDs; I've opened https://issues.apache.org/jira/browse/MESOS-6779 to
> track this issue. If you have an opinion in this change, please
> weigh-in, either on this thread or on JIRA.
>
> Neil
>
> [1] Specifically, making the management of completed and unreachable
> tasks more symmetric and avoiding some bugs/UBI in
> boost::circular_buffer. O(1) lookup of completed tasks might be useful
> in the future but isn't used right now.
>
> On Fri, Dec 9, 2016 at 2:13 PM, Joris Van Remoortere
> <jo...@mesosphere.io> wrote:
> > Hey Neil,
> >
> > I concur that using duplicate task IDs is bad practice and asking for
> > trouble.
> >
> > Could you please clarify *why* you want to use a hashmap? Is your goal to
> > remove duplicate task IDs or is this just a side-effect and you have a
> > different reason (e.g. performance) for using a hashmap?
> >
> > I'm wondering why a multi-hashmap is not sufficient. This would be clear
> if
> > you were explicitly *trying* to get rid of duplicates of course :-)
> >
> > Thanks,
> > Joris
> >
> > —
> > *Joris Van Remoortere*
> > Mesosphere
> >
> > On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <ne...@gmail.com>
> wrote:
> >
> >> Folks,
> >>
> >> The master stores a cache of metadata about recently completed tasks;
> >> for example, this information can be accessed via the "/tasks" HTTP
> >> endpoint or the "GET_TASKS" call in the new Operator API.
> >>
> >> The master currently stores this metadata using a list; this means
> >> that duplicate task IDs are permitted. We're considering [1] changing
> >> this to use a hashmap instead. Using a hashmap would mean that
> >> duplicate task IDs would be discarded: if two completed tasks have the
> >> same task ID, only the metadata for the most recently completed task
> >> would be retained by the master.
> >>
> >> If this behavior change would cause problems for your framework or
> >> other software that relies on Mesos, please let me know.
> >>
> >> (Note that if you do have two completed tasks with the same ID, you'd
> >> need an unambiguous way to tell them apart. As a recommendation, I
> >> would strongly encourage framework authors to never reuse task IDs.)
> >>
> >> Neil
> >>
> >> [1] https://reviews.apache.org/r/54179/
> >>
>

Re: Duplicate task IDs

Posted by Joris Van Remoortere <jo...@mesosphere.io>.

It sounds like using a multi_hashmap for now allows you to clean up the
code and avoid some bugs, without changing the existing behavior.

I agree that we would want a deprecation period if we changed the behavior.
It would also be unfortunate if we said we were dis-allowing duplicate task
ids but only catch some of the manifestations.

—
*Joris Van Remoortere*
Mesosphere

On Mon, Dec 12, 2016 at 7:56 AM, Neil Conway <ne...@gmail.com> wrote:

> Hi Joris,
>
> Fair point: I didn't deliberately set out to change the behavior for
> duplicate task IDs. Rather, it was a consequence of switching from
> boost::circular_buffer to using a hashmap for managing completed
> tasks. Using a hashmap has a few minor advantages [1], but we can
> certainly continue using circular_buffer (or a multi-hashmap) if we
> want to keep the current behavior.
>
> I think we have the following options:
>
> (1) Keep the current behavior: reusing task IDs is discouraged but
> supported.
>
> (2) Per Alex's suggestion, we can say that frameworks are no longer
> allowed to reuse task IDs. Because the master only keeps a
> limited-size cache of completed tasks (which is not preserved across
> master restart or failover), we wouldn't be able to reject all
> situations in which frameworks attempt to reuse task IDs.
>
> If we pursue #2, we might need a deprecation period or master
> capability to give framework authors some time to migrate.
>
> For the moment, I'll avoid changing the behavior for duplicate task
> IDs; I've opened https://issues.apache.org/jira/browse/MESOS-6779 to
> track this issue. If you have an opinion in this change, please
> weigh-in, either on this thread or on JIRA.
>
> Neil
>
> [1] Specifically, making the management of completed and unreachable
> tasks more symmetric and avoiding some bugs/UBI in
> boost::circular_buffer. O(1) lookup of completed tasks might be useful
> in the future but isn't used right now.
>
> On Fri, Dec 9, 2016 at 2:13 PM, Joris Van Remoortere
> <jo...@mesosphere.io> wrote:
> > Hey Neil,
> >
> > I concur that using duplicate task IDs is bad practice and asking for
> > trouble.
> >
> > Could you please clarify *why* you want to use a hashmap? Is your goal to
> > remove duplicate task IDs or is this just a side-effect and you have a
> > different reason (e.g. performance) for using a hashmap?
> >
> > I'm wondering why a multi-hashmap is not sufficient. This would be clear
> if
> > you were explicitly *trying* to get rid of duplicates of course :-)
> >
> > Thanks,
> > Joris
> >
> > —
> > *Joris Van Remoortere*
> > Mesosphere
> >
> > On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <ne...@gmail.com>
> wrote:
> >
> >> Folks,
> >>
> >> The master stores a cache of metadata about recently completed tasks;
> >> for example, this information can be accessed via the "/tasks" HTTP
> >> endpoint or the "GET_TASKS" call in the new Operator API.
> >>
> >> The master currently stores this metadata using a list; this means
> >> that duplicate task IDs are permitted. We're considering [1] changing
> >> this to use a hashmap instead. Using a hashmap would mean that
> >> duplicate task IDs would be discarded: if two completed tasks have the
> >> same task ID, only the metadata for the most recently completed task
> >> would be retained by the master.
> >>
> >> If this behavior change would cause problems for your framework or
> >> other software that relies on Mesos, please let me know.
> >>
> >> (Note that if you do have two completed tasks with the same ID, you'd
> >> need an unambiguous way to tell them apart. As a recommendation, I
> >> would strongly encourage framework authors to never reuse task IDs.)
> >>
> >> Neil
> >>
> >> [1] https://reviews.apache.org/r/54179/
> >>
>

Re: Duplicate task IDs

Posted by Neil Conway <ne...@gmail.com>.

Hi Joris,

Fair point: I didn't deliberately set out to change the behavior for
duplicate task IDs. Rather, it was a consequence of switching from
boost::circular_buffer to using a hashmap for managing completed
tasks. Using a hashmap has a few minor advantages [1], but we can
certainly continue using circular_buffer (or a multi-hashmap) if we
want to keep the current behavior.

I think we have the following options:

(1) Keep the current behavior: reusing task IDs is discouraged but supported.

(2) Per Alex's suggestion, we can say that frameworks are no longer
allowed to reuse task IDs. Because the master only keeps a
limited-size cache of completed tasks (which is not preserved across
master restart or failover), we wouldn't be able to reject all
situations in which frameworks attempt to reuse task IDs.

If we pursue #2, we might need a deprecation period or master
capability to give framework authors some time to migrate.

For the moment, I'll avoid changing the behavior for duplicate task
IDs; I've opened https://issues.apache.org/jira/browse/MESOS-6779 to
track this issue. If you have an opinion in this change, please
weigh-in, either on this thread or on JIRA.

Neil

[1] Specifically, making the management of completed and unreachable
tasks more symmetric and avoiding some bugs/UBI in
boost::circular_buffer. O(1) lookup of completed tasks might be useful
in the future but isn't used right now.

On Fri, Dec 9, 2016 at 2:13 PM, Joris Van Remoortere
<jo...@mesosphere.io> wrote:
> Hey Neil,
>
> I concur that using duplicate task IDs is bad practice and asking for
> trouble.
>
> Could you please clarify *why* you want to use a hashmap? Is your goal to
> remove duplicate task IDs or is this just a side-effect and you have a
> different reason (e.g. performance) for using a hashmap?
>
> I'm wondering why a multi-hashmap is not sufficient. This would be clear if
> you were explicitly *trying* to get rid of duplicates of course :-)
>
> Thanks,
> Joris
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <ne...@gmail.com> wrote:
>
>> Folks,
>>
>> The master stores a cache of metadata about recently completed tasks;
>> for example, this information can be accessed via the "/tasks" HTTP
>> endpoint or the "GET_TASKS" call in the new Operator API.
>>
>> The master currently stores this metadata using a list; this means
>> that duplicate task IDs are permitted. We're considering [1] changing
>> this to use a hashmap instead. Using a hashmap would mean that
>> duplicate task IDs would be discarded: if two completed tasks have the
>> same task ID, only the metadata for the most recently completed task
>> would be retained by the master.
>>
>> If this behavior change would cause problems for your framework or
>> other software that relies on Mesos, please let me know.
>>
>> (Note that if you do have two completed tasks with the same ID, you'd
>> need an unambiguous way to tell them apart. As a recommendation, I
>> would strongly encourage framework authors to never reuse task IDs.)
>>
>> Neil
>>
>> [1] https://reviews.apache.org/r/54179/
>>

Re: Duplicate task IDs

Posted by Alex Rukletsov <al...@mesosphere.io>.

I'm fine with prohibiting non-unique IDs, but why do you plan to keep the
most recent in case of a conflict? I'd expect any duplicate (that we can
find out) is rejected / killed / banned / unchurched.

On 9 Dec 2016 8:13 pm, "Joris Van Remoortere" <jo...@mesosphere.io> wrote:

> Hey Neil,
>
> I concur that using duplicate task IDs is bad practice and asking for
> trouble.
>
> Could you please clarify *why* you want to use a hashmap? Is your goal to
> remove duplicate task IDs or is this just a side-effect and you have a
> different reason (e.g. performance) for using a hashmap?
>
> I'm wondering why a multi-hashmap is not sufficient. This would be clear if
> you were explicitly *trying* to get rid of duplicates of course :-)
>
> Thanks,
> Joris
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <ne...@gmail.com> wrote:
>
> > Folks,
> >
> > The master stores a cache of metadata about recently completed tasks;
> > for example, this information can be accessed via the "/tasks" HTTP
> > endpoint or the "GET_TASKS" call in the new Operator API.
> >
> > The master currently stores this metadata using a list; this means
> > that duplicate task IDs are permitted. We're considering [1] changing
> > this to use a hashmap instead. Using a hashmap would mean that
> > duplicate task IDs would be discarded: if two completed tasks have the
> > same task ID, only the metadata for the most recently completed task
> > would be retained by the master.
> >
> > If this behavior change would cause problems for your framework or
> > other software that relies on Mesos, please let me know.
> >
> > (Note that if you do have two completed tasks with the same ID, you'd
> > need an unambiguous way to tell them apart. As a recommendation, I
> > would strongly encourage framework authors to never reuse task IDs.)
> >
> > Neil
> >
> > [1] https://reviews.apache.org/r/54179/
> >
>

Re: Duplicate task IDs

Posted by Neil Conway <ne...@gmail.com>.

Hi Joris,

Fair point: I didn't deliberately set out to change the behavior for
duplicate task IDs. Rather, it was a consequence of switching from
boost::circular_buffer to using a hashmap for managing completed
tasks. Using a hashmap has a few minor advantages [1], but we can
certainly continue using circular_buffer (or a multi-hashmap) if we
want to keep the current behavior.

I think we have the following options:

(1) Keep the current behavior: reusing task IDs is discouraged but supported.

(2) Per Alex's suggestion, we can say that frameworks are no longer
allowed to reuse task IDs. Because the master only keeps a
limited-size cache of completed tasks (which is not preserved across
master restart or failover), we wouldn't be able to reject all
situations in which frameworks attempt to reuse task IDs.

If we pursue #2, we might need a deprecation period or master
capability to give framework authors some time to migrate.

For the moment, I'll avoid changing the behavior for duplicate task
IDs; I've opened https://issues.apache.org/jira/browse/MESOS-6779 to
track this issue. If you have an opinion in this change, please
weigh-in, either on this thread or on JIRA.

Neil

[1] Specifically, making the management of completed and unreachable
tasks more symmetric and avoiding some bugs/UBI in
boost::circular_buffer. O(1) lookup of completed tasks might be useful
in the future but isn't used right now.

On Fri, Dec 9, 2016 at 2:13 PM, Joris Van Remoortere
<jo...@mesosphere.io> wrote:
> Hey Neil,
>
> I concur that using duplicate task IDs is bad practice and asking for
> trouble.
>
> Could you please clarify *why* you want to use a hashmap? Is your goal to
> remove duplicate task IDs or is this just a side-effect and you have a
> different reason (e.g. performance) for using a hashmap?
>
> I'm wondering why a multi-hashmap is not sufficient. This would be clear if
> you were explicitly *trying* to get rid of duplicates of course :-)
>
> Thanks,
> Joris
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <ne...@gmail.com> wrote:
>
>> Folks,
>>
>> The master stores a cache of metadata about recently completed tasks;
>> for example, this information can be accessed via the "/tasks" HTTP
>> endpoint or the "GET_TASKS" call in the new Operator API.
>>
>> The master currently stores this metadata using a list; this means
>> that duplicate task IDs are permitted. We're considering [1] changing
>> this to use a hashmap instead. Using a hashmap would mean that
>> duplicate task IDs would be discarded: if two completed tasks have the
>> same task ID, only the metadata for the most recently completed task
>> would be retained by the master.
>>
>> If this behavior change would cause problems for your framework or
>> other software that relies on Mesos, please let me know.
>>
>> (Note that if you do have two completed tasks with the same ID, you'd
>> need an unambiguous way to tell them apart. As a recommendation, I
>> would strongly encourage framework authors to never reuse task IDs.)
>>
>> Neil
>>
>> [1] https://reviews.apache.org/r/54179/
>>

Re: Duplicate task IDs

Posted by Alex Rukletsov <al...@mesosphere.io>.

I'm fine with prohibiting non-unique IDs, but why do you plan to keep the
most recent in case of a conflict? I'd expect any duplicate (that we can
find out) is rejected / killed / banned / unchurched.

On 9 Dec 2016 8:13 pm, "Joris Van Remoortere" <jo...@mesosphere.io> wrote:

> Hey Neil,
>
> I concur that using duplicate task IDs is bad practice and asking for
> trouble.
>
> Could you please clarify *why* you want to use a hashmap? Is your goal to
> remove duplicate task IDs or is this just a side-effect and you have a
> different reason (e.g. performance) for using a hashmap?
>
> I'm wondering why a multi-hashmap is not sufficient. This would be clear if
> you were explicitly *trying* to get rid of duplicates of course :-)
>
> Thanks,
> Joris
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <ne...@gmail.com> wrote:
>
> > Folks,
> >
> > The master stores a cache of metadata about recently completed tasks;
> > for example, this information can be accessed via the "/tasks" HTTP
> > endpoint or the "GET_TASKS" call in the new Operator API.
> >
> > The master currently stores this metadata using a list; this means
> > that duplicate task IDs are permitted. We're considering [1] changing
> > this to use a hashmap instead. Using a hashmap would mean that
> > duplicate task IDs would be discarded: if two completed tasks have the
> > same task ID, only the metadata for the most recently completed task
> > would be retained by the master.
> >
> > If this behavior change would cause problems for your framework or
> > other software that relies on Mesos, please let me know.
> >
> > (Note that if you do have two completed tasks with the same ID, you'd
> > need an unambiguous way to tell them apart. As a recommendation, I
> > would strongly encourage framework authors to never reuse task IDs.)
> >
> > Neil
> >
> > [1] https://reviews.apache.org/r/54179/
> >
>

Re: Duplicate task IDs

Posted by Joris Van Remoortere <jo...@mesosphere.io>.

Hey Neil,

I concur that using duplicate task IDs is bad practice and asking for
trouble.

Could you please clarify *why* you want to use a hashmap? Is your goal to
remove duplicate task IDs or is this just a side-effect and you have a
different reason (e.g. performance) for using a hashmap?

I'm wondering why a multi-hashmap is not sufficient. This would be clear if
you were explicitly *trying* to get rid of duplicates of course :-)

Thanks,
Joris

—
*Joris Van Remoortere*
Mesosphere

On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <ne...@gmail.com> wrote:

> Folks,
>
> The master stores a cache of metadata about recently completed tasks;
> for example, this information can be accessed via the "/tasks" HTTP
> endpoint or the "GET_TASKS" call in the new Operator API.
>
> The master currently stores this metadata using a list; this means
> that duplicate task IDs are permitted. We're considering [1] changing
> this to use a hashmap instead. Using a hashmap would mean that
> duplicate task IDs would be discarded: if two completed tasks have the
> same task ID, only the metadata for the most recently completed task
> would be retained by the master.
>
> If this behavior change would cause problems for your framework or
> other software that relies on Mesos, please let me know.
>
> (Note that if you do have two completed tasks with the same ID, you'd
> need an unambiguous way to tell them apart. As a recommendation, I
> would strongly encourage framework authors to never reuse task IDs.)
>
> Neil
>
> [1] https://reviews.apache.org/r/54179/
>

Re: Duplicate task IDs

Posted by Joris Van Remoortere <jo...@mesosphere.io>.

Hey Neil,

I concur that using duplicate task IDs is bad practice and asking for
trouble.

Could you please clarify *why* you want to use a hashmap? Is your goal to
remove duplicate task IDs or is this just a side-effect and you have a
different reason (e.g. performance) for using a hashmap?

I'm wondering why a multi-hashmap is not sufficient. This would be clear if
you were explicitly *trying* to get rid of duplicates of course :-)

Thanks,
Joris

—
*Joris Van Remoortere*
Mesosphere

On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <ne...@gmail.com> wrote:

> Folks,
>
> The master stores a cache of metadata about recently completed tasks;
> for example, this information can be accessed via the "/tasks" HTTP
> endpoint or the "GET_TASKS" call in the new Operator API.
>
> The master currently stores this metadata using a list; this means
> that duplicate task IDs are permitted. We're considering [1] changing
> this to use a hashmap instead. Using a hashmap would mean that
> duplicate task IDs would be discarded: if two completed tasks have the
> same task ID, only the metadata for the most recently completed task
> would be retained by the master.
>
> If this behavior change would cause problems for your framework or
> other software that relies on Mesos, please let me know.
>
> (Note that if you do have two completed tasks with the same ID, you'd
> need an unambiguous way to tell them apart. As a recommendation, I
> would strongly encourage framework authors to never reuse task IDs.)
>
> Neil
>
> [1] https://reviews.apache.org/r/54179/
>