You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mesos.apache.org by Neil Conway <ne...@gmail.com> on 2016/07/01 17:44:51 UTC

Overloading and function names

Consider the following function signatures from master.cpp:

Nothing Master::removeSlave(const Registry::Slave& slave);

void Master::removeSlave(Slave* slave, const string& message,
Option<Counter> reason);

or these from sorter/drf/sorter.hpp:

void update(const SlaveID& slaveId, const Resources& resources);

void update(
      const std::string& name,
      const SlaveID& slaveId,
      const Resources& oldAllocation,
      const Resources& newAllocation);

void update(const std::string& name);

void update(const std::string& name, double weight);

These function names use overloading but the different variants have
*completely* different semantics. For example, the variants of
update() do the following:

(1) Set the weight associated with a role.
(2) Change the total pool of resources in the sorter.
(3) Update the fair-share associated with a sorter client.
(4) Change both the total and allocated resources in the sorter.

(For fun, the descriptions and function names are in different orders. :) )

I'd like to propose that we avoid naming functions in this style: if
two functions do fundamentally different things or should be invoked
in very different circumstances, we should avoid giving them the same
name. We should use overloading when two variants of a function do
basically the same thing but differ slightly in the parameters they
accept.

Comments welcome.

Neil

Re: Overloading and function names

Posted by Michael Park <mp...@apache.org>.

Neil, thanks for bringing this up! Agree with you in principle and
sentiment.
I gave it some thought myself since BenM mentioned providing concrete
suggestions.

Ben, I think Neil brought up `removeSlave` as an example that he's not ok
with. (Neil, correct me if I'm wrong)
As far as I understand it, the issue with the 2 overloads of
`removeSlave` that he mentioned is that one
removes the agent from the registry, whereas the other one removes it from
the in-memory map.
While these are conceptually similar (they do remove the slave in some
way), they do in fact do very different things.

I think an example that Neil is ok with is something like `std::max`.
Specifically these overloads (not sure about the ones with comparators):
  * `const T& max(const T&lhs, const T&rhs);`
  * `T max(initializer_list<T> list);`

In regards to `update`, the `void update(const std::string& name);` he
mentioned is a private function.
Since we already have a `calculateShare` this probably could be named
`updateShare`.

---

Some more general thoughts / ideas:

(1) I agree that we don't always want to include all of the parameter
names, but it seems to beneficial in some cases.

     It's a bit difficult to accurately characterize the situations, but
one aspect to consider is the likelihood that
     the argument will be the same name as the parameter name. If the
argument name will basically always be
     the same as the parameter name, we end up with redundancy.

     I don't think `map.count(key)` is a good example of this, since the
`key` portion is basically never named `key`.
     But perhaps call-sites like `map.count(pid)` and `map.count(name)` are
indicative enough.

     The example I want to point out is `void removeFramework(Framework*
framework);`.
     Every variable of type `Framework*` we have in the codebase is named
`framework`, and one function that returns
     an instance of `Framework*` is called `getFramework`. This means
basically every call-site of `removeFramework`
     is either `removeFramework(framework)` or
`removeFramework(getFramework(...));`

(2) I wonder if it would help if we were to "namespace" these functions.

     In the case of `removeSlave` for example, perhaps it would help if we
could write something like:

    ```
    master->registry.remove_agent(agent);
    master->in_memory.remove_agent(agent, ...);
    ```

    or for the `update` example, to have something like:

    ```
    sorter->clients.update(name, weight);
    sorter->clients.calculateShare(name);

    sorter->total_resources.add(agent, resources);
    sorter->total_resources.update(agent, resources);

    sorter->allocations.allocated(name, agent_id, resources);
    sorter->allocations.update(name, agent_id, old_allocation,
new_allocation);
    ```

    Example code:

    ```
    class Master {
      struct {
        void remove_agent(/* ... */) const {
          std::cout << self.x << std::endl;  // can still access private
members.
        }

        Master& self;
      } registry{*this};

      struct InMemory {
        void remove_agent(/* ... */) const;

        Master& self;
      } in_memory{*this};

      private:
      int x;
    };

    void Master::InMemory::remove_agent(...) const { ... }  // can still
define out-of-line.
    ```

Just a few thoughts.

MPark

On 3 July 2016 at 21:10, Benjamin Mahler <bm...@apache.org> wrote:

> To clarify, are you ok with the removeSlave example? It seems to fit your
> criteria.
>
> Usually with this kind of email we need concrete suggestions for
> improvement. First though I wanted to improve some issues around naming and
> the descriptions you've used in the update example to avoid conflating
> things further (e.g. there are only 3 signatures, the sorter doesn't know
> about "roles", the descriptions are not precisely matching the behavior for
> (2) and (4), we can improve argument naming). So we have:
>
> void update(const std::string& client, double newWeight);
> void update(const SlaveID& slaveId, const Resources& newTotal);
> void update(
>       const std::string& client,
>       const SlaveID& slaveId,
>       const Resources& oldAllocation,
>       const Resources& newAllocation);
>
> (1) Update a client's weight.
> (2) Update the total resources for a slave.
> (3) Update a client's allocation on a slave.
>
> With this I don't think it's as bad as you've described in terms of being
> able to intuit behavior:
>
> sorter->update(client, weight);
> sorter->update(slaveId, newTotal);
> sorter->update(client, slaveId, oldAllocation, newAllocation);
>
> That being said, if there are more helpful function names let's make some
> suggestions! The obvious alternative here seems to be verbose names that
> repeat the arguments?
>
> sorter->update_client_weight(client, weight);
> sorter->update_slave_total(slave, total);
> sorter->update_client_allocation(client, slaveId, oldAllocation,
> newAllocation);
>
> We tend to avoid this pattern as well, because it leads to redundancy. For
> example, we would generally prefer:
>
> map.count(key)
> map.contains(key)
>
> and not:
>
> map.countKey(key)
> map.containsKey(key)
>
> Another thought is in the direction of exposing Sorter::Client /
> Sorter::Slave to make this more straightforward. Something like:
>
> sorter->clients.at(client).set_weight(weight);
> sorter->slaves.at(slaveId).set_total(resources);
> sorter->slaves.at(slaveId).update_allocation(client, old, new);
>
> I don't know if you had specific suggestions in mind but would love to hear
> them!
>
> On Fri, Jul 1, 2016 at 10:44 AM, Neil Conway <ne...@gmail.com>
> wrote:
>
> > Consider the following function signatures from master.cpp:
> >
> > Nothing Master::removeSlave(const Registry::Slave& slave);
> >
> > void Master::removeSlave(Slave* slave, const string& message,
> > Option<Counter> reason);
> >
> > or these from sorter/drf/sorter.hpp:
> >
> > void update(const SlaveID& slaveId, const Resources& resources);
> >
> > void update(
> >       const std::string& name,
> >       const SlaveID& slaveId,
> >       const Resources& oldAllocation,
> >       const Resources& newAllocation);
> >
> > void update(const std::string& name);
> >
> > void update(const std::string& name, double weight);
> >
> > These function names use overloading but the different variants have
> > *completely* different semantics. For example, the variants of
> > update() do the following:
> >
> > (1) Set the weight associated with a role.
> > (2) Change the total pool of resources in the sorter.
> > (3) Update the fair-share associated with a sorter client.
> > (4) Change both the total and allocated resources in the sorter.
> >
> > (For fun, the descriptions and function names are in different orders.
> :) )
> >
> > I'd like to propose that we avoid naming functions in this style: if
> > two functions do fundamentally different things or should be invoked
> > in very different circumstances, we should avoid giving them the same
> > name. We should use overloading when two variants of a function do
> > basically the same thing but differ slightly in the parameters they
> > accept.
> >
> > Comments welcome.
> >
> > Neil
> >
>

Re: Overloading and function names

Posted by Benjamin Mahler <bm...@apache.org>.

The longer names do seem to help for 'update', sounds good! I also agree
about the general rule you're advocating for.

What are your concrete thoughts for removeSlave? Michael's comment was
inaccurate in that the intention of both is to remove the slave from the
registry. Initially, we only had removeSlave(Slave*), and as I introduced
the recovery code, we unfortunately could not call removeSlave(Slave*)
because we didn't have a Slave* (out of technical debt [1] [2]). Therefore
we added an overload in which we passed a registry::Slave. I can see how
this might be confusing but the documentation seems to be clarify it:

  // Remove the slave from the registrar. Called when the slave
  // does not re-register in time after a master failover.
  Nothing removeSlave(const Registry::Slave& slave);

  // Remove the slave from the registrar and from the master's state.
  //
  // TODO(bmahler): 'reason' is optional until MESOS-2317 is resolved.
  void removeSlave(
      Slave* slave,
      const std::string& message,
      Option<process::metrics::Counter> reason = None());

It seems inaccurate to say that these have completely different semantics.
The intention was for these to have the same semantics of removing a slave
from the registry (and all that it entails). I'd sooner try to consolidate
these functions (ideally, a recovered slave could be a Slave*, or we can
just take the SlaveID, etc) than try to come up with a different naming
scheme. Thoughts?

[1] Originally I filed https://issues.apache.org/jira/browse/MESOS-2423 for
the discrepancy.
[2] After Neil and I discussed, we decided to file the more generally
described https://issues.apache.org/jira/browse/MESOS-4048 (I forgot that I
had already filed MESOS-2423 at the time).

On Mon, Jul 4, 2016 at 2:31 AM, Neil Conway <ne...@gmail.com> wrote:

> On Sun, Jul 3, 2016 at 9:10 PM, Benjamin Mahler <bm...@apache.org>
> wrote:
> > To clarify, are you ok with the removeSlave example? It seems to fit your
> > criteria.
>
> I think `removeSlave` is poorly named, for similar reasons -- I just
> talked about `update` in my email for brevity.
>
> > Usually with this kind of email we need concrete suggestions for
> > improvement.
>
> My email included the following, which I think is pretty concrete:
>
> ***
> I'd like to propose that we avoid naming functions in this style: if
> two functions do fundamentally different things or should be invoked
> in very different circumstances, we should avoid giving them the same
> name. We should use overloading when two variants of a function do
> basically the same thing but differ slightly in the parameters they
> accept.
> ***
>
> We can certainly debate the specifics of whether/how to rename
> particular functions, but I think the bigger question is whether we
> want to endorse using overloading to differentiate between functions
> that do fundamentally different things.
>
> > With this I don't think it's as bad as you've described in terms of being
> > able to intuit behavior:
> >
> > sorter->update(client, weight);
> > sorter->update(slaveId, newTotal);
> > sorter->update(client, slaveId, oldAllocation, newAllocation);
> >
> > That being said, if there are more helpful function names let's make some
> > suggestions! The obvious alternative here seems to be verbose names that
> > repeat the arguments?
> >
> > sorter->update_client_weight(client, weight);
> > sorter->update_slave_total(slave, total);
> > sorter->update_client_allocation(client, slaveId, oldAllocation,
> > newAllocation);
> >
> > We tend to avoid this pattern as well, because it leads to redundancy.
>
> I would happily accept a little more redundancy for these examples,
> because I think it improves the clarity of the code. We tend to favor
> clarity and redundancy over brevity in several other situations (e.g.,
> variable names must be entire words, using `load` and `store` for
> atomics rather than operator overloading).
>
> For the specific examples above, I think the longer names (e.g.,
> `update_client_weight`) are an improvement. There's also a compromise
> (adding back the fourth variant of `update` which is a private
> function):
>
> sorter->updateWeight(client, weight);
> sorter->updateTotal(slaveId, total);
> sorter->updateAllocation(client, slaveId, oldAllocation, newAllocation);
> sorter->updateShare(client);
>
> Neil
>

Re: Overloading and function names

Posted by Neil Conway <ne...@gmail.com>.

On Sun, Jul 3, 2016 at 9:10 PM, Benjamin Mahler <bm...@apache.org> wrote:
> To clarify, are you ok with the removeSlave example? It seems to fit your
> criteria.

I think `removeSlave` is poorly named, for similar reasons -- I just
talked about `update` in my email for brevity.

> Usually with this kind of email we need concrete suggestions for
> improvement.

My email included the following, which I think is pretty concrete:

***
I'd like to propose that we avoid naming functions in this style: if
two functions do fundamentally different things or should be invoked
in very different circumstances, we should avoid giving them the same
name. We should use overloading when two variants of a function do
basically the same thing but differ slightly in the parameters they
accept.
***

We can certainly debate the specifics of whether/how to rename
particular functions, but I think the bigger question is whether we
want to endorse using overloading to differentiate between functions
that do fundamentally different things.

> With this I don't think it's as bad as you've described in terms of being
> able to intuit behavior:
>
> sorter->update(client, weight);
> sorter->update(slaveId, newTotal);
> sorter->update(client, slaveId, oldAllocation, newAllocation);
>
> That being said, if there are more helpful function names let's make some
> suggestions! The obvious alternative here seems to be verbose names that
> repeat the arguments?
>
> sorter->update_client_weight(client, weight);
> sorter->update_slave_total(slave, total);
> sorter->update_client_allocation(client, slaveId, oldAllocation,
> newAllocation);
>
> We tend to avoid this pattern as well, because it leads to redundancy.

I would happily accept a little more redundancy for these examples,
because I think it improves the clarity of the code. We tend to favor
clarity and redundancy over brevity in several other situations (e.g.,
variable names must be entire words, using `load` and `store` for
atomics rather than operator overloading).

For the specific examples above, I think the longer names (e.g.,
`update_client_weight`) are an improvement. There's also a compromise
(adding back the fourth variant of `update` which is a private
function):

sorter->updateWeight(client, weight);
sorter->updateTotal(slaveId, total);
sorter->updateAllocation(client, slaveId, oldAllocation, newAllocation);
sorter->updateShare(client);

Neil

Re: Overloading and function names

Posted by Benjamin Mahler <bm...@apache.org>.

To clarify, are you ok with the removeSlave example? It seems to fit your
criteria.

Usually with this kind of email we need concrete suggestions for
improvement. First though I wanted to improve some issues around naming and
the descriptions you've used in the update example to avoid conflating
things further (e.g. there are only 3 signatures, the sorter doesn't know
about "roles", the descriptions are not precisely matching the behavior for
(2) and (4), we can improve argument naming). So we have:

void update(const std::string& client, double newWeight);
void update(const SlaveID& slaveId, const Resources& newTotal);
void update(
      const std::string& client,
      const SlaveID& slaveId,
      const Resources& oldAllocation,
      const Resources& newAllocation);

(1) Update a client's weight.
(2) Update the total resources for a slave.
(3) Update a client's allocation on a slave.

With this I don't think it's as bad as you've described in terms of being
able to intuit behavior:

sorter->update(client, weight);
sorter->update(slaveId, newTotal);
sorter->update(client, slaveId, oldAllocation, newAllocation);

That being said, if there are more helpful function names let's make some
suggestions! The obvious alternative here seems to be verbose names that
repeat the arguments?

sorter->update_client_weight(client, weight);
sorter->update_slave_total(slave, total);
sorter->update_client_allocation(client, slaveId, oldAllocation,
newAllocation);

We tend to avoid this pattern as well, because it leads to redundancy. For
example, we would generally prefer:

map.count(key)
map.contains(key)

and not:

map.countKey(key)
map.containsKey(key)

Another thought is in the direction of exposing Sorter::Client /
Sorter::Slave to make this more straightforward. Something like:

sorter->clients.at(client).set_weight(weight);
sorter->slaves.at(slaveId).set_total(resources);
sorter->slaves.at(slaveId).update_allocation(client, old, new);

I don't know if you had specific suggestions in mind but would love to hear
them!

On Fri, Jul 1, 2016 at 10:44 AM, Neil Conway <ne...@gmail.com> wrote:

> Consider the following function signatures from master.cpp:
>
> Nothing Master::removeSlave(const Registry::Slave& slave);
>
> void Master::removeSlave(Slave* slave, const string& message,
> Option<Counter> reason);
>
> or these from sorter/drf/sorter.hpp:
>
> void update(const SlaveID& slaveId, const Resources& resources);
>
> void update(
>       const std::string& name,
>       const SlaveID& slaveId,
>       const Resources& oldAllocation,
>       const Resources& newAllocation);
>
> void update(const std::string& name);
>
> void update(const std::string& name, double weight);
>
> These function names use overloading but the different variants have
> *completely* different semantics. For example, the variants of
> update() do the following:
>
> (1) Set the weight associated with a role.
> (2) Change the total pool of resources in the sorter.
> (3) Update the fair-share associated with a sorter client.
> (4) Change both the total and allocated resources in the sorter.
>
> (For fun, the descriptions and function names are in different orders. :) )
>
> I'd like to propose that we avoid naming functions in this style: if
> two functions do fundamentally different things or should be invoked
> in very different circumstances, we should avoid giving them the same
> name. We should use overloading when two variants of a function do
> basically the same thing but differ slightly in the parameters they
> accept.
>
> Comments welcome.
>
> Neil
>

Re: Overloading and function names

Posted by Alex Rukletsov <al...@mesosphere.com>.

Very good suggestion, Neil. Let's make sure we mention in out styleguide.

On Fri, Jul 1, 2016 at 7:44 PM, Neil Conway <ne...@gmail.com> wrote:

> Consider the following function signatures from master.cpp:
>
> Nothing Master::removeSlave(const Registry::Slave& slave);
>
> void Master::removeSlave(Slave* slave, const string& message,
> Option<Counter> reason);
>
> or these from sorter/drf/sorter.hpp:
>
> void update(const SlaveID& slaveId, const Resources& resources);
>
> void update(
>       const std::string& name,
>       const SlaveID& slaveId,
>       const Resources& oldAllocation,
>       const Resources& newAllocation);
>
> void update(const std::string& name);
>
> void update(const std::string& name, double weight);
>
> These function names use overloading but the different variants have
> *completely* different semantics. For example, the variants of
> update() do the following:
>
> (1) Set the weight associated with a role.
> (2) Change the total pool of resources in the sorter.
> (3) Update the fair-share associated with a sorter client.
> (4) Change both the total and allocated resources in the sorter.
>
> (For fun, the descriptions and function names are in different orders. :) )
>
> I'd like to propose that we avoid naming functions in this style: if
> two functions do fundamentally different things or should be invoked
> in very different circumstances, we should avoid giving them the same
> name. We should use overloading when two variants of a function do
> basically the same thing but differ slightly in the parameters they
> accept.
>
> Comments welcome.
>
> Neil
>