You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Michael Park (JIRA)" <ji...@apache.org> on 2015/04/17 23:14:00 UTC

[jira] [Updated] (MESOS-2629) Update style guide to disallow capture by reference of temporaries

     [ https://issues.apache.org/jira/browse/MESOS-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Park updated MESOS-2629:
--------------------------------
    Description: 
We modify the style guide to disallow constant references to temporaries as a whole. This means disallowing both *Case 1* and *Case 2* below.

**Background**:
- *Case 1*: Constant references to simple expression temporaries do extend the lifetime of the temporary till end of function scope:
  - a) Temporary returned by function:
  ```c++
  // See full example below.
  T f(const char* s) { return T(s); }

  {
    const T& good = f("Ok");
    // use of good is ok.
  }
  ```
  - b) Temporary constructed as simple expression:
  ```c++
  // See full example below.
  {
    const T& good = T("Ok");
    // use of good is ok.
  }
  ```
- *Case 2*: Constant references to expressions that result in a reference to a temporary do not extend the lifetime of the temporary:
  - a) Temporary returned by function:
  ```c++
  // See full example below.
  T f(const char* s) { return T(s); }

  {
    const T& bad = f("Bad!").Member();
    // use of bad is invalid.
  }
  ```
  - b) Temporary constructed as simple expression:
  ```c++
  // See full example below.
  {
    const T& bad = T("Bad!").Member();
    // use of bad is invalid.
  }
  ```
- *Mesos Case*:
  - In Mesos we use Future<T> a lot. Many of our functions return Futures by value:
  ```c++
  class Socket {
    Future<Socket> accept();
    Future<size_t> recv(char* data, size_t size);
    ...
  }
  ```
  - Sometimes we capture these Futures:
  ```c++
  {
    const Future<Socket>& accepted = socket.accept(); // Valid c++, propose we disallow.
  }
  ```
  - Sometimes we chain these Futures:
  ```c++
  {
    socket.accept().then(lambda::bind(_accepted)); // Temporary will be valid during 'then' expression evaluation.
  }
  ```
  - Sometimes we do both:
  ```c++
  {
    const Future<Socket>& accepted = socket.accept().then(lambda::bind(_accepted)); // Dangerous! 'accepted' lifetime will not be valid till end of scope. Disallow!
  }
  ```
**Reasoning**:
- Although *Case 1* is ok, and considered a feature (http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/), *Case 2* is extremely dangerous and leads to hard to track bugs.
- If we explicitly allow *Case 1*, but disallow *Case 2*, then my worry is that someone coming along to maintain the code later on may accidentally turn *Case 1* into *Case 2*, without recognizing the severity of this mistake. For example:
```c++
// Original code:
const T& val = T();
std::cout << val << std::endl;
// New code:
const T& val = T().removeWhiteSpace();
std::cout << val << std::endl; // val could be corrupted since the destructor has been invoked and T's memory freed.
```
- If we disallow both cases: it will be easier to catch these mistakes early on in code reviews (and avoid these painful bugs), at the same cost of introducing a new style guide rule.

**Performance Implications**:
- BenH suggests c++ developers are commonly taught to capture by constant reference to hint to the compiler that the copy can be elided.
- Modern compilers use a Data Flow Graph to make optimizations such as
  - **In-place-construction**: leveraged by RVO and NRVO to construct the object in place on the stack. Similar to "*Placement new*": http://en.wikipedia.org/wiki/Placement_syntax
  - **RVO** (Return Value Optimization): http://en.wikipedia.org/wiki/Return_value_optimization
  - **NRVO** (Named Return Value Optimization): https://msdn.microsoft.com/en-us/library/ms364057%28v=vs.80%29.aspx
- Since modern compilers perform these optimizations, we no longer need to 'hint' to the compiler that the copies can be elided.

**Action Items**:
- Discussion on dev-list with community.
- Remove all constant references to temporaries in code base.
- Modify style guide with examples of bad behavior, and reasoning for the rule.

**Example program**:
```c++
#include <stdio.h>

class T {
public:
  T(const char* str) : Str(str) {
    printf("+ T(%s)\n", Str);
  }
  ~T() {
    printf("- T(%s)\n", Str);
  }
  const T& Member() const
  {
    return *this;
  }
private:
  const char* Str;
};

T f(const char* s) { return T(s); }

int main() {
  const T& good = T("Ok");
  const T& good_f = f("Ok function");

  const T& bad = T("Bad!").Member();
  const T& bad_f = T("Bad function!").Member();

  printf("End of function scope...\n");
}
```
Output:
```
+ T(Ok)
+ T(Ok function)
+ T(Bad!)
- T(Bad!)
+ T(Bad function!)
- T(Bad function!)
End of function scope...
- T(Ok function)
- T(Ok)
```

  was:Update the style guide according to the epic.


> Update style guide to disallow capture by reference of temporaries
> ------------------------------------------------------------------
>
>                 Key: MESOS-2629
>                 URL: https://issues.apache.org/jira/browse/MESOS-2629
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Joris Van Remoortere
>            Assignee: Joris Van Remoortere
>
> We modify the style guide to disallow constant references to temporaries as a whole. This means disallowing both *Case 1* and *Case 2* below.
> **Background**:
> - *Case 1*: Constant references to simple expression temporaries do extend the lifetime of the temporary till end of function scope:
>   - a) Temporary returned by function:
>   ```c++
>   // See full example below.
>   T f(const char* s) { return T(s); }
>   {
>     const T& good = f("Ok");
>     // use of good is ok.
>   }
>   ```
>   - b) Temporary constructed as simple expression:
>   ```c++
>   // See full example below.
>   {
>     const T& good = T("Ok");
>     // use of good is ok.
>   }
>   ```
> - *Case 2*: Constant references to expressions that result in a reference to a temporary do not extend the lifetime of the temporary:
>   - a) Temporary returned by function:
>   ```c++
>   // See full example below.
>   T f(const char* s) { return T(s); }
>   {
>     const T& bad = f("Bad!").Member();
>     // use of bad is invalid.
>   }
>   ```
>   - b) Temporary constructed as simple expression:
>   ```c++
>   // See full example below.
>   {
>     const T& bad = T("Bad!").Member();
>     // use of bad is invalid.
>   }
>   ```
> - *Mesos Case*:
>   - In Mesos we use Future<T> a lot. Many of our functions return Futures by value:
>   ```c++
>   class Socket {
>     Future<Socket> accept();
>     Future<size_t> recv(char* data, size_t size);
>     ...
>   }
>   ```
>   - Sometimes we capture these Futures:
>   ```c++
>   {
>     const Future<Socket>& accepted = socket.accept(); // Valid c++, propose we disallow.
>   }
>   ```
>   - Sometimes we chain these Futures:
>   ```c++
>   {
>     socket.accept().then(lambda::bind(_accepted)); // Temporary will be valid during 'then' expression evaluation.
>   }
>   ```
>   - Sometimes we do both:
>   ```c++
>   {
>     const Future<Socket>& accepted = socket.accept().then(lambda::bind(_accepted)); // Dangerous! 'accepted' lifetime will not be valid till end of scope. Disallow!
>   }
>   ```
> **Reasoning**:
> - Although *Case 1* is ok, and considered a feature (http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/), *Case 2* is extremely dangerous and leads to hard to track bugs.
> - If we explicitly allow *Case 1*, but disallow *Case 2*, then my worry is that someone coming along to maintain the code later on may accidentally turn *Case 1* into *Case 2*, without recognizing the severity of this mistake. For example:
> ```c++
> // Original code:
> const T& val = T();
> std::cout << val << std::endl;
> // New code:
> const T& val = T().removeWhiteSpace();
> std::cout << val << std::endl; // val could be corrupted since the destructor has been invoked and T's memory freed.
> ```
> - If we disallow both cases: it will be easier to catch these mistakes early on in code reviews (and avoid these painful bugs), at the same cost of introducing a new style guide rule.
> **Performance Implications**:
> - BenH suggests c++ developers are commonly taught to capture by constant reference to hint to the compiler that the copy can be elided.
> - Modern compilers use a Data Flow Graph to make optimizations such as
>   - **In-place-construction**: leveraged by RVO and NRVO to construct the object in place on the stack. Similar to "*Placement new*": http://en.wikipedia.org/wiki/Placement_syntax
>   - **RVO** (Return Value Optimization): http://en.wikipedia.org/wiki/Return_value_optimization
>   - **NRVO** (Named Return Value Optimization): https://msdn.microsoft.com/en-us/library/ms364057%28v=vs.80%29.aspx
> - Since modern compilers perform these optimizations, we no longer need to 'hint' to the compiler that the copies can be elided.
> **Action Items**:
> - Discussion on dev-list with community.
> - Remove all constant references to temporaries in code base.
> - Modify style guide with examples of bad behavior, and reasoning for the rule.
> **Example program**:
> ```c++
> #include <stdio.h>
> class T {
> public:
>   T(const char* str) : Str(str) {
>     printf("+ T(%s)\n", Str);
>   }
>   ~T() {
>     printf("- T(%s)\n", Str);
>   }
>   const T& Member() const
>   {
>     return *this;
>   }
> private:
>   const char* Str;
> };
> T f(const char* s) { return T(s); }
> int main() {
>   const T& good = T("Ok");
>   const T& good_f = f("Ok function");
>   const T& bad = T("Bad!").Member();
>   const T& bad_f = T("Bad function!").Member();
>   printf("End of function scope...\n");
> }
> ```
> Output:
> ```
> + T(Ok)
> + T(Ok function)
> + T(Bad!)
> - T(Bad!)
> + T(Bad function!)
> - T(Bad function!)
> End of function scope...
> - T(Ok function)
> - T(Ok)
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)