You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2019/01/15 18:22:20 UTC

Re: Review Request 69082: Correctly propagated `close` failures in some instances.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69082/#review212026
-----------------------------------------------------------



Should we be surfacing a close EINTR as an error or let that be silent?
I think these errors need some message pre-fixing? E.g.

```
Failed to close '3': Bad file number
```

As it stands the error messages will only show "Bad file number" and it won't be clear if the write or close produced this? It's also an issue here between open/write/flush but would be great to resolve this now.


3rdparty/stout/include/stout/os/write.hpp
Line 129 (original), 129 (patched)
<https://reviews.apache.org/r/69082/#comment297616>

    Looks to me like s/result/write would read better?
    
    ```
      Try<Nothing> write = os::write(fd.get(), message);
      
      ...
      
      Try<Nothing> close = os::close(fd.get());
      
      // Propagate `close` errors if `write` succeeded.
      if (close.isError() && !write.isError()) {
        return close;
      }
      
      return write;
    ```
    
    Ditto for the others



3rdparty/stout/include/stout/os/write.hpp
Line 140 (original), 140 (patched)
<https://reviews.apache.org/r/69082/#comment297615>

    newline above this to be consistent with the rest of the formatting in this function? ditto for the others


- Benjamin Mahler


On Jan. 15, 2019, 12:24 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69082/
> -----------------------------------------------------------
> 
> (Updated Jan. 15, 2019, 12:24 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Chun-Hung Hsiao, and Alex Clemmer.
> 
> 
> Bugs: MESOS-9331
>     https://issues.apache.org/jira/browse/MESOS-9331
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When e.g., writing to disk, errors from write might only manifest at
> ::close time. We update some instances to correctly propagate these
> errors instead of dropping them silently. We only propagate the
> ::close error if the write operation succeeded, otherwise we just
> propagate the error from the write operation.
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/os/posix/mktemp.hpp 63b3d1a7720d07f877fa1d4eb7f32a548916637a 
>   3rdparty/stout/include/stout/os/write.hpp f7538f94f5a953a7a90a05bc1d2f138b6c17f814 
>   3rdparty/stout/include/stout/protobuf.hpp eb4adef56f1701e3c101284e05e4e6c66eef9180 
> 
> 
> Diff: https://reviews.apache.org/r/69082/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69082: Correctly propagated `close` failures in some instances.

Posted by Benjamin Mahler <bm...@apache.org>.

> On Jan. 15, 2019, 6:22 p.m., Benjamin Mahler wrote:
> > Should we be surfacing a close EINTR as an error or let that be silent?
> > I think these errors need some message pre-fixing? E.g.
> > 
> > ```
> > Failed to close '3': Bad file number
> > ```
> > 
> > As it stands the error messages will only show "Bad file number" and it won't be clear if the write or close produced this? It's also an issue here between open/write/flush but would be great to resolve this now.
> 
> Benjamin Bannier wrote:
>     If I read the POSIX spec for `close` correctly, if `close` fails with `EINTR` the passed file descriptor is left in an unspecified state so it e.g., would not be safe to assume that pending data was successfully flushed. Am I missing something?

There's quite a rabbit hole of reading around this, but with respect to the POSIX close spec:

> If close() is interrupted by a signal that is to be caught, it shall
> return -1 with errno set to [EINTR] and the state of fildes is unspecified.
> If an I/O error occurred while reading from or writing to the file system
> during close(), it may return -1 with errno set to [EIO]; if this error is
> returned, the state of fildes is unspecified.

My interpretation so far from reading more about this is that the "state of fildes is unspecified" is *only* referring to whether it's closed or open. Howerver, Linus says:

> The error return just tells you that soem error happened on the file: for
> example, in the case of EINTR, the close() may not have flushed all the
> pending data synchronously.

https://lkml.org/lkml/2005/9/10/129

But, even if close returns successfully we cannot assume pending data was flushed (although some of our code in question here uses an explicit flush before closing as suggested), from linux's close man page:

> A successful close does not guarantee that the data has been
> successfully saved to disk, as the kernel uses the buffer cache to
> defer writes.  Typically, filesystems do not flush buffers when a
> file is closed.  If you need to be sure that the data is physically
> stored on the underlying disk, use fsync(2).  (It will depend on the
> disk hardware at this point.)

Based on this, EINTR (which is more like EINPROGRESS in terms of semantics, see https://ewontfix.com/4/) seems to provide the same guarantee as return code of 0:

* `0`: filedes is closed, previous `write()`s are in kernel buffer cache but data might not make it to disk
* `EINTR`: filedes is closed, close call was interrupted by a signal, previous `write()`s are in kernel buffer cache but data might not make it to disk

In the case of an `fsync()` before close, it seems to mean:

* `0`: filedes is closed, nothing in kernel buffer cache, data is already fsynced
* `EINTR`: filedes is closed, nothing in kernel buffer cache, data is already fsynced, close call was interrupted by a signal (not sure if EINTR is possible with an `fsync()` beforehand..)

Looking through the linux source, it calls through these:
https://github.com/torvalds/linux/blob/d7393226d15add056285c8fc86723d54d7e0c77d/fs/open.c#L1152-L1169
https://github.com/torvalds/linux/blob/903b77c631673eeec9e9114e9524171cdf9a2646/fs/file.c#L617-L641
https://github.com/torvalds/linux/blob/ad1d69735878a6bf797705b5d2a20316d35e1113/fs/open.c#L1126-L1150
and the interruptible part appears to be the flush here:
https://github.com/torvalds/linux/blob/ad1d69735878a6bf797705b5d2a20316d35e1113/fs/open.c#L1140

Trying to figure out what this flush is, I see this: https://lwn.net/Articles/576478/

> In fact, it is difficult to even return EINTR from close() on Linux, according to Christoph Hellwig. If the driver or filesystem's release() method returns an error, it is explicitly ignored. The only path that would allow a driver to return EINTR is if it provides a flush() method that does so. Hellwig plans to post a patch that would enforce a no-EINTR policy on that path as well.

> If EINTR can never be returned, there is no real reason to map it to EINPROGRESS in glibc. But, since glibc may be used on an older kernel that can return EINTR in some rare situations, mapping it to something probably makes sense. That could be EINPROGRESS or, perhaps better still, just zero for success, as suggested by Rich Felker. There really isn't much the application programmer can do if close() returns an error.

I'm not sure under which circumstances flushing is occurring on close (not sure which drivers provide the flush method), but this makes it sound rare / driver dependent, and it sounds like the plan would make EINTR impossible.

So, I guess surfacing EINTR up to the caller sounds like the simplest thing to do, and we'll have to see whether this actually occurs in practice.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69082/#review212026
-----------------------------------------------------------


On Jan. 16, 2019, 1:06 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69082/
> -----------------------------------------------------------
> 
> (Updated Jan. 16, 2019, 1:06 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Chun-Hung Hsiao, and Alex Clemmer.
> 
> 
> Bugs: MESOS-9331
>     https://issues.apache.org/jira/browse/MESOS-9331
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When e.g., writing to disk, errors from write might only manifest at
> ::close time. We update some instances to correctly propagate these
> errors instead of dropping them silently. We only propagate the
> ::close error if the write operation succeeded, otherwise we just
> propagate the error from the write operation.
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/os/posix/mktemp.hpp 63b3d1a7720d07f877fa1d4eb7f32a548916637a 
>   3rdparty/stout/include/stout/os/write.hpp f7538f94f5a953a7a90a05bc1d2f138b6c17f814 
>   3rdparty/stout/include/stout/protobuf.hpp eb4adef56f1701e3c101284e05e4e6c66eef9180 
> 
> 
> Diff: https://reviews.apache.org/r/69082/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69082: Correctly propagated `close` failures in some instances.

Posted by Benjamin Bannier <be...@mesosphere.io>.

> On Jan. 15, 2019, 7:22 p.m., Benjamin Mahler wrote:
> > Should we be surfacing a close EINTR as an error or let that be silent?
> > I think these errors need some message pre-fixing? E.g.
> > 
> > ```
> > Failed to close '3': Bad file number
> > ```
> > 
> > As it stands the error messages will only show "Bad file number" and it won't be clear if the write or close produced this? It's also an issue here between open/write/flush but would be great to resolve this now.

If I read the POSIX spec for `close` correctly, if `close` fails with `EINTR` the passed file descriptor is left in an unspecified state so it e.g., would not be safe to assume that pending data was successfully flushed. Am I missing something?


> On Jan. 15, 2019, 7:22 p.m., Benjamin Mahler wrote:
> > 3rdparty/stout/include/stout/os/write.hpp
> > Line 129 (original), 129 (patched)
> > <https://reviews.apache.org/r/69082/diff/2/?file=2114349#file2114349line129>
> >
> >     Looks to me like s/result/write would read better?
> >     
> >     ```
> >       Try<Nothing> write = os::write(fd.get(), message);
> >       
> >       ...
> >       
> >       Try<Nothing> close = os::close(fd.get());
> >       
> >       // Propagate `close` errors if `write` succeeded.
> >       if (close.isError() && !write.isError()) {
> >         return close;
> >       }
> >       
> >       return write;
> >     ```
> >     
> >     Ditto for the others

Good idea. Like you showed in your example this requires some name disambiguation, but ultimately reads better.


> On Jan. 15, 2019, 7:22 p.m., Benjamin Mahler wrote:
> > 3rdparty/stout/include/stout/os/write.hpp
> > Line 140 (original), 140 (patched)
> > <https://reviews.apache.org/r/69082/diff/2/?file=2114349#file2114349line140>
> >
> >     newline above this to be consistent with the rest of the formatting in this function? ditto for the others

Went with freestanding calls to `close`.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69082/#review212026
-----------------------------------------------------------


On Jan. 16, 2019, 2:06 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69082/
> -----------------------------------------------------------
> 
> (Updated Jan. 16, 2019, 2:06 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Chun-Hung Hsiao, and Alex Clemmer.
> 
> 
> Bugs: MESOS-9331
>     https://issues.apache.org/jira/browse/MESOS-9331
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When e.g., writing to disk, errors from write might only manifest at
> ::close time. We update some instances to correctly propagate these
> errors instead of dropping them silently. We only propagate the
> ::close error if the write operation succeeded, otherwise we just
> propagate the error from the write operation.
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/os/posix/mktemp.hpp 63b3d1a7720d07f877fa1d4eb7f32a548916637a 
>   3rdparty/stout/include/stout/os/write.hpp f7538f94f5a953a7a90a05bc1d2f138b6c17f814 
>   3rdparty/stout/include/stout/protobuf.hpp eb4adef56f1701e3c101284e05e4e6c66eef9180 
> 
> 
> Diff: https://reviews.apache.org/r/69082/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>