You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Tim Wilson-Brown (JIRA)" <ji...@apache.org> on 2010/04/01 12:40:27 UTC

[jira] Created: (THRIFT-748) C++ TSocket default linger setting breaks forked parent process

C++ TSocket default linger setting breaks forked parent process
---------------------------------------------------------------

                 Key: THRIFT-748
                 URL: https://issues.apache.org/jira/browse/THRIFT-748
             Project: Thrift
          Issue Type: Bug
          Components: Library (C++)
    Affects Versions: 0.2, 0.3
         Environment: Cygwin 1.7.1 on Windows XP SP3, Thrift 0.2.0 & r760184 & Trunk
            Reporter: Tim Wilson-Brown
            Priority: Minor


If a Thrift C++ Client opens a TSocket, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.

In particular,
TSocket->close() calls shutdown(socket_,SHUT_RDWR) before close(socket_)


Discussion:

This behaviour is inconsistent, as it is:
  * unlike the unix socket close() semantics - close() only affects the process that calls it, and the socket is shut down when all copies of it are closed
  * unlike the python and java code, which (appears) to only use close()

This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The child process is unable to cleanup its copy of the parent's connection - this is a particular issue when using shared_ptr because the child process can not even exit().

However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections.


Options:

  * The most functional resolution would be to implement TSocket->setShutdownOnClose() that allows Thrift users to set their preference for shutdown on socket close or delete. However, this change may also need to be made to other language libraries.

  * Removing shutdown() from TSocket->close() could break programs that expect TSockets not to stay open if children are still running.


TODO:
  * Confirm issue on Linux - see attached test code
  * Decide how to resolve issue
  * Create Patch - see attached TSocket.h & TSocket.cpp from Thrift 0.2.0 (I don't know how to generate patches but I'm happy to try and work it out)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-748) C++ TSocket default linger setting breaks forked parent process

Posted by "Tim Wilson-Brown (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Wilson-Brown updated THRIFT-748:
------------------------------------

    Description: 
If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.

In particular,
the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 

Discussion:

This behaviour is identical to the behaviour of unix sockets when SO_LINGER is set (implementations vary).
However, the SO_LINGER default for sockets is off not on. This provides unexpected behaviour in TSocket.

This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) or (1,timeout) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).

However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.

The design choice may also be an attempt to implement the block to send then close behaviour described in http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
However, the default linger interval of 0 turns the linger setting into a hard reset.
And in the absence of linger, the kernel can usually send small thrift messages by itself.


Options:

  * Change the default lingerOn to 0 - rely on the kernel to resend a limited number of times
  * Change the default lingerVal to > 0
    - a large value like INT_MAX would match the default connection, send, and recv 'no timeout' behaviour

TODO:
  * Confirm issue on Linux - see attached test code
  * Decide if a change to the defaults is needed
  * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) or (1,timeout) if forking


  was:
If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.

In particular,
the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 

Discussion:

This behaviour is identical to the behaviour of unix sockets when SO_LINGER is set (implementations vary).
However, the SO_LINGER default for sockets is off not on. This provides unexpected behaviour in TSocket.

This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).

However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.


Options:

Do we want to change the default? What is linger useful for?

TODO:
  * Confirm issue on Linux - see attached test code
  * Decide if a code change is needed
  * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) if forking



Added notes about article at http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable describing reliable TCP communication

> C++ TSocket default linger setting breaks forked parent process
> ---------------------------------------------------------------
>
>                 Key: THRIFT-748
>                 URL: https://issues.apache.org/jira/browse/THRIFT-748
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (C++)
>    Affects Versions: 0.2, 0.3
>         Environment: Cygwin 1.7.1 on Windows XP SP3, Thrift 0.2.0 & r760184 & Trunk
>            Reporter: Tim Wilson-Brown
>            Priority: Trivial
>         Attachments: thrift_linger_example.cpp
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.
> In particular,
> the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 
> Discussion:
> This behaviour is identical to the behaviour of unix sockets when SO_LINGER is set (implementations vary).
> However, the SO_LINGER default for sockets is off not on. This provides unexpected behaviour in TSocket.
> This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) or (1,timeout) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).
> However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.
> The design choice may also be an attempt to implement the block to send then close behaviour described in http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
> However, the default linger interval of 0 turns the linger setting into a hard reset.
> And in the absence of linger, the kernel can usually send small thrift messages by itself.
> Options:
>   * Change the default lingerOn to 0 - rely on the kernel to resend a limited number of times
>   * Change the default lingerVal to > 0
>     - a large value like INT_MAX would match the default connection, send, and recv 'no timeout' behaviour
> TODO:
>   * Confirm issue on Linux - see attached test code
>   * Decide if a change to the defaults is needed
>   * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) or (1,timeout) if forking

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-748) C++ TSocket default linger setting breaks forked parent process

Posted by "Tim Wilson-Brown (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Wilson-Brown updated THRIFT-748:
------------------------------------

    Description: 
If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.

In particular,
the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 

Discussion:

This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).

However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.


Options:

Do we want to change the default? What is linger useful for?

TODO:
  * Confirm issue on Linux - see attached test code
  * Decide if a code change is needed
  * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) if forking


  was:
If a Thrift C++ Client opens a TSocket, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.

In particular,
TSocket->close() calls shutdown(socket_,SHUT_RDWR) before close(socket_)


Discussion:

This behaviour is inconsistent, as it is:
  * unlike the unix socket close() semantics - close() only affects the process that calls it, and the socket is shut down when all copies of it are closed
  * unlike the python and java code, which (appears) to only use close()

This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The child process is unable to cleanup its copy of the parent's connection - this is a particular issue when using shared_ptr because the child process can not even exit().

However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections.


Options:

  * The most functional resolution would be to implement TSocket->setShutdownOnClose() that allows Thrift users to set their preference for shutdown on socket close or delete. However, this change may also need to be made to other language libraries.

  * Removing shutdown() from TSocket->close() could break programs that expect TSockets not to stay open if children are still running.


TODO:
  * Confirm issue on Linux - see attached test code
  * Decide how to resolve issue
  * Create Patch - see attached TSocket.h & TSocket.cpp from Thrift 0.2.0 (I don't know how to generate patches but I'm happy to try and work it out)


       Priority: Trivial  (was: Minor)

Edited clone of [#THRIFT-747] for linger issue

> C++ TSocket default linger setting breaks forked parent process
> ---------------------------------------------------------------
>
>                 Key: THRIFT-748
>                 URL: https://issues.apache.org/jira/browse/THRIFT-748
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (C++)
>    Affects Versions: 0.2, 0.3
>         Environment: Cygwin 1.7.1 on Windows XP SP3, Thrift 0.2.0 & r760184 & Trunk
>            Reporter: Tim Wilson-Brown
>            Priority: Trivial
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.
> In particular,
> the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 
> Discussion:
> This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).
> However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.
> Options:
> Do we want to change the default? What is linger useful for?
> TODO:
>   * Confirm issue on Linux - see attached test code
>   * Decide if a code change is needed
>   * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) if forking

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-748) C++ TSocket default linger setting breaks forked parent process

Posted by "Tim Wilson-Brown (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Wilson-Brown updated THRIFT-748:
------------------------------------

    Attachment: thrift_linger_example.cpp

Linger Issue Test Code

> C++ TSocket default linger setting breaks forked parent process
> ---------------------------------------------------------------
>
>                 Key: THRIFT-748
>                 URL: https://issues.apache.org/jira/browse/THRIFT-748
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (C++)
>    Affects Versions: 0.2, 0.3
>         Environment: Cygwin 1.7.1 on Windows XP SP3, Thrift 0.2.0 & r760184 & Trunk
>            Reporter: Tim Wilson-Brown
>            Priority: Trivial
>         Attachments: thrift_linger_example.cpp
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.
> In particular,
> the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 
> Discussion:
> This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).
> However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.
> Options:
> Do we want to change the default? What is linger useful for?
> TODO:
>   * Confirm issue on Linux - see attached test code
>   * Decide if a code change is needed
>   * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) if forking

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-748) C++ TSocket default linger setting breaks forked parent process

Posted by "Tim Wilson-Brown (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Wilson-Brown updated THRIFT-748:
------------------------------------

    Description: 
If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.

In particular,
the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 

Discussion:

This behaviour is identical to the behaviour of unix sockets when SO_LINGER is set (implementations vary).
However, the SO_LINGER default for sockets is off not on. This provides unexpected behaviour in TSocket.

This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).

However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.


Options:

Do we want to change the default? What is linger useful for?

TODO:
  * Confirm issue on Linux - see attached test code
  * Decide if a code change is needed
  * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) if forking


  was:
If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.

In particular,
the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 

Discussion:

This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).

However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.


Options:

Do we want to change the default? What is linger useful for?

TODO:
  * Confirm issue on Linux - see attached test code
  * Decide if a code change is needed
  * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) if forking



Updated description with standard unix socket behaviour

> C++ TSocket default linger setting breaks forked parent process
> ---------------------------------------------------------------
>
>                 Key: THRIFT-748
>                 URL: https://issues.apache.org/jira/browse/THRIFT-748
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (C++)
>    Affects Versions: 0.2, 0.3
>         Environment: Cygwin 1.7.1 on Windows XP SP3, Thrift 0.2.0 & r760184 & Trunk
>            Reporter: Tim Wilson-Brown
>            Priority: Trivial
>         Attachments: thrift_linger_example.cpp
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> If a Thrift C++ Client opens a TSocket, writes some data, then calls fork(), the child process can terminate the parent processes' connection by deleting its copy of the parent TSocket.
> In particular,
> the default setting of lingerOn_ = 1 causes a RST to be sent in close(socket_) in TSocket->close() 
> Discussion:
> This behaviour is identical to the behaviour of unix sockets when SO_LINGER is set (implementations vary).
> However, the SO_LINGER default for sockets is off not on. This provides unexpected behaviour in TSocket.
> This design choice makes it really difficult to program a Thrift client that forks other clients in C++, as the first process to call TSocket->close() terminates all copies of the connection. The processes all have to call TSocket->setLinger(0,0) before deleting the TSocket, closing the TSocket, or exiting. (This workaround only succeeds with the suggested fix in [#THRIFT-747] ).
> However, the design choice also prevents deadlock/slowdown issues where a forked process holds open a copy of the parent's Thrift connections. It also makes close non-blocking, which is ideal in a destructor.
> Options:
> Do we want to change the default? What is linger useful for?
> TODO:
>   * Confirm issue on Linux - see attached test code
>   * Decide if a code change is needed
>   * Document workaround after resolution of [#THRIFT-747] - call TSocket->setLinger(0,0) if forking

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.