You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by abadcafe <gi...@git.apache.org> on 2015/04/03 16:28:41 UTC

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

GitHub user abadcafe opened a pull request:

    https://github.com/apache/thrift/pull/422

    THRIFT-3080: fix connection leak of C++ Nonblocking Server while huge nu...

    ...mber connections are accepted and unix socket stream fd is busy.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ops-baidu/thrift THRIFT-3080

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/thrift/pull/422.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #422
    
----
commit c5d6b9e6fb3920433914140bc94e3158ec5eecfc
Author: abadcafe <fw...@live.com>
Date:   2015-04-03T14:23:04Z

    THRIFT-3080: fix connection leak of C++ Nonblocking Server while huge number connections are accepted and unix socket stream fd is busy.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

Posted by abadcafe <gi...@git.apache.org>.
Github user abadcafe commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/422#discussion_r28510532
  
    --- Diff: lib/cpp/src/thrift/server/TNonblockingServer.cpp ---
    @@ -1393,9 +1397,42 @@ bool TNonblockingIOThread::notify(TNonblockingServer::TConnection* conn) {
         return false;
       }
     
    -  const int kSize = sizeof(conn);
    -  if (send(fd, const_cast_sockopt(&conn), kSize, 0) != kSize) {
    -    return false;
    +  fd_set wfds, efds;
    +  int ret = -1;
    +  int kSize = sizeof(conn);
    +  const char * pos = (const char *)const_cast_sockopt(&conn);
    +
    +  while (kSize > 0) {
    +    FD_ZERO(&wfds);
    +    FD_ZERO(&efds);
    +    FD_SET(fd, &wfds);
    +    FD_SET(fd, &efds);
    +    ret = select(fd + 1, NULL, &wfds, &efds, NULL);
    --- End diff --
    
    fd + 1 is select()'s convention which is the highest-numbered file descriptor in any of the three sets plus 1. There is only one fd here, so just plus 1 is safe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

Posted by abadcafe <gi...@git.apache.org>.
Github user abadcafe commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/422#discussion_r28532945
  
    --- Diff: lib/cpp/src/thrift/server/TNonblockingServer.cpp ---
    @@ -1393,9 +1397,42 @@ bool TNonblockingIOThread::notify(TNonblockingServer::TConnection* conn) {
         return false;
       }
     
    -  const int kSize = sizeof(conn);
    -  if (send(fd, const_cast_sockopt(&conn), kSize, 0) != kSize) {
    -    return false;
    +  fd_set wfds, efds;
    +  int ret = -1;
    +  int kSize = sizeof(conn);
    +  const char * pos = (const char *)const_cast_sockopt(&conn);
    +
    +  while (kSize > 0) {
    +    FD_ZERO(&wfds);
    +    FD_ZERO(&efds);
    +    FD_SET(fd, &wfds);
    +    FD_SET(fd, &efds);
    +    ret = select(fd + 1, NULL, &wfds, &efds, NULL);
    --- End diff --
    
    In early windows(<= XP), there is no poll() function. The implement of THRIFT_POLL in windows XP was simulated by select(). The problem is the fake poll() lacks POLL_HUP/POLL_ERR state, I am afraid it was not function completed?
    
    On the other side, select() is common enough on every platform supported by thrift.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

Posted by jeking3 <gi...@git.apache.org>.
Github user jeking3 commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/422#discussion_r28515512
  
    --- Diff: lib/cpp/src/thrift/server/TNonblockingServer.cpp ---
    @@ -1393,9 +1397,42 @@ bool TNonblockingIOThread::notify(TNonblockingServer::TConnection* conn) {
         return false;
       }
     
    -  const int kSize = sizeof(conn);
    -  if (send(fd, const_cast_sockopt(&conn), kSize, 0) != kSize) {
    -    return false;
    +  fd_set wfds, efds;
    +  int ret = -1;
    +  int kSize = sizeof(conn);
    +  const char * pos = (const char *)const_cast_sockopt(&conn);
    +
    +  while (kSize > 0) {
    +    FD_ZERO(&wfds);
    +    FD_ZERO(&efds);
    +    FD_SET(fd, &wfds);
    +    FD_SET(fd, &efds);
    +    ret = select(fd + 1, NULL, &wfds, &efds, NULL);
    --- End diff --
    
    Fair enough, I read the first page of the manual and saw nfds and made an assumption!
    
    Should we be using THRIFT_POLL here similar to how TSocket does so that it is [more] compatible cross-platform?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift issue #422: THRIFT-3080: fix connection leak of C++ Nonblocking Serve...

Posted by lyhuzi <gi...@git.apache.org>.
Github user lyhuzi commented on the issue:

    https://github.com/apache/thrift/pull/422
  
    为什么不用ul_swriteo_ms_ex2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

Posted by jeking3 <gi...@git.apache.org>.
Github user jeking3 commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/422#discussion_r27765152
  
    --- Diff: lib/cpp/src/thrift/server/TNonblockingServer.cpp ---
    @@ -1393,9 +1394,39 @@ bool TNonblockingIOThread::notify(TNonblockingServer::TConnection* conn) {
         return false;
       }
     
    -  const int kSize = sizeof(conn);
    -  if (send(fd, const_cast_sockopt(&conn), kSize, 0) != kSize) {
    -    return false;
    +  int ret = -1;
    +  struct pollfd pfd = {fd, POLLOUT, 0};
    +  int kSize = sizeof(conn);
    +  const char * pos = (const char *)const_cast_sockopt(&conn);
    +
    +  while (kSize > 0) {
    +    pfd.revents = 0;
    +    ret = poll(&pfd, 1, -1);
    --- End diff --
    
    Should this be using the THRIFT_POLL and other THRIFT_ macros for portability?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

Posted by abadcafe <gi...@git.apache.org>.
Github user abadcafe closed the pull request at:

    https://github.com/apache/thrift/pull/422


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

Posted by abadcafe <gi...@git.apache.org>.
Github user abadcafe commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/422#discussion_r27766767
  
    --- Diff: lib/cpp/src/thrift/server/TNonblockingServer.cpp ---
    @@ -1393,9 +1394,39 @@ bool TNonblockingIOThread::notify(TNonblockingServer::TConnection* conn) {
         return false;
       }
     
    -  const int kSize = sizeof(conn);
    -  if (send(fd, const_cast_sockopt(&conn), kSize, 0) != kSize) {
    -    return false;
    +  int ret = -1;
    +  struct pollfd pfd = {fd, POLLOUT, 0};
    +  int kSize = sizeof(conn);
    +  const char * pos = (const char *)const_cast_sockopt(&conn);
    +
    +  while (kSize > 0) {
    +    pfd.revents = 0;
    +    ret = poll(&pfd, 1, -1);
    --- End diff --
    
    Sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift issue #422: THRIFT-3080: fix connection leak of C++ Nonblocking Serve...

Posted by lyhuzi <gi...@git.apache.org>.
Github user lyhuzi commented on the issue:

    https://github.com/apache/thrift/pull/422
  
    why not use ul_swriteo_ms_ex2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

Posted by jeking3 <gi...@git.apache.org>.
Github user jeking3 commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/422#discussion_r28375162
  
    --- Diff: lib/cpp/src/thrift/server/TNonblockingServer.cpp ---
    @@ -1393,9 +1397,42 @@ bool TNonblockingIOThread::notify(TNonblockingServer::TConnection* conn) {
         return false;
       }
     
    -  const int kSize = sizeof(conn);
    -  if (send(fd, const_cast_sockopt(&conn), kSize, 0) != kSize) {
    -    return false;
    +  fd_set wfds, efds;
    +  int ret = -1;
    +  int kSize = sizeof(conn);
    +  const char * pos = (const char *)const_cast_sockopt(&conn);
    +
    +  while (kSize > 0) {
    +    FD_ZERO(&wfds);
    +    FD_ZERO(&efds);
    +    FD_SET(fd, &wfds);
    +    FD_SET(fd, &efds);
    +    ret = select(fd + 1, NULL, &wfds, &efds, NULL);
    --- End diff --
    
    Seems like a dangerous assumption that the file descriptor you are interested is +1 from the obtained file descriptor.  Did you mean to use `getNotificationRecvFD()` instead of `fd + 1` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift pull request: THRIFT-3080: fix connection leak of C++ Nonbl...

Posted by jeking3 <gi...@git.apache.org>.
Github user jeking3 commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/422#discussion_r27768849
  
    --- Diff: lib/cpp/src/thrift/server/TNonblockingServer.cpp ---
    @@ -1393,9 +1394,39 @@ bool TNonblockingIOThread::notify(TNonblockingServer::TConnection* conn) {
         return false;
       }
     
    -  const int kSize = sizeof(conn);
    -  if (send(fd, const_cast_sockopt(&conn), kSize, 0) != kSize) {
    -    return false;
    +  int ret = -1;
    +  struct pollfd pfd = {fd, POLLOUT, 0};
    +  int kSize = sizeof(conn);
    +  const char * pos = (const char *)const_cast_sockopt(&conn);
    +
    +  while (kSize > 0) {
    +    pfd.revents = 0;
    +    ret = poll(&pfd, 1, -1);
    --- End diff --
    
    TServerSocket has a macro based implementation that is portable for reference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] thrift issue #422: THRIFT-3080: fix connection leak of C++ Nonblocking Serve...

Posted by xiaosuo <gi...@git.apache.org>.
Github user xiaosuo commented on the issue:

    https://github.com/apache/thrift/pull/422
  
    This patch may introduce a more serious problem in theory.
    
    No standard defines writing to a socket is an atomic operation, so if writing is interleaved by other threads, the reading end will end up with an invalid connection pointer, and the following operation on connection will cause segment fault.
    
    In order to fix the above issue, we must turn to pipe(2), which guarantees that writing data less than PIPE_BUF is atomic . And pipes are more efficient than sockets, because no skb is required to save small pointers.
    
    BTW, if we don't set the writing end to nonblocking mode, we don't need to use select to emulate it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---