You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Charley Kamolpornwijit (JIRA)" <ji...@apache.org> on 2016/08/31 00:08:20 UTC

[jira] [Updated] (THRIFT-3912) TNonblockingServer crashes when file descriptor numbers > FD_SETSIZE

     [ https://issues.apache.org/jira/browse/THRIFT-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Charley Kamolpornwijit updated THRIFT-3912:
-------------------------------------------
    Description: 
We experienced crashes by TNonblockingServer when we used it in our system with high numbers of file descriptors. The stacktrace output were similar to the following:
{quote}
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f536e0f9c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#0  0x00007f536e0f9c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f536e0fd028 in __GI_abort () at abort.c:89
#2  0x00007f536e1362a4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f536e242113 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007f536e1cdbbc in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7f536e2420aa "buffer overflow detected") at fortify_fail.c:38
#4  0x00007f536e1cca90 in __GI___chk_fail () at chk_fail.c:28
#5  0x00007f536e1cdb07 in __fdelt_chk (d=d@entry=1024) at fdelt_chk.c:25
#6  0x0000000000a8a745 in apache::thrift::server::TNonblockingIOThread::notify (this=<optimized out>, conn=0x7f536d432780) at src/thrift/server/TNonblockingServer.cpp:1408
{quote}

By investigating the problem, we found that this cause by TNonblcokingServer using {{FD_SET()}} with a file descriptor number that greater than the {{FD_SETSIZE}} constant. We also found that the patch in THRIFT-3080 was the cause of this problem, as {{select()}}, in contrast to `poll()`, has such limit, according to https://daniel.haxx.se/docs/poll-vs-select.html .

Currently, we revert the patch in THRIFT-3080 (https://github.com/apache/thrift/commit/b5ebcd199c1b603cea652847bfc9177c60fb8e28) to make it works in our environment. However, in the long term, I believe it is important that TNonblockingSystem uses {{poll()}} instead of {{select()}} whenever {{poll()}} exists in the build system to avoid this problem.

Thank you.

  was:
We experienced crashes by TNonblockingServer when we used it in our system with high numbers of file descriptors. The stacktrace output were similar to the following:
{quote}
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f536e0f9c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#0  0x00007f536e0f9c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f536e0fd028 in __GI_abort () at abort.c:89
#2  0x00007f536e1362a4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f536e242113 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007f536e1cdbbc in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7f536e2420aa "buffer overflow detected") at fortify_fail.c:38
#4  0x00007f536e1cca90 in __GI___chk_fail () at chk_fail.c:28
#5  0x00007f536e1cdb07 in __fdelt_chk (d=d@entry=1024) at fdelt_chk.c:25
#6  0x0000000000a8a745 in apache::thrift::server::TNonblockingIOThread::notify (this=<optimized out>, conn=0x7f536d432780) at src/thrift/server/TNonblockingServer.cpp:1408
{quote}

By investigating the problem, we found that this cause by TNonblcokingServer using {{FD_SET()}} with a file descriptor number that greater than the `FD_SETSIZE` constant. We also found that the patch in THRIFT-3080 was the cause of this problem, as {{select()}}, in contrast to `poll()`, has such limit.

Currently, we revert the patch in THRIFT-3080 (https://github.com/apache/thrift/commit/b5ebcd199c1b603cea652847bfc9177c60fb8e28) to make it works in our environment. However, in the long term, I believe it is important that TNonblockingSystem uses {{poll()}} instead of {{select()}} whenever {{poll()}} exists in the build system to avoid this problem.

Thank you.


> TNonblockingServer crashes when file descriptor numbers > FD_SETSIZE
> --------------------------------------------------------------------
>
>                 Key: THRIFT-3912
>                 URL: https://issues.apache.org/jira/browse/THRIFT-3912
>             Project: Thrift
>          Issue Type: Bug
>          Components: C++ - Library
>    Affects Versions: 0.9.3
>         Environment: Ubuntu 14.04 LTS
>            Reporter: Charley Kamolpornwijit
>
> We experienced crashes by TNonblockingServer when we used it in our system with high numbers of file descriptors. The stacktrace output were similar to the following:
> {quote}
> Program terminated with signal SIGABRT, Aborted.
> #0  0x00007f536e0f9c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #0  0x00007f536e0f9c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x00007f536e0fd028 in __GI_abort () at abort.c:89
> #2  0x00007f536e1362a4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f536e242113 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
> #3  0x00007f536e1cdbbc in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7f536e2420aa "buffer overflow detected") at fortify_fail.c:38
> #4  0x00007f536e1cca90 in __GI___chk_fail () at chk_fail.c:28
> #5  0x00007f536e1cdb07 in __fdelt_chk (d=d@entry=1024) at fdelt_chk.c:25
> #6  0x0000000000a8a745 in apache::thrift::server::TNonblockingIOThread::notify (this=<optimized out>, conn=0x7f536d432780) at src/thrift/server/TNonblockingServer.cpp:1408
> {quote}
> By investigating the problem, we found that this cause by TNonblcokingServer using {{FD_SET()}} with a file descriptor number that greater than the {{FD_SETSIZE}} constant. We also found that the patch in THRIFT-3080 was the cause of this problem, as {{select()}}, in contrast to `poll()`, has such limit, according to https://daniel.haxx.se/docs/poll-vs-select.html .
> Currently, we revert the patch in THRIFT-3080 (https://github.com/apache/thrift/commit/b5ebcd199c1b603cea652847bfc9177c60fb8e28) to make it works in our environment. However, in the long term, I believe it is important that TNonblockingSystem uses {{poll()}} instead of {{select()}} whenever {{poll()}} exists in the build system to avoid this problem.
> Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)