You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Leif Hedstrom (JIRA)" <ji...@apache.org> on 2013/05/04 03:13:14 UTC
[jira] [Updated] (TS-1487) the ordering of plugin_init and
init_HttpProxyServer cause crashed TS to core endlessly
[ https://issues.apache.org/jira/browse/TS-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leif Hedstrom updated TS-1487:
------------------------------
Labels: A (was: )
> the ordering of plugin_init and init_HttpProxyServer cause crashed TS to core endlessly
> ---------------------------------------------------------------------------------------
>
> Key: TS-1487
> URL: https://issues.apache.org/jira/browse/TS-1487
> Project: Traffic Server
> Issue Type: Bug
> Components: Core
> Affects Versions: 3.2.0
> Environment: Linux RHEL6.2
> Reporter: Aidan McGurn
> Assignee: Alan M. Carroll
> Priority: Critical
> Labels: A
> Fix For: 3.3.5
>
> Attachments: INTD-529-RespawnCrash.patch, INTD-529-RespawnCrash.patch
>
>
> We've had a serious issue whereby the TS when it crashes re-spawns/cores continuously when its tries to re-start under load. I traced the issue to SNMP research library (a third party lib)- They use selects and what happens is the file descriptor number spikes under load after the crash as all the sockets get opened at once - this causes buffer overflow in the select (which their library is full of) as the fd allocated to the FD_SET is much bigger than the FD_SETSIZE of 1024 (which was a bitch to track down as the stack was corrupted and gdb therefore useless). Tracing why this happened on 3.2.0 and not 3.0.2, I find the sequence
> of the plugin_init has changed - On 3.0.2 the sequence was in effect 1. plugin_init and then 2. init_HttpProxyServer. Whereas this has mysteriously been reversed on 3.2.0. In order to get our system to work in this crash case , I've patched ATS to flip them around like in 3.0.2.
> i'll attach the patch we propose we need to use to get around this.
> Is this actually a bug then waiting to happen in other systems - Or was there a reason to change this sequence?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira