You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by Jan-Frode Myklebust <ja...@tanso.net> on 2012/06/18 12:37:11 UTC

ATS 3.0.5 segfaulting

ATS 3.0.5 on RHEL6 seems to segfault reliably when I press reload on the
same page a few times in my browser. This is the build I'm using:

	https://admin.fedoraproject.org/updates/FEDORA-EPEL-2012-6110/trafficserver-3.0.5-1.el6

Would be great of someone else could confirm this problem.

The error I see is:

	Jun 18 12:29:40 dibs kernel: [ET_NET 2][18795]: segfault at 8a ip 0000000000664c8d sp 00002adea1f06ca0 error 6 in traffic_server[400000+359000]
	Jun 18 12:29:40 dibs traffic_manager[18671]: {139952995461088} ERROR: [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 11: Segmentation fault
	Jun 18 12:29:40 dibs traffic_manager[18671]: {139952995461088} ERROR:  (last system error 2: No such file or directory)
	Jun 18 12:29:40 dibs traffic_manager[18671]: {139952995461088} ERROR: [Alarms::signalAlarm] Server Process was reset
	Jun 18 12:29:40 dibs traffic_manager[18671]: {139952995461088} ERROR:  (last system error 2: No such file or directory)
	Jun 18 12:29:43 dibs traffic_server[18817]: NOTE: --- Server Starting ---
	Jun 18 12:29:43 dibs traffic_server[18817]: NOTE: Server Version: Apache Traffic Server - traffic_server - 3.0.5 - (build # 51812 on Jun 18 2012 at 12:26:10)
	Jun 18 12:29:43 dibs traffic_server[18817]: {47792412368192} STATUS: opened /var/log/trafficserver/diags.log



  -jf

Re: ATS 3.0.5 segfaulting

Posted by Jan-Frode Myklebust <ja...@tanso.net>.
On Tue, Jun 19, 2012 at 06:41:46AM -0600, Leif Hedstrom wrote:
> 
> Seems to be the same stack trace Kendo reported last week.
> Is this a regression from 3.0.5?

I have seen some crashes just after starting ATS with earlier releases
also, but then it didn't automatically restart:

	http://mail-archives.apache.org/mod_mbox/trafficserver-users/201106.mbox/%3C20110615202317.GA24213@oc1046828364.ibm.com%3E

That problem was present with 3.0.4 also I think. No idea if that was
the same bug or not.

> Does it happen with trunk or 3.1.4?

Will try to reproduce on something newer..


  -jf

Re: ATS 3.0.5 segfaulting

Posted by Jan-Frode Myklebust <ja...@tanso.net>.
On Tue, Jun 19, 2012 at 03:28:44PM -0700, Brian Geffon wrote:
> Can you provide any more information about your setup? About the
> content being cached, and so on... Thanks!

It's a trivial personal server, with only static content proxyed to
apache httpd on port 8080. Only thing a little special is that I do have
native IPv6, configured in ATS with "CONFIG
proxy.config.http.server_other_ports STRING 80:X6". 

I can trigger the crash by opening http://blag.tanso.net/ in my browser,
and keep the reload page key combination pressed to generate traffic.


  -jf

Re: ATS 3.0.5 segfaulting

Posted by Brian Geffon <br...@gmail.com>.
Can you provide any more information about your setup? About the
content being cached, and so on... Thanks!

Brian

On Tue, Jun 19, 2012 at 2:27 PM, Jan-Frode Myklebust <ja...@tanso.net> wrote:
> On Tue, Jun 19, 2012 at 06:41:46AM -0600, Leif Hedstrom wrote:
>>
>> Does it happen with trunk or 3.1.4?
>>
>
> I now tested the 3.2.0 release candidate, and can't trigger the problem
> with that version.
>
>
>  -jf

Re: ATS 3.0.5 segfaulting

Posted by Jan-Frode Myklebust <ja...@tanso.net>.
On Tue, Jun 19, 2012 at 06:41:46AM -0600, Leif Hedstrom wrote:
> 
> Does it happen with trunk or 3.1.4?
> 

I now tested the 3.2.0 release candidate, and can't trigger the problem
with that version.


  -jf

Re: ATS 3.0.5 segfaulting

Posted by Leif Hedstrom <zw...@apache.org>.
On Jun 19, 2012, at 2:51 AM, Jan-Frode Myklebust <ja...@tanso.net> wrote:

> On Tue, Jun 19, 2012 at 10:41:54AM +0200, Jan-Frode Myklebust wrote:
>> On Mon, Jun 18, 2012 at 09:39:37AM -0700, Brian Geffon wrote:
>>> Can you provide a stack trace?
>> 
>> Maybe, if you could give me some instructions for how.. I tried getting
>> it to dump core, but unsuccessfully...
> 
> Got it*
> 

Seems to be the same stack trace Kendo reported last week. Is this a regression from 3.0.5? Does it happen with trunk or 3.1.4?

-- Leif 


> 
> ==========================================================================
> Core was generated by `/usr/bin/traffic_server -M -A,7:X,8:X6'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x0000000000664c8d in CacheVC::openWriteStartDone (this=0x2b322c1aa2f0, 
>    event=2, e=0x317ad50) at CacheWrite.cc:1574
> 1574      od->reading_vec = 0;
> Missing separate debuginfos, use: debuginfo-install expat-2.0.1-9.1.el6.x86_64 glibc-2.12-1.47.el6_2.12.x86_64 keyutils-libs-1.4-3.el6.x86_64 krb5-libs-1.9-22.el6_2.1.x86_64 libattr-2.4.44-7.el6.x86_64 libcap-2.16-5.5.el6.x86_64 libcom_err-1.41.12-11.el6.x86_64 libgcc-4.4.6-3.el6.x86_64 libselinux-2.0.94-5.2.el6.x86_64 libstdc++-4.4.6-3.el6.x86_64 openssl-1.0.0-20.el6_2.5.x86_64 pcre-7.8-3.1.el6.x86_64 tcl-8.5.7-6.el6.x86_64 xz-libs-4.999.9-0.3.beta.20091007git.el6.x86_64 zlib-1.2.3-27.el6.x86_64
> (gdb) bt
> #0  0x0000000000664c8d in CacheVC::openWriteStartDone (this=0x2b322c1aa2f0, 
>    event=2, e=0x317ad50) at CacheWrite.cc:1574
> #1  0x00000000006ab0f4 in handleEvent (this=0x2b321ec84010, e=0x317ad50, 
>    calling_code=2) at I_Continuation.h:146
> #2  EThread::process_event (this=0x2b321ec84010, e=0x317ad50, calling_code=2)
>    at UnixEThread.cc:140
> #3  0x00000000006abbf3 in EThread::execute (this=0x2b321ec84010)
>    at UnixEThread.cc:217
> #4  0x00000000004c0878 in main (argc=<value optimized out>, 
>    argv=<value optimized out>) at Main.cc:1918
> (gdb) 
> ==========================================================================
> 
> 
> [*] note to self to enable core dumps for ATS:
> 
>    sysctl -w fs.suid_dumpable=1
>    sysctl -w kernel.core_pattern=/tmp/core.%e.%p
>    records.config: CONFIG proxy.config.stack_dump_enabled INT 0
>    /etc/profile: ulimit -c unlimited >/dev/null 2>&1
>    /etc/sysconfig/init: DAEMON_COREFILE_LIMIT='unlimited'
> 
> 
>  -jf

Re: ATS 3.0.5 segfaulting

Posted by Leif Hedstrom <zw...@apache.org>.
On 6/20/12 5:46 PM, Leif Hedstrom wrote:
> On 6/19/12 2:51 AM, Jan-Frode Myklebust wrote:
>> Core was generated by `/usr/bin/traffic_server -M -A,7:X,8:X6'.
>
> Jan-Frode, did you file a Jira ticket for this? If not, can you please do 
> so, target it for v3.0.6, so we can get this resolved. Plenty of people 
> seeing this problem I think.


Actually, I believe it's already filed in 
https://issues.apache.org/jira/browse/TS-1276


-- Leif


Re: ATS 3.0.5 segfaulting

Posted by Leif Hedstrom <zw...@apache.org>.
On 6/19/12 2:51 AM, Jan-Frode Myklebust wrote:
> Core was generated by `/usr/bin/traffic_server -M -A,7:X,8:X6'.

Jan-Frode, did you file a Jira ticket for this? If not, can you please do 
so, target it for v3.0.6, so we can get this resolved. Plenty of people 
seeing this problem I think.

Tack!

-- Leif



Re: ATS 3.0.5 segfaulting

Posted by Jan-Frode Myklebust <ja...@tanso.net>.
On Tue, Jun 19, 2012 at 10:41:54AM +0200, Jan-Frode Myklebust wrote:
> On Mon, Jun 18, 2012 at 09:39:37AM -0700, Brian Geffon wrote:
> > Can you provide a stack trace?
> 
> Maybe, if you could give me some instructions for how.. I tried getting
> it to dump core, but unsuccessfully...

Got it*


==========================================================================
Core was generated by `/usr/bin/traffic_server -M -A,7:X,8:X6'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000664c8d in CacheVC::openWriteStartDone (this=0x2b322c1aa2f0, 
    event=2, e=0x317ad50) at CacheWrite.cc:1574
1574	  od->reading_vec = 0;
Missing separate debuginfos, use: debuginfo-install expat-2.0.1-9.1.el6.x86_64 glibc-2.12-1.47.el6_2.12.x86_64 keyutils-libs-1.4-3.el6.x86_64 krb5-libs-1.9-22.el6_2.1.x86_64 libattr-2.4.44-7.el6.x86_64 libcap-2.16-5.5.el6.x86_64 libcom_err-1.41.12-11.el6.x86_64 libgcc-4.4.6-3.el6.x86_64 libselinux-2.0.94-5.2.el6.x86_64 libstdc++-4.4.6-3.el6.x86_64 openssl-1.0.0-20.el6_2.5.x86_64 pcre-7.8-3.1.el6.x86_64 tcl-8.5.7-6.el6.x86_64 xz-libs-4.999.9-0.3.beta.20091007git.el6.x86_64 zlib-1.2.3-27.el6.x86_64
(gdb) bt
#0  0x0000000000664c8d in CacheVC::openWriteStartDone (this=0x2b322c1aa2f0, 
    event=2, e=0x317ad50) at CacheWrite.cc:1574
#1  0x00000000006ab0f4 in handleEvent (this=0x2b321ec84010, e=0x317ad50, 
    calling_code=2) at I_Continuation.h:146
#2  EThread::process_event (this=0x2b321ec84010, e=0x317ad50, calling_code=2)
    at UnixEThread.cc:140
#3  0x00000000006abbf3 in EThread::execute (this=0x2b321ec84010)
    at UnixEThread.cc:217
#4  0x00000000004c0878 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at Main.cc:1918
(gdb) 
==========================================================================


[*] note to self to enable core dumps for ATS:

	sysctl -w fs.suid_dumpable=1
	sysctl -w kernel.core_pattern=/tmp/core.%e.%p
	records.config: CONFIG proxy.config.stack_dump_enabled INT 0
	/etc/profile: ulimit -c unlimited >/dev/null 2>&1
	/etc/sysconfig/init: DAEMON_COREFILE_LIMIT='unlimited'


  -jf

Re: ATS 3.0.5 segfaulting

Posted by Jan-Frode Myklebust <ja...@tanso.net>.
On Mon, Jun 18, 2012 at 09:39:37AM -0700, Brian Geffon wrote:
> Can you provide a stack trace?

Maybe, if you could give me some instructions for how.. I tried getting
it to dump core, but unsuccessfully...


  -jf

Re: ATS 3.0.5 segfaulting

Posted by Brian Geffon <br...@gmail.com>.
Can you provide a stack trace?

Brian



On Jun 18, 2012, at 3:37 AM, Jan-Frode Myklebust <ja...@tanso.net> wrote:

> ATS 3.0.5 on RHEL6 seems to segfault reliably when I press reload on the
> same page a few times in my browser. This is the build I'm using:
>
>    https://admin.fedoraproject.org/updates/FEDORA-EPEL-2012-6110/trafficserver-3.0.5-1.el6
>
> Would be great of someone else could confirm this problem.
>
> The error I see is:
>
>    Jun 18 12:29:40 dibs kernel: [ET_NET 2][18795]: segfault at 8a ip 0000000000664c8d sp 00002adea1f06ca0 error 6 in traffic_server[400000+359000]
>    Jun 18 12:29:40 dibs traffic_manager[18671]: {139952995461088} ERROR: [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 11: Segmentation fault
>    Jun 18 12:29:40 dibs traffic_manager[18671]: {139952995461088} ERROR:  (last system error 2: No such file or directory)
>    Jun 18 12:29:40 dibs traffic_manager[18671]: {139952995461088} ERROR: [Alarms::signalAlarm] Server Process was reset
>    Jun 18 12:29:40 dibs traffic_manager[18671]: {139952995461088} ERROR:  (last system error 2: No such file or directory)
>    Jun 18 12:29:43 dibs traffic_server[18817]: NOTE: --- Server Starting ---
>    Jun 18 12:29:43 dibs traffic_server[18817]: NOTE: Server Version: Apache Traffic Server - traffic_server - 3.0.5 - (build # 51812 on Jun 18 2012 at 12:26:10)
>    Jun 18 12:29:43 dibs traffic_server[18817]: {47792412368192} STATUS: opened /var/log/trafficserver/diags.log
>
>
>
>  -jf