You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "David Carlin (JIRA)" <ji...@apache.org> on 2013/10/03 16:38:43 UTC

[jira] [Commented] (TS-1411) Seg faulting during logging

    [ https://issues.apache.org/jira/browse/TS-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785239#comment-13785239 ] 

David Carlin commented on TS-1411:
----------------------------------

I'm still not able to reproduce this in a lab environment, but since upgrading from 3.2.0 to 3.3.5 the frequency of the crashes (10-20 times a day across our entire footprint) hasn't changed, but its generating core files now.   I know its the same issue because the stack trace in traffic.out is identical.    I'm putting these three backtraces here because I'm going to have to turn off saving cores; they keep filling the disks.  I changed the title of the bug because it doesn't seem related to log rotation, I don't see the crashes occurring right after log rotation anymore, its more random.

{noformat}
#0  LogAccess::marshal_mem (dest=0x2ab400312348 "",
    source=0x2ab4a42f876c <Address 0x2ab4a42f876c out of bounds>,
    actual_len=1997, padded_len=2000) at LogAccess.cc:805
#1  0x000000000058a440 in LogAccessHttp::marshal_client_req_url_canon (
    this=<value optimized out>, buf=<value optimized out>)
    at LogAccessHttp.cc:304
#2  0x000000000059b422 in LogFieldList::marshal (this=<value optimized out>,
    lad=0x2ab36f07da90, buf=0x2ab400312310 "") at LogField.cc:517
#3  0x00000000005a5ce5 in LogObject::log (this=0x1ced4b0, lad=0x2ab36f07da90,
    text_entry=0x0) at LogObject.cc:609
#4  0x000000000058d586 in log (lad=0x2ab36f07da90) at LogObject.h:394
#5  Log::access (lad=0x2ab36f07da90) at Log.cc:1110
#6  0x00000000005234c8 in HttpSM::update_stats (this=0x2ab6a9e30450)
    at HttpSM.cc:6484
#7  0x0000000000530e10 in HttpSM::kill_this (this=0x2ab6a9e30450)
    at HttpSM.cc:6445
#8  0x0000000000531198 in HttpSM::main_handler (this=0x2ab6a9e30450,
    event=2301, data=0x2ab6a9e32010) at HttpSM.cc:2516
#9  0x000000000056b3f7 in handleEvent (this=0x2ab6a9e32010,
    event=<value optimized out>, data=<value optimized out>)
    at ../../iocore/eventsystem/I_Continuation.h:146
#10 HttpTunnel::main_handler (this=0x2ab6a9e32010,
    event=<value optimized out>, data=<value optimized out>)
    at HttpTunnel.cc:1506
#11 0x0000000000680f81 in handleEvent (event=<value optimized out>, nh=
    0x2ab36d86abf0, vc=0x2ab4bc259fb0)
    at ../../iocore/eventsystem/I_Continuation.h:146
#12 write_signal_and_update (event=<value optimized out>, nh=0x2ab36d86abf0,
    vc=0x2ab4bc259fb0) at UnixNetVConnection.cc:153
#13 write_signal_done (event=<value optimized out>, nh=0x2ab36d86abf0,
    vc=0x2ab4bc259fb0) at UnixNetVConnection.cc:180
#14 0x0000000000683db7 in write_to_net_io (nh=0x2ab36d86abf0,
    vc=0x2ab4bc259fb0, thread=<value optimized out>)
    at UnixNetVConnection.cc:479
#15 0x000000000067b5e3 in NetHandler::mainNetEvent (this=0x2ab36d86abf0,
    event=<value optimized out>, e=<value optimized out>) at UnixNet.cc:392
#16 0x00000000006a2f2f in handleEvent (this=0x2ab36d867010, e=0x17e97b0,
    calling_code=5) at I_Continuation.h:146
#17 EThread::process_event (this=0x2ab36d867010, e=0x17e97b0, calling_code=5)
    at UnixEThread.cc:141
#18 0x00000000006a3913 in EThread::execute (this=0x2ab36d867010)
    at UnixEThread.cc:265
#19 0x00000000006a1dca in spawn_thread_internal (a=0x17197c0) at Thread.cc:88
#20 0x00000033c68077f1 in start_thread () from /lib64/libpthread.so.0
#21 0x00000033c64e5ccd in clone () from /lib64/libc.so.6
{noformat}

{noformat}
#0  LogAccess::marshal_mem (
    dest=0x2b0e4c364328 "http://mrs-yhomerun.mrs.yimg.com/api/res/1.2/zGX2mOzEsuwoaJh4uUD.3g--/YXBwaWQ9eWhvbWVydW47cT04NTtzbT0xO3c9MjUwO2g9MTc4/http://media.zenfs.com/en_us/Sports/ap/201310011329485558058",
    source=0x2b0e4c38ca63 <Address 0x2b0e4c38ca63 out of bounds>,
    actual_len=1940, padded_len=1944) at LogAccess.cc:805
#1  0x000000000058a440 in LogAccessHttp::marshal_client_req_url_canon (
    this=<value optimized out>, buf=<value optimized out>)
    at LogAccessHttp.cc:304
#2  0x000000000059b422 in LogFieldList::marshal (this=<value optimized out>,
    lad=0x2b0ddb2b1a90, buf=0x2b0e4c3642f0 "") at LogField.cc:517
#3  0x00000000005a5ce5 in LogObject::log (this=0x2573310, lad=0x2b0ddb2b1a90,
    text_entry=0x0) at LogObject.cc:609
#4  0x000000000058d586 in log (lad=0x2b0ddb2b1a90) at LogObject.h:394
#5  Log::access (lad=0x2b0ddb2b1a90) at Log.cc:1110
#6  0x00000000005234c8 in HttpSM::update_stats (this=0x2b13c7f9f1f0)
    at HttpSM.cc:6484
#7  0x0000000000530e10 in HttpSM::kill_this (this=0x2b13c7f9f1f0)
    at HttpSM.cc:6445
#8  0x0000000000531198 in HttpSM::main_handler (this=0x2b13c7f9f1f0,
    event=2301, data=0x2b13c7fa0db0) at HttpSM.cc:2516
#9  0x000000000056b3f7 in handleEvent (this=0x2b13c7fa0db0,
    event=<value optimized out>, data=<value optimized out>)
    at ../../iocore/eventsystem/I_Continuation.h:146
#10 HttpTunnel::main_handler (this=0x2b13c7fa0db0,
    event=<value optimized out>, data=<value optimized out>)
    at HttpTunnel.cc:1506
#11 0x0000000000680f81 in handleEvent (event=<value optimized out>, nh=
    0x2b0dd9a9ebf0, vc=0x2b0f38892c30)
    at ../../iocore/eventsystem/I_Continuation.h:146
#12 write_signal_and_update (event=<value optimized out>, nh=0x2b0dd9a9ebf0,
    vc=0x2b0f38892c30) at UnixNetVConnection.cc:153
#13 write_signal_done (event=<value optimized out>, nh=0x2b0dd9a9ebf0,
    vc=0x2b0f38892c30) at UnixNetVConnection.cc:180
#14 0x0000000000683db7 in write_to_net_io (nh=0x2b0dd9a9ebf0,
    vc=0x2b0f38892c30, thread=<value optimized out>)
    at UnixNetVConnection.cc:479
#15 0x000000000067b5e3 in NetHandler::mainNetEvent (this=0x2b0dd9a9ebf0,
    event=<value optimized out>, e=<value optimized out>) at UnixNet.cc:392
#16 0x00000000006a2f2f in handleEvent (this=0x2b0dd9a9b010, e=0x206fab0,
    calling_code=5) at I_Continuation.h:146
#17 EThread::process_event (this=0x2b0dd9a9b010, e=0x206fab0, calling_code=5)
    at UnixEThread.cc:141
#18 0x00000000006a3913 in EThread::execute (this=0x2b0dd9a9b010)
    at UnixEThread.cc:265
#19 0x00000000006a1dca in spawn_thread_internal (a=0x1f9f050) at Thread.cc:88
#20 0x00000033c68077f1 in start_thread () from /lib64/libpthread.so.0
#21 0x00000033c64e5ccd in clone () from /lib64/libc.so.6
{noformat}

{noformat}
#0  LogAccess::marshal_mem (
    dest=0x2b52283fa888 "5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)", source=0x2b5228420b37 <Address 0x2b5228420b37 out of bounds>,
    actual_len=1940, padded_len=1944) at LogAccess.cc:805
#1  0x000000000058a440 in LogAccessHttp::marshal_client_req_url_canon (
    this=<value optimized out>, buf=<value optimized out>)
    at LogAccessHttp.cc:304
#2  0x000000000059b422 in LogFieldList::marshal (this=<value optimized out>,
    lad=0x2b5161816a90, buf=0x2b52283fa850 "") at LogField.cc:517
#3  0x00000000005a5ce5 in LogObject::log (this=0x21a9310, lad=0x2b5161816a90,
    text_entry=0x0) at LogObject.cc:609
#4  0x000000000058d586 in log (lad=0x2b5161816a90) at LogObject.h:394
#5  Log::access (lad=0x2b5161816a90) at Log.cc:1110
#6  0x00000000005234c8 in HttpSM::update_stats (this=0x2b525bbc5930)
    at HttpSM.cc:6484
#7  0x0000000000530e10 in HttpSM::kill_this (this=0x2b525bbc5930)
    at HttpSM.cc:6445
#8  0x0000000000531198 in HttpSM::main_handler (this=0x2b525bbc5930,
    event=2301, data=0x2b525bbc74f0) at HttpSM.cc:2516
#9  0x000000000056b3f7 in handleEvent (this=0x2b525bbc74f0,
    event=<value optimized out>, data=<value optimized out>)
    at ../../iocore/eventsystem/I_Continuation.h:146
#10 HttpTunnel::main_handler (this=0x2b525bbc74f0,
    event=<value optimized out>, data=<value optimized out>)
    at HttpTunnel.cc:1506
#11 0x0000000000680f81 in handleEvent (event=<value optimized out>, nh=
    0x2b5160003bf0, vc=0x2b51c0302330)
    at ../../iocore/eventsystem/I_Continuation.h:146
#12 write_signal_and_update (event=<value optimized out>, nh=0x2b5160003bf0,
    vc=0x2b51c0302330) at UnixNetVConnection.cc:153
#13 write_signal_done (event=<value optimized out>, nh=0x2b5160003bf0,
    vc=0x2b51c0302330) at UnixNetVConnection.cc:180
#14 0x0000000000683db7 in write_to_net_io (nh=0x2b5160003bf0,
    vc=0x2b51c0302330, thread=<value optimized out>)
    at UnixNetVConnection.cc:479
#15 0x000000000067b5e3 in NetHandler::mainNetEvent (this=0x2b5160003bf0,
    event=<value optimized out>, e=<value optimized out>) at UnixNet.cc:392
#16 0x00000000006a2f2f in handleEvent (this=0x2b5160000010, e=0x1ca5330,
    calling_code=5) at I_Continuation.h:146
#17 EThread::process_event (this=0x2b5160000010, e=0x1ca5330, calling_code=5)
    at UnixEThread.cc:141
#18 0x00000000006a3913 in EThread::execute (this=0x2b5160000010)
    at UnixEThread.cc:265
#19 0x00000000006a1dca in spawn_thread_internal (a=0x1bce6a0) at Thread.cc:88
#20 0x00000033c68077f1 in start_thread () from /lib64/libpthread.so.0
#21 0x00000033c64e5ccd in clone () from /lib64/libc.so.6
{noformat}

> Seg faulting during logging
> ---------------------------
>
>                 Key: TS-1411
>                 URL: https://issues.apache.org/jira/browse/TS-1411
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Logging
>    Affects Versions: 3.2.0
>         Environment: RHEL 6.2 x86_64
>            Reporter: David Carlin
>            Assignee: Yunkai Zhang
>            Priority: Critical
>             Fix For: 4.2.0
>
>         Attachments: Log rotation segaults.txt, TS-1411 backtraces.txt
>
>
> I've been experiencing some segfaults during log rotation.  The sequence of events is this.. log rotation occurs, then I get hundreds of dropping log buffer error msgs, then the segfault.
> This started occurring when I lengthened the default log format to include the unmapped URL and the user agent string:
> %<cqtq> %<ttms> %<chi> %<crc>/%<pssc> %<psql> %<cqhm> %<cquc> %<caun> %<phr>/%<pqsn> %<psct> %<xid> %<cquuc> \"%<{User-Agent}cqh>\"
> In terms of frequency, we have a number of boxes and I probably see one of these crashed per day since the above change.  Logs are rotated every 2 hours.
> I've had other log related segfaults, reported in TS-1330 - these new ones seem to have a different cause.
> [Aug 14 21:07:20.002] Server {0x2ae3a8887700} STATUS: The rolled logfile, /home/y/logs/trafficserver/error.log_l30.ycs.a4e.yahoo.com.20120814.17h59m50s-20120814.20h00m00s.old, was auto-deleted; 3148252 bytes were reclaimed.
> [Aug 14 21:07:42.859] Server {0x2ae3a8887700} STATUS: The rolled logfile, /home/y/logs/trafficserver/squid.blog_l30.ycs.a4e.yahoo.com.20120814.18h00m00s-20120814.20h00m00s.old, was auto-deleted; 14735520048 bytes were reclaimed.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> [...]
> [Aug 14 21:07:42.876] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> [Aug 14 21:07:42.876] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> [Aug 14 21:07:42.876] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> [Aug 14 21:07:42.876] Server {0x2ae3a8887700} WARNING: Dropping log buffer, can't keep up.
> NOTE: Traffic Server received Sig 11: Segmentation fault
> /home/y/bin/traffic_server - STACK TRACE: 
> /lib64/libpthread.so.0[0x383f00f500]
> /home/y/bin/traffic_server(_ZN9LogAccess11marshal_memEPcPKcii+0x48)[0x58a118]
> /home/y/bin/traffic_server(_ZN13LogAccessHttp28marshal_client_req_url_canonEPc+0x20)[0x58c3f0]
> /home/y/bin/traffic_server(_ZN12LogFieldList7marshalEP9LogAccessPc+0x32)[0x59d5a2]
> /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x399)[0x5a7ed9]
> /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
> /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
> /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
> /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
> /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
> /home/y/bin/traffic_server[0x673871]
> /home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x6756e7]
> /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x286)[0x66e076]
> /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0xb4)[0x696ce4]
> /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4c3)[0x697673]
> /home/y/bin/traffic_server[0x695cb2]
> /lib64/libpthread.so.0[0x383f007851]



--
This message was sent by Atlassian JIRA
(v6.1#6144)