You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by Leif Hedstrom <zw...@apache.org> on 2011/06/04 22:28:35 UTC

Debugging ATS crashers with gdb

Hi all,

these questions pops up all the time, how to get a stack traces etc. 
from a crashing ATS installation. So, here are some quick tips .

First, it's important to have a properly built ATS version for debugging 
to be efficient. The "best" option is to build ATS with something like this:

#! /bin/sh
#
# Created by configure

"./configure" \
"--enable-static-libts" \
"--enable-debug" \
"$@"


Next, Linux can be finicky when it comes to generating core files, so 
I'd recommend the following sysctl configs:

kernel.core_uses_pid = 1
kernel.core_pattern = /tmp/core

This tells the kernel to dump the core file into /tmp, with a name like 
core.12345. Also, you might need to increase the resource limits on the 
size of the core file that can be generated. This can be done with e.g. 
"ulimit -c unlimited", or adding something like this to 
/etc/security/limits.conf:

root		 -	 core		 unlimited


We might still not be able to get a core file with all this said and 
done, so in that case, you might have to attach gdb directly to the 
running traffic_server process. You can do that with e.g.

% sudo gdb /usr/local/bin/traffic_server 12345


Where 12345 is the pid of the running traffic_server process. Once you 
have entered the command above, make sure to type the "cont" command to 
have gdb continue executing traffic_server. Assuming you instead got a 
core file (lets hope), then instead you would run

% sudo gdb /usr/local/bin/traffic_server /tmp/core.12345


Now with either a core file, or a crasher inside gdb directly, you need 
to submit useful information to us. The best starting point is to get 
some stack traces, so inside gdb type:

(gdb) set pagination 0
(gdb) bt
...
(gdb) thread apply all bt



This is now useful information, particularly if you compiled ATS like 
mentioned above. Attach this to a bug in Jira, together with information 
from you environment. Useful information includes:

    * Version of ATS (this is absolutely necessary to do any debugging).
      More importantly, you should run the latest released version
      (development releases preferably).
    * Linux version / platform
    * Any configuration changes you have made
    * Any information about what causes the crash, i.e. if you know of a
      particular request that triggers it, or anything else that would
      help reproducing the problem, please include it.
    * Anything else you can possibly think would be useful to help debug
      the problem, including promises of beer.


I'm sure I've missed plenty of details here, so feel free to add to this.

Thanks,

-- leif


Re: Debugging ATS crashers with gdb

Posted by Leif Hedstrom <zw...@apache.org>.
On 06/04/2011 02:28 PM, Leif Hedstrom wrote:
> Hi all,
>
> these questions pops up all the time, how to get a stack traces etc. 
> from a crashing ATS installation. So, here are some quick tips .

One more tip for debugging with gdb, this one is suggested from John:

     (gdb) handle SIGPIPE nopass nostop noprint


You can also put this in your .gdbinit. For some reason, even though we 
explicitly specify to ignore SIGPIPE,while running under gdb, it can 
still cause write() to generate a SIGPIPE (when under normal operation, 
it would be an EPIPE error). Therefore, it's important you do this, to 
avoid a "false positive" in calls to write().

-- leif


Re: Debugging ATS crashers with gdb

Posted by Leif Hedstrom <zw...@apache.org>.
On 06/04/2011 02:28 PM, Leif Hedstrom wrote:
> Hi all,
>
> these questions pops up all the time, how to get a stack traces etc. 
> from a crashing ATS installation. So, here are some quick tips .
>

One more thing: You might have to set this in records.config as well:

     CONFIG proxy.config.core_limit INT -1


Or you can set it to a size, -1 will use the hard limit. The other thing 
it does is to also call prctl() to set PR_SET_DUMPABLE to 1 (which I 
think should be set anyways, but not sure if it'll make a difference on 
different Linux distros).

-- leif


Re: Debugging ATS crashers with gdb

Posted by Leif Hedstrom <zw...@apache.org>.
I captured some of this from the discussions in

     
https://cwiki.apache.org/confluence/display/TS/Filing+useful+bug+reports


Please add / edit this as appropriate.

-- leif


Re: Debugging ATS crashers with gdb

Posted by Leif Hedstrom <zw...@apache.org>.
On 06/04/2011 02:28 PM, Leif Hedstrom wrote:
> Hi all,
>
> these questions pops up all the time, how to get a stack traces etc. 
> from a crashing ATS installation. So, here are some quick tips .
>

One more thing: You might have to set this in records.config as well:

     CONFIG proxy.config.core_limit INT -1


Or you can set it to a size, -1 will use the hard limit. The other thing 
it does is to also call prctl() to set PR_SET_DUMPABLE to 1 (which I 
think should be set anyways, but not sure if it'll make a difference on 
different Linux distros).

-- leif


Re: Debugging ATS crashers with gdb

Posted by Leif Hedstrom <zw...@apache.org>.
I captured some of this from the discussions in

     
https://cwiki.apache.org/confluence/display/TS/Filing+useful+bug+reports


Please add / edit this as appropriate.

-- leif


Re: Debugging ATS crashers with gdb

Posted by Leif Hedstrom <zw...@apache.org>.
On 06/04/2011 02:28 PM, Leif Hedstrom wrote:
> Hi all,
>
> these questions pops up all the time, how to get a stack traces etc. 
> from a crashing ATS installation. So, here are some quick tips .

One more tip for debugging with gdb, this one is suggested from John:

     (gdb) handle SIGPIPE nopass nostop noprint


You can also put this in your .gdbinit. For some reason, even though we 
explicitly specify to ignore SIGPIPE,while running under gdb, it can 
still cause write() to generate a SIGPIPE (when under normal operation, 
it would be an EPIPE error). Therefore, it's important you do this, to 
avoid a "false positive" in calls to write().

-- leif


Re: Debugging ATS crashers with gdb

Posted by Leif Hedstrom <zw...@apache.org>.
On 06/04/2011 05:07 PM, Igor Galić wrote:
>
> This is *bad* it will be inherited down to *all* processes started
> by root, which are bound to be many.
> traffic_server runs as nobody (per default) so add that user too:

Sure, valid point. If you are concerned about this, use the first option 
of doing ulimit manually.

Fwiw, I don't think it's as bad as you say though, I always do this 
(cause I do want to see core files, and I dump them in a directory I 
monitor). I rarely see a core file, typically from ATS or HTTPD :).

> your_ats_user		 -	 core		 unlimited

pretty sure this doesn't work. Since you have to start traffic_server as 
"root", those are the limits that will be used.

-- Leif

Re: Debugging ATS crashers with gdb

Posted by Leif Hedstrom <zw...@apache.org>.
On 06/04/2011 05:07 PM, Igor Galić wrote:
>
> This is *bad* it will be inherited down to *all* processes started
> by root, which are bound to be many.
> traffic_server runs as nobody (per default) so add that user too:

Sure, valid point. If you are concerned about this, use the first option 
of doing ulimit manually.

Fwiw, I don't think it's as bad as you say though, I always do this 
(cause I do want to see core files, and I dump them in a directory I 
monitor). I rarely see a core file, typically from ATS or HTTPD :).

> your_ats_user		 -	 core		 unlimited

pretty sure this doesn't work. Since you have to start traffic_server as 
"root", those are the limits that will be used.

-- Leif

Re: Debugging ATS crashers with gdb

Posted by Igor Galić <i....@brainsware.org>.

----- Original Message -----
> 
> 
> ----- Original Message -----
> > Hi all,
> >
> > these questions pops up all the time, how to get a stack traces
> > etc.
> > from a crashing ATS installation. So, here are some quick tips .
> >
> > First, it's important to have a properly built ATS version for
> > debugging to be efficient. The "best" option is to build ATS with
> > something like this:
> >
> > #! /bin/sh
> > #
> > # Created by configure
> >
> > "./configure" \
> > "--enable-static-libts" \
> > "--enable-debug" \
> > "$@" Next, Linux can be finicky when it comes to generating core
> > files, so I'd recommend the following sysctl configs:
> >
> > kernel.core_uses_pid = 1
> > kernel.core_pattern = /tmp/core This tells the kernel to dump the
> > core file into /tmp, with a name like core.12345. Also, you might
> 
> For more info see core(5)
> http://www.freebsd.org/cgi/man.cgi?core
> http://www.kernel.org/doc/man-pages/online/pages/man5/core.5.html


You should really just ignore everything I said in my last email
past this point. It was terribly misinformed.

i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/

Re: Debugging ATS crashers with gdb

Posted by Igor Galić <i....@brainsware.org>.

----- Original Message -----
> 
> 
> ----- Original Message -----
> > Hi all,
> >
> > these questions pops up all the time, how to get a stack traces
> > etc.
> > from a crashing ATS installation. So, here are some quick tips .
> >
> > First, it's important to have a properly built ATS version for
> > debugging to be efficient. The "best" option is to build ATS with
> > something like this:
> >
> > #! /bin/sh
> > #
> > # Created by configure
> >
> > "./configure" \
> > "--enable-static-libts" \
> > "--enable-debug" \
> > "$@" Next, Linux can be finicky when it comes to generating core
> > files, so I'd recommend the following sysctl configs:
> >
> > kernel.core_uses_pid = 1
> > kernel.core_pattern = /tmp/core This tells the kernel to dump the
> > core file into /tmp, with a name like core.12345. Also, you might
> 
> For more info see core(5)
> http://www.freebsd.org/cgi/man.cgi?core
> http://www.kernel.org/doc/man-pages/online/pages/man5/core.5.html


You should really just ignore everything I said in my last email
past this point. It was terribly misinformed.

i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/

Re: Debugging ATS crashers with gdb

Posted by Igor Galić <i....@brainsware.org>.

----- Original Message -----
> Hi all,
> 
> these questions pops up all the time, how to get a stack traces etc.
> from a crashing ATS installation. So, here are some quick tips .
> 
> First, it's important to have a properly built ATS version for
> debugging to be efficient. The "best" option is to build ATS with
> something like this:
> 
> #! /bin/sh
> #
> # Created by configure
> 
> "./configure" \
> "--enable-static-libts" \
> "--enable-debug" \
> "$@" Next, Linux can be finicky when it comes to generating core
> files, so I'd recommend the following sysctl configs:
> 
> kernel.core_uses_pid = 1
> kernel.core_pattern = /tmp/core This tells the kernel to dump the
> core file into /tmp, with a name like core.12345. Also, you might

For more info see core(5)
http://www.freebsd.org/cgi/man.cgi?core
http://www.kernel.org/doc/man-pages/online/pages/man5/core.5.html


> need to increase the resource limits on the size of the core file
> that can be generated. This can be done with e.g. "ulimit -c
> unlimited", or adding something like this to
> /etc/security/limits.conf:
> 
> root		 -	 core		 unlimited

This is *bad* it will be inherited down to *all* processes started
by root, which are bound to be many. 
traffic_server runs as nobody (per default) so add that user too:

your_ats_user		 -	 core		 unlimited

> We might still not be able to get a core
> file with all this said and done, so in that case, you might have to
> attach gdb directly to the running traffic_server process. You can
> do that with e.g.
> 
> % sudo gdb /usr/local/bin/traffic_server 12345
> Where 12345 is the pid of the running traffic_server process. Once
> you have entered the command above, make sure to type the "cont"
> command to have gdb continue executing traffic_server. Assuming you
> instead got a core file (lets hope), then instead you would run
> 
> % sudo gdb /usr/local/bin/traffic_server /tmp/core.12345
> Now with either a core file, or a crasher inside gdb directly, you
> need to submit useful information to us. The best starting point is
> to get some stack traces, so inside gdb type:
> 
> (gdb) set pagination 0
> (gdb) bt
> ...
> (gdb) thread apply all bt

See also gcore, which gets a stack from the running process
There's also poor man's profiler:

http://poormansprofiler.org/
I mirror it (with slight modifications) here:
http://blag.esotericsystems.at/igor/hacks/pmprofiler


> This is now useful information, particularly if you compiled ATS like
> mentioned above. Attach this to a bug in Jira, together with
> information from you environment. Useful information includes:
> 
> 
>     • Version of ATS (this is absolutely necessary to do any
>     debugging). More importantly, you should run the latest released
>     version (development releases preferably).
>     • Linux version / platform
>     • Any configuration changes you have made
>     • Any information about what causes the crash, i.e. if you know
>     of a particular request that triggers it, or anything else that
>     would help reproducing the problem, please include it.
>     • Anything else you can possibly think would be useful to help
>     debug the problem, including promises of beer.
> 
> I'm sure I've missed plenty of details here, so feel free to add to
> this.
> 
> Thanks,
> 
> -- leif


i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/

Re: Debugging ATS crashers with gdb

Posted by Igor Galić <i....@brainsware.org>.

----- Original Message -----
> Hi all,
> 
> these questions pops up all the time, how to get a stack traces etc.
> from a crashing ATS installation. So, here are some quick tips .
> 
> First, it's important to have a properly built ATS version for
> debugging to be efficient. The "best" option is to build ATS with
> something like this:
> 
> #! /bin/sh
> #
> # Created by configure
> 
> "./configure" \
> "--enable-static-libts" \
> "--enable-debug" \
> "$@" Next, Linux can be finicky when it comes to generating core
> files, so I'd recommend the following sysctl configs:
> 
> kernel.core_uses_pid = 1
> kernel.core_pattern = /tmp/core This tells the kernel to dump the
> core file into /tmp, with a name like core.12345. Also, you might

For more info see core(5)
http://www.freebsd.org/cgi/man.cgi?core
http://www.kernel.org/doc/man-pages/online/pages/man5/core.5.html


> need to increase the resource limits on the size of the core file
> that can be generated. This can be done with e.g. "ulimit -c
> unlimited", or adding something like this to
> /etc/security/limits.conf:
> 
> root		 -	 core		 unlimited

This is *bad* it will be inherited down to *all* processes started
by root, which are bound to be many. 
traffic_server runs as nobody (per default) so add that user too:

your_ats_user		 -	 core		 unlimited

> We might still not be able to get a core
> file with all this said and done, so in that case, you might have to
> attach gdb directly to the running traffic_server process. You can
> do that with e.g.
> 
> % sudo gdb /usr/local/bin/traffic_server 12345
> Where 12345 is the pid of the running traffic_server process. Once
> you have entered the command above, make sure to type the "cont"
> command to have gdb continue executing traffic_server. Assuming you
> instead got a core file (lets hope), then instead you would run
> 
> % sudo gdb /usr/local/bin/traffic_server /tmp/core.12345
> Now with either a core file, or a crasher inside gdb directly, you
> need to submit useful information to us. The best starting point is
> to get some stack traces, so inside gdb type:
> 
> (gdb) set pagination 0
> (gdb) bt
> ...
> (gdb) thread apply all bt

See also gcore, which gets a stack from the running process
There's also poor man's profiler:

http://poormansprofiler.org/
I mirror it (with slight modifications) here:
http://blag.esotericsystems.at/igor/hacks/pmprofiler


> This is now useful information, particularly if you compiled ATS like
> mentioned above. Attach this to a bug in Jira, together with
> information from you environment. Useful information includes:
> 
> 
>     • Version of ATS (this is absolutely necessary to do any
>     debugging). More importantly, you should run the latest released
>     version (development releases preferably).
>     • Linux version / platform
>     • Any configuration changes you have made
>     • Any information about what causes the crash, i.e. if you know
>     of a particular request that triggers it, or anything else that
>     would help reproducing the problem, please include it.
>     • Anything else you can possibly think would be useful to help
>     debug the problem, including promises of beer.
> 
> I'm sure I've missed plenty of details here, so feel free to add to
> this.
> 
> Thanks,
> 
> -- leif


i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/