You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by Pranav Desai <pr...@gmail.com> on 2010/09/21 03:20:27 UTC

Re: errors and shutdown message in 2.1.2 under load (TS-441)

On Thu, Sep 16, 2010 at 3:30 PM, Pranav Desai <pr...@gmail.com> wrote:
> On Thu, Sep 16, 2010 at 2:03 PM, Leif Hedstrom <zw...@apache.org> wrote:
>> On 09/16/2010 02:45 PM, Pranav Desai wrote:
>>
>> On Thu, Sep 16, 2010 at 12:33 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>
>>  On 09/16/2010 01:26 PM, Pranav Desai wrote:
>>
>> Hi!
>>
>> I am running a load test with some video files to see . I am using
>> curl-loader to generate the load. I have modified it to add a random
>> number to the URLs before sending so I can test with a single URL and
>> still stress the cache. The webserver is a lighttpd server with
>> rewrite rules to translate the random strings back to a common URL.
>> The URL is essentially a 15MB video file. I can provide more details
>> on the setup if needed.
>>
>> Ok, I've created https://issues.apache.org/jira/browse/TS-441   with this
>> information. If you can find a core file (or, run traffic_server under gdb),
>> and get a stack trace, that would be very helpful. Also, when it crashes,
>> you might get a stack trace in /var/log/messages and/or one of the log files
>> in the .../var/log/trafficserver  directory.
>>
>>

I got the stack trace. I have updated the bug with the trace, but here it is.

FATAL: HTTP.cc:1526: failed assert `!"unknown m_polarity"`
./bin/traffic_server - STACK TRACE:
./bin/traffic_server(ink_fatal+0x86)[0x6f2056]
./bin/traffic_server(_ink_assert+0x81)[0x6f0d61]
./bin/traffic_server(_ZN11HTTPHdrImpl9unmarshalEl+0x35)[0x5ea385]
./bin/traffic_server(_ZN7HdrHeap9unmarshalEiiPP14HdrHeapObjImplP11RefCountObj+0x146)[0x5df926]
./bin/traffic_server(_ZN8HTTPInfo9unmarshalEPciP11RefCountObj+0xc5)[0x5ea205]
./bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x750)[0x6648a0]
./bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x26)[0x66ce26]
./bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x22f)[0x6e7c0f]
./bin/traffic_server(_ZN7EThread7executeEv+0x1aa)[0x6e810a]
./bin/traffic_server[0x6e76da]
/lib64/libpthread.so.0[0x7f4a565e52e7]
/lib64/libc.so.6(clone+0x6d)[0x32a18ce3bd]


I will try to dig deeper, but if have any ideas or suggestions I can
try those out.

Thanks
-- Pranav

Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by Pranav Desai <pr...@gmail.com>.
On Wed, Sep 22, 2010 at 3:44 PM, Leif Hedstrom <zw...@apache.org> wrote:
>  On 09/22/2010 04:36 PM, Pranav Desai wrote:
>>
>> On Wed, Sep 22, 2010 at 2:55 PM, Leif Hedstrom<zw...@apache.org>  wrote:
>>>
>>>  On 09/22/2010 03:37 PM, Pranav Desai wrote:
>>>>
>>>> Sep 21 18:51:48 c1b14 traffic_manager[29617]: NOTE:
>>>> RLIMIT_NOFILE(7):cur(10000),max(10000)
>>>> Sep 21 18:51:48 c1b14 traffic_manager[29617]: {140542945421104}
>>>> STATUS: opened /usr/local/atssvn/var/log/trafficserver/manager.log
>>>> Sep 21 18:51:49 c1b14 traffic_manager[29617]: {140542945421104} FATAL:
>>>> [ClusterCom::ClusterCom] path + filename too large
>>>
>>> This is already fixed on "trunk", and will be on v2.1.3.
>>>
>> I was on the trunk. Anyway I will wait for 2.1.3 and try it out again.
>
> yeah, but I fixed it earlier today :). "svn up" and see if it fixes it (it
> really should).
>

oh cool ! haven't done that since yesterday. will try it soon.


> -- leif
>
>

Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by Leif Hedstrom <zw...@apache.org>.
  On 09/22/2010 04:36 PM, Pranav Desai wrote:
> On Wed, Sep 22, 2010 at 2:55 PM, Leif Hedstrom<zw...@apache.org>  wrote:
>>   On 09/22/2010 03:37 PM, Pranav Desai wrote:
>>> Sep 21 18:51:48 c1b14 traffic_manager[29617]: NOTE:
>>> RLIMIT_NOFILE(7):cur(10000),max(10000)
>>> Sep 21 18:51:48 c1b14 traffic_manager[29617]: {140542945421104}
>>> STATUS: opened /usr/local/atssvn/var/log/trafficserver/manager.log
>>> Sep 21 18:51:49 c1b14 traffic_manager[29617]: {140542945421104} FATAL:
>>> [ClusterCom::ClusterCom] path + filename too large
>> This is already fixed on "trunk", and will be on v2.1.3.
>>
> I was on the trunk. Anyway I will wait for 2.1.3 and try it out again.

yeah, but I fixed it earlier today :). "svn up" and see if it fixes it 
(it really should).

-- leif


Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by Pranav Desai <pr...@gmail.com>.
On Wed, Sep 22, 2010 at 2:55 PM, Leif Hedstrom <zw...@apache.org> wrote:
>  On 09/22/2010 03:37 PM, Pranav Desai wrote:
>>
>> Sep 21 18:51:48 c1b14 traffic_manager[29617]: NOTE:
>> RLIMIT_NOFILE(7):cur(10000),max(10000)
>> Sep 21 18:51:48 c1b14 traffic_manager[29617]: {140542945421104}
>> STATUS: opened /usr/local/atssvn/var/log/trafficserver/manager.log
>> Sep 21 18:51:49 c1b14 traffic_manager[29617]: {140542945421104} FATAL:
>> [ClusterCom::ClusterCom] path + filename too large
>
> This is already fixed on "trunk", and will be on v2.1.3.
>

I was on the trunk. Anyway I will wait for 2.1.3 and try it out again.

-- Pranav


> -- leif
>
>> Sep 21 18:51:49 c1b14 traffic_manager[29617]: {140542945421104} FATAL:
>>  (last system error 2: No such file or directory)
>> Sep 21 18:51:49 c1b14 traffic_cop[29615]: cop received child status
>> signal [29617 256]
>> Sep 21 18:51:49 c1b14 traffic_cop[29615]: traffic_manager not running,
>> making sure traffic_server is dead
>
>

Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by Leif Hedstrom <zw...@apache.org>.
  On 09/22/2010 03:37 PM, Pranav Desai wrote:
> Sep 21 18:51:48 c1b14 traffic_manager[29617]: NOTE:
> RLIMIT_NOFILE(7):cur(10000),max(10000)
> Sep 21 18:51:48 c1b14 traffic_manager[29617]: {140542945421104}
> STATUS: opened /usr/local/atssvn/var/log/trafficserver/manager.log
> Sep 21 18:51:49 c1b14 traffic_manager[29617]: {140542945421104} FATAL:
> [ClusterCom::ClusterCom] path + filename too large

This is already fixed on "trunk", and will be on v2.1.3.

-- leif

> Sep 21 18:51:49 c1b14 traffic_manager[29617]: {140542945421104} FATAL:
>   (last system error 2: No such file or directory)
> Sep 21 18:51:49 c1b14 traffic_cop[29615]: cop received child status
> signal [29617 256]
> Sep 21 18:51:49 c1b14 traffic_cop[29615]: traffic_manager not running,
> making sure traffic_server is dead


Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by Pranav Desai <pr...@gmail.com>.
On Tue, Sep 21, 2010 at 9:33 AM, John Plevyak <jp...@apache.org> wrote:
>
>
> OK, I pulled 2.1.2, and it crashes almost immediately for on the same
> test that the SVN version runs overnight on.
>
> I would say that 2.1.2 has a bug which has been fixed.
>
> Please try SVN and we'll see about expediting the next version.
>

With the latest SVN I am not seeing the same problem. However there
are some problems with the autoconf which I mentioned in an earlier
mail. I disabled all tproxy related code and got it to run.

Also, the traffic_manager seems to be having some trouble starting up.
Below is the error msg, so I am running with traffic_server alone. Not
sure if its something I am doing.

For now the traffic_server alone should work for me for testing until
2.1.3 is out. The stats wont be there but I guess I have system stats
that I can monitor.

Will wait for the 2.1.3 release.

Thanks for all your help.

-- Pranav

/var/log/messages for traffic_manager
-------------------------------------------
Sep 21 18:51:48 c1b14 traffic_cop[29615]: --- Cop Starting [Version:
Apache Traffic Server - traffic_cop - 2.1.3-unstable - (build # 82118
o
n Sep 21 2010 at 18:09:22)] ---
Sep 21 18:51:48 c1b14 traffic_cop[29615]: traffic_manager not running,
making sure traffic_server is dead
Sep 21 18:51:48 c1b14 traffic_cop[29615]: spawning traffic_manager
Sep 21 18:51:48 c1b14 traffic_manager[29617]: NOTE: --- Manager Starting ---
Sep 21 18:51:48 c1b14 traffic_manager[29617]: NOTE: Manager Version:
Apache Traffic Server - traffic_manager - 2.1.3-unstable - (build #
821
18 on Sep 21 2010 at 18:13:08)
Sep 21 18:51:48 c1b14 traffic_manager[29617]: NOTE:
RLIMIT_NOFILE(7):cur(10000),max(10000)
Sep 21 18:51:48 c1b14 traffic_manager[29617]: {140542945421104}
STATUS: opened /usr/local/atssvn/var/log/trafficserver/manager.log
Sep 21 18:51:49 c1b14 traffic_manager[29617]: {140542945421104} FATAL:
[ClusterCom::ClusterCom] path + filename too large
Sep 21 18:51:49 c1b14 traffic_manager[29617]: {140542945421104} FATAL:
 (last system error 2: No such file or directory)
Sep 21 18:51:49 c1b14 traffic_cop[29615]: cop received child status
signal [29617 256]
Sep 21 18:51:49 c1b14 traffic_cop[29615]: traffic_manager not running,
making sure traffic_server is dead

Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by Pranav Desai <pr...@gmail.com>.
On Tue, Sep 21, 2010 at 9:33 AM, John Plevyak <jp...@apache.org> wrote:
>
>
> OK, I pulled 2.1.2, and it crashes almost immediately for on the same
> test that the SVN version runs overnight on.
>
> I would say that 2.1.2 has a bug which has been fixed.
>
> Please try SVN and we'll see about expediting the next version.
>

The configure is giving me some problem.

svn checkout http://svn.apache.org/repos/asf/trafficserver/traffic/trunk
trafficserver
cd trafficserver/
autoreconf -i --force
Putting files in AC_CONFIG_AUX_DIR, `build/aux'.
configure.ac:145: error: possibly undefined macro: AC_MSG_RESULT
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation.
configure.ac:214: error: possibly undefined macro: AS_IF
configure.ac:828: error: possibly undefined macro: AC_MSG_FAILURE
configure.ac:1041: error: possibly undefined macro: AS_CASE
autoreconf: /usr/bin/autoconf failed with exit status: 1

The configure script was created so I went ahead and ran it, with this failure
checking for sys/capability.h... yes
./configure: line 28167: syntax error near unexpected token `"$enable_tproxy",'
./configure: line 28167: `    AS_CASE("$enable_tproxy",'

The only difference in configure.ac between trunk and 2.1.2 is related
to tproxy. So I removed that and it goes further, but the make fails
in UnixConnection.cc, which is probably caused by my changes, since it
fails at ATS_USE_TPROXY.

Let me know if I missed something.

Thanks
-- Pranav

> john
>
>
> On 9/20/2010 9:45 PM, Pranav Desai wrote:
>> On Mon, Sep 20, 2010 at 9:34 PM, John Plevyak <jp...@apache.org> wrote:
>>>
>>> All of these are cache corruption issues.  I can't seem to reproduce them locally.
>>>
>>> Are you running the latest SVN ?  There was a bug in the SVN version of ATS
>>> a little while ago for a couple days before it got fixed.
>>>
>>
>> This is with 2.1.2. Maybe I will try the latest SVN.
>>
>>> Did you clear the cache before running (start with traffic_server -K)?  Changes
>>> in the cache format are supposed to be tracked by versioning of the database, but
>>> there might have been a change which wasn't accompanied by a version bump.
>>>
>>
>> I have tried it with -K -k, but same result. In fact I clean out all
>> the files logs and cache.db before every run. Could it possible be
>> hard drive issues ... I am thinking of trying another machine. Also
>> note that I am testing it with a file based cache. Do you think there
>> could be something there ?
>>
>>
>>> If can provide access to a gdb session with the crash I might be able to figure
>>> out what is going on...
>>>
>>
>> I was short on time so couldn't debug it any further ... hopefully I
>> can get it done tomorrow ...
>>
>> -- Pranav
>>
>>> john
>>>
>>>
>>> On 9/20/2010 6:20 PM, Pranav Desai wrote:
>>>> On Thu, Sep 16, 2010 at 3:30 PM, Pranav Desai <pr...@gmail.com> wrote:
>>>>> On Thu, Sep 16, 2010 at 2:03 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>>>> On 09/16/2010 02:45 PM, Pranav Desai wrote:
>>>>>>
>>>>>> On Thu, Sep 16, 2010 at 12:33 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>>>>
>>>>>>  On 09/16/2010 01:26 PM, Pranav Desai wrote:
>>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> I am running a load test with some video files to see . I am using
>>>>>> curl-loader to generate the load. I have modified it to add a random
>>>>>> number to the URLs before sending so I can test with a single URL and
>>>>>> still stress the cache. The webserver is a lighttpd server with
>>>>>> rewrite rules to translate the random strings back to a common URL.
>>>>>> The URL is essentially a 15MB video file. I can provide more details
>>>>>> on the setup if needed.
>>>>>>
>>>>>> Ok, I've created https://issues.apache.org/jira/browse/TS-441   with this
>>>>>> information. If you can find a core file (or, run traffic_server under gdb),
>>>>>> and get a stack trace, that would be very helpful. Also, when it crashes,
>>>>>> you might get a stack trace in /var/log/messages and/or one of the log files
>>>>>> in the .../var/log/trafficserver  directory.
>>>>>>
>>>>>>
>>>>
>>>> I got the stack trace. I have updated the bug with the trace, but here it is.
>>>>
>>>> FATAL: HTTP.cc:1526: failed assert `!"unknown m_polarity"`
>>>> ./bin/traffic_server - STACK TRACE:
>>>> ./bin/traffic_server(ink_fatal+0x86)[0x6f2056]
>>>> ./bin/traffic_server(_ink_assert+0x81)[0x6f0d61]
>>>> ./bin/traffic_server(_ZN11HTTPHdrImpl9unmarshalEl+0x35)[0x5ea385]
>>>> ./bin/traffic_server(_ZN7HdrHeap9unmarshalEiiPP14HdrHeapObjImplP11RefCountObj+0x146)[0x5df926]
>>>> ./bin/traffic_server(_ZN8HTTPInfo9unmarshalEPciP11RefCountObj+0xc5)[0x5ea205]
>>>> ./bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x750)[0x6648a0]
>>>> ./bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x26)[0x66ce26]
>>>> ./bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x22f)[0x6e7c0f]
>>>> ./bin/traffic_server(_ZN7EThread7executeEv+0x1aa)[0x6e810a]
>>>> ./bin/traffic_server[0x6e76da]
>>>> /lib64/libpthread.so.0[0x7f4a565e52e7]
>>>> /lib64/libc.so.6(clone+0x6d)[0x32a18ce3bd]
>>>>
>>>>
>>>> I will try to dig deeper, but if have any ideas or suggestions I can
>>>> try those out.
>>>>
>>>> Thanks
>>>> -- Pranav
>>>
>>>
>
>

Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by John Plevyak <jp...@apache.org>.

OK, I pulled 2.1.2, and it crashes almost immediately for on the same
test that the SVN version runs overnight on.

I would say that 2.1.2 has a bug which has been fixed.

Please try SVN and we'll see about expediting the next version.

john


On 9/20/2010 9:45 PM, Pranav Desai wrote:
> On Mon, Sep 20, 2010 at 9:34 PM, John Plevyak <jp...@apache.org> wrote:
>>
>> All of these are cache corruption issues.  I can't seem to reproduce them locally.
>>
>> Are you running the latest SVN ?  There was a bug in the SVN version of ATS
>> a little while ago for a couple days before it got fixed.
>>
> 
> This is with 2.1.2. Maybe I will try the latest SVN.
> 
>> Did you clear the cache before running (start with traffic_server -K)?  Changes
>> in the cache format are supposed to be tracked by versioning of the database, but
>> there might have been a change which wasn't accompanied by a version bump.
>>
> 
> I have tried it with -K -k, but same result. In fact I clean out all
> the files logs and cache.db before every run. Could it possible be
> hard drive issues ... I am thinking of trying another machine. Also
> note that I am testing it with a file based cache. Do you think there
> could be something there ?
> 
> 
>> If can provide access to a gdb session with the crash I might be able to figure
>> out what is going on...
>>
> 
> I was short on time so couldn't debug it any further ... hopefully I
> can get it done tomorrow ...
> 
> -- Pranav
> 
>> john
>>
>>
>> On 9/20/2010 6:20 PM, Pranav Desai wrote:
>>> On Thu, Sep 16, 2010 at 3:30 PM, Pranav Desai <pr...@gmail.com> wrote:
>>>> On Thu, Sep 16, 2010 at 2:03 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>>> On 09/16/2010 02:45 PM, Pranav Desai wrote:
>>>>>
>>>>> On Thu, Sep 16, 2010 at 12:33 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>>>
>>>>>  On 09/16/2010 01:26 PM, Pranav Desai wrote:
>>>>>
>>>>> Hi!
>>>>>
>>>>> I am running a load test with some video files to see . I am using
>>>>> curl-loader to generate the load. I have modified it to add a random
>>>>> number to the URLs before sending so I can test with a single URL and
>>>>> still stress the cache. The webserver is a lighttpd server with
>>>>> rewrite rules to translate the random strings back to a common URL.
>>>>> The URL is essentially a 15MB video file. I can provide more details
>>>>> on the setup if needed.
>>>>>
>>>>> Ok, I've created https://issues.apache.org/jira/browse/TS-441   with this
>>>>> information. If you can find a core file (or, run traffic_server under gdb),
>>>>> and get a stack trace, that would be very helpful. Also, when it crashes,
>>>>> you might get a stack trace in /var/log/messages and/or one of the log files
>>>>> in the .../var/log/trafficserver  directory.
>>>>>
>>>>>
>>>
>>> I got the stack trace. I have updated the bug with the trace, but here it is.
>>>
>>> FATAL: HTTP.cc:1526: failed assert `!"unknown m_polarity"`
>>> ./bin/traffic_server - STACK TRACE:
>>> ./bin/traffic_server(ink_fatal+0x86)[0x6f2056]
>>> ./bin/traffic_server(_ink_assert+0x81)[0x6f0d61]
>>> ./bin/traffic_server(_ZN11HTTPHdrImpl9unmarshalEl+0x35)[0x5ea385]
>>> ./bin/traffic_server(_ZN7HdrHeap9unmarshalEiiPP14HdrHeapObjImplP11RefCountObj+0x146)[0x5df926]
>>> ./bin/traffic_server(_ZN8HTTPInfo9unmarshalEPciP11RefCountObj+0xc5)[0x5ea205]
>>> ./bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x750)[0x6648a0]
>>> ./bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x26)[0x66ce26]
>>> ./bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x22f)[0x6e7c0f]
>>> ./bin/traffic_server(_ZN7EThread7executeEv+0x1aa)[0x6e810a]
>>> ./bin/traffic_server[0x6e76da]
>>> /lib64/libpthread.so.0[0x7f4a565e52e7]
>>> /lib64/libc.so.6(clone+0x6d)[0x32a18ce3bd]
>>>
>>>
>>> I will try to dig deeper, but if have any ideas or suggestions I can
>>> try those out.
>>>
>>> Thanks
>>> -- Pranav
>>
>>


Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by John Plevyak <jp...@apache.org>.
On 9/20/2010 9:45 PM, Pranav Desai wrote:
> 
> This is with 2.1.2. Maybe I will try the latest SVN.

If you could that would be useful as the disk layout changed a bit when
support for 4096 sector size drives was added.

> Also
> note that I am testing it with a file based cache. Do you think there
> could be something there ?

Your file size is relatively large, so if your number of clients is large
and your file based cache is small you could be running into wrap-around issues...
where a slow client can't write the whole object before the disk wraps.

How big of a file cache are you running ?  How many clients ?  I am using raw
disk and at 10 clients (debug build) it I am doing between 5 and 3 ops/sec or
between 45 and 75 MB/sec with a total latency of around 3 seconds but probably
spiking to more like 5+... which means I would require a cache size of at least
1GB probably a lot more to be safe (because of random latency spikes).

The code *should* recognize wrap-around, but because it is typically not a problem
in adequately sized production systems it might not be reported.

> 
> 
>> If can provide access to a gdb session with the crash I might be able to figure
>> out what is going on...
>>
> 
> I was short on time so couldn't debug it any further ... hopefully I
> can get it done tomorrow ...
> 
> -- Pranav
> 
>> john
>>
>>
>> On 9/20/2010 6:20 PM, Pranav Desai wrote:
>>> On Thu, Sep 16, 2010 at 3:30 PM, Pranav Desai <pr...@gmail.com> wrote:
>>>> On Thu, Sep 16, 2010 at 2:03 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>>> On 09/16/2010 02:45 PM, Pranav Desai wrote:
>>>>>
>>>>> On Thu, Sep 16, 2010 at 12:33 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>>>
>>>>>  On 09/16/2010 01:26 PM, Pranav Desai wrote:
>>>>>
>>>>> Hi!
>>>>>
>>>>> I am running a load test with some video files to see . I am using
>>>>> curl-loader to generate the load. I have modified it to add a random
>>>>> number to the URLs before sending so I can test with a single URL and
>>>>> still stress the cache. The webserver is a lighttpd server with
>>>>> rewrite rules to translate the random strings back to a common URL.
>>>>> The URL is essentially a 15MB video file. I can provide more details
>>>>> on the setup if needed.
>>>>>
>>>>> Ok, I've created https://issues.apache.org/jira/browse/TS-441   with this
>>>>> information. If you can find a core file (or, run traffic_server under gdb),
>>>>> and get a stack trace, that would be very helpful. Also, when it crashes,
>>>>> you might get a stack trace in /var/log/messages and/or one of the log files
>>>>> in the .../var/log/trafficserver  directory.
>>>>>
>>>>>
>>>
>>> I got the stack trace. I have updated the bug with the trace, but here it is.
>>>
>>> FATAL: HTTP.cc:1526: failed assert `!"unknown m_polarity"`
>>> ./bin/traffic_server - STACK TRACE:
>>> ./bin/traffic_server(ink_fatal+0x86)[0x6f2056]
>>> ./bin/traffic_server(_ink_assert+0x81)[0x6f0d61]
>>> ./bin/traffic_server(_ZN11HTTPHdrImpl9unmarshalEl+0x35)[0x5ea385]
>>> ./bin/traffic_server(_ZN7HdrHeap9unmarshalEiiPP14HdrHeapObjImplP11RefCountObj+0x146)[0x5df926]
>>> ./bin/traffic_server(_ZN8HTTPInfo9unmarshalEPciP11RefCountObj+0xc5)[0x5ea205]
>>> ./bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x750)[0x6648a0]
>>> ./bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x26)[0x66ce26]
>>> ./bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x22f)[0x6e7c0f]
>>> ./bin/traffic_server(_ZN7EThread7executeEv+0x1aa)[0x6e810a]
>>> ./bin/traffic_server[0x6e76da]
>>> /lib64/libpthread.so.0[0x7f4a565e52e7]
>>> /lib64/libc.so.6(clone+0x6d)[0x32a18ce3bd]
>>>
>>>
>>> I will try to dig deeper, but if have any ideas or suggestions I can
>>> try those out.
>>>
>>> Thanks
>>> -- Pranav
>>
>>


Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by Pranav Desai <pr...@gmail.com>.
On Mon, Sep 20, 2010 at 9:34 PM, John Plevyak <jp...@apache.org> wrote:
>
> All of these are cache corruption issues.  I can't seem to reproduce them locally.
>
> Are you running the latest SVN ?  There was a bug in the SVN version of ATS
> a little while ago for a couple days before it got fixed.
>

This is with 2.1.2. Maybe I will try the latest SVN.

> Did you clear the cache before running (start with traffic_server -K)?  Changes
> in the cache format are supposed to be tracked by versioning of the database, but
> there might have been a change which wasn't accompanied by a version bump.
>

I have tried it with -K -k, but same result. In fact I clean out all
the files logs and cache.db before every run. Could it possible be
hard drive issues ... I am thinking of trying another machine. Also
note that I am testing it with a file based cache. Do you think there
could be something there ?


> If can provide access to a gdb session with the crash I might be able to figure
> out what is going on...
>

I was short on time so couldn't debug it any further ... hopefully I
can get it done tomorrow ...

-- Pranav

> john
>
>
> On 9/20/2010 6:20 PM, Pranav Desai wrote:
>> On Thu, Sep 16, 2010 at 3:30 PM, Pranav Desai <pr...@gmail.com> wrote:
>>> On Thu, Sep 16, 2010 at 2:03 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>> On 09/16/2010 02:45 PM, Pranav Desai wrote:
>>>>
>>>> On Thu, Sep 16, 2010 at 12:33 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>>
>>>>  On 09/16/2010 01:26 PM, Pranav Desai wrote:
>>>>
>>>> Hi!
>>>>
>>>> I am running a load test with some video files to see . I am using
>>>> curl-loader to generate the load. I have modified it to add a random
>>>> number to the URLs before sending so I can test with a single URL and
>>>> still stress the cache. The webserver is a lighttpd server with
>>>> rewrite rules to translate the random strings back to a common URL.
>>>> The URL is essentially a 15MB video file. I can provide more details
>>>> on the setup if needed.
>>>>
>>>> Ok, I've created https://issues.apache.org/jira/browse/TS-441   with this
>>>> information. If you can find a core file (or, run traffic_server under gdb),
>>>> and get a stack trace, that would be very helpful. Also, when it crashes,
>>>> you might get a stack trace in /var/log/messages and/or one of the log files
>>>> in the .../var/log/trafficserver  directory.
>>>>
>>>>
>>
>> I got the stack trace. I have updated the bug with the trace, but here it is.
>>
>> FATAL: HTTP.cc:1526: failed assert `!"unknown m_polarity"`
>> ./bin/traffic_server - STACK TRACE:
>> ./bin/traffic_server(ink_fatal+0x86)[0x6f2056]
>> ./bin/traffic_server(_ink_assert+0x81)[0x6f0d61]
>> ./bin/traffic_server(_ZN11HTTPHdrImpl9unmarshalEl+0x35)[0x5ea385]
>> ./bin/traffic_server(_ZN7HdrHeap9unmarshalEiiPP14HdrHeapObjImplP11RefCountObj+0x146)[0x5df926]
>> ./bin/traffic_server(_ZN8HTTPInfo9unmarshalEPciP11RefCountObj+0xc5)[0x5ea205]
>> ./bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x750)[0x6648a0]
>> ./bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x26)[0x66ce26]
>> ./bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x22f)[0x6e7c0f]
>> ./bin/traffic_server(_ZN7EThread7executeEv+0x1aa)[0x6e810a]
>> ./bin/traffic_server[0x6e76da]
>> /lib64/libpthread.so.0[0x7f4a565e52e7]
>> /lib64/libc.so.6(clone+0x6d)[0x32a18ce3bd]
>>
>>
>> I will try to dig deeper, but if have any ideas or suggestions I can
>> try those out.
>>
>> Thanks
>> -- Pranav
>
>

Re: errors and shutdown message in 2.1.2 under load (TS-441)

Posted by John Plevyak <jp...@apache.org>.
All of these are cache corruption issues.  I can't seem to reproduce them locally.

Are you running the latest SVN ?  There was a bug in the SVN version of ATS
a little while ago for a couple days before it got fixed.

Did you clear the cache before running (start with traffic_server -K)?  Changes
in the cache format are supposed to be tracked by versioning of the database, but
there might have been a change which wasn't accompanied by a version bump.

If can provide access to a gdb session with the crash I might be able to figure
out what is going on...

john


On 9/20/2010 6:20 PM, Pranav Desai wrote:
> On Thu, Sep 16, 2010 at 3:30 PM, Pranav Desai <pr...@gmail.com> wrote:
>> On Thu, Sep 16, 2010 at 2:03 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>> On 09/16/2010 02:45 PM, Pranav Desai wrote:
>>>
>>> On Thu, Sep 16, 2010 at 12:33 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>>
>>>  On 09/16/2010 01:26 PM, Pranav Desai wrote:
>>>
>>> Hi!
>>>
>>> I am running a load test with some video files to see . I am using
>>> curl-loader to generate the load. I have modified it to add a random
>>> number to the URLs before sending so I can test with a single URL and
>>> still stress the cache. The webserver is a lighttpd server with
>>> rewrite rules to translate the random strings back to a common URL.
>>> The URL is essentially a 15MB video file. I can provide more details
>>> on the setup if needed.
>>>
>>> Ok, I've created https://issues.apache.org/jira/browse/TS-441   with this
>>> information. If you can find a core file (or, run traffic_server under gdb),
>>> and get a stack trace, that would be very helpful. Also, when it crashes,
>>> you might get a stack trace in /var/log/messages and/or one of the log files
>>> in the .../var/log/trafficserver  directory.
>>>
>>>
> 
> I got the stack trace. I have updated the bug with the trace, but here it is.
> 
> FATAL: HTTP.cc:1526: failed assert `!"unknown m_polarity"`
> ./bin/traffic_server - STACK TRACE:
> ./bin/traffic_server(ink_fatal+0x86)[0x6f2056]
> ./bin/traffic_server(_ink_assert+0x81)[0x6f0d61]
> ./bin/traffic_server(_ZN11HTTPHdrImpl9unmarshalEl+0x35)[0x5ea385]
> ./bin/traffic_server(_ZN7HdrHeap9unmarshalEiiPP14HdrHeapObjImplP11RefCountObj+0x146)[0x5df926]
> ./bin/traffic_server(_ZN8HTTPInfo9unmarshalEPciP11RefCountObj+0xc5)[0x5ea205]
> ./bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x750)[0x6648a0]
> ./bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x26)[0x66ce26]
> ./bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x22f)[0x6e7c0f]
> ./bin/traffic_server(_ZN7EThread7executeEv+0x1aa)[0x6e810a]
> ./bin/traffic_server[0x6e76da]
> /lib64/libpthread.so.0[0x7f4a565e52e7]
> /lib64/libc.so.6(clone+0x6d)[0x32a18ce3bd]
> 
> 
> I will try to dig deeper, but if have any ideas or suggestions I can
> try those out.
> 
> Thanks
> -- Pranav