You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by Steve Cole <co...@itconsul.com> on 2011/06/13 17:58:40 UTC

Tproxy issues

I have ATS working, using raw devices now (12 x 15K RPM drives) and it is in 
more or less stock settings as far as the thread and memory setup.

I turned on the L4 redirect firehose and shot 800 req/s at it, and within about 
a minute, ATS started reporting errors connecting to *all* sites it was asked 
for and for all intents and purposes, "locked up."

Didn't really hear from users as they're used to the Internet being wonky. :)

Anyway, thing is... at first I thought this was ATS scalability settings but 
then I set it up to test with a single browser again and have discovered that 
I can successfully get ATS to exhibit the same behaviour with just a single 
computer and browser, and simply browsing!  It just takes a bit longer.

This is with ATS 3.0.0 beta, FWIW.

So the question is... is this a tproxy thing where the computer has a set 
number of connections that fills up?  Doesn't seem to be, netstat doesn't 
report more than about 4 connections from my browser at once.

And if it's ATS... where to look?  I think it may be, by the way... because 
ATS is what stops connecting to external sites.

Lastly, load on the machine seems to go from 2.5 to 4.5 all the time that ATS 
is running, regardless of load.  I thought this might be a poll/epoll issue 
but the config log tells me that epoll is being used (and the overall CPU time 
seems to show that this is true).

I believe I may have bitten off more than I can chew here and may have to 
return to Squid.  Which is sad, because ATS has some obvious advantages.  But, 
ATS documentation/examples/experience are still quite lacking at this point.

Re: Tproxy issues

Posted by "Alan M. Carroll" <am...@network-geographics.com>.
I'm back and have cleared off enough fires to look at this. I have my test 
system up and running again with 3.0.0 and I can't reproduce the problem, even 
if I have 5 tabs playing Youtube videos simultaneously. I know someone who has 
this in production, with hundreds of clients moving ~50-100G of video a day. 
So I suspect a configuration issue.

Can you provide any more details on your configuration (e.g., how much is 
being cached) and how many videos you need to hit before you get the problem?

>> It does not appear to, no.  That is to say if I explicitly set it in the
>> browser and do not use tproxy mode, I was unable to reproduce the issue.  That
>> doesn't mean it isn't happening... it may be that it scales better or
>> something, I don't really know.  I did try to hit it really hard by opening up
>> many many tabs of pages including Youtube videos.
>>
>> BTW, Youtube seems to be able to quickly reproduce the issue (when tproxy is
>> enabled), probably because there are just so many thumbnail images etc.


Re: Tproxy issues

Posted by Leif Hedstrom <zw...@apache.org>.
On 06/13/2011 01:10 PM, Steve Cole wrote:
> On June 13, 2011 02:05:39 PM Leif Hedstrom wrote:
>> Since you can reproduce it easily, can you see if tproxy vs forward
>> proxy makes any difference? Ie does it hang in either case?
> It does not appear to, no.  That is to say if I explicitly set it in the
> browser and do not use tproxy mode, I was unable to reproduce the issue.  That
> doesn't mean it isn't happening... it may be that it scales better or
> something, I don't really know.  I did try to hit it really hard by opening up
> many many tabs of pages including Youtube videos.
>
> BTW, Youtube seems to be able to quickly reproduce the issue (when tproxy is
> enabled), probably because there are just so many thumbnail images etc.

Ok, the reason I'm asking is because tproxy is a fairly new addition, 
and I only know of one customer using it. Alan M. Carrol is the "lead" 
on that project, but he's been busy eating bangers and mash and haggis 
in the UK for a few weeks.

Did you file a bug on this issue btw? The more information you can 
provide there (under what condition it reproduces etc.), the better. I'm 
personally not familiar with the tproxy additions that amc made a while 
back either (so I'm kinda useless).

-- leif


Re: Tproxy issues

Posted by Steve Cole <co...@itconsul.com>.
On June 13, 2011 02:05:39 PM Leif Hedstrom wrote:
> Since you can reproduce it easily, can you see if tproxy vs forward
> proxy makes any difference? Ie does it hang in either case?

It does not appear to, no.  That is to say if I explicitly set it in the 
browser and do not use tproxy mode, I was unable to reproduce the issue.  That 
doesn't mean it isn't happening... it may be that it scales better or 
something, I don't really know.  I did try to hit it really hard by opening up 
many many tabs of pages including Youtube videos.

BTW, Youtube seems to be able to quickly reproduce the issue (when tproxy is 
enabled), probably because there are just so many thumbnail images etc.

Re: Tproxy issues

Posted by Leif Hedstrom <zw...@apache.org>.
Since you can reproduce it easily, can you see if tproxy vs forward 
proxy makes any difference? Ie does it hang in either case?

-- leif

On Jun 13, 2011, at 9:58 AM, Steve Cole <co...@itconsul.com> wrote:

> I have ATS working, using raw devices now (12 x 15K RPM drives) and it is in
> more or less stock settings as far as the thread and memory setup.
>
> I turned on the L4 redirect firehose and shot 800 req/s at it, and within about
> a minute, ATS started reporting errors connecting to *all* sites it was asked
> for and for all intents and purposes, "locked up."
>
> Didn't really hear from users as they're used to the Internet being wonky. :)
>
> Anyway, thing is... at first I thought this was ATS scalability settings but
> then I set it up to test with a single browser again and have discovered that
> I can successfully get ATS to exhibit the same behaviour with just a single
> computer and browser, and simply browsing!  It just takes a bit longer.
>
> This is with ATS 3.0.0 beta, FWIW.
>
> So the question is... is this a tproxy thing where the computer has a set
> number of connections that fills up?  Doesn't seem to be, netstat doesn't
> report more than about 4 connections from my browser at once.
>
> And if it's ATS... where to look?  I think it may be, by the way... because
> ATS is what stops connecting to external sites.
>
> Lastly, load on the machine seems to go from 2.5 to 4.5 all the time that ATS
> is running, regardless of load.  I thought this might be a poll/epoll issue
> but the config log tells me that epoll is being used (and the overall CPU time
> seems to show that this is true).
>
> I believe I may have bitten off more than I can chew here and may have to
> return to Squid.  Which is sad, because ATS has some obvious advantages.  But,
> ATS documentation/examples/experience are still quite lacking at this point.