You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jmeter.apache.org by Philippe Mouawad <ph...@gmail.com> on 2014/03/21 11:14:43 UTC

Re: Bug 56119 - dealing with idle connection timeout and dropped connection

Hello,
Regarding this item, I think we should make retry behaviour (or stale
check) reflect what browser do.

It seems browsers by default do some retries but I was not yet able to find
a reference of this behaviour.

Regards
Philippe


On Wed, Feb 26, 2014 at 2:59 AM, sebb <se...@gmail.com> wrote:

> On 23 February 2014 04:38, sebb <se...@gmail.com> wrote:
> > On 19 February 2014 21:51, sebb <se...@gmail.com> wrote:
> >> On 19 February 2014 21:03, Philippe Mouawad <ph...@gmail.com>
> wrote:
> >>> On Wed, Feb 19, 2014 at 9:58 PM, sebb <se...@gmail.com> wrote:
> >>>
> >>>> On 19 February 2014 18:49, sebb <se...@gmail.com> wrote:
> >>>> > It looks as though the problem reported in Bug 56119 was due to the
> >>>> > server dropping connections that have been idle too long.
> >>>> >
> >>>> > There may also be servers that only allow a connection to be reused
> a
> >>>> > certain number of times (this does not seem to have been the case
> >>>> > here).
> >>>> >
> >>>> > This email is to discuss what JMeter could perhaps do to make it
> >>>> > easier to test such servers.
> >>>> >
> >>>> > The two existing work-rounds are:
> >>>> > - disable Keep-Alive
> >>>> > - enable staleCheck
> >>>> >
> >>>> > Neither is ideal; the first is awkward to use, and staleCheck can
> >>>> > generate unnecessary additional traffic (which is why it was
> disabled
> >>>> > in 2.11).
> >>>> >
> >>>> > I can think of two possible approaches:
> >>>> >
> >>>> > 1) proactively shut connections. This would be easy for servers that
> >>>> > limit reuse.
> >>>> > Just count reuses and turn off keep-alive when a specified limit is
> >>>> reached.
> >>>> > Not so easy for idle timeouts; one cannot retroactively disable
> >>>> keep-alive.
> >>>> >
> >>>> > 2) Deal with the disconnects when they occur.
> >>>> > The code needs to distinguish which errors are retriable, and may
> need
> >>>> > to distinguish at what point the failure occurs. For example, even a
> >>>> > POST ought to be retriable if JMeter is unable to send any data on
> the
> >>>> > connection.
> >>>> >
> >>>> > Also need to consider how one might report retries.
> >>>> > I think the tester needs to be able to find out if additional
> requests
> >>>> > have been made by JMeter.
> >>>>
> >>>> Further testing against the ASF servers shows that HC 4.2.x does
> >>>> handle idle timeouts without needing to use the staleCheck option.
> >>>> This relies on the server sending a header of the form:
> >>>>
> >>>> Keep-Alive: timeout=5, max=100
> >>>>
> >>>> In this case, the connection is automatically recreated if necessary
> >>>> when the next sampler runs.
> >>>> If the server fails to send the header, then the connection may be
> >>>> dropped unexpectedly (which is what was happening with Bug 56119).
> >>>> So another approach might be to allow an optional keep-alive timeout
> >>>> in case the server does not provide one.
> >>>>
> >>>> Or we could take the view that there is nothing to fix in JMeter.
> >>>> The Keep-Alive header is there for a reason, it tells the client when
> >>>> it is safe to reuse the connectiion.
> >>>> If the server fails to send it, then it is broken, and so the failed
> >>>> samples are to be expected.
> >>>>
> >>>
> >>> I think we need to make something at least for servers like Amazon S3
> which
> >>> close connections after number of uses.
> >>> Did you check to see if this kind of server send a keep alive header ?
> >>
> >> I just tested again with jmeter.a.o.
> >> It returns headers of the form:
> >>
> >> Keep-Alive: timeout=5, max=100
> >> Connection: Keep-Alive
> >> ...
> >> Keep-Alive: timeout=5, max=99
> >> Connection: Keep-Alive
> >> ...
> >> etc
> >> ...
> >> Keep-Alive: timeout=5, max=1
> >> Connection: Keep-Alive
> >> ...
> >> Connection: close
> >>
> >> So the HC connection manager does not need to keep track of the
> >> remaining re-use count; the server disconnects at the end of the last
> >> request.
> >> Nice and simple.
> >>
> >> I assume S3 does the same as jmeter.a.o if it is well-behaved.
> >
> > I asked on the HC dev list, and found out that the Keep-Alive header
> > is optional.
> > So servers that don't send it are not misconfigured - though of course
> > it helps the client if they send the header.
> >
> > Also, the HC default retry processing does handle the disconnect case,
> > i.e. where the server disconnects whilst starting to send the next
> > request.
> > Setting retry count to 1 allows the request to be retried.
> >
> > JMeter previously used to enable retries, but we found this had
> > adverse effects, as it could generate extra requests.
> > It turns out that the default retry handler always retries failed
> > idempotent requests (e.g. GET), even if they were successfully sent.
> > This is perhaps what caused the additional traffic.
> >
> > So I think we need an amended version of the retry handler which only
> > retries failed sends, if the retry count is 0.
> > We can use the HC version for retry counts > 0.
>
> Unfortunately this does not work, as the disconnect is only detected
> once the request has been sent.
>
> However, it is possible to specify an idle timeout to deal with a
> missing Keep-Alive header.
> I've tried that and it works, but we need some decisions on how to
> implement it.
> I'll send a separate e-mail for that.
>
> It may also be possible to code a conditional stale connection check
> which is not applied every single time.
> e.g. it could be done after N requests or when the connection has been
> idle for T seconds.
> I'll look into this a bit more and report back in another mail.
>
> >>> Anyway on my side I think what has been changed in 2.11 should not be
> >>> reverted, because for servers correctly configured you don't get these
> >>> errors, I made 3 campaigns on different servers  with 2.11 and never
> got
> >>> this kind of issues.
> >>
> >> Agreed, no need to revert.
> >>
> >>> But maybe we should document it better somewhere.
> >>
> >> Yes, the error and likely cause should be documented.
> >> Probably easiest to start as a Wiki page.
> >>
>



-- 
Cordialement.
Philippe Mouawad.

Re: Bug 56119 - dealing with idle connection timeout and dropped connection

Posted by Philippe Mouawad <ph...@gmail.com>.
Hello,
As discussed, I wrote this :

   - https://wiki.apache.org/jmeter/JMeterSocketClosed

Your review is welcome.

Regards

Philippe



On Fri, Mar 21, 2014 at 12:48 PM, sebb <se...@gmail.com> wrote:

> On 21 March 2014 10:14, Philippe Mouawad <ph...@gmail.com>
> wrote:
> > Hello,
> > Regarding this item, I think we should make retry behaviour (or stale
> > check) reflect what browser do.
>
> Maybe, but one on the reasons we changed the code was that JMeter was
> making additional unexpected requests, putting more load on the
> server.
> That was mainly due to the unconditional stale checking.
>
> I think we need to consider collecting statistics on retries - perhaps
> an extra field in the sample result?
> Or perhaps a sub-sample?
>
> And the behaviour should ideally be configurable.
> In this case a property would be OK as I think the behaviour does not
> need to be host-specific.
>
> > It seems browsers by default do some retries but I was not yet able to
> find
> > a reference of this behaviour.
>
> Yes, we need to find out what sort of errors cause the browser to
> retry, and how many times.
> However, note that browsers may use asynchronous I/O which would allow
> them to detect connection failure much earlier.
> With synch I/O it's not generally possible to tell if the connection
> dropped before the request was completely sent.
> So one cannot determine if a non-idempotent request should be auto-retried.
> [That's where the conditional stale check is needed]
>
> > Regards
> > Philippe
> >
> >
> > On Wed, Feb 26, 2014 at 2:59 AM, sebb <se...@gmail.com> wrote:
> >
> >> On 23 February 2014 04:38, sebb <se...@gmail.com> wrote:
> >> > On 19 February 2014 21:51, sebb <se...@gmail.com> wrote:
> >> >> On 19 February 2014 21:03, Philippe Mouawad <
> philippe.mouawad@gmail.com>
> >> wrote:
> >> >>> On Wed, Feb 19, 2014 at 9:58 PM, sebb <se...@gmail.com> wrote:
> >> >>>
> >> >>>> On 19 February 2014 18:49, sebb <se...@gmail.com> wrote:
> >> >>>> > It looks as though the problem reported in Bug 56119 was due to
> the
> >> >>>> > server dropping connections that have been idle too long.
> >> >>>> >
> >> >>>> > There may also be servers that only allow a connection to be
> reused
> >> a
> >> >>>> > certain number of times (this does not seem to have been the case
> >> >>>> > here).
> >> >>>> >
> >> >>>> > This email is to discuss what JMeter could perhaps do to make it
> >> >>>> > easier to test such servers.
> >> >>>> >
> >> >>>> > The two existing work-rounds are:
> >> >>>> > - disable Keep-Alive
> >> >>>> > - enable staleCheck
> >> >>>> >
> >> >>>> > Neither is ideal; the first is awkward to use, and staleCheck can
> >> >>>> > generate unnecessary additional traffic (which is why it was
> >> disabled
> >> >>>> > in 2.11).
> >> >>>> >
> >> >>>> > I can think of two possible approaches:
> >> >>>> >
> >> >>>> > 1) proactively shut connections. This would be easy for servers
> that
> >> >>>> > limit reuse.
> >> >>>> > Just count reuses and turn off keep-alive when a specified limit
> is
> >> >>>> reached.
> >> >>>> > Not so easy for idle timeouts; one cannot retroactively disable
> >> >>>> keep-alive.
> >> >>>> >
> >> >>>> > 2) Deal with the disconnects when they occur.
> >> >>>> > The code needs to distinguish which errors are retriable, and may
> >> need
> >> >>>> > to distinguish at what point the failure occurs. For example,
> even a
> >> >>>> > POST ought to be retriable if JMeter is unable to send any data
> on
> >> the
> >> >>>> > connection.
> >> >>>> >
> >> >>>> > Also need to consider how one might report retries.
> >> >>>> > I think the tester needs to be able to find out if additional
> >> requests
> >> >>>> > have been made by JMeter.
> >> >>>>
> >> >>>> Further testing against the ASF servers shows that HC 4.2.x does
> >> >>>> handle idle timeouts without needing to use the staleCheck option.
> >> >>>> This relies on the server sending a header of the form:
> >> >>>>
> >> >>>> Keep-Alive: timeout=5, max=100
> >> >>>>
> >> >>>> In this case, the connection is automatically recreated if
> necessary
> >> >>>> when the next sampler runs.
> >> >>>> If the server fails to send the header, then the connection may be
> >> >>>> dropped unexpectedly (which is what was happening with Bug 56119).
> >> >>>> So another approach might be to allow an optional keep-alive
> timeout
> >> >>>> in case the server does not provide one.
> >> >>>>
> >> >>>> Or we could take the view that there is nothing to fix in JMeter.
> >> >>>> The Keep-Alive header is there for a reason, it tells the client
> when
> >> >>>> it is safe to reuse the connectiion.
> >> >>>> If the server fails to send it, then it is broken, and so the
> failed
> >> >>>> samples are to be expected.
> >> >>>>
> >> >>>
> >> >>> I think we need to make something at least for servers like Amazon
> S3
> >> which
> >> >>> close connections after number of uses.
> >> >>> Did you check to see if this kind of server send a keep alive
> header ?
> >> >>
> >> >> I just tested again with jmeter.a.o.
> >> >> It returns headers of the form:
> >> >>
> >> >> Keep-Alive: timeout=5, max=100
> >> >> Connection: Keep-Alive
> >> >> ...
> >> >> Keep-Alive: timeout=5, max=99
> >> >> Connection: Keep-Alive
> >> >> ...
> >> >> etc
> >> >> ...
> >> >> Keep-Alive: timeout=5, max=1
> >> >> Connection: Keep-Alive
> >> >> ...
> >> >> Connection: close
> >> >>
> >> >> So the HC connection manager does not need to keep track of the
> >> >> remaining re-use count; the server disconnects at the end of the last
> >> >> request.
> >> >> Nice and simple.
> >> >>
> >> >> I assume S3 does the same as jmeter.a.o if it is well-behaved.
> >> >
> >> > I asked on the HC dev list, and found out that the Keep-Alive header
> >> > is optional.
> >> > So servers that don't send it are not misconfigured - though of course
> >> > it helps the client if they send the header.
> >> >
> >> > Also, the HC default retry processing does handle the disconnect case,
> >> > i.e. where the server disconnects whilst starting to send the next
> >> > request.
> >> > Setting retry count to 1 allows the request to be retried.
> >> >
> >> > JMeter previously used to enable retries, but we found this had
> >> > adverse effects, as it could generate extra requests.
> >> > It turns out that the default retry handler always retries failed
> >> > idempotent requests (e.g. GET), even if they were successfully sent.
> >> > This is perhaps what caused the additional traffic.
> >> >
> >> > So I think we need an amended version of the retry handler which only
> >> > retries failed sends, if the retry count is 0.
> >> > We can use the HC version for retry counts > 0.
> >>
> >> Unfortunately this does not work, as the disconnect is only detected
> >> once the request has been sent.
> >>
> >> However, it is possible to specify an idle timeout to deal with a
> >> missing Keep-Alive header.
> >> I've tried that and it works, but we need some decisions on how to
> >> implement it.
> >> I'll send a separate e-mail for that.
> >>
> >> It may also be possible to code a conditional stale connection check
> >> which is not applied every single time.
> >> e.g. it could be done after N requests or when the connection has been
> >> idle for T seconds.
> >> I'll look into this a bit more and report back in another mail.
> >>
> >> >>> Anyway on my side I think what has been changed in 2.11 should not
> be
> >> >>> reverted, because for servers correctly configured you don't get
> these
> >> >>> errors, I made 3 campaigns on different servers  with 2.11 and never
> >> got
> >> >>> this kind of issues.
> >> >>
> >> >> Agreed, no need to revert.
> >> >>
> >> >>> But maybe we should document it better somewhere.
> >> >>
> >> >> Yes, the error and likely cause should be documented.
> >> >> Probably easiest to start as a Wiki page.
> >> >>
> >>
> >
> >
> >
> > --
> > Cordialement.
> > Philippe Mouawad.
>



-- 
Cordialement.
Philippe Mouawad.

Re: Bug 56119 - dealing with idle connection timeout and dropped connection

Posted by sebb <se...@gmail.com>.
On 21 March 2014 10:14, Philippe Mouawad <ph...@gmail.com> wrote:
> Hello,
> Regarding this item, I think we should make retry behaviour (or stale
> check) reflect what browser do.

Maybe, but one on the reasons we changed the code was that JMeter was
making additional unexpected requests, putting more load on the
server.
That was mainly due to the unconditional stale checking.

I think we need to consider collecting statistics on retries - perhaps
an extra field in the sample result?
Or perhaps a sub-sample?

And the behaviour should ideally be configurable.
In this case a property would be OK as I think the behaviour does not
need to be host-specific.

> It seems browsers by default do some retries but I was not yet able to find
> a reference of this behaviour.

Yes, we need to find out what sort of errors cause the browser to
retry, and how many times.
However, note that browsers may use asynchronous I/O which would allow
them to detect connection failure much earlier.
With synch I/O it's not generally possible to tell if the connection
dropped before the request was completely sent.
So one cannot determine if a non-idempotent request should be auto-retried.
[That's where the conditional stale check is needed]

> Regards
> Philippe
>
>
> On Wed, Feb 26, 2014 at 2:59 AM, sebb <se...@gmail.com> wrote:
>
>> On 23 February 2014 04:38, sebb <se...@gmail.com> wrote:
>> > On 19 February 2014 21:51, sebb <se...@gmail.com> wrote:
>> >> On 19 February 2014 21:03, Philippe Mouawad <ph...@gmail.com>
>> wrote:
>> >>> On Wed, Feb 19, 2014 at 9:58 PM, sebb <se...@gmail.com> wrote:
>> >>>
>> >>>> On 19 February 2014 18:49, sebb <se...@gmail.com> wrote:
>> >>>> > It looks as though the problem reported in Bug 56119 was due to the
>> >>>> > server dropping connections that have been idle too long.
>> >>>> >
>> >>>> > There may also be servers that only allow a connection to be reused
>> a
>> >>>> > certain number of times (this does not seem to have been the case
>> >>>> > here).
>> >>>> >
>> >>>> > This email is to discuss what JMeter could perhaps do to make it
>> >>>> > easier to test such servers.
>> >>>> >
>> >>>> > The two existing work-rounds are:
>> >>>> > - disable Keep-Alive
>> >>>> > - enable staleCheck
>> >>>> >
>> >>>> > Neither is ideal; the first is awkward to use, and staleCheck can
>> >>>> > generate unnecessary additional traffic (which is why it was
>> disabled
>> >>>> > in 2.11).
>> >>>> >
>> >>>> > I can think of two possible approaches:
>> >>>> >
>> >>>> > 1) proactively shut connections. This would be easy for servers that
>> >>>> > limit reuse.
>> >>>> > Just count reuses and turn off keep-alive when a specified limit is
>> >>>> reached.
>> >>>> > Not so easy for idle timeouts; one cannot retroactively disable
>> >>>> keep-alive.
>> >>>> >
>> >>>> > 2) Deal with the disconnects when they occur.
>> >>>> > The code needs to distinguish which errors are retriable, and may
>> need
>> >>>> > to distinguish at what point the failure occurs. For example, even a
>> >>>> > POST ought to be retriable if JMeter is unable to send any data on
>> the
>> >>>> > connection.
>> >>>> >
>> >>>> > Also need to consider how one might report retries.
>> >>>> > I think the tester needs to be able to find out if additional
>> requests
>> >>>> > have been made by JMeter.
>> >>>>
>> >>>> Further testing against the ASF servers shows that HC 4.2.x does
>> >>>> handle idle timeouts without needing to use the staleCheck option.
>> >>>> This relies on the server sending a header of the form:
>> >>>>
>> >>>> Keep-Alive: timeout=5, max=100
>> >>>>
>> >>>> In this case, the connection is automatically recreated if necessary
>> >>>> when the next sampler runs.
>> >>>> If the server fails to send the header, then the connection may be
>> >>>> dropped unexpectedly (which is what was happening with Bug 56119).
>> >>>> So another approach might be to allow an optional keep-alive timeout
>> >>>> in case the server does not provide one.
>> >>>>
>> >>>> Or we could take the view that there is nothing to fix in JMeter.
>> >>>> The Keep-Alive header is there for a reason, it tells the client when
>> >>>> it is safe to reuse the connectiion.
>> >>>> If the server fails to send it, then it is broken, and so the failed
>> >>>> samples are to be expected.
>> >>>>
>> >>>
>> >>> I think we need to make something at least for servers like Amazon S3
>> which
>> >>> close connections after number of uses.
>> >>> Did you check to see if this kind of server send a keep alive header ?
>> >>
>> >> I just tested again with jmeter.a.o.
>> >> It returns headers of the form:
>> >>
>> >> Keep-Alive: timeout=5, max=100
>> >> Connection: Keep-Alive
>> >> ...
>> >> Keep-Alive: timeout=5, max=99
>> >> Connection: Keep-Alive
>> >> ...
>> >> etc
>> >> ...
>> >> Keep-Alive: timeout=5, max=1
>> >> Connection: Keep-Alive
>> >> ...
>> >> Connection: close
>> >>
>> >> So the HC connection manager does not need to keep track of the
>> >> remaining re-use count; the server disconnects at the end of the last
>> >> request.
>> >> Nice and simple.
>> >>
>> >> I assume S3 does the same as jmeter.a.o if it is well-behaved.
>> >
>> > I asked on the HC dev list, and found out that the Keep-Alive header
>> > is optional.
>> > So servers that don't send it are not misconfigured - though of course
>> > it helps the client if they send the header.
>> >
>> > Also, the HC default retry processing does handle the disconnect case,
>> > i.e. where the server disconnects whilst starting to send the next
>> > request.
>> > Setting retry count to 1 allows the request to be retried.
>> >
>> > JMeter previously used to enable retries, but we found this had
>> > adverse effects, as it could generate extra requests.
>> > It turns out that the default retry handler always retries failed
>> > idempotent requests (e.g. GET), even if they were successfully sent.
>> > This is perhaps what caused the additional traffic.
>> >
>> > So I think we need an amended version of the retry handler which only
>> > retries failed sends, if the retry count is 0.
>> > We can use the HC version for retry counts > 0.
>>
>> Unfortunately this does not work, as the disconnect is only detected
>> once the request has been sent.
>>
>> However, it is possible to specify an idle timeout to deal with a
>> missing Keep-Alive header.
>> I've tried that and it works, but we need some decisions on how to
>> implement it.
>> I'll send a separate e-mail for that.
>>
>> It may also be possible to code a conditional stale connection check
>> which is not applied every single time.
>> e.g. it could be done after N requests or when the connection has been
>> idle for T seconds.
>> I'll look into this a bit more and report back in another mail.
>>
>> >>> Anyway on my side I think what has been changed in 2.11 should not be
>> >>> reverted, because for servers correctly configured you don't get these
>> >>> errors, I made 3 campaigns on different servers  with 2.11 and never
>> got
>> >>> this kind of issues.
>> >>
>> >> Agreed, no need to revert.
>> >>
>> >>> But maybe we should document it better somewhere.
>> >>
>> >> Yes, the error and likely cause should be documented.
>> >> Probably easiest to start as a Wiki page.
>> >>
>>
>
>
>
> --
> Cordialement.
> Philippe Mouawad.