You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Sharon Pattison <sc...@gmail.com> on 2010/09/20 17:52:50 UTC

Strange process behavior

Hi,


We are experiencing a strange problem on our web servers for our site, and
are not sure if the problem is mod_perl related, but thought we would turn
here for help in case someone else has experienced this issue.
Unfortunately, we are not sure exactly when the problem started, so we can’t
point to any particular change that might have prompted it.



We are using Apache 1.3.42 with mod_perl 1.31 and Perl 5.8.3 on Solaris 9.
What is happening is that a particular Apache child process starts having an
issue where the wrong content is being returned for a request.  The content
returned is always from the correct script, but it has the wrong data.  For
example, if we request a product detail page for a particular product, we
might get a different product, or we might get a message that the product
does not exist, when it does.



After adding a lot of logging to our logs, we discovered that the Apache
request object will have the correct information, but the CGI object never
seems to get the global values reset, so is always has the query_string of
the request where things started to go wrong.  The process will continue to
get and respond to requests, but the CGI object’s query_string never
changes, so the user either gets no information or the wrong information.



For example, if we had a request for mysite.com/product?sku=12345, and the
process that served this request starts to have the problem, then we see
that for every request that process gets, logging $q->query_string gives us
‘sku=12345’.  It seems like the initialize_globals() CGI method is never
being executed in the process cleanup.



There doesn’t seem to be any particular request that triggers this
behavior.  We have checked the server load when we discover the issue is
occurring and everything seems normal.  The problem process does not seem to
be using any more memory or CPU than any of the other processes, and the
number of processes is fine.



Has anyone experienced anything like this?  Any ideas how we can track down
the problem?



Thank you in advance for any help you can give us.



Sharon

Re: Strange process behavior

Posted by Perrin Harkins <pe...@elem.com>.
[ please reply-all to keep it on the list ]

On Mon, Sep 20, 2010 at 4:07 PM, Sharon Pattison <sc...@gmail.com> wrote:
> We would still like to track down the cause of the problem, if we can.  The
> process does indeed seem to become broken at a particular point and the CGI
> object has an incorrect query_string for each request until the process
> dies.

That sounds like it may be a problem with the scoping of the variables
you load data into from CGI rather than CGI itself.  Hard to say until
you get a way to reproduce it.

> If you have any other suggestions for finding the root of the problem, we
> would love to hear your ideas.

If you can write a test to determine if this is happening, you can log
the process ID and then use that to look back through your access_log
(if you log process ID there) and see what the sequence was that broke
the process.  Try to write a test that runs at some point in the
request and checks the URL against the values you see from CGI.pm.

- Perrin

Re: Strange process behavior

Posted by Perrin Harkins <pe...@elem.com>.
Hi,

> After adding a lot of logging to our logs, we discovered that the Apache
> request object will have the correct information, but the CGI object never
> seems to get the global values reset, so is always has the query_string of
> the request where things started to go wrong.  The process will continue to
> get and respond to requests, but the CGI object’s query_string never
> changes, so the user either gets no information or the wrong information.

This can happen when you do an internal redirect, which does not
trigger a cleanup.

Some options:
- Use something else instead of CGI.pm.  Apache::Request is one possibility.
- Put in a call to initialize_globals() in an early phase, like
FixupHandler, so you know it has been run.

If the process seems to become broken and always serve the wrong pages
after a certain point, you may have something else wrong, like a
variable scoping issue.

- Perrin

Re: Strange process behavior

Posted by Perrin Harkins <pe...@elem.com>.
On Tue, Sep 21, 2010 at 12:09 PM, Sharon Pattison <sc...@gmail.com> wrote:
> We added some logging to several of our scripts to log when $r->args and
> $q->query_string do not match.  What we find for the bad process is that $r
> always has the correct data for the new request, but the value of
> $q->query_string never changes once the problem starts.

One quick thing to check: make sure you have the very latest version
of CGI.pm installed!  This often fixes mysterious CGI.pm problems.

> Here's a shortened
> example of what we are logging:

I'd suggest you log enough that you can see all the requests sent to a
process before it broke.  Then try duplicating those with an LWP
script and see if you can reproduce the breakage.

- Perrin

Re: Strange process behavior

Posted by Sharon Pattison <sc...@gmail.com>.
Hi,

I think I sent my last reply just to Tuomo, so am sending again to the
list......

Thanks for the suggestion.  I checked our error logs and did not find that
message anywhere.

We added some logging to several of our scripts to log when $r->args and
$q->query_string do not match.  What we find for the bad process is that $r
always has the correct data for the new request, but the value of
$q->query_string never changes once the problem starts.  Here's a shortened
example of what we are logging:

[Sat Sep 18 01:14:06 2010] [error]  INFO LOGGER: Mismatched query strings: r
= item_code=WW&item_no=293845&view=details, q = sku=828752&slide=2||PID:
2309
[Sat Sep 18 01:14:10 2010] [error]  INFO LOGGER: Mismatched query strings: r
= sp=102686, q = sku=828752&slide=2||PID: 2309
[Sat Sep 18 01:14:12 2010] [error]  INFO LOGGER: Mismatched query strings: r
= , q = sku=828752&slide=2||PID: 2309
[Sat Sep 18 01:14:20 2010] [error]  INFO LOGGER: Mismatched query strings: r
= N=1108281+5401+10006, q = sku=828752&slide=2||PID: 2309
[Sat Sep 18 01:18:53 2010] [error]  INFO LOGGER: Mismatched query strings: r
= item_no=312108, q = sku=828752&slide=2||PID: 2309

This will continue until the process with PID 2309 dies, and it happens with
any script that is requested.

We are doing some internal redirects as Perrin mentioned, but we are not
redirecting every time we see this happen, so I can't say that redirects are
causing the problem.  At the same time this is happening, all other child
processes are behaving normally.

Sharon

On Tue, Sep 21, 2010 at 8:28 AM, Tuomo Salo <Tu...@cybercom.com> wrote:

> Are you getting "variable will not stay shared" messages in your error_log?
>
> If you are, you might be using a lexical variable outside of the function
> it was defined in. When this happens, you'll get a closure, which is a
> useful tool sometimes, but for us regular people it means that old data
> will unexpectedly stay in the variables from one request to the other.
>
> The fix is to make sure that every variable (that is not meant to be
> global)
> is declared as "my" inside any sub where it is used.
>
>
> http://modperlbook.org/html/22-2-1-Value-of-x-will-not-stay-shared-at-line-5.html
>
>  -Tuomo
>
> On Mon, Sep 20, 2010 at 11:52:50AM -0400, Sharon Pattison wrote:
> > Hi,
> >
> >
> > We are experiencing a strange problem on our web servers for our site,
> and
> > are not sure if the problem is mod_perl related, but thought we would
> turn
> > here for help in case someone else has experienced this issue.
> > Unfortunately, we are not sure exactly when the problem started, so we
> can’t
> > point to any particular change that might have prompted it.
> >
> >
> >
> > We are using Apache 1.3.42 with mod_perl 1.31 and Perl 5.8.3 on Solaris
> 9.
> > What is happening is that a particular Apache child process starts having
> an
> > issue where the wrong content is being returned for a request.  The
> content
> > returned is always from the correct script, but it has the wrong data.
>  For
> > example, if we request a product detail page for a particular product, we
> > might get a different product, or we might get a message that the product
> > does not exist, when it does.
> >
> >
> >
> > After adding a lot of logging to our logs, we discovered that the Apache
> > request object will have the correct information, but the CGI object
> never
> > seems to get the global values reset, so is always has the query_string
> of
> > the request where things started to go wrong.  The process will continue
> to
> > get and respond to requests, but the CGI object’s query_string never
> > changes, so the user either gets no information or the wrong information.
> >
> >
> >
> > For example, if we had a request for mysite.com/product?sku=12345, and
> the
> > process that served this request starts to have the problem, then we see
> > that for every request that process gets, logging $q->query_string gives
> us
> > ‘sku=12345’.  It seems like the initialize_globals() CGI method is never
> > being executed in the process cleanup.
> >
> >
> >
> > There doesn’t seem to be any particular request that triggers this
> > behavior.  We have checked the server load when we discover the issue is
> > occurring and everything seems normal.  The problem process does not seem
> to
> > be using any more memory or CPU than any of the other processes, and the
> > number of processes is fine.
> >
> >
> >
> > Has anyone experienced anything like this?  Any ideas how we can track
> down
> > the problem?
> >
> >
> >
> > Thank you in advance for any help you can give us.
> >
> >
> >
> > Sharon
>

Re: Strange process behavior

Posted by Tuomo Salo <Tu...@cybercom.com>.
Are you getting "variable will not stay shared" messages in your error_log?

If you are, you might be using a lexical variable outside of the function
it was defined in. When this happens, you'll get a closure, which is a 
useful tool sometimes, but for us regular people it means that old data
will unexpectedly stay in the variables from one request to the other.

The fix is to make sure that every variable (that is not meant to be global)
is declared as "my" inside any sub where it is used.

http://modperlbook.org/html/22-2-1-Value-of-x-will-not-stay-shared-at-line-5.html

  -Tuomo

On Mon, Sep 20, 2010 at 11:52:50AM -0400, Sharon Pattison wrote:
> Hi,
> 
> 
> We are experiencing a strange problem on our web servers for our site, and
> are not sure if the problem is mod_perl related, but thought we would turn
> here for help in case someone else has experienced this issue.
> Unfortunately, we are not sure exactly when the problem started, so we can’t
> point to any particular change that might have prompted it.
> 
> 
> 
> We are using Apache 1.3.42 with mod_perl 1.31 and Perl 5.8.3 on Solaris 9.
> What is happening is that a particular Apache child process starts having an
> issue where the wrong content is being returned for a request.  The content
> returned is always from the correct script, but it has the wrong data.  For
> example, if we request a product detail page for a particular product, we
> might get a different product, or we might get a message that the product
> does not exist, when it does.
> 
> 
> 
> After adding a lot of logging to our logs, we discovered that the Apache
> request object will have the correct information, but the CGI object never
> seems to get the global values reset, so is always has the query_string of
> the request where things started to go wrong.  The process will continue to
> get and respond to requests, but the CGI object’s query_string never
> changes, so the user either gets no information or the wrong information.
> 
> 
> 
> For example, if we had a request for mysite.com/product?sku=12345, and the
> process that served this request starts to have the problem, then we see
> that for every request that process gets, logging $q->query_string gives us
> ‘sku=12345’.  It seems like the initialize_globals() CGI method is never
> being executed in the process cleanup.
> 
> 
> 
> There doesn’t seem to be any particular request that triggers this
> behavior.  We have checked the server load when we discover the issue is
> occurring and everything seems normal.  The problem process does not seem to
> be using any more memory or CPU than any of the other processes, and the
> number of processes is fine.
> 
> 
> 
> Has anyone experienced anything like this?  Any ideas how we can track down
> the problem?
> 
> 
> 
> Thank you in advance for any help you can give us.
> 
> 
> 
> Sharon