You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2018/01/25 12:35:03 UTC

[Bug 62044] New: shared memory segments are not found in global list, but appear to exist in kernel.

https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

            Bug ID: 62044
           Summary: shared memory segments are not found in global list,
                    but appear to exist in kernel.
           Product: Apache httpd-2
           Version: 2.4.29
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: critical
          Priority: P2
         Component: mod_proxy_balancer
          Assignee: bugs@httpd.apache.org
          Reporter: mark@blackmans.org
  Target Milestone: ---

With a large number of vhosts ( > 1000 ) and proxy balancer configurations ( >
1000), we are seeing Apache exit at start up time with a configuration error
(very frequently) with an error like. 

[Wed Jan 10 16:28:45.853599 2018] [slotmem_shm:error] [pid 29764:tid
140038537377536] (17)File exists: AH02611: create:
apr_shm_create(/apache24/logs/slotmem-shm-p71143bd8_balancer1.shm) failed

[Wed Jan 10 16:28:45.853641 2018] [:emerg] [pid 29764:tid 140038537377536]
AH00020: Configuration Failed, exiting 

turning on trace5 level logs we see things like the following for a single
balancer worker (I filtered on the balance SHM name)

[Thu Jan 25 03:48:08.397926 2018] [slotmem_shm:debug] [pid 13310:tid
140455729428224] mod_slotmem_shm.c(364): AH02602: create didn't find
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm in global list
[Thu Jan 25 03:48:08.397932 2018] [slotmem_shm:debug] [pid 13310:tid
140455729428224] mod_slotmem_shm.c(374): AH02300: create
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm: 1176/2
[Thu Jan 25 03:48:08.398076 2018] [slotmem_shm:debug] [pid 13310:tid
140455729428224] mod_slotmem_shm.c(417): AH02611: create:
apr_shm_create(/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm) succeeded
[Thu Jan 25 03:48:58.529349 2018] [slotmem_shm:debug] [pid 45813:tid
139795075143424] mod_slotmem_shm.c(364): AH02602: create didn't find
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm in global list
[Thu Jan 25 03:48:58.529357 2018] [slotmem_shm:debug] [pid 45813:tid
139795075143424] mod_slotmem_shm.c(374): AH02300: create
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm: 1176/2
[Thu Jan 25 03:49:01.835207 2018] [slotmem_shm:debug] [pid 46229:tid
139795075143424] mod_slotmem_shm.c(496): AH02301: attach looking for
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm
[Thu Jan 25 03:49:01.835222 2018] [slotmem_shm:debug] [pid 46625:tid
139795075143424] mod_slotmem_shm.c(496): AH02301: attach looking for
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm
[Thu Jan 25 03:49:01.835230 2018] [slotmem_shm:debug] [pid 46229:tid
139795075143424] mod_slotmem_shm.c(509): AH02302: attach found
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm: 1176/2
[Thu Jan 25 03:49:01.835254 2018] [slotmem_shm:debug] [pid 46625:tid
139795075143424] mod_slotmem_shm.c(509): AH02302: attach found
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm: 1176/2
[Thu Jan 25 03:49:01.886171 2018] [slotmem_shm:debug] [pid 47011:tid
139795075143424] mod_slotmem_shm.c(496): AH02301: attach looking for
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm
[Thu Jan 25 03:49:01.886284 2018] [slotmem_shm:debug] [pid 47011:tid
139795075143424] mod_slotmem_shm.c(509): AH02302: attach found
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm: 1176/2
[Thu Jan 25 03:49:01.899288 2018] [slotmem_shm:debug] [pid 47281:tid
139795075143424] mod_slotmem_shm.c(496): AH02301: attach looking for
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm
[Thu Jan 25 03:49:01.899321 2018] [slotmem_shm:debug] [pid 47281:tid
139795075143424] mod_slotmem_shm.c(509): AH02302: attach found
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm: 1176/2
[Thu Jan 25 03:53:03.516455 2018] [slotmem_shm:debug] [pid 45813:tid
139795075143424] mod_slotmem_shm.c(364): AH02602: create didn't find
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm in global list
[Thu Jan 25 03:53:03.516462 2018] [slotmem_shm:debug] [pid 45813:tid
139795075143424] mod_slotmem_shm.c(374): AH02300: create
/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm: 1176/2
[Thu Jan 25 03:53:03.516499 2018] [slotmem_shm:error] [pid 45813:tid
139795075143424] (17)File exists: AH02611: create:
apr_shm_create(/apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm) failed



In other words, in the space of five minutes, the balancer was not found in the
global list (03:48:08), successfully created, then found several times, then
went missing at 03:53:03, and then failed to create it, which then triggered an
Apache exit (not shown here)

Rather confusingly, the choice of the DefaultRuntimeDirectory has an impact on
frequence.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #5 from Ruediger Pluem <rp...@apache.org> ---
(In reply to mark from comment #0)
> With a large number of vhosts ( > 1000 ) and proxy balancer configurations (
> > 1000), we are seeing Apache exit at start up time with a configuration
> error (very frequently) with an error like. 
> 
> [Wed Jan 10 16:28:45.853599 2018] [slotmem_shm:error] [pid 29764:tid
> 140038537377536] (17)File exists: AH02611: create:
> apr_shm_create(/apache24/logs/slotmem-shm-p71143bd8_balancer1.shm) failed
> 
> [Wed Jan 10 16:28:45.853641 2018] [:emerg] [pid 29764:tid 140038537377536]
> AH00020: Configuration Failed, exiting 
> 
> turning on trace5 level logs we see things like the following for a single
> balancer worker (I filtered on the balance SHM name)
> 
> [Thu Jan 25 03:48:08.397926 2018] [slotmem_shm:debug] [pid 13310:tid
> 140455729428224] mod_slotmem_shm.c(364): AH02602: create didn't find
> /apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm in global list


> [Thu Jan 25 03:48:58.529349 2018] [slotmem_shm:debug] [pid 45813:tid
> 139795075143424] mod_slotmem_shm.c(364): AH02602: create didn't find
> /apache24/logs/slotmem-shm-pe1b232bb_balancer1.shm in global list

Hm. The above two lines are weird. mod_proxy_balancer only creates the shm
segments in the post_config phase where there is still only one httpd process.
But I see two different pid's in the above log messages. Did you do a graceful
restart between 03:48:08 and 03:48:58?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #25 from mark@blackmans.org ---
Perhaps, we're seeing the error even in the balancer SHMs as well as the worker
SHMs and the balancer SHM already uses conf->id as a distinguisher.

https://github.com/apache/httpd/blob/2.4.29/modules/proxy/mod_proxy_balancer.c#L814

For this balancer (not worker), even with Jim's change, we saw the following.
Summarizing first

12:26:41  - attach found and attached to slotmem-shm-p701d8bbe_0
12:33:51  - SIGHUP
12:34:54  - create (not attach) fails to find slotmem-shm-p701d8bbe_0
12:34:54  - create fails to create because the SHM key/segment is still in the
kernel
12:38:54  - create (under a new PID) fails to find slotmem-shm-p701d8bbe_0 but
successfully creates it, presumably because all attached processes detached
finally.

Why didnt the generation change? it was zero before and after the HUP.

[Wed Jan 31 12:26:41.463136 2018] [slotmem_shm:debug] [pid 1322:tid
139715805775616] mod_slotmem_shm.c(463): AH02301: attach looking for
/var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm
[Wed Jan 31 12:26:41.463169 2018] [slotmem_shm:debug] [pid 1322:tid
139715805775616] mod_slotmem_shm.c(476): AH02302: attach found
/var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm: 992/6
[Wed Jan 31 12:33:51.761487 2018] [mpm_event:notice] [pid 65265:tid
139715805775616] AH00494: SIGHUP received.  Attempting to restart
[Wed Jan 31 12:34:54.471933 2018] [slotmem_shm:debug] [pid 20672:tid
139965041129216] mod_slotmem_shm.c(331): AH02602: create didn't find
/var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm in global list
[Wed Jan 31 12:34:54.471939 2018] [slotmem_shm:debug] [pid 20672:tid
139965041129216] mod_slotmem_shm.c(341): AH02300: create
/var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm: 992/6
[Wed Jan 31 12:34:54.471970 2018] [slotmem_shm:error] [pid 20672:tid
139965041129216] (17)File exists: AH02611: create:
apr_shm_create(/var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm) failed
[Wed Jan 31 12:38:46.746713 2018] [slotmem_shm:debug] [pid 31117:tid
140506605512448] mod_slotmem_shm.c(331): AH02602: create didn't find
/var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm in global list
[Wed Jan 31 12:38:46.746719 2018] [slotmem_shm:debug] [pid 31117:tid
140506605512448] mod_slotmem_shm.c(341): AH02300: create
/var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm: 992/6
[Wed Jan 31 12:38:46.746893 2018] [slotmem_shm:debug] [pid 31117:tid
140506605512448] mod_slotmem_shm.c(384): AH02611: create:
apr_shm_create(/var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm)
succeeded
[Wed Jan 31 12:38:49.922030 2018] [mpm_event:notice] [pid 31117:tid
140506605512448] AH00489: Apache/2.4.29 (Unix) OpenSSL/1.0.2n mod_fcgid/2.3.9
mod_auth_kerb/5.4 mod_qos/11.43 mod_jk/1.2.42 configured -- resuming

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #15 from mark@blackmans.org ---
We were a bit keen for a fix for this morning (29 Jan), so we went with Jim's
patch in trunk as it looked very conservative (extending tested behaviours to
Unix from Windows). I didn't see your patch at that point.

http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/slotmem/mod_slotmem_shm.c?r1=1822341&r2=1822340&pathrev=1822341&view=patch

and we're now rolling that out across the pre-production environments today, 29
Jan.

I can't really comment on the relative merits of either approach, so can you
give me a recommendation. Is this later patch either more robust or more
comprehensive than Jim's?  If you're making a strong recommendation, we will
see about pushing that version out to the pre-production environments as an
exceptional change, in advance of the next scheduled roll-out.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #30 from mark@blackmans.org ---
(In reply to Yann Ylavic from comment #26)
> (In reply to mark from comment #25)
> > 
> > [Wed Jan 31 12:26:41.463136 2018] [slotmem_shm:debug] [pid 1322:tid
> > 139715805775616] mod_slotmem_shm.c(463): AH02301: attach looking for
> > /var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm
> > [Wed Jan 31 12:26:41.463169 2018] [slotmem_shm:debug] [pid 1322:tid
> > 139715805775616] mod_slotmem_shm.c(476): AH02302: attach found
> ^ This is a child process attaching the SHMs created by the parent process.
> 
> > /var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm: 992/6
> > [Wed Jan 31 12:33:51.761487 2018] [mpm_event:notice] [pid 65265:tid
> > 139715805775616] AH00494: SIGHUP received.  Attempting to restart
> ^ This is the parent process asked to restart (non graceful).
> 
> > [Wed Jan 31 12:34:54.471933 2018] [slotmem_shm:debug] [pid 20672:tid
> > 139965041129216] mod_slotmem_shm.c(331): AH02602: create didn't find
> > /var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm in global list
> > [Wed Jan 31 12:34:54.471939 2018] [slotmem_shm:debug] [pid 20672:tid
> > 139965041129216] mod_slotmem_shm.c(341): AH02300: create
> > /var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm: 992/6
> > [Wed Jan 31 12:34:54.471970 2018] [slotmem_shm:error] [pid 20672:tid
> > 139965041129216] (17)File exists: AH02611: create:
> ^ This is *another* parent process(not the same pid), ditto for the
> following messages (stripped here).
> 
> How so? One minute for a non-graceful restart looks huge too.
> Do you have multiple instances of httpd running (and using the same log
> file)?
> Could you monitor the processes here?

We have multiple configurations running, but each with their own log files. We
have both Apache 2.2 and Apache 2.4 configurations running side by side, but
completely isolated in terms of configuration, log and run directories.  Each
of our configuration files tends to have around 200k lines including comments
and blank lines and we use a lot of 3rd party modules, so it's they're big
configurations.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Jim Jagielski <ji...@jaguNET.com>.
FWIW, the id is supposed to be somewhat unique if the config DOES
change, hence the use of the line number as part of the hash...
In other words, if the config file itself is changed, we want to
create a new id because we have no idea how to match up
the "old" config in shm and the "new" config that was just
reloaded, so we assume that the new config is the new
default and thus deserves/requires a new ID.

> On Feb 1, 2018, at 11:27 AM, Yann Ylavic <yl...@gmail.com> wrote:
> 
> <balancer_id-no_line_number.patch>


Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Mark Blackman <ma...@exonetric.com>.

> On 2 Feb 2018, at 14:19, Jim Jagielski <ji...@jaguNET.com> wrote:
> 
> To be honest, I don't think we ever envisioned an actual environ
> where the config files change every hour and the server gracefully
> restarted... I think our working assumptions have been that actual
> config file changes are "rare", hence the number of modules that
> allow for "on-the-fly" reconfiguration which avoid the need for
> restarts.
> 
> So this is a nice "edge case"
> 

Think mass hosting of 10000+ reverse-proxy front-ends all in the same Apache instance, with self-service updates to configs as well as a staging environment.The 24-hour cycle is like this..

1am: Full stop (SIGTERM) and start of Apache with all configurations, primarily to permit log file rotation.

Then, on the hour, any configuration changes requested will be made live by auto-generation of a giant 200k+ configuration then a HUP (not a USR1) signal to keep the same parent, but a bunch of fresh children. As these are mostly reverse proxies, we generate thousands of balancer and balancermember directives per configuration.

In the background, once a minute, a background process is always checking for responses and forcibly restarting Apache (SIGTERM then SIGKILL if necessary) if it doesn’t respond.

Finally, bear in mind that line number changes can occur merely because a new virtualhost was added ahead of a given virtualhost, so some kind of tracking UUID for a virtualhost based on whatever non-line-number properties is probably useful.

- Mark



Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Jim Jagielski <ji...@jaguNET.com>.
To be honest, I don't think we ever envisioned an actual environ
where the config files change every hour and the server gracefully
restarted... I think our working assumptions have been that actual
config file changes are "rare", hence the number of modules that
allow for "on-the-fly" reconfiguration which avoid the need for
restarts.

So this is a nice "edge case"

> On Feb 1, 2018, at 11:49 AM, Mark Blackman <ma...@exonetric.com> wrote:
> 
> 
> 
>> On 1 Feb 2018, at 16:27, Yann Ylavic <yl...@gmail.com> wrote:
>> 
>>> On Thu, Feb 1, 2018 at 5:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>> On Thu, Feb 1, 2018 at 4:32 PM, Mark Blackman <ma...@exonetric.com> wrote:>
>>> 
>>>> SHM clean-up is the key here and any patch that doesn’t contribute to
>>>> that has no immediate value for me.
>>> 
>>> What you may want to try is remove "s->defn_line_number" from the id there:
>>> https://github.com/apache/httpd/blob/trunk/modules/proxy/mod_proxy_balancer.c#L787
>>> If your configuration file changes often, that contributes to changing
>>> the name of the SHM...
>> 
>> FWIW, here is (attached) the patch I'm thinking about.
>> <balancer_id-no_line_number.patch>
> 
> Thanks, the configuration changes once an hour or so. Typically, we have about 1000 active shared memory segments (yes, they are SHMs) attached to the httpd processes.
> 
> For now, we’ll just have to implement a SHM clean-up in the start/stop wrappers until we can address the root cause or find a cleaner mitigation, which your patch might help with.
> 
> - Mark


Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Mark Blackman <ma...@exonetric.com>.

> On 1 Feb 2018, at 16:27, Yann Ylavic <yl...@gmail.com> wrote:
> 
>> On Thu, Feb 1, 2018 at 5:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
>> On Thu, Feb 1, 2018 at 4:32 PM, Mark Blackman <ma...@exonetric.com> wrote:>
>> 
>>> SHM clean-up is the key here and any patch that doesn’t contribute to
>>> that has no immediate value for me.
>> 
>> What you may want to try is remove "s->defn_line_number" from the id there:
>> https://github.com/apache/httpd/blob/trunk/modules/proxy/mod_proxy_balancer.c#L787
>> If your configuration file changes often, that contributes to changing
>> the name of the SHM...
> 
> FWIW, here is (attached) the patch I'm thinking about.
> <balancer_id-no_line_number.patch>

Thanks, the configuration changes once an hour or so. Typically, we have about 1000 active shared memory segments (yes, they are SHMs) attached to the httpd processes.

For now, we’ll just have to implement a SHM clean-up in the start/stop wrappers until we can address the root cause or find a cleaner mitigation, which your patch might help with.

- Mark

Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Feb 1, 2018 at 5:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
> On Thu, Feb 1, 2018 at 4:32 PM, Mark Blackman <ma...@exonetric.com> wrote:>
>
>> SHM clean-up is the key here and any patch that doesn’t contribute to
>> that has no immediate value for me.
>
> What you may want to try is remove "s->defn_line_number" from the id there:
>  https://github.com/apache/httpd/blob/trunk/modules/proxy/mod_proxy_balancer.c#L787
> If your configuration file changes often, that contributes to changing
> the name of the SHM...

FWIW, here is (attached) the patch I'm thinking about.

Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Feb 1, 2018 at 4:32 PM, Mark Blackman <ma...@exonetric.com> wrote:>
>> On 1 Feb 2018, at 12:36, Yann Ylavic <yl...@gmail.com> wrote:
>>
>> Hi Mark,
>>
>> On Thu, Feb 1, 2018 at 10:29 AM, Mark Blackman <ma...@exonetric.com>
>> wrote:>
>>>
>>>
>>> Just to confirm, you expect that patch to handle SHM clean-up
>>> even in the “nasty error” case?
>>
>> Not really, no patch can avoid a crash for a crashing code :/ The
>> "stop_signals-PR61558.patch" patch avoids a known httpd crash in
>> some circumstances, but...
>
> Well, I just mean, if sig_coredump gets called, will the patch result
> in the normal SHM clean-up routines getting called, where they would
> have not been called before?

No, unfortunately nothing fancy in there, keep in mind that it's a
root process faulting so I don't think much should ben done...

> SHM clean-up is the key here and any patch that doesn’t contribute to
> that has no immediate value for me.

What you may want to try is remove "s->defn_line_number" from the id there:
 https://github.com/apache/httpd/blob/trunk/modules/proxy/mod_proxy_balancer.c#L787
If your configuration file changes often, that contributes to changing
the name of the SHM...

>
>>
>>> I suspect that nasty error is triggered by the Weblogic plugin
>>> based on the adjacency in the logs, but the tracing doesn’t
>>> reveal any details, so an strace will probably be required to get
>>> more detail.
>
> Tracing has confirmed this really is a segmentation fault despite the
> lack of host-level messages and that reading a 3rd party module (but
> not Weblogic) is the last thing that happens before the segmentation
> fault and that pattern is fairly consistent. Now we need to ensure
> coredumps are generated.
>
> Finally, there are no orphaned child httpd processes with a PPID of
> 1.  Just thousands and thousands of SHM segments with no processes
> attached to them.

Which brings us back to why attach and/or create fail if nothing is
attached to them.
These are SHMs (per "ipcs -m"), right? Not semaphores ("ipcs -s")?

"thousands and thousands" is kind of exponential, even for thousands
of vhosts, do the names of SHMs change for each startup?
(besides the generation number if you use that patch, I'm hardly
thinking that the processes would crash arbitrarily at generation
[0..1000]...)
If so, does it relate to configuration changes?

We are not talking about fixing the root issue here :/


Regards,
Yann.

Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Mark Blackman <ma...@exonetric.com>.
> On 1 Feb 2018, at 12:36, Yann Ylavic <yl...@gmail.com> wrote:
> 
> Hi Mark,
> 
> On Thu, Feb 1, 2018 at 10:29 AM, Mark Blackman <ma...@exonetric.com> wrote:>
>> 
>> 
>> Just to confirm, you expect that patch to handle SHM clean-up even in
>> the “nasty error” case?
> 
> Not really, no patch can avoid a crash for a crashing code :/
> The "stop_signals-PR61558.patch" patch avoids a known httpd crash in
> some circumstances, but...

Well, I just mean, if sig_coredump gets called, will the patch result in the normal SHM clean-up routines getting called, where they would have not been called before?  SHM clean-up is the key here and any patch that doesn’t contribute to that has no immediate value for me.

> 
>> I suspect that nasty error is triggered by
>> the Weblogic plugin based on the adjacency in the logs, but the
>> tracing doesn’t reveal any details, so an strace will probably be
>> required to get more detail.

Tracing has confirmed this really is a segmentation fault despite the lack of host-level messages and that reading a 3rd party module (but not Weblogic) is the last thing that happens before the segmentation fault and that pattern is fairly consistent. Now we need to ensure coredumps are generated.

Finally, there are no orphaned child httpd processes with a PPID of 1.  Just thousands and thousands of SHM segments with no processes attached to them.

Regards,
Mark

Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Yann Ylavic <yl...@gmail.com>.
Hi Mark,

On Thu, Feb 1, 2018 at 10:29 AM, Mark Blackman <ma...@exonetric.com> wrote:>
> Thanks, for now, we will treat the “nasty error” as a separate
> question to resolve and hope that clean-up patch deals with the
> immediate issue.

OK, that patch can be discussed on bz if it doesn't turn too technical.
Technicals, (long) discussions and debugging is not very friendly for
future visitors of bz which may encounter the same issue to go to the
solution...

>
> I had originally treated that “nasty error” as a reference to the
> “file exists” error. However, based on your feedback and reviewing
> the logs, I would conclude that “nasty error” is the trigger, as you
> suggrest, and the lack of SHM clean-up and consequent collisions are
> collateral damage.

That's what I feel, but I wouldn't stake my life on it either :)

>
> Just to confirm, you expect that patch to handle SHM clean-up even in
> the “nasty error” case?

Not really, no patch can avoid a crash for a crashing code :/
The "stop_signals-PR61558.patch" patch avoids a known httpd crash in
some circumstances, but...

> I suspect that nasty error is triggered by
> the Weblogic plugin based on the adjacency in the logs, but the
> tracing doesn’t reveal any details, so an strace will probably be
> required to get more detail.

... if the crash is not related, that won't help.

I'm missing something in your scenario though.

In the original/non-patched code and still with the "generation
number" patch (aka "Jim's"), there is always an attempt to attach the
SHM first and only it that fails a new one is created.
It means that even if the parent process crashes without cleaning up
the SHM on the system, whether or not some children are still alive
when a new httpd instance is started, it should be able to attach the
SHM (create would fail, but not attach).
Btw, things would probably turn bad soon or later because
synchronization assumptions are off (old and new children wouldn't
share the same mutex which is not reused/attached on startup, global
mutexes leak in the system for that scenario more than SHMs).
So why both attach and create fails in your case?

With my proposed patch (r1822509), since I removed attach (bullet 4/
in the commit message), your scenario is "expected" to fail when the
second httpd instance starts (while old children are still alive).
I'm not sure I should fix this (re-introduce the attach code) because
as I said this is a screwy scenario with regard to the global mutex,
it's not supposed to work like this.
The only sane thing to do here (IMHO, and more a note to other httpd
devs) would be to kill children whenever the parent process dies
underneath them, be it with a startup script (there shouldn't be any
orphaned child process, at least when httpd starts), or natively in
the MPM which could detect this situation (that's another story
though, and it probably should be opt-in because it depends on how
httpd is started/monitored externally, and how much the user want the
service to continue as much as possible...).

So the faster/simpler solution *for you* might be to create/modify
your (re)startup script such that it kills orphaned children, if any,
in prevention...

>
> Bugzilla was slightly easier to get log data into as I cannot use
> work email for these conversations.

There is no strong statement/rule on bz vs dev@, if it's more
convenient for you to continue there this is a good reason ;)
I wouldn't go as far in the discussion as I did here, though (sorry if
it was too long btw).


Regards,
Yann.

Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Mark Blackman <ma...@exonetric.com>.
On 31 Jan 2018, at 22:41, Yann Ylavic <yl...@gmail.com> wrote:
> 
> Hi Mark,
> 
> let's continue this debugging on dev@ if you don't mind..
> 
>> On Wed, Jan 31, 2018 at 10:15 PM,  <bu...@apache.org> wrote:
>> https://bz.apache.org/bugzilla/show_bug.cgi?id=62044
>> 
>> --- Comment #32 from mark@blackmans.org ---
>> so sig_coredump is being triggered by an unknown signal, multiple times a day.
>> It's not a segfault, nothing in /var/log/messages. That results in a bunch of
>> undeleted shared memory segments and probably some that will no longer be in
>> the global list, but still present in the kernel.
> 
> In 2.4.29, i.e. without patch [1], sig_coredump might be triggered by
> any signal received by httpd during a restart, and the signal handle
> crashes itself (double fault) so the process is forcibly SIGKILLed
> (presumably, no trace in /var/log/messages...).
> This was reported and discussed in [2], and seems to quite correspond
> to what you observe in your tests.
> 
> Moreover, if the parent process crashes nothing will delete the
> IPC-SysV SHMs (hence the leak in the system), while children processes
> may continue to be attached which prevents a new parent process to
> start (until children stop or are forcibly killed)...
> 
> When this happens, you should see non-root processes attached to PPID
> 1 (e.g. with "ps -ef"), "-f /path/to/httpd.conf" in the command line
> might help distinguish the different httpd instances to monitor
> processes.
> 
> If this is the case, you probably should try patch [1].
> If not, I can't explain why in httpd logs a process with a different
> PID appears after the SIGHUP, it must have been started
> (automatically?) after the previous one crashed.
> Here the generation number can't help, a new process always start at
> generation #0.
> 
> Regards,
> Yann.
> 
> [1] https://svn.apache.org/repos/asf/httpd/httpd/patches/2.4.x/stop_signals-PR61558.patch
> [2] https://bz.apache.org/bugzilla/show_bug.cgi?id=61558

Thanks, for now, we will treat the “nasty error” as a separate question to resolve and hope that clean-up patch deals with the immediate issue.

I had originally treated that “nasty error” as a reference to the “file exists” error.  However, based on your feedback and reviewing the logs, I would conclude that “nasty error” is the trigger, as you suggrest, and the lack of SHM clean-up and consequent collisions are collateral damage.

Just to confirm, you expect that patch to handle SHM clean-up even in the “nasty error” case?  I suspect that nasty error is triggered by the Weblogic plugin based on the adjacency in the logs, but the tracing doesn’t reveal any details, so an strace will probably be required to get more detail.

Bugzilla was slightly easier to get log data into as I cannot use work email for these conversations.

Cheers,
Mark




Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by Yann Ylavic <yl...@gmail.com>.
Hi Mark,

let's continue this debugging on dev@ if you don't mind..

On Wed, Jan 31, 2018 at 10:15 PM,  <bu...@apache.org> wrote:
> https://bz.apache.org/bugzilla/show_bug.cgi?id=62044
>
> --- Comment #32 from mark@blackmans.org ---
> so sig_coredump is being triggered by an unknown signal, multiple times a day.
> It's not a segfault, nothing in /var/log/messages. That results in a bunch of
> undeleted shared memory segments and probably some that will no longer be in
> the global list, but still present in the kernel.

In 2.4.29, i.e. without patch [1], sig_coredump might be triggered by
any signal received by httpd during a restart, and the signal handle
crashes itself (double fault) so the process is forcibly SIGKILLed
(presumably, no trace in /var/log/messages...).
This was reported and discussed in [2], and seems to quite correspond
to what you observe in your tests.

Moreover, if the parent process crashes nothing will delete the
IPC-SysV SHMs (hence the leak in the system), while children processes
may continue to be attached which prevents a new parent process to
start (until children stop or are forcibly killed)...

When this happens, you should see non-root processes attached to PPID
1 (e.g. with "ps -ef"), "-f /path/to/httpd.conf" in the command line
might help distinguish the different httpd instances to monitor
processes.

If this is the case, you probably should try patch [1].
If not, I can't explain why in httpd logs a process with a different
PID appears after the SIGHUP, it must have been started
(automatically?) after the previous one crashed.
Here the generation number can't help, a new process always start at
generation #0.

Regards,
Yann.

[1] https://svn.apache.org/repos/asf/httpd/httpd/patches/2.4.x/stop_signals-PR61558.patch
[2] https://bz.apache.org/bugzilla/show_bug.cgi?id=61558

[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #32 from mark@blackmans.org ---
so sig_coredump is being triggered by an unknown signal, multiple times a day. 
It's not a segfault, nothing in /var/log/messages. That results in a bunch of
undeleted shared memory segments and probably some that will no longer be in
the global list, but still present in the kernel.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #1 from mark@blackmans.org ---
I believe the error arises here

https://github.com/apache/httpd/blob/2.4.29/modules/slotmem/mod_slotmem_shm.c#L408

I assume the 'file exists' error refers to the SHM key rather than the
placeholder file in the filesystem.

However, there is a defensive removal of the key *before* the create, which
makes this error very mysterious, it should be nearly impossible to fail here I
think.

apr_shm_remove(fname, gpool);
rv = apr_shm_create(&shm, size, fname, gpool);

Is there any possibility there is some latency between the removal being
effective and the create starting? Or could the remove fail silently?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

Graham Leggett <mi...@sharp.fm> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #35 from Graham Leggett <mi...@sharp.fm> ---
Backported to v2.4.30.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #27 from Yann Ylavic <yl...@gmail.com> ---
(In reply to mark from comment #7)
> 
> " AH00060: seg fault or similar nasty error detected in the parent process"
> but I cannot tell what it's referring to.

The parent process crashed leaving children orphaned (hence attached to SHMs).

You possibly need this patch too:
https://svn.apache.org/repos/asf/httpd/httpd/patches/2.4.x/stop_signals-PR61558.patch
It was merged for upcoming 2.4.30 already (r1820794).
See Bug 61558.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #33 from Yann Ylavic <yl...@gmail.com> ---
Mark, followed up on dev@ since debugging in not really suitable in bugzilla.
Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #16 from Yann Ylavic <yl...@gmail.com> ---
(In reply to mark from comment #15)
> Is this later patch either more robust or more
> comprehensive than Jim's?  If you're making a strong recommendation, we will
> see about pushing that version out to the pre-production environments as an
> exceptional change, in advance of the next scheduled roll-out.
I can't do a recommendation given your time constraints, what I can tell is
that if the Windows approach indeed avoids the (re)start failures, it however
does not preserve the state of the balancers accross restarts (including
graceful).
So things like load distribution, error states, ...,  are reset/lost, as if it
were the first startup.

This is not the right fix for httpd, but it may be enough for your use case...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #8 from Eric Covener <co...@gmail.com> ---
(In reply to mark from comment #7)
> give that man a teddy bear. pid 13310 was born at 03:40:12 with a SIGHUP at
> 03:47:58 and then permanently exiting at 03:48:11.
> 
> [Thu Jan 25 03:40:22.300797 2018] [mpm_event:notice] [pid 13310:tid
> 140455729428224] AH00489: Apache/2.4.29 (Unix) OpenSSL/1.0.2n
> mod_fcgid/2.3.9 mod_auth_kerb/5.4 mod_qos/11.43 mod_jk/1.2.42 configured --
> resuming normal operations
> [Thu Jan 25 03:40:22.300851 2018] [core:notice] [pid 13310:tid
> 140455729428224] AH00094: Command line: '/apache24/bin/httpd -f
> /apache24/conf/dynamic/apache24/httpd.conf -D XXXXX'
> [Thu Jan 25 03:47:58.097848 2018] [mpm_event:notice] [pid 13310:tid
> 140455729428224] AH00494: SIGHUP received.  Attempting to restart
> [Thu Jan 25 03:48:11.467544 2018] [core:notice] [pid 13310:tid
> 140455729428224] AH00060: seg fault or similar nasty error detected in the
> parent process
> 
> so, the diagnosis probably remains roughly the same, some SHM keys are not
> getting removed or not removed quickly enough and are still in place the
> next time the same configuration starts up.

If this is the case maybe we could bake the generation name into the filename.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #10 from mark@blackmans.org ---
baked here:

https://github.com/apache/httpd/blob/2.4.29/modules/proxy/mod_proxy_balancer.c#L787

        id = apr_psprintf(pconf, "%s.%s.%d.%s.%s.%u.%s",
                          (s->server_scheme ? s->server_scheme : "????"),
                          (s->server_hostname ? s->server_hostname : "???"),
                          (int)s->port,
                          (s->server_admin ? s->server_admin : "??"),
                          (s->defn_name ? s->defn_name : "?"),
                          s->defn_line_number,
                          (s->error_fname ? s->error_fname :
DEFAULT_ERRORLOG));

        conf->id = apr_psprintf(pconf, "p%x",
ap_proxy_hashfunc(id, PROXY_HASHFUNC_DEFAULT));

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

Dasharath Masirkar <ma...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|FIXED                       |---
             Status|RESOLVED                    |REOPENED

--- Comment #37 from Dasharath Masirkar <ma...@gmail.com> ---
(In reply to Graham Leggett from comment #35)
> Backported to v2.4.30.

I have tested with httpd-2.4.46 and observe that With configurations in use
(700 vhosts and 200 proxy balancer) issue still reproducible.

 [Fri Aug 21 15:52:42.560514 2020] [suexec:notice] [pid 20064:tid
139931803388032] AH01232: suEXEC mechanism enabled (wrapper:
/opt/community/httpd-2.4.46/bin/suexec)
[Fri Aug 21 15:52:42.573796 2020] [lbmethod_heartbeat:notice] [pid 20065:tid
139931803388032] AH02282: No slotmem from mod_heartmonitor
[Fri Aug 21 15:52:42.575575 2020] [example_hooks:notice] [pid 20065:tid
139931803388032] x_pre_mpm()
[Fri Aug 21 15:52:42.576076 2020] [mpm_worker:notice] [pid 20065:tid
139931803388032] AH00292: Apache/2.4.46 (Unix) configured -- resuming normal
operations
[Fri Aug 21 15:52:42.576093 2020] [core:notice] [pid 20065:tid 139931803388032]
AH00094: Command line: '/opt/community/httpd-2.4.46/bin/httpd'
[Fri Aug 21 15:52:51.586007 2020] [example_hooks:notice] [pid 20065:tid
139931803388032] x_monitor()
[Fri Aug 21 15:53:01.597034 2020] [example_hooks:notice] [pid 20065:tid
139931803388032] x_monitor()
[Fri Aug 21 15:53:03.070731 2020] [example_hooks:notice] [pid 20074:tid
139931691882240] x_create_connection()
[Fri Aug 21 15:53:03.070866 2020] [example_hooks:notice] [pid 20074:tid
139931691882240] x_create_request()
[Fri Aug 21 15:53:03.071039 2020] [example_hooks:notice] [pid 20074:tid
139931691882240] x_create_request()
[Fri Aug 21 15:53:03.071332 2020] [example_hooks:notice] [pid 20074:tid
139931691882240] x_create_request()
[Fri Aug 21 15:53:03.071467 2020] [example_hooks:notice] [pid 20074:tid
139931691882240] x_create_request()
[Fri Aug 21 15:53:03.071518 2020] [example_hooks:notice] [pid 20074:tid
139931691882240] x_create_request()
[Fri Aug 21 15:53:03.071720 2020] [optional_hook_import:error] [pid 20074:tid
139931691882240] AH01866: Optional hook test said: GET / HTTP/1.1
[Fri Aug 21 15:53:03.071723 2020] [optional_fn_export:error] [pid 20074:tid
139931691882240] AH01871: Optional function test said: GET / HTTP/1.1
[Fri Aug 21 15:53:03.071736 2020] [example_hooks:notice] [pid 20074:tid
139931691882240] x_create_request()
[Fri Aug 21 15:53:11.608523 2020] [example_hooks:notice] [pid 20065:tid
139931803388032] x_monitor()
[Fri Aug 21 15:53:20.777638 2020] [mpm_worker:notice] [pid 20065:tid
139931803388032] AH00295: caught SIGTERM, shutting down
[Fri Aug 21 15:55:24.939515 2020] [suexec:notice] [pid 20497:tid
140677344643200] AH01232: suEXEC mechanism enabled (wrapper:
/opt/community/httpd-2.4.46/bin/suexec)
[Fri Aug 21 15:55:25.444413 2020] [slotmem_shm:error] [pid 20498:tid
140677344643200] (28)No space left on device: AH02611: create:
apr_shm_create(/opt/community/httpd-2.4.46/logs/slotmem-shm-p3fb225f0_rec2_avise_api_0.shm)
failed
[Fri Aug 21 15:55:25.444457 2020] [:emerg] [pid 20498:tid 140677344643200]
AH00020: Configuration Failed, exiting

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #29 from Yann Ylavic <yl...@gmail.com> ---
> (In reply to Yann Ylavic from comment #24)
> > Created attachment 35710 [details]
> > Unique balancer id per vhost
> 
> Committed to trunk in r1822800.
Reverted, all was there already (sname vs name).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #22 from mark@blackmans.org ---
Anyway, in the absence of other ideas, we're going revert to the more
conservative patch, even at the cost of cross-generation persistence, at 

http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/slotmem/mod_slotmem_shm.c?r1=1822341&r2=1822340&pathrev=1822341&view=patch

for now.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #2 from mark@blackmans.org ---
looking at the code for apr_shm_remove at 

https://github.com/apache/apr/blob/1.6.1/shmem/unix/shm.c#L436

I am reminded that

    /* Indicate that the segment is to be destroyed as soon
     * as all processes have detached. This also disallows any
     * new attachments to the segment. */
    if (shmctl(shmid, IPC_RMID, NULL) == -1) {
        goto shm_remove_failed;
}

So, while the remove can succeed, although I note the return status isn't
tested here, the key will hang around until the last process detaches, so the
defensive measure isn't effective.

So back to the original question, why does Apache think this slot isn't already
in the global list.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #31 from mark@blackmans.org ---
(In reply to Yann Ylavic from comment #27)
> (In reply to mark from comment #7)
> > 
> > " AH00060: seg fault or similar nasty error detected in the parent process"
> > but I cannot tell what it's referring to.
> 
> The parent process crashed leaving children orphaned (hence attached to
> SHMs).
> 
> You possibly need this patch too:
> https://svn.apache.org/repos/asf/httpd/httpd/patches/2.4.x/stop_signals-
> PR61558.patch
> It was merged for upcoming 2.4.30 already (r1820794).
> See Bug 61558.

I can't see evidence of a crash beyond that message. Could it be referring to
the exit triggered by the "file exists" problem?

i.e. HUP is received, SHMs are marked as deleted but processes are still
attached so they are still present for the HUP restart and that triggers the
"crash" exit and thus other SHMs fail to get deleted?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #9 from Jim Jagielski <ji...@apache.org> ---
... or possibly re-used?? I'll need to look. It's been awhile since I've
reviewed that chunk of code.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #38 from Dasharath Masirkar <ma...@gmail.com> ---
We can use following steps to reproduce the issue.

step1: Create a configuration files using a script like as follows, I have
tried on Linux.

~~~
#!/bin/bash

### create 700 vhost ###########
for i in {1..700..1}
  do

echo "<VirtualHost *:80>" >>vhost.conf
echo "ServerName app$i.example.com">>vhost.conf
echo "ProxyPass /myapp$i balancer://lb$i/myapp$i">>vhost.conf
echo "ProxyPassReverse /myapp$i balancer://lb$i/myapp$i">>vhost.conf
echo "</VirtualHost>">>vhost.conf

  done

echo "700 vhost created in vhost.conf file"

## create 200 proxy balancer #########
for i in {1..200..1}
  do

echo "<Proxy balancer://lb$i>" >>balancer.conf
echo "BalancerMember http://backend$i.example.com:808$i" >>balancer.conf
echo "BalancerMember http://backend$i.example.com:818$i" >>balancer.conf
echo "Proxyset lbmethod=byrequests" >>balancer.conf
echo "</Proxy>" >>balancer.conf

  done
echo "200 balancer created in balancer.conf file"
~~~

step2: Copy vhost.conf and balancer.conf to Apache httpd configuration
directory or add Include to httpd.conf file.

step3: start the httpd

Observation:
 The httpd will fail in a startup with the following error:

[Fri Aug 21 17:53:07.825499 2020] [proxy_balancer:emerg] [pid 581:tid
140381045221504] (28)No space left on device: AH01185: worker slotmem_create
failed
[Fri Aug 21 17:53:07.825510 2020] [:emerg] [pid 581:tid 140381045221504]
AH00020: Configuration Failed, exiting

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35698|0                           |1
        is obsolete|                            |

--- Comment #14 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35702
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35702&action=edit
slotmem SHMs reuse (2.4.x)

This patch does:
1/ use a constant file name for all systems (no generation suffix),
2/ maintain the list of the created SHMs *accross restarts*
3/ not unlink the files on (graceful) restart anymore (not needed),
4/ not attach in slotmem_create() anymore (not needed),
5/ add type/sizes consistency check for persisted slots on restoration,
6/ unlink the files only on stop/exit or before creating them (crash
remainder).

Mark, could you please try it?

I think we could avoid 6/ if we remove the file just after the SHM is created.
This would work for systems with "unlink semantics" (i.e. unlink allowed while
some descriptors are opened even if it really happens when the last one is
closed, since we don't need to re-open them now), but not for others so I kept
the code generic to start with...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #19 from mark@blackmans.org ---
We were able to rebuild and deploy Yann's patch for the pre-production
environments and we're not yet seeing slotmem_shm "File Exists" errors.
However, we are seeing a lot of orphaned shared segments (i.e. zero attached
processes) as though cleanup is not happening appropriately or is getting
bypassed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


***UNCHECKED*** [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #11 from Jim Jagielski <ji...@apache.org> ---
Yeah, it looks like adding in the generation to conf->id will create a unique
name. But I need to see how it effects persistence

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35702|0                           |1
        is obsolete|                            |

--- Comment #34 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35723
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35723&action=edit
Reuse SHMs names on restart or stop/start (2.4.x)

This is the full patch proposed to be backported to 2.4.next.

It should reuse the SHMs names as much as possible on restart or stop/start,
which should address the increasing number of IPCs on the system if/when the
parent process crashes.

Please note that it won't reuse SHMs if by some means children process from an
old httpd instance (whose parent process crashed) are still alive, this is not
something desirable.

Could you test it with your large configuration?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35710|0                           |1
        is obsolete|                            |

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #17 from Yann Ylavic <yl...@gmail.com> ---
In any case, if you go with the "Windows" approach for your production, we are
still interested in your testing of attachment 35702 for the future ;)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #18 from mark@blackmans.org ---
Thanks for the perspective. We were seeing Apache instances fail and not
restart due to the orphaned segments, requiring manual intervention to resolve,
hence our urgency.

However, I see your point now and this "Windows" fix loses too much state to be
the right long term fix and we make extensive use of the proxy  balancer
feature, so I will see about an exceptional change to test this more
comprehensive change in our pre-production  environments.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #26 from Yann Ylavic <yl...@gmail.com> ---
(In reply to mark from comment #25)
> 
> [Wed Jan 31 12:26:41.463136 2018] [slotmem_shm:debug] [pid 1322:tid
> 139715805775616] mod_slotmem_shm.c(463): AH02301: attach looking for
> /var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm
> [Wed Jan 31 12:26:41.463169 2018] [slotmem_shm:debug] [pid 1322:tid
> 139715805775616] mod_slotmem_shm.c(476): AH02302: attach found
^ This is a child process attaching the SHMs created by the parent process.

> /var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm: 992/6
> [Wed Jan 31 12:33:51.761487 2018] [mpm_event:notice] [pid 65265:tid
> 139715805775616] AH00494: SIGHUP received.  Attempting to restart
^ This is the parent process asked to restart (non graceful).

> [Wed Jan 31 12:34:54.471933 2018] [slotmem_shm:debug] [pid 20672:tid
> 139965041129216] mod_slotmem_shm.c(331): AH02602: create didn't find
> /var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm in global list
> [Wed Jan 31 12:34:54.471939 2018] [slotmem_shm:debug] [pid 20672:tid
> 139965041129216] mod_slotmem_shm.c(341): AH02300: create
> /var/run/http/apache24/tmp/slotmem-shm-p701d8bbe_0.shm: 992/6
> [Wed Jan 31 12:34:54.471970 2018] [slotmem_shm:error] [pid 20672:tid
> 139965041129216] (17)File exists: AH02611: create:
^ This is *another* parent process(not the same pid), ditto for the following
messages (stripped here).

How so? One minute for a non-graceful restart looks huge too.
Do you have multiple instances of httpd running (and using the same log file)?
Could you monitor the processes here?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #36 from mark@blackmans.org ---
(In reply to Yann Ylavic from comment #34)
> Created attachment 35723 [details]
> Reuse SHMs names on restart or stop/start (2.4.x)
> 
> This is the full patch proposed to be backported to 2.4.next.
> 
> It should reuse the SHMs names as much as possible on restart or stop/start,
> which should address the increasing number of IPCs on the system if/when the
> parent process crashes.
> 
> Please note that it won't reuse SHMs if by some means children process from
> an old httpd instance (whose parent process crashed) are still alive, this
> is not something desirable.
> 
> Could you test it with your large configuration?

Thanks, we will aim to test it in our next scheduled update, early March.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #23 from mark@blackmans.org ---
That more conservative patch doesn't seem to have helped either.

[Wed Jan 31 08:44:12.677361 2018] [proxy:debug] [pid 58615:tid 140446564935424]
proxy_util.c(1225): AH02337: copying shm[2] (0x7fbc398b07d8) for
balancer://balancer3
[Wed Jan 31 08:44:12.677429 2018] [slotmem_shm:debug] [pid 58615:tid
140446564935424] mod_slotmem_shm.c(331): AH02602: create didn't find
/var/run/http/apache24/tmp/slotmem-shm-p5dfa5b80_balancer3_0.shm in global list
[Wed Jan 31 08:44:12.677469 2018] [slotmem_shm:debug] [pid 58615:tid
140446564935424] mod_slotmem_shm.c(341): AH02300: create
/var/run/http/apache24/tmp/slotmem-shm-p5dfa5b80_balancer3_0.shm: 1176/2
[Wed Jan 31 08:44:12.677585 2018] [slotmem_shm:error] [pid 58615:tid
140446564935424] (17)File exists: AH02611: create:
apr_shm_create(/var/run/http/apache24/tmp/slotmem-shm-p5dfa5b80_balancer3_0.shm)
failed
[Wed Jan 31 08:44:12.677677 2018] [:emerg] [pid 58615:tid 140446564935424]
AH00020: Configuration Failed, exiting

We keep bumping into previously created keys. I wonder if our balancer naming
isn't distinctive enough, literally each vhost gets balancer1, balancer2,
balancer3. So those names appear hundreds or thousands of times per
configuration, but always inside a virtualhost container.

Any ideas?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #4 from mark@blackmans.org ---
Thanks for looking, the apr_shm_remove does an apr_file_remove as the final
step, so I would be surprised if another one helps

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #6 from mark@blackmans.org ---
Probably, we do a lot of active restarts both to bring in managed changes to
the configuration (but only hourly) and reactive restarts when apache stops
responding.  I will examine and get back to you. 

My feeling after reading the code is that an old process still hasn't detached
from the SHM segment, so the SHM key hangs around, but the placeholder file
does get deleted, so when the next Apache process comes along, presumably
without a filled-in global list, it attempts to re-instate a SHM key that still
hasn't been quite released by the last process.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #12 from Jim Jagielski <ji...@apache.org> ---
Upon review, it appears that in slotmem_filenames() there is code that will
automagically add generational data to the SHM filename... this is done by
default for Win and OS/2.

Are you able to test any fixes?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |FixedInTrunk

--- Comment #28 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Yann Ylavic from comment #14)
> Created attachment 35702 [details]
> slotmem SHMs reuse (2.4.x)

Committed to trunk in r1822509.

(In reply to Yann Ylavic from comment #24)
> Created attachment 35710 [details]
> Unique balancer id per vhost

Committed to trunk in r1822800.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #13 from mark@blackmans.org ---
Yes, I can test fixes.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

cbarbara@okta.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cbarbara@okta.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #39 from Dasharath Masirkar <ma...@gmail.com> ---
- This issue can be workaround by configuring miltiple instances of httpd to
handle such high vhost and proxy balancer configuration instead of putting all
in single instance. I have tested it and works for me.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #3 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35698
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35698&action=edit
Also remove SHM file if any

Does this help?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #21 from mark@blackmans.org ---
In the patched file, line 396 was updated with "gpool" I believe, should 395
have been updated as well?

    393     {
    394         if (fbased) {
    395             apr_shm_remove(fname, pool);
    396             rv = apr_shm_create(&shm, size, fname, gpool);
    397         }

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #40 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Dasharath Masirkar from comment #37)
> 
> I have tested with httpd-2.4.46 and observe that With configurations in use
> (700 vhosts and 200 proxy balancer) issue still reproducible.
[]
> [Fri Aug 21 15:55:25.444413 2020] [slotmem_shm:error] [pid 20498:tid
> 140677344643200] (28)No space left on device: AH02611: create:
> apr_shm_create(/opt/community/httpd-2.4.46/logs/slotmem-shm-
> p3fb225f0_rec2_avise_api_0.shm) failed

This does not look like the same error as the original one ("No space left on
device" here versus "(17)File exists" in first comment).

If you are running out of IPC-SysV shared memories you should consider
increasing the limit on the system, or compiling/linking httpd with an APR
library using POSIX shared memories (APR compiled with --enable-posix-shm).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #7 from mark@blackmans.org ---
give that man a teddy bear. pid 13310 was born at 03:40:12 with a SIGHUP at
03:47:58 and then permanently exiting at 03:48:11.

[Thu Jan 25 03:40:22.300797 2018] [mpm_event:notice] [pid 13310:tid
140455729428224] AH00489: Apache/2.4.29 (Unix) OpenSSL/1.0.2n mod_fcgid/2.3.9
mod_auth_kerb/5.4 mod_qos/11.43 mod_jk/1.2.42 configured -- resuming normal
operations
[Thu Jan 25 03:40:22.300851 2018] [core:notice] [pid 13310:tid 140455729428224]
AH00094: Command line: '/apache24/bin/httpd -f
/apache24/conf/dynamic/apache24/httpd.conf -D XXXXX'
[Thu Jan 25 03:47:58.097848 2018] [mpm_event:notice] [pid 13310:tid
140455729428224] AH00494: SIGHUP received.  Attempting to restart
[Thu Jan 25 03:48:11.467544 2018] [core:notice] [pid 13310:tid 140455729428224]
AH00060: seg fault or similar nasty error detected in the parent process

so, the diagnosis probably remains roughly the same, some SHM keys are not
getting removed or not removed quickly enough and are still in place the next
time the same configuration starts up.

I can't yet find any trace of the seg fault suggested though. We do see that
line a lot

" AH00060: seg fault or similar nasty error detected in the parent process" but
I cannot tell what it's referring to.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #24 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35710
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35710&action=edit
Unique balancer id per vhost

It seems indeed that if balancer:// are not unique the slotmem is reused
accross vhosts.

Does this patch help?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62044

--- Comment #20 from mark@blackmans.org ---
Sorry, I am wrong, we are still seeing the "file exists" error in our logs.

[Tue Jan 30 09:07:05.575349 2018] [slotmem_shm:debug] [pid 3716:tid
139969799624448] mod_slotmem_shm.c(380): AH02602: create didn't find
/var/run/http/apache24/tmp/slotmem-shm-p7a67b429_balancer1.shm in gl
obal list
[Tue Jan 30 09:07:05.575357 2018] [slotmem_shm:debug] [pid 3716:tid
139969799624448] mod_slotmem_shm.c(390): AH02300: create
/var/run/http/apache24/tmp/slotmem-shm-p7a67b429_balancer1.shm: 1176/2
[Tue Jan 30 09:07:05.575398 2018] [slotmem_shm:error] [pid 3716:tid
139969799624448] (17)File exists: AH02611: create:
apr_shm_create(/var/run/http/apache24/tmp/slotmem-shm-p7a67b429_balancer1.shm)
failed
[Tue Jan 30 09:07:05.575442 2018] [:emerg] [pid 3716:tid 139969799624448]
AH00020: Configuration Failed, exiting

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org