You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modules-dev@httpd.apache.org by "Harold J. Ship" <Ha...@giant-steps-networks.com> on 2008/08/12 20:50:00 UTC

Global Data

I'm in the middle of porting an application from IIS/Windows to Apache 2.2 module. In the application, there is a lot of global data. The data contains both:

- application configuration that is read on startup and on receiving a certain HTTP request to reload
- per-request data that is shared between certain requests with dependencies.

Other important information:
- The config data is very large, many MB
- The data structures are built with a lot of pointers to structs to pointers ...
- We are using the worker MPM.

My question is, how can we be sure that the data is stored only once on the machine, and accessible by any request that needs it?

For instance, if we store it in the server pool, and we have multiple processes, how can the request data be shared?
If we have to reload the configuration data, will each process need to maintain its own copy?
If we use shared memory, we will have to change a lot of code which today allocates data on the heap.

One idea for the request data: is there a way to direct a request to be handled by a specific process?

Harold Ship
Team Leader, Giant Steps Networks
04-678-3440 extension 106
harold@giant-steps-networks.com



Re: Global Data

Posted by Sorin Manolache <so...@gmail.com>.
On Tue, Aug 12, 2008 at 20:50, Harold J. Ship
<Ha...@giant-steps-networks.com> wrote:
> I'm in the middle of porting an application from IIS/Windows to Apache 2.2 module. In the application, there is a lot of global data. The data contains both:
>
> - application configuration that is read on startup

These data are read when there are no child processes. You can use the
conf pool for that.

>  and on receiving a certain HTTP request to reload

This has no easy solution. As you're in a child process when you
handle the request, I can't see any other way than shared memory.
Maybe other more experimented readers can help you.

Maybe you could force a graceful restart of apache in the handler,
i.e. to stop all threads and processes when they finish handling the
requests that they handle at the moment of the graceful restart and
then restart apache. Depends on how often you expect this to happen.

> - per-request data that is shared between certain requests with dependencies.

I don't understand. Where is this per-req data stored? It arrives as
arguments of a request (query string/headers/POSTed body) or it's
stored in the config/a database that is indexed with some sort of
request ID (request URL for example)?

If it arrives with the request and you need to store it for later
requests, then we're in the "no easy solution"-case above.

If it's in the config files or a database, then we're in the simple
case, you store it in the conf pool.

> Other important information:
> - The config data is very large, many MB
> - The data structures are built with a lot of pointers to structs to pointers ...
> - We are using the worker MPM.
>
> My question is, how can we be sure that the data is stored only once on the machine, and accessible by any request that needs it?

The conf pool.

>
> For instance, if we store it in the server pool, and we have multiple processes, how can the request data be shared?

> If we have to reload the configuration data, will each process need to maintain its own copy?

No. Conf reloading works like this:
- all threads/processes exit gracefully (more or less)
- the only remaining apache process loads the new config and spawns
the children/threads. They all inherit the new conf.

> If we use shared memory, we will have to change a lot of code which today allocates data on the heap.
>
> One idea for the request data: is there a way to direct a request to be handled by a specific process?

I am not aware of any such mechanism but I have not looked for that in
the apache sources, so I cannot tell if such a mechanism could exist.

Sorin

RE: Global Data

Posted by "Harold J. Ship" <Ha...@giant-steps-networks.com>.
>> I'm in the middle of porting an application from IIS/Windows to
Apache 2.2 module.
>> In the application, there is a lot of global data. The data contains
both:
>>
>> - application configuration that is read on startup and on receiving
a 
>> certain HTTP request to reload
>> - per-request data that is shared between certain requests with
dependencies.
>>
>> Other important information:
>> - The config data is very large, many MB
>> - The data structures are built with a lot of pointers to structs to
pointers ...
>> - We are using the worker MPM.
>>

> I believe this is sort of an ugly hack, but if you're not expecting
too much traffic to your
> application, you may limit the number of child processes of the worker
MPM to just 1. That way, you > can safely use global data in your
module, since all threads of that process will share the same
> memory space.

This is, in fact, what we are doing. We are using the Worker MPM forced
to 1 process with 128 threads. We may change the number of threads as
needed.

Re: Global Data

Posted by César Leonardo Blum Silveira <ce...@gmail.com>.
On Tue, Aug 12, 2008 at 3:50 PM, Harold J. Ship
<Ha...@giant-steps-networks.com> wrote:
> I'm in the middle of porting an application from IIS/Windows to Apache 2.2 module. In the application, there is a lot of global data. The data contains both:
>
> - application configuration that is read on startup and on receiving a certain HTTP request to reload
> - per-request data that is shared between certain requests with dependencies.
>
> Other important information:
> - The config data is very large, many MB
> - The data structures are built with a lot of pointers to structs to pointers ...
> - We are using the worker MPM.
>

I believe this is sort of an ugly hack, but if you're not expecting
too much traffic to your application, you may limit the number of
child processes of the worker MPM to just 1. That way, you can safely
use global data in your module, since all threads of that process will
share the same memory space.

> My question is, how can we be sure that the data is stored only once on the machine, and accessible by any request that needs it?
>
> For instance, if we store it in the server pool, and we have multiple processes, how can the request data be shared?
> If we have to reload the configuration data, will each process need to maintain its own copy?
> If we use shared memory, we will have to change a lot of code which today allocates data on the heap.
>
> One idea for the request data: is there a way to direct a request to be handled by a specific process?
>
> Harold Ship
> Team Leader, Giant Steps Networks
> 04-678-3440 extension 106
> harold@giant-steps-networks.com
>
>
>



-- 
César L. B. Silveira
http://www.cesarbs.org/blog