You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@apr.apache.org by Justin Erenkrantz <je...@ebuilt.com> on 2001/07/16 08:43:26 UTC

Design of pools and threads

On Sun, Jul 15, 2001 at 07:20:28PM -0700, rbb@covalent.net wrote:
> Pools are by their very nature hierarchical.  That is why the relationship
> is there, and why it needs to remain there.  You can't just rely on the
> thread to cleanup itself.  Pools (and SMS's) are used for a LOT more than
> just memory allocation in the server.  They make sure that descriptors are
> closed correctly.  In some places, they are used to make sure that
> anything the child opened that isn't returned to the OS automatically is
> returned when the child goes away.
> 
> If you divorce the thread pool from the pool for the child, you WILL break
> many assumptions in the server.

Here is what I understand the situation to be.  Please correct me if 
any of my statements are incorrect.

- A parent may not terminate any child threads forcibly (via a system
  call or other mechanism).  The best mechanism we can hope for (at this
  time - again no real progress has been made on alternatives) is to
  indicate via some way that the thread can honor the request to
  terminate gracefully.
- For various safety/robustness issues, we have decided to not implement 
  a pthread_cancel()-like mechanism.  There are three reasons for this - 
  we may arbitrarily cancel the thread even if it is in the middle of an 
  important transaction and certain operations (acquiring mutexes) are
  non-cancellable.  (It is true that we could set cancellation points
  in the APR-using programs to prevent important transactions from being
  cancelled.)  Furthermore, when issuing a pthread_cancel-like call, 
  certain OSes may leak the resources of a thread.  This is the third
  consideration to implementing this option.  With all of these 
  factors in consideration, Dean, Manoj, and you (and others) decided 
  not to implement a pthread_cancel-like mechanism in APR.  I do think 
  that this was the correct decision.
- Since APR has no way of forcing a thread to exit, this agreement lives 
  entirely outside of the bounds of APR.  What I mean by this is that 
  the APR library can not provide this thread-exit mechanism 
  transparently to APR-using programs via cleanups.  Only the thread 
  itself may choose to shut down.  If the thread so wishes, it may 
  consult any mechanisms it desires (such as shared memory, sockets, 
  pipes, etc.).  But, we have no clean way for a parent to kill a child 
  thread that does not wish to exit.
- As an example of what I mean, please look at the threaded MPM.  Each 
  worker thread is in control of when they may exit, and the mechanism 
  for determining when to exit lies outside of APR (via the 
  process-local workers_may_exit variable and the POD).
- You have suggested that a thread pool has a relationship to its parent
  (from the process before thread was created).  I believe this
  assumption is invalid IF we are to attempt an independent per-thread 
  SMS.  And, I believe that a per-process memory model does not work 
  with a threaded httpd architecture.
- Why do we want an independent per-thread SMS?  Because it now removes 
  the requirement for locking.  A thread may not be reentrant upon
  itself - executing in two places at once.  It's a thread with one flow
  of control.  Memory allocation without the need for locking should 
  be a good thing.  In order to scale well with threads, I believe httpd 
  requires independent (thread-local free list) per-thread SMS.
- Does merely having a per-thread SMS (and its associated thread-local 
  free list) necessitate a break from its parent SMS (i.e. independent)?
  Yes, I believe so.  If we were to assume that the per-thread SMS had 
  a parent/child relationship to the per-process SMS, then we must now 
  have locking.  By design, a SMS must allocate memory from its parent.
  Since the parent SMS would now be reentrant (as the parent, it has 
  spawned multiple threads that are executing in parallel all asking it 
  for memory), the parent SMS must acquire a lock before allocating 
  memory.  This lock is the very thing we are trying to avoid in the 
  first place.  Also, when a parent/child relationship exists between 
  two SMS, the parent may clean up the child's SMS when it is destroyed.
- However, for reasons described above, it is impossible for the parent
  to clean up its children threads forcibly.  The only thing the parent 
  can do is wait for the thread to exit on its own behalf (via
  apr_thread_join).  The parent may only make hints (such as 
  workers_may_exit in threaded MPM) to the thread that it should exit.
- So, would having an independent per-thread SMS break any assumptions
  in APR or httpd?  I don't think so.  Why?  I believe that the thread
  is a logical breaking-off point for memory allocation.  If we were to
  assume a thread were to create a SMS when it begins, all subsequent
  processing in that thread will use this SMS.  When this independent
  per-thread SMS is destroyed (as the thread is exiting - remember we
  have assumed that the thread must exit voluntarily), the memory is
  reclaimed.  All descriptors or sockets opened during that threads'
  life is now returned to the OS.
- Furthermore, we know that the process may not exit before the children
  threads may exit.  For the entire lifetime of the thread, any 
  process-local variables in use by the threads must remain valid.  
  This is inherent in the threading model of all the operating systems 
  we encounter.  A malicious parent indeed, could reset its SMS 
  *before* allowing all of its children to exit.  But, let's not worry
  about that (or any error conditions - we're screwed anyway).
- Again, let's consider the case of the threaded MPM.  The parent of 
  the threads (i.e. child process) will only destroy the process-level 
  SMS when it calls clean_child_exit().  Any per-process open 
  descriptors are now closed as the process terminates.  This call 
  occurs after the call to apr_thread_join() on all started threads.  
  No more child threads are present when this code is called.  Again, 
  this code is correct.

I hope I now have made my position and rationale clearer.  Again, I may
have misinterpreted code or design or made blatantly wrong statements.
I heartily welcome any feedback or corrections.  I am human - 
therefore, I am usually wrong.

Respectfully yours,
Justin Erenkrantz
jerenkrantz@ebuilt.com

Re: Design of pools and threads

Posted by Justin Erenkrantz <je...@ebuilt.com>.

On Mon, Jul 16, 2001 at 10:58:17AM -0700, Brian Pane wrote:
>   * The parent pool should control the cleanup of its children
>     to ensure timely resource release (important for memory, but
>     even more important for file descriptors).  In the case of
>     an httpd with a high request volume, this cleanup needs to
>     happen for a request and all its subrequests right after
>     the response is sent; otherwise we

Correct except for the case of a parent/child relationship across 
threads as the parent can't cleanup the children in this case.

FWIW, I'm not talking about the SMS used for a request.  It could be 
cleaned up (i.e. reset) after the request is completed (exactly like 
it is now - no code changes).  It'd free the resources - just like now.
We're simply divorcing its ancestor (the per-thread SMS) from the 
per-process SMS.  The children of this per-thread SMS still have the 
same manner of operation as before.  -- justin

Re: Design of pools and threads

Posted by Brian Pane <bp...@pacbell.net>.

Justin Erenkrantz wrote:
[...]

>- Why do we want an independent per-thread SMS?  Because it now removes 
>  the requirement for locking.  A thread may not be reentrant upon
>  itself - executing in two places at once.  It's a thread with one flow
>  of control.  Memory allocation without the need for locking should 
>  be a good thing.  In order to scale well with threads, I believe httpd 
>  requires independent (thread-local free list) per-thread SMS.
>- Does merely having a per-thread SMS (and its associated thread-local 
>  free list) necessitate a break from its parent SMS (i.e. independent)?
>  Yes, I believe so.  If we were to assume that the per-thread SMS had 
>  a parent/child relationship to the per-process SMS, then we must now 
>  have locking.  By design, a SMS must allocate memory from its parent.
>  Since the parent SMS would now be reentrant (as the parent, it has 
>  spawned multiple threads that are executing in parallel all asking it 
>  for memory), the parent SMS must acquire a lock before allocating 
>  memory.  This lock is the very thing we are trying to avoid in the 
>  first place.  Also, when a parent/child relationship exists between 
>  two SMS, the parent may clean up the child's SMS when it is destroyed.
>- However, for reasons described above, it is impossible for the parent
>  to clean up its children threads forcibly.  The only thing the parent 
>  can do is wait for the thread to exit on its own behalf (via
>  apr_thread_join).  The parent may only make hints (such as 
>  workers_may_exit in threaded MPM) to the thread that it should exit.
>- So, would having an independent per-thread SMS break any assumptions
>  in APR or httpd?  I don't think so.  Why?  I believe that the thread
>  is a logical breaking-off point for memory allocation.  If we were to
>  assume a thread were to create a SMS when it begins, all subsequent
>  processing in that thread will use this SMS.  When this independent
>  per-thread SMS is destroyed (as the thread is exiting - remember we
>  have assumed that the thread must exit voluntarily), the memory is
>  reclaimed.  All descriptors or sockets opened during that threads'
>  life is now returned to the OS.
>
I really think we need a hybrid model that splits the two roles
of a parent in the SMS design.  One of my messages in the archives
talks about this in more detail, but the basic idea is:
  * The parent pool should control the cleanup of its children
    to ensure timely resource release (important for memory, but
    even more important for file descriptors).  In the case of
    an httpd with a high request volume, this cleanup needs to
    happen for a request and all its subrequests right after
    the response is sent; otherwise we
  * But the parent doesn't need to be where the children go
    to get blocks when they need more memory; instead, they
    should call directly to the most efficient source of memory
    that's suitable for the application (e.g., a per-thread SMS).

--Brian