You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@axis.apache.org by Samisa Abeysinghe <sa...@gmail.com> on 2006/11/04 10:23:04 UTC
Re: [Axis2]thoughts re development towards 1.0

Hi Chris,
    Thank you for the detailed review, this really helps - appreciate it 
very much.
    I do agree with your concerns on memory leaks and the difficulty of 
dealing with that in  the absence of a pool.
    And improvements you have suggested makes quite sense.
    However I need more time to understand some of the stuff - I am 
traveling this week, so it will take some time.
    May be others can have look in the mean time :)

Thanks,
Samisa...
Chris Darroch wrote:
> Hi --
>
>    I've been working fairly intensively with Axis2/C over the last month
> or two in order to build some applications, and I thought I might share
> two major concerns that have arisen for me.  I apologize if any of these
> points have been discussed before!
>
>    Before I explain in detail, if I could just put in a request for
> someone to review the latest patches I've attached to AXIS2C-380
> (https://issues.apache.org/jira/browse/AXIS2C-380) regarding a number
> of mod_axis2 cleanups and fixups.  I've tried to fix and clean up
> the configuration directives, add axis2_error_init(), move
> axiom_xml_reader_init(), and enable valid SOAP faults from application
> services and also when services and operations are not found.  Thanks
> in advance!
>
>    In brief, my two concerns are that (A) Axis2/C doesn't use the APR
> libraries (in particular, it doesn't use APR memory pools), and (B)
> that it relies on extensive use of indirect function calls through
> tables of function pointers instead of making simple function calls.
> I'll walk through my thinking on both issues below.
>
>
> A) There's a lot of code duplication between the Axis2/C util library
> and APR.  That's too bad, because it means that bug fixes to APR
> won't be automatically inherited by Axis2/C.  From a presentation
> I attended at last month's ApacheCon U.S. I learned that the original
> plan was to provide an abstraction layer above a portability library
> like APR, and that that got delayed by time constraints.  I don't
> know myself if such an extra layer is really valuable or not, but I
> would have thought that it was better to at least use APR than to
> reproduce portions of it.
>
>    At any rate, I'm not suggesting that much can be done about this
> in the short term, at least for a 1.0 release.  However, I am
> concerned in particular about the absence of APR memory pools and
> the use instead of AXIS2_MALLOC() and AXIS2_FREE().
>
>    My concern comes about because I'm hoping to run mod_axis2 in
> a production system.  That means that I'm worried about gradual
> memory leaks that occur when memory is allocated but not deallocated
> appropriately.  Given the size of the Axis2/C codebase, I think it's
> a reasonable assumption that there are a number of places where
> this occurs -- and that's not a major criticism, since this is one
> of the classic hard things about C programming.
>
>    APR memory pools make life a lot easier.  When writing an
> Apache httpd module, there's basically two kinds of pools one wants
> to consider using -- either a pool that lasts for the lifetime of an
> httpd child process, or a pool that lasts for the lifetime of an
> HTTP request.  (I'm leaving aside for now the threading issues that
> can arise when dealing with process-lifetime pools when using a
> threaded MPM.)
>
>    The attached patchset is a first hack at making mod_axis2 use
> these two kinds of pools.  It works, sort of, but after a few
> requests the child processes segfault.  (It doesn't include
> patches to rampart or guththila, both of which seems to contain
> a few calls to AXIS2_REALLOC which would need to be rewritten.)
> The segfaults always seems to happen at the following level of
> the calling stack (for me, anyway):
>
> axis2_engine_receive()
>   AXIS2_MSG_RECV_RECEIVE()
>     axis2_raw_xml_in_out_msg_recv_receive_sync()
>       axis2_core_utils_create_out_msg_ctx()
>         AXIS2_MSG_CTX_SET_SVC_CTX()
>           axis2_msg_ctx_set_svc_ctx()
>             axis2_msg_ctx_set_svc()
>
>    I don't have time to pursue this further right now, but I think
> it would be really valuable before the 1.0 release to ensure that
> mod_axis2 worked using some kind of similar technique.
>
>    It seems to me that what might be the logical next steps would be:
>
> 1) Add to the env structure a second allocator, named something like
>    resident_allocator or config_allocator or something.
> 2) Analyze which data structures in Axis2/C were effectively
>    "configuration" data and needed to stay in memory for the lifetime
>    of a process.  This should include any such data that needs to
>    be modified or deleted over the lifetime of the server process.
> 3) For each such structure, ensure that it is created, modified,
>    and deleted using only the env->resident_allocator.  This could
>    be accomplished a variety of ways (a private sub-allocator,
>    for example).  The main concern is thread-safety; if multiple
>    threads might be managing this structure at the same time,
>    a thread mutex will need to be used as well.
>
>    At this point, everything else should be data which can be
> cleaned up automatically when the httpd request->pool is cleared
> at the end of each HTTP request.  Then you could even go through
> the code and remove all the calls to AXIS2_FREE() if you really
> felt like it!  The key advantage is that it means that each
> httpd server process is guaranteed to only increase in size if
> the "configuration" data grows, and this should presumably be by
> only a trivial amount.
>
>
> B) I mentioned to a few people I met at a presentation on Axis2/C
> at ApacheCon my concern about the all the function pointers in
> all of the "ops" structures.  The heavy use of all this indirection
> means that the compiler and linker can't help detect when you're
> about to call a function but your function pointer is NULL.
> Making simple, normal function calls means that you can rely on
> the compiler and linker to give you a lot of helpful errors about
> such things before you reach runtime.
>
>    Now, again I understand that this isn't going to be something
> that can change for a 1.0 release.  However, maybe something can
> be done that would help a bit.  Let me point to an example;
> unfortunately I don't have the exact details to hand any more, but
> I think it still illustrates my concern.
>
>    When I was working through the problems I described in AXIS2C-326
> and AXIS2C-322 (https://issues.apache.org/jira/browse/AXIS2C-326 and
> http://issues.apache.org/jira/browse/AXIS2C-322) I came across
> a situation where a valid WSDL file, fed into woden, caused a
> segfault.  It turned out to be because the woden code was expecting
> an attribute where the WSDL spec says you don't absolutely need one.
> I no longer have all the details at hand but it doesn't really matter
> for the purposes of discussion.
>
>    Ideally, obviously, woden should accept the WSDL and proceed.
> But, if it really isn't going to work without the attribute -- if
> the WSDL is just nonsensical, or whatever -- it should at least
> print a nice message like "expecting attribute 'foo' at line 123
> of file 'bar.wsdl'".
>
>    To do that, the first thing it needs to do is not crash.  There
> were two segfault problems I had to work through.  The first is
> a relatively simple coding error that happens to appear in a bunch
> of places in woden; see my nodes in AXIS2C-384
> (https://issues.apache.org/jira/browse/AXIS2C-384) about that.
>
>    The code where I recall the problem turned up looked like this
> (I might be off by a few lines but that doesn't matter to the larger
> point):
>
> int_msg_ref = woden_wsdl10_interface_msg_ref_to_interface_msg_ref_element(
>                   int_msg_ref, env);
> intf_msg_qname = WODEN_WSDL10_INTERFACE_MSG_REF_ELEMENT_GET_QNAME(
>                      int_msg_ref, env);
>
>    The WODEN_WSDL10_INTERFACE_MSG_REF_ELEMENT_GET_QNAME macro is
> defined like this (with some whitespace for clarity):
>
> #define WODEN_WSDL10_INTERFACE_MSG_REF_ELEMENT_GET_QNAME(
>             interface_msg_ref_element, env)
>   (((woden_wsdl10_interface_msg_ref_element_t *) interface_msg_ref_element)->
>             ops->get_qname(interface_msg_ref_element, env))
>
>    The AXIS2C-322/384 issue meant that int_msg_ref was unexpectedly
> NULL and so the macro would cause a segfault.  It seems to me like
> this sort of thing could happen quite a bit, given the way the
> code is structured as a whole, and that's the crux of my concern here.
>
>    At ApacheCon, someone pointed out that the Axis2/C code does
> a lot of parameter checking, but that's not going to help here.
> Suppose that woden_wsdl10_interface_msg_ref_get_qname(), which is
> what the macro calls, performed a nice argument check:
>
> AXIS2_PARAM_CHECK(env->error, interface_msg_ref, AXIS2_FAILURE);
>
> that still won't help, because the macro is still going to fail
> if either its interface_msg_ref_element argument is NULL, or if
> the ops structure pointer is NULL, or if the get_qname function
> pointer is NULL.
>
>    In a more traditional C program, you might have something like:
>
> int_msg_ref = woden_wsdl10_interface_msg_ref_to_interface_msg_ref_element(
>                   int_msg_ref, env);
> intf_msg_qname = woden_wsdl10_interface_msg_ref_get_qname(
>                      int_msg_ref, env);
>
> and if the get_qname() function did an AXIS2_ENV_CHECK() and an
> AXIS2_PARAM_CHECK() on its two arguments, you'd be nicely protected
> against segfaults.  The compiler and linker would caution you if
> were calling a function that wasn't defined.
>
>     Plus, if AXIS2_PARAM_CHECK() did an AXIS2_LOG_ERROR()
> then you'd even get an error message showing exactly which file
> and line of code encountered a NULL function argument.
>
>    With the current indirect function calling scheme, you're at
> the mercy of the various "constructor" functions; if they fail to
> fill in function pointers then you're in trouble.  You're also
> going to fail if you pass a NULL to a macro.  And in all these
> cases there's nothing in the logs to help someone.
>
>    What I might propose is a compromise, since clearly the function
> pointer technique is used extensively and can't just be replaced.
> Here are my proposals:
>
> 1) AXIS2_ENV_CHECK() and AXIS2_PARAM_CHECK() should call
> AXIS2_LOG_ERROR() in the case where the argument is NULL.  Because
> these are macros they'll expand so that AXIS2_LOG_SI gives file
> and line information on the place where the NULL argument was
> encountered.
>
> 2) All the macros that perform indirect function calling through
> the many ops structures should be expanded to first call
> AXIS2_ENV_CHECK() and AXIS2_PARAM_CHECK() on their key env
> and "interface" structure pointer arguments.  It might be good
> to also check for a NULL ops structure pointer and a NULL function
> pointer in the ops structure.
>
>    This would mean that whenever you called one of these many
> macros you'd be automatically protected against failure due to
> NULLs, and you'd get a helpful error message too.
>
>
>    So ... a long email, but hopefully not too unpleasant a read.
> I'm going to have to largely ignore email next week while I work
> toward a deadline, but please do reply to the list or to me
> in person with comments, criticisms, flames, etc.  Thanks for
> listening!
>
> Chris.
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-dev-help@ws.apache.org