You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Kean Johnston <ke...@gmail.com> on 2013/12/12 01:00:57 UTC

Do pools lead to bad programming?

Hi all,

So I've been spending a fair bit of time inside Apache recently and I've 
seen a pattern. Consider the following code (from mod_proxy_fcgi.c):

     apr_uri_t *uri = apr_palloc(r->pool, sizeof(*uri));

     ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01076)
                   "url: %s proxyname: %s proxyport: %d",
                  url, proxyname, proxyport);

     if (strncasecmp(url, "fcgi:", 5) != 0) {
         ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01077) 
"declining URL %s", url);
         return DECLINED;
     }

That allocation of uri can be moved down (further than the code shown 
above) until right before it is used. I've seen this in a number of places 
and it "feels" like it is considered OK because the pool is relatively 
short lived and in most cases I've seen so far the allocation is pretty 
small. But as a concept, this strikes me as bad. If that code was using a 
traditional malloc/free the above would be a memory leak. I am aware that 
pools provide protection against such leaks, but nonetheless, that is 
wasted memory allocated and although quick, also a waste of time.

Am I being too obsessive? If not, would you like patches to correct these 
as I find them, and if so, should I open a bug about this or just post 
patches here (they are all likely to be a simple move of 1 or 2 lines)?

Sincerely,
Kean

Re: Do pools lead to bad programming?

Posted by Nick Kew <ni...@webthing.com>.
On 12 Dec 2013, at 00:00, Kean Johnston wrote:

> Hi all,
> 
> So I've been spending a fair bit of time inside Apache recently and I've seen a pattern. Consider the following code (from mod_proxy_fcgi.c):

That's just minor sloppiness.  Fixing it is a minor improvement,
and if you have the time and inclination, go right ahead!

> Am I being too obsessive?

Nope.  If you want to submit patches (bugzilla is probably better than this list)
they have the advantage of being easy to review, so with luck they won't
just languish there!

Not sure I'm keen on your thread title.  If pools lead to sloppiness,
what about garbage collection?  Let alone scripting languages!

-- 
Nick Kew


Re: Do pools lead to bad programming?

Posted by Ben Reser <be...@reser.org>.
On 12/11/13 4:00 PM, Kean Johnston wrote:
> Am I being too obsessive? If not, would you like patches to correct these as I
> find them, and if so, should I open a bug about this or just post patches here
> (they are all likely to be a simple move of 1 or 2 lines)?

There are two ways this sort of thing can happen.

The person who wrote the original code didn't feel like breaking the
declaration of the variable and the allocation into two lines.  Which is a mistake.

Alternatively, the allocation could have always been necessary and the only
difference was the ordering when it happened.  I.E. there may have not been any
returns before the use of the variable.  This would be an error on the part of
the person changing the code.

Looking at this specific case, the variable was added in r357444 as part of the
original template of mod_proxy_fcgi.  So it seems like an error on the original
developers part.

It certainly seems worthwhile to clean these up when they are found in my opinion.

I wouldn't say that pools lead to bad programming.  It's mostly that pools
limit the consequences of these sorts of mistakes.  The mistakes are going to
happen regardless.

Re: Do pools lead to bad programming?

Posted by Daniel Lescohier <da...@cbsi.com>.
The format string is a multiplier of length.  4k repeated %Y elements in
the 'a' variable's format string is 8kbytes in the format string, but the
result would take 16kbytes.  Then you have things like %B: the full month
name according to the current locale.  You cannot really predict the length
without actually doing the strftime.

I figured 256 bytes is more than reasonable.  Normal time strings are < 32
bytes; double that, and quadruple that again (e.g. if there are utf-8 chars
in the string), and you get 256 bytes.  If people need more, one can use in
LogFormat %{formatstring1}t%{formatstring2}t.


On Fri, Dec 13, 2013 at 10:14 AM, Yann Ylavic <yl...@gmail.com> wrote:

> On Fri, Dec 13, 2013 at 5:06 AM, Daniel Lescohier <
> daniel.lescohier@cbsi.com> wrote:
>
>> Here is my draft replacement:
>>
>> static const char *log_request_time_custom(request_rec *r, char *a,
>>                                            apr_time_exp_t *xt)
>> {
>>     static const apr_size_t buf_len = 256;
>>     apr_size_t tstr_len;
>>     char *tstr = apr_palloc(r->pool, buf_len);
>>
>> apr_strftime(tstr, &tstr_len, buf_len, a, xt);
>>     return tstr_len ? tstr : "-";
>> }
>>
>
> Shouldn't [MAX_STRING_LEN > buf_len > strlen(a)] for apr_strftime() not to
> truncate the result (for any resonnable 'a')?
>
>

Re: Do pools lead to bad programming?

Posted by Yann Ylavic <yl...@gmail.com>.
On Fri, Dec 13, 2013 at 5:06 AM, Daniel Lescohier <daniel.lescohier@cbsi.com
> wrote:

> Here is my draft replacement:
>
> static const char *log_request_time_custom(request_rec *r, char *a,
>                                            apr_time_exp_t *xt)
> {
>     static const apr_size_t buf_len = 256;
>     apr_size_t tstr_len;
>     char *tstr = apr_palloc(r->pool, buf_len);
>
> 
> apr_strftime(tstr, &tstr_len, buf_len, a, xt);
>     return tstr_len ? tstr : "-";
> }
>

Shouldn't [MAX_STRING_LEN > buf_len > strlen(a)] for apr_strftime() not
to truncate the result (for any resonnable 'a')?

Re: Do pools lead to bad programming?

Posted by Yann Ylavic <yl...@gmail.com>.
On Fri, Dec 13, 2013 at 5:06 AM, Daniel Lescohier <daniel.lescohier@cbsi.com
> wrote:

>     char server_portstr[sizeof(apr_port_t)*241/100+3]; /* log10(256) is
> 2.408 */
>

Nice :)

Re: Do pools lead to bad programming?

Posted by Daniel Lescohier <da...@cbsi.com>.
We can also save stack space by changing:
    char server_portstr[32];
to:
    char server_portstr[6];
Or if we want to future-proof against the small possibility of a new TCP
standard that has larger port numbers and negative port numbers:
    char server_portstr[sizeof(apr_port_t)*241/100+3]; /* log10(256) is
2.408 */

As part of my time caching work, I'm fixing a memory buffer issue in
mod_log_config.c.  Here is the current code:

static const char *log_request_time_custom(request_rec *r, char *a,
                                           apr_time_exp_t *xt)
{
    apr_size_t retcode;
    char tstr[MAX_STRING_LEN];
    apr_strftime(tstr, &retcode, sizeof(tstr), a, xt);
    return apr_pstrdup(r->pool, tstr);
}

This function needs the string on the heap, because the string is returned
to the caller.  By first using stack memory, one has to add a memory copy
before returning.  Also, this expands the stack by 8KiB, and stack might be
a limited resource on some OSes.  The final problem is that the retcode
isn't checked: if the retcode is 0, the content of the buffer is undefined,
and shouldn't be used; in theory, even 8KiB might not be a big enough
buffer.

Here is my draft replacement:

static const char *log_request_time_custom(request_rec *r, char *a,
                                           apr_time_exp_t *xt)
{
    static const apr_size_t buf_len = 256;
    apr_size_t tstr_len;
    char *tstr = apr_palloc(r->pool, buf_len);
    apr_strftime(tstr, &tstr_len, buf_len, a, xt);
    return tstr_len ? tstr : "-";
}



On Thu, Dec 12, 2013 at 7:37 AM, Yann Ylavic <yl...@gmail.com> wrote:

> On Thu, Dec 12, 2013 at 1:54 AM, Kean Johnston <ke...@gmail.com>wrote:
>
>> I'd love to see these things fixed, because they add up. If you post them
>>> here they are likely to be reviewed very quickly, as they'll no doubt be
>>> simple to review.
>>>
>> Cool. Here's a patch for the case I just mentioned. It also eliminates an
>> un-needed automatic (yes, I obsess about stack frames too). diff against
>> trunk.
>>
>
> Note that in this particular cases (ie. "uri = apr_palloc(p,
> sizeof(*uri))" used in proxy_(ajp|fcgi|http|scgi|wstunnel)_handler), the
> "local" uri would better be declared on the stack (ie. apr_uri_t uri; and
> &uri used accordingly), so to avoid the allocation completely.
>
> Regards,
> Yann.
>
>

Re: Do pools lead to bad programming?

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Dec 12, 2013 at 1:54 AM, Kean Johnston <ke...@gmail.com>wrote:

> I'd love to see these things fixed, because they add up. If you post them
>> here they are likely to be reviewed very quickly, as they'll no doubt be
>> simple to review.
>>
> Cool. Here's a patch for the case I just mentioned. It also eliminates an
> un-needed automatic (yes, I obsess about stack frames too). diff against
> trunk.
>

Note that in this particular cases (ie. "uri = apr_palloc(p,
sizeof(*uri))" used in proxy_(ajp|fcgi|http|scgi|wstunnel)_handler), the
"local" uri would better be declared on the stack (ie. apr_uri_t uri; and
&uri used accordingly), so to avoid the allocation completely.

Regards,
Yann.

Re: Do pools lead to bad programming?

Posted by Christophe JAILLET <ch...@wanadoo.fr>.
Le 12/12/2013 01:54, Kean Johnston a écrit :
>> I'd love to see these things fixed, because they add up. If you post 
>> them here they are likely to be reviewed very quickly, as they'll no 
>> doubt be simple to review.
> Cool. Here's a patch for the case I just mentioned. It also eliminates 
> an un-needed automatic (yes, I obsess about stack frames too). diff 
> against trunk.
>
> Kean

If interested in stack frame, you can have a look at 
http://mail-archives.apache.org/mod_mbox/httpd-dev/201306.mbox/%3Ckp1fo8$v6b$1@ger.gmane.org%3E

I've updated the files yesterday with latest trunk.
If you wish, I can send the script used to generate them.

Best regards,
CJ


Re: Do pools lead to bad programming?

Posted by Kean Johnston <ke...@gmail.com>.
> I'd love to see these things fixed, because they add up. If you post them here they are likely to be reviewed very quickly, as they'll no doubt be simple to review.
Cool. Here's a patch for the case I just mentioned. It also eliminates an 
un-needed automatic (yes, I obsess about stack frames too). diff against trunk.

Kean

Re: Do pools lead to bad programming?

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Dec 11, 2013, at 7:15 PM, Graham Leggett <mi...@sharp.fm> wrote:

> 
> Obviously allocating too early and then throwing away the results of the allocation is a waste as you've pointed out, and should ideally be smoked out and fixed.
> 

Agreed.


Re: Do pools lead to bad programming?

Posted by Christophe JAILLET <ch...@wanadoo.fr>.
Le 12/12/2013 01:15, Graham Leggett a écrit :
> Obviously allocating too early and then throwing away the results of 
> the allocation is a waste as you've pointed out, and should ideally be 
> smoked out and fixed.
> I'd love to see these things fixed, because they add up. If you post them here they are likely to be reviewed very quickly, as they'll no doubt be simple to review.
>
> Regards,
> Graham
> --

+1 as well

I've gone quickly thrue code source and this kind of construction does 
not seem to be so common.

Best regards,
CJ

Re: Do pools lead to bad programming?

Posted by Graham Leggett <mi...@sharp.fm>.
On 12 Dec 2013, at 2:00 AM, Kean Johnston <ke...@gmail.com> wrote:

> So I've been spending a fair bit of time inside Apache recently and I've seen a pattern. Consider the following code (from mod_proxy_fcgi.c):
> 
>    apr_uri_t *uri = apr_palloc(r->pool, sizeof(*uri));
> 
>    ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01076)
>                  "url: %s proxyname: %s proxyport: %d",
>                 url, proxyname, proxyport);
> 
>    if (strncasecmp(url, "fcgi:", 5) != 0) {
>        ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01077) "declining URL %s", url);
>        return DECLINED;
>    }
> 
> That allocation of uri can be moved down (further than the code shown above) until right before it is used. I've seen this in a number of places and it "feels" like it is considered OK because the pool is relatively short lived and in most cases I've seen so far the allocation is pretty small. But as a concept, this strikes me as bad. If that code was using a traditional malloc/free the above would be a memory leak. I am aware that pools provide protection against such leaks, but nonetheless, that is wasted memory allocated and although quick, also a waste of time.

The idea behind pools is that you allocate an arbitrary and unpredictable set of memory, and then free the whole set at some future point in time. This means you don't need to keep track of each and every escape path out of a system and hope you've cleaned up everything, you allocate at will confident that it will all be freed.

Obviously allocating too early and then throwing away the results of the allocation is a waste as you've pointed out, and should ideally be smoked out and fixed.

> Am I being too obsessive? If not, would you like patches to correct these as I find them, and if so, should I open a bug about this or just post patches here (they are all likely to be a simple move of 1 or 2 lines)?

I'd love to see these things fixed, because they add up. If you post them here they are likely to be reviewed very quickly, as they'll no doubt be simple to review.

Regards,
Graham
--