You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Rainer Jung <ra...@kippdata.de> on 2013/04/16 23:49:20 UTC

Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c

On 27.03.2013 22:57, jailletc36@apache.org wrote:
> Author: jailletc36
> Date: Wed Mar 27 21:57:44 2013
> New Revision: 1461869
> 
> URL: http://svn.apache.org/r1461869
> Log:
> Be more clever when allocating memory for log item to be escaped.
> This should save about 70-100 bytes in the request pool with the default config.
> 
> Modified:
>     httpd/httpd/trunk/server/util.c
> 
> Modified: httpd/httpd/trunk/server/util.c
> URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/server/util.c?rev=1461869&r1=1461868&r2=1461869&view=diff
> ==============================================================================
> --- httpd/httpd/trunk/server/util.c (original)
> +++ httpd/httpd/trunk/server/util.c Wed Mar 27 21:57:44 2013
> @@ -1850,12 +1850,26 @@ AP_DECLARE(char *) ap_escape_logitem(apr
>      char *ret;
>      unsigned char *d;
>      const unsigned char *s;
> +    int length = 0;
>  
>      if (!str) {
>          return NULL;
>      }
>  
> -    ret = apr_palloc(p, 4 * strlen(str) + 1); /* Be safe */
> +    /* First, compute the space needed for the escaped string.
> +     * This could be tweaked a bit for '\b', '\n'... These characters
> +     * should not be common, so do not bother. */
> +    s = (const unsigned char *)str;
> +    for (; *s; ++s) {
> +        if (TEST_CHAR(*s, T_ESCAPE_LOGITEM)) {
> +            length += 4;        /* for '\\' + c2x() */
> +        }
> +        else {
> +            length++;
> +        }
> +    }
> +    
> +    ret = apr_palloc(p, length + 1);
>      d = (unsigned char *)ret;

Here we treat CPU against memory. AFAIK this function is in the hot code
path, because it is used for many of the fields used in the default
access log format and also in many custom ones. All other uses seem to
be either trace log messages, some (few) error log messages and
mod_status output. For those cases I think neither the CPU nor the
Memory differences are important.

In the access log case it seems the typical fields which are run through
ap_escape_logitem() should rarely contain characters to escape. So an
alternative strategy would be to simply copy the string if we don't find
a character to escape. The second check for each char is not necessary then.

A quick draft patch is at

http://people.apache.org/~rjung/patches/ap_escape_logitem_enhanced.patch

It compiles, but I haven't really tested it. You get the idea though.

In the case of no escapes, it should be quite a bit faster. In the case
of escapes it shouldn't make a noticeable difference in any direction.

Regards,

Rainer

Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c

Posted by André Malo <nd...@perlig.de>.
* Rainer Jung wrote:

> On 17.04.2013 11:38, Plüm, Rüdiger, Vodafone Group wrote:
> >> -----Original Message-----
> >> From: André Malo [mailto:nd@perlig.de]
> >> Sent: Mittwoch, 17. April 2013 11:04
> >> To: dev@httpd.apache.org
> >> Subject: Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c
> >>
> >> * Rainer Jung wrote:
> >>
> >>
> >> http://people.apache.org/~rjung/patches/ap_escape_logitem_enhanced.pat
> >>ch
> >>
> >> +    ret = apr_palloc(p, length + 3 * escapes);
> >>
> >> It should be still 4 * escapes ('\xHH').
> >
> > I don't think so. If you have the binary data of \xHH it is ok to have
> > one byte for the original data plus 3 bytes for the escapes and that is
> > length + 3 * escapes. The storage for the original data is in length.
> > But I admit that I got confused by that as well :-)

Orrrr. right.

>
> So probably we should use a better comment as the one in the patch
> currently:
>
> /* 3 * escapes for the overhead in '\\' + c2x() */
>
> ;)

good idea ;)

nd
-- 
"Solides und umfangreiches Buch"
                                          -- aus einer Rezension

<http://pub.perlig.de/books.html#apache2>

Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c

Posted by Rainer Jung <ra...@kippdata.de>.
On 17.04.2013 11:38, Plüm, Rüdiger, Vodafone Group wrote:
> 
> 
>> -----Original Message-----
>> From: André Malo [mailto:nd@perlig.de]
>> Sent: Mittwoch, 17. April 2013 11:04
>> To: dev@httpd.apache.org
>> Subject: Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c
>>
>> * Rainer Jung wrote:
>>
>>>
>> http://people.apache.org/~rjung/patches/ap_escape_logitem_enhanced.patch
>>
>> +    ret = apr_palloc(p, length + 3 * escapes);
>>
>> It should be still 4 * escapes ('\xHH').
> 
> I don't think so. If you have the binary data of \xHH it is ok to have one byte for the original data
> plus 3 bytes for the escapes and that is length + 3 * escapes. The storage for the original data is in
> length. But I admit that I got confused by that as well :-)

So probably we should use a better comment as the one in the patch
currently:

/* 3 * escapes for the overhead in '\\' + c2x() */

;)

Rainer


RE: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c

Posted by Plüm, Rüdiger, Vodafone Group <ru...@vodafone.com>.

> -----Original Message-----
> From: André Malo [mailto:nd@perlig.de]
> Sent: Mittwoch, 17. April 2013 11:04
> To: dev@httpd.apache.org
> Subject: Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c
> 
> * Rainer Jung wrote:
> 
> >
> http://people.apache.org/~rjung/patches/ap_escape_logitem_enhanced.patch
> 
> +    ret = apr_palloc(p, length + 3 * escapes);
> 
> It should be still 4 * escapes ('\xHH').

I don't think so. If you have the binary data of \xHH it is ok to have one byte for the original data
plus 3 bytes for the escapes and that is length + 3 * escapes. The storage for the original data is in
length. But I admit that I got confused by that as well :-)

Regards

Rüdiger


Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c

Posted by Rainer Jung <ra...@kippdata.de>.
On 17.04.2013 11:07, André Malo wrote:
> * André Malo wrote:
> 
>> * Rainer Jung wrote:
>>> A quick draft patch is at
>>>
>>> http://people.apache.org/~rjung/patches/ap_escape_logitem_enhanced.patc
>>> h
>>
>> +    ret = apr_palloc(p, length + 3 * escapes);
>>
>> It should be still 4 * escapes ('\xHH').

Hmmm, isn't the 1 = 4-3 already counted in length? Here we only add the
escaping overhead of 3 bytes per escape.

> Also, escapes should be apr_size_t as well, I think.

Probably.

Thanks,

Rainer


Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c

Posted by André Malo <nd...@perlig.de>.
* André Malo wrote:

> * Rainer Jung wrote:
> > A quick draft patch is at
> >
> > http://people.apache.org/~rjung/patches/ap_escape_logitem_enhanced.patc
> >h
>
> +    ret = apr_palloc(p, length + 3 * escapes);
>
> It should be still 4 * escapes ('\xHH').

Also, escapes should be apr_size_t as well, I think.

nd
-- 
s  s^saaaaaoaaaoaaaaooooaaoaaaomaaaa  a  alataa  aaoat  a  a
a maoaa a laoata  a  oia a o  a m a  o  alaoooat aaool aaoaa
matooololaaatoto  aaa o a  o ms;s;\s;s;g;y;s;:;s;y#mailto: #
 \51/\134\137| http://www.perlig.de #;print;# > nd@perlig.de

Re: svn commit: r1461869 - /httpd/httpd/trunk/server/util.c

Posted by André Malo <nd...@perlig.de>.
* Rainer Jung wrote:

> On 27.03.2013 22:57, jailletc36@apache.org wrote:
> > Author: jailletc36
> > Date: Wed Mar 27 21:57:44 2013
> > New Revision: 1461869
> >
> > URL: http://svn.apache.org/r1461869
> > Log:
> > Be more clever when allocating memory for log item to be escaped.
> > This should save about 70-100 bytes in the request pool with the
> > default config.
> >
> > Modified:
> >     httpd/httpd/trunk/server/util.c
> >
> > Modified: httpd/httpd/trunk/server/util.c
> > URL:
> > http://svn.apache.org/viewvc/httpd/httpd/trunk/server/util.c?rev=146186
> >9&r1=1461868&r2=1461869&view=diff
> > =======================================================================
> >======= --- httpd/httpd/trunk/server/util.c (original)
> > +++ httpd/httpd/trunk/server/util.c Wed Mar 27 21:57:44 2013
> > @@ -1850,12 +1850,26 @@ AP_DECLARE(char *) ap_escape_logitem(apr
> >      char *ret;
> >      unsigned char *d;
> >      const unsigned char *s;
> > +    int length = 0;
> >
> >      if (!str) {
> >          return NULL;
> >      }
> >
> > -    ret = apr_palloc(p, 4 * strlen(str) + 1); /* Be safe */
> > +    /* First, compute the space needed for the escaped string.
> > +     * This could be tweaked a bit for '\b', '\n'... These characters
> > +     * should not be common, so do not bother. */
> > +    s = (const unsigned char *)str;
> > +    for (; *s; ++s) {
> > +        if (TEST_CHAR(*s, T_ESCAPE_LOGITEM)) {
> > +            length += 4;        /* for '\\' + c2x() */
> > +        }
> > +        else {
> > +            length++;
> > +        }
> > +    }
> > +
> > +    ret = apr_palloc(p, length + 1);
> >      d = (unsigned char *)ret;
>
> Here we treat CPU against memory. AFAIK this function is in the hot code
> path, because it is used for many of the fields used in the default
> access log format and also in many custom ones. All other uses seem to
> be either trace log messages, some (few) error log messages and
> mod_status output. For those cases I think neither the CPU nor the
> Memory differences are important.
>
> In the access log case it seems the typical fields which are run through
> ap_escape_logitem() should rarely contain characters to escape. So an
> alternative strategy would be to simply copy the string if we don't find
> a character to escape. The second check for each char is not necessary
> then.
>
> A quick draft patch is at
>
> http://people.apache.org/~rjung/patches/ap_escape_logitem_enhanced.patch

+    ret = apr_palloc(p, length + 3 * escapes);

It should be still 4 * escapes ('\xHH').

nd
-- 
Das, was ich nicht kenne, spielt stückzahlmäßig *keine* Rolle.

                                   -- Helmut Schellong in dclc