You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Kent Fitch <k....@adfa.edu.au> on 2002/10/11 02:32:52 UTC

regexp processing and pcre - apache modules restricted to the POSIX interface and passing a string?

I'm finding that using the bucket API to develop filters using 
Apache2 is very easy and powerful - congratulations to the
desiginers.

One of the things I'm doing is looking for regular expressions
in the contents of buckets.  Because the buckets are not null
terminated strings, I must copy bytes out of the buckets
and null terminate them before passing them to the ap_regexec
interface, which then just has to count the bytes in the
string again before passing on to the pcre routines - lots
of unnecessary CPU and memory!

I can understand the desire to have a POSIX compliant
interface, but is there a possibility of providing an
interface to regexec that accepts a buffer and a length
rather than a string?  Such a change is likely to be useful 
and a considerable performance boon to other filter writers
wanting to process buckets using regexp.  The current regexp
interface seems to be at odds with the buffer and length
nature (rather than the null-terminated string approach) of
much of the new Apache API and of pcre itself.

I've coded the changes for my own use to util.c and pcreposix.c
etc, and they are not extensive, but by doing so I've created
problems for myself in maintaining compatability for future
versions of Apache.

Or have I missed something - ie, should I be looking at
calling the "raw" pcre interfaces?

Regards,

Kent Fitch
AustLit project