You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2005/07/19 21:55:26 UTC
Pondering strings in Apache 3.x
Greg and a few others voiced interest in moving from null-term
strings to counted strings for a future version of Apache.
This was too broad a scope change to make it into 2.0, of course,
and was dropped on the floor for the time being.
I'm wondering today; what metadata interests us in an ap_string_t
prefix header? I have a hunch that a short, 65536, is enough
to map most data we want to handle in one chunk; brigades are
better for handling large sets of data. Of course we could push
that to an int, or size_t, but there would be a small memory
penalty. It might be overcome by cpu-specific optimized int
or size_t handling behavior, since the assembly code wouldn't
need to truncate short values.
Perhaps, both bytes allocated/used, in order to play optimized
games with string allocation. Perhaps, a refcount? (This
doesn't play well with pool allocations, obviously.)
But the byte count clearly isn't enough. I'm thinking of;
encoding; is this data URI escaped or un-escaped?
tainted; is it raw? or has it been untainted with
context-specific validity checks?
charset; is this native? (e.g. EBCDIC). utf-8?
opaque or otherwise a specific set?
What else interests us within an 'ap_string_t' header, that
would help eliminate bugs within httpd? A random trailing
short following the string, in a 'string debug' mode, to
detect buffer overflows? Something similar to detect
underflows?
Open to all ideas.
Bill
Re: Pondering strings in Apache 3.x
Posted by André Malo <nd...@perlig.de>.
* Brian Pane wrote:
> And although I like the performance benefits of the pool memory
> allocators, I remember how tricky it was to debug some of the
> pool and bucket lifetime problems that we encountered during
> the development of 2.0 (especially in filters). All things considered,
> I don't think I'd mind the overhead of a garbage collection thread.
The pool problems should be solved now... (mostly)
> Thus I can't help but wonder: Would 3.0 be a good time to consider
> trying a Java-based httpd?
If you ask me: Nope. Try Tomcat instead ;)
What we need for 3.0 is just a clean design and definitions of what is core
(not much, imo) and what is not core. This was started for 2.0 but never
finished. Further a standardized exception handling would be nice (like
svn's).
The "core" could provide several convenience data types like ap_string_t.
I would, btw, just store the length of the string in such a type. Other
properties (url-encoding state, ...) imo belong to a different layer. Like
a bucket or just a wrapper type.
nd
--
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"
Im Westen was neues: <http://pub.perlig.de/books.html#apache2>
Re: Pondering strings in Apache 3.x
Posted by Brian Pane <br...@apache.org>.
On Jul 19, 2005, at 12:55 PM, William A. Rowe, Jr. wrote:
> Greg and a few others voiced interest in moving from null-term
> strings to counted strings for a future version of Apache.
> This was too broad a scope change to make it into 2.0, of course,
> and was dropped on the floor for the time being.
>
> I'm wondering today; what metadata interests us in an ap_string_t
> prefix header? I have a hunch that a short, 65536, is enough
> to map most data we want to handle in one chunk; brigades are
> better for handling large sets of data. Of course we could push
> that to an int, or size_t, but there would be a small memory
> penalty. It might be overcome by cpu-specific optimized int
> or size_t handling behavior, since the assembly code wouldn't
> need to truncate short values.
>
> Perhaps, both bytes allocated/used, in order to play optimized
> games with string allocation. Perhaps, a refcount? (This
> doesn't play well with pool allocations, obviously.)
>
> But the byte count clearly isn't enough. I'm thinking of;
>
> encoding; is this data URI escaped or un-escaped?
>
> tainted; is it raw? or has it been untainted with
> context-specific validity checks?
>
> charset; is this native? (e.g. EBCDIC). utf-8?
> opaque or otherwise a specific set?
>
> What else interests us within an 'ap_string_t' header, that
> would help eliminate bugs within httpd? A random trailing
> short following the string, in a 'string debug' mode, to
> detect buffer overflows? Something similar to detect
> underflows?
>
> Open to all ideas.
This may be a bit more radical than you were hoping for, but...
I like the idea of using a reference-counted, non-null-terminated
string type for 3.x.
More generally, it would be great to have overflow detection
on all arrays.
And although I like the performance benefits of the pool memory
allocators, I remember how tricky it was to debug some of the
pool and bucket lifetime problems that we encountered during
the development of 2.0 (especially in filters). All things considered,
I don't think I'd mind the overhead of a garbage collection thread.
Thus I can't help but wonder: Would 3.0 be a good time to consider
trying a Java-based httpd?
Brian
Re: Pondering strings in Apache 3.x
Posted by Jeff White <jl...@earthlink.net>.
From: "William A. Rowe, Jr."
>
> What else interests us within an
> 'ap_string_t' header, that would help
> eliminate bugs within httpd? A random
> trailing short following the string, in a
> 'string debug' mode, to detect buffer
> overflows? Something similar to detect
> underflows?
>
> Open to all ideas.
>
What are the newer C / C++ compiler
standard Safer C Library Functions?
Jeff