You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Dan Jacobowitz <dr...@false.org> on 1998/08/07 23:49:27 UTC

[finrod@EWOX.ORG: YA Apache DoS attack]

I'm sure you've probably all seen this as soon as I did, but:

is this still an issue with 1.3?

Dan

Re: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Lars Eilebrecht <La...@unix-ag.org>.

According to Alexei Kosut:

>  Still, we might want to publicize the security@apache.org email alias?

+1


ciao...
-- 
Lars Eilebrecht                            - "We had joy, we had fun,
sfx@unix-ag.org                          - we had sessions on the SUN."
http://www.home.unix-ag.org/sfx/

Re: [IDEA] using tsearch() -- was: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Dmitry Khrustalev <di...@bog.msu.su>.

On Thu, 13 Aug 1998, Dean Gaudet wrote:

> 
> On Thu, 13 Aug 1998, Dean Gaudet wrote:
> 
> > One option for speeding up the core and proxy, which search for fixed-name
> > such as "Content-Length" would be to "intern" the names.
> 
> BTW, we can use a perfect hash to do the intern operation as well -- so
> we'd actually have a flat file that lists all the internable names, and
> then a tool that generates a few .h files and a .c file with the perfect
> hash function in it.  Standard compiler trick :) 

Well, I played with perfect hash and structure like this:

typedef struct {
    pool *pool;
    const char *strings[HEADER_MAXVAL];
    table *more;
} headers;

with api like this:

#define ap_header_get(h, n) ((h)->strings[(n)])
#define ap_header_set(h, n, v) ((h)->strings[(n)] = (v))
#define ap_header_merge(h, n, v) ((((h)->strings[(n)]) == NULL ) ? \
		((h)->strings[(n)] = (v)) : \
		((h)->strings[(n)] = ap_pstrcat(h->pool, (h)->strings[(n)], \
		", ", (v), NULL)))
#define ap_header_unset(h, n) ((h)->strings[(n)] = NULL)
API_EXPORT(const char *) ap_header_getstr(const headers *, const char *);
API_EXPORT(void) ap_header_setstr(headers *, const char *name, const char *val);
API_EXPORT(void) ap_header_mergestr(headers *, const char *, const char *);
API_EXPORT(void) ap_header_unsetstr(headers *, const char *key);

Known headers are referenced by numeric constant, unknown go into more
table. Set-Cookie trick in util_script.c is unsupported.

	-Dima

Re: [IDEA] using tsearch() -- was: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 13 Aug 1998, Dean Gaudet wrote:

> One option for speeding up the core and proxy, which search for fixed-name
> such as "Content-Length" would be to "intern" the names.

BTW, we can use a perfect hash to do the intern operation as well -- so
we'd actually have a flat file that lists all the internable names, and
then a tool that generates a few .h files and a .c file with the perfect
hash function in it.  Standard compiler trick :) 

Dean

Re: [IDEA] using tsearch() -- was: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 13 Aug 1998, Martin Kraemer wrote:

> I *hope* that this could indeed lead to a faster way of handling the
> header lines in both the core and modules, most notably the proxy
> module.

But when I put a hash table in, which is usually less expensive than a
tree, it didn't make the core faster at all.  Some of the tables are just
so small it doesn't matter.  So I think you'd have to do something
adaptive to really see benefits.

Have you profiled the proxy to determine if tables are really the problem? 
Usually what I see in the core is that strcasecmp is a high running
routine, not so much that tables are the problem... just that comparing
strings in the tables is a pain. 

One option for speeding up the core and proxy, which search for fixed-name
such as "Content-Length" would be to "intern" the names.  So while we're
building the headers when we notice that we've read "Content-Length" we
instead use intern_content_length... which is defined: 

const char intern_content_length[] = "Content-Length"; 

Then we have ap_table_get_intern(r->headers_in, intern_content_length) 
which compares the key pointer, and not the value. 

Dean

[IDEA] using tsearch() -- was: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Martin Kraemer <Ma...@mch.sni.de>.

On Sat, Aug 08, 1998 at 06:55:19PM -0700, Roy T. Fielding wrote:
> The best solution would be to do what is planned for 2.0.  That is,
> replace all parsing and access to header values with a tokenized
> hash table and linked lists for values.

One thing I've been thinking about while changing the proxy to use
tables instead of arrays for header lines (to be committed *RSN* ;-)
was changing from the current array implementation to a binary tree
based solution.

What I dislike about today's method is the repeated linear search
(which is just a waste of time) embedded in the ap_table_get() call.
This function is called at many places to check for the existance
or/and value of certain headers.

I chose a binary tree over a hash list (using hsearch()) because with a
limited (but possibly open, think of "X-My-Phone-Number: 555-234-5678")
set of distinct headers it still keeps the search path short. Also, I
did not choose the bsearch() function because that would require to
always maintain a sorted array, which leads to many wasted cycles when
the header list grows dynamically.

Before I elaborate on this topic however, I have a question about the
*order* of headers: is header order (for disjunct headers) important in
any situation? I know that, within headers with the same name, order
does of course matter (like in
  Via: 1.0 myhost
  Via: 1.1 hishost
This can be merged to
  Via: 1.0 myhost, 1.1 hishost
but never the other way around. But how is the situation with, say,
  Date: Sat, 08 Aug 1998 00:04:21 +0100
  Content-Type: text/html
can the order of these be arbitrarily be modified?  RFC2068 says:
   The order in which header fields with differing field names are
   received is not significant. However, it is "good practice" to send
   general-header fields first, followed by request-header or response-
   header fields, and ending with the entity-header fields.

Based on this statement, my idea was like this:

1) modify ap_table_set*(), ap_table_merge*(), ap_table_add*() etc. to
   use the tsearch() function to set/modify value nodes with a
   structure similar to this:

   typedef struct {
     const char *name;  /* name token of this header line, e.g., "Content-Type" */
     const char *key;   /* lower cased repr. of name, like: "content-type" */
     array_header value;/* usually, contains one string only. For
			 * non-concatenable headers like "Set-Cookie",
			 * this will contain the various "Set-Cookie" lines
			 * separately */
     enum { tn_is_general_header,
	    tn_is_request_header,
	    tn_is_entity_header,
	    tn_is_unknown_header
	  } header_prio; /* Optional: categorize the header type */
   } table_node;

*  The "name" field is only kept for beautifying the reconstructed
   header when outputting it.

*  The "key" field is used when searching the tree for an already
   entered table_node. It is already lowercased, "boosting the
   performance" of the strcmp()s required to look up an element
   compared to the more expensive strcasecmp()s.

*  The "value" header is a compromise between the necessity to have a
   linear list (sorted by order of appearance) of values, and the
   possibitity to collapse multiple headers into one.

*  For "header_prio", see below

2) When the decision about using either ap_table_set() or
   ap_table_merge() is done in the caller (like it's done today for
   "set-cookie" [[Hint to myself: not in mod_proxy]]), only
   ap_table_add() would have to create additional array elements.
   Ap_table_set() and ap_table_merge() would only work on the first
   array element.
   [[Question: what does ap_table_get() return today for multiple
   set-cookie headers? The first value only?]]

3) At outputting time, the table would simply be traversed in-order by
   the twalk() function (instead of table_do()), outputting each header
   line (or multiple separate lines if present). Optionally, the
   RFC2068 desire...
	``However, it  is "good practice"  to send  general-header fields
	first, followed  by request-header or  response- header fields,
	and ending with the entity-header fields.''
   could be realized by adding a priority code to the structure and
   doing several twalk() passes.

I *hope* that this could indeed lead to a faster way of handling the
header lines in both the core and modules, most notably the proxy
module.

    Martin
-- 
| S I E M E N S |  <Ma...@mch.sni.de>  |      Siemens Nixdorf
| ------------- |   Voice: +49-89-636-46021     |  Informationssysteme AG
| N I X D O R F |   FAX:   +49-89-636-44994     |   81730 Munich, Germany
~~~~~~~~~~~~~~~~My opinions only, of course; pgp key available on request

Re: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.

Feh.  pfree() is a good idea.  Unless we do some serious magic, though,
a pfree() implementation is going to require reference counts and
handles to be really robust.  Or possibly prefixing each allocation
with a word indicating its size -- I haven't looked to see if we
already do that.

For this specific pathology.. is it permissable for the server to
*not* merge field values if the existing concatenation already
includes them?  That is, don't merge "Header: foo" if the
accumulated value of Header already includes "foo"?  Roy?

That won't affect repeating fields with differing values, but
it's something I've been wanting for a while -- but just never
checked on the allowability.

#ken	P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Group member         <http://www.apache.org/>
"Apache Server for Dummies" <http://WWW.Dummies.Com/

Re: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Dirk-Willem van Gulik <di...@jrc.it>.

On Fri, 7 Aug 1998, Alexei Kosut wrote:

> On Fri, 7 Aug 1998, Dan Jacobowitz wrote:
> 
> > I'm sure you've probably all seen this as soon as I did, but:
> > 
> > is this still an issue with 1.3?
> 
> I haven't tested it, but probably. (if it's in fact an issue at all). It
> has to do with the pool stuff, and demonstrates why a pfree() might
> be a good idea. IMO, this is probably what happens:

Hmm, Just tried it; just do alternating headers; say Agent1: Test, Agent2: 
Test, Agent3: Test, Agent1: Test, Agent2: Test. I guess we need to really
free(); 

Not good.

Dw.

Re: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.

On Fri, 7 Aug 1998, Dan Jacobowitz wrote:

> I'm sure you've probably all seen this as soon as I did, but:
> 
> is this still an issue with 1.3?

I haven't tested it, but probably. (if it's in fact an issue at all). It
has to do with the pool stuff, and demonstrates why a pfree() might
be a good idea. IMO, this is probably what happens:

1. First "User-agent: sioux" comes in. Apache creates an entry in the
headers_in table for "User-agent", puts a copy of "sioux" in it.

2. Next "User-agent: sioux" comes in. Apache (in ap_table_mergen) creates
a new string containing the contents of the previous entry ("sioux"), a
comma, and the new entry ("sioux"). It keeps the old string around in
memory, doing nothing.

3. Repeat step 2. For 10,000 headers, we have 10,000 + 9,999 + 9,998 + ... 
+ 2 + 1 = 50,005,000 copies of the string in memory (which, for a
seven-byte string 'sioux, ', matches up with the 392 meg usage the report
indicates), even though the actual pointer linked from the "User-agent"
string only has 10,000 copies (70k)

Seems like a problem to me, although I find the guy's attitude annoying:
He can take the time to write up scripts, polish them with copyright
notices and things, do tests to figure out where the problem might be, but
can't spend two minutes typing things into a web page?

Still, we might want to publicize the security@apache.org email alias?

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: [finrod@EWOX.ORG: YA Apache DoS attack]

Posted by Marc Slemko <ma...@worldgate.com>.

On Fri, 7 Aug 1998, Dan Jacobowitz wrote:

> I'm sure you've probably all seen this as soon as I did, but:

"oh, I really would have let them konw but I'm too lazy to spend two
seconds to fill out a little web form, so too bad"

Childish luser.