You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs@httpd.apache.org by Mi...@telekurs.de on 2002/09/06 17:22:09 UTC

Documentation: mod_actions and security (long ...)

Hi all,

I have a problem. It will touch Apache documentation and security
aspects, so I hope this may be the place to discuss about it.
I would like to ask what I should do, and how the Apache documen-
tation (about handlers as well as about security) in the future
might possibly cover the issue I am going to describe.

Let's start with
     http://httpd.apache.org/docs/handler.html
which is a very short description about how to embed CGI scripts
as handlers into the Apache request processing.
There is not a single clue that there might be a security issue,
but you will see that it may easily become one.

All I learned in this article was that
 a) I can have my own Perl CGI script being used as handler and
    thus be invoked for each and every request to the class of
    documents I am going to configure,
 b) all it takes would be some <FilesMatch> and Addhandler inside
    a .htaccess file, and
 c) my script will then be served PATH_TRANSLATED so that it can
    know which file has actually been requested.
What I didn't learn there is that another Environment variable,
PATH_INFO, is being set as well, telling me the relative URI of
the request after some translations (like Directory Default ha-
ving been evaluated - don't know whether mod_rewrite and others
of this kind have already done what they want to do).
I just found this information when embedding a simple printenv.pl
script as a handler and looked at which options I might have.

So what I did was write some CGI script that implements a variant
of mod_deflate (or mod_gzip, as we are in Apache 1.3 territory),
playing content-negotiation for the "Accept-Encoding: gzip" HTTP
header and - which is the main difference to the existing modules
- create and maintain my own cache of compressed contents.
So instead of making the user manually create some *.gz variants
of his documents and then use mod_negotiation for conditionally
serving the compressed variants (which may easily lead to serving
outdated variants just because you forget to update your compres-
sed version after changing the original files) my cache is doing
this for itself, checking the last-modification-date of both the
original and the compressed cache file, auto-creating and auto-
refreshing the cache.
This of course will restrict the usage to static files, but it
would be usable for a large percentage of websites. All you need
is Apache, CGI, .htaccess, AddHandler and Perl, plus some com-
pression tool, which may be UNIX "gzip" or some CPAN module -
both would work.

The whole thing was working perfectly on my standard webspace
(where I didn't manage to convince the provider to upgrade to
something more state-of-the-art than Apache 1.3.12 ... and no
mod_gzip of course ...). I have some very large documents that
compress nicely (sometimes by a factor of 10 and more), which
will speed up transmission a lot (not to mention traffic savings).
I improved the thing more and more, made it write my own compres-
sion log file, wrote a statistics evaluation script to calculate
the traffic savings and the frequency of compressed requests ...
After some weeks of successful operation, I decided to make this
thing Open Source and offer some download version of it. I wrote
a lot of documentation about it, made the script configurable
via Apache environment variables, and even included some 'online
configuration feature' (that would display sensitive path name
information about the server, so it can of course be deactivated
by configuration, and my installation manual explicitly suggests
to disable it after successful installation).

This has happened three months ago. I was feeling safe - whoever
would request a file would first have to do a normal HTTP access,
so Apache would evaluate any type of access control before my
handler would be invoked. The world looked bright.

But yesterday someone told me that my handler is a perfect tunnel
through each and every HTTP security concept that Apache ever had.

If you outguess the script URL and directly access the script from
a browser, but _add_ the relative URL of some document within this
web space, then this is treated by Apache exactly like the activa-
tion of the handler via the "Action:" interface, i. e. PATH_INFO
_and_ PATH_TRANSLATED are being set. Therefore, the file's content
would probably be compressed - and definitely be sent to the client.

The only difference is that in this case HTTP access control will
be done for the URL of the script, not for the URL of the document!
So with this script, you can now read the content of files that
reside within some /cgi-bin directory created by ScriptAlias, and
you can now read files that are being protected by Server Authen-
tication. It is just a catastrophe.

At this point I started to check whether there is anything I could
use to definitely find out if my script has been invoked as a
handler or if it was simply executed directly with some additional
URL 'tail'.
I need to identify some difference between
     "GET /directory/file.htm"
and	"GET /cgi-bin/script.pl/directory/file.htm"
or else the latter one of these requests will be an open door for
intruders.
But looking at the Environment variables being set in both cases
leads to even more confusing results - especially it seems to lead
to different results on different Apache installations.
So the result might even depend on things I am not aware of, like
changes in the handler API for different Apache versions.
Or maybe this is even somehow configurable but I don't know where.

So I am asking for your help now.

The first thing I would like to know is how a CGI Script can defi-
nitely know whether it has been activated via the "Action:" inter-
face, instead of being invoked via an URL.
I would need this information to be able to publish a "safe" update
of my script. I can well live with some algorithm that is dependent
on the Apache version being used; of course it would be nice to ask
Apache about the version that is running but I am afraid this isn't
an information available through the Environment.
I have already found out that using Server Authentication like
     order deny,allow
     deny  from all
     allow from my.own.server.ip
would block direct requests to the script but still let the handler
do its normal work. This would of course disable the configuration
interface of the script as well, but so be it - I can would be able
to move both functionalities into separate scripts.
The main problem is that the IP might change (if the provider decides
to move the web space to another server and adapt the DNS mapping),
and in this case each and every page access would lead to a HTTP-403,
totally blocking the site.
And it would make the handling of the script more difficult for the
person who installs it - I don't like a procedure where the user has
to make it right or else security will be broken. This is no solution.

And the second one is pointing more into the future of the Apache
documentation. Now that all this has happened, let's at least make
best use of it so that others won't make the same mistake again.
Can there be any type of documentation about the security aspects
one has to take into consideration when writing a CGI script to be
used as Apache handler? Maybe there are even more pitfalls on my
road to publishing a secure CGI script ...
Currently, the handler description seems all so easy: "Hey, write
your CGI script, evaluate PATH_TRANSLATED and there you go!".
This was the euphoria that lead to my script, and now I don't know
whether I had better not written it at all ...

One last question: Is there any way for an "Action" handler to know
the MIME type it has been invoked for? I didn't find any Environment
variable to tell me about this - and after all, it is a MIME type
that a handler is assigned to, but there may be more than one MIME
type to be handled by the same handler.
In my special case the compression handler would like to be invoked
for more than one MIME type (like HTML, XHTML, JavaScript, CSS, ...)
but in each case send the correct "Content-Type:" HTTP header ...
my current level of knowledge requires me to use several copies of
my handler, and even individually customized ones (because my provi-
der doesn't support the use of SetEnv in .htaccess files ...).

Maybe all of my problems would dissolve into nothing if the "Action"
interface would set a couple of additional Environment variables ...

Regards, Michael

P.S.: Not really meant as to advertise my broken script :-(, but
      if anyone wants to read the whole story, algorithm, Perl
      source code and documentation, you can find it all at
          http://www.schroepl.net/projekte/gzip_cnc/,
      including my current research to find any secure method to
      make the thing usable again. The pages are going to change
      incrementally during the next days ...



---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org