You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Ronald F. Guilmette" <rf...@monkeys.com> on 2002/05/31 04:45:42 UTC

Need a new feature: Listing of CGI-enabled directories.


Greetings all,

I am directing this message to the developer's list because I strongly
suspect that it may require some new development.

I am working with many major web hosting companies to try to put a lid
on the FormMail spam problem.  With regards to this it would be most
helpful if I could get Apache, which already has code to parse and
analyze Apache configuration files, to simply spit out a list of all
of the CGI-enabled directories that are specified in a given http.conf
file to, say, stdout.

Background:

As many of you may already know, there is this script called FormMail.pl
that was written by a fellow named Matt Wright.  It is a free CGI script
used to process simple "contact us" type HTML form data, turning it into
a mail message.

Unfortunately, most versions of this script are subject to trivial
hijacking by spammers to send spam, and spammer exploitation of these
scripts, which are installed at thousands of locations all over the
world, is now rampant.  (If you have ever gotten a spam that contained
the phrase "Below is the result of your feedback form" then you have
been spammed via a hijacked FormMail script.)

Most sites and web hosting companies _do_ want to eliminate these exploit-
able FormMail scripts from their web servers, but when you are a big web
hosting company with thousands of virtually-hosted web sites, the task of
just finding the bad scripts in your filesystem isn't terribly easy.  In
fact in can get downright convoluted.

The first step in finding all such scripts however may often be the most
difficult one.  That first step consists of simply gathering into one
big list a list of all of the CGI-enabled directories on the local web
server.  Once such a list has been compiled, other standard UNIX tools
such as `find' and `file' and `grep' and be set to work, plowing through
all of the files in those (CGI) directories and finding all of the bad
FormMail scripts.

This latter part of the process is something that I (and/or others) can
easily write a shell or Perl script to perform.  It is the initial part
of the process... finding and listing all CGI-enabled directories... that
remains problematic.

Now in theory, I _could_ write my own parser/analyzer for httpd.conf files,
and then use that to fish out the list of CGI-enabled directories, but as
with all good programmers, I'm lazier than a one-legged dog, and I don't
have any desire to re-invent the wheel here.  Apache itself clearly already
contains something that knows perfectly how to parse and analyze http.conf
files, so I reallyt would just like to leverage off that and get Apache's
own config file parser/analyzer to spit out the list of CGI-enabled
directories that I need.  That seems to be the shortest route to success,
but I am not fundamentally an Apache hacker, so I thought that I would
just take a shot in the dark, wander in here and beg for some help, and
hope that someone who _is_ an experienced Apache hacker would take pity
on me and volunteer to help with this very noble and worthy cause.

Well, I can dream can't I? :-)

But seriously, is there already a way to do what I need to do with the
Apache server?  Looking at the command line options on the man page for
httpd, there doesn't seem to be an option to request httpd to just parse
the config file, list the CGI directories, and then exit.  But that's
exactly what I need.

So, any volunteers?

It really is a worthy cause.  If you can help, please do.


Regards,
Ron Guilmette


P.S.  Note that FormMail is far from the only dangerous CGI script that
a big web hosting company might find that some of its less-clued end lusers
may have uploaded and installed into CGI-enabled directories.  And the
overall comprehensive search process that I'm trying to develop could
be (and arguably should be) applied to searching for other dangerous CGI
scripts too.

I just mention that fact in order to underscore the point that the kind
of new Apache feature I'm hoping for here would have general applicability
to a larger set of problems than just the FormMail problem.  Listing of the
CGI-enabled directories is really a generic facility that Apache really
ought to provide.

Re: Need a new feature: Listing of CGI-enabled directories.

Posted by Zac Stevens <zt...@cryptocracy.com>.
Hi Ronald,


Let me preface this by noting that I agree with your goals, however I
believe that you may not have considered the nature of the problem in
sufficient depth.

On Thu, May 30, 2002 at 07:45:42PM -0700, Ronald F. Guilmette wrote:
> Most sites and web hosting companies _do_ want to eliminate these exploit-
> able FormMail scripts from their web servers, but when you are a big web
> hosting company with thousands of virtually-hosted web sites, the task of
> just finding the bad scripts in your filesystem isn't terribly easy.  In
> fact in can get downright convoluted.
> 
> The first step in finding all such scripts however may often be the most
> difficult one.  That first step consists of simply gathering into one
> big list a list of all of the CGI-enabled directories on the local web
> server.  Once such a list has been compiled, other standard UNIX tools
> such as `find' and `file' and `grep' and be set to work, plowing through
> all of the files in those (CGI) directories and finding all of the bad
> FormMail scripts.

How are you defining "bad FormMail scripts" here?  In the simplest case,
you could just run 'find' across the filesystem containing the web content
looking for "formmail.cgi" or "formmail.pl" and checking those found
against a list of known good/bad versions.  This doesn't require any
specific knowledge of the Apache configuration in use, and is an entirely
viable approach even on filesystems of several hundred gigabytes.

A more thorough check would involve testing all executable ascii files,
perhaps excluding .html/.php and variants thereof.

> But seriously, is there already a way to do what I need to do with the
> Apache server?  Looking at the command line options on the man page for
> httpd, there doesn't seem to be an option to request httpd to just parse
> the config file, list the CGI directories, and then exit.  But that's
> exactly what I need.

It isn't possible in the generic case.  Apache allows many aspects of the
configuration to be modified by the inclusion of ".htaccess" files beneath
the DocumentRoot of the web server.  Unless Apache has been configured not
to allow the ExecCGI option to be overridden, you would need to parse both
httpd.conf AND every .htaccess file on the filesystem.  Apache itself does
not do this at startup - it is done only as required.

> P.S.  Note that FormMail is far from the only dangerous CGI script that
> a big web hosting company might find that some of its less-clued end lusers
> may have uploaded and installed into CGI-enabled directories.  And the
> overall comprehensive search process that I'm trying to develop could
> be (and arguably should be) applied to searching for other dangerous CGI
> scripts too.

You also can't assume that only files in designated CGI directories are
going to be problematic.  There's a long history of people using all sorts
of redirection and other techniques to access/execute things that they
shouldn't be able to.  This does, admittedly, involve far more effort (and
good luck) than the straightforward formmail abuse you're concerned about.

It should be apparent at this point that what you're looking at here is a
reduction of the abuse of formmail & etc, as it is almost impossible to
stamp out entirely.

In this particular example, the best solution is for the web host to
replace the sendmail binary with something which performs more rigorous
checking of its input and, ideally, tags outbound messages with identifying
information.  This is the solution implemented at a former employer - the
sendmail replacement checks that the From: address is valid and sane and
writes a Received: header which includes information which can be used to
identify the VirtualHost responsible for generating the message.  This also
removes the ability for poorly-written PHP scripts and the like to be
abused in a similar manner.

Although the script was developed in-house (and, unfortunately, cannot be
released) I believe there are open-source alternatives available to
accomplish the same ends.

I hope these thoughts provide you with a different perspective to consider,
and I wish you well in your efforts to reduce the amount of spam on the
Internet.

Cheers,


Zac


Re: Need a new feature: Listing of CGI-enabled directories.

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 09:45 PM 5/30/2002, you wrote:

>I am directing this message to the developer's list because I strongly
>suspect that it may require some new development.
>...
>The first step in finding all such scripts however may often be the most
>difficult one.  That first step consists of simply gathering into one
>big list a list of all of the CGI-enabled directories on the local web
>server.  Once such a list has been compiled, other standard UNIX tools
>such as `find' and `file' and `grep' and be set to work, plowing through
>all of the files in those (CGI) directories and finding all of the bad
>FormMail scripts.

I've been thinking along this line for a different reason [not just limited
to the cgi enabled bit.]  However, I have no time to implement at the
moment, but I'll share my thoughts.  Many, many users have trouble
figuring out

   1. why their scripts don't run
   2. how the +/- options syntax was really parsed
   3. how a given request will be handled

What about an htls command [httpd ls file listing utility] that would
actually use the core Apache code to run directories and files through
the Location, Directory and Files tests?

It doesn't have to be blindingly fast [it isn't the daemon itself, it's an
administrator's tool] and it can be rather simple.

One complexity is the difference introduced by <Location> and
<VirtualHost> directives.  My original though was to list the files, but
those listings are only complete in the context of a host.

If we take the logic in mod_autoindex one step further, and set up
whatever hooks through mod_alias to 'query' the possible matches
under a given pattern, we should be able to make this reasonably
complete.  Perhaps the same sort of hook would work for mod_rewrite,
although the logic to "work those backwards" is probably as complex
as the rewrite code itself.  Recursion is one complication, depending
on how the patterns interact.

Anyway, I'm not suggesting that htls would ever be a web app!!!
This would be a local tool, similar to mod_info, only providing a list
of resources and the flags that affect them.

One might invoke the utility with;

   htls -r www.apache.org/dist/

and using whatever abbrieviations, you might end up with something like
IN(D)EXES
(I)NCLUDES INC(N)OEXEC
(S)YM_LINKS OPT_SYM_(O)WNER
(E)XECCGI
(M)ULTI 128

D--SO-M  www.apache.org/dist/project1/
D--SO-M  www.apache.org/dist/project1/file.html
D----EM  www.apache.org/dist/project1/search.pl
D--SO-M www.apache.org/dist/project2/
...
You could further extend the concept to include the optional listing of
the handler toggled by such a request, etc.

BTW - you might also want to consider simply reviewing your log files.
It would be really trivial if you could add the handler invoked (or a flag
to indicate that a given filter were invoked) to the logs, so you could
look specifically for requests that trigger mod_cgi.  It might be a
straighter line to identifying the risks you are concerned with.