You are viewing a plain text version of this content. The canonical link for it is here.
Posted to apache-bugdb@apache.org by Stewart Honsberger <bl...@softhome.net> on 2001/10/17 22:04:20 UTC

general/8568: Web crawlers are able to gain access to directory listings of forbidden directories.

>Number:         8568
>Category:       general
>Synopsis:       Web crawlers are able to gain access to directory listings of forbidden directories.
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    apache
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   apache
>Arrival-Date:   Wed Oct 17 13:10:00 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     blackdeath@softhome.net
>Release:        1.3.20
>Organization:
apache
>Environment:
Linux kernel series 2.4.0 - 2.4.7, Apache 1.3.12 through 1.3.22, GCC compiler 2.95-2 through 3.1.
>Description:
A certain web crawler is able to gain directory listings of directories that are otherwise protected by a password. One such directory is configured as follows;

        <Directory "/usr/local/httpd/htdocs/main/cisco">
                AllowOverride None
                AuthType Basic
                AuthName "Cisco Content"
                AuthUserFile /etc/httpd/users
                require user blackdeath cisco
        </Directory>
For several weeks, this particular web crawler has been accessing specific files from as many as 3 levels deep, receiving a 401 error each time.

A trailer to this problem is that this same web crawler is able to discern URLs of web sites (personal homepages as well as URIs off of my main branch of my website) that a) Have never been advertised, b) No longer exist, c) Were never linked to, and the most glaring of all - d) Existed for exactly 2 minutes on my live web server! There existed a second-level sub-directory called "oldsite" which I created approximately one week ago by untar'ing it from a stored archive, but deleted literally two minutes after extracting it.

The severity of this particular incident has me more worried than even the access to forbidden content. How is it possible that a web crawler can monitor what directories I create - without even making a request to my web server?

Apache config file, specifics about aforementioned web-crawler, and other sensitive details available to Apache team members by request.
>How-To-Repeat:
Unknown.
>Fix:
Unknown.
>Release-Note:
>Audit-Trail:
>Unformatted:
 [In order for any reply to be added to the PR database, you need]
 [to include <ap...@Apache.Org> in the Cc line and make sure the]
 [subject line starts with the report component and number, with ]
 [or without any 'Re:' prefixes (such as "general/1098:" or      ]
 ["Re: general/1098:").  If the subject doesn't match this       ]
 [pattern, your message will be misfiled and ignored.  The       ]
 ["apbugs" address is not added to the Cc line of messages from  ]
 [the database automatically because of the potential for mail   ]
 [loops.  If you do not include this Cc, your reply may be ig-   ]
 [nored unless you are responding to an explicit request from a  ]
 [developer.  Reply only with text; DO NOT SEND ATTACHMENTS!     ]