You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Marc Slemko <ma...@worldgate.com> on 1998/11/01 08:57:42 UTC

Re: [PATCH] RE: Canonical filename overhaul

On Mon, 19 Oct 1998, Ken Parzygnat wrote:

> Ok, here's the patch for you to play with.  This patch is what
> was previously discussed with one addition.  I heard through
> the grapevine (credit Marc) that one problem was related to 
> use of alias names (the loveable "blah~1" type names).  Therefore,
> I had another piece of code up my sleeve to address another problem,
> but I'm putting it on the table now because it will solve this problem.

This doesn't want to apply for me; the util_win32.c chunks fail
completely.  

Could you either see if that may be something on your end or just send the
whole util_win32.c file?

A few notes:

Get rid of the ap_assert()s.  There is no need for them and it is bad 
coding practice to use them for "normal" error conditions.  Yea, I 
know, they are a carryover from the existing code but it is broken.

Are you sure you want to return "" if the arg is too long for
the buffer in ap_os_systemcase_filename?  

Other than that, the concept looks reasonable and if it actually works
all the better...  I can't say I fully understand all the issues on 
Win32 since it is such a pain, but...  I just wish there were some 
documented way to compare file paths on Win32.  I'm sure there are
more magic cases that we don't know about.




RE: [PATCH] RE: Canonical filename overhaul

Posted by Ken Parzygnat <kp...@raleigh.ibm.com>.
> > 
> > BTW, wouldn't it make more sense to return NULL if szFile is NULL?
> 
> Probably.  I've updated this too.  I was originally thinking that we
> needed to return some string.  This sort of falls under your next
> comment.
> 
After looking at this a little more, there are places in the core code that
don't check for a NULL return from ap_os_canonical_filename.  Therefore,
it probably is safer to return an empty string for now.

> > 
> > Double-BTW, there isn't a defined way for ap_os_canonical_filename to
> > return an error. Do we want one?
> > 
> I'm back and forth on this.  In one respect it would be nice to have error
> codes to indicate things like NULL parameters and filenames that are way
> too long.  But on the other hand, I think that ap_os_canonical_filename
> should make the best attempt it can at returning something since we don't
> want to make too many assumptions about the input.  I'm really not sold
> either way.
If we decide to do this, this should probably go in on its own patch since
we will have to change core code to look for error codes. 

So, anyone against committing this patch?  Any +1's?


- - - - - - - - - - - - - - - - - -
Ken Parzygnat
email: kparz@raleigh.ibm.com 
 

RE: [PATCH] RE: Canonical filename overhaul

Posted by Ken Parzygnat <kp...@raleigh.ibm.com>.
> > 
> > > Ok, here's the patch for you to play with.  This patch is what
> > > was previously discussed with one addition.  I heard through
> > > the grapevine (credit Marc) that one problem was related to
> > > use of alias names (the loveable "blah~1" type names).  Therefore,
> > > I had another piece of code up my sleeve to address another problem,
> > > but I'm putting it on the table now because it will solve this problem.
> > 
> > This doesn't want to apply for me; the util_win32.c chunks fail
> > completely.
> > 
> > Could you either see if that may be something on your end or just send the
> > whole util_win32.c file?
> > 
> > A few notes:
> > 
> > Get rid of the ap_assert()s.  There is no need for them and it is bad
> > coding practice to use them for "normal" error conditions.  Yea, I
> > know, they are a carryover from the existing code but it is broken.
> 
> There's only one mentioned in the patch that I can see (on the + side),
> and that's:
> 
> +    ap_assert(szFile != NULL);
> +    if (szFile == NULL || strlen(szFile) == 0)
> +        return ap_pstrdup(pPool, "");
> 
> which is either correct, if we don't think you should pass a NULL to
> ap_os_canonical_filename, in which case the following if is wrong, or
> incorrect if the following if is correct.

Yep.  I was thinking that ap_assert was only in affect on debug
builds.  I've changed this in the last repost.

> 
> BTW, wouldn't it make more sense to return NULL if szFile is NULL?

Probably.  I've updated this too.  I was originally thinking that we
needed to return some string.  This sort of falls under your next
comment.

> 
> Double-BTW, there isn't a defined way for ap_os_canonical_filename to
> return an error. Do we want one?
> 
I'm back and forth on this.  In one respect it would be nice to have error
codes to indicate things like NULL parameters and filenames that are way
too long.  But on the other hand, I think that ap_os_canonical_filename
should make the best attempt it can at returning something since we don't
want to make too many assumptions about the input.  I'm really not sold
either way.

Thanks for your comments Ben!
- - - - - - - - - - - - - - - - - -
Ken Parzygnat
email: kparz@raleigh.ibm.com 


Re: [PATCH] RE: Canonical filename overhaul

Posted by Ben Laurie <be...@algroup.co.uk>.
Marc Slemko wrote:
> 
> On Mon, 19 Oct 1998, Ken Parzygnat wrote:
> 
> > Ok, here's the patch for you to play with.  This patch is what
> > was previously discussed with one addition.  I heard through
> > the grapevine (credit Marc) that one problem was related to
> > use of alias names (the loveable "blah~1" type names).  Therefore,
> > I had another piece of code up my sleeve to address another problem,
> > but I'm putting it on the table now because it will solve this problem.
> 
> This doesn't want to apply for me; the util_win32.c chunks fail
> completely.
> 
> Could you either see if that may be something on your end or just send the
> whole util_win32.c file?
> 
> A few notes:
> 
> Get rid of the ap_assert()s.  There is no need for them and it is bad
> coding practice to use them for "normal" error conditions.  Yea, I
> know, they are a carryover from the existing code but it is broken.

There's only one mentioned in the patch that I can see (on the + side),
and that's:

+    ap_assert(szFile != NULL);
+    if (szFile == NULL || strlen(szFile) == 0)
+        return ap_pstrdup(pPool, "");

which is either correct, if we don't think you should pass a NULL to
ap_os_canonical_filename, in which case the following if is wrong, or
incorrect if the following if is correct.

BTW, wouldn't it make more sense to return NULL if szFile is NULL?

Double-BTW, there isn't a defined way for ap_os_canonical_filename to
return an error. Do we want one?

Cheers,

Ben.

-- 
Ben Laurie            |Phone: +44 (181) 735 0686| Apache Group member
Freelance Consultant  |Fax:   +44 (181) 735 0689|http://www.apache.org/
and Technical Director|Email: ben@algroup.co.uk |
A.L. Digital Ltd,     |Apache-SSL author     http://www.apache-ssl.org/
London, England.      |"Apache: TDG" http://www.ora.com/catalog/apache/

RE: [PATCH] RE: Canonical filename overhaul

Posted by Ken Parzygnat <kp...@raleigh.ibm.com>.
> > Ok, here's the patch for you to play with.  This patch is what
> > was previously discussed with one addition.  I heard through
> > the grapevine (credit Marc) that one problem was related to 
> > use of alias names (the loveable "blah~1" type names).  Therefore,
> > I had another piece of code up my sleeve to address another problem,
> > but I'm putting it on the table now because it will solve this problem.
> 
> This doesn't want to apply for me; the util_win32.c chunks fail
> completely.  
> 
> Could you either see if that may be something on your end or just send the
> whole util_win32.c file?

I think if you tell patch to ignore whitespace, it will work.  In my version
of patch the option is --ignore-whitespace.  However, since I'm just about
replacing the entire file, I'll post the whole file here.  If anything, it'll
be easier to read here.

> 
> A few notes:
> 
> Get rid of the ap_assert()s.  There is no need for them and it is bad 
> coding practice to use them for "normal" error conditions.  Yea, I 
> know, they are a carryover from the existing code but it is broken.

OK.  The code below has them removed.  I was coding under the
false assumption that asserts only had affect on debug builds.
I didn't remove them from sections of code I did not modify, 
do you want those blown away too?

> 
> Are you sure you want to return "" if the arg is too long for
> the buffer in ap_os_systemcase_filename?  
> 
Well, if the name is too long for the buffer, the name is over
8K, so I'm betting it's bogus.  Maybe returning NULL is a 
better idea?  This sort of falls under Ben's comment about
whether or not we should return error codes.

> second, what about trailing spaces?  In some cases, opening "file " will
> give you "file".  I don't think we deal with it now, but that doesn't mean
> we shouldn't. 
Ah-ha! Good catch.  I've updated the code to remove trailing spaces
from the filename.  It is reflected below. Thanks.

Here's the code, with modifications noted by you and Ben.
I also made a change to the header where the two new routines are
defined.  I don't want to break other OS's that have 
HAVE_CANONICAL_FILENAME defined (e.g. OS/2).
I've included the patch files for httpd.h and http_request.c, and
I copied in util_win32.c wholesale.

Thanks for taking time to look at this.

--- .\include\httpd.h.orig	Wed Oct 28 20:12:18 1998
+++ .\include\httpd.h	Mon Nov 02 16:01:27 1998
@@ -998,8 +998,17 @@

 #ifndef HAVE_CANONICAL_FILENAME
 #define ap_os_canonical_filename(p,f)  (f)
+#define ap_os_case_canonical_filename(p,f)  (f)
+#define ap_os_systemcase_filename(p,f)  (f)
 #else
 API_EXPORT(char *) ap_os_canonical_filename(pool *p, const char *file);
+#ifdef WIN32
+API_EXPORT(char *) ap_os_case_canonical_filename(pool *pPool, const char *szFile);
+API_EXPORT(char *) ap_os_systemcase_filename(pool *pPool, const char *szFile);
+#else
+#define ap_os_case_canonical_filename(p,f) ap_os_canonical_filename(p,f)
+#define ap_os_systemcase_filename(p,f) ap_os_canonical_filename(p,f)
+#endif
 #endif

 #ifdef _OSD_POSIX
--- .\main\http_request.c.orig	Tue Oct 20 19:12:26 1998
+++ .\main\http_request.c	Mon Nov 02 14:30:23 1998
@@ -359,16 +359,19 @@
         return OK;
     }

-    r->filename   = ap_os_canonical_filename(r->pool, r->filename);
-    test_filename = ap_pstrdup(r->pool, r->filename);
-
-    ap_no2slash(test_filename);
-    num_dirs = ap_count_dirs(test_filename);
+    r->filename   = ap_os_case_canonical_filename(r->pool, r->filename);

     res = get_path_info(r);
     if (res != OK) {
         return res;
     }
+
+    r->filename   = ap_os_canonical_filename(r->pool, r->filename);
+
+    test_filename = ap_pstrdup(r->pool, r->filename);
+
+    ap_no2slash(test_filename);
+    num_dirs = ap_count_dirs(test_filename);

     if ((res = check_safe_file(r))) {
         return res;



============== UTIL_WIN32.C =====================
#include <windows.h>
#include <sys/stat.h>
#include <stdarg.h>

#include "httpd.h"
#include "http_log.h"

/* Returns TRUE if the input string is a string
 * of one or more '.' characters.
 */
static BOOL OnlyDots(char *pString)
{
    char *c;

    if (*pString == '\0')
        return FALSE;

    for (c = pString;*c;c++)
        if (*c != '.')
            return FALSE;

    return TRUE;
}

/* Accepts as input a pathname, and tries to match it to an
 * existing path and return the pathname in the case that
 * is present on the existing path.  This routine also
 * converts alias names to long names.
 */
API_EXPORT(char *) ap_os_systemcase_filename(pool *pPool,
                                             const char *szFile)
{
    char buf[HUGE_STRING_LEN];
    char *pInputName;
    char *p, *q;
    BOOL bDone = FALSE;
    BOOL bFileExists = TRUE;
    HANDLE hFind;
    WIN32_FIND_DATA wfd;

    if (!szFile || strlen(szFile) >= sizeof(buf))
        return NULL;

    if (strlen(szFile) == 0)
        return ap_pstrdup(pPool, "");

    buf[0] = '\0';
    pInputName = ap_pstrdup(pPool, szFile);

    /* First convert all slashes to \ so Win32 calls work OK */
    for (p = pInputName; *p; p++) {
        if (*p == '/')
            *p = '\\';
    }

    p = pInputName;
    /* If there is drive information, copy it over. */
    if (pInputName[1] == ':') {
        buf[0] = tolower(*p++);
        buf[1] = *p++;
        buf[2] = '\0';

        /* If all we have is a drive letter, then we are done */
        if (strlen(pInputName) == 2)
            bDone = TRUE;
    }

    q = p;
    if (*p == '\\') {
        p++;
        if (*p == '\\')  /* Possible UNC name */
        {
            p++;
            /* Get past the machine name.  FindFirstFile */
            /* will not find a machine name only */
            p = strchr(p, '\\');
            if (p)
            {
                p++;
                /* Get past the share name.  FindFirstFile */
                /* will not find a \\machine\share name only */
                p = strchr(p, '\\');
                if (p) {
                    strncat(buf,q,p-q);
                    q = p;
                    p++;
                }
            }

            if (!p)
                p = q;
        }
    }

    p = strchr(p, '\\');

    while (!bDone) {
        if (p)
            *p = '\0';

        if (strchr(q, '*') || strchr(q, '?'))
            bFileExists = FALSE;

        /* If the path exists so far, call FindFirstFile
         * again.  However, if this portion of the path contains
         * only '.' charaters, skip the call to FindFirstFile
         * since it will convert '.' and '..' to actual names.
         * Note: in the call to OnlyDots, we may have to skip
         *       a leading slash.
         */
        if (bFileExists && !OnlyDots((*q == '.' ? q : q+1))) {
            hFind = FindFirstFile(pInputName, &wfd);

            if (hFind == INVALID_HANDLE_VALUE) {
                bFileExists = FALSE;
            }
            else {
                FindClose(hFind);

                if (*q == '\\')
                    strcat(buf,"\\");
                strcat(buf, wfd.cFileName);
            }
        }

        if (!bFileExists || OnlyDots((*q == '.' ? q : q+1))) {
            strcat(buf, q);
        }

        if (p) {
            q = p;
            *p++ = '\\';
            p = strchr(p, '\\');
        }
        else {
            bDone = TRUE;
        }
    }

    /* First convert all slashes to / so server code handles it ok */
    for (p = buf; *p; p++) {
        if (*p == '\\')
            *p = '/';
    }

    return ap_pstrdup(pPool, buf);
}


/*  Perform canonicalization with the exception that the
 *  input case is preserved.
 */
API_EXPORT(char *) ap_os_case_canonical_filename(pool *pPool,
                                                 const char *szFile)
{
    char *pNewStr;
    char *s;
    char *p;
    char *q;

    if (szFile == NULL)
        return NULL;

    if (strlen(szFile) == 0)
        return ap_pstrdup(pPool, "");

    pNewStr = ap_pstrdup(pPool, szFile);

    /*  Change all '\' characters to '/' characters.
     *  While doing this, remove any trailing '.'.
     *  Also, blow away any directories with 3 or
     *  more '.'
     */
    for (p = pNewStr,s = pNewStr; *s; s++,p++) {
        if (*s == '\\' || *s == '/') {

            q = p;
            while (p > pNewStr && *(p-1) == '.')
                p--;

            if (p == pNewStr && q-p <= 2 && *p == '.')
                p = q;
            else if (p > pNewStr && p < q && *(p-1) == '/') {
                if (q-p > 2)
                    p--;
                else
                    p = q;
            }

            *p = '/';
        }
        else {
            *p = *s;
        }
    }
    *p = '\0';

    /*  Blow away any final trailing '.' since on Win32
     *  foo.bat == foo.bat. == foo.bat... etc.
     *  Also blow away any trailing spaces since
     *  "filename" == "filename "
     */
    q = p;
    while (p > pNewStr && (*(p-1) == '.' || *(p-1) == ' '))
        p--;
    if ((p > pNewStr) ||
        (p == pNewStr && q-p > 2))
        *p = '\0';


    /*  One more security issue to deal with.  Win32 allows
     *  you to create long filenames.  However, alias filenames
     *  are always created so that the filename will
     *  conform to 8.3 rules.  According to the Microsoft
     *  Developer's network CD (1/98)
     *  "Automatically generated aliases are composed of the
     *   first six characters of the filename plus ~n
     *   (where n is a number) and the first three characters
     *   after the last period."
     *  Here, we attempt to detect and decode these names.
     */
    p = strchr(pNewStr, '~');
    if (p != NULL) {
        char *pConvertedName, *pQstr, *pPstr;
        char buf[HUGE_STRING_LEN];
        /* We potentially have a short name.  Call
         * ap_os_systemcase_filename to examine the filesystem
         * and possibly extract the long name.
         */
        pConvertedName = ap_os_systemcase_filename(pPool, pNewStr);

        /* Since we want to preserve the incoming case as much
         * as we can, compare for differences in the string and
         * only substitute in the path names that changed.
         */
        if (stricmp(pNewStr, pConvertedName)) {
            buf[0] = '\0';

            q = pQstr = pConvertedName;
            p = pPstr = pNewStr;
            do {
                q = strchr(q,'/');
                p = strchr(p,'/');

                if (p != NULL) {
                    *q = '\0';
                    *p = '\0';
                }

                if (stricmp(pQstr, pPstr))
                    strcat(buf, pQstr);   /* Converted name */
                else
                    strcat(buf, pPstr);   /* Original name  */


                if (p != NULL) {
                    pQstr = q;
                    pPstr = p;
                    *q++ = '/';
                    *p++ = '/';
                }

            } while (p != NULL);

            pNewStr = ap_pstrdup(pPool, buf);
        }
    }


    return pNewStr;
}

/*  Perform complete canonicalization.
 */
API_EXPORT(char *) ap_os_canonical_filename(pool *pPool, const char *szFile)
{
    char *pNewName;
    pNewName = ap_os_case_canonical_filename(pPool, szFile);
    strlwr(pNewName);
    return pNewName;
}

/* Win95 doesn't like trailing /s. NT and Unix don't mind. This works
 * around the problem.
 * Errr... except if it is UNC and we are referring to the root of
 * the UNC, we MUST have a trailing \ and we can't use /s. Jeez.
 * Not sure if this refers to all UNCs or just roots,
 * but I'm going to fix it for all cases for now. (Ben)
 */

#undef stat
API_EXPORT(int) os_stat(const char *szPath, struct stat *pStat)
{
    int n;

    if (strlen(szPath) == 0) {
        return -1;
    }

    if (szPath[0] == '/' && szPath[1] == '/') {
	char buf[_MAX_PATH];
	char *s;
	int nSlashes = 0;

	ap_assert(strlen(szPath) < _MAX_PATH);
	strcpy(buf, szPath);
	for (s = buf; *s; ++s) {
	    if (*s == '/') {
		*s = '\\';
		++nSlashes;
	    }
	}
	/* then we need to add one more to get \\machine\share\ */
	if (nSlashes == 3) {
	    *s++ = '\\';
	}
	*s = '\0';
	return stat(buf, pStat);
    }

    /*
     * Below removes the trailing /, however, do not remove
     * it in the case of 'x:/' or stat will fail
     */
    n = strlen(szPath);
    if ((szPath[n - 1] == '\\' || szPath[n - 1] == '/') &&
        !(n == 3 && szPath[1] == ':')) {
        char buf[_MAX_PATH];

        ap_assert(n < _MAX_PATH);
        strcpy(buf, szPath);
        buf[n - 1] = '\0';

        return stat(buf, pStat);
    }
    return stat(szPath, pStat);
}

/* Fix two really crap problems with Win32 spawn[lv]e*:
 *
 *  1. Win32 doesn't deal with spaces in argv.
 *  2. Win95 doesn't like / in cmdname.
 */

#undef _spawnv
API_EXPORT(int) os_spawnv(int mode, const char *cmdname,
			  const char *const *argv)
{
    int n;
    char **aszArgs;
    const char *szArg;
    char *szCmd;
    char *s;

    szCmd = _alloca(strlen(cmdname)+1);
    strcpy(szCmd, cmdname);
    for (s = szCmd; *s; ++s) {
        if (*s == '/') {
            *s = '\\';
	}
    }

    for (n = 0; argv[n]; ++n)
        ;

    aszArgs = _alloca((n + 1) * sizeof(const char *));

    for (n = 0; szArg = argv[n]; ++n) {
        if (strchr(szArg, ' ')) {
            int l = strlen(szArg);

            aszArgs[n] = _alloca(l + 2 + 1);
            aszArgs[n][0] = '"';
            strcpy(&aszArgs[n][1], szArg);
            aszArgs[n][l + 1] = '"';
            aszArgs[n][l + 2] = '\0';
        }
        else {
            aszArgs[n] = (char *)szArg;
        }
    }

    aszArgs[n] = NULL;

    return _spawnv(mode, szCmd, aszArgs);
}

#undef _spawnve
API_EXPORT(int) os_spawnve(int mode, const char *cmdname,
			   const char *const *argv, const char *const *envp)
{
    int n;
    char **aszArgs;
    const char *szArg;
    char *szCmd;
    char *s;

    szCmd = _alloca(strlen(cmdname)+1);
    strcpy(szCmd, cmdname);
    for (s = szCmd; *s; ++s) {
        if (*s == '/') {
            *s = '\\';
	}
    }

    for (n = 0; argv[n]; ++n)
        ;

    aszArgs = _alloca((n + 1)*sizeof(const char *));

    for (n = 0; szArg = argv[n]; ++n){
        if (strchr(szArg, ' ')) {
            int l = strlen(szArg);

            aszArgs[n] = _alloca(l + 2 + 1);
            aszArgs[n][0] = '"';
            strcpy(&aszArgs[n][1], szArg);
            aszArgs[n][l + 1] = '"';
            aszArgs[n][l + 2] = '\0';
        }
        else {
            aszArgs[n] = (char *)szArg;
        }
    }

    aszArgs[n] = NULL;

    return _spawnve(mode, szCmd, aszArgs, envp);
}

API_EXPORT(int) os_spawnle(int mode, const char *cmdname, ...)
{
    int n;
    va_list vlist;
    char **aszArgs;
    const char *szArg;
    const char *const *aszEnv;
    char *szCmd;
    char *s;

    szCmd = _alloca(strlen(cmdname)+1);
    strcpy(szCmd, cmdname);
    for (s = szCmd; *s; ++s) {
        if (*s == '/') {
            *s = '\\';
	}
    }

    va_start(vlist, cmdname);
    for (n = 0; va_arg(vlist, const char *); ++n)
        ;
    va_end(vlist);

    aszArgs = _alloca((n + 1) * sizeof(const char *));

    va_start(vlist, cmdname);
    for (n = 0; szArg = va_arg(vlist, const char *); ++n) {
        if (strchr(szArg, ' ')) {
            int l = strlen(szArg);

            aszArgs[n] = _alloca(l + 2 + 1);
            aszArgs[n][0] = '"';
            strcpy(&aszArgs[n][1], szArg);
            aszArgs[n][l + 1] = '"';
            aszArgs[n][l + 2] = '\0';
        }
        else {
            aszArgs[n] = (char *)szArg;
        }
    }

    aszArgs[n] = NULL;

    aszEnv = va_arg(vlist, const char *const *);
    va_end(vlist);

    return _spawnve(mode, szCmd, aszArgs, aszEnv);
}


- - - - - - - - - - - - - - - - - -
Ken Parzygnat
email: kparz@raleigh.ibm.com 






--------------------------------------------------------------------------------
Users of the Apache webserver are hereby granted a non-exclusive, irrevocable,
world-wide, royalty-free, non-transferable license to use, execute, prepare
derivative works of, and distribute (internally and externally, and including
derivative works) the code accompanying this license as part of, and
integrated into the Apache webserver.  This code is provided "AS IS" WITHOUT
WARRANTY OF ANY KIND EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
THE IMPLIED WARRANTY OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
AND ANY WARRANTY OF NON-INFRINGEMENT.  THE ENTIRE RISK ARISING OUT OF THE USE
OR PERFORMANCE OF THIS CODE REMAINS WITH USERS OF THE APACHE WEBSERVER.  The
owner of this code represents and warrants that it is legally entitled to
grant the above license.

Re: [PATCH] RE: Canonical filename overhaul

Posted by Marc Slemko <ma...@worldgate.com>.
A few more comments:

first, looking over the test cases you posted, I like this new code even
more.  Thanks, this is the source of many bug reports and should be the
source of many more if people knew what to expect.

second, what about trailing spaces?  In some cases, opening "file " will
give you "file".  I don't think we deal with it now, but that doesn't mean
we shouldn't.