You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Lukas Euler (JIRA)" <ji...@apache.org> on 2017/11/18 18:28:00 UTC

[jira] [Created] (IO-556) Unexpected behavior of FileNameUtils.normalize may lead to limited path traversal vulnerabilies

Lukas Euler created IO-556:
------------------------------

             Summary: Unexpected behavior of FileNameUtils.normalize may lead to limited path traversal vulnerabilies
                 Key: IO-556
                 URL: https://issues.apache.org/jira/browse/IO-556
             Project: Commons IO
          Issue Type: Bug
          Components: Utilities
    Affects Versions: 2.6, 2.5, 2.4, 2.3, 2.2
         Environment: all
            Reporter: Lukas Euler


I sent this report in an Email to security@apache.org on 2017-10-16. I did not receive any kind of response yet (2017-11-18 as of writing). I am now posting it publicly, to open the issue up for discussion, and hopefully initiate a fix.

This report is not about a vulnerability in {{commons-io}} per se, but an unexpected behavior that has a high chance of introducing a path traversal vulnerability when using {{FileNameUtils.normalize}} to sanitize user input. The traversal is limited to a single out-of-bounds-stepping "/../" segment.


h5. Reproduction

{Code}
FileNameUtils.normalize("//../foo");        // returns "//../foo" or "\\\\..\\foo", based on java.io.File.separatorChar
FileNameUtils.normalize("\\\\..\\foo");        // returns "//../foo" or "\\\\..\\foo", based on java.io.File.separatorChar
{Code}

h5. Possible impact (example)

Consider a web-application that uses {{FileNameUtils.normalize}} to sanitize a user-supplied file name string, and then appends the sanitized value to a configured upload directory to store the uploaded content in:

{Code}
String fileName = "//../foo";            // actually user-supplied (e.g. from multipart-POST request)
fileName = FileNameUtils.normalize(fileName);    // still holds the same value ("//../foo")   
           
if (fileName != null) {
    File newFile = new File("/base/uploads", fileName);    // java.io.File treats double forward slashes as singles
                                // newFile now points to "/base/uploads//../foo"
    newFile = newFile.getCanonicalFile();            // newFile now points to "/base/foo", which should be inaccessible

    // Write contents to newFile...
} else {
    // Assume malicious activity, handle error
}
{Code}


h5. Relevant code locations

* {{org.apache.commons.io.FilenameUtils#getPrefixLength}} : everything between a leading "//" and the next "/" is treated as a UNC server name, and ignored in all further validation logic of {{org.apache.commons.io.FilenameUtils#doNormalize}} .


h5. Discussion

One might argue that the given example is a misuse of the {{FileNameUtils.normalize}} method, and that everyone using it should expect absolute paths, full UNC paths, etc. to be returned by the method.
Furthermore, the vulnerability can only occur due to the lax behavior of {{java.io.File}} .

On the other hand, I believe that the JavaDoc of {{FileNameUtils.normalize}} suggests to most readers, that this method is exactly what is needed to sanitize file names:
{noformat}
//-----------------------------------------------------------------------
    /**
     * Normalizes a path, removing double and single dot path steps,
     * and removing any final directory separator.
     * <p>
     * This method normalizes a path to a standard format.
     * The input may contain separators in either Unix or Windows format.
     * The output will contain separators in the format of the system.
     * <p>
     * A trailing slash will be removed.
     * A double slash will be merged to a single slash (but UNC names are handled).
     * A single dot path segment will be removed.
     * A double dot will cause that path segment and the one before to be removed.
     * If the double dot has no parent path segment to work with, {@code null}
     * is returned.
     * <p>
     * The output will be the same on both Unix and Windows except
     * for the separator character.
     * <pre>
     * /foo//               --&gt;   /foo
     * /foo/./              --&gt;   /foo
     * /foo/../bar          --&gt;   /bar
     * /foo/../bar/         --&gt;   /bar
     * /foo/../bar/../baz   --&gt;   /baz
     * //foo//./bar         --&gt;   /foo/bar
     * /../                 --&gt;   null
     * ../foo               --&gt;   null
     * foo/bar/..           --&gt;   foo
     * foo/../../bar        --&gt;   null
     * foo/../bar           --&gt;   bar
     * //server/foo/../bar  --&gt;   //server/bar
     * //server/../bar      --&gt;   null
     * C:\foo\..\bar        --&gt;   C:\bar
     * C:\..\bar            --&gt;   null
     * ~/foo/../bar/        --&gt;   ~/bar
     * ~/../bar             --&gt;   null
     * </pre>
     * (Note the file separator returned will be correct for Windows/Unix)
     *
     * @param filename  the filename to normalize, null returns null
     * @return the normalized filename, or null if invalid. Null bytes inside string will be removed
     */
{noformat}

I have done a quick survey of the usages of the method in public GitHub repositories. I have found numerous projects that suffer from the limited path traversal vulnerability that is described here because of this very issue. This includes Webservers, Web-Frameworks, Archive-Extraction-Software, and others.

Preventing path traversal attacks is not trivial, and many people turn to libraries like {{commons-io}} to avoid making mistakes when implementing parsing logic on their own. They trust that {{FileNameUtils}} will provide them with the most complete, and suitable sanitation logic for file names.
In addition, ".." is not a valid UNC host name according to https://msdn.microsoft.com/de-de/library/gg465305.aspx , so prohibiting it shouldn't result in any major problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)