You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by "Joshua Maurice (JIRA)" <ji...@apache.org> on 2018/03/23 22:49:00 UTC

[jira] [Created] (XALANJ-2613) TransformerIdentityImpl doesn't properly handle file URIs with percent-encoded Unicode characters

Joshua Maurice created XALANJ-2613:
--------------------------------------

             Summary: TransformerIdentityImpl doesn't properly handle file URIs with percent-encoded Unicode characters
                 Key: XALANJ-2613
                 URL: https://issues.apache.org/jira/browse/XALANJ-2613
             Project: XalanJ2
          Issue Type: Bug
      Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects.  Anybody can view the issue.)
          Components: transformation
    Affects Versions: 2.7.2
         Environment: I tested on the following system:

$ cat /etc/centos-release
CentOS Linux release 7.4.1708 (Core)
$ uname -a
Linux jjmdeskvm.informatica.com 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ env | grep -E '^LANG'
LANG=en_US.UTF-8
$ env | grep -E '^LC'
$
            Reporter: Joshua Maurice
            Assignee: Steven J. Hathaway
             Fix For: The Latest Development Code
         Attachments: Repro.java, runtest.sh

When using Xalan, and javax.xml.transform.Transformer, with a javax.xml.transform.stream.StreamResult constructed from a java.io.File object that contains Unicode characters, the Transformer will create an output file with the wrong file path.

I have attached a very small repro, which is a very small Java file and a very small bash script used to compile and run the test, and print out a few relevant environmental details.

 

The cause of the bug is this:

When constructing a StreamResult object by passing a File object to the constructor, the StreamResult object saves a string representation of the URI object created from the File object. This string representation of the URI is properly formatted, which means that the individual path elements of the path of the URI are properly percent-encoded. The Xalan TransformerImpl class calls getSystemId on StreamResult to get this string representation of the URI, and it simply strips off the leading "file://" prefix, and uses the remainder to create a FileOutputStream object. However, the remainder of the string is the result of URI percent-encoding, and as such, it is not suitable for directly passing to FileOutputStream. Instead, the code here must use a URI utility to properly interpret the URI string, and to undo the percent-encoding, to obtain a string that is suitable for creating a FileOutputStream object.

When the file path contains only ASCII characters, percent-encoding does nothing, which means that the code works with ASCII. However, as soon as any other Unicode character is part of the file path, then it breaks by writing to the wrong file path.

Because it writes to the wrong file path which may silently succeed, this may have security concerns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xalan.apache.org
For additional commands, e-mail: dev-help@xalan.apache.org