You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Edward Toro <ed...@RocketSoftware.com> on 2004/03/10 21:37:41 UTC
Tomcat 5.0.19 international filenames inaccessible
Short version:
Does Tomcat 5 no longer serve files with international characters in their filenames?
Long version:
Environment: Tomcat 5.1.19 on WinXP Pro
I have a file located in: <tomcat-home>/<webapps>/MyWebApp/. The filename contains international characters: 0x305f 0x3079 0x304f (a.k.a E3-81-9F E3-81-B9 E3-81-8F in UTF-8)).
When I navigate to the directory via http://<server>:8080/<webappname>/ I get a directory listing of the files in that directory. I can access every file on that list except those that contain international characters.
When I click on a filename that contains international characters, I'm sent to http://<server>:8080/<webappname>%E3%81%9F%E3%81%B9%E3%81%8F.xml. This is the correct result of putting the filename through a URLEncoder with the UTF-8 character set, which is what I assume is being done behind by the scene by the server. Except the file doesn't appear. I get a 404 error.
So I made some Java testing code:
try {
URL url = new URL("http://<server>:8080/<webapp>/%E3%81%9F%E3%81%B9%E=
3%81%8F.xml");
HttpURLConnection conn = (HttpURLConnection)url.openConnection();
// checking the headers
String header;
String key;
int i = 0;
while ((header = conn.getHeaderField(i)) != null) {
key = conn.getHeaderFieldKey(i);
System.out.println(key + " = " + header);
i++;
}
// checking the content
InputStream is = url.openConnection().getInputStream();
InputStreamReader isr = new InputStreamReader(is);
int chr;
while ((chr = isr.read()) != -1) {
System.out.print((char)chr);
}
System.out.println("success");
} catch (Throwable t) { t.printStackTrace(); }
The headers I get back are:
HTTP/1.1 404 /<webapp>/%E3%81%9F%E3%81%B9%E3%81%8F.scene.xml
Content-Type = text/html;charset=ISO-8859-1
Content-Language = en-US
Content-Length = 1091
Date = Wed, 10 Mar 2004 18:02:01 GMT
Server = Apache-Coyote/1.1
No help there because I get those same headers when I try to access a file that doesn't exist at all:
HTTP/1.1 404 /<webapp>/inexistent.xml
Content-Type = text/html;charset=3DISO-8859-1
Content-Language = en-US
Content-Length = 1040
Date = Wed, 10 Mar 2004 18:03:22 GMT
Server = Apache-Coyote/1.1
When I try to access the input stream to read for content, I get a FileNotFoundException.
I'm pretty confident that this problem does not exist in Tomcat 4.
I'm also pretty confident that this problem is not related to the characters being 3-byte UTF-8. I've tested using 2-byte UTF-8 (D0-9F, D1-80) and the result is the same.
Is this a bug?
-Ed Toro
---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
RE: Tomcat 5.0.19 international filenames inaccessible
Posted by Yansheng Lin <ya...@silvacom.com>.
Hi, I got the same error when I tried to view
"http://localhost:8080/j2e/jsp/%E5%AE%9B%E5%90%8D.jsp". And I've been
researching on this for a few days now whenever I got some free time. But the
only way that seemed have worked for others has been adding/setting the
D:\java\bin\java.exe -Dfile.encoding==UTF-8 ...
in your catalina config file. But I don't like the solution myself:_(. Not
entirely sure why either:).
Let us know if you get it working somehow through any other way.
Thanks!
-Yan
-----Original Message-----
From: Edward Toro [mailto:ed.toro@RocketSoftware.com]
Sent: Wednesday, March 10, 2004 1:38 PM
To: Tomcat Users List
Subject: Tomcat 5.0.19 international filenames inaccessible
Short version:
Does Tomcat 5 no longer serve files with international characters in their
filenames?
Long version:
Environment: Tomcat 5.1.19 on WinXP Pro
I have a file located in: <tomcat-home>/<webapps>/MyWebApp/. The filename
contains international characters: 0x305f 0x3079 0x304f (a.k.a E3-81-9F
E3-81-B9 E3-81-8F in UTF-8)).
When I navigate to the directory via http://<server>:8080/<webappname>/ I get a
directory listing of the files in that directory. I can access every file on
that list except those that contain international characters.
When I click on a filename that contains international characters, I'm sent to
http://<server>:8080/<webappname>%E3%81%9F%E3%81%B9%E3%81%8F.xml. This is the
correct result of putting the filename through a URLEncoder with the UTF-8
character set, which is what I assume is being done behind by the scene by the
server. Except the file doesn't appear. I get a 404 error.
So I made some Java testing code:
try {
URL url = new URL("http://<server>:8080/<webapp>/%E3%81%9F%E3%81%B9%E=
3%81%8F.xml");
HttpURLConnection conn = (HttpURLConnection)url.openConnection();
// checking the headers
String header;
String key;
int i = 0;
while ((header = conn.getHeaderField(i)) != null) {
key = conn.getHeaderFieldKey(i);
System.out.println(key + " = " + header);
i++;
}
// checking the content
InputStream is = url.openConnection().getInputStream();
InputStreamReader isr = new InputStreamReader(is);
int chr;
while ((chr = isr.read()) != -1) {
System.out.print((char)chr);
}
System.out.println("success");
} catch (Throwable t) { t.printStackTrace(); }
The headers I get back are:
HTTP/1.1 404 /<webapp>/%E3%81%9F%E3%81%B9%E3%81%8F.scene.xml
Content-Type = text/html;charset=ISO-8859-1
Content-Language = en-US
Content-Length = 1091
Date = Wed, 10 Mar 2004 18:02:01 GMT
Server = Apache-Coyote/1.1
No help there because I get those same headers when I try to access a file that
doesn't exist at all:
HTTP/1.1 404 /<webapp>/inexistent.xml
Content-Type = text/html;charset=3DISO-8859-1
Content-Language = en-US
Content-Length = 1040
Date = Wed, 10 Mar 2004 18:03:22 GMT
Server = Apache-Coyote/1.1
When I try to access the input stream to read for content, I get a
FileNotFoundException.
I'm pretty confident that this problem does not exist in Tomcat 4.
I'm also pretty confident that this problem is not related to the characters
being 3-byte UTF-8. I've tested using 2-byte UTF-8 (D0-9F, D1-80) and the
result is the same.
Is this a bug?
-Ed Toro
---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org