You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ant.apache.org by Antoine Levy-Lambert <an...@gmx.de> on 2005/12/07 09:36:19 UTC

URI encoding ...

I have checked in some code lately related to URI encoding and decoding.
I was triggered by 2 bug reports, which were pointing to the same
problems in the background.
One bug report was about the impossibility to load properly a jar file
having in his manifest a Class-Path attribute pointing to another jar
file in a subdir, when the main jar file is in a directory containing
spaces.

The other bug report was about entities not found when processing with
<xslt/> an XML document located in a non ascii path and referring to
entities in the same directory.

Now I am testing manifestclasspath. I tested it with änt (a umlaut nt,
or &#227;) . this works. Now with a directory bearing a hebrew name (iom
= &#1501;&#1493;&#1497; ) it does not work.

I can send a tar file with my test material if someone is interested. I
am wondering whether our encoding/decoding routines are wrong, or
whether it is a specific problem for langages which get written right to
left or
could even be some kind of JDK bug ...

Cheers,

Antoine

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: URI encoding ...

Posted by Antoine Levy-Lambert <an...@gmx.de>.
Jesse Glick wrote:

> Antoine Levy-Lambert wrote:
>
>> I have checked in some code lately related to URI encoding and decoding.
>
>
> BTW should use java.net.URI and File.toURI and new File(URI) when
> running on JDK 1.4+. I have an old outstanding bug for this, needs to
> be reexamined:
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=8031

Will look at this.By the way, the methods to encode/decode URIs are
available in JDK 1.5, for instance URLEncoder.encode("UTF-8") does the
same as org.apache.tools.ant.launch.Locator#encodeUri,

>
>> One bug report was about the impossibility to load properly a jar file
>> having in his manifest a Class-Path attribute pointing to another jar
>> file in a subdir, when the main jar file is in a directory containing
>> spaces.
>
>
> Note that a relative path in Class-Path is really a URI and must be
> encoded. E.g.
>
> Class-Path: some%20other%20lib.jar

Yes; the % escaping of the class path is now done in the new
manifestclasspath task.
AntClassloader

>
>> The other bug report was about entities not found when processing with
>> <xslt/> an XML document located in a non ascii path and referring to
>> entities in the same directory.
>
>
> May be completely unrelated, but I recently found a JDK bug sounding
> very similar to this:
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6341770

This bug report is fixed in the head revision of 
javax.xml.parsers.SAXParser
http://cvs.apache.org/viewcvs.cgi/xml-commons/java/external/src/javax/xml/parsers/SAXParser.java?rev=1.7&view=log
We just have to light a candle, hoping that the new version of SAXParser
will be in the new JDK 1.5s shipped by Sun.

>
>> Now I am testing manifestclasspath. I tested it with änt (a umlaut nt,
>> or &#227;) . this works. Now with a directory bearing a hebrew name (iom
>> = &#1501;&#1493;&#1497; ) it does not work.
>
>
> Just a guess: maybe you forgot to encode in UTF-8 before encoding to
> %xx octets? That could cause problems for non-ISO-8859-1 text.

No, the routines in Locator now do this UTF-8 encoding. I do not know
yet what is wrong. I am going to test this on Solaris to see whether
this problem could be a Windows-JVM bug.

>
> -J.
>
Cheers,

Antoine

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: URI encoding ...

Posted by Jesse Glick <je...@sun.com>.
Antoine Levy-Lambert wrote:
> I have checked in some code lately related to URI encoding and decoding.

BTW should use java.net.URI and File.toURI and new File(URI) when 
running on JDK 1.4+. I have an old outstanding bug for this, needs to be 
reexamined:

http://issues.apache.org/bugzilla/show_bug.cgi?id=8031

> One bug report was about the impossibility to load properly a jar file
> having in his manifest a Class-Path attribute pointing to another jar
> file in a subdir, when the main jar file is in a directory containing
> spaces.

Note that a relative path in Class-Path is really a URI and must be 
encoded. E.g.

Class-Path: some%20other%20lib.jar

> The other bug report was about entities not found when processing with
> <xslt/> an XML document located in a non ascii path and referring to
> entities in the same directory.

May be completely unrelated, but I recently found a JDK bug sounding 
very similar to this:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6341770

> Now I am testing manifestclasspath. I tested it with änt (a umlaut nt,
> or &#227;) . this works. Now with a directory bearing a hebrew name (iom
> = &#1501;&#1493;&#1497; ) it does not work.

Just a guess: maybe you forgot to encode in UTF-8 before encoding to %xx 
octets? That could cause problems for non-ISO-8859-1 text.

-J.

-- 
jesse.glick@sun.com  x22801  netbeans.org  ant.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org