You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@ant.apache.org by "Robin Fernandes (JIRA)" <ji...@apache.org> on 2009/04/08 20:16:13 UTC

[jira] Created: (IVY-1060) ApacheURLLister.retrieveListing() fails if the encoding of the URL list is different from the default encoding

ApacheURLLister.retrieveListing() fails if the encoding of the URL list is different from the default encoding
--------------------------------------------------------------------------------------------------------------

                 Key: IVY-1060
                 URL: https://issues.apache.org/jira/browse/IVY-1060
             Project: Ivy
          Issue Type: Bug
          Components: Core
    Affects Versions: 2.0, 2.1.0, trunk
         Environment: Observed on z/OS
            Reporter: Robin Fernandes


ApacheURLLister.retrieveListing() assumes that the list of URLs is encoded in the same encoding as the system's default encoding.

The problematic code is:
{code}
BufferedReader r = new BufferedReader(new InputStreamReader(URLHandlerRegistry.getDefault().openStream(url)));
String htmlText = FileUtil.readEntirely(r);
{code}

FileUtil.readEntirely() converts the the content of the BufferedReader r to a String. Because no encoding is specified in the InputStreamReader constructor, the default encoding is used. If the default encoding does not match the actual encoding of the data read from url,  htmlText ends up as a garbage String and the URL pattern matcher fails.

This causes an issue on z/OS, where the default encoding is EBCDIC (e.g. IBM-1047) but the data containing the list of URLs is typically retrieved from the network as ASCII (ISO-8559-1).

A workaround could be to specify the system property -Dfile.encoding=ISO-8559-1 on the command line, but this is a bit of a big hammer. In particular, it is not suitable when Ivy is used within an application where we don't to assume all input is ISO-8559-1.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (IVY-1060) ApacheURLLister.retrieveListing() fails if the encoding of the URL list is different from the default encoding

Posted by "Robin Fernandes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IVY-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robin Fernandes updated IVY-1060:
---------------------------------

    Environment: 
OS: z/OS 1.9

java version "1.6.0"
Java(TM) SE Runtime Environment (build pmz3160sr3-20081108_01(SR3))

  was:Observed on z/OS


> ApacheURLLister.retrieveListing() fails if the encoding of the URL list is different from the default encoding
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: IVY-1060
>                 URL: https://issues.apache.org/jira/browse/IVY-1060
>             Project: Ivy
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 2.0, 2.1.0, trunk
>         Environment: OS: z/OS 1.9
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr3-20081108_01(SR3))
>            Reporter: Robin Fernandes
>         Attachments: patch.ivy.URLListerEncoding.diff
>
>
> ApacheURLLister.retrieveListing() assumes that the list of URLs is encoded in the same encoding as the system's default encoding.
> The problematic code is:
> {code}
> BufferedReader r = new BufferedReader(new InputStreamReader(URLHandlerRegistry.getDefault().openStream(url)));
> String htmlText = FileUtil.readEntirely(r);
> {code}
> FileUtil.readEntirely() converts the the content of the BufferedReader r to a String. Because no encoding is specified in the InputStreamReader constructor, the default encoding is used. If the default encoding does not match the actual encoding of the data read from url,  htmlText ends up as a garbage String and the URL pattern matcher fails.
> This causes an issue on z/OS, where the default encoding is EBCDIC (e.g. IBM-1047) but the data containing the list of URLs is typically retrieved from the network as ASCII (ISO-8559-1).
> A workaround could be to specify the system property -Dfile.encoding=ISO-8559-1 on the command line, but this is a bit of a big hammer. In particular, it is not suitable when Ivy is used within an application where we don't to assume all input is ISO-8559-1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (IVY-1060) ApacheURLLister.retrieveListing() fails if the encoding of the URL list is different from the default encoding

Posted by "Robin Fernandes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IVY-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robin Fernandes updated IVY-1060:
---------------------------------

    Attachment: patch.ivy.URLListerEncoding.diff

I'm attaching a patch which resolves the issue I'm seeing on z/OS and passes the Ivy JUnit tests on Windows, but has not otherwise been tested.

> ApacheURLLister.retrieveListing() fails if the encoding of the URL list is different from the default encoding
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: IVY-1060
>                 URL: https://issues.apache.org/jira/browse/IVY-1060
>             Project: Ivy
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 2.0, 2.1.0, trunk
>         Environment: Observed on z/OS
>            Reporter: Robin Fernandes
>         Attachments: patch.ivy.URLListerEncoding.diff
>
>
> ApacheURLLister.retrieveListing() assumes that the list of URLs is encoded in the same encoding as the system's default encoding.
> The problematic code is:
> {code}
> BufferedReader r = new BufferedReader(new InputStreamReader(URLHandlerRegistry.getDefault().openStream(url)));
> String htmlText = FileUtil.readEntirely(r);
> {code}
> FileUtil.readEntirely() converts the the content of the BufferedReader r to a String. Because no encoding is specified in the InputStreamReader constructor, the default encoding is used. If the default encoding does not match the actual encoding of the data read from url,  htmlText ends up as a garbage String and the URL pattern matcher fails.
> This causes an issue on z/OS, where the default encoding is EBCDIC (e.g. IBM-1047) but the data containing the list of URLs is typically retrieved from the network as ASCII (ISO-8559-1).
> A workaround could be to specify the system property -Dfile.encoding=ISO-8559-1 on the command line, but this is a bit of a big hammer. In particular, it is not suitable when Ivy is used within an application where we don't to assume all input is ISO-8559-1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.