You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Aaron Digulla (JIRA)" <ji...@codehaus.org> on 2011/05/23 15:22:22 UTC

[jira] Created: (DOXIA-431) Doxia creates illegal URLs from local paths

Doxia creates illegal URLs from local paths
-------------------------------------------

                 Key: DOXIA-431
                 URL: http://jira.codehaus.org/browse/DOXIA-431
             Project: Maven Doxia
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.2
            Reporter: Aaron Digulla


If a local resource contains characters which are illegal in a URL, Doxia creates illegal code or crashes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (DOXIA-431) Doxia creates illegal URLs from local paths

Posted by "Lukas Theussl (JIRA)" <ji...@codehaus.org>.
    [ http://jira.codehaus.org/browse/DOXIA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=268412#action_268412 ] 

Lukas Theussl commented on DOXIA-431:
-------------------------------------

I generally agree with your comments. The sanitize methods in URIPathDescriptor (I guess that's what you are referring to) were necessary for some backward issues I encountered when re-writing the deprecated PathDescriptor class. Note also the [comment in the relativizeLink|http://maven.apache.org/doxia/doxia-sitetools/doxia-decoration-model/xref/org/apache/maven/doxia/site/decoration/inheritance/DefaultDecorationModelInheritanceAssembler.html#375] method of DefaultDecorationModelInheritanceAssembler.

One thing I can point out is the javadoc in the Sink API for [figureGraphics|http://maven.apache.org/doxia/doxia/doxia-sink-api/apidocs/org/apache/maven/doxia/sink/Sink.html#figureGraphics(java.lang.String, org.apache.maven.doxia.sink.SinkEventAttributes)], which states that the image src parameter has to be a valid URL before being emitted into the Sink. So that's consistent with your remark that data should be validated at the input side, ie by the Parser.

Otherwise, I think a concrete test example would help me to work on this, as I still don't know where your figure is referenced from.

> Doxia creates illegal URLs from local paths
> -------------------------------------------
>
>                 Key: DOXIA-431
>                 URL: http://jira.codehaus.org/browse/DOXIA-431
>             Project: Maven Doxia
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Aaron Digulla
>
> If a local resource contains characters which are illegal in a URL, Doxia creates illegal code or crashes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (DOXIA-431) Doxia creates illegal URLs from local paths

Posted by "Aaron Digulla (JIRA)" <ji...@codehaus.org>.
    [ http://jira.codehaus.org/browse/DOXIA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=268177#action_268177 ] 

Aaron Digulla commented on DOXIA-431:
-------------------------------------

In an external project, there are image files which contain spaces. Instead of replacing the spaces with %20 or calling {{java.net.URLEncoder.encode()}}, Doxia tries to call {{new java.net.URL("images/The ExTeX Project.png"}} which fails.

I tried a fix but couldn't get it to work in a couple of hours. The problem is that you use a lot of Strings when you should be using URLs (or at least a URL-like type). Without such a type, it's impossible to know when a URL must be encoded/decoded.

Example stacktrace:

{code}
Caused by: java.lang.IllegalArgumentException
        at java.net.URI.create(URI.java:842)
        at org.apache.maven.doxia.site.decoration.inheritance.URIPathDescriptor.<init>(URIPathDescriptor.java:69)
        at org.apache.maven.doxia.site.decoration.inheritance.DefaultDecorationModelInheritanceAssembler.rebaseLink(DefaultDecorationModelInheritanceAssembler.java:361)
        at org.apache.maven.doxia.site.decoration.inheritance.DefaultDecorationModelInheritanceAssembler.rebaseBannerPaths(DefaultDecorationModelInheritanceAssembler.java:162)
        at org.apache.maven.doxia.site.decoration.inheritance.DefaultDecorationModelInheritanceAssembler.assembleModelInheritance(DefaultDecorationModelInheritanceAssembler.java:61)
        at org.apache.maven.doxia.tools.DefaultSiteTool.getDecorationModel(DefaultSiteTool.java:1221)
        at org.apache.maven.doxia.tools.DefaultSiteTool.getDecorationModel(DefaultSiteTool.java:458)
        at org.apache.maven.plugins.site.AbstractSiteRenderingMojo.createSiteRenderingContext(AbstractSiteRenderingMojo.java:285)
        at org.apache.maven.plugins.site.SiteMojo.renderLocale(SiteMojo.java:140)
        at org.apache.maven.plugins.site.SiteMojo.execute(SiteMojo.java:124)
        at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:107)
        ... 20 more
Caused by: java.net.URISyntaxException: Illegal character in path at index 10: images/The ExTeX Project.png
        at java.net.URI$Parser.fail(URI.java:2809)
        at java.net.URI$Parser.checkChars(URI.java:2982)
        at java.net.URI$Parser.parseHierarchical(URI.java:3066)
        at java.net.URI$Parser.parse(URI.java:3024)
        at java.net.URI.<init>(URI.java:578)
        at java.net.URI.create(URI.java:840)
        ... 30 more
{code}




> Doxia creates illegal URLs from local paths
> -------------------------------------------
>
>                 Key: DOXIA-431
>                 URL: http://jira.codehaus.org/browse/DOXIA-431
>             Project: Maven Doxia
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Aaron Digulla
>
> If a local resource contains characters which are illegal in a URL, Doxia creates illegal code or crashes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (DOXIA-431) Doxia creates illegal URLs from local paths

Posted by "Aaron Digulla (JIRA)" <ji...@codehaus.org>.
    [ http://jira.codehaus.org/browse/DOXIA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=268284#action_268284 ] 

Aaron Digulla commented on DOXIA-431:
-------------------------------------

> Doxia 1.2 is not used...

I had problems with the site plugin 3.0-beta-3, so I tried beta-4-SNAPSHOT.

> Is this a regression?

Probably. My guess is that recent code changes unveiled a whole set of errors.

> where does this image come from...

The image comes from the folder {{src/site/resources/images/}}

I'm not the maintainer of the project, so I have no idea how Doxia includes the image. All I have is the error message and the filename. I can't see any reference to the image in site.xml, so it must be included from somewhere else, probably the skin.

> Is it documented somewhere that image/file references...?

A URL can contain only some characters. See http://www.blooberry.com/indexdot/html/topics/urlencoding.htm for a pretty good explanation.

File names on Unix can contain anything except "/" (slash) and 0-bytes. 

So if you accept Unix file names anywhere in Doxia, you must escape them as soon as they are converted to URLs and you must unescape them when they are converted back to file names.

My suggestion is a new type which can be both and which has accessor methods to get a OS-specific path or a RFC-compliant URL and to get rid of the type String as soon as you can to make sure you don't have any gaps in the chain.



> Doxia creates illegal URLs from local paths
> -------------------------------------------
>
>                 Key: DOXIA-431
>                 URL: http://jira.codehaus.org/browse/DOXIA-431
>             Project: Maven Doxia
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Aaron Digulla
>
> If a local resource contains characters which are illegal in a URL, Doxia creates illegal code or crashes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (DOXIA-431) Doxia creates illegal URLs from local paths

Posted by "Lukas Theussl (JIRA)" <ji...@codehaus.org>.
    [ http://jira.codehaus.org/browse/DOXIA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=268215#action_268215 ] 

Lukas Theussl commented on DOXIA-431:
-------------------------------------

Can you be more specific: where does this image come from, from an apt/xdoc source file or site.xml? Doxia 1.2 is not used yet in any site release so I assume you are using a snapshot? Is this a regression then? Is it documented somewhere that image/file references may contain spaces (just for my education, I don't think eg that the apt reference is sufficiently precise in many respects)?

> Doxia creates illegal URLs from local paths
> -------------------------------------------
>
>                 Key: DOXIA-431
>                 URL: http://jira.codehaus.org/browse/DOXIA-431
>             Project: Maven Doxia
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Aaron Digulla
>
> If a local resource contains characters which are illegal in a URL, Doxia creates illegal code or crashes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (DOXIA-431) Doxia creates illegal URLs from local paths

Posted by "Lukas Theussl (JIRA)" <ji...@codehaus.org>.
    [ http://jira.codehaus.org/browse/DOXIA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=268294#action_268294 ] 

Lukas Theussl commented on DOXIA-431:
-------------------------------------

Yeah, I'm somewhat familiar with url encoding :)

I meant whether it is documented somewhere within maven/doxia wheher file references have to be urls, eg whether you are allowed to use in an apt source file

{noformat}
[The ExTeX Project.png] Figure caption
{noformat}

or in a site.xml

{noformat}
<logo name=.. href=.. img="The ExTeX Project.png" />
{noformat}

I guess the answer is yes, but I'm just wondering if there is anything in the docs.

> Doxia creates illegal URLs from local paths
> -------------------------------------------
>
>                 Key: DOXIA-431
>                 URL: http://jira.codehaus.org/browse/DOXIA-431
>             Project: Maven Doxia
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Aaron Digulla
>
> If a local resource contains characters which are illegal in a URL, Doxia creates illegal code or crashes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (DOXIA-431) Doxia creates illegal URLs from local paths

Posted by "Aaron Digulla (JIRA)" <ji...@codehaus.org>.
    [ http://jira.codehaus.org/browse/DOXIA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=268325#action_268325 ] 

Aaron Digulla commented on DOXIA-431:
-------------------------------------

I have no idea. But from my experience, I'd say that those URLs should already be encoded. I mean "Image[1].png" is a valid Unix filename. If you want to use that as a caption, you need escaping.

So maybe the solution is to reject strings which contain invalid characters close to the input side.

But I saw that you have sanitize methods in some URL helper class in Doxia. That led me to think that you want to do it there and I don't believe this will work. Data must be sanitized and validated in the outside interface, not deep in the code.

> Doxia creates illegal URLs from local paths
> -------------------------------------------
>
>                 Key: DOXIA-431
>                 URL: http://jira.codehaus.org/browse/DOXIA-431
>             Project: Maven Doxia
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Aaron Digulla
>
> If a local resource contains characters which are illegal in a URL, Doxia creates illegal code or crashes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira