You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by "David Crossley (JIRA)" <ji...@apache.org> on 2006/08/25 09:48:22 UTC

[jira] Closed: (FOR-448) Faulty treatment of a-Elements in html-pipeline

     [ http://issues.apache.org/jira/browse/FOR-448?page=all ]

David Crossley closed FOR-448.
------------------------------

    Fix Version/s: 0.8-dev
       Resolution: Fixed

Thanks for your help Jim. I fixed it in a different way. The html4 specification does talk about @name and @href being okay as simultaneous attributes. Also i don't know why this template was removing other attributes such as @title and @target which some people want to use. So i simply used "xsl:copy-of" to copy all the attributes.

As you suggested, removed the automated generation of @id attributes from @name attributes. The html4 spec indicates that this can lead to invalid IDs. Doing some research into the history of html-to-document.xsl i see that this has been there since the beginning. No idea why the original author thought that it was necessary. We can add it back if people think it necessary.

> Faulty treatment of a-Elements in html-pipeline
> -----------------------------------------------
>
>                 Key: FOR-448
>                 URL: http://issues.apache.org/jira/browse/FOR-448
>             Project: Forrest
>          Issue Type: Bug
>          Components: Core operations
>    Affects Versions: 0.7, 0.8-dev
>         Environment: Windows XP SP2
>            Reporter: Ferdinand Soethe
>             Fix For: 0.8-dev
>
>         Attachments: anchorerrortestfiles.zip, html-to-document.xml.diff
>
>
> After noticing that anchor elements in html-files got lost in the Forrest default pipeline, I did some test with a sample document (before and after are included) and found that named anchors either get completely lost or messed up pretty bad. Even text within them is sometimes lost.
> The lines refer to original and translated file.
> Original     Translated   Looks   Function
>   line          line
> ------------------------------------------
>    16            157       ok       gone     
>    <a> element is completely lost
>    
>    
>    22            162       bad       ok      
>    
>    there are now 2 <a> elements
>    <a name="anchor2"></a>Anchor 2<a href="#anchor1">Anchor 2</a>
>    and unfortunately twice the text!
>    
>    29            166       ok       gone
>    <a> element is completely lost
>    
>    35            171       bad      gone
>    <a> element and text within it is completely lost!
>    
>    42            176       ok       gone 
>    <a> element is completely lost
>    
>    49            181       ok       gone
>    <a> element is completely lost  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] Closed: (FOR-448) Faulty treatment of a-Elements in html-pipeline

Posted by Jim Dixon <jd...@dixons.org>.
On Fri, 25 Aug 2006, David Crossley (JIRA) wrote:

>      [ http://issues.apache.org/jira/browse/FOR-448?page=all ]
>
> David Crossley closed FOR-448.
> ------------------------------
>
>     Fix Version/s: 0.8-dev
>        Resolution: Fixed
>
> Thanks for your help Jim. I fixed it in a different way. The html4
> specification does talk about @name and @href being okay as simultaneous
> attributes. Also i don't know why this template was removing other
> attributes such as @title and @target which some people want to use. So
> i simply used "xsl:copy-of" to copy all the attributes.

I did a spot-check using a randomly selected attribute and found to my
surprise that it actually got through, so this was the correct solution.

The bad template that was used in html-to-document.xsl also appears in

  forrest/main/webapp/resources/stylesheets/any-to-document.xsl
  forrest/etc/test-whitespace/html2document.xsl

and so the same fix needs to be applied to both.  This is now FOR-923.

--
Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881