You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Ross Gardler <rg...@apache.org> on 2005/06/29 19:04:08 UTC
Converting HTML to XDoc
I've been struggling for a couple of days with this. I wonder if someone
can help.
I need to convert an HTML document to XDoc (or XHTML2). I'm using the
html2document.xsl in our SVN as a starting point but am thinking that it
may be a dead end.
The problem is that the html2document.xsl stylesheet assumes that the
HTML document has been authored in a structured way, that is, <h2>
always follows <h1>, <h3> always follow <h2> etc. Unfortunately that is
not the case in many of the documents I have to work with.
Does anyone know of a stylesheet (or other means) that will do the job?
I know I can do it with a custom generator, but I thought I'd ask if
there is another solution first.
Ross