You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by anish sheth <an...@hotmail.com> on 2000/04/28 19:26:13 UTC

HTML to XHTML

I know this is vast question but I am trying to get help
from you experts.

What is involved in converting HTML to XML?

Thanks in advance.
________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com


Re: HTML to XHTML

Posted by "K.C. Jones" <kj...@phoenix-pop.com>.
Beware of relying too heavily on Tidy for HTML->XHTML
conversion.  It does the simple things well, like attribute
case conversion, quoting attribute values, "<br>" to
"<br/>",...  Don't be deceived.  In my limited testing I
found some gaping holes in the conversion.

In particular:
- It won't handle javascript code correctly
- It brutalizes overlapped tags

The script handling could be done as a second pass filter.
But the overlapped tags would have to be fixed before Tidy
mangles them.

I found those problems quickly based on spot testing with
some HTML I had lying around.  I don't know if anyone has
done any comprehensive testing of the conversion.  Does
anyone know of any FAQs or other documentation of Tidy's
XHTML conversion tool?

Just in case you haven't seen this, consult:
http://www.w3.org/TR/xhtml1/#diffs
when evaluating your XHTML conversion needs.

Cheers,
KC Jones

Re: HTML to XHTML

Posted by Anupam Bagchi <ab...@taz.hyperreal.org>.
I think using HTML-Tidy should be all. You'll get it easily by searching the
w3 site.

- Anupam Bagchi

----- Original Message -----
From: "anish sheth" <an...@hotmail.com>
To: <co...@xml.apache.org>
Sent: Friday, April 28, 2000 10:26 AM
Subject: HTML to XHTML


> I know this is vast question but I am trying to get help
> from you experts.
>
> What is involved in converting HTML to XML?
>
> Thanks in advance.
> ________________________________________________________________________
> Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
> For additional commands, e-mail: cocoon-users-help@xml.apache.org
>
>


RE: HTML to XHTML

Posted by Mike Wagner <mw...@icanonline.net>.
To convert HTML to XHTML try tidy.
http://www.w3.org/People/Raggett/tidy/

> -----Original Message-----
> From: Paul Russell [mailto:paulr@itcsajs5.cpc.uea.ac.uk]On Behalf Of
> Paul Russell
> Sent: Saturday, April 29, 2000 2:09 PM
> To: cocoon-users@xml.apache.org
> Subject: Re: HTML to XHTML
> 
> 
> On Fri, Apr 28, 2000 at 12:26:13PM -0500, anish sheth wrote:
> > What is involved in converting HTML to XML?
> 
> This really depends what you mean by 'convert'. HTML is a
> *presentation* markup language whereas XML is an *information*
> markup language. If you simply mean converting HTML to XHTML,
> which is an XML implementation of HTML, then it's not too bad,
> and you could probably write a shortish perl script to do it
> for you. If you mean moving over to the XML way of thinking
> then that's a much larger task since you're not simply talking
> about a change of implementation, you're talking about a
> totally new methodology (mark up the information, and then
> work out how you want it to look and write a stylesheet to
> achieve that goal).
> 
> All in all, a toughie.
> 
> -- 
> Paul Russell                               <pa...@luminas.co.uk>
> Technical Director,                   http://www.luminas.co.uk
> Luminas Ltd.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
> For additional commands, e-mail: cocoon-users-help@xml.apache.org
> 
> 

Re: HTML to XHTML

Posted by Donald Ball <ba...@webslingerZ.com>.
On Sat, 29 Apr 2000, Paul Russell wrote:

> On Fri, Apr 28, 2000 at 12:26:13PM -0500, anish sheth wrote:
> > What is involved in converting HTML to XML?
> 
> This really depends what you mean by 'convert'. HTML is a
> *presentation* markup language whereas XML is an *information*
> markup language. If you simply mean converting HTML to XHTML,
> which is an XML implementation of HTML, then it's not too bad,
> and you could probably write a shortish perl script to do it
> for you. If you mean moving over to the XML way of thinking

tidy converts HTML to XHTML.

- donald


Re: HTML to XHTML

Posted by Paul Russell <pa...@luminas.co.uk>.
On Fri, Apr 28, 2000 at 12:26:13PM -0500, anish sheth wrote:
> What is involved in converting HTML to XML?

This really depends what you mean by 'convert'. HTML is a
*presentation* markup language whereas XML is an *information*
markup language. If you simply mean converting HTML to XHTML,
which is an XML implementation of HTML, then it's not too bad,
and you could probably write a shortish perl script to do it
for you. If you mean moving over to the XML way of thinking
then that's a much larger task since you're not simply talking
about a change of implementation, you're talking about a
totally new methodology (mark up the information, and then
work out how you want it to look and write a stylesheet to
achieve that goal).

All in all, a toughie.

-- 
Paul Russell                               <pa...@luminas.co.uk>
Technical Director,                   http://www.luminas.co.uk
Luminas Ltd.