You are viewing a plain text version of this content. The canonical link for it is here.
Posted to regexp-user@jakarta.apache.org by Robert Sösemann <rs...@gmx.de> on 2004/11/23 10:17:04 UTC
Regexp instead of XSLT
I need to process HTML which is not wellformed and tools like Tidy *cannot*
make wellformed. I decided to apply some regexps to fulfill this task.
My structure is HTML with some extra tags, that I need to extract e.g.:
...
<table border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<my-contenttype name="foo">
<td>
<table border="0" cellspacing="0" cellpadding="0" width="100%">
<tr>
<my-attribute name="bar">
<td class="headline_01">
FooBar
</td>
</my-attribute>
</tr>
</table>
</td>
</my-contenttype>
</tr>
<table>
...
I need to extract every opening and closing my tag and also extract all text
between my-attribute tags.
The result of the regexp should be:
<my-contenttype name="foo">
<my-attribute name="bar">
FooBar
</my-attribute>
</my-contenttype>
Can anybody help.
--
Geschenkt: 3 Monate GMX ProMail + 3 Top-Spielfilme auf DVD
++ Jetzt kostenlos testen http://www.gmx.net/de/go/mail ++
---------------------------------------------------------------------
To unsubscribe, e-mail: regexp-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: regexp-user-help@jakarta.apache.org