You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xalan.apache.org by Peter Hollas <pe...@gmail.com> on 2006/11/29 12:50:23 UTC
XHTML link tag stripping
Hi everyone,
Please could someone provide an example stylesheet of how to strip <a> link
tags out of a source XHTML document whilst retaining the remaining node text
from within the body. Preferably the output should have normalised
whitespace and a space seperating each extracted piece of text. eg.
Source:
<html>
<head>
<title>Not wanted</title>
</head>
<body>
<a>Not wanted</a>
<div class="1">This text is wanted <a href="#">Not wanted</a> and so is
this</div>
<p>Wanted</p>
</body>
</html>
Output:
<htmltext>This text is wanted and so is this Wanted</htmltext>
I'm sure that the solution is incredibly simple, but after days of trying I
keep hitting a brick wall.
Many thanks, Peter.