You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@velocity.apache.org by "Brian W. Young" <by...@decadesystems.com> on 2003/08/13 19:31:55 UTC

other options besides jtidy

I've read the discussions here before about whitespace and formatting 
issues.  One suggestion was to pipe output through JTidy.  I'd consider 
giving this a shot, but I'm a bit concerned about the JTidy project- it 
doesn't appear to be actively maintained, making it a poor compliment to 
Velocity which appears to be very well maintained.

Thoughts on this or suggestions of another Tidy-style library to use?  I 
use Velocity to create HTML once that gets written to disk and may be 
edited by users after that so I'd really like to pass my output through 
a formatter.  Your thoughts on the best practices here for a robust 
solution would be appreciated.  My users may also modify the templates 
as well and I don't want to restrict them to be expected to use "strict" 
markup either... i.e. close every <br> tag or else things are busted.  
So I think that rules out trying to use an XML style API.

Thanks,
Brian


Re: other options besides jtidy

Posted by "Brian W. Young" <by...@decadesystems.com>.
Well, I'm all for giving JTidy a try, in testing it seems to work okay.

One question- what is the best way to pipe the output through Tidy when 
merging your templates?  It takes an InputStream and OutputStream. (My 
HTML output  is aways a file on disk.) Velocity's template.merge() takes 
a BufferedWriter.  It is almost like I'd want a gender changer (to lack 
a better term) to convert the BufferedWriter passed into 
template.merge() into an InputStream that could be passed into Tidy.

The only option I can think of involves a two stage process of writing 
Velocity's output to a string in memory or a temp file, neither of which 
I like.  I'd like to just hook the streams up together and let it go.  
Any ideas?

Thanks,
Brian

Serge Knystautas wrote:

> JTidy is not very *actively* maintained, but it's been around for some 
> time, and activity comes and goes like any project.  We use it in a 
> few places that wants cleansed HTML.  Our content management system 
> offers a Tidy error report on a page or site, but in the end we 
> usually end up just restricting what users can edit.
>
> If you really just want extra whitespace removed, I would think you 
> could just do this....
>
> String original = <what I started with>
> StringBuffer result = new StringBuffer(original.length());
> StringTokenizer tokenizer = new StringTokenizer(original, " \r\t\n", 
> false);
> while (tokenizer.hasMoreTokens()) {
>     result.append(tokenizer.nextToken()).append(' ');
> }
>
> This will strip all spots where you have more than one whitespace 
> elements.  It would screw up something in a <pre> tag, but that's 
> related to whatever whitespace removal approach you take.
>




---------------------------------------------------------------------
To unsubscribe, e-mail: velocity-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: velocity-user-help@jakarta.apache.org


Re: other options besides jtidy

Posted by "Brian W. Young" <by...@decadesystems.com>.
Well, I'm all for giving JTidy a try, in testing it seems to work okay.

One question- what is the best way to pipe the output through Tidy when 
merging your templates?  It takes an InputStream and OutputStream. (My 
HTML output  is aways a file on disk.) Velocity's template.merge() takes 
a BufferedWriter.  It is almost like I'd want a gender changer (to lack 
a better term) to convert the BufferedWriter passed into 
template.merge() into an InputStream that could be passed into Tidy.

The only option I can think of involves a two stage process of writing 
Velocity's output to a string in memory or a temp file, neither of which 
I like.  I'd like to just hook the streams up together and let it go.  
Any ideas?

Thanks,
Brian

Serge Knystautas wrote:

> JTidy is not very *actively* maintained, but it's been around for some 
> time, and activity comes and goes like any project.  We use it in a 
> few places that wants cleansed HTML.  Our content management system 
> offers a Tidy error report on a page or site, but in the end we 
> usually end up just restricting what users can edit.
>
> If you really just want extra whitespace removed, I would think you 
> could just do this....
>
> String original = <what I started with>
> StringBuffer result = new StringBuffer(original.length());
> StringTokenizer tokenizer = new StringTokenizer(original, " \r\t\n", 
> false);
> while (tokenizer.hasMoreTokens()) {
>     result.append(tokenizer.nextToken()).append(' ');
> }
>
> This will strip all spots where you have more than one whitespace 
> elements.  It would screw up something in a <pre> tag, but that's 
> related to whatever whitespace removal approach you take.
>




Re: other options besides jtidy

Posted by Dave Newton <da...@solaraccess.com>.
On Wed, 2003-08-13 at 13:41, Serge Knystautas wrote:
> [...JTidy...]

It has worked well for me too.

> This will strip all spots where you have more than one whitespace 
> elements.  It would screw up something in a <pre> tag, but that's 
> related to whatever whitespace removal approach you take.

Maybe combine it with an HTML parser and skip PRE tags, and maybe
comments, too. Shame to lose indentation info if somebody is going to
edit it someday! 

Too bad about having to process real-world HTML, but if it's just yours
and well-formed (XHTML?) then that's not too bad...

Dave



Re: other options besides jtidy

Posted by Dave Newton <da...@solaraccess.com>.
On Wed, 2003-08-13 at 13:41, Serge Knystautas wrote:
> [...JTidy...]

It has worked well for me too.

> This will strip all spots where you have more than one whitespace 
> elements.  It would screw up something in a <pre> tag, but that's 
> related to whatever whitespace removal approach you take.

Maybe combine it with an HTML parser and skip PRE tags, and maybe
comments, too. Shame to lose indentation info if somebody is going to
edit it someday! 

Too bad about having to process real-world HTML, but if it's just yours
and well-formed (XHTML?) then that's not too bad...

Dave



---------------------------------------------------------------------
To unsubscribe, e-mail: velocity-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: velocity-user-help@jakarta.apache.org


Re: other options besides jtidy

Posted by Serge Knystautas <se...@lokitech.com>.
JTidy is not very *actively* maintained, but it's been around for some 
time, and activity comes and goes like any project.  We use it in a few 
places that wants cleansed HTML.  Our content management system offers a 
Tidy error report on a page or site, but in the end we usually end up 
just restricting what users can edit.

If you really just want extra whitespace removed, I would think you 
could just do this....

String original = <what I started with>
StringBuffer result = new StringBuffer(original.length());
StringTokenizer tokenizer = new StringTokenizer(original, " \r\t\n", false);
while (tokenizer.hasMoreTokens()) {
     result.append(tokenizer.nextToken()).append(' ');
}

This will strip all spots where you have more than one whitespace 
elements.  It would screw up something in a <pre> tag, but that's 
related to whatever whitespace removal approach you take.

-- 
Serge Knystautas
President
Lokitech >> software . strategy . design >> http://www.lokitech.com
p. 301.656.5501
e. sergek@lokitech.com

Brian W. Young wrote:
> I've read the discussions here before about whitespace and formatting 
> issues.  One suggestion was to pipe output through JTidy.  I'd consider 
> giving this a shot, but I'm a bit concerned about the JTidy project- it 
> doesn't appear to be actively maintained, making it a poor compliment to 
> Velocity which appears to be very well maintained.
> 
> Thoughts on this or suggestions of another Tidy-style library to use?  I 
> use Velocity to create HTML once that gets written to disk and may be 
> edited by users after that so I'd really like to pass my output through 
> a formatter.  Your thoughts on the best practices here for a robust 
> solution would be appreciated.  My users may also modify the templates 
> as well and I don't want to restrict them to be expected to use "strict" 
> markup either... i.e. close every <br> tag or else things are busted.  
> So I think that rules out trying to use an XML style API.


---------------------------------------------------------------------
To unsubscribe, e-mail: velocity-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: velocity-user-help@jakarta.apache.org


Re: other options besides jtidy

Posted by Serge Knystautas <se...@lokitech.com>.
JTidy is not very *actively* maintained, but it's been around for some 
time, and activity comes and goes like any project.  We use it in a few 
places that wants cleansed HTML.  Our content management system offers a 
Tidy error report on a page or site, but in the end we usually end up 
just restricting what users can edit.

If you really just want extra whitespace removed, I would think you 
could just do this....

String original = <what I started with>
StringBuffer result = new StringBuffer(original.length());
StringTokenizer tokenizer = new StringTokenizer(original, " \r\t\n", false);
while (tokenizer.hasMoreTokens()) {
     result.append(tokenizer.nextToken()).append(' ');
}

This will strip all spots where you have more than one whitespace 
elements.  It would screw up something in a <pre> tag, but that's 
related to whatever whitespace removal approach you take.

-- 
Serge Knystautas
President
Lokitech >> software . strategy . design >> http://www.lokitech.com
p. 301.656.5501
e. sergek@lokitech.com

Brian W. Young wrote:
> I've read the discussions here before about whitespace and formatting 
> issues.  One suggestion was to pipe output through JTidy.  I'd consider 
> giving this a shot, but I'm a bit concerned about the JTidy project- it 
> doesn't appear to be actively maintained, making it a poor compliment to 
> Velocity which appears to be very well maintained.
> 
> Thoughts on this or suggestions of another Tidy-style library to use?  I 
> use Velocity to create HTML once that gets written to disk and may be 
> edited by users after that so I'd really like to pass my output through 
> a formatter.  Your thoughts on the best practices here for a robust 
> solution would be appreciated.  My users may also modify the templates 
> as well and I don't want to restrict them to be expected to use "strict" 
> markup either... i.e. close every <br> tag or else things are busted.  
> So I think that rules out trying to use an XML style API.