You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@velocity.apache.org by Will Glass-Husain <wg...@forio.com> on 2005/11/01 01:20:37 UTC

Re: off-topic: HTML filtering

Just a quick followup (final post).

HtmlParser worked great.  It didn't do exactly what I want (allow some HTML
but block others).  But I wrote a small amount of code and I had it.    Use
the Lexer, iterate through all nodes.  Pass through just a select list of
tags (filtering out any invalid or javascript related attributes).  Escape
everything else.  Very nice.

This seems a fairly generic idea.  If anyone is interested in such code, let
me know.  I'll go see if the HtmlParser project is interested in my routine.

WILL

----- Original Message ----- 
From: "Will Glass-Husain" <wg...@forio.com>
To: "Velocity Users List" <ve...@jakarta.apache.org>
Sent: Sunday, October 30, 2005 5:41 AM
Subject: Re: off-topic: HTML filtering


> Thanks for the suggestions.
>
> I was debating using [b] etc, but realized the only good reason to do so
> was laziness in not coding a decent filter.  Why invent a new syntax when
> HTML is pretty darn good?
>
> I'll look into htmlparser.
>
> WILL
>
> ----- Original Message ----- 
> From: "Robert Koberg" <ro...@koberg.com>
> To: "Velocity Users List" <ve...@jakarta.apache.org>
> Sent: Saturday, October 29, 2005 3:58 AM
> Subject: Re: off-topic: HTML filtering
>
>
>> Will Glass-Husain wrote:
>>> Hi,
>>>
>>> This is a little off-topic, but I'm struggling a bit to find something -
>>> I thought one of my fellow Velocity users might have a tip.
>>>
>>> I want to allow users to enter comments on a site with HTML formatting
>>> tags but prevent any javascript hyperlinks, or other potential
>>> cross-scripting issues.  Specifically, I need a Java library that will
>>> parse text, allowing HTML formatting (<b>, <i>, <img>, and
>>> non-Javascript <a>) but escaping everything else.
>>>
>>> Does anyone know a good source?  There's a very nice PHP library--
>>> lib_filter ( http://code.iamcal.com/php/lib_filter/ )-- that does
>>> exactly this.  The issues are subtle enough that I'd like to re-use
>>> rather than make my own if possible.  (and to make things more
>>> difficult, the license can't be GPL).
>>
>> Do you need to allow them to enter HTML tags? Instead could they enter
>> something like [b]foo[/b]? (you could have a button that does it for them
>> on the selected text). This way you can strip *all* html tags using
>> org.apache.commons.lang.StringEscapeUtils. Then after stripping you can
>> convert your set of well known markup to HTML and save that.
>>
>> best,
>> -Rob
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: velocity-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: velocity-user-help@jakarta.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: velocity-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: velocity-user-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: velocity-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: velocity-user-help@jakarta.apache.org


Re: off-topic: HTML filtering

Posted by Ahmed Mohombe <am...@yahoo.com>.
> HtmlParser worked great.  It didn't do exactly what I want (allow some HTML
> but block others).  But I wrote a small amount of code and I had it.    Use
> the Lexer, iterate through all nodes.  Pass through just a select list of
> tags (filtering out any invalid or javascript related attributes).  Escape
> everything else.  Very nice.
> 
> This seems a fairly generic idea.  If anyone is interested in such code, 
> let
> me know.  I'll go see if the HtmlParser project is interested in my 
> routine.
The shold be :).
Or at least the users of HTMLParser, so at least it could go in the 'examples'/wiki section :).

Ahmed.


---------------------------------------------------------------------
To unsubscribe, e-mail: velocity-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: velocity-user-help@jakarta.apache.org