You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by "Karr, David" <Da...@wamu.net> on 2003/05/16 17:09:06 UTC

RE: HTML parsing for auto-forms processing

I would try using JTidy or NekoHTML.  I've used the former, but not the
latter.  These two, and perhaps another, appear to be used in HttpUnit,
which just work without me having to worry about it.

> -----Original Message-----
> From: David Wall [mailto:d.wall@computer.org]
> 
> I have a requirement that users be able to upload HTML forms (simple
> HTML documents that contain one FORM element with any number of INPUT,
> SELECT and TEXTAREA type fields) such that my JSP can then display a
> page, complete with a header logo, intro text, then the user-supplied
> form, followed by footer information.
> 
> I want to convert the embedded FORM such that it posts back to my
> JSP/servlet, and then I can process the fields received, set these
> values as the "value=" option of INPUT fields or turn on the "CHECKED"
> option for checkboxes and the "SELECTED" option for SELECT fields.  My
> program has to be able to do this without knowing anything about the
> form itself, since that was uploaded by a user.
> 
> Most user-defined FORMS will not be pure XHTML, so I cannot count on
> using a regular XML parser to create a DOM from the uploaded form.
> 
> I've looked into the javax.swing.text.html.HTMLEditorKit (based on a
> sample of how to extract all A HREFs), but it seems to be lacking,
> perhaps because it's only up to HTML 3.2, not 4.0 and beyond.  For
> one, when I output the parsed HTML, it's close, but it's definitely
> old style (like <OPTION> tags are not closed with </OPTION> tags even
> though the input has the ending tag).
> 
> Does anybody know about a better "HTML parser" toolkit I can use?
> Will the ECS project do it for me?
> It needs to parse the original HTML FORM uploaded so that I can
extract
> all INPUT type fields, know their name and type, and then when I
> display the form, I'll put my own FORM tags in so that it posts back
> to my servlet, and then I'll retrieve the known input fields using the
> param names I discovered when the file was originally parsed, then
> I'll modify the HTML/DOM so that I can set the value="" parameters
> based on the newly posted data the user entered, and redisplay the
> updated form by outputing the HTML/DOM back into HTML for the JSP to
> display.

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org