You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by gordy perkins <go...@yahoo.com> on 2001/03/08 19:07:26 UTC
Whoops! HTMLDocumentImpl != implemented
Found out why HTMLDocumentImpl doesn't work - check
the close() method:
public void close()
{
// ! NOT IMPLEMENTED, REQUIRES PARSER !
if ( _writer != null )
{
_writer = null;
}
}
Does anyone know a simple parser for HTML - all I need
is a reliable way to grab forms, some text and/or
title from a WWW page and display on text terminals.
__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail.
http://personal.mail.yahoo.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
Re: Whoops! HTMLDocumentImpl != implemented
Posted by Chad Loder <cl...@acm.org>.
Yes, I was going to suggest that exact thing. JTidy
is really cool because HTML parsing requires some
heuristics, and the heuristics in JTidy (and tidy)
are quite refined and easily tweakable.
c
At 11:58 AM 3/8/2001 -0800, you wrote:
>We have found JTidy very useful for parsing HTML:
>http://sourceforge.net/projects/jtidy/
>
>gordy perkins <go...@yahoo.com> writes:
> > Found out why HTMLDocumentImpl doesn't work - check
> > the close() method:
> >
> >
> > public void close()
> > {
> > // ! NOT IMPLEMENTED, REQUIRES PARSER !
> > if ( _writer != null )
> > {
> > _writer = null;
> > }
> > }
> >
> > Does anyone know a simple parser for HTML - all I need
> > is a reliable way to grab forms, some text and/or
> > title from a WWW page and display on text terminals.
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Get email at your own domain with Yahoo! Mail.
> > http://personal.mail.yahoo.com/
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-j-user-help@xml.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
Re: Whoops! HTMLDocumentImpl != implemented
Posted by Mark Diekhans <ma...@lutris.com>.
We have found JTidy very useful for parsing HTML:
http://sourceforge.net/projects/jtidy/
gordy perkins <go...@yahoo.com> writes:
> Found out why HTMLDocumentImpl doesn't work - check
> the close() method:
>
>
> public void close()
> {
> // ! NOT IMPLEMENTED, REQUIRES PARSER !
> if ( _writer != null )
> {
> _writer = null;
> }
> }
>
> Does anyone know a simple parser for HTML - all I need
> is a reliable way to grab forms, some text and/or
> title from a WWW page and display on text terminals.
>
> __________________________________________________
> Do You Yahoo!?
> Get email at your own domain with Yahoo! Mail.
> http://personal.mail.yahoo.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
RE: Whoops! HTMLDocumentImpl != implemented
Posted by Brad O'Hearne <ca...@megapathdsl.net>.
Gordy,
If you find one, let me know -- I have been keeping my eye out for one as
well.
BradO
-----Original Message-----
From: gordy perkins [mailto:gop3000@yahoo.com]
Sent: Thursday, March 08, 2001 10:07 AM
To: xerces-j-user@xml.apache.org
Subject: Whoops! HTMLDocumentImpl != implemented
Found out why HTMLDocumentImpl doesn't work - check
the close() method:
public void close()
{
// ! NOT IMPLEMENTED, REQUIRES PARSER !
if ( _writer != null )
{
_writer = null;
}
}
Does anyone know a simple parser for HTML - all I need
is a reliable way to grab forms, some text and/or
title from a WWW page and display on text terminals.
__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail.
http://personal.mail.yahoo.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org