You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by gordy perkins <go...@yahoo.com> on 2001/03/08 19:07:26 UTC

Whoops! HTMLDocumentImpl != implemented

Found out why HTMLDocumentImpl doesn't work - check
the close() method:


    public void close()
    {
        // ! NOT IMPLEMENTED, REQUIRES PARSER !
        if ( _writer != null )
        {
            _writer = null;
        }
    }

Does anyone know a simple parser for HTML - all I need
is a reliable way to grab forms, some text and/or
title from a WWW page and display on text terminals. 

__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail. 
http://personal.mail.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Whoops! HTMLDocumentImpl != implemented

Posted by Chad Loder <cl...@acm.org>.
Yes, I was going to suggest that exact thing. JTidy
is really cool because HTML parsing requires some
heuristics, and the heuristics in JTidy (and tidy)
are quite refined and easily tweakable.

         c

At 11:58 AM 3/8/2001 -0800, you wrote:

>We have found JTidy very useful for parsing HTML:
>http://sourceforge.net/projects/jtidy/
>
>gordy perkins <go...@yahoo.com> writes:
> > Found out why HTMLDocumentImpl doesn't work - check
> > the close() method:
> >
> >
> >     public void close()
> >     {
> >         // ! NOT IMPLEMENTED, REQUIRES PARSER !
> >         if ( _writer != null )
> >         {
> >             _writer = null;
> >         }
> >     }
> >
> > Does anyone know a simple parser for HTML - all I need
> > is a reliable way to grab forms, some text and/or
> > title from a WWW page and display on text terminals.
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Get email at your own domain with Yahoo! Mail.
> > http://personal.mail.yahoo.com/
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-j-user-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Whoops! HTMLDocumentImpl != implemented

Posted by Mark Diekhans <ma...@lutris.com>.
We have found JTidy very useful for parsing HTML:
http://sourceforge.net/projects/jtidy/

gordy perkins <go...@yahoo.com> writes:
> Found out why HTMLDocumentImpl doesn't work - check
> the close() method:
> 
> 
>     public void close()
>     {
>         // ! NOT IMPLEMENTED, REQUIRES PARSER !
>         if ( _writer != null )
>         {
>             _writer = null;
>         }
>     }
> 
> Does anyone know a simple parser for HTML - all I need
> is a reliable way to grab forms, some text and/or
> title from a WWW page and display on text terminals. 
> 
> __________________________________________________
> Do You Yahoo!?
> Get email at your own domain with Yahoo! Mail. 
> http://personal.mail.yahoo.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


RE: Whoops! HTMLDocumentImpl != implemented

Posted by Brad O'Hearne <ca...@megapathdsl.net>.
Gordy,

If you find one, let me know -- I have been keeping my eye out for one as
well.

BradO

-----Original Message-----
From: gordy perkins [mailto:gop3000@yahoo.com]
Sent: Thursday, March 08, 2001 10:07 AM
To: xerces-j-user@xml.apache.org
Subject: Whoops! HTMLDocumentImpl != implemented


Found out why HTMLDocumentImpl doesn't work - check
the close() method:


    public void close()
    {
        // ! NOT IMPLEMENTED, REQUIRES PARSER !
        if ( _writer != null )
        {
            _writer = null;
        }
    }

Does anyone know a simple parser for HTML - all I need
is a reliable way to grab forms, some text and/or
title from a WWW page and display on text terminals.

__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail.
http://personal.mail.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org