You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by Laurent Tardif <la...@inrialpes.fr> on 2001/03/08 17:38:02 UTC

Improvment of Xerces/xalan performance

insert the attached file in your project 
run your program with : java -Xbootclasspath/p:..\Classes;%CLASSPATH%
and win some time :)

With this little modification, i've significantly decreased the parsing
time of xml file.

Also, i have a question, 
does somebody use jet to compile class file ?
i have a problem at run time, with stream (they are closed prematurly).

Laurent
-- 
*************************************************************************
   ,-~~-.___.         Laurent Tardif 
  / |  '     \            Tel    : (+33) 4 76 61 53 83
 (  )         0	      //  Mail   : laurent.tardif@inrialpes.fr
  \_/-, ,----'	     //	  Web    : http://www.inrialpes.fr/opera
     ====           //	  Bureau : B 206
    /  \-'~;    /~~~(O)   add    : Projet OPERA
   /  __/~|   /       |		   Inria Rhone-alpes, 655 avenue de l'europe
 =(  _____| (_________|	  	   38330 Montbonnot St Martin
*************************************************************************

Re: Improvment of Xerces/xalan performance

Posted by Arnaud Le Hors <le...@us.ibm.com>.
Arnaud Le Hors wrote:
> 
> Laurent Tardif wrote:
> >
> > insert the attached file in your project
> > run your program with : java -Xbootclasspath/p:..\Classes;%CLASSPATH%
> > and win some time :)
> >
> > With this little modification, i've significantly decreased the parsing
> > time of xml file.
> 
> Err... I read:
> 
> /**
>  * The Attributes class maps Manifest attribute names to associated
> string
>  * values. Attribute names are case-insensitive and restricted to the
> ASCII
>  * characters in the set [0-9a-zA-Z_-].
> 
> The restrictions on names just don't work for XML. Don't be surprised if
> you have problems...

Err... Never mind that. I thought this was being used to store XML
attributes... 
-- 
Arnaud  Le Hors - IBM Cupertino, XML Strategy Group

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Improvment of Xerces/xalan performance

Posted by Arnaud Le Hors <le...@us.ibm.com>.
Arnaud Le Hors wrote:
> 
> Laurent Tardif wrote:
> >
> > insert the attached file in your project
> > run your program with : java -Xbootclasspath/p:..\Classes;%CLASSPATH%
> > and win some time :)
> >
> > With this little modification, i've significantly decreased the parsing
> > time of xml file.
> 
> Err... I read:
> 
> /**
>  * The Attributes class maps Manifest attribute names to associated
> string
>  * values. Attribute names are case-insensitive and restricted to the
> ASCII
>  * characters in the set [0-9a-zA-Z_-].
> 
> The restrictions on names just don't work for XML. Don't be surprised if
> you have problems...

Err... Never mind that. I thought this was being used to store XML
attributes... 
-- 
Arnaud  Le Hors - IBM Cupertino, XML Strategy Group

Re: Improvment of Xerces/xalan performance

Posted by Arnaud Le Hors <le...@us.ibm.com>.
Laurent Tardif wrote:
> 
> insert the attached file in your project
> run your program with : java -Xbootclasspath/p:..\Classes;%CLASSPATH%
> and win some time :)
> 
> With this little modification, i've significantly decreased the parsing
> time of xml file.

Err... I read:

/**
 * The Attributes class maps Manifest attribute names to associated
string
 * values. Attribute names are case-insensitive and restricted to the
ASCII
 * characters in the set [0-9a-zA-Z_-].

The restrictions on names just don't work for XML. Don't be surprised if
you have problems...
-- 
Arnaud  Le Hors - IBM Cupertino, XML Strategy Group

Re: Improvment of Xerces/xalan performance

Posted by Arnaud Le Hors <le...@us.ibm.com>.
Laurent Tardif wrote:
> 
> insert the attached file in your project
> run your program with : java -Xbootclasspath/p:..\Classes;%CLASSPATH%
> and win some time :)
> 
> With this little modification, i've significantly decreased the parsing
> time of xml file.

Err... I read:

/**
 * The Attributes class maps Manifest attribute names to associated
string
 * values. Attribute names are case-insensitive and restricted to the
ASCII
 * characters in the set [0-9a-zA-Z_-].

The restrictions on names just don't work for XML. Don't be surprised if
you have problems...
-- 
Arnaud  Le Hors - IBM Cupertino, XML Strategy Group

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Whoops! HTMLDocumentImpl != implemented

Posted by Chad Loder <cl...@acm.org>.
Yes, I was going to suggest that exact thing. JTidy
is really cool because HTML parsing requires some
heuristics, and the heuristics in JTidy (and tidy)
are quite refined and easily tweakable.

         c

At 11:58 AM 3/8/2001 -0800, you wrote:

>We have found JTidy very useful for parsing HTML:
>http://sourceforge.net/projects/jtidy/
>
>gordy perkins <go...@yahoo.com> writes:
> > Found out why HTMLDocumentImpl doesn't work - check
> > the close() method:
> >
> >
> >     public void close()
> >     {
> >         // ! NOT IMPLEMENTED, REQUIRES PARSER !
> >         if ( _writer != null )
> >         {
> >             _writer = null;
> >         }
> >     }
> >
> > Does anyone know a simple parser for HTML - all I need
> > is a reliable way to grab forms, some text and/or
> > title from a WWW page and display on text terminals.
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Get email at your own domain with Yahoo! Mail.
> > http://personal.mail.yahoo.com/
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-j-user-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Whoops! HTMLDocumentImpl != implemented

Posted by Mark Diekhans <ma...@lutris.com>.
We have found JTidy very useful for parsing HTML:
http://sourceforge.net/projects/jtidy/

gordy perkins <go...@yahoo.com> writes:
> Found out why HTMLDocumentImpl doesn't work - check
> the close() method:
> 
> 
>     public void close()
>     {
>         // ! NOT IMPLEMENTED, REQUIRES PARSER !
>         if ( _writer != null )
>         {
>             _writer = null;
>         }
>     }
> 
> Does anyone know a simple parser for HTML - all I need
> is a reliable way to grab forms, some text and/or
> title from a WWW page and display on text terminals. 
> 
> __________________________________________________
> Do You Yahoo!?
> Get email at your own domain with Yahoo! Mail. 
> http://personal.mail.yahoo.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


RE: Whoops! HTMLDocumentImpl != implemented

Posted by Brad O'Hearne <ca...@megapathdsl.net>.
Gordy,

If you find one, let me know -- I have been keeping my eye out for one as
well.

BradO

-----Original Message-----
From: gordy perkins [mailto:gop3000@yahoo.com]
Sent: Thursday, March 08, 2001 10:07 AM
To: xerces-j-user@xml.apache.org
Subject: Whoops! HTMLDocumentImpl != implemented


Found out why HTMLDocumentImpl doesn't work - check
the close() method:


    public void close()
    {
        // ! NOT IMPLEMENTED, REQUIRES PARSER !
        if ( _writer != null )
        {
            _writer = null;
        }
    }

Does anyone know a simple parser for HTML - all I need
is a reliable way to grab forms, some text and/or
title from a WWW page and display on text terminals.

__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail.
http://personal.mail.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Whoops! HTMLDocumentImpl != implemented

Posted by gordy perkins <go...@yahoo.com>.
Found out why HTMLDocumentImpl doesn't work - check
the close() method:


    public void close()
    {
        // ! NOT IMPLEMENTED, REQUIRES PARSER !
        if ( _writer != null )
        {
            _writer = null;
        }
    }

Does anyone know a simple parser for HTML - all I need
is a reliable way to grab forms, some text and/or
title from a WWW page and display on text terminals. 

__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail. 
http://personal.mail.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org