You are viewing a plain text version of this content. The canonical link for it is here.

Posted to j-dev@xerces.apache.org by Guoliang Cao <ca...@ispsoft.com> on 2001/01/16 22:29:44 UTC

Cache xml schema file to improve performance.

Hi,

I'm using this code to cache xml schema files to improve validation
performance.  I guess I got some good results.
Except for the performance enhancement, I can totally remove the link
between schemaLocation and the physical location of schema files.
Is there anybody interested in this?  If you have some good experiences
already, please share with me. Thanks.

Guoliang

--- Output ---

Time for 10 validation with default EntityResolver: 3395
Time for 10 validation with MyResolver: 3205

Time for 100 validation with default EntityResolver: 20660
Time for 100 validation with MyResolver: 15102

Time for 200 validation with default EntityResolver: 40318
Time for 200 validation with MyResolver: 27410


--- Validation.java ---
...
parser = ...;
parser.setEntityResolver(new MyResolver());
parser.parse();
...

---  MyResolver.java ---

import java.io.*;
import java.net.*;
import java.util.Hashtable;
import org.xml.sax.EntityResolver;
import org.xml.sax.InputSource;
import org.apache.xerces.parsers.*;

public class MyResolver implements EntityResolver {
    public String PathToSchema="";
    public Hashtable Htable = new Hashtable();

    public MyResolver(){
    }

    public MyResolver(String path){
        if (path == null) return;
        PathToSchema = path;
        if (PathToSchema.length() > 0 &&
PathToSchema.charAt(PathToSchema.length() - 1) != '/'){
            PathToSchema = PathToSchema.concat("/");
        }
    }

    public InputSource resolveEntity (String publicId, String systemId)
    throws FileNotFoundException, MalformedURLException, IOException
    {
        String xml_str = (String)(Htable.get(systemId));
        if (xml_str == null){
            String url = PathToSchema+systemId;
            char chArr[] = new char[1000000];
            InputStreamReader isReader = new InputStreamReader((new
URL(url)).openStream());
            int len = isReader.read(chArr);
            isReader.close();
            String str = (new String(chArr)).substring(0,len);
            InputSource is = new InputSource(new StringReader(str));
            Htable.put(systemId,str);
            return is;
        } else {
            return new InputSource(new StringReader(xml_str));
        }
    }

}

HTML DOM

Posted by Shiva <Si...@informix.com>.

Hi,
	From the Xerces docs, I see that there is also a HTML DOM
in xerces.jar
	Are there any samples to use this ?

rgds,
Shiva

Re: Xerces 2

Posted by Elena Litani <hl...@jtcsv.com>.

Hi, Kevin,

> Where can I get the Xerces 2 code?

  set CVSROOT=:pserver:anoncvs@xml.apache.org:/home/cvspublic
  cvs login        (password: anoncvs)
  cvs checkout -d x2 -r xerces_j_2 xml-xerces/java 

Visit Andy Clark web site for more information:
http://www.apache.org/~andyc/

Any help with development is greatly appreciated. You can start by
posting patches to this mailing list.

Good Luck,
elena

Xerces 2

Posted by Kevin Steppe <ks...@pacbell.net>.

Where can I get the Xerces 2 code?

What should I do to get involved in its development?

Thanks,
Kevin

Re: Cache xml schema file to improve performance.

Posted by Andy Clark <an...@apache.org>.

Guoliang Cao wrote:
> Is there anybody interested in this?  If you have some good experiences
> already, please share with me. Thanks.

There are some problems with the following lines.

>             char chArr[] = new char[1000000];
>             InputStreamReader isReader = new InputStreamReader((new
> URL(url)).openStream());
>             int len = isReader.read(chArr);
>             isReader.close();
>             String str = (new String(chArr)).substring(0,len);
>             InputSource is = new InputSource(new StringReader(str));
>             Htable.put(systemId,str);

1) You should use a byte array instead of a char array so
   that you are able to handle files in various encodings.
2) Make the byte array size based on the the Content-Length 
   header and then store the array in a hashtable. Then, 
   when you need to read it, just wrap a ByteArrayInputStream
   around it.
3) Be careful about block reads from an input stream because
   there is no guarantee that you will read the entire amount
   in a single call. You have to loop until the value that is
   returned by InputStream#read(byte[]) is -1.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org