You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by David Halsted <ha...@tcimet.net> on 2000/10/28 04:10:44 UTC

Stopping parse

I am writing a servlet that will be asked (among other things) to read through XML files of arbitrary size.  These files may be rather large, as they may represent output from corporate databases.  I'd like to enable system administrators to limit the number of records returned by such parsing operations.  I'm using an XMLReader and what I'd really like to do is count the number of times a certain element occurs and stop the parsing process entirely when it reaches the permitted maximum.  The API docs say I should throw an exception if I want to stop the parsing process, but I'm afraid I haven't been able to figure out how to throw an exception that'll actually stop the parser.  Does anybody have a bit of code that does this?

Thanks,
Dave Halsted

Re: Stopping parse

Posted by Maitreyee Pasad <pa...@gst.com>.
David:

Have you implemented  a ContentHandler?

If yes, in the ContentHandler maintain a counter that counts the number
of records read. This you can do by incrementing the
counter everytime the startElement method is called for the first
element in every record.

Then in the startElement method you can throw a SAXException() if the
counter value exceeds your MAX value.

count = 0;  //member of the ContentHandler class

public void startElement(String namespaceURI, String localName, String
qName, Attributes atts)
    throws SAXException {

    if(qName.equals("Myname")){
          count ++;
    }

   if(count>MAX){
    throw new SAXException("Max records exceeded")'
   }

}

Hope that helps.

Maitreyee

David Halsted wrote:

> I am writing a servlet that will be asked (among other things) to read
> through XML files of arbitrary size.  These files may be rather large,
> as they may represent output from corporate databases.  I'd like to
> enable system administrators to limit the number of records returned
> by such parsing operations.  I'm using an XMLReader and what I'd
> really like to do is count the number of times a certain element
> occurs and stop the parsing process entirely when it reaches the
> permitted maximum.  The API docs say I should throw an exception if I
> want to stop the parsing process, but I'm afraid I haven't been able
> to figure out how to throw an exception that'll actually stop the
> parser.  Does anybody have a bit of code that does this? Thanks,Dave
> Halsted

Re: Stopping parse

Posted by Andy Clark <an...@apache.org>.
> David Halsted wrote:
> I am writing a servlet that will be asked (among other things) 
> to read through XML files of arbitrary size.  These files may be 
> rather large, as they may represent output from corporate databases.  
> I'd like to enable system administrators to limit the number of 
> records returned by such parsing operations.  I'm using an XMLReader 
> and what I'd really like to do is count the number of times a 
> certain element occurs and stop the parsing process entirely when 
> it reaches the [...]

I would imagine that the hardest part is not stopping the parser
but skipping to the end of the document in the stream without
closing the socket connection. This is a perpetual problem with
people reading XML documents via sockets and I haven't seen a
good solution to the problem, yet.

If you are using the same connection/stream, then you have to
invent some protocol to not only handle delineating the end of
valid, well-formed documents in the stream but also to skip to
the end of invalid documents. There is nothing in an XML parser
that helps with this problem.

This is an area where it would be nice to have a FAQ item with
a solution. If you find a good solution and want to write up
how you did it, we could include it into the documentation for
the benefit of all of the Xerces users.

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org