You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by pe...@student.luc.ac.be on 2004/12/16 18:41:34 UTC
Socket timeout when parsing a big XML file (DOM/SAX)
Hello,
I wanted to parse a XML file of 30 megs, but I get the following error:
Exception in thread "main" java.net.ConnectException: Connection timed out:
connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
...
at sun.net.www.http.HttpClient.openServer(Unknown Source)
...
at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
...
It both happens with SAX and DOM:
SAXParser parser = new SAXParser();
parser.setContentHandler(this);
parser.parse(file);
documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document document = documentBuilder.parse(file);
I also tried to increase the allowed memory usage of the virtual machine, but
that didn't help. It has something to do with the filesize of the input XML
file, since everything works perfect with small files.
Someone who has an idea what is wrong and how it can be fixed?
And why are sockets and the HTTP protocol used anyway, to load a XML file on my
hard disk?
thanks a lot for your time,
--
Peter
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
Re: Socket timeout when parsing a big XML file (DOM/SAX)
Posted by ra...@freddiemac.com.
This could very well be file size issue.
The HTTP servers are configurable to limit the upload size or even the
request parameter size. What server are you using? Apache has a couple of
configurable parameters:
http://httpd.apache.org/docs-2.0/mod/core.html#limitrequestfieldsize
http://httpd.apache.org/docs-2.0/mod/core.html#limitxmlrequestbody
http://httpd.apache.org/docs-2.0/mod/core.html#limitrequestbody
The default for LimitXMLRequestBody is 1000000 (almost 1MB).
Limits on request body or request fields are not part of HTTP protocol, so
different implementation of servers may handle (or not) the limits
differently.
Hope that helps.
-ramin
peter_billen@student.luc.ac.be
12/16/2004 12:41 PM
Please respond to
xerces-j-user@xml.apache.org
To
xerces-j-user@xml.apache.org
cc
Subject
Socket timeout when parsing a big XML file (DOM/SAX)
Hello,
I wanted to parse a XML file of 30 megs, but I get the following error:
Exception in thread "main" java.net.ConnectException: Connection timed
out:
connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
...
at sun.net.www.http.HttpClient.openServer(Unknown Source)
...
at
org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at
org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
...
It both happens with SAX and DOM:
SAXParser parser = new SAXParser();
parser.setContentHandler(this);
parser.parse(file);
documentBuilder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document document = documentBuilder.parse(file);
I also tried to increase the allowed memory usage of the virtual machine,
but
that didn't help. It has something to do with the filesize of the input
XML
file, since everything works perfect with small files.
Someone who has an idea what is wrong and how it can be fixed?
And why are sockets and the HTTP protocol used anyway, to load a XML file
on my
hard disk?
thanks a lot for your time,
--
Peter
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
Re: Socket timeout when parsing a big XML file (DOM/SAX)
Posted by Joseph Kesselman <ke...@us.ibm.com>.
Does your document reference a DTD or Schema located on the web, perhaps?
______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more.
"The world changed profoundly and unpredictably the day Tim Berners Lee
got bitten by a radioactive spider." -- Rafe Culpin, in r.m.filk
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org