You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Mark Lundquist <lu...@gmail.com> on 2008/08/27 20:25:13 UTC

java.net.ConnectException in spite of EntityResolver

Hi, this is my first post here.  I'm looking for some help with a
problem.  I've been working on some code that invokes Xerces, and I
discovered that this code was not supplying an EntityResolver, even
though there is a catalog-based EntityResolver readily available for it
to use.  This code has been in use for a long time (years), but we
suddenly (a couple of weeks ago) started seeing ConnectException thrown
from the default entity resolver... and due to a coding error by the guy
who originally wrote this code, the exception was getting swallowed,
giving rise to a nasty and hard-to-track-down bug!

I fixed this code, i.e. fixed the broken try{}/catch{} block and also
supplied the catalog-based EntityResolver to the Xerces parser instance.

I know that's working, because (a) we get the ConnectException a lot
less often, and (b) the whole thing is about 3x faster.

But... I still sometimes get "java.net.ConnectException: Connection
refused" from www.w3c.org, even though I'm using the catalog-based
EntityResolver, and I don't understand why.  Like I said, it happens a
lot less often now, which makes me think it was getting thrown from two
different places, and using the EntityResolver took care of only one of
them.

The exception stack trace and DOCTYPE are shown below:


Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
    at java.net.Socket.connect(Socket.java:520)
    at java.net.Socket.connect(Socket.java:470)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:157)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:387)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:522)
    at sun.net.www.http.HttpClient.&lt;init&gt;(HttpClient.java:231)
    at sun.net.www.http.HttpClient.New(HttpClient.java:304)
    at sun.net.www.http.HttpClient.New(HttpClient.java:321)
    at
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:813)
    at
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:765)
    at
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:690)
    at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:934)
    at
org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
    at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
    at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
    at org.apache.xerces.impl.XMLDTDScannerImpl.startPE(Unknown Source)

My DTD looks like this:

<!DOCTYPE xhtml-fragment
[
<!ENTITY % HTMLlat1 PUBLIC
   "-//W3C//ENTITIES Latin 1 for XHTML//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
%HTMLlat1;
<!ENTITY % HTMLspecial PUBLIC
    "-//W3C//ENTITIES Special for XHTML//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent">
  %HTMLspecial;
]>

I looked at the Xerces source code and found that my EntityResolver is
invoked by XMLEntityManager.resolveEntity(), but not by
XMLEntityManager.startEntity() which is what is being called by
XMLDTDScannerImpl.startPE(), which is what's trying to make the network
connection... for whatever that's worth.

Any idea what's going on here, and how I can make Xerces stop trying to
go out on the 'net altogether?  I thought using the catalog resolver
would take care of it all, but...???

Also, does anyone have an inside story on why we're suddenly be seeing
all these "connection refused" from www.w3c.org?  I mean, it's not like
it happens every time, but it seems to happen with some regularity...
and all starting a couple of wekks


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: java.net.ConnectException in spite of EntityResolver

Posted by Frédéric Filliat <fe...@gmail.com>.
Hello Mark,

I had the same problem.
You have to disable the property
"http://apache.org/xml/features/nonvalidating/load-external-dtd".

Ex:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature(
				"http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

Enjoy !

Fred



On Wed, Aug 27, 2008 at 8:25 PM, Mark Lundquist
<lu...@gmail.com> wrote:
>
> Hi, this is my first post here.  I'm looking for some help with a
> problem.  I've been working on some code that invokes Xerces, and I
> discovered that this code was not supplying an EntityResolver, even
> though there is a catalog-based EntityResolver readily available for it
> to use.  This code has been in use for a long time (years), but we
> suddenly (a couple of weeks ago) started seeing ConnectException thrown
> from the default entity resolver... and due to a coding error by the guy
> who originally wrote this code, the exception was getting swallowed,
> giving rise to a nasty and hard-to-track-down bug!
>
> I fixed this code, i.e. fixed the broken try{}/catch{} block and also
> supplied the catalog-based EntityResolver to the Xerces parser instance.
>
> I know that's working, because (a) we get the ConnectException a lot
> less often, and (b) the whole thing is about 3x faster.
>
> But... I still sometimes get "java.net.ConnectException: Connection
> refused" from www.w3c.org, even though I'm using the catalog-based
> EntityResolver, and I don't understand why.  Like I said, it happens a
> lot less often now, which makes me think it was getting thrown from two
> different places, and using the EntityResolver took care of only one of
> them.
>
> The exception stack trace and DOCTYPE are shown below:
>
>
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>   at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>   at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>   at java.net.Socket.connect(Socket.java:520)
>   at java.net.Socket.connect(Socket.java:470)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:157)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:387)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:522)
>   at sun.net.www.http.HttpClient.&lt;init&gt;(HttpClient.java:231)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:304)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:321)
>   at
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:813)
>   at
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:765)
>   at
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:690)
>   at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:934)
>   at
> org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
>   at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
>   at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
>   at org.apache.xerces.impl.XMLDTDScannerImpl.startPE(Unknown Source)
>
> My DTD looks like this:
>
> <!DOCTYPE xhtml-fragment
> [
> <!ENTITY % HTMLlat1 PUBLIC
>  "-//W3C//ENTITIES Latin 1 for XHTML//EN"
>  "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
> %HTMLlat1;
> <!ENTITY % HTMLspecial PUBLIC
>   "-//W3C//ENTITIES Special for XHTML//EN"
>   "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent">
>  %HTMLspecial;
> ]>
>
> I looked at the Xerces source code and found that my EntityResolver is
> invoked by XMLEntityManager.resolveEntity(), but not by
> XMLEntityManager.startEntity() which is what is being called by
> XMLDTDScannerImpl.startPE(), which is what's trying to make the network
> connection... for whatever that's worth.
>
> Any idea what's going on here, and how I can make Xerces stop trying to
> go out on the 'net altogether?  I thought using the catalog resolver
> would take care of it all, but...???
>
> Also, does anyone have an inside story on why we're suddenly be seeing
> all these "connection refused" from www.w3c.org?  I mean, it's not like
> it happens every time, but it seems to happen with some regularity...
> and all starting a couple of wekks
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org