You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lenya.apache.org by Peter Shipley <ps...@nomensa.com> on 2005/01/27 20:18:10 UTC

Page exception errors generated whilst mirroring a iive site

I am using a utility called httrack to mirror a site and have noticed a 
couple of pages are showing me cocoon error pages.

The utility is only running about 2 concurrent connections against the 
live version of a site in tomcat. Its random as to which pages suffer 
the error and all pages work normally when you visit the page.

The error being:

Original Exception: javax.xml.transform.TransformerException: java.lang.NullPointerException
	at org.apache.xalan.xsltc.trax.TransformerHandlerImpl.endDocument(TransformerHandlerImpl.java:240)
	at org.apache.cocoon.xml.AbstractXMLPipe.endDocument(AbstractXMLPipe.java:56)
	at org.apache.cocoon.transformation.TraxTransformer.endDocument(TraxTransformer.java:549)
	at org.apache.cocoon.sitemap.ContentAggregator.generate(ContentAggregator.java:134)


The full error can be found at http://pshipley.homeip.net/stacktrace-20040127.html


My only guess at this point is to do with the speed/number of requests 
that generates a problem at a critical point. Since the number of pages 
and relative speed is trivial in this case I guess it worrys a little. 
Whilst surfing the live site in a browser there are no errors.

Any help on this is greatly appreciated.

Peter


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: Page exception errors generated whilst mirroring a iive site

Posted by Peter Shipley <ps...@nomensa.com>.
> that's my thought too. when you say external program, what do you mean? 
> did you write your own exporter?

A program called 'httrack', all it does is try to spider the site, download all 
pages and assets.

It was supposed to be a quick and easy way of getting around some current 
problems associated with the lenya export utility and a stop gap before trying 
to use something like the cocoon cli


> if you figured out more clues as to when this happens, it will be very 
> helpful to track down the problem.


I will ensure everyone is kept up-to-date ;)

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: Page exception errors generated whilst mirroring a iive site

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Peter Shipley wrote:
>> are you doing reads & writes to the sitetree at the same time?
> 
> Can say with certainty I am the only user at the moment and it would be 
> reading only.
> 
> Of course, I would hope this shouldn't represent a problem as mirroring 
> may occur as people update the site. The external program will not 
> attempt to write to the sitetree at all.

that's my thought too. when you say external program, what do you mean? 
did you write your own exporter?

>> maybe set the logging in the pertaining classes to log.error() for the 
>> time being?
> 
> 
> I will give it a go .. just not sure what I am looking for.

if you figured out more clues as to when this happens, it will be very 
helpful to track down the problem.


-- 
Gregor J. Rothfuss
COO, Wyona       Content Management Solutions    http://wyona.com
Apache Lenya                              http://lenya.apache.org
gregor.rothfuss@wyona.com                       gregor@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: Page exception errors generated whilst mirroring a iive site

Posted by Peter Shipley <ps...@nomensa.com>.
> are you doing reads & writes to the sitetree at the same time?
Can say with certainty I am the only user at the moment and it would be reading 
only.

Of course, I would hope this shouldn't represent a problem as mirroring may 
occur as people update the site. The external program will not attempt to write 
to the sitetree at all.

> maybe set the logging in the pertaining classes to log.error() for the 
> time being?

I will give it a go .. just not sure what I am looking for.

Cheers

Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: Page exception errors generated whilst mirroring a iive site

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Peter Shipley wrote:

> I have observed that on different machines running the export program 
> some produce more errors than others, cutting it down to one thread on 
> one particular machine will usually work fine (on another it won't).
> 
> Are there any grounds where this call might fail - perhaps access 
> control is denying access to sitetree.xml ?? (clutching .. straws).

are you doing reads & writes to the sitetree at the same time?

> I have noticed a few AccessControl Exceptions as well.
> 
> The problem is when I turn on debugging the error is not reproduced. I 
> can only assume because the delay involved in writing out the debug 
> information slows down the crawl of the site so that a test which might 
> take 50 seconds to crawl now takes 30mins.

ouch!

> Any ideas how to approach solving this sort of problem is greatly 
> appreciated.

maybe set the logging in the pertaining classes to log.error() for the 
time being?

-- 
Gregor J. Rothfuss
COO, Wyona       Content Management Solutions    http://wyona.com
Apache Lenya                              http://lenya.apache.org
gregor.rothfuss@wyona.com                       gregor@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: Page exception errors generated whilst mirroring a iive site

Posted by Peter Shipley <ps...@nomensa.com>.
Gregor J. Rothfuss wrote:

> looks like DefaultSiteTree.findNode() is failing for some reason:
> 
>             NodeList nodes = node.getChildNodes();
> 
>             for (int i = 0; i < nodes.getLength(); i++) {
> 
> 
> i wonder why the getLenght fails.. can you file a bug with this 
> information?
> 

I am not sure its not a bug I have introduced in some of my changes.

I have observed that on different machines running the export program some 
produce more errors than others, cutting it down to one thread on one particular 
machine will usually work fine (on another it won't).

Are there any grounds where this call might fail - perhaps access control is 
denying access to sitetree.xml ?? (clutching .. straws).

I have noticed a few AccessControl Exceptions as well.

The problem is when I turn on debugging the error is not reproduced. I can only 
assume because the delay involved in writing out the debug information slows 
down the crawl of the site so that a test which might take 50 seconds to crawl 
now takes 30mins.

Any ideas how to approach solving this sort of problem is greatly appreciated.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: Page exception errors generated whilst mirroring a iive site

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Peter Shipley wrote:
> 
> I am using a utility called httrack to mirror a site and have noticed a 
> couple of pages are showing me cocoon error pages.
> 
> The utility is only running about 2 concurrent connections against the 
> live version of a site in tomcat. Its random as to which pages suffer 
> the error and all pages work normally when you visit the page.
> 
> The error being:
> 
> Original Exception: javax.xml.transform.TransformerException: 
> java.lang.NullPointerException

cause, from your log:

java.lang.NullPointerException
	at org.apache.xerces.dom.ParentNode.nodeListGetLength(Unknown Source)
	at org.apache.xerces.dom.ParentNode.getLength(Unknown Source)
	at 
org.apache.lenya.cms.publication.DefaultSiteTree.findNode(DefaultSiteTree.java:182)
	at 
org.apache.lenya.cms.publication.DefaultSiteTree.getNodeInternal(DefaultSiteTree.java:445)
	at 
org.apache.lenya.cms.publication.DefaultSiteTree.getNode(DefaultSiteTree.java:457)
	at 
org.apache.lenya.cms.publication.DefaultDocument.exists(DefaultDocument.java:319)
	at 
org.apache.lenya.cms.cocoon.transformation.LinkRewritingTransformer.startElement(LinkRewritingTransformer.java:228)

looks like DefaultSiteTree.findNode() is failing for some reason:

             NodeList nodes = node.getChildNodes();

             for (int i = 0; i < nodes.getLength(); i++) {


i wonder why the getLenght fails.. can you file a bug with this information?

-- 
Gregor J. Rothfuss
COO, Wyona       Content Management Solutions    http://wyona.com
Apache Lenya                              http://lenya.apache.org
gregor.rothfuss@wyona.com                       gregor@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org