You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Stjepan Marjanovic <m_...@yahoo.com> on 2007/04/04 16:06:52 UTC

Nutch - incorrect JavaScript url

Hi,
 I started Nutch on my localhost web site. In application I have 
javascript files that create dynamic urls.
My 
question is: what should I configure so that Nutch recognizes these urls and 
completely crawls the site?

Below is part of log file that nutch 
generates.

fetching https://www.localhost/script/ShockwaveFlash.ShockwaveFlash.
fetching https://www.localhost/script/
fetching https://www.localhost/script/webtv/2.6
fetching https://www.localhost/script/_level0/_root
fetching https://www.localhost/script/betslip.aspx
fetching https://www.localhost/shared/script/+s_c2fe(c.substring(o+1,e))+
fetching https://www.localhost/shared/script/)<0)||oc.indexOf(
fetching https://www.localhost/shared/script/+m).indexOf(
fetching https://www.localhost/shared/script/c.indexOf(\
fetching https://www.localhost/shared/script/);else{if(s.ismac&&s.u.indexOf(
fetching https://www.localhost/registration.aspx#
fetch of https://www.localhost/shared/script/)<0)||oc.indexOf( failed with: java.lang.IllegalArgumentException: Invalid uri 'https://www.localhost/shared/script/)<0)||oc.indexOf(': escaped absolute path not valid


Thanks.

Stjepan




 
____________________________________________________________________________________
TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/