You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Stjepan Marjanovic <m_...@yahoo.com> on 2007/04/04 16:06:52 UTC
Nutch - incorrect JavaScript url
Hi,
I started Nutch on my localhost web site. In application I have
javascript files that create dynamic urls.
My
question is: what should I configure so that Nutch recognizes these urls and
completely crawls the site?
Below is part of log file that nutch
generates.
fetching https://www.localhost/script/ShockwaveFlash.ShockwaveFlash.
fetching https://www.localhost/script/
fetching https://www.localhost/script/webtv/2.6
fetching https://www.localhost/script/_level0/_root
fetching https://www.localhost/script/betslip.aspx
fetching https://www.localhost/shared/script/+s_c2fe(c.substring(o+1,e))+
fetching https://www.localhost/shared/script/)<0)||oc.indexOf(
fetching https://www.localhost/shared/script/+m).indexOf(
fetching https://www.localhost/shared/script/c.indexOf(\
fetching https://www.localhost/shared/script/);else{if(s.ismac&&s.u.indexOf(
fetching https://www.localhost/registration.aspx#
fetch of https://www.localhost/shared/script/)<0)||oc.indexOf( failed with: java.lang.IllegalArgumentException: Invalid uri 'https://www.localhost/shared/script/)<0)||oc.indexOf(': escaped absolute path not valid
Thanks.
Stjepan
____________________________________________________________________________________
TV dinner still cooling?
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/