You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Andy Morris <an...@woodward.edu> on 2006/02/06 14:52:49 UTC
Asp pages again
okay, what version of nutch crawls asp pages the best?
I can't seem to get a good crawl of my site.
andy
Re: Asp pages again
Posted by Stefan Groschupf <sg...@media-style.com>.
I guess this is more a question of the configuration than of the
version.
In any case I suggest using the latest nightly build, since - well -
that is an active open source project. :-)
Carefully check your url reg ex, also check what your webserver
retrun as content type, there is a known issue with .7 and wrong
returned content types. I'm not sure if this issues is already fixed:
http://issues.apache.org/jira/browse/nutch-133
http://issues.apache.org/jira/browse/nutch-135
Am 06.02.2006 um 14:52 schrieb Andy Morris:
> okay, what version of nutch crawls asp pages the best?
> I can't seem to get a good crawl of my site.
> andy
>
---------------------------------------------------------------
company: http://www.media-style.com
forum: http://www.text-mining.org
blog: http://www.find23.net