You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Paul M Lieberman <pa...@alum.mit.edu> on 2006/08/12 21:38:02 UTC
crawl w/o store
Y'all -
I need to do an intranet crawl in order to get a list of all URLs
fetched. I do NOT want to store the data for this crawl. I understand
there is a configuration option to do just this. Which file do I change
(conf/nutch-site.xml?), and what do I need to add to it?
I'm running nutch 0.72.
- Paul M Lieberman
Re: crawl w/o store
Posted by Dennis Kubes <nu...@dragonflymc.com>.
You can add the property to the nutch-site.xml file to take precedence
over default in nutch-default.xml file. The value is as below. This is
for Nutch 0.8 I am not sure if this is the same for 0.72
<property>
<name>fetcher.store.content</name>
<value>false</value>
<description>If true, fetcher will store content.</description>
</property>
Dennis
Paul M Lieberman wrote:
> Y'all -
>
> I need to do an intranet crawl in order to get a list of all URLs
> fetched. I do NOT want to store the data for this crawl. I understand
> there is a configuration option to do just this. Which file do I
> change (conf/nutch-site.xml?), and what do I need to add to it?
>
> I'm running nutch 0.72.
>
> - Paul M Lieberman