You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Edward Quick <ed...@hotmail.com> on 2005/09/19 13:58:33 UTC

type: searches

Hi,

I really want to get the type: searches working but so far I haven't been 
able to get back any results. Does my plugin configuration below look 
alright?

<property>
<name>plugin.includes</name>
<value>protocol-httpclient|urlfilter-regex|parse-(text|html|js|msword|pdf|rss|ext)|index-(basic|more)|query-(basic|more|site|ur
l)</value>
<description>Regular expression naming plugin directory names to
include.  Any plugin not matching this expression is excluded.  By
default Nutch includes crawling just HTML and plain text via HTTP,
and basic indexing and search plugins.
</description>
</property>


Thanks for any help.

Ed.



Re: type: searches

Posted by Johannes Söllner <jo...@gmx.net>.
Hi, mine looks like this, and it works

<property>
  <name>plugin.includes</name>
 
<value>protocol-httpclient|urlfilter-regex|parse-(text|html|js|pdf|msword|rss)|index-basic|query-(basic|site|url)</value>
  <description>Regular expression naming plugin directory names to
  include.  Any plugin not matching this expression is excluded.  By
  default Nutch includes crawling just HTML and plain text via HTTP,
  and basic indexing and search plugins.
  </description>
</property>

To me your config looks good ;-)

Johannes

> --- Ursprüngliche Nachricht ---
> Von: "Edward Quick" <ed...@hotmail.com>
> An: nutch-user@lucene.apache.org
> Betreff: type: searches
> Datum: Mon, 19 Sep 2005 11:58:33 +0000
> 
> Hi,
> 
> I really want to get the type: searches working but so far I haven't been 
> able to get back any results. Does my plugin configuration below look 
> alright?
> 
> <property>
> <name>plugin.includes</name>
>
<value>protocol-httpclient|urlfilter-regex|parse-(text|html|js|msword|pdf|rss|ext)|index-(basic|more)|query-(basic|more|site|ur
> l)</value>
> <description>Regular expression naming plugin directory names to
> include.  Any plugin not matching this expression is excluded.  By
> default Nutch includes crawling just HTML and plain text via HTTP,
> and basic indexing and search plugins.
> </description>
> </property>
> 
> 
> Thanks for any help.
> 
> Ed.
> 
> 

-- 
GMX DSL = Maximale Leistung zum minimalen Preis!
2000 MB nur 2,99, Flatrate ab 4,99 Euro/Monat: http://www.gmx.net/de/go/dsl