You are viewing a plain text version of this content. The canonical link for it is here.
Posted to agent@nutch.apache.org by Filipe Antunes <fa...@tecnica.cc> on 2009/04/08 16:09:53 UTC
Subcollection plugin not working
I'm using nutch 1.0.
My subcollections.xml config file is configured like this:
<?xml version="1.0" encoding="UTF-8"?>
<subcollections>
<subcollection>
<name>sub1</name>
<id>sub1</id>
<whitelist>
http://www.apache.org/
</whitelist>
<blacklist />
</subcollection>
<subcollection>
<name>sub2</name>
<id>sub2</id>
<whitelist>
http://www.mysql.com/
</whitelist>
<blacklist />
</subcollection>
<subcollection>
<name>sub3</name>
<id>sub3</id>
<whitelist>
http://www.redhat.com/
</whitelist>
<blacklist />
</subcollection>
</subcollections>
After indexing, and making sure that plugin subcollection was activated
on nutch-site.xml,
I checked the database with luke.
Subcollection field was populated as it should with sub1,sub2,sub3
Problem is when I try to search for anything associated with a
subcollection.
I get zero results (on luke).
Using the command line, the same results:
./bin/nutch org.apache.nutch.searcher.NutchBean "subcollection:sub1 apache"
Total hits: 0
After performing a normal search, following the explain link on the
search results, the subcollection content is correct too but any search
using subcollection:sub1 text, returns no results..
Bug maybe?
--
AVISO DE CONFIDENCIALIDADE: Esta mensagem, assim como os ficheiros
eventualmente anexos, é confidencial e reservada apenas ao conhecimento
da(s) pessoa(s) nela indicada(s) como destinatária(s). Se não é o seu
destinatário, solicitamos que não faça qualquer uso do respectivo
conteúdo e proceda à sua destruição, notificando o remetente.
LIMITAÇÃO DE RESPONSABILIDADE: A segurança da transmissão de informação
por via electrónica não pode ser garantida pelo remetente, que
consequentemente, não se responsabiliza por qualquer facto susceptível
de afectar a sua integridade.
CONFIDENTIALITY NOTICE: This message, as well as any existing attached
files, is confidential and intended exclusively for the individual(s)
named as addressees. If you are not the intended recipient, you are
kindly requested not to make any use whatsoever of its contents and to
proceed to the destruction of the message, thereby notifying the sender.
DISCLAIMER: The sender of this message can NOT ensure the security of
its electronic transmission and consequently does not accept liability
for any fact, which may interfere with the integrity of its content.