You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Mark Jones <ma...@quovadx.com> on 2006/08/15 00:28:59 UTC
searching on multiple subcollections
Hi,
I noticed that "subcollection" field has multiple space separated
subcollection values for documents that are in more than one
subcollection. I read the sources and tried several syntaxes.
Is there a way to search for membership in multiple subcollections?
A site:
site.foo.com/index.html
site.foo.com/faq.html
site.foo.com/contact.html
site.foo.com/bar/bar1.html
site.foo.com/foo/foo1.html
A subcollections.xml
<subcollections>
<subcollection>
<name>foosite</name>
<id>foosite</id>
<whitelist>http://site.foo.com</whitelist>
<blacklist />
</subcollection>
<subcollection>
<name>foobar</name>
<id>foobar</id>
<whitelist>http://site.foo.com/bar</whitelist>
<blacklist />
</subcollection>
</subcollections>
subcollection field for site.foo.com/bar/bar1.html
"foosite foobar"
subcollection field for site.foo.com/foo/foo1.html
"foosite"
I would like to search for documents in subcollections
foosite AND foobar
foosite OR foobar
Thanks,
Mark
Mark Jones
Sr. Systems Integration Specialist
Mark.Jones@quovadx.com