You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Fred Zimmerman <zi...@gmail.com> on 2011/11/02 15:52:37 UTC

limiting searches to particular sources

I want to be able to list some searches to particular sources, e.g. "wiki
only", "crawled only", etc.  So I think I need to create a source field in
the schema.xml.  However, the native data for these sources does not
contain source info (e.g. "crawled").  So I want to use (I think)
<copyfield> to add a string to each data set as I import it, e.g.
"website-X-crawl".  So my question is, how do I insert a string value into
a blank field?

Re: limiting searches to particular sources

Posted by Markus Jelsma <ma...@openindex.io>.
Your Nutch indexes the site and host fields. If that is not enough you can use 
its subcollection plugin to write values for URL patterns.

On Wednesday 02 November 2011 15:52:37 Fred Zimmerman wrote:
> I want to be able to list some searches to particular sources, e.g. "wiki
> only", "crawled only", etc.  So I think I need to create a source field in
> the schema.xml.  However, the native data for these sources does not
> contain source info (e.g. "crawled").  So I want to use (I think)
> <copyfield> to add a string to each data set as I import it, e.g.
> "website-X-crawl".  So my question is, how do I insert a string value into
> a blank field?

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: limiting searches to particular sources

Posted by Chris Hostetter <ho...@fucit.org>.
: Yes -- how do I specify the field as a constant in DIH?

https://wiki.apache.org/solr/DataImportHandlerFaq#How_would_I_insert_a_static_value_into_a_field_.3F



-Hoss

Re: limiting searches to particular sources

Posted by Fred Zimmerman <zi...@gmail.com>.
Yes -- how do I specify the field as a constant in DIH?

On Fri, Nov 4, 2011 at 11:17 AM, Erick Erickson <er...@gmail.com>wrote:

> How are you crawling your info? Somewhere you have to inject the
> source into the document, <copyField> won't do the trick because
> there's no source available....
>
> If you're crawling the data by yourself, you can just add the source
> to the document.
>
> If you're using DIH, you can specify the field as a constant. Or you
> could implement a custom Transformer that inserted it for you.
>
> Best
> Erick
>
> On Wed, Nov 2, 2011 at 10:52 AM, Fred Zimmerman <zi...@gmail.com>
> wrote:
> > I want to be able to list some searches to particular sources, e.g. "wiki
> > only", "crawled only", etc.  So I think I need to create a source field
> in
> > the schema.xml.  However, the native data for these sources does not
> > contain source info (e.g. "crawled").  So I want to use (I think)
> > <copyfield> to add a string to each data set as I import it, e.g.
> > "website-X-crawl".  So my question is, how do I insert a string value
> into
> > a blank field?
> >
>

Re: limiting searches to particular sources

Posted by Erick Erickson <er...@gmail.com>.
How are you crawling your info? Somewhere you have to inject the
source into the document, <copyField> won't do the trick because
there's no source available....

If you're crawling the data by yourself, you can just add the source
to the document.

If you're using DIH, you can specify the field as a constant. Or you
could implement a custom Transformer that inserted it for you.

Best
Erick

On Wed, Nov 2, 2011 at 10:52 AM, Fred Zimmerman <zi...@gmail.com> wrote:
> I want to be able to list some searches to particular sources, e.g. "wiki
> only", "crawled only", etc.  So I think I need to create a source field in
> the schema.xml.  However, the native data for these sources does not
> contain source info (e.g. "crawled").  So I want to use (I think)
> <copyfield> to add a string to each data set as I import it, e.g.
> "website-X-crawl".  So my question is, how do I insert a string value into
> a blank field?
>