You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Damien Dudognon <da...@jfg-networks.net> on 2010/08/16 10:06:09 UTC

Analyser depending on field's value

Hi all,

I want to use a specific stopword list depending on a field's value. For example, if type == 1 then I use stopwords1.txt to index "text" field, else I use stopwords2.txt. 

I thought of several solutions but no one really satisfied me:
1) use one Solr instance by type, and therefore a distinct index by type;
2) use as many fields as types with specific rules for each field (e.g. a field "text_1" for the type "1" which uses "stopwords1.txt", "text_2" for other types which uses "stopwords2.txt", ...)

I am sure that there is a better solution to my problem.

If anyone have a suitable solution to suggest to me ... :-)

Thanks,
Damien

----------------------------------------
A sample of my schema.xml :

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="myschema" version="1.2">

  <types>
    <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
    <fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
    <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
      </analyzer>

    ...

    </fieldType>
 </types>

 <fields>
   <!-- common fields -->
   <field name="id" type="long" indexed="true" stored="true" required="true" /> 
   <field name="title" type="text" indexed="true" stored="false" omitNorms="false" required="true"/>
   <field name="text" type="text" indexed="true" stored="false" omitNorms="false" required="true"/>
   <field name="type" type="int" indexed="true" stored="true" omitNorms="true" required="true"/>
 </fields>

 <uniqueKey>id</uniqueKey>

 <defaultSearchField>text</defaultSearchField>

 <solrQueryParser defaultOperator="OR"/>

</schema>


Re: Analyser depending on field's value

Posted by Damien Dudognon <da...@jfg-networks.net>.
Thank you for your reply.
I'll apply your patch and try this new feature to see if it meets my needs.

If I understand correctly, your solution is to have a field by type and to select the field to use depending on the value of another field.

Ideally, I would apply a different pre-treatment to my data, depending on the value of a field and store the data in a single field. I'll also take a look at  "StopFilter" to see if I can implement a custom filter with an additional parameter (the fields to consider).

Best regards,
Damien

Re: Analyser depending on field's value

Posted by Andrzej Bialecki <ab...@getopt.org>.
On 2010-08-16 10:06, Damien Dudognon wrote:
> Hi all,
>
> I want to use a specific stopword list depending on a field's value. For example, if type == 1 then I use stopwords1.txt to index "text" field, else I use stopwords2.txt.
>
> I thought of several solutions but no one really satisfied me:
> 1) use one Solr instance by type, and therefore a distinct index by type;
> 2) use as many fields as types with specific rules for each field (e.g. a field "text_1" for the type "1" which uses "stopwords1.txt", "text_2" for other types which uses "stopwords2.txt", ...)
>
> I am sure that there is a better solution to my problem.
>
> If anyone have a suitable solution to suggest to me ... :-)

Perhaps the solution described here:

https://issues.apache.org/jira/browse/SOLR-1536

Take a look at the example that uses token types to put text into 
different fields, which can then be analyzed differently.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com