You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brian Lamb <br...@journalexperts.com> on 2011/05/25 19:01:33 UTC

Similarity per field

Hi all,

I sent a mail in about this topic a week ago but now that I have more
information about what I am doing, as well as a better understanding of how
the similarity class works, I wanted to start a new thread with a bit more
information about what I'm doing, what I want to do, and how I can make it
work correctly.

I have written a similarity class that I would like applied to a specific
field.

This is how I am defining the fieldType:

<fieldType name="edgengram_cust" class="solr.TextField"
positionIncrementGap="1000">
   <analyzer>
     <tokenizer class="solr.LowerCaseTokenizerFactory" />
     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
maxGramSize="1" side="front" />
   </analyzer>
   <similarity class="my.package.similarity.MySimilarity"/>
</fieldType>

And then I assign a specific field to that fieldType:

<field name="myfield" multiValued="true" type="edgengram_cust"
indexed="true" stored="true" required="false" omitNorms="true" />

Then, I restarted solr and did a fullimport. However, the changes I have
made do not appear to be taking hold. For simplicity, right now I just have
the idf function returning 1. When I do a search with debugQuery=on, the idf
behaves as it normally does. However, when I search on this field, the idf
should be 1 and that is not the case.

To try and nail down where the problem occurs, I commented out the
similarity class definition in the fieldType and added it globally to the
schema file:

<similarity class="my.package.similarity.MySimilarity"/>

Then, I restarted solr and did a fullimport. This time, the idf scores were
all 1. So it seems to me the problem is not with my similarity class but in
trying to apply it to a specific fieldType.

According to https://issues.apache.org/jira/browse/SOLR-2338, this should be
in the trunk now yes? I have run svn up on both my lucene and solr installs
and it still is not recognizing it on a per field basis.

Is the tag different inside a fieldType? Did I not update solr correctly?
Where is my mistake?

Thanks,

Brian Lamb

Re: Similarity per field

Posted by Brian Lamb <br...@journalexperts.com>.
I'm still not having any luck with this. Has anyone actually gotten this to
work so far? I feel like I've followed the directions to the letter but it
just doesn't work.

Thanks,

Brian Lamb

On Wed, May 25, 2011 at 2:48 PM, Brian Lamb
<br...@journalexperts.com>wrote:

> I looked at the patch page and saw the files that were changed. I went into
> my install and looked at those same files and found that they had indeed
> been changed. So it looks like I have the correct version of solr.
>
>
> On Wed, May 25, 2011 at 1:01 PM, Brian Lamb <brian.lamb@journalexperts.com
> > wrote:
>
>> Hi all,
>>
>> I sent a mail in about this topic a week ago but now that I have more
>> information about what I am doing, as well as a better understanding of how
>> the similarity class works, I wanted to start a new thread with a bit more
>> information about what I'm doing, what I want to do, and how I can make it
>> work correctly.
>>
>> I have written a similarity class that I would like applied to a specific
>> field.
>>
>> This is how I am defining the fieldType:
>>
>> <fieldType name="edgengram_cust" class="solr.TextField"
>> positionIncrementGap="1000">
>>    <analyzer>
>>      <tokenizer class="solr.LowerCaseTokenizerFactory" />
>>      <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
>> maxGramSize="1" side="front" />
>>    </analyzer>
>>    <similarity class="my.package.similarity.MySimilarity"/>
>> </fieldType>
>>
>> And then I assign a specific field to that fieldType:
>>
>> <field name="myfield" multiValued="true" type="edgengram_cust"
>> indexed="true" stored="true" required="false" omitNorms="true" />
>>
>> Then, I restarted solr and did a fullimport. However, the changes I have
>> made do not appear to be taking hold. For simplicity, right now I just have
>> the idf function returning 1. When I do a search with debugQuery=on, the idf
>> behaves as it normally does. However, when I search on this field, the idf
>> should be 1 and that is not the case.
>>
>> To try and nail down where the problem occurs, I commented out the
>> similarity class definition in the fieldType and added it globally to the
>> schema file:
>>
>> <similarity class="my.package.similarity.MySimilarity"/>
>>
>> Then, I restarted solr and did a fullimport. This time, the idf scores
>> were all 1. So it seems to me the problem is not with my similarity class
>> but in trying to apply it to a specific fieldType.
>>
>> According to https://issues.apache.org/jira/browse/SOLR-2338, this should
>> be in the trunk now yes? I have run svn up on both my lucene and solr
>> installs and it still is not recognizing it on a per field basis.
>>
>> Is the tag different inside a fieldType? Did I not update solr correctly?
>> Where is my mistake?
>>
>> Thanks,
>>
>> Brian Lamb
>>
>
>

Re: Similarity per field

Posted by Brian Lamb <br...@journalexperts.com>.
I looked at the patch page and saw the files that were changed. I went into
my install and looked at those same files and found that they had indeed
been changed. So it looks like I have the correct version of solr.

On Wed, May 25, 2011 at 1:01 PM, Brian Lamb
<br...@journalexperts.com>wrote:

> Hi all,
>
> I sent a mail in about this topic a week ago but now that I have more
> information about what I am doing, as well as a better understanding of how
> the similarity class works, I wanted to start a new thread with a bit more
> information about what I'm doing, what I want to do, and how I can make it
> work correctly.
>
> I have written a similarity class that I would like applied to a specific
> field.
>
> This is how I am defining the fieldType:
>
> <fieldType name="edgengram_cust" class="solr.TextField"
> positionIncrementGap="1000">
>    <analyzer>
>      <tokenizer class="solr.LowerCaseTokenizerFactory" />
>      <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> maxGramSize="1" side="front" />
>    </analyzer>
>    <similarity class="my.package.similarity.MySimilarity"/>
> </fieldType>
>
> And then I assign a specific field to that fieldType:
>
> <field name="myfield" multiValued="true" type="edgengram_cust"
> indexed="true" stored="true" required="false" omitNorms="true" />
>
> Then, I restarted solr and did a fullimport. However, the changes I have
> made do not appear to be taking hold. For simplicity, right now I just have
> the idf function returning 1. When I do a search with debugQuery=on, the idf
> behaves as it normally does. However, when I search on this field, the idf
> should be 1 and that is not the case.
>
> To try and nail down where the problem occurs, I commented out the
> similarity class definition in the fieldType and added it globally to the
> schema file:
>
> <similarity class="my.package.similarity.MySimilarity"/>
>
> Then, I restarted solr and did a fullimport. This time, the idf scores were
> all 1. So it seems to me the problem is not with my similarity class but in
> trying to apply it to a specific fieldType.
>
> According to https://issues.apache.org/jira/browse/SOLR-2338, this should
> be in the trunk now yes? I have run svn up on both my lucene and solr
> installs and it still is not recognizing it on a per field basis.
>
> Is the tag different inside a fieldType? Did I not update solr correctly?
> Where is my mistake?
>
> Thanks,
>
> Brian Lamb
>