You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by PeterKerk <ve...@hotmail.com> on 2010/10/14 13:55:07 UTC

check if field CONTAINS a value, as opposed to IS of a value

I try to determine if a certain word occurs within a field.

http://localhost:8983/solr/db/select/?indent=on&facet=true&fl=id,title&q=introtext:hi

this works if an EXACT match was found on field introtext, thus the field
value is just "hi"

But if the field value woud be "hi there, this is just some text", the above
URL does no longer find this record.

What is the queryparameter to ask solr to look inside the introtext field
for a value (and even better also for synonyms)
-- 
View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700495.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by PeterKerk <ve...@hotmail.com>.
Nice! :)
No further questions SIR! ;)

Thanks!
-- 
View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1701120.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
correct, it show the transformations that happen to your indexed term (or
query term if you use the *Field value (query)* box ) after each
Tokenizer/Filter is executed.

On 14 October 2010 14:40, PeterKerk <ve...@hotmail.com> wrote:

>
> Awesome again!
>
> And for my understanding, I type a single word "Boston" and then I see 7
> lines of output:
> Boston
> Boston
> Boston
> Boston
> boston
> boston
> boston
>
>
> So each line represents what is done to the query value after it has passed
> through the filter?
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1701070.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by PeterKerk <ve...@hotmail.com>.
Awesome again!

And for my understanding, I type a single word "Boston" and then I see 7
lines of output:
Boston
Boston
Boston
Boston
boston
boston
boston


So each line represents what is done to the query value after it has passed
through the filter?
-- 
View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1701070.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
yep, the Solr Admin web-app provides functionality that does exactly
that..it can reached@ http://
{serverName}:{serverPort}/solr/admin/analysis.jsp

On 14 October 2010 14:28, PeterKerk <ve...@hotmail.com> wrote:

>
> It DOES work :)
>
> Oh and on the filters....is there some sort of debug/overview tool to see
> what each filter does and what an input string look like after going
> through
> a filter?
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700997.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by PeterKerk <ve...@hotmail.com>.
It DOES work :)

Oh and on the filters....is there some sort of debug/overview tool to see
what each filter does and what an input string look like after going through
a filter?
-- 
View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700997.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
I think this should work..It might also be a good idea to investigate how
exactly each filter in the chain modifies your original text..this way you
will be able to better understand why certain queries match certain
documents.

On 14 October 2010 14:18, PeterKerk <ve...@hotmail.com> wrote:

>
> Correct, thanks!
>
> I have used the following:
>
>    <fieldType name="text" class="solr.TextField"
> positionIncrementGap="100">
>      <analyzer type="index">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_dutch.txt"/>
>        <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>      </analyzer>
>      <analyzer type="query">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_dutch.txt"/>
>        <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>      </analyzer>
>    </fieldType>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700945.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by PeterKerk <ve...@hotmail.com>.
Correct, thanks!

I have used the following: 

    <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>
-- 
View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700945.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
verbatim from schema.xml:
" <!-- The StrField type is not analyzed, but indexed/stored verbatim.
       - StrField and TextField support an optional compressThreshold which
       limits compression (if enabled in the derived fields) to values which
       exceed a certain size (in characters).
    --> "

so basically what this means is that when you index "Hello there mate" the
only text that is indexed and therefore searchable is the exact
phrase "Hello there mate" and *not* the terms Hello  - there - mate.
What you need is a solr.TextField based type which splits ( tokenizes) your
text.

On 14 October 2010 14:07, PeterKerk <ve...@hotmail.com> wrote:

>
> This is the definition
>
> <fieldType name="string" class="solr.StrField" sortMissingLast="true"
> omitNorms="true"/>
>
> <field name="introtext" type="string" indexed="true" stored="true"/>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700893.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by PeterKerk <ve...@hotmail.com>.
This is the definition

<fieldType name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true"/>

<field name="introtext" type="string" indexed="true" stored="true"/>


-- 
View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700893.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
looks like you are not tokenizing your field properly. What does your
schema.xml look like?

On 14 October 2010 13:01, Allistair Crossley <al...@roxxor.co.uk> wrote:

> actuall no you don't .. if you want hi in a sentence of hi there this is me
> this is just normal tokenizing and should work .. check your field
> type/analysers
>
> On Oct 14, 2010, at 7:59 AM, Allistair Crossley wrote:
>
> > i think you need to look at ngram tokenizing
> >
> > On Oct 14, 2010, at 7:55 AM, PeterKerk wrote:
> >
> >>
> >> I try to determine if a certain word occurs within a field.
> >>
> >>
> http://localhost:8983/solr/db/select/?indent=on&facet=true&fl=id,title&q=introtext:hi
> >>
> >> this works if an EXACT match was found on field introtext, thus the
> field
> >> value is just "hi"
> >>
> >> But if the field value woud be "hi there, this is just some text", the
> above
> >> URL does no longer find this record.
> >>
> >> What is the queryparameter to ask solr to look inside the introtext
> field
> >> for a value (and even better also for synonyms)
> >> --
> >> View this message in context:
> http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700495.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
>

Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by Allistair Crossley <al...@roxxor.co.uk>.
actuall no you don't .. if you want hi in a sentence of hi there this is me this is just normal tokenizing and should work .. check your field type/analysers

On Oct 14, 2010, at 7:59 AM, Allistair Crossley wrote:

> i think you need to look at ngram tokenizing
> 
> On Oct 14, 2010, at 7:55 AM, PeterKerk wrote:
> 
>> 
>> I try to determine if a certain word occurs within a field.
>> 
>> http://localhost:8983/solr/db/select/?indent=on&facet=true&fl=id,title&q=introtext:hi
>> 
>> this works if an EXACT match was found on field introtext, thus the field
>> value is just "hi"
>> 
>> But if the field value woud be "hi there, this is just some text", the above
>> URL does no longer find this record.
>> 
>> What is the queryparameter to ask solr to look inside the introtext field
>> for a value (and even better also for synonyms)
>> -- 
>> View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700495.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 


Re: check if field CONTAINS a value, as opposed to IS of a value

Posted by Allistair Crossley <al...@roxxor.co.uk>.
i think you need to look at ngram tokenizing

On Oct 14, 2010, at 7:55 AM, PeterKerk wrote:

> 
> I try to determine if a certain word occurs within a field.
> 
> http://localhost:8983/solr/db/select/?indent=on&facet=true&fl=id,title&q=introtext:hi
> 
> this works if an EXACT match was found on field introtext, thus the field
> value is just "hi"
> 
> But if the field value woud be "hi there, this is just some text", the above
> URL does no longer find this record.
> 
> What is the queryparameter to ask solr to look inside the introtext field
> for a value (and even better also for synonyms)
> -- 
> View this message in context: http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700495.html
> Sent from the Solr - User mailing list archive at Nabble.com.