You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by aaguilar <An...@nd.edu> on 2014/09/19 17:10:59 UTC

Issue Adding Filter Query

Hello All,

I recently came across a problem when I tried using description:"fatty
acid-binding protein" as a filter query when doing a query through the query
interface for Solr in the Tomcat server.  Using that filter query did not
give me any results at all, however if I used description:"fatty
acid-binding" as the filter query, it would give me the results I wanted.

The thing is that some of the results I got back from Solr, did have the
words "fatty acid-binding protein" in the description field.  So I really do
not know what might be causing the issue of Solr not being able to find
those hits.

Another weird thing is that if I used description:"fatty acid-binding" AND
description:"protein" as the filter query when doing a query, it gave me the
results I anticipated (with some extra results that did not have the exact
phrase "fatty acid-binding protein").  Does anyone have an idea as to what
might be happening?  Just in case this is helpful, the version of Solr we
are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help anyone can
provide.

Thanks!



--
View this message in context: http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue Adding Filter Query

Posted by Erick Erickson <er...@gmail.com>.
Glad your problem isn't one any longer. Yeah, there are a lot
of nooks and crannies that one gets used to with Solr!

I'd estimate that between learning how to read the debug
output and the analysis page 80-90% of the
"my search isn't working" questions on the list can be answered,
but it takes a while to get comfortable with those tools (and
to even know they exist!)...

Best
Erick

On Wed, Sep 24, 2014 at 6:57 AM, aaguilar <An...@nd.edu> wrote:

> Hello Erick,
>
> Just wanted to let you know that I did the change you suggested and
> everything works as expected.  Also, thanks for letting me know about the
> Analysis page in solr.  I did not know about it and I have found it very
> useful.
>
> Thanks!
>
> On Mon, Sep 22, 2014 at 5:41 PM, Antelmo Aguilar <
> Antelmo.Aguilar.17@nd.edu>
> wrote:
>
> > Hello Erick,
> >
> > Thank you so much for your help.  That makes perfect sense.  I will do
> the
> > changes you suggest and let you know how it goes.
> >
> > Thanks!
> >
> > On Mon, Sep 22, 2014 at 4:12 PM, Erick Erickson [via Lucene] <
> > ml-node+s472066n4160547h14@n3.nabble.com> wrote:
> >
> >> You have your index and query time analysis chains defined much
> >> differently. Omitting the WordDelimiterFilterFactory from the
> >> query-time analysis chain will lead to endless problems.
> >>
> >> With the definition you have, here are the terms in the index and
> >> their term positions as  below. This is available from the
> >> admin/analysis page if you click the "verbose" checkbox, although I
> >> admit it's kind of hard to read:
> >> 1         2                       3            4
> >> fatty  acid-binding     binding    protein
> >>          acid
> >>
> >> But at query time, this is how they're being analyzed
> >> 1             2                   3
> >> fatty    acid-binding    protein
> >>
> >> So searching for "fatty acid-binding protein" requires that the tokens
> >> "fatty" "acid-binding" and "protein" appear in term positions 1, 2, 3
> >> rather  than where they actually are (1, 2, 4). Searching for "fatty
> >> acid-binding protein"~1 would actually find this, the "~1" means allow
> >> one gap in there.
> >>
> >> HOWEVER, that's the least of your problems. WordDelimiterFilterFactory
> >> will _also_ "split on intra-word delimiters (all non alpha-numeric
> >> characters)". While that doesn't really say so explicitly, that will
> >> have the effect of removing puncutation. So searching for "fatty
> >> acid-binding protein."~1 (note the period) will fail since the token
> >> will include the period.
> >>
> >> I'd _really_ advise you to use the stock WordDelimiterFilterFactory
> >> settings in both analysis and query times included in the stock Solr
> >> release for, say, text_en_splitting or even a single analyzer like
> >> text_en_splitting_tight.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Sep 22, 2014 at 6:33 AM, aaguilar <[hidden email]
> >> <http://user/SendEmail.jtp?type=node&node=4160547&i=0>> wrote:
> >>
> >> > Hello Erick.
> >> >
> >> > Below is the information you requested.   Thanks for your help!
> >> >
> >> > <fieldType name="text_ws_finer" class="solr.TextField"
> >> positionIncrementGap=
> >> > "100"> <analyzer type="index"> <tokenizer class=
> >> > "solr.WhitespaceTokenizerFactory"/> <filter class=
> >> > "solr.WordDelimiterFilterFactory" splitOnNumerics="0"
> >> splitOnCaseChange="0"
> >> > generateWordParts="1" generateNumberParts="0" catenateWords="0"
> >> > catenateNumbers="0" catenateAll="0" preserveOriginal="1"/> <filter
> >> class=
> >> > "solr.StopFilterFactory"/> <filter
> >> class="solr.LowerCaseFilterFactory"/> </
> >> > analyzer> <analyzer type="query"> <tokenizer class=
> >> > "solr.WhitespaceTokenizerFactory"/> <filter class=
> >> > "solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
> >> >
> >> >
> >> > <field name="description" type="text_ws_finer" indexed="true"
> >> stored="true"
> >> > />
> >> >
> >> > On Fri, Sep 19, 2014 at 7:36 PM, Erick Erickson [via Lucene] <
> >> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4160547&i=1
> >>
> >> wrote:
> >> >
> >> >> Hmmm, I'd have to see the schema definition for your description
> >> >> field. For this, the admin/analysis page is very helpful. Here's my
> >> >> guess:
> >> >>
> >> >> Your analysis chain doesn't break the incoming tokens up quite like
> >> >> you think it is. Thus you have the tokens in your index like
> >> >> 'protein,' (notice the comma) and 'protein-like' rather than just
> >> >> 'protein'. However, I can't quite reconcile this with your statement:
> >> >> "Another weird thing is that if I used description:"fatty
> >> >> acid-binding" AND description:"protein"
> >> >>
> >> >> so I'm at something of a loss. If you paste in your schema definition
> >> >> for the 'description' field _and_ the corresponding <fieldType>
> >> >> definition I can give it a quick whirl.
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <[hidden email]
> >> >> <http://user/SendEmail.jtp?type=node&node=4160122&i=0>> wrote:
> >> >>
> >> >> > Hello Erick,
> >> >> >
> >> >> > Thanks for the response.  I tried adding the debug=True to the
> >> query,
> >> >> but I
> >> >> > do not know exactly what I am looking for in the output.  Would it
> >> be
> >> >> > possible for you to look at the results?  I would really appreciate
> >> it.
> >> >> I
> >> >> > attached two files, one of them is with the filter query
> >> >> description:"fatty
> >> >> > acid-binding" and the other is with the filter query
> >> description:"fatty
> >> >> > acid-binding protein".  If you see the file that has the results
> for
> >> >> > description:"fatty acid-binding" , you can see that the hits do
> have
> >> >> "fatty
> >> >> > acid-binding protein" and nothing in between.  I really appreciate
> >> any
> >> >> help
> >> >> > you can provide.
> >> >> >
> >> >> > Thanks you
> >> >> >
> >> >> > On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
> >> >> > [hidden email] <
> http://user/SendEmail.jtp?type=node&node=4160122&i=1>>
> >>
> >> >> wrote:
> >> >> >
> >> >> >> Your very best friend here is attaching &debug=query to the URL
> and
> >> >> >> looking at the parsed query results. Upon occasion there's some
> >> >> >>
> >> >> >> One possible explanation is that description field has something
> >> like
> >> >> >> "fatty acid-binding some words protein" in which case your query
> >> >> >> "fatty acid-binding protein" would fail, but "fatty acid-binding
> >> >> >> protein"~4 would succeed.
> >> >> >>
> >> >> >> The other possibility is that your query parsing isn't quite doing
> >> >> >> what you think, but adding &debug=query should help there.
> >> >> >>
> >> >> >> Best,
> >> >> >> Erick
> >> >> >>
> >> >> >> On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <[hidden email]
> >> >> >> <http://user/SendEmail.jtp?type=node&node=4160036&i=0>> wrote:
> >> >> >>
> >> >> >> > Hello All,
> >> >> >> >
> >> >> >> > I recently came across a problem when I tried using
> >> >> description:"fatty
> >> >> >> > acid-binding protein" as a filter query when doing a query
> >> through
> >> >> the
> >> >> >> query
> >> >> >> > interface for Solr in the Tomcat server.  Using that filter
> query
> >> did
> >> >> >> not
> >> >> >> > give me any results at all, however if I used description:"fatty
> >> >> >> > acid-binding" as the filter query, it would give me the results
> I
> >> >> >> wanted.
> >> >> >> >
> >> >> >> > The thing is that some of the results I got back from Solr, did
> >> have
> >> >> the
> >> >> >> > words "fatty acid-binding protein" in the description field.  So
> >> I
> >> >> >> really do
> >> >> >> > not know what might be causing the issue of Solr not being able
> >> to
> >> >> find
> >> >> >> > those hits.
> >> >> >> >
> >> >> >> > Another weird thing is that if I used description:"fatty
> >> >> acid-binding"
> >> >> >> AND
> >> >> >> > description:"protein" as the filter query when doing a query, it
> >> gave
> >> >> me
> >> >> >> the
> >> >> >> > results I anticipated (with some extra results that did not have
> >> the
> >> >> >> exact
> >> >> >> > phrase "fatty acid-binding protein").  Does anyone have an idea
> >> as to
> >> >> >> what
> >> >> >> > might be happening?  Just in case this is helpful, the version
> of
> >> >> Solr
> >> >> >> we
> >> >> >> > are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help
> >> anyone
> >> >> >> can
> >> >> >> > provide.
> >> >> >> >
> >> >> >> > Thanks!
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > --
> >> >> >> > View this message in context:
> >> >> >>
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
> >> >> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------
> >> >> >>  If you reply to this email, your message will be added to the
> >> >> discussion
> >> >> >> below:
> >> >> >>
> >> >> >>
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160036.html
> >> >> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> >> >> <
> >> >> >> .
> >> >> >> NAML
> >> >> >> <
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >>
> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> > fatty_acid-binding_protein.xml (1K) <
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/attachment/4160048/0/fatty_acid-binding_protein.xml
> >
> >>
> >> >>
> >> >> > fatty_acid-binding.xml (63K) <
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/attachment/4160048/1/fatty_acid-binding.xml
> >
> >>
> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > View this message in context:
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160048.html
> >> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >> >>
> >> >>
> >> >> ------------------------------
> >> >>  If you reply to this email, your message will be added to the
> >> discussion
> >> >> below:
> >> >>
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160122.html
> >> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> >> <
> >> >> .
> >> >> NAML
> >> >> <
> >>
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >>
> >> >>
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160423.html
> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >> ------------------------------
> >>  If you reply to this email, your message will be added to the
> >> discussion below:
> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160547.html
> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> <
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4159990&code=QW50ZWxtby5BZ3VpbGFyLjE3QG5kLmVkdXw0MTU5OTkwfC0xMDkyNTg2ODY3
> >
> >> .
> >> NAML
> >> <
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >>
> >
> >
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160921.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Issue Adding Filter Query

Posted by aaguilar <An...@nd.edu>.
Hello Erick,

Just wanted to let you know that I did the change you suggested and
everything works as expected.  Also, thanks for letting me know about the
Analysis page in solr.  I did not know about it and I have found it very
useful.

Thanks!

On Mon, Sep 22, 2014 at 5:41 PM, Antelmo Aguilar <An...@nd.edu>
wrote:

> Hello Erick,
>
> Thank you so much for your help.  That makes perfect sense.  I will do the
> changes you suggest and let you know how it goes.
>
> Thanks!
>
> On Mon, Sep 22, 2014 at 4:12 PM, Erick Erickson [via Lucene] <
> ml-node+s472066n4160547h14@n3.nabble.com> wrote:
>
>> You have your index and query time analysis chains defined much
>> differently. Omitting the WordDelimiterFilterFactory from the
>> query-time analysis chain will lead to endless problems.
>>
>> With the definition you have, here are the terms in the index and
>> their term positions as  below. This is available from the
>> admin/analysis page if you click the "verbose" checkbox, although I
>> admit it's kind of hard to read:
>> 1         2                       3            4
>> fatty  acid-binding     binding    protein
>>          acid
>>
>> But at query time, this is how they're being analyzed
>> 1             2                   3
>> fatty    acid-binding    protein
>>
>> So searching for "fatty acid-binding protein" requires that the tokens
>> "fatty" "acid-binding" and "protein" appear in term positions 1, 2, 3
>> rather  than where they actually are (1, 2, 4). Searching for "fatty
>> acid-binding protein"~1 would actually find this, the "~1" means allow
>> one gap in there.
>>
>> HOWEVER, that's the least of your problems. WordDelimiterFilterFactory
>> will _also_ "split on intra-word delimiters (all non alpha-numeric
>> characters)". While that doesn't really say so explicitly, that will
>> have the effect of removing puncutation. So searching for "fatty
>> acid-binding protein."~1 (note the period) will fail since the token
>> will include the period.
>>
>> I'd _really_ advise you to use the stock WordDelimiterFilterFactory
>> settings in both analysis and query times included in the stock Solr
>> release for, say, text_en_splitting or even a single analyzer like
>> text_en_splitting_tight.
>>
>> Best,
>> Erick
>>
>> On Mon, Sep 22, 2014 at 6:33 AM, aaguilar <[hidden email]
>> <http://user/SendEmail.jtp?type=node&node=4160547&i=0>> wrote:
>>
>> > Hello Erick.
>> >
>> > Below is the information you requested.   Thanks for your help!
>> >
>> > <fieldType name="text_ws_finer" class="solr.TextField"
>> positionIncrementGap=
>> > "100"> <analyzer type="index"> <tokenizer class=
>> > "solr.WhitespaceTokenizerFactory"/> <filter class=
>> > "solr.WordDelimiterFilterFactory" splitOnNumerics="0"
>> splitOnCaseChange="0"
>> > generateWordParts="1" generateNumberParts="0" catenateWords="0"
>> > catenateNumbers="0" catenateAll="0" preserveOriginal="1"/> <filter
>> class=
>> > "solr.StopFilterFactory"/> <filter
>> class="solr.LowerCaseFilterFactory"/> </
>> > analyzer> <analyzer type="query"> <tokenizer class=
>> > "solr.WhitespaceTokenizerFactory"/> <filter class=
>> > "solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
>> >
>> >
>> > <field name="description" type="text_ws_finer" indexed="true"
>> stored="true"
>> > />
>> >
>> > On Fri, Sep 19, 2014 at 7:36 PM, Erick Erickson [via Lucene] <
>> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4160547&i=1>>
>> wrote:
>> >
>> >> Hmmm, I'd have to see the schema definition for your description
>> >> field. For this, the admin/analysis page is very helpful. Here's my
>> >> guess:
>> >>
>> >> Your analysis chain doesn't break the incoming tokens up quite like
>> >> you think it is. Thus you have the tokens in your index like
>> >> 'protein,' (notice the comma) and 'protein-like' rather than just
>> >> 'protein'. However, I can't quite reconcile this with your statement:
>> >> "Another weird thing is that if I used description:"fatty
>> >> acid-binding" AND description:"protein"
>> >>
>> >> so I'm at something of a loss. If you paste in your schema definition
>> >> for the 'description' field _and_ the corresponding <fieldType>
>> >> definition I can give it a quick whirl.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <[hidden email]
>> >> <http://user/SendEmail.jtp?type=node&node=4160122&i=0>> wrote:
>> >>
>> >> > Hello Erick,
>> >> >
>> >> > Thanks for the response.  I tried adding the debug=True to the
>> query,
>> >> but I
>> >> > do not know exactly what I am looking for in the output.  Would it
>> be
>> >> > possible for you to look at the results?  I would really appreciate
>> it.
>> >> I
>> >> > attached two files, one of them is with the filter query
>> >> description:"fatty
>> >> > acid-binding" and the other is with the filter query
>> description:"fatty
>> >> > acid-binding protein".  If you see the file that has the results for
>> >> > description:"fatty acid-binding" , you can see that the hits do have
>> >> "fatty
>> >> > acid-binding protein" and nothing in between.  I really appreciate
>> any
>> >> help
>> >> > you can provide.
>> >> >
>> >> > Thanks you
>> >> >
>> >> > On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
>> >> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4160122&i=1>>
>>
>> >> wrote:
>> >> >
>> >> >> Your very best friend here is attaching &debug=query to the URL and
>> >> >> looking at the parsed query results. Upon occasion there's some
>> >> >>
>> >> >> One possible explanation is that description field has something
>> like
>> >> >> "fatty acid-binding some words protein" in which case your query
>> >> >> "fatty acid-binding protein" would fail, but "fatty acid-binding
>> >> >> protein"~4 would succeed.
>> >> >>
>> >> >> The other possibility is that your query parsing isn't quite doing
>> >> >> what you think, but adding &debug=query should help there.
>> >> >>
>> >> >> Best,
>> >> >> Erick
>> >> >>
>> >> >> On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <[hidden email]
>> >> >> <http://user/SendEmail.jtp?type=node&node=4160036&i=0>> wrote:
>> >> >>
>> >> >> > Hello All,
>> >> >> >
>> >> >> > I recently came across a problem when I tried using
>> >> description:"fatty
>> >> >> > acid-binding protein" as a filter query when doing a query
>> through
>> >> the
>> >> >> query
>> >> >> > interface for Solr in the Tomcat server.  Using that filter query
>> did
>> >> >> not
>> >> >> > give me any results at all, however if I used description:"fatty
>> >> >> > acid-binding" as the filter query, it would give me the results I
>> >> >> wanted.
>> >> >> >
>> >> >> > The thing is that some of the results I got back from Solr, did
>> have
>> >> the
>> >> >> > words "fatty acid-binding protein" in the description field.  So
>> I
>> >> >> really do
>> >> >> > not know what might be causing the issue of Solr not being able
>> to
>> >> find
>> >> >> > those hits.
>> >> >> >
>> >> >> > Another weird thing is that if I used description:"fatty
>> >> acid-binding"
>> >> >> AND
>> >> >> > description:"protein" as the filter query when doing a query, it
>> gave
>> >> me
>> >> >> the
>> >> >> > results I anticipated (with some extra results that did not have
>> the
>> >> >> exact
>> >> >> > phrase "fatty acid-binding protein").  Does anyone have an idea
>> as to
>> >> >> what
>> >> >> > might be happening?  Just in case this is helpful, the version of
>> >> Solr
>> >> >> we
>> >> >> > are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help
>> anyone
>> >> >> can
>> >> >> > provide.
>> >> >> >
>> >> >> > Thanks!
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > View this message in context:
>> >> >>
>> >>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
>> >> >> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >> >>
>> >> >>
>> >> >> ------------------------------
>> >> >>  If you reply to this email, your message will be added to the
>> >> discussion
>> >> >> below:
>> >> >>
>> >> >>
>> >>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160036.html
>> >> >>  To unsubscribe from Issue Adding Filter Query, click here
>> >> >> <
>> >> >> .
>> >> >> NAML
>> >> >> <
>> >>
>> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>> >>
>> >> >>
>> >> >
>> >> >
>> >> > fatty_acid-binding_protein.xml (1K) <
>> >>
>> http://lucene.472066.n3.nabble.com/attachment/4160048/0/fatty_acid-binding_protein.xml>
>>
>> >>
>> >> > fatty_acid-binding.xml (63K) <
>> >>
>> http://lucene.472066.n3.nabble.com/attachment/4160048/1/fatty_acid-binding.xml>
>>
>> >>
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > View this message in context:
>> >>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160048.html
>> >> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >>
>> >> ------------------------------
>> >>  If you reply to this email, your message will be added to the
>> discussion
>> >> below:
>> >>
>> >>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160122.html
>> >>  To unsubscribe from Issue Adding Filter Query, click here
>> >> <
>> >> .
>> >> NAML
>> >> <
>> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>> >>
>> >
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160423.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the
>> discussion below:
>>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160547.html
>>  To unsubscribe from Issue Adding Filter Query, click here
>> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4159990&code=QW50ZWxtby5BZ3VpbGFyLjE3QG5kLmVkdXw0MTU5OTkwfC0xMDkyNTg2ODY3>
>> .
>> NAML
>> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160921.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue Adding Filter Query

Posted by aaguilar <An...@nd.edu>.
Hello Erick,

Thank you so much for your help.  That makes perfect sense.  I will do the
changes you suggest and let you know how it goes.

Thanks!

On Mon, Sep 22, 2014 at 4:12 PM, Erick Erickson [via Lucene] <
ml-node+s472066n4160547h14@n3.nabble.com> wrote:

> You have your index and query time analysis chains defined much
> differently. Omitting the WordDelimiterFilterFactory from the
> query-time analysis chain will lead to endless problems.
>
> With the definition you have, here are the terms in the index and
> their term positions as  below. This is available from the
> admin/analysis page if you click the "verbose" checkbox, although I
> admit it's kind of hard to read:
> 1         2                       3            4
> fatty  acid-binding     binding    protein
>          acid
>
> But at query time, this is how they're being analyzed
> 1             2                   3
> fatty    acid-binding    protein
>
> So searching for "fatty acid-binding protein" requires that the tokens
> "fatty" "acid-binding" and "protein" appear in term positions 1, 2, 3
> rather  than where they actually are (1, 2, 4). Searching for "fatty
> acid-binding protein"~1 would actually find this, the "~1" means allow
> one gap in there.
>
> HOWEVER, that's the least of your problems. WordDelimiterFilterFactory
> will _also_ "split on intra-word delimiters (all non alpha-numeric
> characters)". While that doesn't really say so explicitly, that will
> have the effect of removing puncutation. So searching for "fatty
> acid-binding protein."~1 (note the period) will fail since the token
> will include the period.
>
> I'd _really_ advise you to use the stock WordDelimiterFilterFactory
> settings in both analysis and query times included in the stock Solr
> release for, say, text_en_splitting or even a single analyzer like
> text_en_splitting_tight.
>
> Best,
> Erick
>
> On Mon, Sep 22, 2014 at 6:33 AM, aaguilar <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=4160547&i=0>> wrote:
>
> > Hello Erick.
> >
> > Below is the information you requested.   Thanks for your help!
> >
> > <fieldType name="text_ws_finer" class="solr.TextField"
> positionIncrementGap=
> > "100"> <analyzer type="index"> <tokenizer class=
> > "solr.WhitespaceTokenizerFactory"/> <filter class=
> > "solr.WordDelimiterFilterFactory" splitOnNumerics="0"
> splitOnCaseChange="0"
> > generateWordParts="1" generateNumberParts="0" catenateWords="0"
> > catenateNumbers="0" catenateAll="0" preserveOriginal="1"/> <filter
> class=
> > "solr.StopFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/>
> </
> > analyzer> <analyzer type="query"> <tokenizer class=
> > "solr.WhitespaceTokenizerFactory"/> <filter class=
> > "solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
> >
> >
> > <field name="description" type="text_ws_finer" indexed="true"
> stored="true"
> > />
> >
> > On Fri, Sep 19, 2014 at 7:36 PM, Erick Erickson [via Lucene] <
> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4160547&i=1>>
> wrote:
> >
> >> Hmmm, I'd have to see the schema definition for your description
> >> field. For this, the admin/analysis page is very helpful. Here's my
> >> guess:
> >>
> >> Your analysis chain doesn't break the incoming tokens up quite like
> >> you think it is. Thus you have the tokens in your index like
> >> 'protein,' (notice the comma) and 'protein-like' rather than just
> >> 'protein'. However, I can't quite reconcile this with your statement:
> >> "Another weird thing is that if I used description:"fatty
> >> acid-binding" AND description:"protein"
> >>
> >> so I'm at something of a loss. If you paste in your schema definition
> >> for the 'description' field _and_ the corresponding <fieldType>
> >> definition I can give it a quick whirl.
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <[hidden email]
> >> <http://user/SendEmail.jtp?type=node&node=4160122&i=0>> wrote:
> >>
> >> > Hello Erick,
> >> >
> >> > Thanks for the response.  I tried adding the debug=True to the query,
> >> but I
> >> > do not know exactly what I am looking for in the output.  Would it be
> >> > possible for you to look at the results?  I would really appreciate
> it.
> >> I
> >> > attached two files, one of them is with the filter query
> >> description:"fatty
> >> > acid-binding" and the other is with the filter query
> description:"fatty
> >> > acid-binding protein".  If you see the file that has the results for
> >> > description:"fatty acid-binding" , you can see that the hits do have
> >> "fatty
> >> > acid-binding protein" and nothing in between.  I really appreciate
> any
> >> help
> >> > you can provide.
> >> >
> >> > Thanks you
> >> >
> >> > On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
> >> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4160122&i=1>>
>
> >> wrote:
> >> >
> >> >> Your very best friend here is attaching &debug=query to the URL and
> >> >> looking at the parsed query results. Upon occasion there's some
> >> >>
> >> >> One possible explanation is that description field has something
> like
> >> >> "fatty acid-binding some words protein" in which case your query
> >> >> "fatty acid-binding protein" would fail, but "fatty acid-binding
> >> >> protein"~4 would succeed.
> >> >>
> >> >> The other possibility is that your query parsing isn't quite doing
> >> >> what you think, but adding &debug=query should help there.
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <[hidden email]
> >> >> <http://user/SendEmail.jtp?type=node&node=4160036&i=0>> wrote:
> >> >>
> >> >> > Hello All,
> >> >> >
> >> >> > I recently came across a problem when I tried using
> >> description:"fatty
> >> >> > acid-binding protein" as a filter query when doing a query through
> >> the
> >> >> query
> >> >> > interface for Solr in the Tomcat server.  Using that filter query
> did
> >> >> not
> >> >> > give me any results at all, however if I used description:"fatty
> >> >> > acid-binding" as the filter query, it would give me the results I
> >> >> wanted.
> >> >> >
> >> >> > The thing is that some of the results I got back from Solr, did
> have
> >> the
> >> >> > words "fatty acid-binding protein" in the description field.  So I
> >> >> really do
> >> >> > not know what might be causing the issue of Solr not being able to
> >> find
> >> >> > those hits.
> >> >> >
> >> >> > Another weird thing is that if I used description:"fatty
> >> acid-binding"
> >> >> AND
> >> >> > description:"protein" as the filter query when doing a query, it
> gave
> >> me
> >> >> the
> >> >> > results I anticipated (with some extra results that did not have
> the
> >> >> exact
> >> >> > phrase "fatty acid-binding protein").  Does anyone have an idea as
> to
> >> >> what
> >> >> > might be happening?  Just in case this is helpful, the version of
> >> Solr
> >> >> we
> >> >> > are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help
> anyone
> >> >> can
> >> >> > provide.
> >> >> >
> >> >> > Thanks!
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > View this message in context:
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
> >> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >> >>
> >> >>
> >> >> ------------------------------
> >> >>  If you reply to this email, your message will be added to the
> >> discussion
> >> >> below:
> >> >>
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160036.html
> >> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> >> <
> >> >> .
> >> >> NAML
> >> >> <
> >>
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
> >>
> >> >>
> >> >
> >> >
> >> > fatty_acid-binding_protein.xml (1K) <
> >>
> http://lucene.472066.n3.nabble.com/attachment/4160048/0/fatty_acid-binding_protein.xml>
>
> >>
> >> > fatty_acid-binding.xml (63K) <
> >>
> http://lucene.472066.n3.nabble.com/attachment/4160048/1/fatty_acid-binding.xml>
>
> >>
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160048.html
> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >> ------------------------------
> >>  If you reply to this email, your message will be added to the
> discussion
> >> below:
> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160122.html
> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> <
> >> .
> >> NAML
> >> <
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
> >>
> >
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160423.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160547.html
>  To unsubscribe from Issue Adding Filter Query, click here
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4159990&code=QW50ZWxtby5BZ3VpbGFyLjE3QG5kLmVkdXw0MTU5OTkwfC0xMDkyNTg2ODY3>
> .
> NAML
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160576.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue Adding Filter Query

Posted by Erick Erickson <er...@gmail.com>.
You have your index and query time analysis chains defined much
differently. Omitting the WordDelimiterFilterFactory from the
query-time analysis chain will lead to endless problems.

With the definition you have, here are the terms in the index and
their term positions as  below. This is available from the
admin/analysis page if you click the "verbose" checkbox, although I
admit it's kind of hard to read:
1         2                       3            4
fatty  acid-binding     binding    protein
         acid

But at query time, this is how they're being analyzed
1             2                   3
fatty    acid-binding    protein

So searching for "fatty acid-binding protein" requires that the tokens
"fatty" "acid-binding" and "protein" appear in term positions 1, 2, 3
rather  than where they actually are (1, 2, 4). Searching for "fatty
acid-binding protein"~1 would actually find this, the "~1" means allow
one gap in there.

HOWEVER, that's the least of your problems. WordDelimiterFilterFactory
will _also_ "split on intra-word delimiters (all non alpha-numeric
characters)". While that doesn't really say so explicitly, that will
have the effect of removing puncutation. So searching for "fatty
acid-binding protein."~1 (note the period) will fail since the token
will include the period.

I'd _really_ advise you to use the stock WordDelimiterFilterFactory
settings in both analysis and query times included in the stock Solr
release for, say, text_en_splitting or even a single analyzer like
text_en_splitting_tight.

Best,
Erick

On Mon, Sep 22, 2014 at 6:33 AM, aaguilar <An...@nd.edu> wrote:
> Hello Erick.
>
> Below is the information you requested.   Thanks for your help!
>
> <fieldType name="text_ws_finer" class="solr.TextField" positionIncrementGap=
> "100"> <analyzer type="index"> <tokenizer class=
> "solr.WhitespaceTokenizerFactory"/> <filter class=
> "solr.WordDelimiterFilterFactory" splitOnNumerics="0" splitOnCaseChange="0"
> generateWordParts="1" generateNumberParts="0" catenateWords="0"
> catenateNumbers="0" catenateAll="0" preserveOriginal="1"/> <filter class=
> "solr.StopFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </
> analyzer> <analyzer type="query"> <tokenizer class=
> "solr.WhitespaceTokenizerFactory"/> <filter class=
> "solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
>
>
> <field name="description" type="text_ws_finer" indexed="true" stored="true"
> />
>
> On Fri, Sep 19, 2014 at 7:36 PM, Erick Erickson [via Lucene] <
> ml-node+s472066n4160122h39@n3.nabble.com> wrote:
>
>> Hmmm, I'd have to see the schema definition for your description
>> field. For this, the admin/analysis page is very helpful. Here's my
>> guess:
>>
>> Your analysis chain doesn't break the incoming tokens up quite like
>> you think it is. Thus you have the tokens in your index like
>> 'protein,' (notice the comma) and 'protein-like' rather than just
>> 'protein'. However, I can't quite reconcile this with your statement:
>> "Another weird thing is that if I used description:"fatty
>> acid-binding" AND description:"protein"
>>
>> so I'm at something of a loss. If you paste in your schema definition
>> for the 'description' field _and_ the corresponding <fieldType>
>> definition I can give it a quick whirl.
>>
>> Best,
>> Erick
>>
>> On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <[hidden email]
>> <http://user/SendEmail.jtp?type=node&node=4160122&i=0>> wrote:
>>
>> > Hello Erick,
>> >
>> > Thanks for the response.  I tried adding the debug=True to the query,
>> but I
>> > do not know exactly what I am looking for in the output.  Would it be
>> > possible for you to look at the results?  I would really appreciate it.
>> I
>> > attached two files, one of them is with the filter query
>> description:"fatty
>> > acid-binding" and the other is with the filter query description:"fatty
>> > acid-binding protein".  If you see the file that has the results for
>> > description:"fatty acid-binding" , you can see that the hits do have
>> "fatty
>> > acid-binding protein" and nothing in between.  I really appreciate any
>> help
>> > you can provide.
>> >
>> > Thanks you
>> >
>> > On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
>> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4160122&i=1>>
>> wrote:
>> >
>> >> Your very best friend here is attaching &debug=query to the URL and
>> >> looking at the parsed query results. Upon occasion there's some
>> >>
>> >> One possible explanation is that description field has something like
>> >> "fatty acid-binding some words protein" in which case your query
>> >> "fatty acid-binding protein" would fail, but "fatty acid-binding
>> >> protein"~4 would succeed.
>> >>
>> >> The other possibility is that your query parsing isn't quite doing
>> >> what you think, but adding &debug=query should help there.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <[hidden email]
>> >> <http://user/SendEmail.jtp?type=node&node=4160036&i=0>> wrote:
>> >>
>> >> > Hello All,
>> >> >
>> >> > I recently came across a problem when I tried using
>> description:"fatty
>> >> > acid-binding protein" as a filter query when doing a query through
>> the
>> >> query
>> >> > interface for Solr in the Tomcat server.  Using that filter query did
>> >> not
>> >> > give me any results at all, however if I used description:"fatty
>> >> > acid-binding" as the filter query, it would give me the results I
>> >> wanted.
>> >> >
>> >> > The thing is that some of the results I got back from Solr, did have
>> the
>> >> > words "fatty acid-binding protein" in the description field.  So I
>> >> really do
>> >> > not know what might be causing the issue of Solr not being able to
>> find
>> >> > those hits.
>> >> >
>> >> > Another weird thing is that if I used description:"fatty
>> acid-binding"
>> >> AND
>> >> > description:"protein" as the filter query when doing a query, it gave
>> me
>> >> the
>> >> > results I anticipated (with some extra results that did not have the
>> >> exact
>> >> > phrase "fatty acid-binding protein").  Does anyone have an idea as to
>> >> what
>> >> > might be happening?  Just in case this is helpful, the version of
>> Solr
>> >> we
>> >> > are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help anyone
>> >> can
>> >> > provide.
>> >> >
>> >> > Thanks!
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > View this message in context:
>> >>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
>> >> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >>
>> >> ------------------------------
>> >>  If you reply to this email, your message will be added to the
>> discussion
>> >> below:
>> >>
>> >>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160036.html
>> >>  To unsubscribe from Issue Adding Filter Query, click here
>> >> <
>> >> .
>> >> NAML
>> >> <
>> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>> >>
>> >
>> >
>> > fatty_acid-binding_protein.xml (1K) <
>> http://lucene.472066.n3.nabble.com/attachment/4160048/0/fatty_acid-binding_protein.xml>
>>
>> > fatty_acid-binding.xml (63K) <
>> http://lucene.472066.n3.nabble.com/attachment/4160048/1/fatty_acid-binding.xml>
>>
>> >
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160048.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160122.html
>>  To unsubscribe from Issue Adding Filter Query, click here
>> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4159990&code=QW50ZWxtby5BZ3VpbGFyLjE3QG5kLmVkdXw0MTU5OTkwfC0xMDkyNTg2ODY3>
>> .
>> NAML
>> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160423.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue Adding Filter Query

Posted by aaguilar <An...@nd.edu>.
Hello Erick.

Below is the information you requested.   Thanks for your help!

<fieldType name="text_ws_finer" class="solr.TextField" positionIncrementGap=
"100"> <analyzer type="index"> <tokenizer class=
"solr.WhitespaceTokenizerFactory"/> <filter class=
"solr.WordDelimiterFilterFactory" splitOnNumerics="0" splitOnCaseChange="0"
generateWordParts="1" generateNumberParts="0" catenateWords="0"
catenateNumbers="0" catenateAll="0" preserveOriginal="1"/> <filter class=
"solr.StopFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </
analyzer> <analyzer type="query"> <tokenizer class=
"solr.WhitespaceTokenizerFactory"/> <filter class=
"solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>


<field name="description" type="text_ws_finer" indexed="true" stored="true"
/>

On Fri, Sep 19, 2014 at 7:36 PM, Erick Erickson [via Lucene] <
ml-node+s472066n4160122h39@n3.nabble.com> wrote:

> Hmmm, I'd have to see the schema definition for your description
> field. For this, the admin/analysis page is very helpful. Here's my
> guess:
>
> Your analysis chain doesn't break the incoming tokens up quite like
> you think it is. Thus you have the tokens in your index like
> 'protein,' (notice the comma) and 'protein-like' rather than just
> 'protein'. However, I can't quite reconcile this with your statement:
> "Another weird thing is that if I used description:"fatty
> acid-binding" AND description:"protein"
>
> so I'm at something of a loss. If you paste in your schema definition
> for the 'description' field _and_ the corresponding <fieldType>
> definition I can give it a quick whirl.
>
> Best,
> Erick
>
> On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=4160122&i=0>> wrote:
>
> > Hello Erick,
> >
> > Thanks for the response.  I tried adding the debug=True to the query,
> but I
> > do not know exactly what I am looking for in the output.  Would it be
> > possible for you to look at the results?  I would really appreciate it.
> I
> > attached two files, one of them is with the filter query
> description:"fatty
> > acid-binding" and the other is with the filter query description:"fatty
> > acid-binding protein".  If you see the file that has the results for
> > description:"fatty acid-binding" , you can see that the hits do have
> "fatty
> > acid-binding protein" and nothing in between.  I really appreciate any
> help
> > you can provide.
> >
> > Thanks you
> >
> > On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4160122&i=1>>
> wrote:
> >
> >> Your very best friend here is attaching &debug=query to the URL and
> >> looking at the parsed query results. Upon occasion there's some
> >>
> >> One possible explanation is that description field has something like
> >> "fatty acid-binding some words protein" in which case your query
> >> "fatty acid-binding protein" would fail, but "fatty acid-binding
> >> protein"~4 would succeed.
> >>
> >> The other possibility is that your query parsing isn't quite doing
> >> what you think, but adding &debug=query should help there.
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <[hidden email]
> >> <http://user/SendEmail.jtp?type=node&node=4160036&i=0>> wrote:
> >>
> >> > Hello All,
> >> >
> >> > I recently came across a problem when I tried using
> description:"fatty
> >> > acid-binding protein" as a filter query when doing a query through
> the
> >> query
> >> > interface for Solr in the Tomcat server.  Using that filter query did
> >> not
> >> > give me any results at all, however if I used description:"fatty
> >> > acid-binding" as the filter query, it would give me the results I
> >> wanted.
> >> >
> >> > The thing is that some of the results I got back from Solr, did have
> the
> >> > words "fatty acid-binding protein" in the description field.  So I
> >> really do
> >> > not know what might be causing the issue of Solr not being able to
> find
> >> > those hits.
> >> >
> >> > Another weird thing is that if I used description:"fatty
> acid-binding"
> >> AND
> >> > description:"protein" as the filter query when doing a query, it gave
> me
> >> the
> >> > results I anticipated (with some extra results that did not have the
> >> exact
> >> > phrase "fatty acid-binding protein").  Does anyone have an idea as to
> >> what
> >> > might be happening?  Just in case this is helpful, the version of
> Solr
> >> we
> >> > are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help anyone
> >> can
> >> > provide.
> >> >
> >> > Thanks!
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >> ------------------------------
> >>  If you reply to this email, your message will be added to the
> discussion
> >> below:
> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160036.html
> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> <
> >> .
> >> NAML
> >> <
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
> >>
> >
> >
> > fatty_acid-binding_protein.xml (1K) <
> http://lucene.472066.n3.nabble.com/attachment/4160048/0/fatty_acid-binding_protein.xml>
>
> > fatty_acid-binding.xml (63K) <
> http://lucene.472066.n3.nabble.com/attachment/4160048/1/fatty_acid-binding.xml>
>
> >
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160048.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160122.html
>  To unsubscribe from Issue Adding Filter Query, click here
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4159990&code=QW50ZWxtby5BZ3VpbGFyLjE3QG5kLmVkdXw0MTU5OTkwfC0xMDkyNTg2ODY3>
> .
> NAML
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160423.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue Adding Filter Query

Posted by Erick Erickson <er...@gmail.com>.
Hmmm, I'd have to see the schema definition for your description
field. For this, the admin/analysis page is very helpful. Here's my
guess:

Your analysis chain doesn't break the incoming tokens up quite like
you think it is. Thus you have the tokens in your index like
'protein,' (notice the comma) and 'protein-like' rather than just
'protein'. However, I can't quite reconcile this with your statement:
"Another weird thing is that if I used description:"fatty
acid-binding" AND description:"protein"

so I'm at something of a loss. If you paste in your schema definition
for the 'description' field _and_ the corresponding <fieldType>
definition I can give it a quick whirl.

Best,
Erick

On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <An...@nd.edu> wrote:
> Hello Erick,
>
> Thanks for the response.  I tried adding the debug=True to the query, but I
> do not know exactly what I am looking for in the output.  Would it be
> possible for you to look at the results?  I would really appreciate it.  I
> attached two files, one of them is with the filter query description:"fatty
> acid-binding" and the other is with the filter query description:"fatty
> acid-binding protein".  If you see the file that has the results for
> description:"fatty acid-binding" , you can see that the hits do have "fatty
> acid-binding protein" and nothing in between.  I really appreciate any help
> you can provide.
>
> Thanks you
>
> On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
> ml-node+s472066n4160036h73@n3.nabble.com> wrote:
>
>> Your very best friend here is attaching &debug=query to the URL and
>> looking at the parsed query results. Upon occasion there's some
>>
>> One possible explanation is that description field has something like
>> "fatty acid-binding some words protein" in which case your query
>> "fatty acid-binding protein" would fail, but "fatty acid-binding
>> protein"~4 would succeed.
>>
>> The other possibility is that your query parsing isn't quite doing
>> what you think, but adding &debug=query should help there.
>>
>> Best,
>> Erick
>>
>> On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <[hidden email]
>> <http://user/SendEmail.jtp?type=node&node=4160036&i=0>> wrote:
>>
>> > Hello All,
>> >
>> > I recently came across a problem when I tried using description:"fatty
>> > acid-binding protein" as a filter query when doing a query through the
>> query
>> > interface for Solr in the Tomcat server.  Using that filter query did
>> not
>> > give me any results at all, however if I used description:"fatty
>> > acid-binding" as the filter query, it would give me the results I
>> wanted.
>> >
>> > The thing is that some of the results I got back from Solr, did have the
>> > words "fatty acid-binding protein" in the description field.  So I
>> really do
>> > not know what might be causing the issue of Solr not being able to find
>> > those hits.
>> >
>> > Another weird thing is that if I used description:"fatty acid-binding"
>> AND
>> > description:"protein" as the filter query when doing a query, it gave me
>> the
>> > results I anticipated (with some extra results that did not have the
>> exact
>> > phrase "fatty acid-binding protein").  Does anyone have an idea as to
>> what
>> > might be happening?  Just in case this is helpful, the version of Solr
>> we
>> > are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help anyone
>> can
>> > provide.
>> >
>> > Thanks!
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160036.html
>>  To unsubscribe from Issue Adding Filter Query, click here
>> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4159990&code=QW50ZWxtby5BZ3VpbGFyLjE3QG5kLmVkdXw0MTU5OTkwfC0xMDkyNTg2ODY3>
>> .
>> NAML
>> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
> fatty_acid-binding_protein.xml (1K) <http://lucene.472066.n3.nabble.com/attachment/4160048/0/fatty_acid-binding_protein.xml>
> fatty_acid-binding.xml (63K) <http://lucene.472066.n3.nabble.com/attachment/4160048/1/fatty_acid-binding.xml>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160048.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue Adding Filter Query

Posted by aaguilar <An...@nd.edu>.
Hello Erick,

Thanks for the response.  I tried adding the debug=True to the query, but I
do not know exactly what I am looking for in the output.  Would it be
possible for you to look at the results?  I would really appreciate it.  I
attached two files, one of them is with the filter query description:"fatty
acid-binding" and the other is with the filter query description:"fatty
acid-binding protein".  If you see the file that has the results for
description:"fatty acid-binding" , you can see that the hits do have "fatty
acid-binding protein" and nothing in between.  I really appreciate any help
you can provide.

Thanks you

On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
ml-node+s472066n4160036h73@n3.nabble.com> wrote:

> Your very best friend here is attaching &debug=query to the URL and
> looking at the parsed query results. Upon occasion there's some
>
> One possible explanation is that description field has something like
> "fatty acid-binding some words protein" in which case your query
> "fatty acid-binding protein" would fail, but "fatty acid-binding
> protein"~4 would succeed.
>
> The other possibility is that your query parsing isn't quite doing
> what you think, but adding &debug=query should help there.
>
> Best,
> Erick
>
> On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=4160036&i=0>> wrote:
>
> > Hello All,
> >
> > I recently came across a problem when I tried using description:"fatty
> > acid-binding protein" as a filter query when doing a query through the
> query
> > interface for Solr in the Tomcat server.  Using that filter query did
> not
> > give me any results at all, however if I used description:"fatty
> > acid-binding" as the filter query, it would give me the results I
> wanted.
> >
> > The thing is that some of the results I got back from Solr, did have the
> > words "fatty acid-binding protein" in the description field.  So I
> really do
> > not know what might be causing the issue of Solr not being able to find
> > those hits.
> >
> > Another weird thing is that if I used description:"fatty acid-binding"
> AND
> > description:"protein" as the filter query when doing a query, it gave me
> the
> > results I anticipated (with some extra results that did not have the
> exact
> > phrase "fatty acid-binding protein").  Does anyone have an idea as to
> what
> > might be happening?  Just in case this is helpful, the version of Solr
> we
> > are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help anyone
> can
> > provide.
> >
> > Thanks!
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160036.html
>  To unsubscribe from Issue Adding Filter Query, click here
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4159990&code=QW50ZWxtby5BZ3VpbGFyLjE3QG5kLmVkdXw0MTU5OTkwfC0xMDkyNTg2ODY3>
> .
> NAML
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>


fatty_acid-binding_protein.xml (1K) <http://lucene.472066.n3.nabble.com/attachment/4160048/0/fatty_acid-binding_protein.xml>
fatty_acid-binding.xml (63K) <http://lucene.472066.n3.nabble.com/attachment/4160048/1/fatty_acid-binding.xml>




--
View this message in context: http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160048.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue Adding Filter Query

Posted by Erick Erickson <er...@gmail.com>.
Your very best friend here is attaching &debug=query to the URL and
looking at the parsed query results. Upon occasion there's some

One possible explanation is that description field has something like
"fatty acid-binding some words protein" in which case your query
"fatty acid-binding protein" would fail, but "fatty acid-binding
protein"~4 would succeed.

The other possibility is that your query parsing isn't quite doing
what you think, but adding &debug=query should help there.

Best,
Erick

On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <An...@nd.edu> wrote:
> Hello All,
>
> I recently came across a problem when I tried using description:"fatty
> acid-binding protein" as a filter query when doing a query through the query
> interface for Solr in the Tomcat server.  Using that filter query did not
> give me any results at all, however if I used description:"fatty
> acid-binding" as the filter query, it would give me the results I wanted.
>
> The thing is that some of the results I got back from Solr, did have the
> words "fatty acid-binding protein" in the description field.  So I really do
> not know what might be causing the issue of Solr not being able to find
> those hits.
>
> Another weird thing is that if I used description:"fatty acid-binding" AND
> description:"protein" as the filter query when doing a query, it gave me the
> results I anticipated (with some extra results that did not have the exact
> phrase "fatty acid-binding protein").  Does anyone have an idea as to what
> might be happening?  Just in case this is helpful, the version of Solr we
> are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help anyone can
> provide.
>
> Thanks!
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
> Sent from the Solr - User mailing list archive at Nabble.com.