You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Sandeep Mestry <sa...@gmail.com> on 2013/05/16 12:51:04 UTC

Question about Edismax - Solr 4.0

-- *Edismax and Filter Queries with Commas and spaces* --

Dear Experts,

This appears to be a bug, please suggest if I'm wrong.

If I search with the following filter query,

1) fq=title:(, 10)

- I get no results.
- The debug output does NOT show the section containing
parsed_filter_queries

if I carry a search with the filter query,

2) fq=title:(,10) - (No space between , and 10)

- I get results and the debug output shows the parsed filter queries
section as,
<arr name="filter_queries">
<str>(titles:(,10))</str>
<str>(collection:assets)</str>

As you can see above, I'm also passing in other filter queries
(collection:assets) which appear correctly but they do not appear in case 1
above.

I can't make this as part of the query parameter as that needs to be
searched against multiple fields.

Can someone suggest a fix in this case please. I'm using Solr 4.0.

Many Thanks,
Sandeep

Re: Question about Edismax - Solr 4.0

Posted by Sandeep Mestry <sa...@gmail.com>.

Hello Jack,

Thanks for pointing the issues out and for your valuable suggestion. My
preliminary tests were okay on search but I will be doing more testing to
see if this has impacted any other searches.

Thanks once again and have a nice sunny weekend,
Sandeep


On 17 May 2013 05:35, Jack Krupansky <ja...@basetechnology.com> wrote:

> Ah... I think your issue is the preserveOriginal=1 on the query analyzer
> as well as the fact that you have all of these catenatexx="1" options on
> the query analyzer - I indicated that you should remove them all.
>
> The problem is that the whitespace analyzer leaves the leading comma in
> place, and the preserveOriginal="1" also generates an extra token for the
> term, with the comma in place . But, with the space, the comma and "10" are
> separate terms and get analyzed independently.
>
> The query results probably indicate that you don't have that exact
> combination of the term and leading punctuation - or that there is no
> standalone comma in your input data.
>
> Try the following replacement for the query-time WDF:
>
>
> <filter class="solr.**WordDelimiterFilterFactory"
> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="0" catenateNumbers="0" catenateAll="0"
> splitOnCaseChange="1" splitOnNumerics="0" preserveOriginal="0" />
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sandeep Mestry
> Sent: Thursday, May 16, 2013 5:50 PM
>
> To: solr-user@lucene.apache.org
> Subject: Re: Question about Edismax - Solr 4.0
>
> Hi Jack,
>
> Thanks for your response again and for helping me out to get through this.
>
> The URL is definitely encoded for spaces and it looks like below. As I
> mentioned in my previous mail, I can't add it to query parameter as that
> searches on multiple fields.
>
> The title field is defined as below:
> <field name="title" type="text_wc" indexed="true" stored="false"
> multiValued="true"/>
>
> q=countryside&rows=20&qt=**assdismax&fq=%28title%3A%28,**
> 10%29%29&fq=collection:assets
>
> <requestHandler name="assdismax" class="solr.SearchHandler">
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</**str>
> <float name="tie">0.01</float>
> <str name="qf">title^10 description^5 annotations^3 notes^2
> categories</str>
> <str name="pf">title</str>
> <int name="ps">0</int>
> <str name="q.alt">*:*</str>
> <str name="fl">*,score</str>
> <str name="mm">100%</str>
> <str name="q.op">AND</str>
> <str name="sort">score desc</str>
> <str name="facet">true</str>
> <str name="facet.limit">-1</str>
> <str name="facet.mincount">1</str>
> <str name="facet.field">uniq_**subtype_id</str>
> <str name="facet.field">component_**type</str>
> <str name="facet.field">genre_type<**/str>
> </lst>
> <lst name="appends">
> <str name="fq">collection:assets</**str>
> </lst>
> </requestHandler>
>
> The term 'countryside' needs to be searched against multiple fields
> including titles, descriptions, annotations, categories, notes but the UI
> also has a feature to limit results by providing a title field.
>
>
> I can see that the filter queries are always parsed by LuceneQueryParser
> however I'd expect it to generate the parsed_filter_queries debug output in
> every situation.
>
> I have tried it as the main query with both edismax and lucene defType and
> it gives me correct output and correct results.
> But, there is some problem when this is used as a filter query as the the
> parser is not able to parse a comma with a space.
>
> Thanks again Jack, please let me know in case you need more inputs from my
> side.
>
> Best Regards,
> Sandeep
>
> On 16 May 2013 18:03, Jack Krupansky <ja...@basetechnology.com> wrote:
>
>  Could you show us the full query URL - spaces must be encoded in URL query
>> parameters.
>>
>> Also show the actual field XML - you omitted that.
>>
>> Try the same query as a main query, using both defType=edismax and
>> defType=lucene.
>>
>> Note that the filter query is parsed using the Lucene query parser, not
>> edismax, independent of the defType parameter. But you don't have any
>> edismax features in your fq anyway.
>>
>> But you can stick {!edismax} in front of the query to force edismax to be
>> used for the fq, although it really shouldn't change anything:
>>
>> Also, catenate is fine for indexing, but will mess up your queries at
>> query time, so set them to "0" in the query analyzer
>>
>> Also, make sure you have autoGeneratePhraseQueries="****true" on the
>> field
>>
>> type, but that's not the issue here.
>>
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Sandeep Mestry
>> Sent: Thursday, May 16, 2013 12:42 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Question about Edismax - Solr 4.0
>>
>>
>> Thanks Jack for your reply..
>>
>> The problem is, I'm finding results for fq=title:(,10) but not for
>> fq=title:(, 10) - apologies if that was not clear from my first mail.
>> I have already mentioned the debug analysis in my previous mail.
>>
>> Additionally, the title field is defined as below:
>> <fieldType name="text_wc" class="solr.TextField"
>> positionIncrementGap="100"
>>
>>
>>>          <analyzer type="index">
>>>
>>                <tokenizer class="solr.****WhitespaceTokenizerFactory"/>
>>                <filter class="solr.****WordDelimiterFilterFactory"
>>
>> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
>> catenateWords="1" catenateNumbers="1" catenateAll="1"
>> splitOnCaseChange="1"
>> splitOnNumerics="0" preserveOriginal="1" />
>>                <filter class="solr.****LowerCaseFilterFactory"/>
>>            </analyzer>
>>            <analyzer type="query">
>>                <tokenizer class="solr.****WhitespaceTokenizerFactory"/>
>>                <filter class="solr.****WordDelimiterFilterFactory"
>>
>> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
>> catenateWords="1" catenateNumbers="1" catenateAll="1"
>> splitOnCaseChange="1"
>> splitOnNumerics="0" preserveOriginal="1" />
>>                <filter class="solr.****LowerCaseFilterFactory"/>
>>
>>            </analyzer>
>>        </fieldType>
>>
>> I have the set catenate options to 1 for all types.
>> I can understand if ',' getting ignored when it is on its own (title:(,
>> 10)) but
>> - Why solr is not searching for 10 in that case just like it did when the
>> query was (title:(,10))?
>> - And why other filter queries did not show up (collection:assets) in
>> debug
>> section?
>>
>>
>> Thanks,
>> Sandeep
>>
>>
>> On 16 May 2013 13:57, Jack Krupansky <ja...@basetechnology.com> wrote:
>>
>>  You haven't indicated any problem here! What is the symptom that you
>>
>>> actually think is a problem.
>>>
>>> There is no comma operator in any of the Solr query parsers. Comma is
>>> just
>>> another character that may or may not be included or discarded depending
>>> on
>>> the specific field type and analyzer. For example, a white space analyzer
>>> will keep commas, but the standard analyzer or the word delimiter filter
>>> will discard them. If "title" were a "string" type, all punctuation would
>>> be preserved, including commas and spaces (but spaces would need to be
>>> escaped or the term text enclosed in parentheses.)
>>>
>>> Let us know what your symptom is though, first.
>>>
>>> I mean, the filter query looks perfectly reasonable from an abstract
>>> perspective.
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Sandeep Mestry
>>> Sent: Thursday, May 16, 2013 6:51 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Question about Edismax - Solr 4.0
>>>
>>> -- *Edismax and Filter Queries with Commas and spaces* --
>>>
>>>
>>> Dear Experts,
>>>
>>> This appears to be a bug, please suggest if I'm wrong.
>>>
>>> If I search with the following filter query,
>>>
>>> 1) fq=title:(, 10)
>>>
>>> - I get no results.
>>> - The debug output does NOT show the section containing
>>> parsed_filter_queries
>>>
>>> if I carry a search with the filter query,
>>>
>>> 2) fq=title:(,10) - (No space between , and 10)
>>>
>>> - I get results and the debug output shows the parsed filter queries
>>> section as,
>>> <arr name="filter_queries">
>>> <str>(titles:(,10))</str>
>>> <str>(collection:assets)</str>
>>>
>>> As you can see above, I'm also passing in other filter queries
>>> (collection:assets) which appear correctly but they do not appear in case
>>> 1
>>> above.
>>>
>>> I can't make this as part of the query parameter as that needs to be
>>> searched against multiple fields.
>>>
>>> Can someone suggest a fix in this case please. I'm using Solr 4.0.
>>>
>>> Many Thanks,
>>> Sandeep
>>>
>>>
>>>
>>
>

Re: Question about Edismax - Solr 4.0

Posted by Jack Krupansky <ja...@basetechnology.com>.

Ah... I think your issue is the preserveOriginal=1 on the query analyzer as 
well as the fact that you have all of these catenatexx="1" options on the 
query analyzer - I indicated that you should remove them all.

The problem is that the whitespace analyzer leaves the leading comma in 
place, and the preserveOriginal="1" also generates an extra token for the 
term, with the comma in place . But, with the space, the comma and "10" are 
separate terms and get analyzed independently.

The query results probably indicate that you don't have that exact 
combination of the term and leading punctuation - or that there is no 
standalone comma in your input data.

Try the following replacement for the query-time WDF:

<filter class="solr.WordDelimiterFilterFactory" stemEnglishPossessive="0" 
generateWordParts="1" generateNumberParts="1"
catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" 
splitOnNumerics="0" preserveOriginal="0" />

-- Jack Krupansky

-----Original Message----- 
From: Sandeep Mestry
Sent: Thursday, May 16, 2013 5:50 PM
To: solr-user@lucene.apache.org
Subject: Re: Question about Edismax - Solr 4.0

Hi Jack,

Thanks for your response again and for helping me out to get through this.

The URL is definitely encoded for spaces and it looks like below. As I
mentioned in my previous mail, I can't add it to query parameter as that
searches on multiple fields.

The title field is defined as below:
<field name="title" type="text_wc" indexed="true" stored="false"
multiValued="true"/>

q=countryside&rows=20&qt=assdismax&fq=%28title%3A%28,10%29%29&fq=collection:assets

<requestHandler name="assdismax" class="solr.SearchHandler">
<lst name="defaults">
<str name="defType">edismax</str>
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">title^10 description^5 annotations^3 notes^2 categories</str>
<str name="pf">title</str>
<int name="ps">0</int>
<str name="q.alt">*:*</str>
<str name="fl">*,score</str>
<str name="mm">100%</str>
<str name="q.op">AND</str>
<str name="sort">score desc</str>
<str name="facet">true</str>
<str name="facet.limit">-1</str>
<str name="facet.mincount">1</str>
<str name="facet.field">uniq_subtype_id</str>
<str name="facet.field">component_type</str>
<str name="facet.field">genre_type</str>
</lst>
<lst name="appends">
<str name="fq">collection:assets</str>
</lst>
</requestHandler>

The term 'countryside' needs to be searched against multiple fields
including titles, descriptions, annotations, categories, notes but the UI
also has a feature to limit results by providing a title field.


I can see that the filter queries are always parsed by LuceneQueryParser
however I'd expect it to generate the parsed_filter_queries debug output in
every situation.

I have tried it as the main query with both edismax and lucene defType and
it gives me correct output and correct results.
But, there is some problem when this is used as a filter query as the the
parser is not able to parse a comma with a space.

Thanks again Jack, please let me know in case you need more inputs from my
side.

Best Regards,
Sandeep

On 16 May 2013 18:03, Jack Krupansky <ja...@basetechnology.com> wrote:

> Could you show us the full query URL - spaces must be encoded in URL query
> parameters.
>
> Also show the actual field XML - you omitted that.
>
> Try the same query as a main query, using both defType=edismax and
> defType=lucene.
>
> Note that the filter query is parsed using the Lucene query parser, not
> edismax, independent of the defType parameter. But you don't have any
> edismax features in your fq anyway.
>
> But you can stick {!edismax} in front of the query to force edismax to be
> used for the fq, although it really shouldn't change anything:
>
> Also, catenate is fine for indexing, but will mess up your queries at
> query time, so set them to "0" in the query analyzer
>
> Also, make sure you have autoGeneratePhraseQueries="**true" on the field
> type, but that's not the issue here.
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sandeep Mestry
> Sent: Thursday, May 16, 2013 12:42 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Question about Edismax - Solr 4.0
>
>
> Thanks Jack for your reply..
>
> The problem is, I'm finding results for fq=title:(,10) but not for
> fq=title:(, 10) - apologies if that was not clear from my first mail.
> I have already mentioned the debug analysis in my previous mail.
>
> Additionally, the title field is defined as below:
> <fieldType name="text_wc" class="solr.TextField" 
> positionIncrementGap="100"
>
>>
>>          <analyzer type="index">
>                <tokenizer class="solr.**WhitespaceTokenizerFactory"/>
>                <filter class="solr.**WordDelimiterFilterFactory"
> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="1" 
> splitOnCaseChange="1"
> splitOnNumerics="0" preserveOriginal="1" />
>                <filter class="solr.**LowerCaseFilterFactory"/>
>            </analyzer>
>            <analyzer type="query">
>                <tokenizer class="solr.**WhitespaceTokenizerFactory"/>
>                <filter class="solr.**WordDelimiterFilterFactory"
> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="1" 
> splitOnCaseChange="1"
> splitOnNumerics="0" preserveOriginal="1" />
>                <filter class="solr.**LowerCaseFilterFactory"/>
>            </analyzer>
>        </fieldType>
>
> I have the set catenate options to 1 for all types.
> I can understand if ',' getting ignored when it is on its own (title:(,
> 10)) but
> - Why solr is not searching for 10 in that case just like it did when the
> query was (title:(,10))?
> - And why other filter queries did not show up (collection:assets) in 
> debug
> section?
>
>
> Thanks,
> Sandeep
>
>
> On 16 May 2013 13:57, Jack Krupansky <ja...@basetechnology.com> wrote:
>
>  You haven't indicated any problem here! What is the symptom that you
>> actually think is a problem.
>>
>> There is no comma operator in any of the Solr query parsers. Comma is 
>> just
>> another character that may or may not be included or discarded depending
>> on
>> the specific field type and analyzer. For example, a white space analyzer
>> will keep commas, but the standard analyzer or the word delimiter filter
>> will discard them. If "title" were a "string" type, all punctuation would
>> be preserved, including commas and spaces (but spaces would need to be
>> escaped or the term text enclosed in parentheses.)
>>
>> Let us know what your symptom is though, first.
>>
>> I mean, the filter query looks perfectly reasonable from an abstract
>> perspective.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Sandeep Mestry
>> Sent: Thursday, May 16, 2013 6:51 AM
>> To: solr-user@lucene.apache.org
>> Subject: Question about Edismax - Solr 4.0
>>
>> -- *Edismax and Filter Queries with Commas and spaces* --
>>
>>
>> Dear Experts,
>>
>> This appears to be a bug, please suggest if I'm wrong.
>>
>> If I search with the following filter query,
>>
>> 1) fq=title:(, 10)
>>
>> - I get no results.
>> - The debug output does NOT show the section containing
>> parsed_filter_queries
>>
>> if I carry a search with the filter query,
>>
>> 2) fq=title:(,10) - (No space between , and 10)
>>
>> - I get results and the debug output shows the parsed filter queries
>> section as,
>> <arr name="filter_queries">
>> <str>(titles:(,10))</str>
>> <str>(collection:assets)</str>
>>
>> As you can see above, I'm also passing in other filter queries
>> (collection:assets) which appear correctly but they do not appear in case
>> 1
>> above.
>>
>> I can't make this as part of the query parameter as that needs to be
>> searched against multiple fields.
>>
>> Can someone suggest a fix in this case please. I'm using Solr 4.0.
>>
>> Many Thanks,
>> Sandeep
>>
>>
>

Re: Question about Edismax - Solr 4.0

Posted by Sandeep Mestry <sa...@gmail.com>.

Hi Jack,

Thanks for your response again and for helping me out to get through this.

The URL is definitely encoded for spaces and it looks like below. As I
mentioned in my previous mail, I can't add it to query parameter as that
searches on multiple fields.

The title field is defined as below:
<field name="title" type="text_wc" indexed="true" stored="false"
multiValued="true"/>

q=countryside&rows=20&qt=assdismax&fq=%28title%3A%28,10%29%29&fq=collection:assets

<requestHandler name="assdismax" class="solr.SearchHandler">
<lst name="defaults">
<str name="defType">edismax</str>
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">title^10 description^5 annotations^3 notes^2 categories</str>
<str name="pf">title</str>
<int name="ps">0</int>
<str name="q.alt">*:*</str>
<str name="fl">*,score</str>
<str name="mm">100%</str>
<str name="q.op">AND</str>
<str name="sort">score desc</str>
<str name="facet">true</str>
<str name="facet.limit">-1</str>
<str name="facet.mincount">1</str>
<str name="facet.field">uniq_subtype_id</str>
<str name="facet.field">component_type</str>
<str name="facet.field">genre_type</str>
</lst>
<lst name="appends">
<str name="fq">collection:assets</str>
</lst>
</requestHandler>

The term 'countryside' needs to be searched against multiple fields
including titles, descriptions, annotations, categories, notes but the UI
also has a feature to limit results by providing a title field.


I can see that the filter queries are always parsed by LuceneQueryParser
however I'd expect it to generate the parsed_filter_queries debug output in
every situation.

I have tried it as the main query with both edismax and lucene defType and
it gives me correct output and correct results.
But, there is some problem when this is used as a filter query as the the
parser is not able to parse a comma with a space.

Thanks again Jack, please let me know in case you need more inputs from my
side.

Best Regards,
Sandeep

On 16 May 2013 18:03, Jack Krupansky <ja...@basetechnology.com> wrote:

> Could you show us the full query URL - spaces must be encoded in URL query
> parameters.
>
> Also show the actual field XML - you omitted that.
>
> Try the same query as a main query, using both defType=edismax and
> defType=lucene.
>
> Note that the filter query is parsed using the Lucene query parser, not
> edismax, independent of the defType parameter. But you don't have any
> edismax features in your fq anyway.
>
> But you can stick {!edismax} in front of the query to force edismax to be
> used for the fq, although it really shouldn't change anything:
>
> Also, catenate is fine for indexing, but will mess up your queries at
> query time, so set them to "0" in the query analyzer
>
> Also, make sure you have autoGeneratePhraseQueries="**true" on the field
> type, but that's not the issue here.
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sandeep Mestry
> Sent: Thursday, May 16, 2013 12:42 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Question about Edismax - Solr 4.0
>
>
> Thanks Jack for your reply..
>
> The problem is, I'm finding results for fq=title:(,10) but not for
> fq=title:(, 10) - apologies if that was not clear from my first mail.
> I have already mentioned the debug analysis in my previous mail.
>
> Additionally, the title field is defined as below:
> <fieldType name="text_wc" class="solr.TextField" positionIncrementGap="100"
>
>>
>>          <analyzer type="index">
>                <tokenizer class="solr.**WhitespaceTokenizerFactory"/>
>                <filter class="solr.**WordDelimiterFilterFactory"
> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
> splitOnNumerics="0" preserveOriginal="1" />
>                <filter class="solr.**LowerCaseFilterFactory"/>
>            </analyzer>
>            <analyzer type="query">
>                <tokenizer class="solr.**WhitespaceTokenizerFactory"/>
>                <filter class="solr.**WordDelimiterFilterFactory"
> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
> splitOnNumerics="0" preserveOriginal="1" />
>                <filter class="solr.**LowerCaseFilterFactory"/>
>            </analyzer>
>        </fieldType>
>
> I have the set catenate options to 1 for all types.
> I can understand if ',' getting ignored when it is on its own (title:(,
> 10)) but
> - Why solr is not searching for 10 in that case just like it did when the
> query was (title:(,10))?
> - And why other filter queries did not show up (collection:assets) in debug
> section?
>
>
> Thanks,
> Sandeep
>
>
> On 16 May 2013 13:57, Jack Krupansky <ja...@basetechnology.com> wrote:
>
>  You haven't indicated any problem here! What is the symptom that you
>> actually think is a problem.
>>
>> There is no comma operator in any of the Solr query parsers. Comma is just
>> another character that may or may not be included or discarded depending
>> on
>> the specific field type and analyzer. For example, a white space analyzer
>> will keep commas, but the standard analyzer or the word delimiter filter
>> will discard them. If "title" were a "string" type, all punctuation would
>> be preserved, including commas and spaces (but spaces would need to be
>> escaped or the term text enclosed in parentheses.)
>>
>> Let us know what your symptom is though, first.
>>
>> I mean, the filter query looks perfectly reasonable from an abstract
>> perspective.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Sandeep Mestry
>> Sent: Thursday, May 16, 2013 6:51 AM
>> To: solr-user@lucene.apache.org
>> Subject: Question about Edismax - Solr 4.0
>>
>> -- *Edismax and Filter Queries with Commas and spaces* --
>>
>>
>> Dear Experts,
>>
>> This appears to be a bug, please suggest if I'm wrong.
>>
>> If I search with the following filter query,
>>
>> 1) fq=title:(, 10)
>>
>> - I get no results.
>> - The debug output does NOT show the section containing
>> parsed_filter_queries
>>
>> if I carry a search with the filter query,
>>
>> 2) fq=title:(,10) - (No space between , and 10)
>>
>> - I get results and the debug output shows the parsed filter queries
>> section as,
>> <arr name="filter_queries">
>> <str>(titles:(,10))</str>
>> <str>(collection:assets)</str>
>>
>> As you can see above, I'm also passing in other filter queries
>> (collection:assets) which appear correctly but they do not appear in case
>> 1
>> above.
>>
>> I can't make this as part of the query parameter as that needs to be
>> searched against multiple fields.
>>
>> Can someone suggest a fix in this case please. I'm using Solr 4.0.
>>
>> Many Thanks,
>> Sandeep
>>
>>
>

Re: Question about Edismax - Solr 4.0

Posted by Jack Krupansky <ja...@basetechnology.com>.

Could you show us the full query URL - spaces must be encoded in URL query 
parameters.

Also show the actual field XML - you omitted that.

Try the same query as a main query, using both defType=edismax and 
defType=lucene.

Note that the filter query is parsed using the Lucene query parser, not 
edismax, independent of the defType parameter. But you don't have any 
edismax features in your fq anyway.

But you can stick {!edismax} in front of the query to force edismax to be 
used for the fq, although it really shouldn't change anything:

Also, catenate is fine for indexing, but will mess up your queries at query 
time, so set them to "0" in the query analyzer

Also, make sure you have autoGeneratePhraseQueries="true" on the field type, 
but that's not the issue here.

-- Jack Krupansky

-----Original Message----- 
From: Sandeep Mestry
Sent: Thursday, May 16, 2013 12:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Question about Edismax - Solr 4.0

Thanks Jack for your reply..

The problem is, I'm finding results for fq=title:(,10) but not for
fq=title:(, 10) - apologies if that was not clear from my first mail.
I have already mentioned the debug analysis in my previous mail.

Additionally, the title field is defined as below:
<fieldType name="text_wc" class="solr.TextField" positionIncrementGap="100"
>
         <analyzer type="index">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.WordDelimiterFilterFactory"
stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
splitOnNumerics="0" preserveOriginal="1" />
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
            <analyzer type="query">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.WordDelimiterFilterFactory"
stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
splitOnNumerics="0" preserveOriginal="1" />
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
        </fieldType>

I have the set catenate options to 1 for all types.
I can understand if ',' getting ignored when it is on its own (title:(,
10)) but
- Why solr is not searching for 10 in that case just like it did when the
query was (title:(,10))?
- And why other filter queries did not show up (collection:assets) in debug
section?


Thanks,
Sandeep


On 16 May 2013 13:57, Jack Krupansky <ja...@basetechnology.com> wrote:

> You haven't indicated any problem here! What is the symptom that you
> actually think is a problem.
>
> There is no comma operator in any of the Solr query parsers. Comma is just
> another character that may or may not be included or discarded depending 
> on
> the specific field type and analyzer. For example, a white space analyzer
> will keep commas, but the standard analyzer or the word delimiter filter
> will discard them. If "title" were a "string" type, all punctuation would
> be preserved, including commas and spaces (but spaces would need to be
> escaped or the term text enclosed in parentheses.)
>
> Let us know what your symptom is though, first.
>
> I mean, the filter query looks perfectly reasonable from an abstract
> perspective.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sandeep Mestry
> Sent: Thursday, May 16, 2013 6:51 AM
> To: solr-user@lucene.apache.org
> Subject: Question about Edismax - Solr 4.0
>
> -- *Edismax and Filter Queries with Commas and spaces* --
>
>
> Dear Experts,
>
> This appears to be a bug, please suggest if I'm wrong.
>
> If I search with the following filter query,
>
> 1) fq=title:(, 10)
>
> - I get no results.
> - The debug output does NOT show the section containing
> parsed_filter_queries
>
> if I carry a search with the filter query,
>
> 2) fq=title:(,10) - (No space between , and 10)
>
> - I get results and the debug output shows the parsed filter queries
> section as,
> <arr name="filter_queries">
> <str>(titles:(,10))</str>
> <str>(collection:assets)</str>
>
> As you can see above, I'm also passing in other filter queries
> (collection:assets) which appear correctly but they do not appear in case 
> 1
> above.
>
> I can't make this as part of the query parameter as that needs to be
> searched against multiple fields.
>
> Can someone suggest a fix in this case please. I'm using Solr 4.0.
>
> Many Thanks,
> Sandeep
>

Re: Question about Edismax - Solr 4.0

Posted by Sandeep Mestry <sa...@gmail.com>.

Thanks Jack for your reply..

The problem is, I'm finding results for fq=title:(,10) but not for
fq=title:(, 10) - apologies if that was not clear from my first mail.
I have already mentioned the debug analysis in my previous mail.

Additionally, the title field is defined as below:
<fieldType name="text_wc" class="solr.TextField" positionIncrementGap="100"
>
         <analyzer type="index">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.WordDelimiterFilterFactory"
stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
splitOnNumerics="0" preserveOriginal="1" />
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
            <analyzer type="query">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.WordDelimiterFilterFactory"
stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
splitOnNumerics="0" preserveOriginal="1" />
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
        </fieldType>

I have the set catenate options to 1 for all types.
I can understand if ',' getting ignored when it is on its own (title:(,
10)) but
- Why solr is not searching for 10 in that case just like it did when the
query was (title:(,10))?
- And why other filter queries did not show up (collection:assets) in debug
section?


Thanks,
Sandeep


On 16 May 2013 13:57, Jack Krupansky <ja...@basetechnology.com> wrote:

> You haven't indicated any problem here! What is the symptom that you
> actually think is a problem.
>
> There is no comma operator in any of the Solr query parsers. Comma is just
> another character that may or may not be included or discarded depending on
> the specific field type and analyzer. For example, a white space analyzer
> will keep commas, but the standard analyzer or the word delimiter filter
> will discard them. If "title" were a "string" type, all punctuation would
> be preserved, including commas and spaces (but spaces would need to be
> escaped or the term text enclosed in parentheses.)
>
> Let us know what your symptom is though, first.
>
> I mean, the filter query looks perfectly reasonable from an abstract
> perspective.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sandeep Mestry
> Sent: Thursday, May 16, 2013 6:51 AM
> To: solr-user@lucene.apache.org
> Subject: Question about Edismax - Solr 4.0
>
> -- *Edismax and Filter Queries with Commas and spaces* --
>
>
> Dear Experts,
>
> This appears to be a bug, please suggest if I'm wrong.
>
> If I search with the following filter query,
>
> 1) fq=title:(, 10)
>
> - I get no results.
> - The debug output does NOT show the section containing
> parsed_filter_queries
>
> if I carry a search with the filter query,
>
> 2) fq=title:(,10) - (No space between , and 10)
>
> - I get results and the debug output shows the parsed filter queries
> section as,
> <arr name="filter_queries">
> <str>(titles:(,10))</str>
> <str>(collection:assets)</str>
>
> As you can see above, I'm also passing in other filter queries
> (collection:assets) which appear correctly but they do not appear in case 1
> above.
>
> I can't make this as part of the query parameter as that needs to be
> searched against multiple fields.
>
> Can someone suggest a fix in this case please. I'm using Solr 4.0.
>
> Many Thanks,
> Sandeep
>

Re: Question about Edismax - Solr 4.0

Posted by Jack Krupansky <ja...@basetechnology.com>.

You haven't indicated any problem here! What is the symptom that you 
actually think is a problem.

There is no comma operator in any of the Solr query parsers. Comma is just 
another character that may or may not be included or discarded depending on 
the specific field type and analyzer. For example, a white space analyzer 
will keep commas, but the standard analyzer or the word delimiter filter 
will discard them. If "title" were a "string" type, all punctuation would be 
preserved, including commas and spaces (but spaces would need to be escaped 
or the term text enclosed in parentheses.)

Let us know what your symptom is though, first.

I mean, the filter query looks perfectly reasonable from an abstract 
perspective.

-- Jack Krupansky

-----Original Message----- 
From: Sandeep Mestry
Sent: Thursday, May 16, 2013 6:51 AM
To: solr-user@lucene.apache.org
Subject: Question about Edismax - Solr 4.0

-- *Edismax and Filter Queries with Commas and spaces* --

Dear Experts,

This appears to be a bug, please suggest if I'm wrong.

If I search with the following filter query,

1) fq=title:(, 10)

- I get no results.
- The debug output does NOT show the section containing
parsed_filter_queries

if I carry a search with the filter query,

2) fq=title:(,10) - (No space between , and 10)

- I get results and the debug output shows the parsed filter queries
section as,
<arr name="filter_queries">
<str>(titles:(,10))</str>
<str>(collection:assets)</str>

As you can see above, I'm also passing in other filter queries
(collection:assets) which appear correctly but they do not appear in case 1
above.

I can't make this as part of the query parameter as that needs to be
searched against multiple fields.

Can someone suggest a fix in this case please. I'm using Solr 4.0.

Many Thanks,
Sandeep