You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Alfonso Muñoz-Pomer Fuentes <am...@ebi.ac.uk> on 2017/06/12 17:28:50 UTC

Use of blanks in context filter field with AnalyzingInfixLookupFactory

Hi all,

I was wondering if anybody has experience setting up a suggester with filtering using a context field that has blanks. Currently this is what I have in solr_config.xml:
<searchComponent name="suggest" class="solr.SuggestComponent">
  <lst name="suggester">
    <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
    <str name="dictionaryImpl">DocumentDictionaryFactory</str>
    <str name="field”>property_value</str>
    <str name="contextField”>species</str>
    <str name="suggestAnalyzerFieldType">text_en</str>
    <str name="queryAnalyzerFieldType">text_en</str>
    <str name="buildOnStartup">false</str>
  </lst>
</searchComponent>

And this is an example record in my index:
{
  "bioentity_identifier":["ENSG00000000419"],
  "bioentity_type":["ensgene"],
  "species":"homo sapiens",
  "property_value":["R-HSA-162699"],
  "property_name":["pathwayid"],
  "id":"795aedd9-54aa-44c9-99bf-8d195985b7cc",
  "_version_”:1570016930397421568
}

When I request for suggestions like this, everything’s fine:
http://localhost:8983/solr/bioentities/suggest?wt=json&indent=on&suggest.q=r

But if I try to narrow by species, I get 0 results:
http://localhost:8983/solr/bioentities/suggest?wt=json&indent=on&suggest.q=r&suggest.cfq=homo sapiens

I’ve tried escaping the space, URL-encode it (with %20 and +), enclosing it in single quotes, double quotes, square brackets... to no avail (getting 0 results except when I enclose the parameter value with double quotes, in which case I get an exception). In the example record above, species is of type string. In schemaless mode the results are the same.

Using underscores in the species lets me filter properly, so the filtering mechanism per se works fine.

Any help greatly appreciated.

--
Alfonso Muñoz-Pomer Fuentes
Software Engineer @ Expression Atlas Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Tel:+ 44 (0) 1223 49 2633
Skype: amunozpomer


Re: Use of blanks in context filter field with AnalyzingInfixLookupFactory

Posted by Georg Sorst <ge...@gmail.com>.
Alfonso,

I've run into similar issues with the context filter query, maybe this is
caused by the StandardTokenizer.

I've written a patch in SOLR-9968 that makes the analyzer for the context
filter query configurable. This has helped me at least. SOLR-7963 also
allows you to change the query parser.

Good luck,
Georg

Alfonso Muñoz-Pomer Fuentes <am...@ebi.ac.uk> schrieb am Mo., 12. Juni
2017, 21:11:

> suggestAnalyzerFieldType and queryAnalyzerFieldType are related to the
> field parameter (in my case property_value), not to the contextField.
> Moreover, the change you suggest makes AnalyzingInfixLookupFactory always
> return 0 results (something that’s not discussed in the reference guide and
> has confused other users previously).
>
> Cheers,
> Alfonso
>
>
> > On 12 Jun 2017, at 19:10, Susheel Kumar <su...@gmail.com> wrote:
> >
> > Change below type to string and try...
> >
> > <str name="suggestAnalyzerFieldType">text_en</str>
> >    <str name="queryAnalyzerFieldType">text_en</str>
> >
> > Thanks,
> > Susheel
> >
> > On Mon, Jun 12, 2017 at 1:28 PM, Alfonso Muñoz-Pomer Fuentes <
> > amunoz@ebi.ac.uk> wrote:
> >
> >> Hi all,
> >>
> >> I was wondering if anybody has experience setting up a suggester with
> >> filtering using a context field that has blanks. Currently this is what
> I
> >> have in solr_config.xml:
> >> <searchComponent name="suggest" class="solr.SuggestComponent">
> >>  <lst name="suggester">
> >>    <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
> >>    <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> >>    <str name="field”>property_value</str>
> >>    <str name="contextField”>species</str>
> >>    <str name="suggestAnalyzerFieldType">text_en</str>
> >>    <str name="queryAnalyzerFieldType">text_en</str>
> >>    <str name="buildOnStartup">false</str>
> >>  </lst>
> >> </searchComponent>
> >>
> >> And this is an example record in my index:
> >> {
> >>  "bioentity_identifier":["ENSG00000000419"],
> >>  "bioentity_type":["ensgene"],
> >>  "species":"homo sapiens",
> >>  "property_value":["R-HSA-162699"],
> >>  "property_name":["pathwayid"],
> >>  "id":"795aedd9-54aa-44c9-99bf-8d195985b7cc",
> >>  "_version_”:1570016930397421568
> >> }
> >>
> >> When I request for suggestions like this, everything’s fine:
> >> http://localhost:8983/solr/bioentities/suggest?wt=json&
> >> indent=on&suggest.q=r
> >>
> >> But if I try to narrow by species, I get 0 results:
> >> http://localhost:8983/solr/bioentities/suggest?wt=json&
> >> indent=on&suggest.q=r&suggest.cfq=homo sapiens
> >>
> >> I’ve tried escaping the space, URL-encode it (with %20 and +), enclosing
> >> it in single quotes, double quotes, square brackets... to no avail
> (getting
> >> 0 results except when I enclose the parameter value with double quotes,
> in
> >> which case I get an exception). In the example record above, species is
> of
> >> type string. In schemaless mode the results are the same.
> >>
> >> Using underscores in the species lets me filter properly, so the
> filtering
> >> mechanism per se works fine.
> >>
> >> Any help greatly appreciated.
> >>
> >> --
> >> Alfonso Muñoz-Pomer Fuentes
> >> Software Engineer @ Expression Atlas Team
> >> European Bioinformatics Institute (EMBL-EBI)
> >> European Molecular Biology Laboratory
> >> Tel:+ 44 (0) 1223 49 2633
> >> Skype: amunozpomer
> >>
> >>
>
> --
> Alfonso Muñoz-Pomer Fuentes
> Software Engineer @ Expression Atlas Team
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Tel:+ 44 (0) 1223 49 2633
> Skype: amunozpomer
>
>

Re: Use of blanks in context filter field with AnalyzingInfixLookupFactory

Posted by Alfonso Muñoz-Pomer Fuentes <am...@ebi.ac.uk>.
suggestAnalyzerFieldType and queryAnalyzerFieldType are related to the field parameter (in my case property_value), not to the contextField. Moreover, the change you suggest makes AnalyzingInfixLookupFactory always return 0 results (something that’s not discussed in the reference guide and has confused other users previously).

Cheers,
Alfonso


> On 12 Jun 2017, at 19:10, Susheel Kumar <su...@gmail.com> wrote:
> 
> Change below type to string and try...
> 
> <str name="suggestAnalyzerFieldType">text_en</str>
>    <str name="queryAnalyzerFieldType">text_en</str>
> 
> Thanks,
> Susheel
> 
> On Mon, Jun 12, 2017 at 1:28 PM, Alfonso Muñoz-Pomer Fuentes <
> amunoz@ebi.ac.uk> wrote:
> 
>> Hi all,
>> 
>> I was wondering if anybody has experience setting up a suggester with
>> filtering using a context field that has blanks. Currently this is what I
>> have in solr_config.xml:
>> <searchComponent name="suggest" class="solr.SuggestComponent">
>>  <lst name="suggester">
>>    <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
>>    <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>>    <str name="field”>property_value</str>
>>    <str name="contextField”>species</str>
>>    <str name="suggestAnalyzerFieldType">text_en</str>
>>    <str name="queryAnalyzerFieldType">text_en</str>
>>    <str name="buildOnStartup">false</str>
>>  </lst>
>> </searchComponent>
>> 
>> And this is an example record in my index:
>> {
>>  "bioentity_identifier":["ENSG00000000419"],
>>  "bioentity_type":["ensgene"],
>>  "species":"homo sapiens",
>>  "property_value":["R-HSA-162699"],
>>  "property_name":["pathwayid"],
>>  "id":"795aedd9-54aa-44c9-99bf-8d195985b7cc",
>>  "_version_”:1570016930397421568
>> }
>> 
>> When I request for suggestions like this, everything’s fine:
>> http://localhost:8983/solr/bioentities/suggest?wt=json&
>> indent=on&suggest.q=r
>> 
>> But if I try to narrow by species, I get 0 results:
>> http://localhost:8983/solr/bioentities/suggest?wt=json&
>> indent=on&suggest.q=r&suggest.cfq=homo sapiens
>> 
>> I’ve tried escaping the space, URL-encode it (with %20 and +), enclosing
>> it in single quotes, double quotes, square brackets... to no avail (getting
>> 0 results except when I enclose the parameter value with double quotes, in
>> which case I get an exception). In the example record above, species is of
>> type string. In schemaless mode the results are the same.
>> 
>> Using underscores in the species lets me filter properly, so the filtering
>> mechanism per se works fine.
>> 
>> Any help greatly appreciated.
>> 
>> --
>> Alfonso Muñoz-Pomer Fuentes
>> Software Engineer @ Expression Atlas Team
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>> Tel:+ 44 (0) 1223 49 2633
>> Skype: amunozpomer
>> 
>> 

--
Alfonso Muñoz-Pomer Fuentes
Software Engineer @ Expression Atlas Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Tel:+ 44 (0) 1223 49 2633
Skype: amunozpomer


Re: Use of blanks in context filter field with AnalyzingInfixLookupFactory

Posted by Susheel Kumar <su...@gmail.com>.
Change below type to string and try...

 <str name="suggestAnalyzerFieldType">text_en</str>
    <str name="queryAnalyzerFieldType">text_en</str>

Thanks,
Susheel

On Mon, Jun 12, 2017 at 1:28 PM, Alfonso Muñoz-Pomer Fuentes <
amunoz@ebi.ac.uk> wrote:

> Hi all,
>
> I was wondering if anybody has experience setting up a suggester with
> filtering using a context field that has blanks. Currently this is what I
> have in solr_config.xml:
> <searchComponent name="suggest" class="solr.SuggestComponent">
>   <lst name="suggester">
>     <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
>     <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>     <str name="field”>property_value</str>
>     <str name="contextField”>species</str>
>     <str name="suggestAnalyzerFieldType">text_en</str>
>     <str name="queryAnalyzerFieldType">text_en</str>
>     <str name="buildOnStartup">false</str>
>   </lst>
> </searchComponent>
>
> And this is an example record in my index:
> {
>   "bioentity_identifier":["ENSG00000000419"],
>   "bioentity_type":["ensgene"],
>   "species":"homo sapiens",
>   "property_value":["R-HSA-162699"],
>   "property_name":["pathwayid"],
>   "id":"795aedd9-54aa-44c9-99bf-8d195985b7cc",
>   "_version_”:1570016930397421568
> }
>
> When I request for suggestions like this, everything’s fine:
> http://localhost:8983/solr/bioentities/suggest?wt=json&
> indent=on&suggest.q=r
>
> But if I try to narrow by species, I get 0 results:
> http://localhost:8983/solr/bioentities/suggest?wt=json&
> indent=on&suggest.q=r&suggest.cfq=homo sapiens
>
> I’ve tried escaping the space, URL-encode it (with %20 and +), enclosing
> it in single quotes, double quotes, square brackets... to no avail (getting
> 0 results except when I enclose the parameter value with double quotes, in
> which case I get an exception). In the example record above, species is of
> type string. In schemaless mode the results are the same.
>
> Using underscores in the species lets me filter properly, so the filtering
> mechanism per se works fine.
>
> Any help greatly appreciated.
>
> --
> Alfonso Muñoz-Pomer Fuentes
> Software Engineer @ Expression Atlas Team
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Tel:+ 44 (0) 1223 49 2633
> Skype: amunozpomer
>
>