You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Wendy2 <we...@rcsb.org> on 2018/02/01 14:25:29 UTC

Re: Help with Boolean search using Solr parser edismax

Good morning, Emir,

Here are the debug output for case 1f-a (q=method:"x-ray*" "Solution NMR"),
1f-b (q=+method:"x-ray*" +"Solution NMR"). both returned zero counts. It
looks that the querystrings are the same.   Thanks for following up on my
post and your help! -- Wendy


*=====================DebugQuery Outputs for case 1f-a,
1f-b=======================*
*1f-a (/search?q=method:"x-ray*" "Solution NMR"): result count = 0*
 "debug":{
    "rawquerystring":"method:\"x-ray*\" \"Solution NMR\"",
    "querystring":"method:\"x-ray*\" \"Solution NMR\"",
    "parsedquery":"(+(PhraseQuery(method:\"x rai\")
DisjunctionMaxQuery(((pdb_id:Solution NMR)^5.0 |
(entity_name_com.name:\"solut nmr\")^20.0 | (citation_author.name:\"solut
nmr\")^5.0 | (audit_author.name:\"solut nmr\")^5.0 |
rest_fields_stem:\"solut nmr\" | (title_fields_stem:\"solut nmr\")^3.0 |
(classification:\"solut nmr\")^15.0 | (struct_keywords.text:\"solut
nmr\")^12.0 | (entity.pdbx_description:\"solut nmr\")^10.0 |
(pdbx_descriptor_stem:\"solut nmr\")^10.0 | (citation.title:\"solut
nmr\")^25.0 | (struct_keywords.pdbx_keywords:\"solut nmr\")^15.0 |
(entity_src_gen_concat_stem:\"solut nmr\")^15.0 | (struct.title:\"solut
nmr\")^35.0 | (group_id_stem:\"solut nmr\")^10.0)))~2)/no_coord",
    "parsedquery_toString":"+((method:\"x rai\" ((pdb_id:Solution NMR)^5.0 |
(entity_name_com.name:\"solut nmr\")^20.0 | (citation_author.name:\"solut
nmr\")^5.0 | (audit_author.name:\"solut nmr\")^5.0 |
rest_fields_stem:\"solut nmr\" | (title_fields_stem:\"solut nmr\")^3.0 |
(classification:\"solut nmr\")^15.0 | (struct_keywords.text:\"solut
nmr\")^12.0 | (entity.pdbx_description:\"solut nmr\")^10.0 |
(pdbx_descriptor_stem:\"solut nmr\")^10.0 | (citation.title:\"solut
nmr\")^25.0 | (struct_keywords.pdbx_keywords:\"solut nmr\")^15.0 |
(entity_src_gen_concat_stem:\"solut nmr\")^15.0 | (struct.title:\"solut
nmr\")^35.0 | (group_id_stem:\"solut nmr\")^10.0))~2)",


*1f-b (/search?q=+method:"x-ray*" +"Solution NMR") result count = 0:*
"debug":{
    "rawquerystring":" method:\"x-ray*\"  \"Solution NMR\"",
    "querystring":" method:\"x-ray*\"  \"Solution NMR\"",
    "parsedquery":"(+(PhraseQuery(method:\"x rai\")
DisjunctionMaxQuery(((pdb_id:Solution NMR)^5.0 |
(entity_name_com.name:\"solut nmr\")^20.0 | (citation_author.name:\"solut
nmr\")^5.0 | (audit_author.name:\"solut nmr\")^5.0 |
rest_fields_stem:\"solut nmr\" | (title_fields_stem:\"solut nmr\")^3.0 |
(classification:\"solut nmr\")^15.0 | (struct_keywords.text:\"solut
nmr\")^12.0 | (entity.pdbx_description:\"solut nmr\")^10.0 |
(pdbx_descriptor_stem:\"solut nmr\")^10.0 | (citation.title:\"solut
nmr\")^25.0 | (struct_keywords.pdbx_keywords:\"solut nmr\")^15.0 |
(entity_src_gen_concat_stem:\"solut nmr\")^15.0 | (struct.title:\"solut
nmr\")^35.0 | (group_id_stem:\"solut nmr\")^10.0)))~2)/no_coord",
    "parsedquery_toString":"+((method:\"x rai\" ((pdb_id:Solution NMR)^5.0 |
(entity_name_com.name:\"solut nmr\")^20.0 | (citation_author.name:\"solut
nmr\")^5.0 | (audit_author.name:\"solut nmr\")^5.0 |
rest_fields_stem:\"solut nmr\" | (title_fields_stem:\"solut nmr\")^3.0 |
(classification:\"solut nmr\")^15.0 | (struct_keywords.text:\"solut
nmr\")^12.0 | (entity.pdbx_description:\"solut nmr\")^10.0 |
(pdbx_descriptor_stem:\"solut nmr\")^10.0 | (citation.title:\"solut
nmr\")^25.0 | (struct_keywords.pdbx_keywords:\"solut nmr\")^15.0 |
(entity_src_gen_concat_stem:\"solut nmr\")^15.0 | (struct.title:\"solut
nmr\")^35.0 | (group_id_stem:\"solut nmr\")^10.0))~2)",



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Help with Boolean search using Solr parser edismax

Posted by Wendy2 <we...@rcsb.org>.
Hi Erick,

Thank you very much for the clarification. I will keep it in my mind since
we are now in the process of migrating MySQL database to mongoDB.

Best Regards,

Wendy 
a happy Solr user 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Help with Boolean search using Solr parser edismax

Posted by Wendy2 <we...@rcsb.org>.
Hi Erick,

Yes. Currently I re-index the database on a weekly basis because we only
have weekly release.
As part of the Solr weekly re-index, the batch job will delete the
/solr/core/data folder, restart Solr server, then re-index.
We use Luigi to build/control pipelines of Solr re-index batch jobs.

Thanks for al your help and support!

All the best,

Wendy 
a happy Solr user :-) 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Help with Boolean search using Solr parser edismax

Posted by Erick Erickson <er...@gmail.com>.
From the ref guide:

"Field names should consist of alphanumeric or underscore characters
only and not start with a digit. This is not currently strictly
enforced, but other field names will not have first class support from
all components and back compatibility is not guaranteed."

You need to _completely_ blow away your index and re-index from
scratch though. By "completely blow away" I mean
1> shut down your Solrs and "rm -rf each_core/data"
or
2> create a new collection and index into that
or
3> delete your collection and re-create it.

If you just re-index your entire corpus after changing the field
names, the metadata for the old fields will be preserved. Likely you
won't notice, it's not very much data but....


Erick

On Fri, Feb 2, 2018 at 6:14 AM, Wendy2 <we...@rcsb.org> wrote:
> Good morning, Emir,
>
> Thanks for letting me know that. I used dots to add tableName. as a field
> prefix because several columns from different tables have the same names.
> In your opinion, what will be the best way to replace dots?
>
> Happy Friday!
>
> Wendy
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Help with Boolean search using Solr parser edismax

Posted by Wendy2 <we...@rcsb.org>.
Good morning, Emir,

Thanks for letting me know that. I used dots to add tableName. as a field
prefix because several columns from different tables have the same names.  
In your opinion, what will be the best way to replace dots?

Happy Friday!

Wendy



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Help with Boolean search using Solr parser edismax

Posted by Erick Erickson <er...@gmail.com>.
From the reference guide:

Field names should consist of alphanumeric or underscore characters
only and not start with a digit. This is not currently strictly
enforced, but other field names will not have first class support from
all components and back compatibility is not guaranteed.

Best,
Erick

On Fri, Feb 2, 2018 at 1:07 AM, Emir Arnautović
<em...@sematext.com> wrote:
> Hi Wendy,
> A bit off-topic, but forgot to mention in previous mail: dots in field names are not recommended. Even it obviously works for you, I think I’ve seen people reporting some issues caused by dot in field names (I cannot find some reference now). So, if you plan some system upgrade in the future, you might want to get rid of field names with dots - you can safely use underscore.
>
> Regards,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 1 Feb 2018, at 17:19, Wendy2 <we...@rcsb.org> wrote:
>>
>> And the coupon has no expiration date on it (LOL).  Thank you again, Emir!
>>
>> Best Regards,
>>
>> Wendy
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: Help with Boolean search using Solr parser edismax

Posted by Emir Arnautović <em...@sematext.com>.
Hi Wendy,
A bit off-topic, but forgot to mention in previous mail: dots in field names are not recommended. Even it obviously works for you, I think I’ve seen people reporting some issues caused by dot in field names (I cannot find some reference now). So, if you plan some system upgrade in the future, you might want to get rid of field names with dots - you can safely use underscore.

Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 1 Feb 2018, at 17:19, Wendy2 <we...@rcsb.org> wrote:
> 
> And the coupon has no expiration date on it (LOL).  Thank you again, Emir!
> 
> Best Regards,
> 
> Wendy
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with Boolean search using Solr parser edismax

Posted by Wendy2 <we...@rcsb.org>.
And the coupon has no expiration date on it (LOL).  Thank you again, Emir!

Best Regards,

Wendy



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Help with Boolean search using Solr parser edismax

Posted by Emir Arnautović <em...@sematext.com>.
Hi Wendy,
You are welcome! I’ll put your lunch coupon in my wallet, just in case I get hungry around NJ ;)

Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 1 Feb 2018, at 16:26, Wendy2 <we...@rcsb.org> wrote:
> 
> Excellent!!! Thank you so much for all your help, Emir!
> 
> Both worked now and I got 997 result counts back as the expected number :-) 
> 
> /rcsb/search?q=method:"x-ray*" "Solution NMR"&mm=1
> /rcsb/search?q=+method:"x-ray*" +"Solution NMR"&mm=1
> 
> I will keep this in my mind regarding query with multiple parsers:  
> /select?q=method:”x-ray*” OR _query({!edismax mm=7
> qf=‘title_field_stem^3,….’}”Solution NMR”). 
> 
> Thanks again and have a wonderful Thursday!
> If you ever come to NJ area, I would like to take you out for a lunch to
> thank you for all your help!
> 
> Wendy
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Help with Boolean search using Solr parser edismax

Posted by Wendy2 <we...@rcsb.org>.
Excellent!!! Thank you so much for all your help, Emir!

Both worked now and I got 997 result counts back as the expected number :-) 

/rcsb/search?q=method:"x-ray*" "Solution NMR"&mm=1
/rcsb/search?q=+method:"x-ray*" +"Solution NMR"&mm=1

I will keep this in my mind regarding query with multiple parsers:  
/select?q=method:”x-ray*” OR _query({!edismax mm=7
qf=‘title_field_stem^3,….’}”Solution NMR”). 

Thanks again and have a wonderful Thursday!
If you ever come to NJ area, I would like to take you out for a lunch to
thank you for all your help!

Wendy



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Help with Boolean search using Solr parser edismax

Posted by Emir Arnautović <em...@sematext.com>.
Hi Wendy,
Query now looks as expected but you are not getting results as expected. The reason for that is edismax’s mm parameter is what matters. You are setting it to 7 and you have two parts to match so it is always AND and you don’t have such documents. You can set it to 1 and it will be OR.
If you really need to have OR between some edismax query and some other query, you will have to use standard parser and use _query, something like:

/select?q=method:”x-ray*” OR _query({!edismax mm=7 qf=‘title_field_stem^3,….’}”Solution NMR”).

You can put it in config and use placeholders to pass values.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 1 Feb 2018, at 15:25, Wendy2 <we...@rcsb.org> wrote:
> 
> Good morning, Emir,
> 
> Here are the debug output for case 1f-a (q=method:"x-ray*" "Solution NMR"),
> 1f-b (q=+method:"x-ray*" +"Solution NMR"). both returned zero counts. It
> looks that the querystrings are the same.   Thanks for following up on my
> post and your help! -- Wendy
> 
> 
> *=====================DebugQuery Outputs for case 1f-a,
> 1f-b=======================*
> *1f-a (/search?q=method:"x-ray*" "Solution NMR"): result count = 0*
> "debug":{
>    "rawquerystring":"method:\"x-ray*\" \"Solution NMR\"",
>    "querystring":"method:\"x-ray*\" \"Solution NMR\"",
>    "parsedquery":"(+(PhraseQuery(method:\"x rai\")
> DisjunctionMaxQuery(((pdb_id:Solution NMR)^5.0 |
> (entity_name_com.name:\"solut nmr\")^20.0 | (citation_author.name:\"solut
> nmr\")^5.0 | (audit_author.name:\"solut nmr\")^5.0 |
> rest_fields_stem:\"solut nmr\" | (title_fields_stem:\"solut nmr\")^3.0 |
> (classification:\"solut nmr\")^15.0 | (struct_keywords.text:\"solut
> nmr\")^12.0 | (entity.pdbx_description:\"solut nmr\")^10.0 |
> (pdbx_descriptor_stem:\"solut nmr\")^10.0 | (citation.title:\"solut
> nmr\")^25.0 | (struct_keywords.pdbx_keywords:\"solut nmr\")^15.0 |
> (entity_src_gen_concat_stem:\"solut nmr\")^15.0 | (struct.title:\"solut
> nmr\")^35.0 | (group_id_stem:\"solut nmr\")^10.0)))~2)/no_coord",
>    "parsedquery_toString":"+((method:\"x rai\" ((pdb_id:Solution NMR)^5.0 |
> (entity_name_com.name:\"solut nmr\")^20.0 | (citation_author.name:\"solut
> nmr\")^5.0 | (audit_author.name:\"solut nmr\")^5.0 |
> rest_fields_stem:\"solut nmr\" | (title_fields_stem:\"solut nmr\")^3.0 |
> (classification:\"solut nmr\")^15.0 | (struct_keywords.text:\"solut
> nmr\")^12.0 | (entity.pdbx_description:\"solut nmr\")^10.0 |
> (pdbx_descriptor_stem:\"solut nmr\")^10.0 | (citation.title:\"solut
> nmr\")^25.0 | (struct_keywords.pdbx_keywords:\"solut nmr\")^15.0 |
> (entity_src_gen_concat_stem:\"solut nmr\")^15.0 | (struct.title:\"solut
> nmr\")^35.0 | (group_id_stem:\"solut nmr\")^10.0))~2)",
> 
> 
> *1f-b (/search?q=+method:"x-ray*" +"Solution NMR") result count = 0:*
> "debug":{
>    "rawquerystring":" method:\"x-ray*\"  \"Solution NMR\"",
>    "querystring":" method:\"x-ray*\"  \"Solution NMR\"",
>    "parsedquery":"(+(PhraseQuery(method:\"x rai\")
> DisjunctionMaxQuery(((pdb_id:Solution NMR)^5.0 |
> (entity_name_com.name:\"solut nmr\")^20.0 | (citation_author.name:\"solut
> nmr\")^5.0 | (audit_author.name:\"solut nmr\")^5.0 |
> rest_fields_stem:\"solut nmr\" | (title_fields_stem:\"solut nmr\")^3.0 |
> (classification:\"solut nmr\")^15.0 | (struct_keywords.text:\"solut
> nmr\")^12.0 | (entity.pdbx_description:\"solut nmr\")^10.0 |
> (pdbx_descriptor_stem:\"solut nmr\")^10.0 | (citation.title:\"solut
> nmr\")^25.0 | (struct_keywords.pdbx_keywords:\"solut nmr\")^15.0 |
> (entity_src_gen_concat_stem:\"solut nmr\")^15.0 | (struct.title:\"solut
> nmr\")^35.0 | (group_id_stem:\"solut nmr\")^10.0)))~2)/no_coord",
>    "parsedquery_toString":"+((method:\"x rai\" ((pdb_id:Solution NMR)^5.0 |
> (entity_name_com.name:\"solut nmr\")^20.0 | (citation_author.name:\"solut
> nmr\")^5.0 | (audit_author.name:\"solut nmr\")^5.0 |
> rest_fields_stem:\"solut nmr\" | (title_fields_stem:\"solut nmr\")^3.0 |
> (classification:\"solut nmr\")^15.0 | (struct_keywords.text:\"solut
> nmr\")^12.0 | (entity.pdbx_description:\"solut nmr\")^10.0 |
> (pdbx_descriptor_stem:\"solut nmr\")^10.0 | (citation.title:\"solut
> nmr\")^25.0 | (struct_keywords.pdbx_keywords:\"solut nmr\")^15.0 |
> (entity_src_gen_concat_stem:\"solut nmr\")^15.0 | (struct.title:\"solut
> nmr\")^35.0 | (group_id_stem:\"solut nmr\")^10.0))~2)",
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html