You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Amrit Sarkar <sa...@gmail.com> on 2021/06/16 19:46:54 UTC

Debug Query showing match against raw bytes for string field

Hi team,

Hope everyone is well. We are doing a very generic exercise where we have a
combination of text and string fields in query fields. The parsed query
though showing a match against raw bytes being done for some of the string
fields. What is this behavior? and how can we avoid it?

(((code_string:reebok code_string:footwear)~2)^3.0 |
((PendantType-Jewellery-classification_string_mv:[[72 65 65 62 6f 6b] TO
[72 65 65 62 6f 6b]] PendantType-Jewellery-classification_string_mv:[[66 6f
6f 74 77 65 61 72] TO [66 6f 6f 74 77 65 61 72]])~2)^3.0 |
((BangleType-Jewellery-classification_string_mv:[[72 65 65 62 6f 6b] TO [72
65 65 62 6f 6b]] BangleType-Jewellery-classification_string_mv:[[66 6f 6f
74 77 65 61 72] TO [66 6f 6f 74 77 65 61 72]])~2)^3.0 |
((Theme-Jewellery-classification_string_mv:[[72 65 65 62 6f 6b] TO [72 65
65 62 6f 6b]] Theme-Jewellery-classification_string_mv:[[66 6f 6f 74 77 65
61 72] TO [66 6f 6f 74 77 65 61 72]])~2)^3.0 | ((name_text_en:reebok
name_text_en:footwear)~2)^2.0 |
((lenssizemmeyewear-classification_en_string_mv:reebok
lenssizemmeyewear-classification_en_string_mv:footwear)~2)^3.0

Thanks in advance.

Amrit Sarkar
Engineer | Search and Kubernetes
https://seamadic.com/
Twitter https://twitter.com/sarkaramrit2
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2

Re: Debug Query showing match against raw bytes for string field

Posted by Alessandro Benedetti <a....@sease.io>.
I think what you see is just the behaviour that edismax shows for not
analysed fields (you should see the same for numerical).
Are you setting sow=false ? (default)
The fact you see the byte array could just be a problem of the toString()
maybe.
I would recommend debugging the code, the area should be pretty much
related to what touched in this Jira issue I was working on a few weeks ago:
https://github.com/apache/solr/pull/129/files
Cheers
--------------------------
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Wed, 16 Jun 2021 at 20:47, Amrit Sarkar <sa...@gmail.com> wrote:

> Hi team,
>
> Hope everyone is well. We are doing a very generic exercise where we have a
> combination of text and string fields in query fields. The parsed query
> though showing a match against raw bytes being done for some of the string
> fields. What is this behavior? and how can we avoid it?
>
> (((code_string:reebok code_string:footwear)~2)^3.0 |
> ((PendantType-Jewellery-classification_string_mv:[[72 65 65 62 6f 6b] TO
> [72 65 65 62 6f 6b]] PendantType-Jewellery-classification_string_mv:[[66 6f
> 6f 74 77 65 61 72] TO [66 6f 6f 74 77 65 61 72]])~2)^3.0 |
> ((BangleType-Jewellery-classification_string_mv:[[72 65 65 62 6f 6b] TO [72
> 65 65 62 6f 6b]] BangleType-Jewellery-classification_string_mv:[[66 6f 6f
> 74 77 65 61 72] TO [66 6f 6f 74 77 65 61 72]])~2)^3.0 |
> ((Theme-Jewellery-classification_string_mv:[[72 65 65 62 6f 6b] TO [72 65
> 65 62 6f 6b]] Theme-Jewellery-classification_string_mv:[[66 6f 6f 74 77 65
> 61 72] TO [66 6f 6f 74 77 65 61 72]])~2)^3.0 | ((name_text_en:reebok
> name_text_en:footwear)~2)^2.0 |
> ((lenssizemmeyewear-classification_en_string_mv:reebok
> lenssizemmeyewear-classification_en_string_mv:footwear)~2)^3.0
>
> Thanks in advance.
>
> Amrit Sarkar
> Engineer | Search and Kubernetes
> https://seamadic.com/
> Twitter https://twitter.com/sarkaramrit2
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> Medium: https://medium.com/@sarkaramrit2
>