You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by "Casteel, Kayla Lynne" <Ka...@jacobs.com.INVALID> on 2021/10/19 15:27:11 UTC

Re: [EXTERNAL] Re: Having issues searching literal parentheses

Unfortunately I can't change the type of the allText field to string. We need the features that come with it being a text field.

(We did try changing it to string just to see what would happen -- it made the problem worse, and solr still didn't handle the escaped parentheses properly)


We're using solr 8.0.0, if it matters.


Some more details about the allText field: Right now it's a text_general type, which we define as:

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
    <analyzer type="index">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
</fieldType>

And the allText field itself:
<field name="allText" type="text_general" docValues="false" multiValued="true" indexed="true" stored="true"/>


I don't know if that helps at all. Solr automagically escaping the escape characters I use in the query is still bugging me.


Thank you,

Kayla Casteel

________________________________
From: Deepak Goel <de...@gmail.com>
Sent: Tuesday, October 19, 2021 2:49:54 AM
To: users@solr.apache.org
Subject: [EXTERNAL] Re: Having issues searching literal parentheses

Hey

It might be possible  *allText* does not consider them *()* as text. You
might have to try something else (possibly String)

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://urldefense.com/v3/__https://www.facebook.com/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnvb_rKfQ$
LinkedIn: https://urldefense.com/v3/__http://www.linkedin.com/in/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnMfPoJ4c$

"Plant a Tree, Go Green"

Make In India : https://urldefense.com/v3/__http://www.makeinindia.com/home__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnxN7SEE4$


On Tue, Oct 19, 2021 at 2:54 AM Casteel, Kayla Lynne
<Ka...@jacobs.com.invalid> wrote:

> Hello all,
>
> I have been going mad trying to get SOLR to search for parentheses as
> literals. For example, "(Figure 5)". I've tried entering it in the fq field
> as:
> allText:\(Figure 5\)
>
> (where allText is a facet). SOLR interprets this in the response as
>
> "fq":"allText:\\(Figure 5\\)"
>
> and it ends up finding text like "In Figure 5" with no parentheses. I
> assume this is because it is escaping the escape characters.
>
>
> I've tried escaping, I've tried URL encoding them, I've tried banging my
> head on the desk. I can't get solr to understand that this should be an
> exact match and that the parentheses are both literal and mandatory.
>
> In the wiki it even gives an example of escaping parentheses as part of
> the "Escaping Special Characters" section but it doesn't seem to work in my
> case.
>
> Has anyone else experienced this issue? Is there something I'm doing wrong?
>
>
> Thank you,
>
> Kayla Casteel
>
> ________________________________
>
> NOTICE - This communication may contain confidential and privileged
> information that is for the sole use of the intended recipient. Any
> viewing, copying or distribution of, or reliance on this message by
> unintended recipients is strictly prohibited. If you have received this
> message in error, please notify us immediately by replying to the message
> and deleting it from your computer.
>

________________________________

NOTICE - This communication may contain confidential and privileged information that is for the sole use of the intended recipient. Any viewing, copying or distribution of, or reliance on this message by unintended recipients is strictly prohibited. If you have received this message in error, please notify us immediately by replying to the message and deleting it from your computer.

Re: [EXTERNAL] Re: Having issues searching literal parentheses

Posted by Shawn Heisey <ap...@elyograg.org>.
It is the StandardTokenizer that is eating the parentheses, both at index and query time. If you want to do exact matching, a string type is probably what you want.

A tokenized field is normally not a good candidate for facets, because usually the cardinality for such a field is VERY high. Facets on a high cardinality field will typically be very slow.

⁣Get TypeApp for Android ​

On Oct 19, 2021, 10:25, at 10:25, "Casteel, Kayla Lynne" <ka...@jacobs.com.invalid> wrote:
>Oh no... if you don't mind, can you elaborate on that? Is it tthe
>solr.TextField type we're using that's causing this?
>
>I'm very new to solr and I haven't seen this in the documentation.
>
>
>Thank you,
>
>Kayla Casteel
>
>________________________________
>From: Erik Hatcher <er...@gmail.com>
>Sent: Tuesday, October 19, 2021 10:46:19 AM
>To: users@solr.apache.org
>Subject: Re: [EXTERNAL] Re: Having issues searching literal parentheses
>
>Your field type eats/removes parentheses - so there's no way to search
>for
>them.
>
>Adjustments will need to be made one way or another to get parentheses
>indexed and queried.
>
>    Erik
>
>On Tue, Oct 19, 2021, 11:27 Casteel, Kayla Lynne
><Ka...@jacobs.com.invalid> wrote:
>
>> Unfortunately I can't change the type of the allText field to string.
>We
>> need the features that come with it being a text field.
>>
>> (We did try changing it to string just to see what would happen -- it
>made
>> the problem worse, and solr still didn't handle the escaped
>parentheses
>> properly)
>>
>>
>> We're using solr 8.0.0, if it matters.
>>
>>
>> Some more details about the allText field: Right now it's a
>text_general
>> type, which we define as:
>>
>> <fieldType name="text_general" class="solr.TextField"
>> positionIncrementGap="100" multiValued="true">
>>     <analyzer type="index">
>>       <tokenizer class="solr.StandardTokenizerFactory"/>
>>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
>> ignoreCase="true"/>
>>       <filter class="solr.LowerCaseFilterFactory"/>
>>     </analyzer>
>>     <analyzer type="query">
>>       <tokenizer class="solr.StandardTokenizerFactory"/>
>>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
>> ignoreCase="true"/>
>>       <filter class="solr.SynonymGraphFilterFactory" expand="true"
>> ignoreCase="true" synonyms="synonyms.txt"/>
>>       <filter class="solr.LowerCaseFilterFactory"/>
>>     </analyzer>
>> </fieldType>
>>
>> And the allText field itself:
>> <field name="allText" type="text_general" docValues="false"
>> multiValued="true" indexed="true" stored="true"/>
>>
>>
>> I don't know if that helps at all. Solr automagically escaping the
>escape
>> characters I use in the query is still bugging me.
>>
>>
>> Thank you,
>>
>> Kayla Casteel
>>
>> ________________________________
>> From: Deepak Goel <de...@gmail.com>
>> Sent: Tuesday, October 19, 2021 2:49:54 AM
>> To: users@solr.apache.org
>> Subject: [EXTERNAL] Re: Having issues searching literal parentheses
>>
>> Hey
>>
>> It might be possible  *allText* does not consider them *()* as text.
>You
>> might have to try something else (possibly String)
>>
>> Deepak
>> "The greatness of a nation can be judged by the way its animals are
>treated
>> - Mahatma Gandhi"
>>
>> +91 73500 12833
>> deicool@gmail.com
>>
>> Facebook:
>>
>https://urldefense.com/v3/__https://www.facebook.com/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnvb_rKfQ$
>> LinkedIn:
>>
>https://urldefense.com/v3/__http://www.linkedin.com/in/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnMfPoJ4c$
>>
>> "Plant a Tree, Go Green"
>>
>> Make In India :
>>
>https://urldefense.com/v3/__http://www.makeinindia.com/home__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnxN7SEE4$
>>
>>
>> On Tue, Oct 19, 2021 at 2:54 AM Casteel, Kayla Lynne
>> <Ka...@jacobs.com.invalid> wrote:
>>
>> > Hello all,
>> >
>> > I have been going mad trying to get SOLR to search for parentheses
>as
>> > literals. For example, "(Figure 5)". I've tried entering it in the
>fq
>> field
>> > as:
>> > allText:\(Figure 5\)
>> >
>> > (where allText is a facet). SOLR interprets this in the response as
>> >
>> > "fq":"allText:\\(Figure 5\\)"
>> >
>> > and it ends up finding text like "In Figure 5" with no parentheses.
>I
>> > assume this is because it is escaping the escape characters.
>> >
>> >
>> > I've tried escaping, I've tried URL encoding them, I've tried
>banging my
>> > head on the desk. I can't get solr to understand that this should
>be an
>> > exact match and that the parentheses are both literal and
>mandatory.
>> >
>> > In the wiki it even gives an example of escaping parentheses as
>part of
>> > the "Escaping Special Characters" section but it doesn't seem to
>work in
>> my
>> > case.
>> >
>> > Has anyone else experienced this issue? Is there something I'm
>doing
>> wrong?
>> >
>> >
>> > Thank you,
>> >
>> > Kayla Casteel
>> >
>> > ________________________________
>> >
>> > NOTICE - This communication may contain confidential and privileged
>> > information that is for the sole use of the intended recipient. Any
>> > viewing, copying or distribution of, or reliance on this message by
>> > unintended recipients is strictly prohibited. If you have received
>this
>> > message in error, please notify us immediately by replying to the
>message
>> > and deleting it from your computer.
>> >
>>
>> ________________________________
>>
>> NOTICE - This communication may contain confidential and privileged
>> information that is for the sole use of the intended recipient. Any
>> viewing, copying or distribution of, or reliance on this message by
>> unintended recipients is strictly prohibited. If you have received
>this
>> message in error, please notify us immediately by replying to the
>message
>> and deleting it from your computer.
>>
>
>________________________________
>
>NOTICE - This communication may contain confidential and privileged
>information that is for the sole use of the intended recipient. Any
>viewing, copying or distribution of, or reliance on this message by
>unintended recipients is strictly prohibited. If you have received this
>message in error, please notify us immediately by replying to the
>message and deleting it from your computer.


Re: [EXTERNAL] Re: Having issues searching literal parentheses

Posted by "Casteel, Kayla Lynne" <Ka...@jacobs.com.INVALID>.
Oh no... if you don't mind, can you elaborate on that? Is it tthe solr.TextField type we're using that's causing this?

I'm very new to solr and I haven't seen this in the documentation.


Thank you,

Kayla Casteel

________________________________
From: Erik Hatcher <er...@gmail.com>
Sent: Tuesday, October 19, 2021 10:46:19 AM
To: users@solr.apache.org
Subject: Re: [EXTERNAL] Re: Having issues searching literal parentheses

Your field type eats/removes parentheses - so there's no way to search for
them.

Adjustments will need to be made one way or another to get parentheses
indexed and queried.

    Erik

On Tue, Oct 19, 2021, 11:27 Casteel, Kayla Lynne
<Ka...@jacobs.com.invalid> wrote:

> Unfortunately I can't change the type of the allText field to string. We
> need the features that come with it being a text field.
>
> (We did try changing it to string just to see what would happen -- it made
> the problem worse, and solr still didn't handle the escaped parentheses
> properly)
>
>
> We're using solr 8.0.0, if it matters.
>
>
> Some more details about the allText field: Right now it's a text_general
> type, which we define as:
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100" multiValued="true">
>     <analyzer type="index">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.SynonymGraphFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
> </fieldType>
>
> And the allText field itself:
> <field name="allText" type="text_general" docValues="false"
> multiValued="true" indexed="true" stored="true"/>
>
>
> I don't know if that helps at all. Solr automagically escaping the escape
> characters I use in the query is still bugging me.
>
>
> Thank you,
>
> Kayla Casteel
>
> ________________________________
> From: Deepak Goel <de...@gmail.com>
> Sent: Tuesday, October 19, 2021 2:49:54 AM
> To: users@solr.apache.org
> Subject: [EXTERNAL] Re: Having issues searching literal parentheses
>
> Hey
>
> It might be possible  *allText* does not consider them *()* as text. You
> might have to try something else (possibly String)
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook:
> https://urldefense.com/v3/__https://www.facebook.com/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnvb_rKfQ$
> LinkedIn:
> https://urldefense.com/v3/__http://www.linkedin.com/in/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnMfPoJ4c$
>
> "Plant a Tree, Go Green"
>
> Make In India :
> https://urldefense.com/v3/__http://www.makeinindia.com/home__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnxN7SEE4$
>
>
> On Tue, Oct 19, 2021 at 2:54 AM Casteel, Kayla Lynne
> <Ka...@jacobs.com.invalid> wrote:
>
> > Hello all,
> >
> > I have been going mad trying to get SOLR to search for parentheses as
> > literals. For example, "(Figure 5)". I've tried entering it in the fq
> field
> > as:
> > allText:\(Figure 5\)
> >
> > (where allText is a facet). SOLR interprets this in the response as
> >
> > "fq":"allText:\\(Figure 5\\)"
> >
> > and it ends up finding text like "In Figure 5" with no parentheses. I
> > assume this is because it is escaping the escape characters.
> >
> >
> > I've tried escaping, I've tried URL encoding them, I've tried banging my
> > head on the desk. I can't get solr to understand that this should be an
> > exact match and that the parentheses are both literal and mandatory.
> >
> > In the wiki it even gives an example of escaping parentheses as part of
> > the "Escaping Special Characters" section but it doesn't seem to work in
> my
> > case.
> >
> > Has anyone else experienced this issue? Is there something I'm doing
> wrong?
> >
> >
> > Thank you,
> >
> > Kayla Casteel
> >
> > ________________________________
> >
> > NOTICE - This communication may contain confidential and privileged
> > information that is for the sole use of the intended recipient. Any
> > viewing, copying or distribution of, or reliance on this message by
> > unintended recipients is strictly prohibited. If you have received this
> > message in error, please notify us immediately by replying to the message
> > and deleting it from your computer.
> >
>
> ________________________________
>
> NOTICE - This communication may contain confidential and privileged
> information that is for the sole use of the intended recipient. Any
> viewing, copying or distribution of, or reliance on this message by
> unintended recipients is strictly prohibited. If you have received this
> message in error, please notify us immediately by replying to the message
> and deleting it from your computer.
>

________________________________

NOTICE - This communication may contain confidential and privileged information that is for the sole use of the intended recipient. Any viewing, copying or distribution of, or reliance on this message by unintended recipients is strictly prohibited. If you have received this message in error, please notify us immediately by replying to the message and deleting it from your computer.

Re: [EXTERNAL] Re: Having issues searching literal parentheses

Posted by Erik Hatcher <er...@gmail.com>.
Your field type eats/removes parentheses - so there's no way to search for
them.

Adjustments will need to be made one way or another to get parentheses
indexed and queried.

    Erik

On Tue, Oct 19, 2021, 11:27 Casteel, Kayla Lynne
<Ka...@jacobs.com.invalid> wrote:

> Unfortunately I can't change the type of the allText field to string. We
> need the features that come with it being a text field.
>
> (We did try changing it to string just to see what would happen -- it made
> the problem worse, and solr still didn't handle the escaped parentheses
> properly)
>
>
> We're using solr 8.0.0, if it matters.
>
>
> Some more details about the allText field: Right now it's a text_general
> type, which we define as:
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100" multiValued="true">
>     <analyzer type="index">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.SynonymGraphFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
> </fieldType>
>
> And the allText field itself:
> <field name="allText" type="text_general" docValues="false"
> multiValued="true" indexed="true" stored="true"/>
>
>
> I don't know if that helps at all. Solr automagically escaping the escape
> characters I use in the query is still bugging me.
>
>
> Thank you,
>
> Kayla Casteel
>
> ________________________________
> From: Deepak Goel <de...@gmail.com>
> Sent: Tuesday, October 19, 2021 2:49:54 AM
> To: users@solr.apache.org
> Subject: [EXTERNAL] Re: Having issues searching literal parentheses
>
> Hey
>
> It might be possible  *allText* does not consider them *()* as text. You
> might have to try something else (possibly String)
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook:
> https://urldefense.com/v3/__https://www.facebook.com/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnvb_rKfQ$
> LinkedIn:
> https://urldefense.com/v3/__http://www.linkedin.com/in/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnMfPoJ4c$
>
> "Plant a Tree, Go Green"
>
> Make In India :
> https://urldefense.com/v3/__http://www.makeinindia.com/home__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnxN7SEE4$
>
>
> On Tue, Oct 19, 2021 at 2:54 AM Casteel, Kayla Lynne
> <Ka...@jacobs.com.invalid> wrote:
>
> > Hello all,
> >
> > I have been going mad trying to get SOLR to search for parentheses as
> > literals. For example, "(Figure 5)". I've tried entering it in the fq
> field
> > as:
> > allText:\(Figure 5\)
> >
> > (where allText is a facet). SOLR interprets this in the response as
> >
> > "fq":"allText:\\(Figure 5\\)"
> >
> > and it ends up finding text like "In Figure 5" with no parentheses. I
> > assume this is because it is escaping the escape characters.
> >
> >
> > I've tried escaping, I've tried URL encoding them, I've tried banging my
> > head on the desk. I can't get solr to understand that this should be an
> > exact match and that the parentheses are both literal and mandatory.
> >
> > In the wiki it even gives an example of escaping parentheses as part of
> > the "Escaping Special Characters" section but it doesn't seem to work in
> my
> > case.
> >
> > Has anyone else experienced this issue? Is there something I'm doing
> wrong?
> >
> >
> > Thank you,
> >
> > Kayla Casteel
> >
> > ________________________________
> >
> > NOTICE - This communication may contain confidential and privileged
> > information that is for the sole use of the intended recipient. Any
> > viewing, copying or distribution of, or reliance on this message by
> > unintended recipients is strictly prohibited. If you have received this
> > message in error, please notify us immediately by replying to the message
> > and deleting it from your computer.
> >
>
> ________________________________
>
> NOTICE - This communication may contain confidential and privileged
> information that is for the sole use of the intended recipient. Any
> viewing, copying or distribution of, or reliance on this message by
> unintended recipients is strictly prohibited. If you have received this
> message in error, please notify us immediately by replying to the message
> and deleting it from your computer.
>

Re: [EXTERNAL] Re: Having issues searching literal parentheses

Posted by Thomas Corthals <th...@klascement.net>.
You can use copyField in your schema to have the same incoming text indexed
as different field types. As long as you don't need the features of the
text field and the literal parentheses at the same time, it's just a matter
of querying the field that has the analysers you want for a particular use
case.

https://solr.apache.org/guide/8_10/copying-fields.html

This will require reindexing all documents.

Op di 19 okt. 2021 om 17:27 schreef Casteel, Kayla Lynne
<Ka...@jacobs.com.invalid>:

> Unfortunately I can't change the type of the allText field to string. We
> need the features that come with it being a text field.
>
> (We did try changing it to string just to see what would happen -- it made
> the problem worse, and solr still didn't handle the escaped parentheses
> properly)
>
>
> We're using solr 8.0.0, if it matters.
>
>
> Some more details about the allText field: Right now it's a text_general
> type, which we define as:
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100" multiValued="true">
>     <analyzer type="index">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.SynonymGraphFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
> </fieldType>
>
> And the allText field itself:
> <field name="allText" type="text_general" docValues="false"
> multiValued="true" indexed="true" stored="true"/>
>
>
> I don't know if that helps at all. Solr automagically escaping the escape
> characters I use in the query is still bugging me.
>
>
> Thank you,
>
> Kayla Casteel
>
> ________________________________
> From: Deepak Goel <de...@gmail.com>
> Sent: Tuesday, October 19, 2021 2:49:54 AM
> To: users@solr.apache.org
> Subject: [EXTERNAL] Re: Having issues searching literal parentheses
>
> Hey
>
> It might be possible  *allText* does not consider them *()* as text. You
> might have to try something else (possibly String)
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook:
> https://urldefense.com/v3/__https://www.facebook.com/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnvb_rKfQ$
> LinkedIn:
> https://urldefense.com/v3/__http://www.linkedin.com/in/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnMfPoJ4c$
>
> "Plant a Tree, Go Green"
>
> Make In India :
> https://urldefense.com/v3/__http://www.makeinindia.com/home__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnxN7SEE4$
>
>
> On Tue, Oct 19, 2021 at 2:54 AM Casteel, Kayla Lynne
> <Ka...@jacobs.com.invalid> wrote:
>
> > Hello all,
> >
> > I have been going mad trying to get SOLR to search for parentheses as
> > literals. For example, "(Figure 5)". I've tried entering it in the fq
> field
> > as:
> > allText:\(Figure 5\)
> >
> > (where allText is a facet). SOLR interprets this in the response as
> >
> > "fq":"allText:\\(Figure 5\\)"
> >
> > and it ends up finding text like "In Figure 5" with no parentheses. I
> > assume this is because it is escaping the escape characters.
> >
> >
> > I've tried escaping, I've tried URL encoding them, I've tried banging my
> > head on the desk. I can't get solr to understand that this should be an
> > exact match and that the parentheses are both literal and mandatory.
> >
> > In the wiki it even gives an example of escaping parentheses as part of
> > the "Escaping Special Characters" section but it doesn't seem to work in
> my
> > case.
> >
> > Has anyone else experienced this issue? Is there something I'm doing
> wrong?
> >
> >
> > Thank you,
> >
> > Kayla Casteel
> >
> > ________________________________
> >
> > NOTICE - This communication may contain confidential and privileged
> > information that is for the sole use of the intended recipient. Any
> > viewing, copying or distribution of, or reliance on this message by
> > unintended recipients is strictly prohibited. If you have received this
> > message in error, please notify us immediately by replying to the message
> > and deleting it from your computer.
> >
>
> ________________________________
>
> NOTICE - This communication may contain confidential and privileged
> information that is for the sole use of the intended recipient. Any
> viewing, copying or distribution of, or reliance on this message by
> unintended recipients is strictly prohibited. If you have received this
> message in error, please notify us immediately by replying to the message
> and deleting it from your computer.
>

Re: [EXTERNAL] Re: Having issues searching literal parentheses

Posted by Sujit Pal <su...@comcast.net>.
One possibility could be to borrow an idea from the NLP world, and
pre-process parenthesis to -LRB- and -RRB- tokens (and square and curly to
their corresponding forms). Bypasses issues of escaping but needs
reindexing, and preprocessing the query.

-sujit


On Tue, Oct 19, 2021 at 8:27 AM Casteel, Kayla Lynne
<Ka...@jacobs.com.invalid> wrote:

> Unfortunately I can't change the type of the allText field to string. We
> need the features that come with it being a text field.
>
> (We did try changing it to string just to see what would happen -- it made
> the problem worse, and solr still didn't handle the escaped parentheses
> properly)
>
>
> We're using solr 8.0.0, if it matters.
>
>
> Some more details about the allText field: Right now it's a text_general
> type, which we define as:
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100" multiValued="true">
>     <analyzer type="index">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.SynonymGraphFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
> </fieldType>
>
> And the allText field itself:
> <field name="allText" type="text_general" docValues="false"
> multiValued="true" indexed="true" stored="true"/>
>
>
> I don't know if that helps at all. Solr automagically escaping the escape
> characters I use in the query is still bugging me.
>
>
> Thank you,
>
> Kayla Casteel
>
> ________________________________
> From: Deepak Goel <de...@gmail.com>
> Sent: Tuesday, October 19, 2021 2:49:54 AM
> To: users@solr.apache.org
> Subject: [EXTERNAL] Re: Having issues searching literal parentheses
>
> Hey
>
> It might be possible  *allText* does not consider them *()* as text. You
> might have to try something else (possibly String)
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook:
> https://urldefense.com/v3/__https://www.facebook.com/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnvb_rKfQ$
> LinkedIn:
> https://urldefense.com/v3/__http://www.linkedin.com/in/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnMfPoJ4c$
>
> "Plant a Tree, Go Green"
>
> Make In India :
> https://urldefense.com/v3/__http://www.makeinindia.com/home__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnxN7SEE4$
>
>
> On Tue, Oct 19, 2021 at 2:54 AM Casteel, Kayla Lynne
> <Ka...@jacobs.com.invalid> wrote:
>
> > Hello all,
> >
> > I have been going mad trying to get SOLR to search for parentheses as
> > literals. For example, "(Figure 5)". I've tried entering it in the fq
> field
> > as:
> > allText:\(Figure 5\)
> >
> > (where allText is a facet). SOLR interprets this in the response as
> >
> > "fq":"allText:\\(Figure 5\\)"
> >
> > and it ends up finding text like "In Figure 5" with no parentheses. I
> > assume this is because it is escaping the escape characters.
> >
> >
> > I've tried escaping, I've tried URL encoding them, I've tried banging my
> > head on the desk. I can't get solr to understand that this should be an
> > exact match and that the parentheses are both literal and mandatory.
> >
> > In the wiki it even gives an example of escaping parentheses as part of
> > the "Escaping Special Characters" section but it doesn't seem to work in
> my
> > case.
> >
> > Has anyone else experienced this issue? Is there something I'm doing
> wrong?
> >
> >
> > Thank you,
> >
> > Kayla Casteel
> >
> > ________________________________
> >
> > NOTICE - This communication may contain confidential and privileged
> > information that is for the sole use of the intended recipient. Any
> > viewing, copying or distribution of, or reliance on this message by
> > unintended recipients is strictly prohibited. If you have received this
> > message in error, please notify us immediately by replying to the message
> > and deleting it from your computer.
> >
>
> ________________________________
>
> NOTICE - This communication may contain confidential and privileged
> information that is for the sole use of the intended recipient. Any
> viewing, copying or distribution of, or reliance on this message by
> unintended recipients is strictly prohibited. If you have received this
> message in error, please notify us immediately by replying to the message
> and deleting it from your computer.
>