You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "d.kumar@technisat.de" <d....@technisat.de> on 2017/08/03 13:47:45 UTC

plus sign in request / looking for + in title

Hey,

in our title we are having a word named "hd+".
Now I want to do a query right on these word, but if I do so, solr is just looking for "hd" and ignoring the plus sign. But I relay need to search for the whole string
Of course I did a url encode for the plus sign:

q=title:hd%2B

Can please anyone tell me, how to search for the plus sign "+"?

thanks

David

Re: AW: plus sign in request / looking for + in title

Posted by Erick Erickson <er...@gmail.com>.
Take a look at the other filters, there are a ton of them.
PatternReplaceFilterFactory is a possibility.

Best,
Erick

On Fri, Aug 4, 2017 at 11:01 AM, d.kumar@technisat.de
<d....@technisat.de> wrote:
> Hey,
>
> that is a good point. What is the best way for filtering? About the plus at the request, we are doing on the whole request an URL encode..
>
>
>
> Thanks
> David
>
>
>
>
>> Am 04.08.2017 um 17:34 schrieb Erick Erickson <er...@gmail.com>:
>>
>> Glad to hear it. Two things:
>>
>> 1> you might have to do some additional filtering when using
>> WhitespaceTokenizer. It, well, splits on whitespace so things like
>> punctuation will come through as part of the token. So "My dog has
>> fleas." (note the period after fleas) would have the period included
>> in the token "fleas.".
>>
>> 2> getting the plus sign through URL encoding and the parser may be
>> fun, you may have to escape it to keep it from being interpreted as an
>> operator....
>>
>> Best,
>> Erick
>>
>> On Fri, Aug 4, 2017 at 5:55 AM, d.kumar@technisat.de
>> <d....@technisat.de> wrote:
>>> Hey, thanks.
>>>
>>> Yeah i found a  way..
>>> I sued for these files my on fieldtype. In these I'm using the WhitespaceTokenizerFactory for query an index.. and now everything is like it should be..
>>>
>>> :-)
>>>
>>> Thanks
>>>
>>> David
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: Shawn Heisey [mailto:apache@elyograg.org]
>>> Gesendet: Freitag, 4. August 2017 14:53
>>> An: solr-user@lucene.apache.org
>>> Betreff: Re: AW: plus sign in request / looking for + in title
>>>
>>>> On 8/4/2017 2:15 AM, d.kumar@technisat.de wrote:
>>>> So how can I prevent e.g. the ST (standartTokenizer) to remove the plus sign? An suggestions?
>>>
>>> You can't.  The standard tokenizer really isn't configurable at all.
>>>
>>> You'd need to change your analysis chain (tokenizer and filters) to produce the results you want.
>>>
>>> Thanks,
>>> Shawn
>>>

Re: AW: plus sign in request / looking for + in title

Posted by "d.kumar@technisat.de" <d....@technisat.de>.
Hey,

that is a good point. What is the best way for filtering? About the plus at the request, we are doing on the whole request an URL encode..



Thanks
David


 

> Am 04.08.2017 um 17:34 schrieb Erick Erickson <er...@gmail.com>:
> 
> Glad to hear it. Two things:
> 
> 1> you might have to do some additional filtering when using
> WhitespaceTokenizer. It, well, splits on whitespace so things like
> punctuation will come through as part of the token. So "My dog has
> fleas." (note the period after fleas) would have the period included
> in the token "fleas.".
> 
> 2> getting the plus sign through URL encoding and the parser may be
> fun, you may have to escape it to keep it from being interpreted as an
> operator....
> 
> Best,
> Erick
> 
> On Fri, Aug 4, 2017 at 5:55 AM, d.kumar@technisat.de
> <d....@technisat.de> wrote:
>> Hey, thanks.
>> 
>> Yeah i found a  way..
>> I sued for these files my on fieldtype. In these I'm using the WhitespaceTokenizerFactory for query an index.. and now everything is like it should be..
>> 
>> :-)
>> 
>> Thanks
>> 
>> David
>> 
>> -----Ursprüngliche Nachricht-----
>> Von: Shawn Heisey [mailto:apache@elyograg.org]
>> Gesendet: Freitag, 4. August 2017 14:53
>> An: solr-user@lucene.apache.org
>> Betreff: Re: AW: plus sign in request / looking for + in title
>> 
>>> On 8/4/2017 2:15 AM, d.kumar@technisat.de wrote:
>>> So how can I prevent e.g. the ST (standartTokenizer) to remove the plus sign? An suggestions?
>> 
>> You can't.  The standard tokenizer really isn't configurable at all.
>> 
>> You'd need to change your analysis chain (tokenizer and filters) to produce the results you want.
>> 
>> Thanks,
>> Shawn
>> 

Re: AW: plus sign in request / looking for + in title

Posted by Erick Erickson <er...@gmail.com>.
Glad to hear it. Two things:

1> you might have to do some additional filtering when using
WhitespaceTokenizer. It, well, splits on whitespace so things like
punctuation will come through as part of the token. So "My dog has
fleas." (note the period after fleas) would have the period included
in the token "fleas.".

2> getting the plus sign through URL encoding and the parser may be
fun, you may have to escape it to keep it from being interpreted as an
operator....

Best,
Erick

On Fri, Aug 4, 2017 at 5:55 AM, d.kumar@technisat.de
<d....@technisat.de> wrote:
> Hey, thanks.
>
> Yeah i found a  way..
> I sued for these files my on fieldtype. In these I'm using the WhitespaceTokenizerFactory for query an index.. and now everything is like it should be..
>
> :-)
>
> Thanks
>
> David
>
> -----Ursprüngliche Nachricht-----
> Von: Shawn Heisey [mailto:apache@elyograg.org]
> Gesendet: Freitag, 4. August 2017 14:53
> An: solr-user@lucene.apache.org
> Betreff: Re: AW: plus sign in request / looking for + in title
>
> On 8/4/2017 2:15 AM, d.kumar@technisat.de wrote:
>> So how can I prevent e.g. the ST (standartTokenizer) to remove the plus sign? An suggestions?
>
> You can't.  The standard tokenizer really isn't configurable at all.
>
> You'd need to change your analysis chain (tokenizer and filters) to produce the results you want.
>
> Thanks,
> Shawn
>

AW: AW: plus sign in request / looking for + in title

Posted by "d.kumar@technisat.de" <d....@technisat.de>.
Hey, thanks.

Yeah i found a  way..
I sued for these files my on fieldtype. In these I'm using the WhitespaceTokenizerFactory for query an index.. and now everything is like it should be..

:-)

Thanks

David

-----Ursprüngliche Nachricht-----
Von: Shawn Heisey [mailto:apache@elyograg.org] 
Gesendet: Freitag, 4. August 2017 14:53
An: solr-user@lucene.apache.org
Betreff: Re: AW: plus sign in request / looking for + in title

On 8/4/2017 2:15 AM, d.kumar@technisat.de wrote:
> So how can I prevent e.g. the ST (standartTokenizer) to remove the plus sign? An suggestions?

You can't.  The standard tokenizer really isn't configurable at all.

You'd need to change your analysis chain (tokenizer and filters) to produce the results you want.

Thanks,
Shawn


Re: AW: plus sign in request / looking for + in title

Posted by Shawn Heisey <ap...@elyograg.org>.
On 8/4/2017 2:15 AM, d.kumar@technisat.de wrote:
> So how can I prevent e.g. the ST (standartTokenizer) to remove the plus sign? An suggestions?

You can't.  The standard tokenizer really isn't configurable at all.

You'd need to change your analysis chain (tokenizer and filters) to
produce the results you want.

Thanks,
Shawn


AW: plus sign in request / looking for + in title

Posted by "d.kumar@technisat.de" <d....@technisat.de>.
Her Erick,

thanks for reply.
Analysis is a good point I tried "hd+" at the Field Value and  you were right: 

ST
text hd
raw_bytes  [68 64]
start 0
end 2
positionLength 1 
type <ALPHANUM>
position 1

So how can I prevent e.g. the ST (standartTokenizer) to remove thepus sign? An suggestions?

thanks


-----Ursprüngliche Nachricht-----
Von: Erick Erickson [mailto:erickerickson@gmail.com] 
Gesendet: Donnerstag, 3. August 2017 16:46
An: solr-user
Betreff: Re: plus sign in request / looking for + in title

Take a look at your analysis chain. My bet is that the + is being stripped by some part of the chain. See the admin UI>>analysis page.

Best,
Erick

On Aug 3, 2017 06:47, "d.kumar@technisat.de" <d....@technisat.de> wrote:

> Hey,
>
> in our title we are having a word named "hd+".
> Now I want to do a query right on these word, but if I do so, solr is 
> just looking for "hd" and ignoring the plus sign. But I relay need to 
> search for the whole string Of course I did a url encode for the plus 
> sign:
>
> q=title:hd%2B
>
> Can please anyone tell me, how to search for the plus sign "+"?
>
> thanks
>
> David
>

Re: plus sign in request / looking for + in title

Posted by Erick Erickson <er...@gmail.com>.
Take a look at your analysis chain. My bet is that the + is being stripped
by some part of the chain. See the admin UI>>analysis page.

Best,
Erick

On Aug 3, 2017 06:47, "d.kumar@technisat.de" <d....@technisat.de> wrote:

> Hey,
>
> in our title we are having a word named "hd+".
> Now I want to do a query right on these word, but if I do so, solr is just
> looking for "hd" and ignoring the plus sign. But I relay need to search for
> the whole string
> Of course I did a url encode for the plus sign:
>
> q=title:hd%2B
>
> Can please anyone tell me, how to search for the plus sign "+"?
>
> thanks
>
> David
>