You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@solr.apache.org by son hoang <so...@gmail.com> on 2021/10/21 00:19:26 UTC

Index for text with space

Hello

I have a config like this:

<fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
            <analyzer type="index">
                <tokenizer class="solr.StandardTokenizerFactory"/>
                <filter class="solr.ASCIIFoldingFilterFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="15"/>
            </analyzer>
            <analyzer type="query">
                <tokenizer class="solr.StandardTokenizerFactory" />
                <filter class="solr.ASCIIFoldingFilterFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
        <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="15"/> -->
            </analyzer>
    </fieldtype>   

Using this config:

1. When I search for "Abbas", the result for "Al Abbas" appears.

2. When I search for "Al Abbas" in the search field, I get no results.

It seems that "Al Abbas" is not indexed. What I should do in the config so #2 can return the result

Many thanks

Re: Index for text with space

Posted by Andy C <an...@gmail.com>.

I would think your problem goes beyond 1 and 2 characters words not being
indexed.

With your current field type definition, if someone searches for "can" it
will retrieve documents that contain any word that start with "can". So
"candidate", canadian", "cantina", etc.

Is this really the desired search behavior?

On Mon, Oct 25, 2021 at 8:48 AM Dave <ha...@gmail.com> wrote:

> You can pre process the query to remove anything not indexed (less than 3
> characters) but that initial scheme decision was a mistake, and should be
> remedied and reindexed.
>
> > On Oct 25, 2021, at 8:36 AM, son hoang <so...@gmail.com> wrote:
> >
> > Is there any way in the query so that I do not need to reindex the
> whole data?
> >
> >> On 2021/10/23 15:39:18, Walter Underwood <wu...@wunderwood.org>
> wrote:
> >> Agreed. There is a simple fix. Index all the words. Also, stop using
> EdgeNgramFilter.
> >> That is only used for completion, not word search.
> >>
> >> wunder
> >> Walter Underwood
> >> wunder@wunderwood.org
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>>> On Oct 23, 2021, at 4:31 AM, Dave <ha...@gmail.com>
> wrote:
> >>>
> >>> Why ever would you not index less than three characters?
> >>> “To be or not to be”
> >>> Seems like a significant search
> >>>
> >>>> On Oct 23, 2021, at 7:28 AM, son hoang <so...@gmail.com> wrote:
> >>>>
> >>>> Yep, words less than 3 chars will not be indexed. But if "Al Abbas"
> text can be separated into a token "Abbas" (and "Al"  but it is not counted
> as a token as it has 2 chars only) then we can apply OR condition in the
> query?
> >>>>
> >>>>> On 2021/10/22 14:37:51, Andy C <an...@gmail.com> wrote:
> >>>>> The issue looks to me to be with the use of EdgeNGramFilterFactory
> in your
> >>>>> field type. You have configured it with minGramSize="3" and have not
> >>>>> specified preserveOriginal="true".
> >>>>>
> >>>>> So words less than 3 characters will not be indexed, and therefore
> can't be
> >>>>> searched.
> >>>>>
> >>>>> See
> >>>>>
> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter
> >>>>>
> >>>>> - Andy -
> >>>>>
> >>>>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <so...@gmail.com>
> wrote:
> >>>>>>
> >>>>>> Thanks, Thamiz
> >>>>>>
> >>>>>> It seems that I have index=StandardTokenizerFactory causing the
> issue
> >>>>>>
> >>>>>> I do not want to re-index. Is there any solution ? Should I have
> query
> >>>>>> "OR" so that the search can return  "Al Abbas" when I have  "Al
> Abbas" in
> >>>>>> the query field  (eg: there is a OR match "Abbas" ?
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> On 2021/10/21 07:56:20, Thamizhazhagan B <Thamizhazhagan.X.B@kp.org
> >
> >>>>>> wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> Create a copy field as below and use this copyfield in your query..
> >>>>>>>
> >>>>>>> <copyField source="_name" dest="itemFullName"/>
> >>>>>>> <field name="itemFullName" type="itemFullName_type" stored="true"
> >>>>>> indexed="true" termVectors="true" termPositions="true"
> termOffsets="true"/>
> >>>>>>>
> >>>>>>> <fieldType name="itemFullName_type" class="solr.TextField"
> >>>>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100"
> >>>>>> multiValued="false">
> >>>>>>>  <analyzer type="index">
> >>>>>>>    <tokenizer class="solr.KeywordTokenizerFactory"/>
> >>>>>>>    <filter class="solr.StopFilterFactory" words="stopwords.txt"
> >>>>>> ignoreCase="true"/>
> >>>>>>>    <filter class="solr.LowerCaseFilterFactory"/>
> >>>>>>>  </analyzer>
> >>>>>>>  <analyzer type="query">
> >>>>>>>    <tokenizer class="solr.KeywordTokenizerFactory"/>
> >>>>>>>    <filter class="solr.StopFilterFactory" words="stopwords.txt"
> >>>>>> ignoreCase="true"/>
> >>>>>>>    <filter class="solr.SynonymFilterFactory" expand="true"
> >>>>>> ignoreCase="true" synonyms="synonyms.txt"/>
> >>>>>>>    <filter class="solr.LowerCaseFilterFactory"/>
> >>>>>>>  </analyzer>
> >>>>>>> </fieldType>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Thamizh
> >>>>>>>
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: son hoang <so...@gmail.com>
> >>>>>>> Sent: Thursday, October 21, 2021 8:19 AM
> >>>>>>> To: users@solr.apache.org
> >>>>>>> Subject: Index for text with space
> >>>>>>>
> >>>>>>> Caution: This email came from outside Kaiser Permanente. Do not
> open
> >>>>>> attachments or click on links if you do not recognize the sender.
> >>>>>>>
> >>>>>>>
> ______________________________________________________________________
> >>>>>>> Hello
> >>>>>>>
> >>>>>>> I have a config like this:
> >>>>>>>
> >>>>>>> <fieldtype name="tok" class="solr.TextField"
> positionIncrementGap="100">
> >>>>>>>          <analyzer type="index">
> >>>>>>>              <tokenizer class="solr.StandardTokenizerFactory"/>
> >>>>>>>              <filter class="solr.ASCIIFoldingFilterFactory"/>
> >>>>>>>              <filter class="solr.LowerCaseFilterFactory"/>
> >>>>>>>      <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> >>>>>>> maxGramSize="15"/>
> >>>>>>>          </analyzer>
> >>>>>>>          <analyzer type="query">
> >>>>>>>              <tokenizer class="solr.StandardTokenizerFactory" />
> >>>>>>>              <filter class="solr.ASCIIFoldingFilterFactory"/>
> >>>>>>>              <filter class="solr.LowerCaseFilterFactory"/>
> >>>>>>>      <!-- <filter class="solr.EdgeNGramFilterFactory"
> minGramSize="3"
> >>>>>>> maxGramSize="15"/> -->
> >>>>>>>          </analyzer>
> >>>>>>>  </fieldtype>
> >>>>>>>
> >>>>>>> Using this config:
> >>>>>>>
> >>>>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears.
> >>>>>>>
> >>>>>>> 2. When I search for "Al Abbas" in the search field, I get no
> results.
> >>>>>>>
> >>>>>>> It seems that "Al Abbas" is not indexed. What I should do in the
> config
> >>>>>> so #2 can return the result
> >>>>>>>
> >>>>>>> Many thanks
> >>>>>>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> >>>>>> e-mail, you are prohibited from sharing, copying, or otherwise
> using or
> >>>>>> disclosing its contents.  If you have received this e-mail in
> error, please
> >>>>>> notify the sender immediately by reply e-mail and permanently
> delete this
> >>>>>> e-mail and any attachments without reading, forwarding or saving
> them.
> >>>>>> v.173.295  Thank you.
> >>>>>>>
> >>>>>>
> >>>>>
> >>
> >>
>

Re: Index for text with space

Posted by Dave <ha...@gmail.com>.

You can pre process the query to remove anything not indexed (less than 3 characters) but that initial scheme decision was a mistake, and should be remedied and reindexed. 

> On Oct 25, 2021, at 8:36 AM, son hoang <so...@gmail.com> wrote:
> 
> Is there any way in the query so that I do not need to reindex the whole data?
> 
>> On 2021/10/23 15:39:18, Walter Underwood <wu...@wunderwood.org> wrote: 
>> Agreed. There is a simple fix. Index all the words. Also, stop using EdgeNgramFilter.
>> That is only used for completion, not word search.
>> 
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>>> On Oct 23, 2021, at 4:31 AM, Dave <ha...@gmail.com> wrote:
>>> 
>>> Why ever would you not index less than three characters?
>>> “To be or not to be”
>>> Seems like a significant search 
>>> 
>>>> On Oct 23, 2021, at 7:28 AM, son hoang <so...@gmail.com> wrote:
>>>> 
>>>> Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text can be separated into a token "Abbas" (and "Al"  but it is not counted as a token as it has 2 chars only) then we can apply OR condition in the query?  
>>>> 
>>>>> On 2021/10/22 14:37:51, Andy C <an...@gmail.com> wrote: 
>>>>> The issue looks to me to be with the use of EdgeNGramFilterFactory in your
>>>>> field type. You have configured it with minGramSize="3" and have not
>>>>> specified preserveOriginal="true".
>>>>> 
>>>>> So words less than 3 characters will not be indexed, and therefore can't be
>>>>> searched.
>>>>> 
>>>>> See
>>>>> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter
>>>>> 
>>>>> - Andy -
>>>>> 
>>>>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <so...@gmail.com> wrote:
>>>>>> 
>>>>>> Thanks, Thamiz
>>>>>> 
>>>>>> It seems that I have index=StandardTokenizerFactory causing the issue
>>>>>> 
>>>>>> I do not want to re-index. Is there any solution ? Should I have query
>>>>>> "OR" so that the search can return  "Al Abbas" when I have  "Al Abbas" in
>>>>>> the query field  (eg: there is a OR match "Abbas" ?
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> On 2021/10/21 07:56:20, Thamizhazhagan B <Th...@kp.org>
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Create a copy field as below and use this copyfield in your query..
>>>>>>> 
>>>>>>> <copyField source="_name" dest="itemFullName"/>
>>>>>>> <field name="itemFullName" type="itemFullName_type" stored="true"
>>>>>> indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>
>>>>>>> 
>>>>>>> <fieldType name="itemFullName_type" class="solr.TextField"
>>>>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100"
>>>>>> multiValued="false">
>>>>>>>  <analyzer type="index">
>>>>>>>    <tokenizer class="solr.KeywordTokenizerFactory"/>
>>>>>>>    <filter class="solr.StopFilterFactory" words="stopwords.txt"
>>>>>> ignoreCase="true"/>
>>>>>>>    <filter class="solr.LowerCaseFilterFactory"/>
>>>>>>>  </analyzer>
>>>>>>>  <analyzer type="query">
>>>>>>>    <tokenizer class="solr.KeywordTokenizerFactory"/>
>>>>>>>    <filter class="solr.StopFilterFactory" words="stopwords.txt"
>>>>>> ignoreCase="true"/>
>>>>>>>    <filter class="solr.SynonymFilterFactory" expand="true"
>>>>>> ignoreCase="true" synonyms="synonyms.txt"/>
>>>>>>>    <filter class="solr.LowerCaseFilterFactory"/>
>>>>>>>  </analyzer>
>>>>>>> </fieldType>
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Thamizh
>>>>>>> 
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: son hoang <so...@gmail.com>
>>>>>>> Sent: Thursday, October 21, 2021 8:19 AM
>>>>>>> To: users@solr.apache.org
>>>>>>> Subject: Index for text with space
>>>>>>> 
>>>>>>> Caution: This email came from outside Kaiser Permanente. Do not open
>>>>>> attachments or click on links if you do not recognize the sender.
>>>>>>> 
>>>>>>> ______________________________________________________________________
>>>>>>> Hello
>>>>>>> 
>>>>>>> I have a config like this:
>>>>>>> 
>>>>>>> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
>>>>>>>          <analyzer type="index">
>>>>>>>              <tokenizer class="solr.StandardTokenizerFactory"/>
>>>>>>>              <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>>>>>              <filter class="solr.LowerCaseFilterFactory"/>
>>>>>>>      <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
>>>>>>> maxGramSize="15"/>
>>>>>>>          </analyzer>
>>>>>>>          <analyzer type="query">
>>>>>>>              <tokenizer class="solr.StandardTokenizerFactory" />
>>>>>>>              <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>>>>>              <filter class="solr.LowerCaseFilterFactory"/>
>>>>>>>      <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
>>>>>>> maxGramSize="15"/> -->
>>>>>>>          </analyzer>
>>>>>>>  </fieldtype>
>>>>>>> 
>>>>>>> Using this config:
>>>>>>> 
>>>>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears.
>>>>>>> 
>>>>>>> 2. When I search for "Al Abbas" in the search field, I get no results.
>>>>>>> 
>>>>>>> It seems that "Al Abbas" is not indexed. What I should do in the config
>>>>>> so #2 can return the result
>>>>>>> 
>>>>>>> Many thanks
>>>>>>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
>>>>>> e-mail, you are prohibited from sharing, copying, or otherwise using or
>>>>>> disclosing its contents.  If you have received this e-mail in error, please
>>>>>> notify the sender immediately by reply e-mail and permanently delete this
>>>>>> e-mail and any attachments without reading, forwarding or saving them.
>>>>>> v.173.295  Thank you.
>>>>>>> 
>>>>>> 
>>>>> 
>> 
>>

Re: Index for text with space

Posted by son hoang <so...@gmail.com>.

Is there any way in the query so that I do not need to reindex the whole data?

On 2021/10/23 15:39:18, Walter Underwood <wu...@wunderwood.org> wrote: 
> Agreed. There is a simple fix. Index all the words. Also, stop using EdgeNgramFilter.
> That is only used for completion, not word search.
> 
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
> > On Oct 23, 2021, at 4:31 AM, Dave <ha...@gmail.com> wrote:
> > 
> > Why ever would you not index less than three characters?
> > “To be or not to be”
> > Seems like a significant search 
> > 
> >> On Oct 23, 2021, at 7:28 AM, son hoang <so...@gmail.com> wrote:
> >> 
> >> Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text can be separated into a token "Abbas" (and "Al"  but it is not counted as a token as it has 2 chars only) then we can apply OR condition in the query?  
> >> 
> >>> On 2021/10/22 14:37:51, Andy C <an...@gmail.com> wrote: 
> >>> The issue looks to me to be with the use of EdgeNGramFilterFactory in your
> >>> field type. You have configured it with minGramSize="3" and have not
> >>> specified preserveOriginal="true".
> >>> 
> >>> So words less than 3 characters will not be indexed, and therefore can't be
> >>> searched.
> >>> 
> >>> See
> >>> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter
> >>> 
> >>> - Andy -
> >>> 
> >>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <so...@gmail.com> wrote:
> >>>> 
> >>>> Thanks, Thamiz
> >>>> 
> >>>> It seems that I have index=StandardTokenizerFactory causing the issue
> >>>> 
> >>>> I do not want to re-index. Is there any solution ? Should I have query
> >>>> "OR" so that the search can return  "Al Abbas" when I have  "Al Abbas" in
> >>>> the query field  (eg: there is a OR match "Abbas" ?
> >>>> 
> >>>> Thanks
> >>>> 
> >>>> On 2021/10/21 07:56:20, Thamizhazhagan B <Th...@kp.org>
> >>>> wrote:
> >>>>> Hi,
> >>>>> 
> >>>>> Create a copy field as below and use this copyfield in your query..
> >>>>> 
> >>>>> <copyField source="_name" dest="itemFullName"/>
> >>>>> <field name="itemFullName" type="itemFullName_type" stored="true"
> >>>> indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>
> >>>>> 
> >>>>> <fieldType name="itemFullName_type" class="solr.TextField"
> >>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100"
> >>>> multiValued="false">
> >>>>>   <analyzer type="index">
> >>>>>     <tokenizer class="solr.KeywordTokenizerFactory"/>
> >>>>>     <filter class="solr.StopFilterFactory" words="stopwords.txt"
> >>>> ignoreCase="true"/>
> >>>>>     <filter class="solr.LowerCaseFilterFactory"/>
> >>>>>   </analyzer>
> >>>>>   <analyzer type="query">
> >>>>>     <tokenizer class="solr.KeywordTokenizerFactory"/>
> >>>>>     <filter class="solr.StopFilterFactory" words="stopwords.txt"
> >>>> ignoreCase="true"/>
> >>>>>     <filter class="solr.SynonymFilterFactory" expand="true"
> >>>> ignoreCase="true" synonyms="synonyms.txt"/>
> >>>>>     <filter class="solr.LowerCaseFilterFactory"/>
> >>>>>   </analyzer>
> >>>>> </fieldType>
> >>>>> 
> >>>>> Thanks,
> >>>>> Thamizh
> >>>>> 
> >>>>> 
> >>>>> -----Original Message-----
> >>>>> From: son hoang <so...@gmail.com>
> >>>>> Sent: Thursday, October 21, 2021 8:19 AM
> >>>>> To: users@solr.apache.org
> >>>>> Subject: Index for text with space
> >>>>> 
> >>>>> Caution: This email came from outside Kaiser Permanente. Do not open
> >>>> attachments or click on links if you do not recognize the sender.
> >>>>> 
> >>>>> ______________________________________________________________________
> >>>>> Hello
> >>>>> 
> >>>>> I have a config like this:
> >>>>> 
> >>>>> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
> >>>>>           <analyzer type="index">
> >>>>>               <tokenizer class="solr.StandardTokenizerFactory"/>
> >>>>>               <filter class="solr.ASCIIFoldingFilterFactory"/>
> >>>>>               <filter class="solr.LowerCaseFilterFactory"/>
> >>>>>       <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> >>>>> maxGramSize="15"/>
> >>>>>           </analyzer>
> >>>>>           <analyzer type="query">
> >>>>>               <tokenizer class="solr.StandardTokenizerFactory" />
> >>>>>               <filter class="solr.ASCIIFoldingFilterFactory"/>
> >>>>>               <filter class="solr.LowerCaseFilterFactory"/>
> >>>>>       <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> >>>>> maxGramSize="15"/> -->
> >>>>>           </analyzer>
> >>>>>   </fieldtype>
> >>>>> 
> >>>>> Using this config:
> >>>>> 
> >>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears.
> >>>>> 
> >>>>> 2. When I search for "Al Abbas" in the search field, I get no results.
> >>>>> 
> >>>>> It seems that "Al Abbas" is not indexed. What I should do in the config
> >>>> so #2 can return the result
> >>>>> 
> >>>>> Many thanks
> >>>>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> >>>> e-mail, you are prohibited from sharing, copying, or otherwise using or
> >>>> disclosing its contents.  If you have received this e-mail in error, please
> >>>> notify the sender immediately by reply e-mail and permanently delete this
> >>>> e-mail and any attachments without reading, forwarding or saving them.
> >>>> v.173.295  Thank you.
> >>>>> 
> >>>> 
> >>> 
> 
>

Re: Index for text with space

Posted by Walter Underwood <wu...@wunderwood.org>.

Agreed. There is a simple fix. Index all the words. Also, stop using EdgeNgramFilter.
That is only used for completion, not word search.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 23, 2021, at 4:31 AM, Dave <ha...@gmail.com> wrote:
> 
> Why ever would you not index less than three characters?
> “To be or not to be”
> Seems like a significant search 
> 
>> On Oct 23, 2021, at 7:28 AM, son hoang <so...@gmail.com> wrote:
>> 
>> Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text can be separated into a token "Abbas" (and "Al"  but it is not counted as a token as it has 2 chars only) then we can apply OR condition in the query?  
>> 
>>> On 2021/10/22 14:37:51, Andy C <an...@gmail.com> wrote: 
>>> The issue looks to me to be with the use of EdgeNGramFilterFactory in your
>>> field type. You have configured it with minGramSize="3" and have not
>>> specified preserveOriginal="true".
>>> 
>>> So words less than 3 characters will not be indexed, and therefore can't be
>>> searched.
>>> 
>>> See
>>> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter
>>> 
>>> - Andy -
>>> 
>>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <so...@gmail.com> wrote:
>>>> 
>>>> Thanks, Thamiz
>>>> 
>>>> It seems that I have index=StandardTokenizerFactory causing the issue
>>>> 
>>>> I do not want to re-index. Is there any solution ? Should I have query
>>>> "OR" so that the search can return  "Al Abbas" when I have  "Al Abbas" in
>>>> the query field  (eg: there is a OR match "Abbas" ?
>>>> 
>>>> Thanks
>>>> 
>>>> On 2021/10/21 07:56:20, Thamizhazhagan B <Th...@kp.org>
>>>> wrote:
>>>>> Hi,
>>>>> 
>>>>> Create a copy field as below and use this copyfield in your query..
>>>>> 
>>>>> <copyField source="_name" dest="itemFullName"/>
>>>>> <field name="itemFullName" type="itemFullName_type" stored="true"
>>>> indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>
>>>>> 
>>>>> <fieldType name="itemFullName_type" class="solr.TextField"
>>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100"
>>>> multiValued="false">
>>>>>   <analyzer type="index">
>>>>>     <tokenizer class="solr.KeywordTokenizerFactory"/>
>>>>>     <filter class="solr.StopFilterFactory" words="stopwords.txt"
>>>> ignoreCase="true"/>
>>>>>     <filter class="solr.LowerCaseFilterFactory"/>
>>>>>   </analyzer>
>>>>>   <analyzer type="query">
>>>>>     <tokenizer class="solr.KeywordTokenizerFactory"/>
>>>>>     <filter class="solr.StopFilterFactory" words="stopwords.txt"
>>>> ignoreCase="true"/>
>>>>>     <filter class="solr.SynonymFilterFactory" expand="true"
>>>> ignoreCase="true" synonyms="synonyms.txt"/>
>>>>>     <filter class="solr.LowerCaseFilterFactory"/>
>>>>>   </analyzer>
>>>>> </fieldType>
>>>>> 
>>>>> Thanks,
>>>>> Thamizh
>>>>> 
>>>>> 
>>>>> -----Original Message-----
>>>>> From: son hoang <so...@gmail.com>
>>>>> Sent: Thursday, October 21, 2021 8:19 AM
>>>>> To: users@solr.apache.org
>>>>> Subject: Index for text with space
>>>>> 
>>>>> Caution: This email came from outside Kaiser Permanente. Do not open
>>>> attachments or click on links if you do not recognize the sender.
>>>>> 
>>>>> ______________________________________________________________________
>>>>> Hello
>>>>> 
>>>>> I have a config like this:
>>>>> 
>>>>> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
>>>>>           <analyzer type="index">
>>>>>               <tokenizer class="solr.StandardTokenizerFactory"/>
>>>>>               <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>>>               <filter class="solr.LowerCaseFilterFactory"/>
>>>>>       <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
>>>>> maxGramSize="15"/>
>>>>>           </analyzer>
>>>>>           <analyzer type="query">
>>>>>               <tokenizer class="solr.StandardTokenizerFactory" />
>>>>>               <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>>>               <filter class="solr.LowerCaseFilterFactory"/>
>>>>>       <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
>>>>> maxGramSize="15"/> -->
>>>>>           </analyzer>
>>>>>   </fieldtype>
>>>>> 
>>>>> Using this config:
>>>>> 
>>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears.
>>>>> 
>>>>> 2. When I search for "Al Abbas" in the search field, I get no results.
>>>>> 
>>>>> It seems that "Al Abbas" is not indexed. What I should do in the config
>>>> so #2 can return the result
>>>>> 
>>>>> Many thanks
>>>>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
>>>> e-mail, you are prohibited from sharing, copying, or otherwise using or
>>>> disclosing its contents.  If you have received this e-mail in error, please
>>>> notify the sender immediately by reply e-mail and permanently delete this
>>>> e-mail and any attachments without reading, forwarding or saving them.
>>>> v.173.295  Thank you.
>>>>> 
>>>> 
>>>

Re: Index for text with space

Posted by Dave <ha...@gmail.com>.

Why ever would you not index less than three characters?
“To be or not to be”
Seems like a significant search 

> On Oct 23, 2021, at 7:28 AM, son hoang <so...@gmail.com> wrote:
> 
> Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text can be separated into a token "Abbas" (and "Al"  but it is not counted as a token as it has 2 chars only) then we can apply OR condition in the query?  
> 
>> On 2021/10/22 14:37:51, Andy C <an...@gmail.com> wrote: 
>> The issue looks to me to be with the use of EdgeNGramFilterFactory in your
>> field type. You have configured it with minGramSize="3" and have not
>> specified preserveOriginal="true".
>> 
>> So words less than 3 characters will not be indexed, and therefore can't be
>> searched.
>> 
>> See
>> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter
>> 
>> - Andy -
>> 
>>> On Fri, Oct 22, 2021 at 10:12 AM son hoang <so...@gmail.com> wrote:
>>> 
>>> Thanks, Thamiz
>>> 
>>> It seems that I have index=StandardTokenizerFactory causing the issue
>>> 
>>> I do not want to re-index. Is there any solution ? Should I have query
>>> "OR" so that the search can return  "Al Abbas" when I have  "Al Abbas" in
>>> the query field  (eg: there is a OR match "Abbas" ?
>>> 
>>> Thanks
>>> 
>>> On 2021/10/21 07:56:20, Thamizhazhagan B <Th...@kp.org>
>>> wrote:
>>>> Hi,
>>>> 
>>>> Create a copy field as below and use this copyfield in your query..
>>>> 
>>>> <copyField source="_name" dest="itemFullName"/>
>>>>  <field name="itemFullName" type="itemFullName_type" stored="true"
>>> indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>
>>>> 
>>>> <fieldType name="itemFullName_type" class="solr.TextField"
>>> sortMissingLast="true" omitNorms="true" positionIncrementGap="100"
>>> multiValued="false">
>>>>    <analyzer type="index">
>>>>      <tokenizer class="solr.KeywordTokenizerFactory"/>
>>>>      <filter class="solr.StopFilterFactory" words="stopwords.txt"
>>> ignoreCase="true"/>
>>>>      <filter class="solr.LowerCaseFilterFactory"/>
>>>>    </analyzer>
>>>>    <analyzer type="query">
>>>>      <tokenizer class="solr.KeywordTokenizerFactory"/>
>>>>      <filter class="solr.StopFilterFactory" words="stopwords.txt"
>>> ignoreCase="true"/>
>>>>      <filter class="solr.SynonymFilterFactory" expand="true"
>>> ignoreCase="true" synonyms="synonyms.txt"/>
>>>>      <filter class="solr.LowerCaseFilterFactory"/>
>>>>    </analyzer>
>>>>  </fieldType>
>>>> 
>>>> Thanks,
>>>> Thamizh
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: son hoang <so...@gmail.com>
>>>> Sent: Thursday, October 21, 2021 8:19 AM
>>>> To: users@solr.apache.org
>>>> Subject: Index for text with space
>>>> 
>>>> Caution: This email came from outside Kaiser Permanente. Do not open
>>> attachments or click on links if you do not recognize the sender.
>>>> 
>>>> ______________________________________________________________________
>>>> Hello
>>>> 
>>>> I have a config like this:
>>>> 
>>>> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
>>>>            <analyzer type="index">
>>>>                <tokenizer class="solr.StandardTokenizerFactory"/>
>>>>                <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>>                <filter class="solr.LowerCaseFilterFactory"/>
>>>>        <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
>>>> maxGramSize="15"/>
>>>>            </analyzer>
>>>>            <analyzer type="query">
>>>>                <tokenizer class="solr.StandardTokenizerFactory" />
>>>>                <filter class="solr.ASCIIFoldingFilterFactory"/>
>>>>                <filter class="solr.LowerCaseFilterFactory"/>
>>>>        <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
>>>> maxGramSize="15"/> -->
>>>>            </analyzer>
>>>>    </fieldtype>
>>>> 
>>>> Using this config:
>>>> 
>>>> 1. When I search for "Abbas", the result for "Al Abbas" appears.
>>>> 
>>>> 2. When I search for "Al Abbas" in the search field, I get no results.
>>>> 
>>>> It seems that "Al Abbas" is not indexed. What I should do in the config
>>> so #2 can return the result
>>>> 
>>>> Many thanks
>>>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
>>> e-mail, you are prohibited from sharing, copying, or otherwise using or
>>> disclosing its contents.  If you have received this e-mail in error, please
>>> notify the sender immediately by reply e-mail and permanently delete this
>>> e-mail and any attachments without reading, forwarding or saving them.
>>> v.173.295  Thank you.
>>>> 
>>> 
>>

Re: RE: Index for text with space

Posted by son hoang <so...@gmail.com>.

Yep, words less than 3 chars will not be indexed. But if "Al Abbas" text can be separated into a token "Abbas" (and "Al"  but it is not counted as a token as it has 2 chars only) then we can apply OR condition in the query?  

On 2021/10/22 14:37:51, Andy C <an...@gmail.com> wrote: 
> The issue looks to me to be with the use of EdgeNGramFilterFactory in your
> field type. You have configured it with minGramSize="3" and have not
> specified preserveOriginal="true".
> 
> So words less than 3 characters will not be indexed, and therefore can't be
> searched.
> 
> See
> https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter
> 
> - Andy -
> 
> On Fri, Oct 22, 2021 at 10:12 AM son hoang <so...@gmail.com> wrote:
> 
> > Thanks, Thamiz
> >
> > It seems that I have index=StandardTokenizerFactory causing the issue
> >
> > I do not want to re-index. Is there any solution ? Should I have query
> > "OR" so that the search can return  "Al Abbas" when I have  "Al Abbas" in
> > the query field  (eg: there is a OR match "Abbas" ?
> >
> > Thanks
> >
> > On 2021/10/21 07:56:20, Thamizhazhagan B <Th...@kp.org>
> > wrote:
> > > Hi,
> > >
> > > Create a copy field as below and use this copyfield in your query..
> > >
> > > <copyField source="_name" dest="itemFullName"/>
> > >   <field name="itemFullName" type="itemFullName_type" stored="true"
> > indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>
> > >
> > > <fieldType name="itemFullName_type" class="solr.TextField"
> > sortMissingLast="true" omitNorms="true" positionIncrementGap="100"
> > multiValued="false">
> > >     <analyzer type="index">
> > >       <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> > ignoreCase="true"/>
> > >       <filter class="solr.LowerCaseFilterFactory"/>
> > >     </analyzer>
> > >     <analyzer type="query">
> > >       <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> > ignoreCase="true"/>
> > >       <filter class="solr.SynonymFilterFactory" expand="true"
> > ignoreCase="true" synonyms="synonyms.txt"/>
> > >       <filter class="solr.LowerCaseFilterFactory"/>
> > >     </analyzer>
> > >   </fieldType>
> > >
> > > Thanks,
> > > Thamizh
> > >
> > >
> > > -----Original Message-----
> > > From: son hoang <so...@gmail.com>
> > > Sent: Thursday, October 21, 2021 8:19 AM
> > > To: users@solr.apache.org
> > > Subject: Index for text with space
> > >
> > > Caution: This email came from outside Kaiser Permanente. Do not open
> > attachments or click on links if you do not recognize the sender.
> > >
> > > ______________________________________________________________________
> > > Hello
> > >
> > > I have a config like this:
> > >
> > > <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
> > >             <analyzer type="index">
> > >                 <tokenizer class="solr.StandardTokenizerFactory"/>
> > >                 <filter class="solr.ASCIIFoldingFilterFactory"/>
> > >                 <filter class="solr.LowerCaseFilterFactory"/>
> > >         <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> > > maxGramSize="15"/>
> > >             </analyzer>
> > >             <analyzer type="query">
> > >                 <tokenizer class="solr.StandardTokenizerFactory" />
> > >                 <filter class="solr.ASCIIFoldingFilterFactory"/>
> > >                 <filter class="solr.LowerCaseFilterFactory"/>
> > >         <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> > > maxGramSize="15"/> -->
> > >             </analyzer>
> > >     </fieldtype>
> > >
> > > Using this config:
> > >
> > > 1. When I search for "Abbas", the result for "Al Abbas" appears.
> > >
> > > 2. When I search for "Al Abbas" in the search field, I get no results.
> > >
> > > It seems that "Al Abbas" is not indexed. What I should do in the config
> > so #2 can return the result
> > >
> > > Many thanks
> > > NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> > e-mail, you are prohibited from sharing, copying, or otherwise using or
> > disclosing its contents.  If you have received this e-mail in error, please
> > notify the sender immediately by reply e-mail and permanently delete this
> > e-mail and any attachments without reading, forwarding or saving them.
> > v.173.295  Thank you.
> > >
> >
>

Re: RE: Index for text with space

Posted by Andy C <an...@gmail.com>.

The issue looks to me to be with the use of EdgeNGramFilterFactory in your
field type. You have configured it with minGramSize="3" and have not
specified preserveOriginal="true".

So words less than 3 characters will not be indexed, and therefore can't be
searched.

See
https://solr.apache.org/guide/8_8/filter-descriptions.html#edge-n-gram-filter

- Andy -

On Fri, Oct 22, 2021 at 10:12 AM son hoang <so...@gmail.com> wrote:

> Thanks, Thamiz
>
> It seems that I have index=StandardTokenizerFactory causing the issue
>
> I do not want to re-index. Is there any solution ? Should I have query
> "OR" so that the search can return  "Al Abbas" when I have  "Al Abbas" in
> the query field  (eg: there is a OR match "Abbas" ?
>
> Thanks
>
> On 2021/10/21 07:56:20, Thamizhazhagan B <Th...@kp.org>
> wrote:
> > Hi,
> >
> > Create a copy field as below and use this copyfield in your query..
> >
> > <copyField source="_name" dest="itemFullName"/>
> >   <field name="itemFullName" type="itemFullName_type" stored="true"
> indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>
> >
> > <fieldType name="itemFullName_type" class="solr.TextField"
> sortMissingLast="true" omitNorms="true" positionIncrementGap="100"
> multiValued="false">
> >     <analyzer type="index">
> >       <tokenizer class="solr.KeywordTokenizerFactory"/>
> >       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
> >       <filter class="solr.LowerCaseFilterFactory"/>
> >     </analyzer>
> >     <analyzer type="query">
> >       <tokenizer class="solr.KeywordTokenizerFactory"/>
> >       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
> >       <filter class="solr.SynonymFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
> >       <filter class="solr.LowerCaseFilterFactory"/>
> >     </analyzer>
> >   </fieldType>
> >
> > Thanks,
> > Thamizh
> >
> >
> > -----Original Message-----
> > From: son hoang <so...@gmail.com>
> > Sent: Thursday, October 21, 2021 8:19 AM
> > To: users@solr.apache.org
> > Subject: Index for text with space
> >
> > Caution: This email came from outside Kaiser Permanente. Do not open
> attachments or click on links if you do not recognize the sender.
> >
> > ______________________________________________________________________
> > Hello
> >
> > I have a config like this:
> >
> > <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
> >             <analyzer type="index">
> >                 <tokenizer class="solr.StandardTokenizerFactory"/>
> >                 <filter class="solr.ASCIIFoldingFilterFactory"/>
> >                 <filter class="solr.LowerCaseFilterFactory"/>
> >         <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> > maxGramSize="15"/>
> >             </analyzer>
> >             <analyzer type="query">
> >                 <tokenizer class="solr.StandardTokenizerFactory" />
> >                 <filter class="solr.ASCIIFoldingFilterFactory"/>
> >                 <filter class="solr.LowerCaseFilterFactory"/>
> >         <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> > maxGramSize="15"/> -->
> >             </analyzer>
> >     </fieldtype>
> >
> > Using this config:
> >
> > 1. When I search for "Abbas", the result for "Al Abbas" appears.
> >
> > 2. When I search for "Al Abbas" in the search field, I get no results.
> >
> > It seems that "Al Abbas" is not indexed. What I should do in the config
> so #2 can return the result
> >
> > Many thanks
> > NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> e-mail, you are prohibited from sharing, copying, or otherwise using or
> disclosing its contents.  If you have received this e-mail in error, please
> notify the sender immediately by reply e-mail and permanently delete this
> e-mail and any attachments without reading, forwarding or saving them.
> v.173.295  Thank you.
> >
>

Re: RE: Index for text with space

Posted by son hoang <so...@gmail.com>.

Thanks, Thamiz

It seems that I have index=StandardTokenizerFactory causing the issue

I do not want to re-index. Is there any solution ? Should I have query "OR" so that the search can return  "Al Abbas" when I have  "Al Abbas" in the query field  (eg: there is a OR match "Abbas" ?

Thanks

On 2021/10/21 07:56:20, Thamizhazhagan B <Th...@kp.org> wrote: 
> Hi,
> 
> Create a copy field as below and use this copyfield in your query..
> 
> <copyField source="_name" dest="itemFullName"/>
>   <field name="itemFullName" type="itemFullName_type" stored="true" indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>
> 
> <fieldType name="itemFullName_type" class="solr.TextField" sortMissingLast="true" omitNorms="true" positionIncrementGap="100" multiValued="false">
>     <analyzer type="index">
>       <tokenizer class="solr.KeywordTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.KeywordTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
>       <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>   </fieldType>
> 
> Thanks,
> Thamizh
> 
> 
> -----Original Message-----
> From: son hoang <so...@gmail.com>
> Sent: Thursday, October 21, 2021 8:19 AM
> To: users@solr.apache.org
> Subject: Index for text with space
> 
> Caution: This email came from outside Kaiser Permanente. Do not open attachments or click on links if you do not recognize the sender.
> 
> ______________________________________________________________________
> Hello
> 
> I have a config like this:
> 
> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
>             <analyzer type="index">
>                 <tokenizer class="solr.StandardTokenizerFactory"/>
>                 <filter class="solr.ASCIIFoldingFilterFactory"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> maxGramSize="15"/>
>             </analyzer>
>             <analyzer type="query">
>                 <tokenizer class="solr.StandardTokenizerFactory" />
>                 <filter class="solr.ASCIIFoldingFilterFactory"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>         <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> maxGramSize="15"/> -->
>             </analyzer>
>     </fieldtype>
> 
> Using this config:
> 
> 1. When I search for "Abbas", the result for "Al Abbas" appears.
> 
> 2. When I search for "Al Abbas" in the search field, I get no results.
> 
> It seems that "Al Abbas" is not indexed. What I should do in the config so #2 can return the result
> 
> Many thanks
> NOTICE TO RECIPIENT:  If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents.  If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. v.173.295  Thank you.
>

RE: Index for text with space

Posted by Thamizhazhagan B <Th...@kp.org>.

Hi,

Create a copy field as below and use this copyfield in your query..

<copyField source="_name" dest="itemFullName"/>
  <field name="itemFullName" type="itemFullName_type" stored="true" indexed="true" termVectors="true" termPositions="true" termOffsets="true"/>

<fieldType name="itemFullName_type" class="solr.TextField" sortMissingLast="true" omitNorms="true" positionIncrementGap="100" multiValued="false">
    <analyzer type="index">
      <tokenizer class="solr.KeywordTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.KeywordTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>

Thanks,
Thamizh


-----Original Message-----
From: son hoang <so...@gmail.com>
Sent: Thursday, October 21, 2021 8:19 AM
To: users@solr.apache.org
Subject: Index for text with space

Caution: This email came from outside Kaiser Permanente. Do not open attachments or click on links if you do not recognize the sender.

______________________________________________________________________
Hello

I have a config like this:

<fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
            <analyzer type="index">
                <tokenizer class="solr.StandardTokenizerFactory"/>
                <filter class="solr.ASCIIFoldingFilterFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="15"/>
            </analyzer>
            <analyzer type="query">
                <tokenizer class="solr.StandardTokenizerFactory" />
                <filter class="solr.ASCIIFoldingFilterFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
        <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="15"/> -->
            </analyzer>
    </fieldtype>

Using this config:

1. When I search for "Abbas", the result for "Al Abbas" appears.

2. When I search for "Al Abbas" in the search field, I get no results.

It seems that "Al Abbas" is not indexed. What I should do in the config so #2 can return the result

Many thanks
NOTICE TO RECIPIENT:  If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents.  If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. v.173.295  Thank you.

Re: Index for text with space

Posted by Aroop Ganguly <ar...@icloud.com.INVALID>.

Can you share your query syntax in both cases please?

> On Oct 20, 2021, at 5:19 PM, son hoang <so...@gmail.com> wrote:
> 
> Hello
> 
> I have a config like this:
> 
> <fieldtype name="tok" class="solr.TextField" positionIncrementGap="100">
>            <analyzer type="index">
>                <tokenizer class="solr.StandardTokenizerFactory"/>
>                <filter class="solr.ASCIIFoldingFilterFactory"/>
>                <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> maxGramSize="15"/>
>            </analyzer>
>            <analyzer type="query">
>                <tokenizer class="solr.StandardTokenizerFactory" />
>                <filter class="solr.ASCIIFoldingFilterFactory"/>
>                <filter class="solr.LowerCaseFilterFactory"/>
>        <!-- <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> maxGramSize="15"/> -->
>            </analyzer>
>    </fieldtype>   
> 
> Using this config:
> 
> 1. When I search for "Abbas", the result for "Al Abbas" appears.
> 
> 2. When I search for "Al Abbas" in the search field, I get no results.
> 
> It seems that "Al Abbas" is not indexed. What I should do in the config so #2 can return the result
> 
> Many thanks