You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Siarhei Chystsiakou <br...@gmail.com> on 2017/12/27 12:34:04 UTC

Enable default wildcard search

Hi everybody!
I  try integration Solr 6.6.1  with my email server (dovecot 2.32). I have
the following  settings:

schema.xml - https://pastebin.com/1XXWTs8V
solrconfig.xml - https://pastebin.com/5HSswCcv

But under these settings, the search works only on the full coincidence,
for instance, if I search for Chris it doesn't find  Christmas. The client
does not support wildcard search. I would like to know how to turn on
wildcard search for all queries.

I tried to do that by adding the following line to schema.xml

<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="25"/>

but when I added it, Solr 6.6.1 very often showed errors during the
indexing, which led to its full crash, even the web interface didn't
respond, only the full Solr restart helped. This problem emerged both on
Solr 6.6.1 and Solr 7.2

Also, in case of this option, the search result was not what I expected.
For example, when I searched for the word domain, the words domes and
domain were also included. I suppose, that from the point of view of this
operation, the result is correct, but this is not what I need.

That is why I would like to know, how to turn on the standard wildcard
search. As it is impossible on the client's side, I would like to manage it
from the Solr side.

Thanks.

Re: Enable default wildcard search

Posted by Siarhei Chystsiakou <br...@gmail.com>.
Hi Rick!
Yes, as soon as I get the required result I'll definitely publish it on
GitHub. Dovecot default scheme  doesn't suit me, when I use it the search
works according to the full word, but I want to make it wildcard search. I
don't have the solution. Hope the this group will help me.


2017-12-29 17:56 GMT+01:00 Rick Leir <rl...@leirtech.com>:

> Siarhei:
> Will you be putting up your system at github? I would like to Solr-ize my
> dovecot.
>
> Maybe you saw this already:
> https://github.com/dovecot/core/blob/master/doc/solr-schema.xml
>
> https://github.com/dovecot/core/blob/master/src/plugins/
> fts-solr/solr-connection.c
>
> https://github.com/dovecot/core/blob/master/src/plugins/
> fts-solr/fts-solr-plugin.h
>
> https://github.com/bdraco/dovecot/blob/master/doc/wiki/
> Plugins.FTS.Solr.txt
> Cheers -- Rick
>
> On December 28, 2017 4:15:06 PM EST, Siarhei Chystsiakou <
> brestows@gmail.com> wrote:
> >Hi
> >Does anyone have any idea how to fix this?
> >
> >2017-12-27 13:34 GMT+01:00 Siarhei Chystsiakou <br...@gmail.com>:
> >
> >> Hi everybody!
> >> I  try integration Solr 6.6.1  with my email server (dovecot 2.32). I
> >have
> >> the following  settings:
> >>
> >> schema.xml - https://pastebin.com/1XXWTs8V
> >> solrconfig.xml - https://pastebin.com/5HSswCcv
> >>
> >> But under these settings, the search works only on the full
> >coincidence,
> >> for instance, if I search for Chris it doesn't find  Christmas. The
> >client
> >> does not support wildcard search. I would like to know how to turn on
> >> wildcard search for all queries.
> >>
> >> I tried to do that by adding the following line to schema.xml
> >>
> >> <filter class="solr.NGramFilterFactory" minGramSize="3"
> >maxGramSize="25"/>
> >>
> >> but when I added it, Solr 6.6.1 very often showed errors during the
> >> indexing, which led to its full crash, even the web interface didn't
> >> respond, only the full Solr restart helped. This problem emerged both
> >on
> >> Solr 6.6.1 and Solr 7.2
> >>
> >> Also, in case of this option, the search result was not what I
> >expected.
> >> For example, when I searched for the word domain, the words domes and
> >> domain were also included. I suppose, that from the point of view of
> >this
> >> operation, the result is correct, but this is not what I need.
> >>
> >> That is why I would like to know, how to turn on the standard
> >wildcard
> >> search. As it is impossible on the client's side, I would like to
> >manage it
> >> from the Solr side.
> >>
> >> Thanks.
> >>
> >>
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: Enable default wildcard search

Posted by Rick Leir <rl...@leirtech.com>.
Siarhei:
Will you be putting up your system at github? I would like to Solr-ize my dovecot.

Maybe you saw this already:
https://github.com/dovecot/core/blob/master/doc/solr-schema.xml

https://github.com/dovecot/core/blob/master/src/plugins/fts-solr/solr-connection.c

https://github.com/dovecot/core/blob/master/src/plugins/fts-solr/fts-solr-plugin.h

https://github.com/bdraco/dovecot/blob/master/doc/wiki/Plugins.FTS.Solr.txt
Cheers -- Rick

On December 28, 2017 4:15:06 PM EST, Siarhei Chystsiakou <br...@gmail.com> wrote:
>Hi
>Does anyone have any idea how to fix this?
>
>2017-12-27 13:34 GMT+01:00 Siarhei Chystsiakou <br...@gmail.com>:
>
>> Hi everybody!
>> I  try integration Solr 6.6.1  with my email server (dovecot 2.32). I
>have
>> the following  settings:
>>
>> schema.xml - https://pastebin.com/1XXWTs8V
>> solrconfig.xml - https://pastebin.com/5HSswCcv
>>
>> But under these settings, the search works only on the full
>coincidence,
>> for instance, if I search for Chris it doesn't find  Christmas. The
>client
>> does not support wildcard search. I would like to know how to turn on
>> wildcard search for all queries.
>>
>> I tried to do that by adding the following line to schema.xml
>>
>> <filter class="solr.NGramFilterFactory" minGramSize="3"
>maxGramSize="25"/>
>>
>> but when I added it, Solr 6.6.1 very often showed errors during the
>> indexing, which led to its full crash, even the web interface didn't
>> respond, only the full Solr restart helped. This problem emerged both
>on
>> Solr 6.6.1 and Solr 7.2
>>
>> Also, in case of this option, the search result was not what I
>expected.
>> For example, when I searched for the word domain, the words domes and
>> domain were also included. I suppose, that from the point of view of
>this
>> operation, the result is correct, but this is not what I need.
>>
>> That is why I would like to know, how to turn on the standard
>wildcard
>> search. As it is impossible on the client's side, I would like to
>manage it
>> from the Solr side.
>>
>> Thanks.
>>
>>

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Enable default wildcard search

Posted by Siarhei Chystsiakou <br...@gmail.com>.
Hi
Does anyone have any idea how to fix this?

2017-12-27 13:34 GMT+01:00 Siarhei Chystsiakou <br...@gmail.com>:

> Hi everybody!
> I  try integration Solr 6.6.1  with my email server (dovecot 2.32). I have
> the following  settings:
>
> schema.xml - https://pastebin.com/1XXWTs8V
> solrconfig.xml - https://pastebin.com/5HSswCcv
>
> But under these settings, the search works only on the full coincidence,
> for instance, if I search for Chris it doesn't find  Christmas. The client
> does not support wildcard search. I would like to know how to turn on
> wildcard search for all queries.
>
> I tried to do that by adding the following line to schema.xml
>
> <filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="25"/>
>
> but when I added it, Solr 6.6.1 very often showed errors during the
> indexing, which led to its full crash, even the web interface didn't
> respond, only the full Solr restart helped. This problem emerged both on
> Solr 6.6.1 and Solr 7.2
>
> Also, in case of this option, the search result was not what I expected.
> For example, when I searched for the word domain, the words domes and
> domain were also included. I suppose, that from the point of view of this
> operation, the result is correct, but this is not what I need.
>
> That is why I would like to know, how to turn on the standard wildcard
> search. As it is impossible on the client's side, I would like to manage it
> from the Solr side.
>
> Thanks.
>
>

Re: Enable default wildcard search

Posted by Mikhail Khludnev <mk...@apache.org>.
Right. Sticking to index only processing here should resolve false matches
to 3116 by [3115].
The log should have OutOfMemoryError: heap blah or something. That's the
cause.

On Fri, Dec 29, 2017 at 5:27 PM, Siarhei Chystsiakou <br...@gmail.com>
wrote:

> Thank you for your answer.
>
> I tried to use  EdgeNGram under the same settings
>
> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> maxGramSize="25"/>
>
> the same problem emerged, the search was not exactly correct. For instance,
> I need to find the figure 311570, I enter 3115 into the search bar, in the
> result I get all the figures that start from 311 and not 3115. Should I
> probably had to turn on this option for indexing only?
> But I'm still concerned with the fact that in case of this option Solr
> often crashed during indexing. How to turn on debug correctly so as to show
> you detailed errors?
>
>
> ====RU ====
> Спасибо за Ваш ответ.
> Я пробовал использовать EdgeNGram при таких же настройках
>
> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> maxGramSize="25"/>
>
> возникала такая же проблема, был не совсем правильный поиск. Например надо
> найти число 311570 в поиск я ввожу 3115, в результате я получал все числа
> которые начинались на 311 а не 3115. Возможно данную опцию надо было
> включить только для индексации ?
> Но меня все равно беспокоит, что при данной опции часто в процессе
> индексации падал Solr. Как правильно включить debug что бы Вам показать
> более детальные ошибки ?
>
>
>
>
> 2017-12-28 22:47 GMT+01:00 Mikhail Khludnev <mk...@apache.org>:
>
> > Obviously, Chris has nothing in common with Christmas, hence this classic
> > search behavior is correct.
> > What people are asking here is autocomplete, and it's a separate UX and
> > algorithms.
> > You can start to explore different aspects of this field from
> > https://lucidworks.com/2015/03/04/solr-suggester/
> > You see NGamming just freak the heap out. So, you can band aid it with
> > EdgeNGram (and it's what you probably want to have) and add some heap to
> > your poor server.
> > Another approach, is to stop ngramming but try to really search by
> wildcard
> > with http://yonik.com/solr-query-parameter-substitution/
> > It should be something like q=${text}* and when client pass text=foo it
> > searches for foo*, but it doesn't work for a few words and expensive as
> > well.
> >
> > On Wed, Dec 27, 2017 at 3:34 PM, Siarhei Chystsiakou <brestows@gmail.com
> >
> > wrote:
> >
> > > Hi everybody!
> > > I  try integration Solr 6.6.1  with my email server (dovecot 2.32). I
> > have
> > > the following  settings:
> > >
> > > schema.xml - https://pastebin.com/1XXWTs8V
> > > solrconfig.xml - https://pastebin.com/5HSswCcv
> > >
> > > But under these settings, the search works only on the full
> coincidence,
> > > for instance, if I search for Chris it doesn't find  Christmas. The
> > client
> > > does not support wildcard search. I would like to know how to turn on
> > > wildcard search for all queries.
> > >
> > > I tried to do that by adding the following line to schema.xml
> > >
> > > <filter class="solr.NGramFilterFactory" minGramSize="3"
> > maxGramSize="25"/>
> > >
> > > but when I added it, Solr 6.6.1 very often showed errors during the
> > > indexing, which led to its full crash, even the web interface didn't
> > > respond, only the full Solr restart helped. This problem emerged both
> on
> > > Solr 6.6.1 and Solr 7.2
> > >
> > > Also, in case of this option, the search result was not what I
> expected.
> > > For example, when I searched for the word domain, the words domes and
> > > domain were also included. I suppose, that from the point of view of
> this
> > > operation, the result is correct, but this is not what I need.
> > >
> > > That is why I would like to know, how to turn on the standard wildcard
> > > search. As it is impossible on the client's side, I would like to
> manage
> > it
> > > from the Solr side.
> > >
> > > Thanks.
> > >
> >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>



-- 
Sincerely yours
Mikhail Khludnev

Re: Enable default wildcard search

Posted by Siarhei Chystsiakou <br...@gmail.com>.
Thank you for your answer.

I tried to use  EdgeNGram under the same settings

<filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="25"/>

the same problem emerged, the search was not exactly correct. For instance,
I need to find the figure 311570, I enter 3115 into the search bar, in the
result I get all the figures that start from 311 and not 3115. Should I
probably had to turn on this option for indexing only?
But I'm still concerned with the fact that in case of this option Solr
often crashed during indexing. How to turn on debug correctly so as to show
you detailed errors?


====RU ====
Спасибо за Ваш ответ.
Я пробовал использовать EdgeNGram при таких же настройках

<filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="25"/>

возникала такая же проблема, был не совсем правильный поиск. Например надо
найти число 311570 в поиск я ввожу 3115, в результате я получал все числа
которые начинались на 311 а не 3115. Возможно данную опцию надо было
включить только для индексации ?
Но меня все равно беспокоит, что при данной опции часто в процессе
индексации падал Solr. Как правильно включить debug что бы Вам показать
более детальные ошибки ?




2017-12-28 22:47 GMT+01:00 Mikhail Khludnev <mk...@apache.org>:

> Obviously, Chris has nothing in common with Christmas, hence this classic
> search behavior is correct.
> What people are asking here is autocomplete, and it's a separate UX and
> algorithms.
> You can start to explore different aspects of this field from
> https://lucidworks.com/2015/03/04/solr-suggester/
> You see NGamming just freak the heap out. So, you can band aid it with
> EdgeNGram (and it's what you probably want to have) and add some heap to
> your poor server.
> Another approach, is to stop ngramming but try to really search by wildcard
> with http://yonik.com/solr-query-parameter-substitution/
> It should be something like q=${text}* and when client pass text=foo it
> searches for foo*, but it doesn't work for a few words and expensive as
> well.
>
> On Wed, Dec 27, 2017 at 3:34 PM, Siarhei Chystsiakou <br...@gmail.com>
> wrote:
>
> > Hi everybody!
> > I  try integration Solr 6.6.1  with my email server (dovecot 2.32). I
> have
> > the following  settings:
> >
> > schema.xml - https://pastebin.com/1XXWTs8V
> > solrconfig.xml - https://pastebin.com/5HSswCcv
> >
> > But under these settings, the search works only on the full coincidence,
> > for instance, if I search for Chris it doesn't find  Christmas. The
> client
> > does not support wildcard search. I would like to know how to turn on
> > wildcard search for all queries.
> >
> > I tried to do that by adding the following line to schema.xml
> >
> > <filter class="solr.NGramFilterFactory" minGramSize="3"
> maxGramSize="25"/>
> >
> > but when I added it, Solr 6.6.1 very often showed errors during the
> > indexing, which led to its full crash, even the web interface didn't
> > respond, only the full Solr restart helped. This problem emerged both on
> > Solr 6.6.1 and Solr 7.2
> >
> > Also, in case of this option, the search result was not what I expected.
> > For example, when I searched for the word domain, the words domes and
> > domain were also included. I suppose, that from the point of view of this
> > operation, the result is correct, but this is not what I need.
> >
> > That is why I would like to know, how to turn on the standard wildcard
> > search. As it is impossible on the client's side, I would like to manage
> it
> > from the Solr side.
> >
> > Thanks.
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: Enable default wildcard search

Posted by Mikhail Khludnev <mk...@apache.org>.
Obviously, Chris has nothing in common with Christmas, hence this classic
search behavior is correct.
What people are asking here is autocomplete, and it's a separate UX and
algorithms.
You can start to explore different aspects of this field from
https://lucidworks.com/2015/03/04/solr-suggester/
You see NGamming just freak the heap out. So, you can band aid it with
EdgeNGram (and it's what you probably want to have) and add some heap to
your poor server.
Another approach, is to stop ngramming but try to really search by wildcard
with http://yonik.com/solr-query-parameter-substitution/
It should be something like q=${text}* and when client pass text=foo it
searches for foo*, but it doesn't work for a few words and expensive as
well.

On Wed, Dec 27, 2017 at 3:34 PM, Siarhei Chystsiakou <br...@gmail.com>
wrote:

> Hi everybody!
> I  try integration Solr 6.6.1  with my email server (dovecot 2.32). I have
> the following  settings:
>
> schema.xml - https://pastebin.com/1XXWTs8V
> solrconfig.xml - https://pastebin.com/5HSswCcv
>
> But under these settings, the search works only on the full coincidence,
> for instance, if I search for Chris it doesn't find  Christmas. The client
> does not support wildcard search. I would like to know how to turn on
> wildcard search for all queries.
>
> I tried to do that by adding the following line to schema.xml
>
> <filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="25"/>
>
> but when I added it, Solr 6.6.1 very often showed errors during the
> indexing, which led to its full crash, even the web interface didn't
> respond, only the full Solr restart helped. This problem emerged both on
> Solr 6.6.1 and Solr 7.2
>
> Also, in case of this option, the search result was not what I expected.
> For example, when I searched for the word domain, the words domes and
> domain were also included. I suppose, that from the point of view of this
> operation, the result is correct, but this is not what I need.
>
> That is why I would like to know, how to turn on the standard wildcard
> search. As it is impossible on the client's side, I would like to manage it
> from the Solr side.
>
> Thanks.
>



-- 
Sincerely yours
Mikhail Khludnev