You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by roySolr <ro...@gmail.com> on 2011/04/29 12:18:12 UTC

Autocomplete(terms) middle of words

Hello,

I use the termsComponent to fix some autocomplete on my website. I use the
prefix and get the following results:

searching for manch:

manchester city(10)
manchester united(2)

When a user search for ches i want the following results:

chesterfield united(13)
manchester united(2)

I want to search in the middle of words. How can i fix that? I have tried
the NgramsFilter on index time
but i doesn't seems to work with the termsComponent. 

My current configuration:

<fieldType name="suggestion" class="solr.TextField"
positionIncrementGap="100">
		<analyzer>
			 <charFilter class="solr.HTMLStripCharFilterFactory"/>
	                 <tokenizer class="solr.KeywordTokenizerFactory"/>
	                 <filter class="solr.LowerCaseFilterFactory"/>
		</analyzer>
</fieldtype>



--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878694.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by Grijesh <pi...@gmail.com>.
Hello ,
If you are using NGram then do not use TermsComponent, Query normally like
http://localhost:8983/solr/select?q=suggestionField:chest

It will give you the desired suggestions
-----Thanx: 
Grijesh 
www.gettinhahead.co.in --
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878894.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by ramires <uy...@beriltech.com>.
I've  already use nutch trunk 4.0. I have problem with space. 

--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2888940.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by Grijesh <pi...@gmail.com>.
solr-1.4 version does not support terms.regex .So you need to upgrade your
version to solr-3.1. 
-----Thanx: 
Grijesh 
www.gettinhahead.co.in --
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2880040.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by ramires <uy...@beriltech.com>.
hi 
 I tried before both %20 and " " terms it didn`t work. Also regex=(.*)(book)
delete spaces and merge results like 

thebook 
asbook 
atbook  
songbook
yearbook--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2879375.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by Quentin Proust <q....@gmail.com>.
@roySolr : terms.regex exits from Solr 3.1. Doesn't seem compatible.

@ramires : Did you try with space in your regex. Something like
terms.regex=(.*) book (.*) <-- I put space before and after book. If it
doesn't work, try to replace space with %20. I didn't try so I don't know if
it work.

2011/4/29 roySolr <ro...@gmail.com>

> terms.regex doesn´t work for me. Prefix works fine. I use SOLR 1.4.. Is it
> compatible?--
> View this message in context:
> http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878948.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
----------------------------------------------------------------
Quentin Proust
Email : q.proust@gmail.com
Tel : 06.78.81.15.94
http://www.linkedin.com/in/quentinproust
----------------------------------------------------------------

Re: Autocomplete(terms) middle of words

Posted by roySolr <ro...@gmail.com>.
terms.regex doesn´t work for me. Prefix works fine. I use SOLR 1.4.. Is it
compatible?--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878948.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by ramires <uy...@beriltech.com>.
hi
 I have question about regex terms. I try to find terms before and after
word'ing but can't sand blank char. how can I send through ?? 

terms?terms=true&terms.fl=content&terms.regex=(.*)(
book)&terms.regex.flag=case_insensitive&terms.limit=50--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2879192.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by Quentin Proust <q....@gmail.com>.
You can do it without NGram with a query like this :

http://localhost:8983/solr/terms?terms=true&terms.fl=suggestionField&terms.regex=(.*)chest(.*)&terms.regex.flag=case_insensitive
In my case, I had to encode (.*) so replace it with %28.*%29 if needed.
It use a regex. I don't know if it has an impact on performance.
2011/4/29 lboutros <bo...@gmail.com>

> you could use EdgeNGramFilterFactory :
>
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory
>
> And you should mix front and back ngram process in your analyzer :
>
> <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
> maxGramSize="15"
> side="front"/>
> <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
> maxGramSize="15"
> side="back"/>
>
> is it better ?
>
> Ludovic.
> -----Jouve
> France.--
> View this message in context:
> http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878891.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
----------------------------------------------------------------
Quentin Proust
Email : q.proust@gmail.com
Tel : 06.78.81.15.94
http://www.linkedin.com/in/quentinproust
----------------------------------------------------------------

Re: Autocomplete(terms) middle of words

Posted by roySolr <ro...@gmail.com>.
The words are now splitted in the index(nGram). It looks like this:

m
ma
man
manc
manch
manche
manches
manchest
mancheste
manchester

The termsComponent does not see it as one word(manchester). It gives me the
results back in NGrams(m,ma,man etc)....--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878916.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by lboutros <bo...@gmail.com>.
you could use EdgeNGramFilterFactory :

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory

And you should mix front and back ngram process in your analyzer :

<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"
side="front"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"
side="back"/>

is it better ?

Ludovic.
-----Jouve
France.--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878891.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by roySolr <ro...@gmail.com>.
Ok, i try NGrams. My configuration looks like this:

	<fieldType name="suggestion" class="solr.TextField"
positionIncrementGap="100">
		<analyzer type="index">
		 <charFilter class="solr.HTMLStripCharFilterFactory"/>
	         <tokenizer class="solr.KeywordTokenizerFactory"/>
	         <filter class="solr.LowerCaseFilterFactory"/>
	         <filter class="solr.NGramFilterFactory" minGramSize="1"
maxGramSize="15" />
		</analyzer>
		<analyzer type="query">
	         <charFilter class="solr.HTMLStripCharFilterFactory"/>
	         <tokenizer class="solr.KeywordTokenizerFactory"/>
	         <filter class="solr.LowerCaseFilterFactory"/>
		</analyzer>
	</fieldType>

<field name="suggestionField" type="suggestion" indexed="true"
stored="true"/>

i try to run the query:

http://localhost:8983/solr/terms?terms.fl=suggestionField&terms.prefix=chest

Result:
chest
cheste
chester


The result is not what i expected. I think the query is not ok?..--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878877.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

Posted by Grijesh <pi...@gmail.com>.
NGram will work for you if you want to search in middle of the word .You can
also look for wildcard search for that.
NGram will increase the size of index while wildcard queries are slow.


-----Thanx: 
Grijesh 
www.gettinhahead.co.in --
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878748.html
Sent from the Solr - User mailing list archive at Nabble.com.