You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by hassancrowdc <ha...@gmail.com> on 2013/04/19 17:48:24 UTC

Searching

I want to search so that:

- if i write an alphabet it returns all the items that start with that
alphabet(a returns apple, aspire etc).

- if i ask for a whole string, it returns me just the results with exact
string. (like search for Samsung S3 then only result is samsung s3)

-if i ask for something it returns me anything that is similar to what i m
asking.(like if i only write 'sam' it should return 'samsung') 

right now i m using text_en_splitting for my field type, it looks like this:

<fieldType name="text_en_splitting" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        
        
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_en.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
		<filter class="solr.PositionFilterFactory" />
      </analyzer>
    </fieldType>



--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-tp4057328.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Searching

Posted by William Bell <bi...@gmail.com>.

Guys,

Getting results to return with higher or lower precedence has to do with
relative scores.

For example

I want exact match to be scored highest and then text matching. You
generally use a copyField into 2 or more fields and set up different
fieldType and then boost one field over the other.

So for exact match I would setup lowercase fieldtype and the other would be
text.

Then I would also boost each like: field1^1 fields2^.1

On Friday, April 19, 2013, hassancrowdc wrote:

> thanks. I was expecting an answer that could help me to choose analyzers or
> tokenizers. any help for anyone of the scenarios?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Searching-tp4057328p4057465.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076

Re: Searching

Posted by hassancrowdc <ha...@gmail.com>.

thanks. I was expecting an answer that could help me to choose analyzers or
tokenizers. any help for anyone of the scenarios?



--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-tp4057328p4057465.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Searching

Posted by Jack Krupansky <ja...@basetechnology.com>.

Yes, you can do all of that... but it would be a non-trivial amount of 
effort - the kind of thing consultants get paid real money to do. You should 
also consider doing it in a middleware application layer, using possibly 
multiple queries of separate Solr collections. Otherwise, your index might 
become too large and unwieldy (and risk giving bad or misleading results), 
unless the number of products is rather small.

-- Jack Krupansky

-----Original Message----- 
From: hassancrowdc
Sent: Friday, April 19, 2013 11:48 AM
To: solr-user@lucene.apache.org
Subject: Searching

I want to search so that:

- if i write an alphabet it returns all the items that start with that
alphabet(a returns apple, aspire etc).

- if i ask for a whole string, it returns me just the results with exact
string. (like search for Samsung S3 then only result is samsung s3)

-if i ask for something it returns me anything that is similar to what i m
asking.(like if i only write 'sam' it should return 'samsung')

right now i m using text_en_splitting for my field type, it looks like this:

<fieldType name="text_en_splitting" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>


        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_en.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.PositionFilterFactory" />
      </analyzer>
    </fieldType>



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Searching-tp4057328.html
Sent from the Solr - User mailing list archive at Nabble.com.