You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by prashantc88 <pr...@searshc.com> on 2014/07/18 17:08:35 UTC

Match query string within indexed field?

Hi,

My requirement is to give a match whenever a string is found within the
indexed data of a field irrespective of where it is found.

For example, if I have a field which is indexed with the data "abc". Now any
of the following query string must give a match: xyzabc,xyabc, abcxyz .. 

I am using *solr.KeywordTokenizerFactory* as the tokenizer class and
*solr.LowerCaseFilterFactory* filter as index time in *schema.xml*.

Could anyone help me out as to how I can achieve the functionality.

Thanks in advance.



--
View this message in context: http://lucene.472066.n3.nabble.com/Match-query-string-within-indexed-field-tp4147896.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Match query string within indexed field?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
I wonder if AnalyzingInfixSuggester could be useful as an alternative
way to approach that issue. See a quick write-up and links at:
http://blog.mikemccandless.com/2014/01/finding-long-tail-suggestions-using.html

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On Fri, Jul 18, 2014 at 10:08 PM, prashantc88
<pr...@searshc.com> wrote:
> Hi,
>
> My requirement is to give a match whenever a string is found within the
> indexed data of a field irrespective of where it is found.
>
> For example, if I have a field which is indexed with the data "abc". Now any
> of the following query string must give a match: xyzabc,xyabc, abcxyz ..
>
> I am using *solr.KeywordTokenizerFactory* as the tokenizer class and
> *solr.LowerCaseFilterFactory* filter as index time in *schema.xml*.
>
> Could anyone help me out as to how I can achieve the functionality.
>
> Thanks in advance.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Match-query-string-within-indexed-field-tp4147896.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Match query string within indexed field?

Posted by Umesh Prasad <um...@gmail.com>.
*Span Queries for illustration :*
During Analysis : Inject startSentinel and endSentinal  in your indexed
field ..
So after analysis your field will look like ...
   <start> abc def <endl>
Now during query time, you can expand your query clause programmatic create
queries which will look like
     (<start> xyz <end>) OR  ( <start> abc <end> ) OR ....  basically all
unigrams
  (<start> xyz abc <end> ) OR (<start> abc def <end> ) OR ... bigrams
and so on ...

Then for each of your clauses, you will need to generate a SpanQuery ...
Flexible Query parser can help you here .. You will need to plug a custom
query builder there ..

However, as you can see, ngrams  based queries will results into a lot of
clauses  n^2 .. exactly for just one field .. And if you are searching
across multiple fields then it will go to m * n ^ 2..


On 20 July 2014 10:31, Umesh Prasad <um...@gmail.com> wrote:

> Please ignore my earlier answer .. I had missed that you wanted a match
> spotting .. So that all the indexed terms must be present in the query ...
>
> One way, I can think of is SpanQueries .. But it won't be efficient and
> won't scale to multiple fields ..
>
> My suggestion would be to  keep the mapping of keyword --> <field name,
> count>  mapping in some key value store
> and use it at query time to find field name for  query terms ..
>
>
>
>
>
>
>
>
>
>
>
>
> On 19 July 2014 02:34, prashantc88 <pr...@searshc.com> wrote:
>
>> Hi,
>>
>> Thanks for the reply. Is there a better way to do it if the scenario is
>> the
>> following:
>>
>> Indexed values: "abc def"
>>
>> Query String:"xy abc def z"
>>
>> So basically the query string has to match all the words present in the
>> indexed data to give a MATCH.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Match-indexed-data-within-query-string-tp4147896p4147958.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>
>
> --
> ---
> Thanks & Regards
> Umesh Prasad
>



-- 
---
Thanks & Regards
Umesh Prasad

Re: Match query string within indexed field?

Posted by Umesh Prasad <um...@gmail.com>.
Please ignore my earlier answer .. I had missed that you wanted a match
spotting .. So that all the indexed terms must be present in the query ...

One way, I can think of is SpanQueries .. But it won't be efficient and
won't scale to multiple fields ..

My suggestion would be to  keep the mapping of keyword --> <field name,
count>  mapping in some key value store
and use it at query time to find field name for  query terms ..












On 19 July 2014 02:34, prashantc88 <pr...@searshc.com> wrote:

> Hi,
>
> Thanks for the reply. Is there a better way to do it if the scenario is the
> following:
>
> Indexed values: "abc def"
>
> Query String:"xy abc def z"
>
> So basically the query string has to match all the words present in the
> indexed data to give a MATCH.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Match-indexed-data-within-query-string-tp4147896p4147958.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
---
Thanks & Regards
Umesh Prasad

Re: Match query string within indexed field?

Posted by prashantc88 <pr...@searshc.com>.
Hi,

Thanks for the reply. Is there a better way to do it if the scenario is the
following:

Indexed values: "abc def"

Query String:"xy abc def z"

So basically the query string has to match all the words present in the
indexed data to give a MATCH.




--
View this message in context: http://lucene.472066.n3.nabble.com/Match-indexed-data-within-query-string-tp4147896p4147958.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Match query string within indexed field?

Posted by Umesh Prasad <um...@gmail.com>.
You are looking for wildcard queries but they can be quite costly and you
will need to benchmark performance ..
Specially Suffix wild card queries  (of type *abc) are quite costly ..

You can convert a suffix query into a prefix query by using a
ReverseTokenFilter during index time analysis.

A search on older mails will be useful ..
http://search-lucene.com/?q=wild+card+performance

Uwe's mail explains why performance optimization of Suffix wild card
queries is difficult ..
http://search-lucene.com/m/w1CAyxDpbC1/wild+card+performance&subj=Wild+Card+Query+Performance





On 18 July 2014 20:38, prashantc88 <pr...@searshc.com> wrote:

> Hi,
>
> My requirement is to give a match whenever a string is found within the
> indexed data of a field irrespective of where it is found.
>
> For example, if I have a field which is indexed with the data "abc". Now
> any
> of the following query string must give a match: xyzabc,xyabc, abcxyz ..
>
> I am using *solr.KeywordTokenizerFactory* as the tokenizer class and
> *solr.LowerCaseFilterFactory* filter as index time in *schema.xml*.
>
> Could anyone help me out as to how I can achieve the functionality.
>
> Thanks in advance.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Match-query-string-within-indexed-field-tp4147896.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
---
Thanks & Regards
Umesh Prasad