You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mikhail Khludnev <mk...@griddynamics.com> on 2012/07/10 21:31:23 UTC

Re: Searching for sentences containing a list of words with a configurable number of words not in the list inbetween?

Welcome!

Two points:
- did you choose right maillist? (let me reply to another one)
- have you checked
http://lucene.apache.org/core/3_6_0/queryparsersyntax.html#Proximity%20Searches?
- the same in Lucene Queries api is
http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/PhraseQuery.htmland
http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/spans/SpanNearQuery.html
- it seems to me you should familiarize with "explain" soon
http://wiki.apache.org/solr/SolrRelevancyFAQ#Why_does_id:archangel_come_before_id:hawkgirl_when_querying_for_.22wings.22

Regards

On Mon, Jul 9, 2012 at 10:28 PM, Svetlana <ma...@dswp.co.uk> wrote:

> Hi,
>
> I am just about to work through the demo and get to know lucene now I
> actually got it to build :)  I was wondering if someone could point me in
> the right direction for my project.
>
> I want to query using a list of words but the order that they appear in and
> how common they are is not relevant (i.e. no 'stop words' if I got that
> terminology correct).  The only relevant thing is how closely grouped they
> are and how many of the words in the list occur, and I want to be able to
> configure from 0 (no other non-queried words inbetween) until 'n'
> non-queried words inbetween.
>
> So for example, if I query for 'a and in house I go together or' (stupid
> example I guess) and specify 0 words inbetween then I would only want to
> get
> hits with those query words in any order sorted by relevance based on how
> many of those words occured.  For example:
>
> 'In a house together' may be the most relevant result
>
> If I specify 1 other none query word allowed, results may look like
>
> 1. 'In a house together.'
> 2. 'In a house sleeping together.'  ('sleeping' being the one extra word
> allowed)
>
> These should also be complete sentences or clauses, i.e. not 'fragments' -
> I
> guess I need to use a grammar analyser to determine that.
>
> Any help very much appreciated, I realise that this is probably deceptively
> difficult but if anyone can give some pointers that would be amazing.
>
> Svetlana
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Searching-for-sentences-containing-a-list-of-words-with-a-configurable-number-of-words-not-in-the-li-tp3993981.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <mk...@griddynamics.com>