You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Frederico Azeiteiro <Fr...@cision.com> on 2010/03/30 15:25:17 UTC
search within sentence or paragraph
Hi all,
Is it possible search for a combination of words within the same
sentence or paragraph?
Ex: American and McDonalds
Returns : "McDonalds is a American company...."
Don't returns: "...went to McDonalds. After that se saw the American
flag..."
Is this possible?
Frederico Azeiteiro
Re: search within sentence or paragraph
Posted by Erik Hatcher <er...@gmail.com>.
On Mar 30, 2010, at 12:36 PM, Ahmet Arslan wrote:
>
>> Is it possible search for a combination of words within the
>> same
>> sentence or paragraph?
>
> Mark Miller's Qsol Parser can do that [1]. However it seems that
> temporarily it is not publicly available [2] [3].
>
>
> [1]http://www.lucidimagination.com/blog/2009/02/22/exploring-query-parsers/
> [2]http://search-lucene.com/m/9UT8jpcUc5/Where+to+download+Mark+Miller's+Qsol+Parser
> [3]https://issues.apache.org/jira/browse/SOLR-896
>
> The basic idea is to insert artificial tokens between sentences and
> paragraphs at index time analysis. And use SpanNotQuery at search
> time.
Another perhaps more tractable approach is to have a token filter that
bumps the position increment gap high when a sentence boundary is
detected. Then plain phrase queries will only match within a sentence
(unless the slop factor extends beyond the gap distance).
Erik
Re: search within sentence or paragraph
Posted by Ahmet Arslan <io...@yahoo.com>.
> Is it possible search for a combination of words within the
> same
> sentence or paragraph?
Mark Miller's Qsol Parser can do that [1]. However it seems that temporarily it is not publicly available [2] [3].
[1]http://www.lucidimagination.com/blog/2009/02/22/exploring-query-parsers/
[2]http://search-lucene.com/m/9UT8jpcUc5/Where+to+download+Mark+Miller's+Qsol+Parser
[3]https://issues.apache.org/jira/browse/SOLR-896
The basic idea is to insert artificial tokens between sentences and paragraphs at index time analysis. And use SpanNotQuery at search time.