You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mark Tovey <mt...@ionagroup.com> on 2009/01/08 18:16:39 UTC

Querying based on term position possible?

I'm a relative newbie at Solr/Lucene so apologies if this question is
overly simplistic. I have an index built and functioning as expected,
but I am trying to build a query that can sort/score results based on
the search terms position in the document, with a document appearing
higher in the results list if the term appears earlier in the document.
For example, "Red fox in the forest" would be scored over "My shoes are
red today and my shirt is also red" if I search for the term "red". It
seems to me that the default scoring algorithm is based more on the term
frequency than term position, though this may be a simplistic
interpretation. Does anyone on the list know if there is a way to
achieve my desired results by structuring a query a certain way, or is
this more of an indexing issue where I should have set a parameter(s) in
my schema to a certain value? Any help is hugely appreciated as I have
been puzzling away at this for the past couple of days with no success.

 

Alternatively, is there a way to query on two fields for a search term
with documents being placed higher in the results if the term occurs in
field1 over field2? I ask this because one of the fields in my schema
(title in this case) is more deemed more important in our scenario than
the "text" field (which holds the title plus the contents of the
remainder of the document). I tried, for example, title:red text:red but
again was stumped on the syntax to place an "importance" variable on
field1 over field2.

 

Of course, it may be that what I'm trying to accomplish is simply not
doable with the Lucene engine, at which point feel free to point out the
error of my ways ;)

 

Regards,

--Mark Tovey


Re: Querying based on term position possible?

Posted by Alexander Ramos Jardim <al...@gmail.com>.
2009/1/8 Otis Gospodnetic <ot...@yahoo.com>

> Hello Mark,
>


> As for assigning different weight to fields, have a look at DisMax request
> handler -
> http://wiki.apache.org/solr/DisMaxRequestHandler#head-af452050ee272a1c88e2ff89dc0012049e69e180
>

Field boosting should solve this issue too, right?


>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: Mark Tovey <mt...@ionagroup.com>
> > To: solr-user@lucene.apache.org
> > Sent: Thursday, January 8, 2009 12:16:39 PM
> > Subject: Querying based on term position possible?
> >
> > I'm a relative newbie at Solr/Lucene so apologies if this question is
> > overly simplistic. I have an index built and functioning as expected,
> > but I am trying to build a query that can sort/score results based on
> > the search terms position in the document, with a document appearing
> > higher in the results list if the term appears earlier in the document.
> > For example, "Red fox in the forest" would be scored over "My shoes are
> > red today and my shirt is also red" if I search for the term "red". It
> > seems to me that the default scoring algorithm is based more on the term
> > frequency than term position, though this may be a simplistic
> > interpretation. Does anyone on the list know if there is a way to
> > achieve my desired results by structuring a query a certain way, or is
> > this more of an indexing issue where I should have set a parameter(s) in
> > my schema to a certain value? Any help is hugely appreciated as I have
> > been puzzling away at this for the past couple of days with no success.
> >
> >
> >
> > Alternatively, is there a way to query on two fields for a search term
> > with documents being placed higher in the results if the term occurs in
> > field1 over field2? I ask this because one of the fields in my schema
> > (title in this case) is more deemed more important in our scenario than
> > the "text" field (which holds the title plus the contents of the
> > remainder of the document). I tried, for example, title:red text:red but
> > again was stumped on the syntax to place an "importance" variable on
> > field1 over field2.
> >
> >
> >
> > Of course, it may be that what I'm trying to accomplish is simply not
> > doable with the Lucene engine, at which point feel free to point out the
> > error of my ways ;)
> >
> >
> >
> > Regards,
> >
> > --Mark Tovey
>
>


-- 
Alexander Ramos Jardim

Re: Querying based on term position possible?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hello Mark,

You could have position information play a role in scoring if you use Span* family of queries.  I believe they are currently not supported by Solr, but I believe you could use QSolr + https://issues.apache.org/jira/browse/SOLR-896 to get what you need.

As for assigning different weight to fields, have a look at DisMax request handler - http://wiki.apache.org/solr/DisMaxRequestHandler#head-af452050ee272a1c88e2ff89dc0012049e69e180


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Mark Tovey <mt...@ionagroup.com>
> To: solr-user@lucene.apache.org
> Sent: Thursday, January 8, 2009 12:16:39 PM
> Subject: Querying based on term position possible?
> 
> I'm a relative newbie at Solr/Lucene so apologies if this question is
> overly simplistic. I have an index built and functioning as expected,
> but I am trying to build a query that can sort/score results based on
> the search terms position in the document, with a document appearing
> higher in the results list if the term appears earlier in the document.
> For example, "Red fox in the forest" would be scored over "My shoes are
> red today and my shirt is also red" if I search for the term "red". It
> seems to me that the default scoring algorithm is based more on the term
> frequency than term position, though this may be a simplistic
> interpretation. Does anyone on the list know if there is a way to
> achieve my desired results by structuring a query a certain way, or is
> this more of an indexing issue where I should have set a parameter(s) in
> my schema to a certain value? Any help is hugely appreciated as I have
> been puzzling away at this for the past couple of days with no success.
> 
> 
> 
> Alternatively, is there a way to query on two fields for a search term
> with documents being placed higher in the results if the term occurs in
> field1 over field2? I ask this because one of the fields in my schema
> (title in this case) is more deemed more important in our scenario than
> the "text" field (which holds the title plus the contents of the
> remainder of the document). I tried, for example, title:red text:red but
> again was stumped on the syntax to place an "importance" variable on
> field1 over field2.
> 
> 
> 
> Of course, it may be that what I'm trying to accomplish is simply not
> doable with the Lucene engine, at which point feel free to point out the
> error of my ways ;)
> 
> 
> 
> Regards,
> 
> --Mark Tovey