You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Scott Phillips <sc...@gmail.com> on 2007/10/10 06:41:45 UTC

Field rank?

Hi everyone,

I have a question that I can't quite seem to find the answer to by
googling or searching the archives of this mailing list. The problem
is I would like to weight some fields more than others. Assume that I
have three fields: title, author, and default where title and author
contain there respective attributes and default contains all text
extracted from the document. When a user searches for 'foo' i would to
see at the top of the result list documents that contain 'foo' in the
title.

Looking through the archive I see i could re-craft my user's query as such:

(title: foo)^3 OR (author: foo)^2 OR (default: foo)

But this doesn't seem very elegant of a solution to the problem and
becomes more problematic as the number of fields increase. Also I'm
not sure if this is possible if the user entered more complex
queries...

Does anyone have any ideas?

Scott--

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Field rank?

Posted by Scott Phillips <sc...@gmail.com>.

Thank you Doron & Kyle,

You both pointed me in the right direction. For anyone finding this
thread in the email archive, here is what I've learned.

You can boost field weights either at indexing time or at query time.
For indexing time you call .setBoost(float) on the field before it has
been added to the document. Alternatively you can can provide a
HashMap of the field names & boosts at query time when using the third
pramater of the MultiFieldQueryParser.

In either case you must use the MultiFieldQueryParser to parse the
query, otherwise you'll only be searching against one specific index
and it won't matter what weight they are boosted by. In this case
you'll need to pass the parser all the possible search fields to
combine together. In effect this will run your query against each
index and then merge the results together. I am not sure of what
performance impact this will have.

Thanks,
Scott--

On 10/10/07, Kyle Maxwell <ky...@casttv.com> wrote:
> > I have a question that I can't quite seem to find the answer to by
> > googling or searching the archives of this mailing list. The problem
> > is I would like to weight some fields more than others. Assume that I
> > have three fields: title, author, and default where title and author
> > contain there respective attributes and default contains all text
> > extracted from the document. When a user searches for 'foo' i would to
> > see at the top of the result list documents that contain 'foo' in the
> > title.
> >
> > Looking through the archive I see i could re-craft my user's query as such:
> >
> > (title: foo)^3 OR (author: foo)^2 OR (default: foo)
> >
> > But this doesn't seem very elegant of a solution to the problem and
> > becomes more problematic as the number of fields increase. Also I'm
> > not sure if this is possible if the user entered more complex
> > queries...
>
> One of the MultiFieldQueryParser's constructors has a field boost argument.
>
> --
> Kyle Maxwell
> Software Engineer
> CastTV, Inc
> http://www.casttv.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Field rank?

Posted by Kyle Maxwell <ky...@casttv.com>.

> I have a question that I can't quite seem to find the answer to by
> googling or searching the archives of this mailing list. The problem
> is I would like to weight some fields more than others. Assume that I
> have three fields: title, author, and default where title and author
> contain there respective attributes and default contains all text
> extracted from the document. When a user searches for 'foo' i would to
> see at the top of the result list documents that contain 'foo' in the
> title.
>
> Looking through the archive I see i could re-craft my user's query as such:
>
> (title: foo)^3 OR (author: foo)^2 OR (default: foo)
>
> But this doesn't seem very elegant of a solution to the problem and
> becomes more problematic as the number of fields increase. Also I'm
> not sure if this is possible if the user entered more complex
> queries...

One of the MultiFieldQueryParser's constructors has a field boost argument.

-- 
Kyle Maxwell
Software Engineer
CastTV, Inc
http://www.casttv.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: indexing on NAS

Posted by Chris Lu <ch...@gmail.com>.

I have used Lucene on SAN in a federal project, works out great. It supports
search clustering, where several other servers search on the shared index
which is produced by another server. Only need to refresh the other
searching servers' IndexSearcher after indexing is done.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes

On 10/11/07, yasoja seneviratne <ya...@hotmail.com> wrote:
>
>
> Hi,
>
> I wonder if there are any known issues having a lucene index on a NAS or
> SAN drive?  Some basic tests show that it works fine.  But are there
> performance issues with indexing on NAS for instance?
>
> Thanks,
> Yasoja
>

indexing on NAS

Posted by yasoja seneviratne <ya...@hotmail.com>.

Hi,
 
I wonder if there are any known issues having a lucene index on a NAS or SAN drive?  Some basic tests show that it works fine.  But are there performance issues with indexing on NAS for instance?
 
Thanks,
Yasoja

Re: Field rank?

Posted by Doron Cohen <DO...@il.ibm.com>.

Hi Scott,

Would indexing time field boosts work for you?
http://lucene.apache.org/java/docs/scoring.html#Score%20Boosting

Doron

"Scott Phillips" wrote:

> Hi everyone,
>
> I have a question that I can't quite seem to find the answer to by
> googling or searching the archives of this mailing list. The problem
> is I would like to weight some fields more than others. Assume that I
> have three fields: title, author, and default where title and author
> contain there respective attributes and default contains all text
> extracted from the document. When a user searches for 'foo' i would to
> see at the top of the result list documents that contain 'foo' in the
> title.
>
> Looking through the archive I see i could re-craft my user's
> query as such:
>
> (title: foo)^3 OR (author: foo)^2 OR (default: foo)
>
> But this doesn't seem very elegant of a solution to the problem and
> becomes more problematic as the number of fields increase. Also I'm
> not sure if this is possible if the user entered more complex
> queries...
>
> Does anyone have any ideas?
>
> Scott--


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org