You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Robert Jordan <ro...@gmx.net> on 2010/04/23 17:36:41 UTC

MoreLikeThis queries

Hi,

I've encountered a rather strange issue with queries
generated by the MoreLikeThis (Lucene.NET 2.9.2) class:

When I pass such a query to searcher.Search(query, null, 10)
(using the new TopDocs API), the TopDocs count is not limited as
specified. I'd expect max. 10 hits, but got far more.

Do I miss something fundamental?

Robert


Negate search

Posted by "Monteiro, Alvaro" <Al...@sage.pt>.
Hi!!

Im using Lucene for simple database indexing. Until now it has been a
breeze. Lucene works great and searching is fast and reliable. However a
problem has shown up.
First of all let me give u an example of how my lucene documents are
organized:

Each document represents an entity that reflects a row on a given
database table.
Imagine the entity Clients. It has the following fields:

'Code', 'Name', 'Address', 'Clients', 'content'.

'content' is the default field. It contains all information.
Clients field has the same content as the 'content' field in order to
allow search for Clients alone.

The problem:
Imagine that in the 'Address' and 'Name' fields contain the word "York".
"York" would be found in 'Clients' and
'content' as well. 

A given user is not allowed to see nor to obtain results from the
'Address' field. If I parse the query and transform
into something like "-Address:writtenquery" i will not obtain any lucene
documents that have that in the 'Address' field. The thing is: if "York"
is also on the 'Name' field, I should get the lucene document because
the user should be able to search that field.

What is the best way to restrict search taking into account this case
scenario?

Thank you so much for your help.

Alvaro 

RE: Find articles I like

Posted by Digy <di...@gmail.com>.
See  "MoreLikeThisQuery" in "Contrib/Queries.Net".
It does what you are seeking for.
DIGY



-----Original Message-----
From: Robert Pohl [mailto:robban.p@gmail.com] 
Sent: Thursday, April 29, 2010 11:18 PM
To: lucene-net-user@lucene.apache.org
Subject: Find articles I like

Hi, I have an idea about a function to suggest feed articles that are 
similar to the ones that I already have read.
I want to find the most similar words in the n-th latest article titles, 
and create a string with these words.
Then I take that string and find similar articles to that string, to get 
the latest articles that i "like to read"


Can this be solved in pure Lucene? Do you understand where I want to go 
with this? :)

Any ideas or suggestions are highly appreciated!

Thanks,
Rob


Find articles I like

Posted by Robert Pohl <ro...@gmail.com>.
Hi, I have an idea about a function to suggest feed articles that are 
similar to the ones that I already have read.
I want to find the most similar words in the n-th latest article titles, 
and create a string with these words.
Then I take that string and find similar articles to that string, to get 
the latest articles that i "like to read"


Can this be solved in pure Lucene? Do you understand where I want to go 
with this? :)

Any ideas or suggestions are highly appreciated!

Thanks,
Rob


Re: MoreLikeThis queries

Posted by Robert Jordan <ro...@gmx.net>.
On 23.04.2010 17:36, Robert Jordan wrote:
> Hi,
>
> I've encountered a rather strange issue with queries
> generated by the MoreLikeThis (Lucene.NET 2.9.2) class:
>
> When I pass such a query to searcher.Search(query, null, 10)
> (using the new TopDocs API), the TopDocs count is not limited as
> specified. I'd expect max. 10 hits, but got far more.
>
> Do I miss something fundamental?

Indeed, I do :)

Instead of checking TopDocs.scoreDocs.Length, I was misguided
by TopDocs.totalHits, which is the count of the hits in the index.
I'm still not used to the new API.

Sorry for the noise!

Robert