You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Marcelk <ma...@gmail.com> on 2009/09/14 14:28:04 UTC

Solr results filtered on MoreLikeThis

Hi,

I hope someone can help me in my search for finding the right solution for
my search application. I hope I'm not repeating a question that has been
asked before, but I could not find a similar question out there. So that is
why I'm asking it here...

Here goes:

My index contains documents which also could contain duplicates based on
content. The sources of these documents are from various locations on the
internet. I some cases these documents look the same and in some cases they
are the same. 

What I am trying to achieve is a result with matching documents, but where
the results are unique based on the MoreLikeThis. So I want to provide
matching documents only in the details not in the results. The results
should state the number of morelikethis. 

So if 3 documents match and another 4 documents match, I only want 2 results
like this:

- document1 (3 similar documents)
- document2 (4 similar documents)

And when users click further I will let them see all the similar documents,
but not in the search result

I have used the MoreLikeThis via the standard query not the
MoreLikeThisHandler. And I can see that the results are seperate from the
"morelikethis" element in the result. 

I would like to have the morelikethis results be filtered on the actual
result list.

Sorry, if I'm repeating myself, but I'm just trying to explain it as best as
I can.

Regards,
Marcel



-- 
View this message in context: http://www.nabble.com/Solr-results-filtered-on-MoreLikeThis-tp25434881p25434881.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr results filtered on MoreLikeThis

Posted by Marcelk <ma...@gmail.com>.
Hi Chantal,


Chantal Ackermann wrote:
> 
> Have you had a look at the facet query? Not sure but it might just do 
> what you are looking for.
> 
> http://wiki.apache.org/solr/SolrFacetingOverview
> http://wiki.apache.org/solr/SimpleFacetParameters
> 

I still don't really understand facetting? But It might help me using
following trick.

When I index a document I check for morelikethis. Then each morelikethis and
the indexed element itself will get the references to each other via a
relatedIds array field. Then (maybe using facetting) I will filter the
result based on the id on its own relatedIds. I don't yet know how to do
that, but perhaps you understand how this could be done?

Example:

document1
   - id = 1
   - relatedIds = [2,3,4,5]
   - content = 'some cool java job'
document2
   - id = 2
   - relatedIds = [1,3,4,5]
   - content = 'another cool java job'
document3
   - id = 3
   - relatedIds = [1,2,4,5]
   - content = 'yet another cool java job'
etc...
document6
   - id = 6
   - relatedIds = []
   - content = 'this java article is for you';
document7
   - id=7
   - relatedIds = [8]
   - content = 'nice java book'
document8
   - id=8
   - relatedIds = [7]
   - content = 'java book looks nice'

Now when I search, I would like to have following results:

- document1 (4 related documents)
- document6
- document7 (1 related document)

Could you give me an example on how I could get that result, maybe using
facets?

Kind Regards,
Marcel
-- 
View this message in context: http://www.nabble.com/Solr-results-filtered-on-MoreLikeThis-tp25434881p25470762.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr results filtered on MoreLikeThis

Posted by Chantal Ackermann <ch...@btelligent.de>.
Have you had a look at the facet query? Not sure but it might just do 
what you are looking for.

http://wiki.apache.org/solr/SolrFacetingOverview
http://wiki.apache.org/solr/SimpleFacetParameters



> Hi All,
> 
> Should I create plugin for this or is there some functionality in solr that
> can help me.
> 
> I basically already have part of what I want. The search response gives me a
> result list with (in my situation) 20 results and the attached morelikethis
> NamedList. Filtering based on the morelikethis 'duplicates' may result in 12
> results, meaning my result list is not complete any more, since I requested
> 20 results. So now I need to do a new search, which I need to filter yet
> again. And so on and so forth untill I get a result of 20. This is not a
> very robust implementation.
> 
> Can I do something like this on the solr side (via plugin)? For instance
> filter on the lucene hits based on the morelikethis or something like that.
> So that I can return exactly 20 results. Also adding the morelikethis to the
> response. Grouping based on the morelikethis whould even be a nice to have
> using the collapsing field functionality once it is fully implemented in
> solr.
> 
> I hope someone can give me some pointers in the right direction.
> 
> Kind Regards,
> Marcel
> 
> --
> View this message in context: http://www.nabble.com/Solr-results-filtered-on-MoreLikeThis-tp25434881p25467907.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 


Re: Solr results filtered on MoreLikeThis

Posted by Marcelk <ma...@gmail.com>.
Hi All,

Should I create plugin for this or is there some functionality in solr that
can help me.

I basically already have part of what I want. The search response gives me a
result list with (in my situation) 20 results and the attached morelikethis
NamedList. Filtering based on the morelikethis 'duplicates' may result in 12
results, meaning my result list is not complete any more, since I requested
20 results. So now I need to do a new search, which I need to filter yet
again. And so on and so forth untill I get a result of 20. This is not a
very robust implementation.

Can I do something like this on the solr side (via plugin)? For instance
filter on the lucene hits based on the morelikethis or something like that.
So that I can return exactly 20 results. Also adding the morelikethis to the
response. Grouping based on the morelikethis whould even be a nice to have
using the collapsing field functionality once it is fully implemented in
solr. 

I hope someone can give me some pointers in the right direction. 

Kind Regards,
Marcel

-- 
View this message in context: http://www.nabble.com/Solr-results-filtered-on-MoreLikeThis-tp25434881p25467907.html
Sent from the Solr - User mailing list archive at Nabble.com.