You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by DS jha <ae...@gmail.com> on 2007/01/10 16:02:12 UTC

sort result on different set of terms

Hello -

I would like to score & summarize results on a different set of
words/conditions than my original search query criteria - is that possible?

I was thinking of extending NutchSimplicity class to modify how documents
are scored (basically change terms on which documents are scored) and plugin
basicsummarizer for generating hit summary - does that sound like logical
extension points?
.
Let me know of anyone has solved this type of problem before.

Thanks,

Re: sort result on different set of terms

Posted by Dennis Kubes <nu...@dragonflymc.com>.

DS jha wrote:
> Thanks for your reply. I looked at scoring-opic plugin - but looks like it
> gets called at parsing/index time and not during search time, correct?

that is correct
> 
> I am classifying content and assigning a category (or categories) at parse
> time and storing this information along with category score in the index.
> Now, during query time when I display results for a particular category, I
> would like to sort result based on this category score. I cannot say 
> sort on
> category score field, since a document can be classified against multiple
> categories (and so multiple category scores) - 

are you trying to filter out categories or do you actually have a 
different score for each category that content gets indexed under.  You 
can sort a query but only on a single field (I think).

At search time, for each
> document that matches against say, 'category:A' - i will have to get corr
> category score and use that for sorting.  Any thoughts?

You could populate the sort field dynamically but still only a single 
field.  Are you trying to sort on multiple category fields?

Dennis Kubes
> 
> Thanks,
> 
> 
> 
> 
> 
> 
> 
> On 1/11/07, Dennis Kubes <nu...@dragonflymc.com> wrote:
>>
>> You can write a scoring filter.  That is much easier than changing
>> NutchSimplicity.  Take a look at the scoring-opic plugin under src.
>> That will demostrate the default scoring algorithm.
>>
>> Dennis Kubes
>>
>> DS jha wrote:
>> > Hello -
>> >
>> > I would like to score & summarize results on a different set of
>> > words/conditions than my original search query criteria - is that
>> possible?
>> >
>> > I was thinking of extending NutchSimplicity class to modify how
>> documents
>> > are scored (basically change terms on which documents are scored) and
>> > plugin
>> > basicsummarizer for generating hit summary - does that sound like
>> logical
>> > extension points?
>> > .
>> > Let me know of anyone has solved this type of problem before.
>> >
>> > Thanks,
>> >
>>
> 

Re: sort result on different set of terms

Posted by DS jha <ae...@gmail.com>.
Thanks for your reply. I looked at scoring-opic plugin - but looks like it
gets called at parsing/index time and not during search time, correct?

I am classifying content and assigning a category (or categories) at parse
time and storing this information along with category score in the index.
Now, during query time when I display results for a particular category, I
would like to sort result based on this category score. I cannot say sort on
category score field, since a document can be classified against multiple
categories (and so multiple category scores) - At search time, for each
document that matches against say, 'category:A' - i will have to get corr
category score and use that for sorting.  Any thoughts?

Thanks,







On 1/11/07, Dennis Kubes <nu...@dragonflymc.com> wrote:
>
> You can write a scoring filter.  That is much easier than changing
> NutchSimplicity.  Take a look at the scoring-opic plugin under src.
> That will demostrate the default scoring algorithm.
>
> Dennis Kubes
>
> DS jha wrote:
> > Hello -
> >
> > I would like to score & summarize results on a different set of
> > words/conditions than my original search query criteria - is that
> possible?
> >
> > I was thinking of extending NutchSimplicity class to modify how
> documents
> > are scored (basically change terms on which documents are scored) and
> > plugin
> > basicsummarizer for generating hit summary - does that sound like
> logical
> > extension points?
> > .
> > Let me know of anyone has solved this type of problem before.
> >
> > Thanks,
> >
>

Re: sort result on different set of terms

Posted by Dennis Kubes <nu...@dragonflymc.com>.
You can write a scoring filter.  That is much easier than changing 
NutchSimplicity.  Take a look at the scoring-opic plugin under src. 
That will demostrate the default scoring algorithm.

Dennis Kubes

DS jha wrote:
> Hello -
> 
> I would like to score & summarize results on a different set of
> words/conditions than my original search query criteria - is that possible?
> 
> I was thinking of extending NutchSimplicity class to modify how documents
> are scored (basically change terms on which documents are scored) and 
> plugin
> basicsummarizer for generating hit summary - does that sound like logical
> extension points?
> .
> Let me know of anyone has solved this type of problem before.
> 
> Thanks,
>