You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ryan McKinley <ry...@gmail.com> on 2007/05/30 08:45:39 UTC

MoreLikeThis API changes?

I'm trying to build a custom MoreLikeThis implementation that will run 
within solr and I've run into a few API hurdles...

1. Can MLT.java be modified to optionally take the Similarity 
implementation in the constructor?  Currently it is hardcoded to:
  private Similarity similarity = new DefaultSimilarity();

2. Do retrieveTerms(int docNum) and createQuery(PriorityQueue q) need to 
be private?  Can they be public?  If not public, could they at least be 
protected?

3. Can isNoiseWord() be protected?


If these changes sound reasonable, I'll make a JIRA patch.

thanks
ryan

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: MoreLikeThis API changes?

Posted by Ryan McKinley <ry...@gmail.com>.
> 
>> 2. Do retrieveTerms(int docNum) and createQuery(PriorityQueue q) need 
>> to be private?  Can they be public?  If not public, could they at 
>> least be protected?
>>
> 
> I would think protected would be fine, what is your case for it being 
> public?
> 

 From the solr RequestHandler, I want to return the "interesting" terms 
used for MLT.  If retrieveTerms() is public, the handler could do this 
for any MLT implementation.  If protected, it would be locked to its own 
subclass of MLT (ok, but not ideal).

Since retrieveInterestingTerms(Reader) is public, it seems reasonable.

- - - -

In implementing the handler, i ran into another related problem...

Is there any way to walk through a PriorityQueue without destroying it? 
  Everything I see calls pq.pop() in a loop.  I would like to be able to 
use the queue to construct the MLT query and for display.  For large 
documents, constructing the "interesting" terms can be very slow, so 
doing it twice isn't a good idea.


thanks
ryan

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: MoreLikeThis API changes?

Posted by Grant Ingersoll <gs...@apache.org>.
On May 30, 2007, at 2:45 AM, Ryan McKinley wrote:

> I'm trying to build a custom MoreLikeThis implementation that will  
> run within solr and I've run into a few API hurdles...
>
> 1. Can MLT.java be modified to optionally take the Similarity  
> implementation in the constructor?  Currently it is hardcoded to:
>  private Similarity similarity = new DefaultSimilarity();
>

Seems reasonable.  I suppose accessors would be reasonable too, since  
many of the other attributes have accessors.


> 2. Do retrieveTerms(int docNum) and createQuery(PriorityQueue q)  
> need to be private?  Can they be public?  If not public, could they  
> at least be protected?
>

I would think protected would be fine, what is your case for it being  
public?

> 3. Can isNoiseWord() be protected?
>

Seems reasonable.

>
> If these changes sound reasonable, I'll make a JIRA patch.
>
> thanks
> ryan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org/tech/lucene.asp

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ 
LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org