You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Khee Chin (JIRA)" <ji...@apache.org> on 2008/12/03 01:12:46 UTC

[jira] Issue Comment Edited: (SOLR-877) Access to Lucene's TermEnum capabilities

    [ https://issues.apache.org/jira/browse/SOLR-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652603#action_12652603 ] 

kheechin edited comment on SOLR-877 at 12/2/08 4:12 PM:
---------------------------------------------------------

As a solr-user who uses this function for auto-complete, I'd like to filter out terms with a low-frequency count. Thus, I've implemented a quick-hack, against a 28th Nov checkout.

{code:title=/src/java/org/apache/solr/common/params/TermsParams.java|borderStyle=solid}

  // Optional.  The minimum value of docFreq to be returned.  1 by default
  public static final String TERMS_FREQ_MIN = TERMS_PREFIX + "freqmin";
   // Optional.  The maximum value of docFreq to be returned.  -1 by default means no boundary
  public static final String TERMS_FREQ_MAX = TERMS_PREFIX + "freqmax";
{code} 


{code:title=/src/java/org/apache/solr/handler/component/TermsComponent.java|borderStyle=solid}

    // At lines 55-56, after initializing boolean upperIncl and lowerIncl
    int freqmin = params.getInt(TermsParams.TERMS_FREQ_MIN,1); // initialize freqmin
    int freqmax = params.getInt(TermsParams.TERMS_FREQ_MAX,-1); // initialize freqmax
    
    // At line 69, replacing terms.add(theText, termEnum.docFreq());,    
    if (termEnum.docFreq() >= freqmin && (freqmax==-1 || (termEnum.docFreq() <= freqmax))) {
        terms.add(theText, termEnum.docFreq());
    } else {
        i--;
    } 
{code} 

The new parameters could be used by calling
  terms.freqmin=<value>
  terms.freqmax=<value>
both of which, are optional.


      was (Author: kheechin):
    As a solr-user who uses this function for auto-complete, I'd like to filter out terms with a low-frequency count. Thus, I've implemented a quick-hack, against a 28th Nov checkout.

/src/java/org/apache/solr/common/params/TermsParams.java

  // Optional.  The minimum value of docFreq to be returned.  1 by default
  public static final String TERMS_FREQ_MIN = TERMS_PREFIX + "freqmin";
   // Optional.  The maximum value of docFreq to be returned.  -1 by default means no boundary
  public static final String TERMS_FREQ_MAX = TERMS_PREFIX + "freqmax";

/src/java/org/apache/solr/handler/component/TermsComponent.java

    // At lines 55-56, after initializing boolean upperIncl and lowerIncl
    int freqmin = params.getInt(TermsParams.TERMS_FREQ_MIN,1); // initialize freqmin
    int freqmax = params.getInt(TermsParams.TERMS_FREQ_MAX,-1); // initialize freqmax
    
    // At line 69, within the if() { terms.add() } block,    
    if (termEnum.docFreq() >= freqmin && (freqmax==-1 || (termEnum.docFreq() <= freqmax))) {
        terms.add(theText, termEnum.docFreq());
    } else {
        i--;
    } 
    
The new parameters could be used by calling
  terms.freqmin=<value>
  terms.freqmax=<value>
both of which, are optional.

  
> Access to Lucene's TermEnum capabilities
> ----------------------------------------
>
>                 Key: SOLR-877
>                 URL: https://issues.apache.org/jira/browse/SOLR-877
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-877.patch, SOLR-877.patch
>
>
> I wrote a simple SearchComponent on the plane the other day that gives access to Lucene's TermEnum capabilities.  I think this will be useful for doing auto-suggest and other term based operations.  My first draft is not distributed, but it probably should be made to do so eventually. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.