You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Dallan Quass (JIRA)" <ji...@apache.org> on 2010/02/25 21:34:28 UTC

[jira] Issue Comment Edited: (SOLR-1069) CSV document and field boosting support

    [ https://issues.apache.org/jira/browse/SOLR-1069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838526#action_12838526 ] 

Dallan Quass edited comment on SOLR-1069 at 2/25/10 8:33 PM:
-------------------------------------------------------------

FWIW, I made a few changes to CSVRequestHandler.java, which mainly involve extracting CSVLoader into a separate public class and making a few variables/functions visible outside the package.  The attached files show the changes I made.  

Doing this allowed me to create a subclass of CSVLoader that does boosting:

{code}
public class BoostingCSVRequestHandler extends ContentStreamHandlerBase {
   protected ContentStreamLoader newLoader(SolrQueryRequest req, UpdateRequestProcessor processor) {
      return new BoostingCSVLoader(req, processor);
   }

   //////////////////////// SolrInfoMBeans methods //////////////////////
   @Override
   public String getDescription() {
     return "boost CSV documents";
   }

   @Override
   public String getVersion() {
     return "";
   }

   @Override
   public String getSourceId() {
     return "";
   }

   @Override
   public String getSource() {
     return "";
   }
}

class BoostingCSVLoader extends CSVLoader {
   int boostFieldNum;

   BoostingCSVLoader(SolrQueryRequest req, UpdateRequestProcessor processor) {
      super(req, processor);
   }

   private String[] removeElement(String[] a, int pos) {
      String[] n = new String[a.length-1];
      if (pos > 0) System.arraycopy(a, 0, n, 0, pos);
      if (pos < n.length) System.arraycopy(a, pos+1, n, pos, n.length - pos);
      return n;
   }

   @Override
   protected void prepareFields() {
      boostFieldNum = -1;
      for (int i = 0; i < fieldnames.length; i++) {
         if (fieldnames[i].equals("boost")) {
            boostFieldNum = i;
            break;
         }
      }
      if (boostFieldNum >= 0) {
         fieldnames = removeElement(fieldnames, boostFieldNum);
      }

      super.prepareFields();
   }

   public void addDoc(int line, String[] vals) throws IOException {
      templateAdd.indexedId = null;
      SolrInputDocument doc = new SolrInputDocument();
      if (boostFieldNum >= 0) {
         float boost = Float.parseFloat(vals[boostFieldNum]);
         doc.setDocumentBoost(boost);
         vals = removeElement(vals, boostFieldNum);
      }

      doAdd(line, vals, doc, templateAdd);
   }
}
{code}

      was (Author: dallanq):
    FWIW, I made a few changes to CSVRequestHandler.java, which mainly involve extracting CSVLoader into a separate public class and making a few variables/functions visible outside the package.  The attached files show the changes I made.  

Doing this allowed me to create a subclass of CSVLoader that does boosting:

public class BoostingCSVRequestHandler extends ContentStreamHandlerBase {
   protected ContentStreamLoader newLoader(SolrQueryRequest req, UpdateRequestProcessor processor) {
      return new BoostingCSVLoader(req, processor);
   }

   //////////////////////// SolrInfoMBeans methods //////////////////////
   @Override
   public String getDescription() {
     return "boost CSV documents";
   }

   @Override
   public String getVersion() {
     return "";
   }

   @Override
   public String getSourceId() {
     return "";
   }

   @Override
   public String getSource() {
     return "";
   }
}

class BoostingCSVLoader extends CSVLoader {
   int boostFieldNum;

   BoostingCSVLoader(SolrQueryRequest req, UpdateRequestProcessor processor) {
      super(req, processor);
   }

   private String[] removeElement(String[] a, int pos) {
      String[] n = new String[a.length-1];
      if (pos > 0) System.arraycopy(a, 0, n, 0, pos);
      if (pos < n.length) System.arraycopy(a, pos+1, n, pos, n.length - pos);
      return n;
   }

   @Override
   protected void prepareFields() {
      boostFieldNum = -1;
      for (int i = 0; i < fieldnames.length; i++) {
         if (fieldnames[i].equals("boost")) {
            boostFieldNum = i;
            break;
         }
      }
      if (boostFieldNum >= 0) {
         fieldnames = removeElement(fieldnames, boostFieldNum);
      }

      super.prepareFields();
   }

   public void addDoc(int line, String[] vals) throws IOException {
      templateAdd.indexedId = null;
      SolrInputDocument doc = new SolrInputDocument();
      if (boostFieldNum >= 0) {
         float boost = Float.parseFloat(vals[boostFieldNum]);
         doc.setDocumentBoost(boost);
         vals = removeElement(vals, boostFieldNum);
      }

      doAdd(line, vals, doc, templateAdd);
   }
}

  
> CSV document and field boosting support
> ---------------------------------------
>
>                 Key: SOLR-1069
>                 URL: https://issues.apache.org/jira/browse/SOLR-1069
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>            Priority: Minor
>         Attachments: CSVLoader.java, CSVRequestHandler.java.diff
>
>
> It would be good if CSV loader could do document and field boosting.  
> I believe this could be handled via additional "special" columns that are tacked on such as "doc.boost" and <field.name>.boost, which are then filled in with boost values on a per row basis.  Obviously, this approach would prevent someone having an actual column named <field.name>.boost, so maybe we can make that configurable as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.