You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Noble Paul (JIRA)" <ji...@apache.org> on 2008/06/16 18:05:45 UTC

[jira] Issue Comment Edited: (SOLR-486) Support binary formats for QueryresponseWriter

    [ https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605126#action_12605126 ] 

noble.paul edited comment on SOLR-486 at 6/16/08 9:04 AM:
----------------------------------------------------------

 If we take a look at the data that is written down by NamedListCodec there are a lot of "names" which are repeated.  If we could avoid the repetitions we can achieve better optimization. 
Can we have another type EXTERN_STRING 
The NamedListCodec maintains a Map<String,Integer>  of EXTERN_STRING vs index as it is written out. When the same string is written it checks up in the List whether it already has a reference.

While decoding all the EXTERN_STRING values are copied into a List <String>. When an EXTERN_STRING with an index comes it is copied from the List.

{code:title=NamedListCodec.java}
  private int stringsCount  =  0;
  private Map<String,Integer> stringsMap;
  private List<String > stringsList;
  public void writeExternString(String s) throws IOException {
    if(s == null) {
      writeTag(NULL) ;
      return;
    }
    Integer idx = stringsMap == null ? null : stringsMap.get(s);
    if(idx == null) idx =0;
    writeTag(EXTERN_STRING,idx);
    if(idx == 0){
      writeStr(s);
      if(stringsMap == null) stringsMap = new HashMap<String, Integer>();
      stringsMap.put(s,++stringsCount);
    }

  }
  public String  readExternString(FastInputStream fis) throws IOException {
    int idx = readSize(fis);
    if (idx != 0) {// idx != 0 is the index of the extern string
      return stringsList.get(idx-1);
    } else {// idx == 0 means it has a string value
      String s = (String) readVal(fis);
      if(stringsList == null ) stringsList = new ArrayList<String>();
      stringsList.add(s);
      return s;
    }
  }
{code}

      was (Author: noble.paul):
     If we take a look at the data that is written down by NamedListCodec there are a lot of "names" which are repeated.  If we could avoid the repetitions we can achieve better optimization. 
Can we have another type EXTERN_STRING 
The NamedListCodec maintains a Map<String,Integer>  of EXTERN_STRING vs index as it is written out. When the same string is written it checks up in the List whether it already has a reference.

While decoding all the EXTERN_STRING values are copied into a List <String>. When an EXTERN_STRING with an index comes it is copied from the List.

{code:title=NamedListCodec.java}
private int stringsCount  =  -1;
  private Map<String,Integer> stringsMap;
  private List<String > stringsList;
  public void writeExternString(String s) throws IOException {
    writeTag(EXTERN_STRING);
    if(s == null) {
      writeTag(NULL) ;
      return;
    }
    if(stringsMap.containsKey(s)){
      writeInt(stringsMap.get(s));
    } else {
      writeStr(s);
      stringsCount++;
      if(stringsMap == null) stringsMap = new HashMap<String, Integer>();
      stringsMap.put(s,stringsCount);
    }

  }
  public String  readExternString(FastInputStream fis) throws IOException {
    Object o = readVal(fis);
    if(o == null) return null;
    if (o instanceof String) {
      String s = (String) o;
      if(stringsList == null ) stringsList = new ArrayList<String>();
      stringsList.add(s);
      return s;
    } else {// this must be an integer
      int index = (Integer)o;
      return stringsList.get(index);
    }
  }

{code}
  
> Support binary formats for QueryresponseWriter
> ----------------------------------------------
>
>                 Key: SOLR-486
>                 URL: https://issues.apache.org/jira/browse/SOLR-486
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java, search
>            Reporter: Noble Paul
>            Assignee: Yonik Seeley
>             Fix For: 1.3
>
>         Attachments: SOLR-486.patch, solr-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch
>
>
> QueryResponse writer only allows text data to be written.
> So it is not possible to implement a binary protocol . Create another interface which has a method 
> write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.