You are viewing a plain text version of this content. The canonical link for it is here.
Posted to soap-user@ws.apache.org by Ro...@cellzome.de on 2001/10/05 15:52:07 UTC

Compression

It seems that the problem that string packet time grows exponentially still
exists in the SOAP implementation;
one idea I came up with is to apply a simple window compression - it helps
if your packets are very redundant
anyway. You'll find the code snippet implementing a simple LZ77 derivate
below. Since SOAP uses an XML
framework, you will always have to recode compressed/binary data, so you
end up blowing up your
compressed packet a little :-| Still the result is smaller than the
original by a factor of 20..80%.

Mind you that this does not solve the actual problem, it may only leviate
it - there has been an ongoing discussion about it
on this mailing list. If anyone has come up with a good idea in the
meantime, please post it. The best thing we could think
of was to use either FTP or return a URL that the client can use to access
the result (which is made available by
a servlet that communicates with the SOAP server class implementation) -
should work since the Apache SOAP
implementation needs the servlet runner anyway in order to provide the RPC
access point.

Robert F Schmitt, CellZome
Meyerhofstr. 1, D-69117 Heidelberg, Germany
Tel + 49 6221 137 570 , Fax + 49 6221 137 57 202
www.cellzome.com
------------------------------------------------------------------------------------

"The idea is that we do whatever is most important - not necessarily most
urgent. Sun has 20,000 other people doing that.
I left the urgent behind to get to the important."
Bill Joy (Sun Microsystems)

"When you attack something with a particular technology--a
hammer--everything looks like a nail,
but that's not necessarily the right answer"
Ajei Gopai (IBM CTO)

<CODE>
  static StringBuffer m_compressedPacket = new StringBuffer();

  /**
   * auxiliary function that computes the hexadecimal nibble-wise
   * representation of an integer
   */
  static void addHexNibbles(int number, int nrOfNibbles) {
    char digit;
    char returnChars[] = new char[nrOfNibbles];

    for(int nibbleCnt=0;nibbleCnt<nrOfNibbles;nibbleCnt++) {
      digit = (char)(number&0xf);

      if(digit>9) {
        digit = (char)('A'+digit);
      } else {
        digit = (char)('0'+digit);
      }

      returnChars[nrOfNibbles-nibbleCnt-1] = digit;

      number >>= 4;
    }

    m_compressedPacket.append(returnChars);
  }

  /**
   * apply a simple LZ77 derivate on the given packet
   */
  static String getCompressedPacket(String inPacket) {
    int refIndex;
    int lastRefIndex;
    int lengthCnt;
    int refLength;
    int inPacketLength = inPacket.length();

    for(int chrPos=0;chrPos<inPacketLength;) { //step through the input
string's characters
      for(lengthCnt=7, refIndex=-1, lastRefIndex=-1;
           (lengthCnt<100) && (chrPos+lengthCnt<inPacketLength+1);
               lengthCnt++, (lastRefIndex=refIndex)) { //try to find as
large a repetition block as possible
        if(lastRefIndex==-1) //no searches done yet
          refIndex = inPacket.indexOf(inPacket.substring(chrPos,
chrPos+lengthCnt), chrPos-4000);
        else //previous search found a string => longer strings cannot
start before that position
          refIndex = inPacket.indexOf(inPacket.substring(chrPos,
chrPos+lengthCnt), lastRefIndex);

        if(refIndex+lengthCnt>chrPos) //catch overlaps like \window{XXAAA}
\curpos{AAAAXX} => would return AAAA!
          refIndex = -1;

        if(refIndex==-1)
          break;
      }

      if(lastRefIndex!=-1) { //repetition block found?
        refLength = lengthCnt-1;//since the last iteration's length was
already too long
        lastRefIndex = chrPos - lastRefIndex; //relative index to current
position
        char reference[] = { (char)(lastRefIndex/256),
(char)(lastRefIndex%256), (char)(refLength%256) };
        m_compressedPacket.append('\007');
        addHexNibbles(lastRefIndex, 3);
        addHexNibbles(refLength, 2);
        chrPos += refLength;
      }
      else if(inPacket.charAt(chrPos)=='\007') { //quote the quote
        char reference[] = { 0, 0, 1 };
        m_compressedPacket.append('\007');
        addHexNibbles(0, 3);
        addHexNibbles(1, 2);
        chrPos += 1;
      }
      else { //nothing found, nothing to be quoted
        m_compressedPacket.append(inPacket.charAt(chrPos));
        chrPos += 1;
      }
    }//for all characters

    return m_compressedPacket.toString();
  }//getCompressedPacket
</CODE>