You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Dege Ratak <de...@pointclickcare.com> on 2005/09/01 14:57:30 UTC

Replacing text in Word using HWPF

I wanted to search and replace text inside of a word doc.
I got the following to work in most cases.  In some documents this code
doesn't work, resulting in word crashing.  I think it may have to do
offsets not being adjusted properly when there is a header and footer as
I can remove everything in one of the failing documents and still word
will crash.  Usually the solution is simply to start a new document from
scratch.  I'm using word 2003 and only have to use this for specific
word documents, that is, users can't just upload any word document they
want, they tell me what they want and I write the document.  So sorry if
this doesn't work for your documents but I couldn't find any other
document/solution.

I added the following method to the Range class

/**
   * Replaces the text of this range with
   * some other text
   * @param text
   */
  public void replaceWith( String text){
  	initAll();

    TextPiece tp = (TextPiece)_text.get(_textStart);
    StringBuffer sb = (StringBuffer)tp.getStringBuffer();


    int start = _start > tp.getStart() ? _start - tp.getStart() : 0;
    int end = _end <= tp.getEnd() ? _end - tp.getStart() : tp.getEnd() -
tp.getStart();
    if(tp.usesUnicode()) // convert the byte pointers to char pointers
    {
      start/=2;
      end/=2;
    }
    String oldText= sb.substring(start, end);

    int length = 0;
    if ( text.length() > oldText.length() ){
    	length = text.length() - oldText.length();
    	sb.replace(start, end, text);
	    int adjustedLength =
_doc.getTextTable().adjustForInsert(_textStart, length);
	    _doc.getCharacterTable().adjustForInsert(_charEnd - 1,
adjustedLength);
	    _doc.getParagraphTable().adjustForInsert(_parEnd - 1,
adjustedLength);
	    _doc.getSectionTable().adjustForInsert(_sectionEnd - 1,
adjustedLength);

	    adjustForInsert(length);
    }else {
    	StringBuffer newText = new StringBuffer(text);
    	length = oldText.length()- text.length();
    	for ( int i = 0; i < length; ++i ){
    		newText.append(" ");
    	}
    	sb.replace(start, end, newText.toString());
    }
  }

Most of the code is simply a combination of the adjustForInsert method
and the insertBefore/After methods.  
However one thing about this is that if the new text is smaller than the
place holder in the word doc then the new text is padded with space as I
didn't want to figure out how to handle adjusting for deletes for the
range.

Also, this method only works properly if called on a CharacterRun.
Below is the code I use to loop through a word doc with replaceable
parameters like <[replaceme]>.  The fields variable is a Map that I use
to get the values to replace the parameters with.

HWPFDocument doc = new HWPFDocument (new FileInputStream (args[0]));
Range r = doc.getRange();

for ( int i = 0; i < r.numSections(); ++i ) {

	Section s = r.getSection(i);
      for (int x = 0; x < s.numParagraphs(); x++) {
  		Paragraph p = s.getParagraph(x);

  	    	for (int z = 0; z < p.numCharacterRuns(); z++){
  	        //character run
  	        CharacterRun run = p.getCharacterRun(z);

  	        //character run text
  	        String text = run.text();

  	        if ( text != null ) {

  	        	String value = (String)fields.get(text.trim());
  	        	if ( value != null ){
  	        		run.replaceWith(value);

  	        	}
  	        }
  	      }
  	    }
      }
}


Currently to get the HWPF code you have to get it from the scratchpad I
got it by using the open source cvsgrab utility, you can google it.  The
command I used was 
C:\cvsgrab-2.2.2>java -classpath lib\cvsgrab.jar
net.sourceforge.cvsgrab.CVSGrab -url
http://cvs.apache.org/viewcvs.cgi/jakarta-poi/src/scratchpad
as the batch file didn't work on my machine.
Then I created a project in eclipse with the other poi jars added to the
lib in order to compile everything and export it to a jar.

Hope you find this useful.

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/