You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by gipsy <sp...@gmail.com> on 2011/07/30 14:57:50 UTC

open doc -> replaceText -> save doc -> ERROR: “Word was unable to read this document. It may be corrupt."

Dear all

There are problem when I want to replace text-elements with apache poi. But
not finding the replacing-method is the problem - the doc is damaged and
unable to be opened after writing.
The line which causes the error is the following: 

"p.replaceText(searchText, replaceTxt, offset);"


error-message
-------------

“Word was unable to read this document. It may be corrupt.
Try one or more of the following:
* Open and Repair the file.
* Open the file with the Text Recovery converter.
(\…\test.doc)”


Can you help me - is there a workaround?

Thx Thomas

my code:


		String file_in_path =
"C:/Users/Thomas/Desktop/20110729/BN_Vertragsdaten_123.doc";
		String file_out_path =
"C:/Users/Thomas/Desktop/20110729/ausgabe_BN_Vertragsdaten_123.doc";
		String searchText = "Vertragsart";
		String replaceTxt = "NEW_Vertragsart_NEW";
		
		HWPFDocument doc = null;
		try {
			POIFSFileSystem ps = new POIFSFileSystem(new
FileInputStream(file_in_path));
			doc = new HWPFDocument(ps);
		} catch (FileNotFoundException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

		Range range = doc.getRange();
		for (int x = 0; x < range.numSections(); x++) {
			Section s = range.getSection(x);
			for (int y = 0; y < s.numParagraphs(); y++) {
				Paragraph p = s.getParagraph(y);
				String paraText = p.text();
				int offset = paraText.indexOf(searchText);
				if (offset != -1) {
					System.out.println("REPLACED");
					p.replaceText(searchText, replaceTxt, offset);
				}
			}

		}
		
		try {
			OutputStream out = new FileOutputStream(file_out_path);
			doc.write(out);
			out.flush();
			out.close();
		} catch (FileNotFoundException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/open-doc-replaceText-save-doc-ERROR-Word-was-unable-to-read-this-document-It-may-be-corrupt-tp4649703p4649703.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: open doc -> replaceText -> save doc -> ERROR: “Word was unable to read this document. It may be corrupt."

Posted by gipsy <sp...@gmail.com>.
No there is only one output System.out.println("REPLACED"); 

When I open the doc-file with openoffice and save it I can open it with
Windows and the string is replaced. So I don't think there is an infinite
loop.

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/open-doc-replaceText-save-doc-ERROR-Word-was-unable-to-read-this-document-It-may-be-corrupt-tp4649703p4652028.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: open doc -> replaceText -> save doc -> ERROR: “Word was unable to read this document. It may be corrupt."

Posted by Kalyan Dev <ka...@gmail.com>.
Just to add here, when the replaceText() method strings are same, it is
going in an infinite loop ?

-Kalyan

On Sun, Jul 31, 2011 at 6:44 PM, gipsy <sp...@gmail.com> wrote:

> I use poi-3.8.beta4-20110725
>
> Yes, there are bookmarks inside the document and I want to replace them. It
> would be great if the replacement of bookmarks would work for every
> document
> a user creates (bookmarks in tables etc).
>
> Is this possible? Or is it possible to say
>
> "doc.change_bookmark(Bookmark bookmark, String replacementText)" like
> "doc.getBookmarks()" which is possible in poi-3.8.beta4?
>
> best regards,
> Thomas
>
> --
> View this message in context:
> http://apache-poi.1045710.n5.nabble.com/open-doc-replaceText-save-doc-ERROR-Word-was-unable-to-read-this-document-It-may-be-corrupt-tp4649703p4651948.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

Re: open doc -> replaceText -> save doc -> ERROR: “Word was unable to read this document. It may be corrupt."

Posted by gipsy <sp...@gmail.com>.
I use poi-3.8.beta4-20110725

Yes, there are bookmarks inside the document and I want to replace them. It
would be great if the replacement of bookmarks would work for every document
a user creates (bookmarks in tables etc). 

Is this possible? Or is it possible to say 

"doc.change_bookmark(Bookmark bookmark, String replacementText)" like
"doc.getBookmarks()" which is possible in poi-3.8.beta4?

best regards,
Thomas

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/open-doc-replaceText-save-doc-ERROR-Word-was-unable-to-read-this-document-It-may-be-corrupt-tp4649703p4651948.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: open doc -> replaceText -> save doc -> ERROR: “Word was unable to read this document. It may be corrupt."

Posted by Sergey Vladimirov <vl...@gmail.com>.
Hi,

What version of POI are you using?

Does document have "special" elements like fields, bookmarks, annotations?

--
Best regards,
Sergey

On Sat, Jul 30, 2011 at 4:57 PM, gipsy <sp...@gmail.com> wrote:
> Dear all
>
> There are problem when I want to replace text-elements with apache poi. But
> not finding the replacing-method is the problem - the doc is damaged and
> unable to be opened after writing.
> The line which causes the error is the following:
>
> "p.replaceText(searchText, replaceTxt, offset);"
>
>
> error-message
> -------------
>
> “Word was unable to read this document. It may be corrupt.
> Try one or more of the following:
> * Open and Repair the file.
> * Open the file with the Text Recovery converter.
> (\…\test.doc)”
>
>
> Can you help me - is there a workaround?
>
> Thx Thomas
>
> my code:
>
>
>                String file_in_path =
> "C:/Users/Thomas/Desktop/20110729/BN_Vertragsdaten_123.doc";
>                String file_out_path =
> "C:/Users/Thomas/Desktop/20110729/ausgabe_BN_Vertragsdaten_123.doc";
>                String searchText = "Vertragsart";
>                String replaceTxt = "NEW_Vertragsart_NEW";
>
>                HWPFDocument doc = null;
>                try {
>                        POIFSFileSystem ps = new POIFSFileSystem(new
> FileInputStream(file_in_path));
>                        doc = new HWPFDocument(ps);
>                } catch (FileNotFoundException e) {
>                        // TODO Auto-generated catch block
>                        e.printStackTrace();
>                } catch (IOException e) {
>                        // TODO Auto-generated catch block
>                        e.printStackTrace();
>                }
>
>                Range range = doc.getRange();
>                for (int x = 0; x < range.numSections(); x++) {
>                        Section s = range.getSection(x);
>                        for (int y = 0; y < s.numParagraphs(); y++) {
>                                Paragraph p = s.getParagraph(y);
>                                String paraText = p.text();
>                                int offset = paraText.indexOf(searchText);
>                                if (offset != -1) {
>                                        System.out.println("REPLACED");
>                                        p.replaceText(searchText, replaceTxt, offset);
>                                }
>                        }
>
>                }
>
>                try {
>                        OutputStream out = new FileOutputStream(file_out_path);
>                        doc.write(out);
>                        out.flush();
>                        out.close();
>                } catch (FileNotFoundException e) {
>                        // TODO Auto-generated catch block
>                        e.printStackTrace();
>                } catch (IOException e) {
>                        // TODO Auto-generated catch block
>                        e.printStackTrace();
>                }
>
> --
> View this message in context: http://apache-poi.1045710.n5.nabble.com/open-doc-replaceText-save-doc-ERROR-Word-was-unable-to-read-this-document-It-may-be-corrupt-tp4649703p4649703.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>



-- 
Sergey Vladimirov