You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Cory Newey <co...@gmail.com> on 2016/08/07 20:29:45 UTC

Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Hello All:

I've tried to check the FAQ for this question and was unable to find
anything. I'm not sure if this should be a question for the developers'
list; I couldn't really tell from the brief description of the various
lists. Anyway, I figured I'd ask the question here and you guys could
bounce me over to the developers' list if that is the appropriate place for
this question.

I have written a program that will merge changes (replace certain words
with other words/phrases) into a Word (docx) document using XWPFDocument.
But now I want to replace words/sections of one Word document with an
entire other Word document. I've tried to update my program so that it
iterates through the paragraphs/runs of the document to be merged in. It
merges the text just fine but it loses all formatting, tables, etc. I've
googled the question and found a few Stack-Overflow posts that talked about
it, but nothing was of any use to me.

My question is: is it possible to merge one Word document into another Word
document, such that it copies all formatting, tables, etc from the
merged-in document (minus any headers/footers - that would be a little
impossible I think) into the merged-into document - using the XWPFDocument
object?

Thanks in advance for any help.
~Cory

Re: Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Posted by Алексей Ушаровский <us...@mail.ru.INVALID>.
 Hello!

I've tackled with this problem few weeks ago. And my solution is in using AltChunk element of OOXML format.
It's already works but there is one problem for which I have no simple solution. The problem is conflicts between styles of documents we try to merge.

Here is my topic about adding AltChunk throught the Apache POI library
http://www.cyberforum.ru/java/thread1737393.html  

It's in russian but java code is quite comprehensive, i guess))


>Воскресенье,  7 августа 2016, 23:29 +03:00 от Cory Newey <co...@gmail.com>:
>
>Hello All:
>
>I've tried to check the FAQ for this question and was unable to find
>anything. I'm not sure if this should be a question for the developers'
>list; I couldn't really tell from the brief description of the various
>lists. Anyway, I figured I'd ask the question here and you guys could
>bounce me over to the developers' list if that is the appropriate place for
>this question.
>
>I have written a program that will merge changes (replace certain words
>with other words/phrases) into a Word (docx) document using XWPFDocument.
>But now I want to replace words/sections of one Word document with an
>entire other Word document. I've tried to update my program so that it
>iterates through the paragraphs/runs of the document to be merged in. It
>merges the text just fine but it loses all formatting, tables, etc. I've
>googled the question and found a few Stack-Overflow posts that talked about
>it, but nothing was of any use to me.
>
>My question is: is it possible to merge one Word document into another Word
>document, such that it copies all formatting, tables, etc from the
>merged-in document (minus any headers/footers - that would be a little
>impossible I think) into the merged-into document - using the XWPFDocument
>object?
>
>Thanks in advance for any help.
>~Cory


Re: Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Posted by Dominik Stadler <do...@gmx.at>.
You can also take a look at https://github.com/centic9/poi-mail-merge, it
uses a very simple replacement-mechanism to fill in contents from an
xlsx/csv to create multiple merged documents.

Will it is not a complete fit, you might be able to adjust it for your
use-case.

Dominik

On Aug 8, 2016 5:40 PM, "Angelo zerr" <an...@gmail.com> wrote:

> Hi Cory,
>
> It seems that you wish benefit with "mail merge" features. I suggest you
> that you try XDocReport
> <https://github.com/opensagres/xdocreport/wiki/DocxReportingQuickStart>
> which provides the capability to create a template docx with MS Word and
> use Velocity/Freemarker syntax to manage fields to replace (your need),
> manage loop (for section, etc), condition, etc by using a Java model.
>
>
> You can too convert your generated docx to HTML and PDF (we load POI+iText
> for that, but it's not perfect).
>
> Hope it will help you.
>
> Regard's Angelo
>
> 2016-08-08 17:10 GMT+02:00 Cory Newey <co...@gmail.com>:
>
> > Thanks for the replies.
> >
> > Unfortunately, I don't just want to tack one document onto the end of
> > another document. I want to search for a particular word/phrase in the
> > first document and replace that with the entire contents of the second
> > document. That means that if I find the text to be replaced inside of a
> > particular run, I need to preserve all of the contents of the runs before
> > that run, insert all of the contents from the merged-in document at the
> > place where run that contains the text-to-be-replaced resides, and then
> > preserve all of the contents of the runs that come after.
> >
> > Thanks again.
> > ~Cory
> >
> >
> > On Mon, Aug 8, 2016 at 5:19 AM, Murphy, Mark <mu...@metalexmfg.com>
> > wrote:
> >
> > > That depends on what you mean by merge. Are you just trying to append
> > > document B onto the end of Document A, or are you trying to mix them
> > > together in some specific way?
> > >
> > > Appending should be just a matter of reading through the document body
> of
> > > Document B, and copying the elements to Document A. I wouldn't look for
> > > just paragraphs, but all elements. That way you shouldn't lose
> anything.
> > > The main thing you will have to watch out for is sections. These aren't
> > > handled well right now (ok, not handled at all). If there is one
> section
> > in
> > > the document, the section element will be found at the end of the
> > document,
> > > and must remain there. If there are multiple sections, all but the last
> > > section element will be found in the last paragraph properties element
> of
> > > the section.
> > >
> > > Just remember, the Word interface is still unstable, and subject to
> > > significant changes. And there is still much that has to be
> accomplished
> > > down in the weeds of the CT_ interfaces.
> > >
> > > -----Original Message-----
> > > From: Cory Newey [mailto:cory.newey@gmail.com]
> > > Sent: Sunday, August 07, 2016 4:30 PM
> > > To: user@poi.apache.org
> > > Subject: Merge Word docx into another Word dox via XWPFDocument -
> > > preserving formatting
> > >
> > > Hello All:
> > >
> > > I've tried to check the FAQ for this question and was unable to find
> > > anything. I'm not sure if this should be a question for the developers'
> > > list; I couldn't really tell from the brief description of the various
> > > lists. Anyway, I figured I'd ask the question here and you guys could
> > > bounce me over to the developers' list if that is the appropriate place
> > for
> > > this question.
> > >
> > > I have written a program that will merge changes (replace certain words
> > > with other words/phrases) into a Word (docx) document using
> XWPFDocument.
> > > But now I want to replace words/sections of one Word document with an
> > > entire other Word document. I've tried to update my program so that it
> > > iterates through the paragraphs/runs of the document to be merged in.
> It
> > > merges the text just fine but it loses all formatting, tables, etc.
> I've
> > > googled the question and found a few Stack-Overflow posts that talked
> > about
> > > it, but nothing was of any use to me.
> > >
> > > My question is: is it possible to merge one Word document into another
> > > Word document, such that it copies all formatting, tables, etc from the
> > > merged-in document (minus any headers/footers - that would be a little
> > > impossible I think) into the merged-into document - using the
> > XWPFDocument
> > > object?
> > >
> > > Thanks in advance for any help.
> > > ~Cory
> > >
> >
>

Re: Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Posted by Angelo zerr <an...@gmail.com>.
Hi Cory,

It seems that you wish benefit with "mail merge" features. I suggest you
that you try XDocReport
<https://github.com/opensagres/xdocreport/wiki/DocxReportingQuickStart>
which provides the capability to create a template docx with MS Word and
use Velocity/Freemarker syntax to manage fields to replace (your need),
manage loop (for section, etc), condition, etc by using a Java model.


You can too convert your generated docx to HTML and PDF (we load POI+iText
for that, but it's not perfect).

Hope it will help you.

Regard's Angelo

2016-08-08 17:10 GMT+02:00 Cory Newey <co...@gmail.com>:

> Thanks for the replies.
>
> Unfortunately, I don't just want to tack one document onto the end of
> another document. I want to search for a particular word/phrase in the
> first document and replace that with the entire contents of the second
> document. That means that if I find the text to be replaced inside of a
> particular run, I need to preserve all of the contents of the runs before
> that run, insert all of the contents from the merged-in document at the
> place where run that contains the text-to-be-replaced resides, and then
> preserve all of the contents of the runs that come after.
>
> Thanks again.
> ~Cory
>
>
> On Mon, Aug 8, 2016 at 5:19 AM, Murphy, Mark <mu...@metalexmfg.com>
> wrote:
>
> > That depends on what you mean by merge. Are you just trying to append
> > document B onto the end of Document A, or are you trying to mix them
> > together in some specific way?
> >
> > Appending should be just a matter of reading through the document body of
> > Document B, and copying the elements to Document A. I wouldn't look for
> > just paragraphs, but all elements. That way you shouldn't lose anything.
> > The main thing you will have to watch out for is sections. These aren't
> > handled well right now (ok, not handled at all). If there is one section
> in
> > the document, the section element will be found at the end of the
> document,
> > and must remain there. If there are multiple sections, all but the last
> > section element will be found in the last paragraph properties element of
> > the section.
> >
> > Just remember, the Word interface is still unstable, and subject to
> > significant changes. And there is still much that has to be accomplished
> > down in the weeds of the CT_ interfaces.
> >
> > -----Original Message-----
> > From: Cory Newey [mailto:cory.newey@gmail.com]
> > Sent: Sunday, August 07, 2016 4:30 PM
> > To: user@poi.apache.org
> > Subject: Merge Word docx into another Word dox via XWPFDocument -
> > preserving formatting
> >
> > Hello All:
> >
> > I've tried to check the FAQ for this question and was unable to find
> > anything. I'm not sure if this should be a question for the developers'
> > list; I couldn't really tell from the brief description of the various
> > lists. Anyway, I figured I'd ask the question here and you guys could
> > bounce me over to the developers' list if that is the appropriate place
> for
> > this question.
> >
> > I have written a program that will merge changes (replace certain words
> > with other words/phrases) into a Word (docx) document using XWPFDocument.
> > But now I want to replace words/sections of one Word document with an
> > entire other Word document. I've tried to update my program so that it
> > iterates through the paragraphs/runs of the document to be merged in. It
> > merges the text just fine but it loses all formatting, tables, etc. I've
> > googled the question and found a few Stack-Overflow posts that talked
> about
> > it, but nothing was of any use to me.
> >
> > My question is: is it possible to merge one Word document into another
> > Word document, such that it copies all formatting, tables, etc from the
> > merged-in document (minus any headers/footers - that would be a little
> > impossible I think) into the merged-into document - using the
> XWPFDocument
> > object?
> >
> > Thanks in advance for any help.
> > ~Cory
> >
>

RE: Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Posted by "Murphy, Mark" <mu...@metalexmfg.com>.
So what you need to do is end the run (and paragraph, and potentially table, row, and cell) then start inserting the new document, then reopen the table/paragraph/run with the original properties. You still have to be concerned with sections though because "Document B" will likely contain a section at the end which must be excluded, or you need to copy the section from "Document A" to the properties of the last paragraph before "Document B" is inserted, and move "Document B's" section to the last paragraph properties from that document. In this case you will have three sections: first half of "Document A", "Document B", and last half of "Document A". Any of these options must be done using the CT_ interface and XMLBeans as the features do not exist in the XWPF usermodel.

-----Original Message-----
From: Cory Newey [mailto:cory.newey@gmail.com] 
Sent: Monday, August 08, 2016 11:11 AM
To: POI Users List <us...@poi.apache.org>
Subject: Re: Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Thanks for the replies.

Unfortunately, I don't just want to tack one document onto the end of another document. I want to search for a particular word/phrase in the first document and replace that with the entire contents of the second document. That means that if I find the text to be replaced inside of a particular run, I need to preserve all of the contents of the runs before that run, insert all of the contents from the merged-in document at the place where run that contains the text-to-be-replaced resides, and then preserve all of the contents of the runs that come after.

Thanks again.
~Cory


On Mon, Aug 8, 2016 at 5:19 AM, Murphy, Mark <mu...@metalexmfg.com>
wrote:

> That depends on what you mean by merge. Are you just trying to append 
> document B onto the end of Document A, or are you trying to mix them 
> together in some specific way?
>
> Appending should be just a matter of reading through the document body 
> of Document B, and copying the elements to Document A. I wouldn't look 
> for just paragraphs, but all elements. That way you shouldn't lose anything.
> The main thing you will have to watch out for is sections. These 
> aren't handled well right now (ok, not handled at all). If there is 
> one section in the document, the section element will be found at the 
> end of the document, and must remain there. If there are multiple 
> sections, all but the last section element will be found in the last 
> paragraph properties element of the section.
>
> Just remember, the Word interface is still unstable, and subject to 
> significant changes. And there is still much that has to be 
> accomplished down in the weeds of the CT_ interfaces.
>
> -----Original Message-----
> From: Cory Newey [mailto:cory.newey@gmail.com]
> Sent: Sunday, August 07, 2016 4:30 PM
> To: user@poi.apache.org
> Subject: Merge Word docx into another Word dox via XWPFDocument - 
> preserving formatting
>
> Hello All:
>
> I've tried to check the FAQ for this question and was unable to find 
> anything. I'm not sure if this should be a question for the developers'
> list; I couldn't really tell from the brief description of the various 
> lists. Anyway, I figured I'd ask the question here and you guys could 
> bounce me over to the developers' list if that is the appropriate 
> place for this question.
>
> I have written a program that will merge changes (replace certain 
> words with other words/phrases) into a Word (docx) document using XWPFDocument.
> But now I want to replace words/sections of one Word document with an 
> entire other Word document. I've tried to update my program so that it 
> iterates through the paragraphs/runs of the document to be merged in. 
> It merges the text just fine but it loses all formatting, tables, etc. 
> I've googled the question and found a few Stack-Overflow posts that 
> talked about it, but nothing was of any use to me.
>
> My question is: is it possible to merge one Word document into another 
> Word document, such that it copies all formatting, tables, etc from 
> the merged-in document (minus any headers/footers - that would be a 
> little impossible I think) into the merged-into document - using the 
> XWPFDocument object?
>
> Thanks in advance for any help.
> ~Cory
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org

Re: Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Posted by Cory Newey <co...@gmail.com>.
Thanks for the replies.

Unfortunately, I don't just want to tack one document onto the end of
another document. I want to search for a particular word/phrase in the
first document and replace that with the entire contents of the second
document. That means that if I find the text to be replaced inside of a
particular run, I need to preserve all of the contents of the runs before
that run, insert all of the contents from the merged-in document at the
place where run that contains the text-to-be-replaced resides, and then
preserve all of the contents of the runs that come after.

Thanks again.
~Cory


On Mon, Aug 8, 2016 at 5:19 AM, Murphy, Mark <mu...@metalexmfg.com>
wrote:

> That depends on what you mean by merge. Are you just trying to append
> document B onto the end of Document A, or are you trying to mix them
> together in some specific way?
>
> Appending should be just a matter of reading through the document body of
> Document B, and copying the elements to Document A. I wouldn't look for
> just paragraphs, but all elements. That way you shouldn't lose anything.
> The main thing you will have to watch out for is sections. These aren't
> handled well right now (ok, not handled at all). If there is one section in
> the document, the section element will be found at the end of the document,
> and must remain there. If there are multiple sections, all but the last
> section element will be found in the last paragraph properties element of
> the section.
>
> Just remember, the Word interface is still unstable, and subject to
> significant changes. And there is still much that has to be accomplished
> down in the weeds of the CT_ interfaces.
>
> -----Original Message-----
> From: Cory Newey [mailto:cory.newey@gmail.com]
> Sent: Sunday, August 07, 2016 4:30 PM
> To: user@poi.apache.org
> Subject: Merge Word docx into another Word dox via XWPFDocument -
> preserving formatting
>
> Hello All:
>
> I've tried to check the FAQ for this question and was unable to find
> anything. I'm not sure if this should be a question for the developers'
> list; I couldn't really tell from the brief description of the various
> lists. Anyway, I figured I'd ask the question here and you guys could
> bounce me over to the developers' list if that is the appropriate place for
> this question.
>
> I have written a program that will merge changes (replace certain words
> with other words/phrases) into a Word (docx) document using XWPFDocument.
> But now I want to replace words/sections of one Word document with an
> entire other Word document. I've tried to update my program so that it
> iterates through the paragraphs/runs of the document to be merged in. It
> merges the text just fine but it loses all formatting, tables, etc. I've
> googled the question and found a few Stack-Overflow posts that talked about
> it, but nothing was of any use to me.
>
> My question is: is it possible to merge one Word document into another
> Word document, such that it copies all formatting, tables, etc from the
> merged-in document (minus any headers/footers - that would be a little
> impossible I think) into the merged-into document - using the XWPFDocument
> object?
>
> Thanks in advance for any help.
> ~Cory
>

RE: Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Posted by "Murphy, Mark" <mu...@metalexmfg.com>.
That depends on what you mean by merge. Are you just trying to append document B onto the end of Document A, or are you trying to mix them together in some specific way?

Appending should be just a matter of reading through the document body of Document B, and copying the elements to Document A. I wouldn't look for just paragraphs, but all elements. That way you shouldn't lose anything. The main thing you will have to watch out for is sections. These aren't handled well right now (ok, not handled at all). If there is one section in the document, the section element will be found at the end of the document, and must remain there. If there are multiple sections, all but the last section element will be found in the last paragraph properties element of the section.

Just remember, the Word interface is still unstable, and subject to significant changes. And there is still much that has to be accomplished down in the weeds of the CT_ interfaces.

-----Original Message-----
From: Cory Newey [mailto:cory.newey@gmail.com] 
Sent: Sunday, August 07, 2016 4:30 PM
To: user@poi.apache.org
Subject: Merge Word docx into another Word dox via XWPFDocument - preserving formatting

Hello All:

I've tried to check the FAQ for this question and was unable to find anything. I'm not sure if this should be a question for the developers'
list; I couldn't really tell from the brief description of the various lists. Anyway, I figured I'd ask the question here and you guys could bounce me over to the developers' list if that is the appropriate place for this question.

I have written a program that will merge changes (replace certain words with other words/phrases) into a Word (docx) document using XWPFDocument.
But now I want to replace words/sections of one Word document with an entire other Word document. I've tried to update my program so that it iterates through the paragraphs/runs of the document to be merged in. It merges the text just fine but it loses all formatting, tables, etc. I've googled the question and found a few Stack-Overflow posts that talked about it, but nothing was of any use to me.

My question is: is it possible to merge one Word document into another Word document, such that it copies all formatting, tables, etc from the merged-in document (minus any headers/footers - that would be a little impossible I think) into the merged-into document - using the XWPFDocument object?

Thanks in advance for any help.
~Cory