You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Christian Märzinger <ch...@gmail.com> on 2011/01/03 00:15:12 UTC

Re: Extract Content_Types.xml from docx

And how is this done in POI?
Thanks
Christian

Am 28.12.2010 01:56, schrieb Nick Burch:
> On Tue, 28 Dec 2010, Christian Märzinger wrote:
>> So I if I add the comment rel to the /_rels/document.xml.rels if it 
>> does not exists. And also in the Content_Types.xml the comments.xml 
>> part has to be overwritten.
>
> You probably shouldn't really be doing that yourself. If you ask POI 
> to add the part with a relationship, it'll handle the _rels and 
> content_types updates for you
>
> Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Extract Content_Types.xml from docx

Posted by Christian Märzinger <ch...@gmail.com>.
Sorry!

It should have been now.

The comment part is created and all relations are set.

Am 05.01.2011 08:29, schrieb Mark Beardsley:
> In that case then. are you experienving problems linking the comment to the
> text - my point 3 in the above answer?
>
> Yours
>
> Mark B

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Extract Content_Types.xml from docx

Posted by Mark Beardsley <ma...@tiscali.co.uk>.
In that case then. are you experienving problems linking the comment to the
text - my point 3 in the above answer?

Yours

Mark B
-- 
View this message in context: http://apache-poi.1045710.n5.nabble.com/Extract-Content-Types-xml-from-docx-tp3319726p3328349.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Extract Content_Types.xml from docx

Posted by Christian Märzinger <ch...@gmail.com>.
thanks!
No all relations, parts and Content_type are correct set.


Am 04.01.2011 17:08, schrieb Mark Beardsley:
> > From a very cursory examination of two Word documents that I created with
> Office 2007 there are three obvious differences between the one into which I
> inserted a comment and the one without any comments.
>
> The first and most obvious I guess, is the following entry in the
> document.xml.rels file;
>
> <Relationship Id="rId4"
> Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/comments"
> Target="comments.xml" />
>
> and I think that this is the part that could be created by adding a
> relationship to the document.
>
> The second, and I guess also fairly obvious, difference is that the document
> that does contain a comment also comtains a file called comments.xml.
> Unsurprisingly, it contains markup that defines the comment and looks a
> little like this;
>
> <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
> -<w:comments
> xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
> xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
> xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
> xmlns:v="urn:schemas-microsoft-com:vml"
> xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
> xmlns:w10="urn:schemas-microsoft-com:office:word"
> xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
> xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
> -<w:comment w:id="0" w:author="win user" w:date="2011-01-04T15:20:00Z"
> w:initials="wu">
> -<w:p w:rsidR="00306FCE" w:rsidRDefault="00306FCE">
> -<w:pPr>
> <w:pStyle w:val="CommentText" />
> </w:pPr>
> -<w:r>
> -<w:rPr>
> <w:rStyle w:val="CommentReference" />
> </w:rPr>
> <w:annotationRef />
> </w:r>
> -<w:r>
> <w:t>And this is the comment</w:t>
> </w:r>
> </w:p>
> </w:comment>
> </w:comments>
>
> The final part of the puzzle lies in the way the comment is lnked with the
> text of the document. That can be found in the third difference between the
> two files and, this time, that lies in the contents of the document.xml
> file. Inserting the comment makes that file look like this;
>
> <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
> -<w:document
> xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
> xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
> xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
> xmlns:v="urn:schemas-microsoft-com:vml"
> xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
> xmlns:w10="urn:schemas-microsoft-com:office:word"
> xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
> xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
> -<w:body>
> -<w:p w:rsidR="008D0F90" w:rsidRDefault="00B82E75">
> -<w:r>
> <w:t xml:space="preserve">This is the text within the body of the</w:t>
> </w:r>
> <w:commentRangeStart w:id="0" />
> -<w:r>
> <w:t>document</w:t>
> </w:r>
> <w:commentRangeEnd w:id="0" />
> -<w:r w:rsidR="00306FCE">
> -<w:rPr>
> <w:rStyle w:val="CommentReference" />
> </w:rPr>
> <w:commentReference w:id="0" />
> </w:r>
> -<w:r>
> <w:t>.</w:t>
> </w:r>
> </w:p>
> -<w:sectPr w:rsidR="008D0F90" w:rsidSect="008D0F90">
> <w:pgSz w:w="11906" w:h="16838" />
> <w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440"
> w:header="708" w:footer="708" w:gutter="0" />
> <w:cols w:space="708" />
> <w:docGrid w:linePitch="360" />
> </w:sectPr>
> </w:body>
> </w:document>
>
> The key seems to be that the comment has been attached to the final word in
> the single sentence this document contains - the word 'document' in this
> case.
>
> So, it seems that you will have to find a way to accomplish the following;
> 1. Create the relationship - this should be fairly straightforward and I
> think is the mechanism Nick is alluding to.
> 2. Create the comments.xml file. I am sure that there are methods in the
> openxml code underpinning POI that will support this. I am guessing that you
> will need to find the relevant factory method to create the 'comment'
> document part, add a comment to that and then stream it out using the
> OutputStream that you should be able to get from the package part.
> 3. This is the potentially tricky bit to my mind as my researches suggest
> that you will need to identify a mechanism to somehow 'attach' or 'link' the
> comment to the document. My simple example suggests that the comment is
> 'attached' to the final word in the sentence suggesting that it is the
> XWPFRun class instance that should offer you the ability to attach a
> comment, but I am not at all clear yet in my own mind that this is the
> 'best' solution as it could be argued that the XWPFParagraph class should
> allow you to insert an comment or even to create one maybe. The key will be
> to nail down just how Word handles comments. Are they attached to words,
> phrases, sentences, pargarphs, sections or even the whole document?
>
> I can make no promises at all but if I do have the time in the near future,
> then I will try to look a little further into this question.
>
> Yours
>
> Mark B
>
> PS There is one further difference - each comment seems to add a single
> additional character to the size of the file as contained in the app
> properties (I think as I do need to review this again) file.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Extract Content_Types.xml from docx

Posted by Mark Beardsley <ma...@tiscali.co.uk>.
>From a very cursory examination of two Word documents that I created with
Office 2007 there are three obvious differences between the one into which I
inserted a comment and the one without any comments.

The first and most obvious I guess, is the following entry in the
document.xml.rels file;

<Relationship Id="rId4"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/comments"
Target="comments.xml" />

and I think that this is the part that could be created by adding a
relationship to the document.

The second, and I guess also fairly obvious, difference is that the document
that does contain a comment also comtains a file called comments.xml.
Unsurprisingly, it contains markup that defines the comment and looks a
little like this;

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
- <w:comments
xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"> 
- <w:comment w:id="0" w:author="win user" w:date="2011-01-04T15:20:00Z"
w:initials="wu"> 
- <w:p w:rsidR="00306FCE" w:rsidRDefault="00306FCE"> 
- <w:pPr> 
<w:pStyle w:val="CommentText" /> 
</w:pPr> 
- <w:r> 
- <w:rPr> 
<w:rStyle w:val="CommentReference" /> 
</w:rPr> 
<w:annotationRef /> 
</w:r> 
- <w:r> 
<w:t>And this is the comment</w:t> 
</w:r> 
</w:p> 
</w:comment> 
</w:comments>

The final part of the puzzle lies in the way the comment is lnked with the
text of the document. That can be found in the third difference between the
two files and, this time, that lies in the contents of the document.xml
file. Inserting the comment makes that file look like this;

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
- <w:document
xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"> 
- <w:body> 
- <w:p w:rsidR="008D0F90" w:rsidRDefault="00B82E75"> 
- <w:r> 
<w:t xml:space="preserve">This is the text within the body of the</w:t> 
</w:r> 
<w:commentRangeStart w:id="0" /> 
- <w:r> 
<w:t>document</w:t> 
</w:r> 
<w:commentRangeEnd w:id="0" /> 
- <w:r w:rsidR="00306FCE"> 
- <w:rPr> 
<w:rStyle w:val="CommentReference" /> 
</w:rPr> 
<w:commentReference w:id="0" /> 
</w:r> 
- <w:r> 
<w:t>.</w:t> 
</w:r> 
</w:p> 
- <w:sectPr w:rsidR="008D0F90" w:rsidSect="008D0F90"> 
<w:pgSz w:w="11906" w:h="16838" /> 
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440"
w:header="708" w:footer="708" w:gutter="0" /> 
<w:cols w:space="708" /> 
<w:docGrid w:linePitch="360" /> 
</w:sectPr> 
</w:body> 
</w:document>

The key seems to be that the comment has been attached to the final word in
the single sentence this document contains - the word 'document' in this
case.

So, it seems that you will have to find a way to accomplish the following;
1. Create the relationship - this should be fairly straightforward and I
think is the mechanism Nick is alluding to.
2. Create the comments.xml file. I am sure that there are methods in the
openxml code underpinning POI that will support this. I am guessing that you
will need to find the relevant factory method to create the 'comment'
document part, add a comment to that and then stream it out using the
OutputStream that you should be able to get from the package part.
3. This is the potentially tricky bit to my mind as my researches suggest
that you will need to identify a mechanism to somehow 'attach' or 'link' the
comment to the document. My simple example suggests that the comment is
'attached' to the final word in the sentence suggesting that it is the
XWPFRun class instance that should offer you the ability to attach a
comment, but I am not at all clear yet in my own mind that this is the
'best' solution as it could be argued that the XWPFParagraph class should
allow you to insert an comment or even to create one maybe. The key will be
to nail down just how Word handles comments. Are they attached to words,
phrases, sentences, pargarphs, sections or even the whole document?

I can make no promises at all but if I do have the time in the near future,
then I will try to look a little further into this question.

Yours

Mark B

PS There is one further difference - each comment seems to add a single
additional character to the size of the file as contained in the app
properties (I think as I do need to review this again) file.


-- 
View this message in context: http://apache-poi.1045710.n5.nabble.com/Extract-Content-Types-xml-from-docx-tp3319726p3327282.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Extract Content_Types.xml from docx

Posted by Nick Burch <ni...@alfresco.com>.
On Mon, 3 Jan 2011, Christian Märzinger wrote:
> Now I can add the comments.xml relation to the 
> /word/_rels/document.xml.rels but the comments.xml in /word is not 
> created and also in the [content_types].xml the comment relation will 
> not be overriden.

You should first create the comments part and fill it, then link that with 
a relationship to the document part. Is that what you're doing?

> If i add the relation to the OPCPackage the file is corrupted.

That adds a root relation, and IIRC comments aren't supposed to be those, 
only things like the document is

Nick

Re: Extract Content_Types.xml from docx

Posted by Christian Märzinger <ch...@gmail.com>.
Now I can add the comments.xml relation to the /word/_rels/document.xml.rels
but the comments.xml in /word is not created and also
in the [content_types].xml the comment relation will not be overriden.

I add the Relation to a PackagePart which is the document.xml

If i add the relation to the OPCPackage the file is corrupted.

code
commentsPart = PackagingURIHelper.createPartName("/word/comments.xml");
coreDoc.addRelationship(commentsPart, TargetMode.INTERNAL,
"
http://schemas.openxmlformats.org/officeDocument/2006/relationships/comments
");
end of code

2011/1/3 Nick Burch <ni...@alfresco.com>

> On Mon, 3 Jan 2011, Christian Märzinger wrote:
>
>> And how is this done in POI?
>>
>
> For code examples, take a look at methods like createRelationship in
> POIXMLDocumentPart:
>
> http://svn.apache.org/repos/asf/poi/trunk/src/ooxml/java/org/apache/poi/POIXMLDocumentPart.java
>
> Nick

Re: Extract Content_Types.xml from docx

Posted by Nick Burch <ni...@alfresco.com>.
On Mon, 3 Jan 2011, Christian Märzinger wrote:
> And how is this done in POI?

For code examples, take a look at methods like createRelationship in 
POIXMLDocumentPart:
http://svn.apache.org/repos/asf/poi/trunk/src/ooxml/java/org/apache/poi/POIXMLDocumentPart.java

Nick