You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by "georgberky (via GitHub)" <gi...@apache.org> on 2023/04/20 08:24:06 UTC

[GitHub] [poi] georgberky opened a new pull request, #458: Insert images into document with shapes

georgberky opened a new pull request, #458:
URL: https://github.com/apache/poi/pull/458

   Dear POI crew,
   
   thank you for your work on POI. We might have found a bug that occurs when inserting images into Word documents that already contain shapes. There seems to be some kind of collision between the shapes' and the images' IDs. The problem only seems to occur after a certain amount of shapes is already present in the document.
   
   We wrote a test that should illustrate the problem but we don't know how to fix it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[GitHub] [poi] centic9 commented on pull request #458: Insert images into document with shapes

Posted by "centic9 (via GitHub)" <gi...@apache.org>.
centic9 commented on PR #458:
URL: https://github.com/apache/poi/pull/458#issuecomment-1518550277

   See also https://stackoverflow.com/questions/68074154/are-there-mcalternatecontent-and-mcfallback-support-in-poi-xwpf-parser and https://stackoverflow.com/questions/46802369/replace-text-in-text-box-of-docx-by-using-apache-poi/46894499#46894499 for related discussions about "AlternateContent"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[GitHub] [poi] georgberky commented on pull request #458: Insert images into document with shapes

Posted by "georgberky (via GitHub)" <gi...@apache.org>.
georgberky commented on PR #458:
URL: https://github.com/apache/poi/pull/458#issuecomment-1520253254

   Hi @centic9 , we've done some more research and are using documents with shapes in the footer now. Applying the workaround as you posted it didn't fix the documents. 
   
   We also tried to add the footer paragraphs to the list of paragraphs the workaround iterates over. That didn't help either.
   
   We also discarded our test assertions as the `docPr` values alone don't seem to cause the problem for word.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[GitHub] [poi] georgberky commented on pull request #458: Insert images into document with shapes

Posted by "georgberky (via GitHub)" <gi...@apache.org>.
georgberky commented on PR #458:
URL: https://github.com/apache/poi/pull/458#issuecomment-1520048412

   @centic9 , thanks for replying so fast. 
   
   We're testing the workaround in our project. 
   
   Another observation: "unable to open document" in Word only seems to happen when the shape is part of the footer. Word seems to be able to handle shapes in the document body better 🤔.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[GitHub] [poi] centic9 commented on pull request #458: Insert images into document with shapes

Posted by "centic9 (via GitHub)" <gi...@apache.org>.
centic9 commented on PR #458:
URL: https://github.com/apache/poi/pull/458#issuecomment-1518549715

   The duplication seems to be because of the "AlternatContent" (i.e. MSs way of not changing the spec, but still adding new types of content ...).
   
   Can you try to add the following between opening the document and adding content? I am not 100% sure this is really the issue, so please verify that Microsoft Word can then read the document properly.
   
   This can also be used as workaround if it is really fixing the issue.
   
   ```
               for (XWPFParagraph paragraph : document.getParagraphs()) {
                   for (XWPFRun run : paragraph.getRuns()) {
                       XmlCursor cursor = run.getCTR().newCursor();
                       cursor.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' "
                               + "declare namespace mc='http://schemas.openxmlformats.org/markup-compatibility/2006' "
                               + "declare namespace wp='http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing' "
                               + ".//mc:AlternateContent/mc:Choice/w:drawing/wp:anchor/wp:docPr");
   
                       while(cursor.hasNextSelection()) {
                           cursor.toNextSelection();
                           XmlObject obj = cursor.getObject();
   
                           CTNonVisualDrawingProps docPr = CTNonVisualDrawingProps.Factory.parse(obj.xmlText());
   
                           document.getDrawingIdManager().reserve(docPr.getId());
                       }
                   }
               }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[GitHub] [poi] centic9 commented on pull request #458: Insert images into document with shapes

Posted by "centic9 (via GitHub)" <gi...@apache.org>.
centic9 commented on PR #458:
URL: https://github.com/apache/poi/pull/458#issuecomment-1520621664

   Ok, as said I am not sure if this will even make it work, but two things I see which might be off:
   * It's important to do the "workaround()" immediately after loading the document, not at the end, otherwise avoiding duplicate ids will not work
   * The current "workaround()" only looks at XPath ".//mc:AlternateContent/mc:Choice/w:drawing/wp:anchor/wp:docPr", if the "docPr" is at a different location in the XML of the Paragraph/Run, another XPath/Cursor might be necessary to also find and "reserve()" those elements
   
   Otherwise you will need to narrow down the problem as much as possible, both by making the document as simple a possible and the changes to the document as well. Unfortunately I cannot help much because I only have the online-version of Microsoft Word and this one seems to be able to load the document just fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[GitHub] [poi] georgberky commented on pull request #458: Insert images into document with shapes

Posted by "georgberky (via GitHub)" <gi...@apache.org>.
georgberky commented on PR #458:
URL: https://github.com/apache/poi/pull/458#issuecomment-1521387979

   Thanks again for helping. 
   
   I'm now running the workaround directly after loading the document. I've also changed the XPath to `.//wp:docPr`. The XPath worked when running "evaluate XPath" in IntelliJ on `footer2.xml`, but it didn't fix the broken documents.
   
   I'll keep digging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org