You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2019/02/28 05:17:58 UTC

[Bug 63215] New: Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

            Bug ID: 63215
           Summary: Specific microsoft excel files(.xlsx - created using
                    Microsoft Office 2016) getting corrupted while
                    manipulated using apache-poi-3.10-FINAL libraries.
                    Could you please let us know the root cause of this
                    issue ?
           Product: POI
           Version: unspecified
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: POI Overall
          Assignee: dev@poi.apache.org
          Reporter: snag@opentext.com
  Target Milestone: ---

Specific microsoft excel files(.xlsx - created using Microsoft Office 2016)
getting corrupted while manipulated using apache-poi-3.10-FINAL libraries.
Could you please let us know the root cause of this issue ? 

Attached sample file here.

Usecase scenario - We upload/manipulate the attached sample file in our server
& then we download the file. And once we download & open the file, we get a
warning message saying the file has been corrupted. We have a legacy
application and we are still using poi 3.10-FINAL libraries. Upgrading poi to
3.15 solves the problem. However, we would still like to know the root cause,
as in why this specific .xlsx file is getting corrupted while manipulated using
poi-3.10-FINAL libraries. Could you please let us know the same ?

Thanks in advance.

Regards,
Sushmita

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

--- Comment #2 from PJ Fanning <fa...@yahoo.com> ---
Apache POI is maintained by volunteers and I'm not sure if this task is going
to be taken on by someone in their spare time. Apache POI 3.10 is no longer
maintained.

If you do find the issue, please share it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

Andreas Beeker <ki...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |NEEDINFO

--- Comment #7 from Andreas Beeker <ki...@apache.org> ---
I've downloaded both your files and find a ton of differences - left alone
their file size differs by 400kb.
So this is what I've done, for *.rels, *.vml and *.xml files.

find . -name "*.rels" -exec xmllint --format {} --output {} \;
find . -name "*.rels" -exec sed -e "s/ standalone=.yes.//" {} > {} \;

Then use WinMerge or Meld to compare the files.

> Only in-case of hyperlink, this issue is occurring i guess.

The guessing can be easily validated by adding the line into the .rels file and
try again ...

If I would like to find that difference, I would try to minimize the example
files to the bare minimum where the error occurs.

I'm just being curious, why would you invest so much time in finding the reason
for a solved problem?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

Andreas Beeker <ki...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|NEW                         |RESOLVED

--- Comment #4 from Andreas Beeker <ki...@apache.org> ---
To find the root cause, I would first roughly identify the version when it
started to work, e.g. not working in 3.13, but working in 3.14.

Then, identify the change in xml document, i.e. unzip the .xlsx of 3.13 and
3.14 and compare it's content. Update the 3.13 version of the .xlsx iteratively
to match the 3.14 version until it works. Remove all other modifications until
you know which minimum set which caused the error.

Then you can use "git bisect" [1] plus a test-driver which checks the affected
document part for the modifications. Or seek the modification code and check
its commit log.

[1] https://git-scm.com/docs/git-bisect-lk2009.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

--- Comment #6 from Sushmita Nag <sn...@opentext.com> ---
And also, while opening the corrupted .xlsx doc, it populates with the below
error message.

Excel completed file level validation and repair. Some parts of this workbook
may have been repaired or discarded.
Repaired Records: Drawing from /xl/drawings/drawing1.xml part (Drawing shape)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

Sushmita Nag <sn...@opentext.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

--- Comment #1 from Sushmita Nag <sn...@opentext.com> ---
Link for the sample .xlsx file.

https://drive.google.com/open?id=1I3k3TA82iVlCkX8fZe_vmvRwKikqK4Yq

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

--- Comment #5 from Sushmita Nag <sn...@opentext.com> ---
hi Andreas,

I unzipped the files & compared the content. I could see one liner difference
between both the drawing1.xml.rels.rels file of working & non-working .xlsx
document.

File path :- Corrupted\xl\drawings\_rels\drawing1.xml.rels.

The corrupted & working file both can be found in below link:-

https://drive.google.com/open?id=1DoOYmJRtwJVug0A_4PgcIuP7Kq1aRDnt


The line which is missing in drawing1.xml.rels of the corrupted file is below:-

<Relationship Id="rId1"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink"
Target="javascript:" TargetMode="External"/>


Only in-case of hyperlink, this issue is occurring i guess. As in all other
valid cases, i could only see image.

Could you please let us know your thoughts on this ?

Regards,
Sushmita

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

--- Comment #3 from Greg Woolsey <gw...@apache.org> ---
You can start by looking through the changelog [1], and examining issues that
sound relevant.  The SVN repository and it's Git mirror are publicly available
for read access [2].  Commit comments and change diffs would be where you would
find the answer you are looking for.  I doubt you will get much more help that
that for a question about an open source release that is over five years old.
Also, without knowing exactly what manipulations are being done, there is no
way to tell what is causing corruption in that old version.  There have been a
huge number of changes, new features, bug fixes, optimizations, and security
adjustments made in the last 5 years.

[1] https://poi.apache.org/changes.html
[2] https://poi.apache.org/devel/subversion.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Microsoft Excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

Dominik Stadler <do...@gmx.at> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Specific microsoft excel    |Microsoft Excel files(.xlsx
                   |files(.xlsx - created using |- created using Microsoft
                   |Microsoft Office 2016)      |Office 2016) getting
                   |getting corrupted while     |corrupted while manipulated
                   |manipulated using           |
                   |apache-poi-3.10-FINAL       |
                   |libraries. Could you please |
                   |let us know the root cause  |
                   |of this issue ?             |

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Specific microsoft excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated using apache-poi-3.10-FINAL libraries. Could you please let us know the root cause of this issue ?

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

Sushmita Nag <sn...@opentext.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|INVALID                     |---
             Status|RESOLVED                    |REOPENED

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63215] Microsoft Excel files(.xlsx - created using Microsoft Office 2016) getting corrupted while manipulated

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63215

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |RESOLVED
         Resolution|---                         |INFORMATIONPROVIDED

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org