You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2020/05/27 09:55:39 UTC
[Bug 64473] New: OPCPackage.open(fileName, PackageAccess.READ) does
not open valid xlsx file
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
Bug ID: 64473
Summary: OPCPackage.open(fileName, PackageAccess.READ) does not
open valid xlsx file
Product: POI
Version: 4.1.2-FINAL
Hardware: PC
Status: NEW
Severity: blocker
Priority: P2
Component: OPC
Assignee: dev@poi.apache.org
Reporter: berek@bk.ru
Target Milestone: ---
Created attachment 37268
--> https://bz.apache.org/bugzilla/attachment.cgi?id=37268&action=edit
corrupted file
Contents of the unpacked xlsx file and apache poi
The initial problem is the inability to open the xlsx file through poi
(OPCPackage.open(fileName, PackageAccess.READ)), while in Excel it opens.
A detailed study of the poi showed that the problem lies in the contents of the
xlsx file.
If you unzip xslx file, then in the xl folder, in addition to all other files
there will be two due to which there is a problem
xl/metadata
xl/metadata.xml
when using poi method OPCPackage.open(fileName, PackageAccess.READ) this leads
to an error:
org.apache.poi.openxml4j.exceptions.InvalidFormatException: You can't add a
part with a part name derived from another part ! [M1.11]
which occurs due to the same file names in PackagePartCollection.put method.
If I just copy the contents of the entire xlsx file to a new created xlsx file
and save it, then the xl/metadata file will not be there and it will open
through poi well.
But I don’t have the task of just fixing the file, I need to figure out why
this problem could arise.
it looks like a slightly incorrect xlsx, but I can still open it through exel,
is there any way to open it through poi?
Is there any idea about the occurrence of xl/metadata in the contents of the
xlsx?
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] [PATCH] OPCPackage.open(fileName, PackageAccess.READ)
does not open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
Yury <yu...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] OPCPackage.open(fileName, PackageAccess.READ) does not
open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
Eugene <be...@bk.ru> changed:
What |Removed |Added
----------------------------------------------------------------------------
OS| |All
--- Comment #1 from Eugene <be...@bk.ru> ---
Also pay attention to the documentation. I found only the draft version, but I
think that the difference there is not big.
https://www.ecma-international.org/activities/Office%20Open%20XML%20Formats/Draft%20ECMA-376%203rd%20edition,%20March%202011/Office%20Open%20XML%20Part%202%20-%20Open%20Packaging%20Conventions.pdf
item 9.1.1.4 Part Naming
A package implementer shall neither create nor recognize a part with apart name
derived from another part name by appending segments to it. [M1.11][Example:If
a package contains a part named“/segment1/segment2/.../segmentn”, then other
parts in that packageshall not have names such as: “/segment1”,
“segment1/segment2”, or “/segment1/segment2/.../segmentn-1”. endexample]
But also look at the item:
9.1.1 Part Names
Each part has a name. Part namesrefer to parts within a package. [Example:The
part name “/hello/world/doc.xml” contains three segments: “hello”, “world”, and
“doc.xml”.The first two segments in the sample represent levelsin the logical
hierarchy and serve to organize the parts of the package, whereas the
ECMA-376 Part214third contains actual content.Note that segments are not
explicitly representedas foldersin the package model, and no directory of
folders exists in the package model.end example]
In this example, “doc.xml” the name of this file is considered along with the
extension, whereas in the POI in the class PackagePartCollection in method
PackagePart put (final PackagePartName partName, final PackagePart part)
Comparison is made only by file names, not considering their extension, which
is possibly a mistake.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] [PATCH] OPCPackage.open(fileName, PackageAccess.READ)
does not open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
--- Comment #9 from PJ Fanning <fa...@yahoo.com> ---
Hi Simone - we need a reproducible test case to debug this or you can try
debugging yourself. Can you open a new issue? We fixed Yury's problem with this
issue - so it is best to track any similar issues with a new bugzilla issue.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] OPCPackage.open(fileName, PackageAccess.READ) does not
open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
--- Comment #6 from Yury <yu...@gmail.com> ---
Created attachment 37964
--> https://bz.apache.org/bugzilla/attachment.cgi?id=37964&action=edit
[PATCH] for fixing the issue
created by the following command:
ant -f patch.xml
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] [PATCH] OPCPackage.open(fileName, PackageAccess.READ)
does not open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
--- Comment #8 from Simone D'Avico <si...@gmail.com> ---
I see the same error occur with poi 5.1.0 and poi-ooxml 5.1.0. The xlsx file I
am trying to open indeed contains both metadata and metadata.xml.
Is there any way I can help troubleshoot this?
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] [PATCH] OPCPackage.open(fileName, PackageAccess.READ)
does not open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
--- Comment #10 from PJ Fanning <fa...@yahoo.com> ---
Giving us a file that reproduces the issue would be the main step towards
debugging the problem.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] OPCPackage.open(fileName, PackageAccess.READ) does not
open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
--- Comment #2 from PJ Fanning <fa...@yahoo.com> ---
It's possible we'll change POI code but the next release could be weeks away.
It's worth investigating where your xlsx file came from to find out why its
contents are not standard.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] OPCPackage.open(fileName, PackageAccess.READ) does not
open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
Nail Samatov <sa...@yandex.ru> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |sanail@yandex.ru
--- Comment #3 from Nail Samatov <sa...@yandex.ru> ---
Created attachment 37929
--> https://bz.apache.org/bugzilla/attachment.cgi?id=37929&action=edit
Zip file with files to reproduce the bug
We also have the same issue.
I tried to find the steps on how we can create such files that apache poi can't
read.
Pre-requisites:
Excel from MS Office 365
files 1.xlsx and 2.xlsx (you can find them in the attached zip file).
1.xlsx contains "xl/metadata" and 2.xlsx contains "xl/metadata.xml"
Steps:
1. Open 1.xlsx in Excel
2. Open 2.xlsx in Excel
3. Right click on the worksheet tab and select Move or Copy.
4. Select the 1.xlsx option at the To Book drop-down list.
5. Press OK.
6. Save 1.xlsx.
After save you will have 1.xlsx which contains both xl/metadata and
xl/metadata.xml
You can find result of the steps above in the folder "result-of-merge" in the
same attached zip file. This file can't be read by POI but can be opened in
Excel.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] [PATCH] OPCPackage.open(fileName, PackageAccess.READ)
does not open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
--- Comment #7 from PJ Fanning <fa...@yahoo.com> ---
Thanks Yury - merged with r1891692
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] OPCPackage.open(fileName, PackageAccess.READ) does not
open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
--- Comment #4 from yurkom <yu...@gmail.com> ---
The issue appeared after https://bz.apache.org/bugzilla/show_bug.cgi?id=61942
ticket in revision 1819708.
I think the dot symbol in the regexp is unnecessary in the line :
"(?=["+PackagingURIHelper.FORWARD_SLASH_STRING+".])";
^
this
See
https://svn.apache.org/viewvc/poi/trunk/poi-ooxml/src/main/java/org/apache/poi/openxml4j/opc/PackagePartCollection.java?revision=1819708&view=markup#l64
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] [PATCH] OPCPackage.open(fileName, PackageAccess.READ)
does not open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
Yury <yu...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|OPCPackage.open(fileName, |[PATCH]
|PackageAccess.READ) does |OPCPackage.open(fileName,
|not open valid xlsx file |PackageAccess.READ) does
| |not open valid xlsx file
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 64473] OPCPackage.open(fileName, PackageAccess.READ) does not
open valid xlsx file
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64473
Yury <yu...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Depends on| |61942
--- Comment #5 from Yury <yu...@gmail.com> ---
The issue appeared after https://bz.apache.org/bugzilla/show_bug.cgi?id=61942
ticket in revision 1819708.
I think the dot symbol in the regexp is unnecessary in the line :
"(?=["+PackagingURIHelper.FORWARD_SLASH_STRING+".])";
^
this
See
https://svn.apache.org/viewvc/poi/trunk/poi-ooxml/src/main/java/org/apache/poi/openxml4j/opc/PackagePartCollection.java?revision=1819708&view=markup#l64
Referenced Bugs:
https://bz.apache.org/bugzilla/show_bug.cgi?id=61942
[Bug 61942] Refactor PackagePartName handling and add getUnusedPartIndex method
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org