You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2008/08/05 14:45:06 UTC

DO NOT REPLY [Bug 45556] New: poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

https://issues.apache.org/bugzilla/show_bug.cgi?id=45556

           Summary: poi-3.5-beta1-20080718.jar - content from the foot notes
                    of a 2007 docx document is not extracted.
           Product: POI
           Version: unspecified
          Platform: PC
        OS/Version: Windows Server 2003
            Status: NEW
          Severity: normal
          Priority: P2
         Component: POI Overall
        AssignedTo: dev@poi.apache.org
        ReportedBy: xtrimxtrim@yahoo.fr


Created an attachment (id=22379)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=22379)
Contains JUnit test class and documents used for testing.

The text contained in the notes inserted at the end of a page of a word 2007
document is not extracted.
Find in attachments the JUnit test class and the documents used for testing.
We expected to extract the words "testdoc" and "test phrase".

Notes on the attached documents:

- the documents "classic_FootNote.docx" and "form_FootNotes.docx" contain the
words "testdoc" and "test phrase" in the notes inserted at the end of a page of
the documents.


"TestUnitPoi35Filter.java" is the JUnit class.


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #3 from Maxim Valyanskiy <ma...@gmail.com>  2009-07-14 06:06:41 PST ---
Created an attachment (id=23976)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=23976)
patch

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] [PATCH] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #8 from Maxim Valyanskiy <ma...@gmail.com>  2009-07-17 05:40:48 PST ---
Created an attachment (id=24004)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24004)
Additional patch for endnotes

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] [PATCH] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #9 from Maxim Valyanskiy <ma...@gmail.com>  2009-07-17 05:41:34 PST ---
Created an attachment (id=24005)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24005)
src/scratchpad/testcases/org/apache/poi/hwpf/data/A Nepalese name for
Tilaka.docx

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556


Maxim Valyanskiy <ma...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |max.valjanski@gmail.com




-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] [PATCH] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556


Maxim Valyanskiy <ma...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|poi-3.5-beta1-20080718.jar  |[PATCH]
                   |- content from the foot     |poi-3.5-beta1-20080718.jar
                   |notes of a 2007 docx        |- content from the foot
                   |document is not extracted.  |notes of a 2007 docx
                   |                            |document is not extracted.




-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] [PATCH] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #4 from Maxim Valyanskiy <ma...@gmail.com>  2009-07-17 01:12:11 PST ---
Created an attachment (id=24000)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24000)
Additinal patch that add text extraction of footnotes in tables

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] [PATCH] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556


Yegor Kozlov <ye...@dinom.ru> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




--- Comment #10 from Yegor Kozlov <ye...@dinom.ru>  2009-07-18 02:44:59 PST ---
Patch applied to svn trunk with some minor tweaks. 

Thanks,
Yegor

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] [PATCH] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #6 from Yegor Kozlov <ye...@dinom.ru>  2009-07-17 02:32:34 PST ---
Maxim,

XWPFFootnote.java is missing in the patch. Please attach, I'm going to look
into it this weekend.

Regards,
Yegor

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] [PATCH] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #5 from Maxim Valyanskiy <ma...@gmail.com>  2009-07-17 01:12:50 PST ---
Created an attachment (id=24001)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24001)
src/scratchpad/testcases/org/apache/poi/hwpf/data/Table.docx

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #1 from Maxim Valyanskiy <ma...@gmail.com>  2009-07-14 06:01:27 PST ---
I did create patch that adds text extraction for docx footnotes. Please review
my solution, I'm going to add endnotes extraction in the same way.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] [PATCH] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #7 from Maxim Valyanskiy <ma...@gmail.com>  2009-07-17 03:46:01 PST ---
Created an attachment (id=24003)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24003)
XWPFFootnote.java

oops :-)

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 45556] poi-3.5-beta1-20080718.jar - content from the foot notes of a 2007 docx document is not extracted.

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=45556





--- Comment #2 from Maxim Valyanskiy <ma...@gmail.com>  2009-07-14 06:04:02 PST ---
Created an attachment (id=23975)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=23975)
src/scratchpad/testcases/org/apache/poi/hwpf/data/snoska.docx

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org