You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2015/07/21 09:08:32 UTC

[Bug 58159] New: getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

            Bug ID: 58159
           Summary: getHeaderText() and getFooterText() duplicate text in
                    sheet.getTextRuns()
           Product: POI
           Version: 3.9-FINAL
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HSLF
          Assignee: dev@poi.apache.org
          Reporter: luke.quinane@gmail.com

Created attachment 32917
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=32917&action=edit
sample where text is duplicated

We are trying to write a text extractor which will convert a PPT to text, and
we've noticed that if we only get the text from the sheet's text runs header
and footer content is missing sometimes. If we add in calls to getHeaderText()
and getFooterText() then for some items the text is duplicated in the data
runs.

Can we change this behaviour to always return the header/footer text in the
runs, or to remove the duplication?

Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58159] getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

sits <da...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |david.sitsky@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58159] getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

Nick Burch <ap...@gagravarr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Nick Burch <ap...@gagravarr.org> ---
3.9 is rather old, what happens if you try with 3.12, or better yet the 3.13
beta 1 release which is currently syncing out to all the mirrors?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58159] getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

--- Comment #4 from Andreas Beeker <ki...@apache.org> ---
Created attachment 33366
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=33366&action=edit
Adding common placeholder getter

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58159] getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

--- Comment #2 from Luke Quinane <lu...@gmail.com> ---
Hi Nick,

We've retested with 3.13-beta1-20150723 and it has the same problem.

Cheers, Luke

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58159] getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

Andreas Beeker <ki...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Hardware|PC                          |All
                 OS|Windows NT                  |All
           Keywords|                            |PatchAvailable

--- Comment #3 from Andreas Beeker <ki...@apache.org> ---
The patch adds getter/setter for Placeholder - so duplicate text shapes can be 
easily identified.
Apart of it ... it also contains (a lot of) related changes, which I've fixed 
in this go, i.e. ...
- a hslf specific escher client data record, for easier retrieval of child 
records
- RecordTypes enum, to minimize ambiguities of RecordTypes and actual Record
- the fix for #56570

I'll apply it after POI 3.14-Beta1 is out

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58159] getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

Andreas Beeker <ki...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |56570

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58159] getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

Luke Quinane <lu...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58159] getHeaderText() and getFooterText() duplicate text in sheet.getTextRuns()

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58159

Andreas Beeker <ki...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #5 from Andreas Beeker <ki...@apache.org> ---
Patch applied via r1722476

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org