You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2007/03/23 01:31:02 UTC

DO NOT REPLY [Bug 41933] New: - PicturesTable#getAlllPictures() sometimes loses images

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=41933>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=41933

           Summary: PicturesTable#getAlllPictures() sometimes loses images
           Product: POI
           Version: 3.0-dev
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: major
          Priority: P2
         Component: HWPF
        AssignedTo: poi-dev@jakarta.apache.org
        ReportedBy: trejkaz@trypticon.org


A constructed document containing four embedded images returns only two when
calling PicturesTable#getAllPictures().

On debugging, it turns out that the pos offset becomes incorrect at the third
image -- the block type at that position is not recognised as an image type, and
the skipOn value is too big to be realistic.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


DO NOT REPLY [Bug 41933] - PicturesTable#getAlllPictures() sometimes loses images

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=41933>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=41933





------- Additional Comments From trejkaz@trypticon.org  2007-03-22 17:55 -------
Here's a bit of analysis of the data stream:

Start of first block (an image):

00000000 E6 0D 07 00 44 00 64 00 00 00 00 00 00 00 08 00 ....D.d.........

Start of second block (an image):

0 + 70DE6 = 70DE6
                 `----------.
                            V
00070DE0 4E 44 AE 42 60 82 CB F3 07 00 44 00 64 00 00 00 ND.B`.....D.d...

Start of third block, which seems to be wrong:

70DE6 + 7F3CB = F01B1
             ,----'
            v
000F01B0 82 9A 00 16 24 01 17 24 01 49 66 01 00 00 00 01 ....$..$.If.....
000F01C0 96 6C 00 21 76 00 02 68 01 35 D6 05 00 01 03 A1 .l.!v..h.5......
000F01D0 11 35 D6 05 01 02 03 0D 12 23 76 00 01 A1 11 23 .5.......#v....#
000F01E0 76 01 02 0D 12 3A 56 0B 00 02 96 6C 00 07 94 CE v....:V....l....
000F01F0 0D 0A 74 00 00 A0 04 13 D6 30 00 00 00 00 04 01 ..t......0......
000F0200 00 00 00 00 00 00 04 01 00 00 00 00 00 00 04 01 ................
000F0210 00 00 00 00 00 00 04 01 00 00 00 00 00 00 04 01 ................
000F0220 00 00 00 00 00 00 04 01 00 00 14 F6 01 00 00 15 ................
000F0230 36 01 35 D6 05 00 01 03 A1 11 35 D6 05 01 02 03 6.5.......5.....

           ACTUAL IMAGE BLOCK HEADER IS HERE----.
                                                v
000F0240 0D 12 61 F6 03 6C 00 79 74 40 0F 7D 00 D1 E0 01 ..a..l.yt@.}....
000F0250 00 44 00 64 00 00 00 00 00 00 00 08 00 00 00 00 .D.d............


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


DO NOT REPLY [Bug 41933] - PicturesTable#getAlllPictures() sometimes loses images

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=41933>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=41933





------- Additional Comments From trejkaz@trypticon.org  2007-04-18 18:43 -------
Okay, the short answer is that the pictures table alone isn't enough.  The code
will need to read every CharacterRun, pull out the picture offsets, and then
remove duplicates (and probably sort the list too, to keep it in the same order
so that nothing which was already calling getAllPictures() breaks.)

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


DO NOT REPLY [Bug 41933] - PicturesTable#getAlllPictures() sometimes loses images

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=41933>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=41933





------- Additional Comments From trejkaz@trypticon.org  2007-04-18 19:44 -------
Here's a workaround for anyone else who needs to get all pictures out.  I'm not
sure if this is actually the best way to implement getAllPictures() because it
requires access to the whole document.

  HWPFDocument document = ...

  SortedMap<Integer, CharacterRun> runs = new TreeMap<Integer, CharacterRun>();

  PicturesTable picturesTable = document.getPicturesTable();
  Range wholeDocument = document.getRange();
  int runCount = wholeDocument.numCharacterRuns();
  for (int i = 0; i < runCount; i++) {
    CharacterRun run = wholeDocument.getCharacterRun(i);
    if (picturesTable.hasPicture(run)) {
      int picOffset = run.getPicOffset();
      if (runs.containsKey(picOffset)) {
        continue;
      }

      runs.put(picOffset, run);
    }
  }

  allPictures = new ArrayList<Picture>();
  for (CharacterRun run : runs.values()) {
    allPictures.add(picturesTable.extractPicture(run, false));
  }


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


DO NOT REPLY [Bug 41933] - PicturesTable#getAlllPictures() sometimes loses images

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=41933>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=41933





------- Additional Comments From trejkaz@trypticon.org  2007-03-22 17:35 -------
Tried to attach the sample document but the bug tracker won't let me because
it's too big.

Here's a link:
http://trypticon.org/files/four-images.doc


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


DO NOT REPLY [Bug 41933] - PicturesTable#getAlllPictures() sometimes loses images

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=41933>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=41933


nick@torchbox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO




------- Additional Comments From nick@torchbox.com  2007-12-04 04:23 -------
In the absence of any other fixes, shall we go ahead and commit this?

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org