You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by Mathieu Migout <mm...@gmail.com> on 2005/12/13 08:31:44 UTC

Powerpoint Embedded Objects

Hi!

  I am new to Jakarta POI.  I am developing an application where i
extract embedded object from Word, EXCEL and powerpoint.

But in powerpoint, embedded object are compressed. Can any one know
compression protocole.

  Very Urgent..

  Thanks In Advance

Mathieu

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


Re: Powerpoint Embedded Objects

Posted by Nick Burch <ni...@torchbox.com>.
On Tue, 13 Dec 2005, Mathieu Migout wrote:
>  - ExOleObjStg : A variable length container, which has LZW compressed
> data, which corresponds to the Istorage data for this ole object. The
> uncompressed data is a docfile, which contains the ole object data.

Sounds like powerpoint might strip off some of the OLE stuff when it 
embeds it, or possibly have had some more stuff wrapped around it

Once you've decomressed it, try comparing it to the thing you originally 
embeded. You might find that the decompressed stream starts part way into
the original (and hence stuff was stripped off when embeding), or that the 
original starts part way into the decomressed stream (and hence extra 
stuff was added)

Do let us know what you find, so we can include it in a future HSLF 
version

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


Powerpoint Embedded Objects

Posted by Mathieu Migout <mm...@gmail.com>.
---------- Forwarded message ----------
From: Mathieu Migout <mm...@gmail.com>
Date: 13 déc. 2005 12:18
Subject: Re: Powerpoint Embedded Objects
To: POI Developers List <po...@jakarta.apache.org>

Hi

I want to extract embedded object from PowerPoint file (other office
document).
An embedded object is a record which type is "4113".
I can extract record. I remoce record header ( 8 bytes). I try dumping the
byte stream to a file.
But neither winzip, neither winrar can open it.

I read in a powerpoint file format description :

record type = 4045:
  - ExEmbedAtom
 record type = 4035:
  - ExOleObjAtom
record type = 4113
  - ExOleObjStg : A variable length container, which has LZW compressed
data, which corresponds to the Istorage data for this ole object. The
uncompressed data is a docfile, which contains the ole object data.

I try uncompressed with LZW algo but it didn't work. I don't know if my LZW
algo is wrong or if it is not LZW compressed.

thanks.

Mathieu


2005/12/13, Nick Burch <ni...@torchbox.com>:
>
> On Tue, 13 Dec 2005, Mathieu Migout wrote:
> > But in powerpoint, embedded object are compressed. Can any one know
> > compression protocole.
>
> I'd take a punt at either cab format or zip format, but I've never
> checked! You could try passing it through "java.util.zip.GZIPInputStream"
> and see if it comes out sensible, otherwise try dumping the byte stream to
>
> a file and seeing if something like winzip will open it
>
> What sort of objects are they, and how are you hoping to get them out? If
> we can figure out how they're compressed, I could probably extend HSLF to
> make this sort of thing easier (though probably not until I've finished
> the rich text stuff)
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>
>

Re: Powerpoint Embedded Objects

Posted by Mathieu Migout <mm...@gmail.com>.
Hi

I want to extract embedded object from PowerPoint file (other office
document).
An embedded object is a record which type is "4113".
I can extract record. I remoce record header ( 8 bytes). I try dumping the
byte stream to a file.
But neither winzip, neither winrar can open it.

I read in a powerpoint file format description :

record type = 4045:
  - ExEmbedAtom
 record type = 4035:
  - ExOleObjAtom
record type = 4113
  - ExOleObjStg : A variable length container, which has LZW compressed
data, which corresponds to the Istorage data for this ole object. The
uncompressed data is a docfile, which contains the ole object data.

I try uncompressed with LZW algo but it didn't work. I don't know if my LZW
algo is wrong or if it is not LZW compressed.

thanks.

Mathieu


2005/12/13, Nick Burch <ni...@torchbox.com>:
>
> On Tue, 13 Dec 2005, Mathieu Migout wrote:
> > But in powerpoint, embedded object are compressed. Can any one know
> > compression protocole.
>
> I'd take a punt at either cab format or zip format, but I've never
> checked! You could try passing it through "java.util.zip.GZIPInputStream"
> and see if it comes out sensible, otherwise try dumping the byte stream to
> a file and seeing if something like winzip will open it
>
> What sort of objects are they, and how are you hoping to get them out? If
> we can figure out how they're compressed, I could probably extend HSLF to
> make this sort of thing easier (though probably not until I've finished
> the rich text stuff)
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>
>

Re: Powerpoint Embedded Objects

Posted by Nick Burch <ni...@torchbox.com>.
On Tue, 13 Dec 2005, Mathieu Migout wrote:
> But in powerpoint, embedded object are compressed. Can any one know 
> compression protocole.

I'd take a punt at either cab format or zip format, but I've never 
checked! You could try passing it through "java.util.zip.GZIPInputStream" 
and see if it comes out sensible, otherwise try dumping the byte stream to 
a file and seeing if something like winzip will open it

What sort of objects are they, and how are you hoping to get them out? If 
we can figure out how they're compressed, I could probably extend HSLF to 
make this sort of thing easier (though probably not until I've finished 
the rich text stuff)

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


Re: Powerpoint Embedded Objects

Posted by Mathieu Migout <mm...@gmail.com>.
Hi,

I try standart compression, i didn't work.

For small file, compressed file is bigger than original file. I think
there is other data.
It is perhaps LZW compression, but I can't decompress it.

Mathieu


2005/12/14, ponthiaux.eric <po...@wanadoo.fr>:
>
> is there another fat system inside :)
>
> did you try standard compression like zip or rar ?
>
> Mathieu Migout wrote:
>
> >Hi!
> >
> >  I am new to Jakarta POI.  I am developing an application where i
> >extract embedded object from Word, EXCEL and powerpoint.
> >
> >But in powerpoint, embedded object are compressed. Can any one know
> >compression protocole.
> >
> >  Very Urgent..
> >
> >  Thanks In Advance
> >
> >Mathieu
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> >Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
> >The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
> >
>
> >---------------------------------------------------------------------------------------
> >Wanadoo vous informe que cet  e-mail a ete controle par l'anti-virus
> mail.
> >Aucun virus connu a ce jour par nos services n'a ete detecte.
> >
> >
> >
> >
> >
> >
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>
>

Re: Powerpoint Embedded Objects

Posted by "ponthiaux.eric" <po...@wanadoo.fr>.
is there another fat system inside :)

did you try standard compression like zip or rar ?

Mathieu Migout wrote:

>Hi!
>
>  I am new to Jakarta POI.  I am developing an application where i
>extract embedded object from Word, EXCEL and powerpoint.
>
>But in powerpoint, embedded object are compressed. Can any one know
>compression protocole.
>
>  Very Urgent..
>
>  Thanks In Advance
>
>Mathieu
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
>Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
>The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>
>---------------------------------------------------------------------------------------
>Wanadoo vous informe que cet  e-mail a ete controle par l'anti-virus mail. 
>Aucun virus connu a ce jour par nos services n'a ete detecte.
>
>
>
>
>
>  
>




---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/