You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by Mathieu Migout <mm...@gmail.com> on 2005/12/13 08:31:44 UTC
Powerpoint Embedded Objects
Hi!
I am new to Jakarta POI. I am developing an application where i
extract embedded object from Word, EXCEL and powerpoint.
But in powerpoint, embedded object are compressed. Can any one know
compression protocole.
Very Urgent..
Thanks In Advance
Mathieu
---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
Re: Powerpoint Embedded Objects
Posted by Nick Burch <ni...@torchbox.com>.
On Tue, 13 Dec 2005, Mathieu Migout wrote:
> - ExOleObjStg : A variable length container, which has LZW compressed
> data, which corresponds to the Istorage data for this ole object. The
> uncompressed data is a docfile, which contains the ole object data.
Sounds like powerpoint might strip off some of the OLE stuff when it
embeds it, or possibly have had some more stuff wrapped around it
Once you've decomressed it, try comparing it to the thing you originally
embeded. You might find that the decompressed stream starts part way into
the original (and hence stuff was stripped off when embeding), or that the
original starts part way into the decomressed stream (and hence extra
stuff was added)
Do let us know what you find, so we can include it in a future HSLF
version
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
Powerpoint Embedded Objects
Posted by Mathieu Migout <mm...@gmail.com>.
---------- Forwarded message ----------
From: Mathieu Migout <mm...@gmail.com>
Date: 13 déc. 2005 12:18
Subject: Re: Powerpoint Embedded Objects
To: POI Developers List <po...@jakarta.apache.org>
Hi
I want to extract embedded object from PowerPoint file (other office
document).
An embedded object is a record which type is "4113".
I can extract record. I remoce record header ( 8 bytes). I try dumping the
byte stream to a file.
But neither winzip, neither winrar can open it.
I read in a powerpoint file format description :
record type = 4045:
- ExEmbedAtom
record type = 4035:
- ExOleObjAtom
record type = 4113
- ExOleObjStg : A variable length container, which has LZW compressed
data, which corresponds to the Istorage data for this ole object. The
uncompressed data is a docfile, which contains the ole object data.
I try uncompressed with LZW algo but it didn't work. I don't know if my LZW
algo is wrong or if it is not LZW compressed.
thanks.
Mathieu
2005/12/13, Nick Burch <ni...@torchbox.com>:
>
> On Tue, 13 Dec 2005, Mathieu Migout wrote:
> > But in powerpoint, embedded object are compressed. Can any one know
> > compression protocole.
>
> I'd take a punt at either cab format or zip format, but I've never
> checked! You could try passing it through "java.util.zip.GZIPInputStream"
> and see if it comes out sensible, otherwise try dumping the byte stream to
>
> a file and seeing if something like winzip will open it
>
> What sort of objects are they, and how are you hoping to get them out? If
> we can figure out how they're compressed, I could probably extend HSLF to
> make this sort of thing easier (though probably not until I've finished
> the rich text stuff)
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> Mailing List: http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>
>
Re: Powerpoint Embedded Objects
Posted by Mathieu Migout <mm...@gmail.com>.
Hi
I want to extract embedded object from PowerPoint file (other office
document).
An embedded object is a record which type is "4113".
I can extract record. I remoce record header ( 8 bytes). I try dumping the
byte stream to a file.
But neither winzip, neither winrar can open it.
I read in a powerpoint file format description :
record type = 4045:
- ExEmbedAtom
record type = 4035:
- ExOleObjAtom
record type = 4113
- ExOleObjStg : A variable length container, which has LZW compressed
data, which corresponds to the Istorage data for this ole object. The
uncompressed data is a docfile, which contains the ole object data.
I try uncompressed with LZW algo but it didn't work. I don't know if my LZW
algo is wrong or if it is not LZW compressed.
thanks.
Mathieu
2005/12/13, Nick Burch <ni...@torchbox.com>:
>
> On Tue, 13 Dec 2005, Mathieu Migout wrote:
> > But in powerpoint, embedded object are compressed. Can any one know
> > compression protocole.
>
> I'd take a punt at either cab format or zip format, but I've never
> checked! You could try passing it through "java.util.zip.GZIPInputStream"
> and see if it comes out sensible, otherwise try dumping the byte stream to
> a file and seeing if something like winzip will open it
>
> What sort of objects are they, and how are you hoping to get them out? If
> we can figure out how they're compressed, I could probably extend HSLF to
> make this sort of thing easier (though probably not until I've finished
> the rich text stuff)
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> Mailing List: http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>
>
Re: Powerpoint Embedded Objects
Posted by Nick Burch <ni...@torchbox.com>.
On Tue, 13 Dec 2005, Mathieu Migout wrote:
> But in powerpoint, embedded object are compressed. Can any one know
> compression protocole.
I'd take a punt at either cab format or zip format, but I've never
checked! You could try passing it through "java.util.zip.GZIPInputStream"
and see if it comes out sensible, otherwise try dumping the byte stream to
a file and seeing if something like winzip will open it
What sort of objects are they, and how are you hoping to get them out? If
we can figure out how they're compressed, I could probably extend HSLF to
make this sort of thing easier (though probably not until I've finished
the rich text stuff)
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
Re: Powerpoint Embedded Objects
Posted by Mathieu Migout <mm...@gmail.com>.
Hi,
I try standart compression, i didn't work.
For small file, compressed file is bigger than original file. I think
there is other data.
It is perhaps LZW compression, but I can't decompress it.
Mathieu
2005/12/14, ponthiaux.eric <po...@wanadoo.fr>:
>
> is there another fat system inside :)
>
> did you try standard compression like zip or rar ?
>
> Mathieu Migout wrote:
>
> >Hi!
> >
> > I am new to Jakarta POI. I am developing an application where i
> >extract embedded object from Word, EXCEL and powerpoint.
> >
> >But in powerpoint, embedded object are compressed. Can any one know
> >compression protocole.
> >
> > Very Urgent..
> >
> > Thanks In Advance
> >
> >Mathieu
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> >Mailing List: http://jakarta.apache.org/site/mail2.html#poi
> >The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
> >
>
> >---------------------------------------------------------------------------------------
> >Wanadoo vous informe que cet e-mail a ete controle par l'anti-virus
> mail.
> >Aucun virus connu a ce jour par nos services n'a ete detecte.
> >
> >
> >
> >
> >
> >
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> Mailing List: http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>
>
Re: Powerpoint Embedded Objects
Posted by "ponthiaux.eric" <po...@wanadoo.fr>.
is there another fat system inside :)
did you try standard compression like zip or rar ?
Mathieu Migout wrote:
>Hi!
>
> I am new to Jakarta POI. I am developing an application where i
>extract embedded object from Word, EXCEL and powerpoint.
>
>But in powerpoint, embedded object are compressed. Can any one know
>compression protocole.
>
> Very Urgent..
>
> Thanks In Advance
>
>Mathieu
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
>Mailing List: http://jakarta.apache.org/site/mail2.html#poi
>The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>
>---------------------------------------------------------------------------------------
>Wanadoo vous informe que cet e-mail a ete controle par l'anti-virus mail.
>Aucun virus connu a ce jour par nos services n'a ete detecte.
>
>
>
>
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/