You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by AK <ak...@gmail.com> on 2012/09/10 08:01:10 UTC

Word 2003 Charts

HI,

Can anyone tell me how to get the embeded chart data from the 2003 word?

Thanks,
AK

Re: Word 2003 Charts

Posted by AK <ak...@gmail.com>.
Thanks Yegor.
Will try to get the data accordingly.

AK



On Mon, Sep 10, 2012 at 4:40 PM, Yegor Kozlov <ye...@dinom.ru> wrote:

> The best way to explore the POIFS tree is
> org.apache.poi.poifs.poibrowser.POIBrowser  from poi-examples. This
> utility provides a user-friendly UI to explore the OLE2 file systems
> of binary office formats.
>
> Once you know id of the chart data stream you can read it as follows:
>
>         HWPFDocument doc = new HWPFDocument(dataStream);
>         ObjectsPool objects = doc.getObjectsPool(); // storage of
> emebedded objects
>
>         // get a specific node for id="_1408793253"
>         DirectoryNode embeddedDoc =
> (DirectoryNode)objects.getObjectById("_1408793253");
>         // make sense of the embedded data
>         HSSFWorkbook wb = new HSSFWorkbook(embeddedDoc , true);
>
>         // note that you can safe the embedded data to disk as a .xls
> file and view it in Excel
>
> How to get chart ids programmatically is up to you. According to the
> docs, the naming convention is "_" + objId  where objId is the name of
> the OLE object storage in hex format.
> The id can be retrieved from CharacterRun objects as follows:
>
> String objId = "_" + characterRun.getPicOffset();
>
> So to get chart you need to first identify the chart object in the
> document, get its OLE id from the corresponding CharacterRun and then
> grab the data from ObjectsPool .
>
> Hope it helps.
>
> Yegor
>
>
> On Mon, Sep 10, 2012 at 1:21 PM, AK <ak...@gmail.com> wrote:
> > Hi Yegor,
> >
> > Yes it is an Embed object. Am not able to get the proper information
> about
> > the chart. Can u bit elaborate, how to traverse the tree and get the
> data?
> >
> > AK
> >
> > On Mon, Sep 10, 2012 at 12:51 PM, Yegor Kozlov <yegor.kozlov@dinom.ru
> >wrote:
> >
> >> Charts in HWPF are not yes suported. My hunch is that charts are
> >> stored as nodes in the underlying POIFS tree. Walk it and see if it is
> >> so.
> >> Then grab the data, see what format it is (mostr likely xls) and make
> >> sense if it.
> >>
> >> Yegor
> >>
> >> On Mon, Sep 10, 2012 at 10:01 AM, AK <ak...@gmail.com> wrote:
> >> > HI,
> >> >
> >> > Can anyone tell me how to get the embeded chart data from the 2003
> word?
> >> >
> >> > Thanks,
> >> > AK
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >> For additional commands, e-mail: user-help@poi.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

Re: Word 2003 Charts

Posted by Yegor Kozlov <ye...@dinom.ru>.
The best way to explore the POIFS tree is
org.apache.poi.poifs.poibrowser.POIBrowser  from poi-examples. This
utility provides a user-friendly UI to explore the OLE2 file systems
of binary office formats.

Once you know id of the chart data stream you can read it as follows:

        HWPFDocument doc = new HWPFDocument(dataStream);
        ObjectsPool objects = doc.getObjectsPool(); // storage of
emebedded objects

        // get a specific node for id="_1408793253"
        DirectoryNode embeddedDoc =
(DirectoryNode)objects.getObjectById("_1408793253");
        // make sense of the embedded data
        HSSFWorkbook wb = new HSSFWorkbook(embeddedDoc , true);

        // note that you can safe the embedded data to disk as a .xls
file and view it in Excel

How to get chart ids programmatically is up to you. According to the
docs, the naming convention is "_" + objId  where objId is the name of
the OLE object storage in hex format.
The id can be retrieved from CharacterRun objects as follows:

String objId = "_" + characterRun.getPicOffset();

So to get chart you need to first identify the chart object in the
document, get its OLE id from the corresponding CharacterRun and then
grab the data from ObjectsPool .

Hope it helps.

Yegor


On Mon, Sep 10, 2012 at 1:21 PM, AK <ak...@gmail.com> wrote:
> Hi Yegor,
>
> Yes it is an Embed object. Am not able to get the proper information about
> the chart. Can u bit elaborate, how to traverse the tree and get the data?
>
> AK
>
> On Mon, Sep 10, 2012 at 12:51 PM, Yegor Kozlov <ye...@dinom.ru>wrote:
>
>> Charts in HWPF are not yes suported. My hunch is that charts are
>> stored as nodes in the underlying POIFS tree. Walk it and see if it is
>> so.
>> Then grab the data, see what format it is (mostr likely xls) and make
>> sense if it.
>>
>> Yegor
>>
>> On Mon, Sep 10, 2012 at 10:01 AM, AK <ak...@gmail.com> wrote:
>> > HI,
>> >
>> > Can anyone tell me how to get the embeded chart data from the 2003 word?
>> >
>> > Thanks,
>> > AK
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Word 2003 Charts

Posted by AK <ak...@gmail.com>.
Hi Yegor,

Yes it is an Embed object. Am not able to get the proper information about
the chart. Can u bit elaborate, how to traverse the tree and get the data?

AK

On Mon, Sep 10, 2012 at 12:51 PM, Yegor Kozlov <ye...@dinom.ru>wrote:

> Charts in HWPF are not yes suported. My hunch is that charts are
> stored as nodes in the underlying POIFS tree. Walk it and see if it is
> so.
> Then grab the data, see what format it is (mostr likely xls) and make
> sense if it.
>
> Yegor
>
> On Mon, Sep 10, 2012 at 10:01 AM, AK <ak...@gmail.com> wrote:
> > HI,
> >
> > Can anyone tell me how to get the embeded chart data from the 2003 word?
> >
> > Thanks,
> > AK
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

Re: Word 2003 Charts

Posted by Yegor Kozlov <ye...@dinom.ru>.
Charts in HWPF are not yes suported. My hunch is that charts are
stored as nodes in the underlying POIFS tree. Walk it and see if it is
so.
Then grab the data, see what format it is (mostr likely xls) and make
sense if it.

Yegor

On Mon, Sep 10, 2012 at 10:01 AM, AK <ak...@gmail.com> wrote:
> HI,
>
> Can anyone tell me how to get the embeded chart data from the 2003 word?
>
> Thanks,
> AK

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org