You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Yury Batrakov <ba...@gmail.com> on 2008/04/01 08:17:42 UTC

Re: Microsoft Graph objects

Nick, Yegor,

Thanks for your replies. I've also tried to get Charts from Word: they
are stored as Workbook stream in OLE fs, but don't contain either
\005DocumentSummaryInformation or \005SummaryInformation. I've
commented  readProperties() in HSSFWorkbook constructor but it failed
to construct Excel spreadsheet from it. Is it reasonable for me to
continue investigation in this way or these objects aren't Excel at
all?


On 3/31/08, Yegor Kozlov <ye...@dinom.ru> wrote:
>
>
>  In Excel MS Graph and Organization Chart are intrinsic objects. Basically you need to
>  iterate through worksheet records and look for the appropriate records.
>  (ChartRecord, ChartTitleFormatRecord, ChartFormatRecord, etc).
>
>  In HSLF MS Graph is an embedded OLE object. You can get it in raw
>  format using HSLFSlideShow.getEmbeddedObjects(). For now that's all.
>  We don't have a high level API for it.
>  Organization Chart is just a group of shapes. It should be accessible
>   using Slide.getShapes(). There should be a flag indicating that this
>   group is a Organization Chart but I didn't figure it out.
>
>
>  Regards,
>
> Yegor
>
>
>  > Is there any way to extract MS Graph objects from Office documents using POI?
>  > First of all I'm interested in MS Graph charts, Organisation charts and WordArt.
>
>  > Looks like HSLF can extract these objects, but other not. Am I right?
>  > Are there plans to support them in other formats?
>
>
> > ---------------------------------------------------------------------
>  > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>  > For additional commands, e-mail: user-help@poi.apache.org
>
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>  For additional commands, e-mail: user-help@poi.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Re[2]: Microsoft Graph objects

Posted by Yury Batrakov <ba...@gmail.com>.
Looks like it is Escher - this chart can be extracted and saved
getPicturesTable() and company, but this code:

       InputStream stream = fs.createDocumentInputStream(dir,
workbookName); // modified version, tested ok on Word and Excel embeds
       EventRecordFactory factory = new EventRecordFactory();
       List records = RecordFactory.createRecords(stream);

causes:
Exception in thread "main" java.lang.RuntimeException:
org.apache.poi.hssf.record.RecordFormatException: Unable to construct
record instance
	at ru.mera.ofa.ReadOLE$MyPOIFSReaderListener.processPOIFSReaderEvent(ReadOLE.java:118)
	at org.apache.poi.poifs.eventfilesystem.POIFSReader.processProperties(POIFSReader.java:261)
	at org.apache.poi.poifs.eventfilesystem.POIFSReader.processProperties(POIFSReader.java:230)
	at org.apache.poi.poifs.eventfilesystem.POIFSReader.processProperties(POIFSReader.java:230)
	at org.apache.poi.poifs.eventfilesystem.POIFSReader.read(POIFSReader.java:97)
	at ru.mera.ofa.ReadOLE.main(ReadOLE.java:84)
Caused by: org.apache.poi.hssf.record.RecordFormatException: Unable to
construct record instance
	at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:199)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:117)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:285)
	at ru.mera.ofa.ReadOLE$MyPOIFSReaderListener.processPOIFSReaderEvent(ReadOLE.java:111)
	... 5 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
	at java.lang.reflect.Constructor.newInstance(Unknown Source)
	at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:187)
	... 8 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
	at org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:132)
	at org.apache.poi.hssf.record.RecordInputStream.readShort(RecordInputStream.java:152)
	at org.apache.poi.hssf.record.WindowOneRecord.fillFields(WindowOneRecord.java:94)
	at org.apache.poi.hssf.record.Record.<init>(Record.java:55)
	at org.apache.poi.hssf.record.WindowOneRecord.<init>(WindowOneRecord.java:76)
	... 13 more

I'll try org.apache.poi.ddf.EscherDump, thank you!

On 4/1/08, Yegor Kozlov <ye...@dinom.ru> wrote:
> It should be either in Excel or in Escher format.
>
>  In the first case it should be readable by
>  org.apache.poi.hssf.record.RecordFactory.createRecords(inputstream).
>  See if you can read the contents of the Workbook stream this way:
>
>  List records = RecordFactory.createRecords(new ByteArrayInputStrema(ole_bytes));
>  where ole_bytes is what you read from the OLE stream.
>
>  In the second case try the same idea with org.apache.poi.ddf.EscherDump.
>
>
>  Yegor
>
>
>  > Nick, Yegor,
>
>  > Thanks for your replies. I've also tried to get Charts from Word: they
>  > are stored as Workbook stream in OLE fs, but don't contain either
>  > \005DocumentSummaryInformation or \005SummaryInformation. I've
>  > commented  readProperties() in HSSFWorkbook constructor but it failed
>  > to construct Excel spreadsheet from it. Is it reasonable for me to
>  > continue investigation in this way or these objects aren't Excel at
>  > all?
>
>
>  > On 3/31/08, Yegor Kozlov <ye...@dinom.ru> wrote:
>  >>
>  >>
>  >>  In Excel MS Graph and Organization Chart are intrinsic objects. Basically you need to
>  >>  iterate through worksheet records and look for the appropriate records.
>  >>  (ChartRecord, ChartTitleFormatRecord, ChartFormatRecord, etc).
>  >>
>  >>  In HSLF MS Graph is an embedded OLE object. You can get it in raw
>  >>  format using HSLFSlideShow.getEmbeddedObjects(). For now that's all.
>  >>  We don't have a high level API for it.
>  >>  Organization Chart is just a group of shapes. It should be accessible
>  >>   using Slide.getShapes(). There should be a flag indicating that this
>  >>   group is a Organization Chart but I didn't figure it out.
>  >>
>  >>
>  >>  Regards,
>  >>
>  >> Yegor
>  >>
>  >>
>  >>  > Is there any way to extract MS Graph objects from Office documents using POI?
>  >>  > First of all I'm interested in MS Graph charts, Organisation charts and WordArt.
>  >>
>  >>  > Looks like HSLF can extract these objects, but other not. Am I right?
>  >>  > Are there plans to support them in other formats?
>  >>
>  >>
>  >> > ---------------------------------------------------------------------
>  >>  > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>  >>  > For additional commands, e-mail: user-help@poi.apache.org
>  >>
>  >>
>  >>  ---------------------------------------------------------------------
>  >>  To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>  >>  For additional commands, e-mail: user-help@poi.apache.org
>  >>
>  >>
>
>  > ---------------------------------------------------------------------
>  > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>  > For additional commands, e-mail: user-help@poi.apache.org
>
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>  For additional commands, e-mail: user-help@poi.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re[2]: Microsoft Graph objects

Posted by Yegor Kozlov <ye...@dinom.ru>.
It should be either in Excel or in Escher format.

In the first case it should be readable by
org.apache.poi.hssf.record.RecordFactory.createRecords(inputstream).
See if you can read the contents of the Workbook stream this way:

List records = RecordFactory.createRecords(new ByteArrayInputStrema(ole_bytes));
where ole_bytes is what you read from the OLE stream.

In the second case try the same idea with org.apache.poi.ddf.EscherDump.

Yegor

> Nick, Yegor,

> Thanks for your replies. I've also tried to get Charts from Word: they
> are stored as Workbook stream in OLE fs, but don't contain either
> \005DocumentSummaryInformation or \005SummaryInformation. I've
> commented  readProperties() in HSSFWorkbook constructor but it failed
> to construct Excel spreadsheet from it. Is it reasonable for me to
> continue investigation in this way or these objects aren't Excel at
> all?


> On 3/31/08, Yegor Kozlov <ye...@dinom.ru> wrote:
>>
>>
>>  In Excel MS Graph and Organization Chart are intrinsic objects. Basically you need to
>>  iterate through worksheet records and look for the appropriate records.
>>  (ChartRecord, ChartTitleFormatRecord, ChartFormatRecord, etc).
>>
>>  In HSLF MS Graph is an embedded OLE object. You can get it in raw
>>  format using HSLFSlideShow.getEmbeddedObjects(). For now that's all.
>>  We don't have a high level API for it.
>>  Organization Chart is just a group of shapes. It should be accessible
>>   using Slide.getShapes(). There should be a flag indicating that this
>>   group is a Organization Chart but I didn't figure it out.
>>
>>
>>  Regards,
>>
>> Yegor
>>
>>
>>  > Is there any way to extract MS Graph objects from Office documents using POI?
>>  > First of all I'm interested in MS Graph charts, Organisation charts and WordArt.
>>
>>  > Looks like HSLF can extract these objects, but other not. Am I right?
>>  > Are there plans to support them in other formats?
>>
>>
>> > ---------------------------------------------------------------------
>>  > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>  > For additional commands, e-mail: user-help@poi.apache.org
>>
>>
>>  ---------------------------------------------------------------------
>>  To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>  For additional commands, e-mail: user-help@poi.apache.org
>>
>>

> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org