You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Dmitry Goldenberg <DG...@attivio.com> on 2008/08/07 15:55:06 UTC

MS Publisher?

Hello,

Does anyone know of an API to get at the metadata and content of MS Publisher files (.pub)?  Or as an alternative, is there a way to programmatically convert them to PDF, Word, or HTML?  I'm aware of things like PrimoPDF but they don't seem to offer an API or an exe I could use.

Are any of the POI contributors currently working on .pub support?

Thanks for any info.
- Dmitry

Re: MS Publisher?

Posted by Nick Burch <ni...@torchbox.com>.
On Thu, 7 Aug 2008, Dmitry Goldenberg wrote:
> Does anyone know of an API to get at the metadata and content of MS 
> Publisher files (.pub)?

I'm not aware of any, other than using ole controls to automate the app, 
which is only single threaded and very prone to failures (most office apps 
"support" this)

> Are any of the POI contributors currently working on .pub support?

I don't think so, but it might not to be too hard to get something very 
basic going (eg most of the text in the file, and lots of stuff that isn't 
quite text....). Any chance you could open a new bugzilla entry, and 
attach a few sample files?

Ideally these would be fairly simple files, from very simple + one page up 
to simple and three pages. Along with these should be a textual 
description of what's in each file.

Armed with that, it should be possible for someone to take a look at the 
files, and try to figure out the structure. If it's like excel or 
powerpoint, a basic extractor could be done in something like 5-10 hours. 
If it's more like visio or project, then in the absence of any docs it 
could be much much more :/

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: MS Publisher?

Posted by Anthony Andrews <py...@yahoo.com>.
I have never used Publisher so forgive me please if this is a stupid question but is it possible, when using Publisher, to save a file in a different format? If so, is it possible to use VBA for example to create a macro in Publisher as you can in Excel or Word?

If the answer to both of these questions is yes, would it be possible to create a macro that opened a file for you and then saved it in a different format? If this is possible then it may also be possible to, for example, force the macro to loop through all of the files in a directory.

Finally, have you searched for ActiveX controls? There are certainly controls that support the manipulation of PDF files and there may be ones that allow you to manipulate Publisher files.

--- On Thu, 8/7/08, Dmitry Goldenberg <DG...@attivio.com> wrote:
From: Dmitry Goldenberg <DG...@attivio.com>
Subject: MS Publisher?
To: "user@poi.apache.org" <us...@poi.apache.org>
Date: Thursday, August 7, 2008, 6:55 AM

Hello,

Does anyone know of an API to get at the metadata and content of MS Publisher
files (.pub)?  Or as an alternative, is there a way to programmatically convert
them to PDF, Word, or HTML?  I'm aware of things like PrimoPDF but they
don't seem to offer an API or an exe I could use.

Are any of the POI contributors currently working on .pub support?

Thanks for any info.
- Dmitry