You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Mr Zappee <za...@gmail.com> on 2010/06/01 22:29:38 UTC

getPageCount() is not working

Hi All,

I want to write a java method which return with number of page of m$ word
document. I can see in the api that there is a
org.apache.poi.hpsf.SummaryInformation.getPageCount() method.


*This is my code:
*
   public static void main(String[] args) throws Exception
   {
       String filename = "General_Partnership_Agreement.doc";

       POIFSFileSystem fs = new POIFSFileSystem( new
FileInputStream(filename) );

       HWPFDocument doc = new HWPFDocument(fs);

       //System.out.println( doc.getSummaryInformation().getCharCount() );
       System.out.println( doc.getSummaryInformation().getPageCount() );
   }



This code does not throw any exception, but it is not working very well,
because getPageCount() always return to 1. I tried it to lots of *.doc
files, but the value of getPageCount was always 1.

Could you tell me what is the wrong in my code?

Thank you very much, zappee

Re: getPageCount() is not working

Posted by Mr Zappee <za...@gmail.com>.
You can see my statistics window in m$ word here (screenshot):
http://kepfeltoltes.hu/view/100601/statistics_www.kepfeltoltes.hu_.png

Re: getPageCount() is not working

Posted by Nick Burch <ni...@alfresco.com>.
On Tue, 1 Jun 2010, Mr Zappee wrote:
> I check the number of pages in word. The statistics page shows that the
> document contains 16 pages.

If you view the statistics and re-save the file, does word correct the 
property information?

> How can I update the properties? Can I update it with apache poi?

Yup, sure, HPSF will let you set any of the properties to anything you'd 
like. Unfortunately though, HWPF isn't currently able to figure out how 
many pages a document would span, so it can't help you figure out the 
correct number to save via HPSF...

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: getPageCount() is not working

Posted by Mr Zappee <za...@gmail.com>.
Hi Nick,

I have searched some *.doc file in the internet, for example here is my test
doc file:
http://www.google.hu/url?sa=t&source=web&ct=res&cd=1&ved=0CBMQFjAA&url=http%3A%2F%2Fwww.tannedfeet.com%2FGeneral_Partnership_Agreement.doc&rct=j&q=General_Partnership_Agreement.doc+&ei=9HIFTJKgAqOJOO7LgYYF&usg=AFQjCNE1tqc4-TONsmOQxyOjbU_FsQD6rw&sig2=u1aYJsTu_fTLApsA5Puc0Q

I check the number of pages in word. The statistics page shows that the
document contains 16 pages.
How can I update the properties? Can I update it with apache poi?

My problem is that I do not prepare the *.doc files. They have created by
unknowns users...





On 1 June 2010 22:38, Nick Burch <ni...@alfresco.com> wrote:

> On Tue, 1 Jun 2010, Mr Zappee wrote:
>
>> This code does not throw any exception, but it is not working very well,
>> because getPageCount() always return to 1. I tried it to lots of *.doc
>> files, but the value of getPageCount was always 1.
>>
>
> I suspect that if you look in office, it'll show a pagecount of 1 too. HPSF
> only shows you the information that the application saved there as a cache,
> so you'll need to get office to update the properties before they'll hold
> something useful for your documents
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

Re: getPageCount() is not working

Posted by Nick Burch <ni...@alfresco.com>.
On Tue, 1 Jun 2010, Mr Zappee wrote:
> This code does not throw any exception, but it is not working very well, 
> because getPageCount() always return to 1. I tried it to lots of *.doc 
> files, but the value of getPageCount was always 1.

I suspect that if you look in office, it'll show a pagecount of 1 too. 
HPSF only shows you the information that the application saved there as a 
cache, so you'll need to get office to update the properties before 
they'll hold something useful for your documents

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org