You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Anju Murthy <an...@cisco.com> on 2004/04/06 04:54:09 UTC

HWPF : IOException

Hi all,

I am using Textmining to extract text from a word document. 
Sometimes I get an IOException. The error is "java.io.IOException: Unable to
read entire header; -1 bytes read; expected 512 bytes".
This is slightly urgent. Could anyone help me out here?

Thanks
Anju


---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


RE: Regarding HSSF Optimization

Posted by "Koundinya (Sudhakar Chavali)" <su...@yahoo.com>.
Hi Srinadh,

I have tested with both overloaded constructors, But the result
was same. hence I posted this query


--- Daniel Joshua <da...@gridnode.com> wrote:
> What's the difference of using a POIFSFileSystem object in the
> constructor
> instead of a InputStream object?
> 
> 
> Regards,
> Daniel
> 
> 
> -----Original Message-----
> From: Srinadh Karumuri [mailto:skarumur@bbn.com]
> Sent: Wednesday, 07 April, 2004 1:05 AM
> To: POI Users List
> Subject: Re: Regarding HSSF Optimization
> 
> 
> I am not sure why it's slow. Maybe someone can help you there.
> 
> Here is what I do (same as in the examples):
>          POIFSFileSystem fs =
>          new POIFSFileSystem(new
> FileInputStream(_templateFile));
>          wb = new HSSFWorkbook(fs);
> 
> At 09:11 AM 4/6/2004, you wrote:
> >Hi all,
> >
> >Just curious about one thing, why constructor of the work
> book
> >is taking maximum time to create itself?
> >
> >Say for example if i wanted to parse 1 MB file which contains
> >only text, it is taking arround 45 seconds on 256MB/P4
> macahine
> >for creating constructor itself, after that more 30 to 35
> secs
> >approx for getting the text from HSSFSheet Objects
> >
> >This is becoming problem for me when deal with large
> documents.
> >
> >
> >
> >here is how i am construction the workbook object
> >
> >
> >     //EXCEL DOCUMENT AS INPUT STREAM OBJECT
> >     InputStream input = new FileInputStream(strDocument);
> >     //HOLDS THE EXCEL BUFFER
> >     ByteArrayOutputStream excelBuffer = new
> >ByteArrayOutputStream();
> >
> >     HSSFWorkbook workbook = null;
> >     //CREATE A WORKBOOK FOR AN EXCEL DOCUMENT
> >     workbook = new HSSFWorkbook(input);
> >
> >
> >=====
> >"No one can earn a million dollars honestly."- William
> Jennings Bryan
> >(1860-1925)
> >
> >"Make everything as simple as possible, but not simpler."-
> Albert Einstein
> >(1879-1955)
> >
> >"It is dangerous to be sincere unless you are also stupid."-
> George
> >Bernard Shaw (1856-1950)
> >
> >__________________________________
> >Do you Yahoo!?
> >Yahoo! Small Business $15K Web Design Giveaway
> >http://promotions.yahoo.com/design_giveaway/
> >
>
>---------------------------------------------------------------------
> >To unsubscribe, e-mail:
> poi-user-unsubscribe@jakarta.apache.org
> >For additional commands, e-mail:
> poi-user-help@jakarta.apache.org
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> poi-user-help@jakarta.apache.org
> 


=====
"No one can earn a million dollars honestly."- William Jennings Bryan (1860-1925) 

"Make everything as simple as possible, but not simpler."- Albert Einstein (1879-1955)

"It is dangerous to be sincere unless you are also stupid."- George Bernard Shaw (1856-1950)

__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


Re: Regarding HSSF Optimization

Posted by "Andrew C. Oliver" <ac...@apache.org>.
The other constructor immediately wraps it in a POIFSFileSystem object.  The
InputStream version is just a convenience method.
-- 
Andrew C. Oliver
http://www.superlinksoftware.com/poi.jsp
Custom enhancements and Commercial Implementation for Jakarta POI

http://jakarta.apache.org/poi
For Java and Excel, Got POI?

The views expressed in this email are those of the author and are almost
definitely not shared by the Apache Software Foundation, its board or its
general membership.  In fact they probably most definitively disagree with
everything espoused in the above email.

> From: "Daniel Joshua" <da...@gridnode.com>
> Reply-To: "POI Users List" <po...@jakarta.apache.org>
> Date: Wed, 7 Apr 2004 10:06:57 +0800
> To: "'POI Users List'" <po...@jakarta.apache.org>
> Subject: RE: Regarding HSSF Optimization
> 
> What's the difference of using a POIFSFileSystem object in the constructor
> instead of a InputStream object?
> 
> 
> Regards,
> Daniel
> 
> 
> -----Original Message-----
> From: Srinadh Karumuri [mailto:skarumur@bbn.com]
> Sent: Wednesday, 07 April, 2004 1:05 AM
> To: POI Users List
> Subject: Re: Regarding HSSF Optimization
> 
> 
> I am not sure why it's slow. Maybe someone can help you there.
> 
> Here is what I do (same as in the examples):
>        POIFSFileSystem fs =
>        new POIFSFileSystem(new FileInputStream(_templateFile));
>        wb = new HSSFWorkbook(fs);
> 
> At 09:11 AM 4/6/2004, you wrote:
>> Hi all,
>> 
>> Just curious about one thing, why constructor of the work book
>> is taking maximum time to create itself?
>> 
>> Say for example if i wanted to parse 1 MB file which contains
>> only text, it is taking arround 45 seconds on 256MB/P4 macahine
>> for creating constructor itself, after that more 30 to 35 secs
>> approx for getting the text from HSSFSheet Objects
>> 
>> This is becoming problem for me when deal with large documents.
>> 
>> 
>> 
>> here is how i am construction the workbook object
>> 
>> 
>>     //EXCEL DOCUMENT AS INPUT STREAM OBJECT
>>     InputStream input = new FileInputStream(strDocument);
>>     //HOLDS THE EXCEL BUFFER
>>     ByteArrayOutputStream excelBuffer = new
>> ByteArrayOutputStream();
>> 
>>     HSSFWorkbook workbook = null;
>>     //CREATE A WORKBOOK FOR AN EXCEL DOCUMENT
>>     workbook = new HSSFWorkbook(input);
>> 
>> 
>> =====
>> "No one can earn a million dollars honestly."- William Jennings Bryan
>> (1860-1925)
>> 
>> "Make everything as simple as possible, but not simpler."- Albert Einstein
>> (1879-1955)
>> 
>> "It is dangerous to be sincere unless you are also stupid."- George
>> Bernard Shaw (1856-1950)
>> 
>> __________________________________
>> Do you Yahoo!?
>> Yahoo! Small Business $15K Web Design Giveaway
>> http://promotions.yahoo.com/design_giveaway/
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: poi-user-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: poi-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


RE: Regarding HSSF Optimization

Posted by Daniel Joshua <da...@gridnode.com>.
What's the difference of using a POIFSFileSystem object in the constructor
instead of a InputStream object?


Regards,
Daniel


-----Original Message-----
From: Srinadh Karumuri [mailto:skarumur@bbn.com]
Sent: Wednesday, 07 April, 2004 1:05 AM
To: POI Users List
Subject: Re: Regarding HSSF Optimization


I am not sure why it's slow. Maybe someone can help you there.

Here is what I do (same as in the examples):
         POIFSFileSystem fs =
         new POIFSFileSystem(new FileInputStream(_templateFile));
         wb = new HSSFWorkbook(fs);

At 09:11 AM 4/6/2004, you wrote:
>Hi all,
>
>Just curious about one thing, why constructor of the work book
>is taking maximum time to create itself?
>
>Say for example if i wanted to parse 1 MB file which contains
>only text, it is taking arround 45 seconds on 256MB/P4 macahine
>for creating constructor itself, after that more 30 to 35 secs
>approx for getting the text from HSSFSheet Objects
>
>This is becoming problem for me when deal with large documents.
>
>
>
>here is how i am construction the workbook object
>
>
>     //EXCEL DOCUMENT AS INPUT STREAM OBJECT
>     InputStream input = new FileInputStream(strDocument);
>     //HOLDS THE EXCEL BUFFER
>     ByteArrayOutputStream excelBuffer = new
>ByteArrayOutputStream();
>
>     HSSFWorkbook workbook = null;
>     //CREATE A WORKBOOK FOR AN EXCEL DOCUMENT
>     workbook = new HSSFWorkbook(input);
>
>
>=====
>"No one can earn a million dollars honestly."- William Jennings Bryan
>(1860-1925)
>
>"Make everything as simple as possible, but not simpler."- Albert Einstein
>(1879-1955)
>
>"It is dangerous to be sincere unless you are also stupid."- George
>Bernard Shaw (1856-1950)
>
>__________________________________
>Do you Yahoo!?
>Yahoo! Small Business $15K Web Design Giveaway
>http://promotions.yahoo.com/design_giveaway/
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: poi-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


Re: Regarding HSSF Optimization

Posted by Srinadh Karumuri <sk...@bbn.com>.
I am not sure why it's slow. Maybe someone can help you there.

Here is what I do (same as in the examples):
         POIFSFileSystem fs =
         new POIFSFileSystem(new FileInputStream(_templateFile));
         wb = new HSSFWorkbook(fs);

At 09:11 AM 4/6/2004, you wrote:
>Hi all,
>
>Just curious about one thing, why constructor of the work book
>is taking maximum time to create itself?
>
>Say for example if i wanted to parse 1 MB file which contains
>only text, it is taking arround 45 seconds on 256MB/P4 macahine
>for creating constructor itself, after that more 30 to 35 secs
>approx for getting the text from HSSFSheet Objects
>
>This is becoming problem for me when deal with large documents.
>
>
>
>here is how i am construction the workbook object
>
>
>     //EXCEL DOCUMENT AS INPUT STREAM OBJECT
>     InputStream input = new FileInputStream(strDocument);
>     //HOLDS THE EXCEL BUFFER
>     ByteArrayOutputStream excelBuffer = new
>ByteArrayOutputStream();
>
>     HSSFWorkbook workbook = null;
>     //CREATE A WORKBOOK FOR AN EXCEL DOCUMENT
>     workbook = new HSSFWorkbook(input);
>
>
>=====
>"No one can earn a million dollars honestly."- William Jennings Bryan 
>(1860-1925)
>
>"Make everything as simple as possible, but not simpler."- Albert Einstein 
>(1879-1955)
>
>"It is dangerous to be sincere unless you are also stupid."- George 
>Bernard Shaw (1856-1950)
>
>__________________________________
>Do you Yahoo!?
>Yahoo! Small Business $15K Web Design Giveaway
>http://promotions.yahoo.com/design_giveaway/
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: poi-user-help@jakarta.apache.org

Regarding HSSF Optimization

Posted by "Koundinya (Sudhakar Chavali)" <su...@yahoo.com>.
Hi all,

Just curious about one thing, why constructor of the work book
is taking maximum time to create itself?

Say for example if i wanted to parse 1 MB file which contains
only text, it is taking arround 45 seconds on 256MB/P4 macahine
for creating constructor itself, after that more 30 to 35 secs
approx for getting the text from HSSFSheet Objects

This is becoming problem for me when deal with large documents.



here is how i am construction the workbook object


    //EXCEL DOCUMENT AS INPUT STREAM OBJECT
    InputStream input = new FileInputStream(strDocument);
    //HOLDS THE EXCEL BUFFER
    ByteArrayOutputStream excelBuffer = new
ByteArrayOutputStream();

    HSSFWorkbook workbook = null;
    //CREATE A WORKBOOK FOR AN EXCEL DOCUMENT
    workbook = new HSSFWorkbook(input);


=====
"No one can earn a million dollars honestly."- William Jennings Bryan (1860-1925) 

"Make everything as simple as possible, but not simpler."- Albert Einstein (1879-1955)

"It is dangerous to be sincere unless you are also stupid."- George Bernard Shaw (1856-1950)

__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


Re: Open Source API for RTF formats

Posted by Avik Sengupta <av...@itellix.com>.
Dont you have access to google

http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=RTF+parser+java&btnG=Search


On Tue, 2004-04-06 at 11:06, Koundinya (Sudhakar Chavali) wrote:
> Hi all,
> 
> Can any body provide me the links to Open Source API for parsing
> RTF format content  to text content
> 
> thanks,
> Sudhakar
> 
> =====
> "No one can earn a million dollars honestly."- William Jennings Bryan (1860-1925) 
> 
> "Make everything as simple as possible, but not simpler."- Albert Einstein (1879-1955)
> 
> "It is dangerous to be sincere unless you are also stupid."- George Bernard Shaw (1856-1950)
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Small Business $15K Web Design Giveaway 
> http://promotions.yahoo.com/design_giveaway/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: poi-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


Open Source API for RTF formats

Posted by "Koundinya (Sudhakar Chavali)" <su...@yahoo.com>.
Hi all,

Can any body provide me the links to Open Source API for parsing
RTF format content  to text content

thanks,
Sudhakar

=====
"No one can earn a million dollars honestly."- William Jennings Bryan (1860-1925) 

"Make everything as simple as possible, but not simpler."- Albert Einstein (1879-1955)

"It is dangerous to be sincere unless you are also stupid."- George Bernard Shaw (1856-1950)

__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


Re: HWPF : IOException

Posted by "Koundinya (Sudhakar Chavali)" <su...@yahoo.com>.
Can u upload the document for which you are getting exception.

This error raises because the document is either in text format
or rtf formated or corrupted.

Just I want to see that document is proper POI supporting
document or not

regards
Sudhakar
--- Anju Murthy <an...@cisco.com> wrote:
> Hi all,
> 
> I am using Textmining to extract text from a word document. 
> Sometimes I get an IOException. The error is
> "java.io.IOException: Unable to
> read entire header; -1 bytes read; expected 512 bytes".
> This is slightly urgent. Could anyone help me out here?
> 
> Thanks
> Anju
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> poi-user-help@jakarta.apache.org
> 


=====
"No one can earn a million dollars honestly."- William Jennings Bryan (1860-1925) 

"Make everything as simple as possible, but not simpler."- Albert Einstein (1879-1955)

"It is dangerous to be sincere unless you are also stupid."- George Bernard Shaw (1856-1950)

__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org