You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by JohnRodey <ti...@yahoo.com> on 2010/03/25 15:06:15 UTC

Preserving word wrap

Using POI for word documents with nutch/hadoop is there a way to force the
plugin to add an eol character where word would typically do wrapping?  Or
would I have to rewrite the plugin and add custom logic to add the eol
character?  Currently the plugin will print a paragraph on one really long
line.
-- 
View this message in context: http://old.nabble.com/Preserving-word-wrap-tp28029360p28029360.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Preserving word wrap

Posted by MSB <ma...@tiscali.co.uk>.
That is a relief! I could not find what I was looking for and now I do not
have to pull out any more of my hair in frustration; far too little left for
that in any case.

Yours

Mark B


JohnRodey wrote:
> 
> No worries, mine just needs to be in a readable format.  So even something
> as simple as an eol at the next space character after the 80th character
> is good enough for me.  I could also create word wrapping on the display
> side.  Thanks for your help
> 
> 
> MSB wrote:
>> 
>> It's so long since I had to write any print routines using core Java code
>> that I cannot really remember the deatils but I do know that there is
>> support in the language for working out the space a line of text rendered
>> in a specific font will occupy. If I have the time today, I will look out
>> the code I put together way back when because I think it does something
>> very much like this; takes a line of text and inserts hard line breaks so
>> that it can be either printed out onto a page or rendered in some other
>> way. Sorry to say that I forgot about this completely last night.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> JohnRodey wrote:
>>> 
>>> Thanks Mark.  That's kinda what I figured I would need to do.
>>> 
>>> 
>>> 
>>> MSB wrote:
>>>> 
>>>> Not at all too sure that I understand your question as I am not
>>>> familiar with natch/hadoop. So, I am going to guess that you want to
>>>> use POI to extract the text of a paragraph or paragraphs from a Word
>>>> document or documents and then repsent that text using another
>>>> application but still formatted - from the perspective of a paragraphs
>>>> layout - to the user. If this is correct then I think you will have to
>>>> determine where Word would wrap the text and insert the hard line
>>>> breaks yourself.
>>>> 
>>>> The problem you are facing is that Word does not add end of line
>>>> characters to the text of a paragraph to wrap lines. Instead, the
>>>> application will determine where the line should be broken based upon
>>>> the width of the page amongst other factors. Therefore, you will need
>>>> to decide where the line should be broken and insert your own line
>>>> breaks if you want to emulate the look and feel of MS Word.
>>>> 
>>>> Yours
>>>> 
>>>> Mark B
>>>> 
>>>> 
>>>> JohnRodey wrote:
>>>>> 
>>>>> Using POI for word documents with nutch/hadoop is there a way to force
>>>>> the plugin to add an eol character where word would typically do
>>>>> wrapping?  Or would I have to rewrite the plugin and add custom logic
>>>>> to add the eol character?  Currently the plugin will print a paragraph
>>>>> on one really long line.
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Preserving-word-wrap-tp28029360p28043831.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Preserving word wrap

Posted by JohnRodey <ti...@yahoo.com>.
No worries, mine just needs to be in a readable format.  So even something as
simple as an eol at the next space character after the 80th character is
good enough for me.  I could also create word wrapping on the display side. 
Thanks for your help


MSB wrote:
> 
> It's so long since I had to write any print routines using core Java code
> that I cannot really remember the deatils but I do know that there is
> support in the language for working out the space a line of text rendered
> in a specific font will occupy. If I have the time today, I will look out
> the code I put together way back when because I think it does something
> very much like this; takes a line of text and inserts hard line breaks so
> that it can be either printed out onto a page or rendered in some other
> way. Sorry to say that I forgot about this completely last night.
> 
> Yours
> 
> Mark B
> 
> 
> JohnRodey wrote:
>> 
>> Thanks Mark.  That's kinda what I figured I would need to do.
>> 
>> 
>> 
>> MSB wrote:
>>> 
>>> Not at all too sure that I understand your question as I am not familiar
>>> with natch/hadoop. So, I am going to guess that you want to use POI to
>>> extract the text of a paragraph or paragraphs from a Word document or
>>> documents and then repsent that text using another application but still
>>> formatted - from the perspective of a paragraphs layout - to the user.
>>> If this is correct then I think you will have to determine where Word
>>> would wrap the text and insert the hard line breaks yourself.
>>> 
>>> The problem you are facing is that Word does not add end of line
>>> characters to the text of a paragraph to wrap lines. Instead, the
>>> application will determine where the line should be broken based upon
>>> the width of the page amongst other factors. Therefore, you will need to
>>> decide where the line should be broken and insert your own line breaks
>>> if you want to emulate the look and feel of MS Word.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> JohnRodey wrote:
>>>> 
>>>> Using POI for word documents with nutch/hadoop is there a way to force
>>>> the plugin to add an eol character where word would typically do
>>>> wrapping?  Or would I have to rewrite the plugin and add custom logic
>>>> to add the eol character?  Currently the plugin will print a paragraph
>>>> on one really long line.
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Preserving-word-wrap-tp28029360p28042096.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Preserving word wrap

Posted by MSB <ma...@tiscali.co.uk>.
It's so long since I had to write any print routines using core Java code
that I cannot really remember the deatils but I do know that there is
support in the language for working out the space a line of text rendered in
a specific font will occupy. If I have the time today, I will look out the
code I put together way back when because I think it does something very
much like this; takes a line of text and inserts hard line breaks so that it
can be either printed out onto a page or rendered in some other way. Sorry
to say that I forgot about this completely last night.

Yours

Mark B


JohnRodey wrote:
> 
> Thanks Mark.  That's kinda what I figured I would need to do.
> 
> 
> 
> MSB wrote:
>> 
>> Not at all too sure that I understand your question as I am not familiar
>> with natch/hadoop. So, I am going to guess that you want to use POI to
>> extract the text of a paragraph or paragraphs from a Word document or
>> documents and then repsent that text using another application but still
>> formatted - from the perspective of a paragraphs layout - to the user. If
>> this is correct then I think you will have to determine where Word would
>> wrap the text and insert the hard line breaks yourself.
>> 
>> The problem you are facing is that Word does not add end of line
>> characters to the text of a paragraph to wrap lines. Instead, the
>> application will determine where the line should be broken based upon the
>> width of the page amongst other factors. Therefore, you will need to
>> decide where the line should be broken and insert your own line breaks if
>> you want to emulate the look and feel of MS Word.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> JohnRodey wrote:
>>> 
>>> Using POI for word documents with nutch/hadoop is there a way to force
>>> the plugin to add an eol character where word would typically do
>>> wrapping?  Or would I have to rewrite the plugin and add custom logic to
>>> add the eol character?  Currently the plugin will print a paragraph on
>>> one really long line.
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Preserving-word-wrap-tp28029360p28038884.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Preserving word wrap

Posted by JohnRodey <ti...@yahoo.com>.
Thanks Mark.  That's kinda what I figured I would need to do.



MSB wrote:
> 
> Not at all too sure that I understand your question as I am not familiar
> with natch/hadoop. So, I am going to guess that you want to use POI to
> extract the text of a paragraph or paragraphs from a Word document or
> documents and then repsent that text using another application but still
> formatted - from the perspective of a paragraphs layout - to the user. If
> this is correct then I think you will have to determine where Word would
> wrap the text and insert the hard line breaks yourself.
> 
> The problem you are facing is that Word does not add end of line
> characters to the text of a paragraph to wrap lines. Instead, the
> application will determine where the line should be broken based upon the
> width of the page amongst other factors. Therefore, you will need to
> decide where the line should be broken and insert your own line breaks if
> you want to emulate the look and feel of MS Word.
> 
> Yours
> 
> Mark B
> 
> 
> JohnRodey wrote:
>> 
>> Using POI for word documents with nutch/hadoop is there a way to force
>> the plugin to add an eol character where word would typically do
>> wrapping?  Or would I have to rewrite the plugin and add custom logic to
>> add the eol character?  Currently the plugin will print a paragraph on
>> one really long line.
>> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Preserving-word-wrap-tp28029360p28032688.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Preserving word wrap

Posted by MSB <ma...@tiscali.co.uk>.
Not at all too sure that I understand your question as I am not familiar with
natch/hadoop. So, I am going to guess that you want to use POI to extract
the text of a paragraph or paragraphs from a Word document or documents and
then repsent that text using another application but still formatted - from
the perspective of a paragraphs layout - to the user. If this is correct
then I think you will have to determine where Word would wrap the text and
insert the hard line breaks yourself.

The problem you are facing is that Word does not add end of line characters
to the text of a paragraph to wrap lines. Instead, the application will
determine where the line should be broken based upon the width of the page
amongst other factors. Therefore, you will need to decide where the line
should be broken and insert your own line breaks if you want to emulate the
look and feel of MS Word.

Yours

Mark B


JohnRodey wrote:
> 
> Using POI for word documents with nutch/hadoop is there a way to force the
> plugin to add an eol character where word would typically do wrapping?  Or
> would I have to rewrite the plugin and add custom logic to add the eol
> character?  Currently the plugin will print a paragraph on one really long
> line.
> 

-- 
View this message in context: http://old.nabble.com/Preserving-word-wrap-tp28029360p28031229.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org