You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Ajai <aj...@gmail.com> on 2009/10/22 10:46:43 UTC

Re: Bookmarks on word documents

Hi Mark,

I am also looking at retrieving bookmarks from a word document.

Can you kindly let me know if you were able to achieve it.

Regards,
Ajai G



MSB wrote:
> 
> Made a little more progress yesterday. I do not think it will be possible
> to simply search for control characters in the table stream, it is going
> to be necessary to interrogate the file information block for the offset
> to the list of bookmarks. That is not as scary as it sounds because a lot
> of the work has already been done in the FIBFieldHandler class. I think I
> am going to hack this by adding an associative list that allows me to link
> the name or ID number of the offset field to the number of bytes. Next, I
> need to work out all of the entities that can appear in the table stream
> before the bookmarks, get the offset for each along with the length, add
> all of this together and hopefuly arrive at the offset for the bookmarks.
> Last night, I had the chance to check that by simply accessing the field
> that relates to the offset for the bookmarks was not enough; it gave me an
> offset of something like 50 bytes whereas I know from mapping the table
> stream it should be something like 2000.
> 
> Cannot promise to work on the problem today - I have commitments both
> during the day and evening - but will post again if I make any progress.
> 
> Yours
> 
> Mark B.
> 
> 
> Fernando Antonio Prado wrote:
>> 
>> 
>> Hi there. I've just subscribed here so this is my first email. I searched
>> the POI site but couldn't find the answer to may problem. I've just
>> downloaded the most recent version of HWPF and I1d like to know if
>> there's any way I can get the bookmarks of a word document. I'll be very
>> pleased with your help.. Thx!!
>> 
>> _________________________________________________________________
>> Com o Windows Live, vocĂȘ pode organizar, editar e compartilhar suas
>> fotos.
>> http://www.microsoft.com/brasil/windows/windowslive/products/photo-gallery-edit.aspx
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Bookmarks-on-word-documents-tp24573916p26006213.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Bookmarks on word documents

Posted by MSB <ma...@tiscali.co.uk>.
No, sorry no progress at all. It will require some work to accomplish this
because of the way Word stores bookmarks; crudely, it uses a data structure
where each bookmark is listed along with a number. This number indicates
where - in terms of character positions - the bookmark was actually inserted
and also where the substitution text should be placed. On one level, this is
quite simple if you ignore the complexities of locating that data structure
within the doc file, but the complications will multiply when we try to
substitute the text for the boomark at the location indicated, at least I
fear this will be the case based upon my experience with search and replace
operations. As yet, I have no idea how and what to modify when the structure
of the Word document changes and once you bear in mind that the file
'contains' up to four streams and each stream is a linked list - at least at
the most basic level - the potential problems that could be caused by
corrupting any of these links become apparent.

Yours

Mark B



Ajai wrote:
> 
> Hi Mark,
> 
> I am also looking at retrieving bookmarks from a word document.
> 
> Can you kindly let me know if you were able to achieve it.
> 
> Regards,
> Ajai G
> 
> 
> 
> MSB wrote:
>> 
>> Made a little more progress yesterday. I do not think it will be possible
>> to simply search for control characters in the table stream, it is going
>> to be necessary to interrogate the file information block for the offset
>> to the list of bookmarks. That is not as scary as it sounds because a lot
>> of the work has already been done in the FIBFieldHandler class. I think I
>> am going to hack this by adding an associative list that allows me to
>> link the name or ID number of the offset field to the number of bytes.
>> Next, I need to work out all of the entities that can appear in the table
>> stream before the bookmarks, get the offset for each along with the
>> length, add all of this together and hopefuly arrive at the offset for
>> the bookmarks. Last night, I had the chance to check that by simply
>> accessing the field that relates to the offset for the bookmarks was not
>> enough; it gave me an offset of something like 50 bytes whereas I know
>> from mapping the table stream it should be something like 2000.
>> 
>> Cannot promise to work on the problem today - I have commitments both
>> during the day and evening - but will post again if I make any progress.
>> 
>> Yours
>> 
>> Mark B.
>> 
>> 
>> Fernando Antonio Prado wrote:
>>> 
>>> 
>>> Hi there. I've just subscribed here so this is my first email. I
>>> searched the POI site but couldn't find the answer to may problem. I've
>>> just downloaded the most recent version of HWPF and I1d like to know if
>>> there's any way I can get the bookmarks of a word document. I'll be very
>>> pleased with your help.. Thx!!
>>> 
>>> _________________________________________________________________
>>> Com o Windows Live, vocĂȘ pode organizar, editar e compartilhar suas
>>> fotos.
>>> http://www.microsoft.com/brasil/windows/windowslive/products/photo-gallery-edit.aspx
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Bookmarks-on-word-documents-tp24573916p26009080.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org