You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by architha <me...@yahoo.com> on 2010/02/18 07:23:25 UTC

hwpf : unabe to read nested tables

I want to read a word document and want to create a tree structure of tables in
the word doc.  I am able to get to the first level tables by asking for   
para.isInTable() 
and then getting to the actual table by
range.getTable(para); //where range is HWPFDocument.getRange();
Now I am reading the table row and it's cells and looking for the existence of
any tables(the nested tables case). Here, once I have the TableCell , i am
getting the table cell paragraphs and checking to see range.getTable(para);
returns a table for these paragraph objects.
The problem is that this call always fails to return a table, even when the
paragraph actually is inside a table.
In summary, range.getTable(para) works fine when the para is in a top level
table but doesn't return the table if the para is part of a nested table.
Am I doing anything wrong here, or is this a bug?. Please advise.
Thanks,
-Archita.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: hwpf : unabe to read nested tables

Posted by Mark Beardsley <ma...@tiscali.co.uk>.
Yes, I understood that. My point was that it is possible to get the table
that starts with a particular paragraph from the paragraph object itself. A
long time ago, I wrote some test code that allowed me to get the tables from
a document in sequence so to speak and remember using this technique. The
trick was that you could only call either the isTable() or getTable()
methods - and I cannot remember which, sorry - for the first paragraph in
the table only. So, I was wondering if you could get the contents of the
table and test it in this manner to see if it was contained within another
table. The BIG logical flaw in the approach is that of course the paragraph
will always be in a table, the outer most one. Still, I do wonder if the
technique has some merit - get the table from the paragraph object and test
to see whether it is the containing table. If not, you may have a nested
table. Do not know if it will work as I have never tried it myself but the
code should be easy enough to put together.

Yours

Mark B

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/hwpf-unabe-to-read-nested-tables-tp2304357p4270856.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: hwpf : unabe to read nested tables

Posted by Francisco2626 <fa...@isaltda.com.uy>.
The problem is that you can not access the table. Apparently there is a bug.


--
View this message in context: http://apache-poi.1045710.n5.nabble.com/hwpf-unabe-to-read-nested-tables-tp2304357p4269750.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: hwpf : unabe to read nested tables

Posted by Mark Beardsley <ma...@tiscali.co.uk>.
If I remember correctly though you can ask a paragraph whether it is in a
table or not. Would that help or do you need to be able to access
information about the nested table, for example the number of rows and
columns?

Yours

Mark B

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/hwpf-unabe-to-read-nested-tables-tp2304357p4269654.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: hwpf : unabe to read nested tables

Posted by Francisco2626 <fa...@isaltda.com.uy>.
And with the recursion can be solved? I'm doing the recursion that makes
apache tika, but what it does is parse this paragraph but not the table.


--
View this message in context: http://apache-poi.1045710.n5.nabble.com/hwpf-unabe-to-read-nested-tables-tp2304357p4269406.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: hwpf : unabe to read nested tables

Posted by Nick Burch <ni...@alfresco.com>.
On Tue, 29 Mar 2011, Francisco2626 wrote:
> Sorry that reviving this post but I have exactly the same question and 
> would be in vain rewrite it all again.

I think there's something not quite right with HWPF and nested tables. In 
Apache Tika we have to turn on table recursion because of this.

It needs someone to dig into the code and identify what's up, but alas 
no-one has had the time to do that yet

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: hwpf : unabe to read nested tables

Posted by Francisco2626 <fa...@isaltda.com.uy>.
Sorry that reviving this post but I have exactly the same question and would
be in vain rewrite it all again.

Thanks.
Francisco.

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/hwpf-unabe-to-read-nested-tables-tp2304357p4269376.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org