You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tika.apache.org by "Steve Gullion (JIRA)" <ji...@apache.org> on 2015/03/25 19:38:53 UTC

[jira] [Issue Comment Deleted] (TIKA-1440) Auto-Paragraph numbers not extracted from Word Document

     [ https://issues.apache.org/jira/browse/TIKA-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Gullion updated TIKA-1440:
--------------------------------
    Comment: was deleted

(was: I guess the comments don't support indentation either, ha.)

> Auto-Paragraph numbers not extracted from Word Document 
> --------------------------------------------------------
>
>                 Key: TIKA-1440
>                 URL: https://issues.apache.org/jira/browse/TIKA-1440
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>         Environment: Windows 7, Windows Server 2008, Tomcat
>            Reporter: Steve Gullion
>            Priority: Minor
>              Labels: numbering, paragraph, word
>
> When the text is extracted from a Microsoft Word document that uses automatic numbering, the text of the automatic numbers is not extracted. As the numbers can be critical to the meaning of the document (as in the case of cross-references), they should be calculated and extracted if at all possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)