You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2014/05/30 15:51:01 UTC
[jira] [Commented] (TIKA-1305) New list processing changes appear
to be causing RTFParser exception
[ https://issues.apache.org/jira/browse/TIKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013630#comment-14013630 ]
Tim Allison commented on TIKA-1305:
-----------------------------------
The cause of the problem with the attached file is that there is a listoverride, but then there is no listlevel in the override table. Note, too, that the listOverrideCount is 0 (the spec [http://msdn.microsoft.com/en-us/library/office/aa140301(v=office.10).aspx#rtfspec_14] says that it has to be 1 or 9).
When the parser hits listoverride, it resets listTableLevel to -1. When the parser then hits levelnfc within the override table, it tries to set that value and the AIOOBE happens.
Some options:
1) Simple solution, add a range check before setting the param
{code}
if (listTableLevel > -1 && listTableLevel < currentList.numberType.length) {
currentList.numberType[listTableLevel] = param;
}
{code}
2) More complex: try to keep track of whether or not this is a valid override table and/or if it actually contains any listlevel items.
[~mikemccand] and [~dkincaid], any preference or other recommendations?
> New list processing changes appear to be causing RTFParser exception
> --------------------------------------------------------------------
>
> Key: TIKA-1305
> URL: https://issues.apache.org/jira/browse/TIKA-1305
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.6
> Environment: Mac OSX 10.7.5
> Tika 1.6-SNAPSHOT
> Reporter: Chris Bamford
> Priority: Minor
> Labels: newbie
> Attachments: rtfparsererror_2.rtf
>
>
> Some RTFs cause RTFParser to throw a RuntimeException:
> Unexpected RuntimeException from org.apache.tika.parser.rtf.RTFParser@425e60f2
> When tracing in the debugger (surfaces in CompositeParser.parse() where it catches the RuntimeException, line 244 in my copy), the exception (e) is:
> java.lang.ArrayIndexOutOfBoundsException: -1
> A committer (Tim Allison) believes that it is being caused by recent list processing changes.
--
This message was sent by Atlassian JIRA
(v6.2#6252)