You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2012/11/27 13:31:58 UTC

[jira] [Resolved] (STANBOL-818) EntitylinkingEngine encounters StringIndexOutOfBounds exceptions

     [ https://issues.apache.org/jira/browse/STANBOL-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rupert Westenthaler resolved STANBOL-818.
-----------------------------------------

    Resolution: Fixed

fixed with http://svn.apache.org/viewvc?rev=1414147&view=rev

This was caused by a Bug in the ProcessingState class: This class iterates over Sections (typically Sentences) in the parsed content and collects the Tokens within those sections. If a Section does not contain any Linkable Token, than it continues with the next Section. However in those cases the tokens of the last sections where not correctly reset. 

Because of that the tokens list contained Tokens of the previous section in cases where the previous sentence had not a single linkable Token. In such situations offset calculations where flawed resulting in negative indexes for calls to substring().
                
> EntitylinkingEngine encounters StringIndexOutOfBounds exceptions
> ----------------------------------------------------------------
>
>                 Key: STANBOL-818
>                 URL: https://issues.apache.org/jira/browse/STANBOL-818
>             Project: Stanbol
>          Issue Type: Bug
>          Components: Enhancer
>            Reporter: Rupert Westenthaler
>
> For some texts the EntityLinkingEngine encounters negative String indexes
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -6
> 	at java.lang.String.substring(String.java:1931)
> 	at org.apache.stanbol.enhancer.engines.entitylinking.impl.ProcessingState.getTokenText(ProcessingState.java:324)
> A text that triggers this is "It comprises 114 counties and one independent city. Missouri's capital is Jefferson City. "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira