You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Jérôme Rocheteau (JIRA)" <ui...@incubator.apache.org> on 2009/07/23 16:47:17 UTC
[jira] Issue Comment Edited: (UIMA-1447) Tabulations are annotated
as tokens after a space
[ https://issues.apache.org/jira/browse/UIMA-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734606#action_12734606 ]
Jérôme Rocheteau edited comment on UIMA-1447 at 7/23/09 7:46 AM:
-----------------------------------------------------------------
I suggest this patch: it merely checks if the current character isn't a whitespace while creating a token annotation for a special character.
was (Author: jerome.rocheteau):
I suggest this patch: it merely checks if the current character isn't a whitespace while creating a token annotation is created for a special character.
> Tabulations are annotated as tokens after a space
> -------------------------------------------------
>
> Key: UIMA-1447
> URL: https://issues.apache.org/jira/browse/UIMA-1447
> Project: UIMA
> Issue Type: Bug
> Components: Sandbox-WhitespaceTokenizer
> Affects Versions: 2.3S
> Environment: Unix (ubuntu 8.04), Eclipse Galileo 3.5
> Reporter: Jérôme Rocheteau
> Attachments: patch-an-wst.txt
>
>
> This is a test-text for the Whitespace Tokenizer in the UIMA Sandbox.
> It behaves as follows: i.e. a '\t' character after a space is
> annotated as a token and its covered text is set to the empty string ""!
> I suppose it shoudn't be the case, am I wrong?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.