You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/06/07 12:59:00 UTC

[jira] [Comment Edited] (OPENNLP-1266) Limit normalization regexes

    [ https://issues.apache.org/jira/browse/OPENNLP-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858606#comment-16858606 ] 

Tim Allison edited comment on OPENNLP-1266 at 6/7/19 12:58 PM:
---------------------------------------------------------------

When we limit []+ to []\{1,100\}

||Length||Time(ms)||
|1000|14|
|2000|13|
|3000|12|
|4000|16|
|5000|17|
|6000|16|
|7000|52|
|8000|41|
|9000|25|
|10000|31|
|11000|24|
|12000|37|
|13000|39|
|14000|48|
|15000|43|
|16000|44|
|17000|41|
|18000|33|
|19000|33|
|20000|35|
|21000|51|
|22000|52|
|23000|78|
|24000|88|
|25000|69|
|26000|78|
|27000|67|
|28000|108|
|29000|113|
|30000|109|


was (Author: tallison@mitre.org):
When we limit []+ to []{1,100}

||Length||Time(ms)||
|1000|14|
|2000|13|
|3000|12|
|4000|16|
|5000|17|
|6000|16|
|7000|52|
|8000|41|
|9000|25|
|10000|31|
|11000|24|
|12000|37|
|13000|39|
|14000|48|
|15000|43|
|16000|44|
|17000|41|
|18000|33|
|19000|33|
|20000|35|
|21000|51|
|22000|52|
|23000|78|
|24000|88|
|25000|69|
|26000|78|
|27000|67|
|28000|108|
|29000|113|
|30000|109|

> Limit normalization regexes
> ---------------------------
>
>                 Key: OPENNLP-1266
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1266
>             Project: OpenNLP
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> Several of the normalizer regexes are not bounded.  In rare cases, this can cause eye-opening performance costs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)