You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by GitBox <gi...@apache.org> on 2021/12/16 21:10:07 UTC

[GitHub] [opennlp] jonmv commented on pull request #355: OPENNLP-1266 -- Limit regexes in UrlCharSequenceNormalizer

jonmv commented on pull request #355:
URL: https://github.com/apache/opennlp/pull/355#issuecomment-996195973


   Please consider https://github.com/apache/opennlp/pull/399 instead. The URL regex shouldn't cause super-linear complexity like the MAIL regex does, I believe. The problem is that the regex is used in `String.replaceAll(...)`, and is evaluated for each suffix of bad input—this does not happen for the URL regex, which can only be a few characters long before it either fails or succeeds definitively. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@opennlp.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org