You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by GitBox <gi...@apache.org> on 2022/01/06 08:39:59 UTC

[GitHub] [opennlp] kinow commented on pull request #399: OPENNLP-1350 Improve normaliser MAIL_REGEX

kinow commented on pull request #399:
URL: https://github.com/apache/opennlp/pull/399#issuecomment-1006378664


   > Hi, the unit tests do not need to be updated. They were valid before, and still are.
   
   :+1: 
   
   >This is primarily a performance optimisation. Additionally, the PR allows `+` in the local-part of emails, and disallows `_` in the domain part. This is just an improvement.
   
   I think this is a good improvement, I use `+` a lot for $work and also for myself. I think we should add a unit test for these two improvements, just to make sure that this is not reverted accidentally later. WDYT?
   
   > The actual email local-part rules are terrifying, and probably not worth expressing in a regex—the current is a compromise which should capture common email addresses out there.
   
   :+1: and I think the lookbehind and performance should be fine, no need for any benchmark IMO. I'll go through your comment above (thanks!) just to educate myself :neckbeard: 
   
   Thank you!
   -Bruno


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@opennlp.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org