You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by "tbw777 (via GitHub)" <gi...@apache.org> on 2023/02/17 15:53:41 UTC

[GitHub] [tomcat] tbw777 opened a new pull request, #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"

tbw777 opened a new pull request, #592:
URL: https://github.com/apache/tomcat/pull/592

   https://gist.github.com/tbw777/8ce56d2cc3a0216012362d18b7262eb2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] aooohan commented on pull request #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"

Posted by "aooohan (via GitHub)" <gi...@apache.org>.
aooohan commented on PR #592:
URL: https://github.com/apache/tomcat/pull/592#issuecomment-1463387623

   Merge manually, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] tbw777 commented on pull request #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"

Posted by "tbw777 (via GitHub)" <gi...@apache.org>.
tbw777 commented on PR #592:
URL: https://github.com/apache/tomcat/pull/592#issuecomment-1444501056

   https://gist.github.com/tbw777/cd394ec67a01f9e7e8fe4d0c66d74637


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] ChristopherSchultz commented on pull request #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"

Posted by "ChristopherSchultz (via GitHub)" <gi...@apache.org>.
ChristopherSchultz commented on PR #592:
URL: https://github.com/apache/tomcat/pull/592#issuecomment-1444608082

   Thanks @tbw777 for the updated micro-benchmarks. I agree that the performance improvement is significant enough to warrant a change to Tomcat.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] markt-asf commented on a diff in pull request #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"

Posted by "markt-asf (via GitHub)" <gi...@apache.org>.
markt-asf commented on code in PR #592:
URL: https://github.com/apache/tomcat/pull/592#discussion_r1116803461


##########
java/jakarta/servlet/jsp/resources/jspxml.xsd:
##########
@@ -25,7 +25,7 @@
 <!ENTITY SetProp    "(&Identifier;|\*)">
 <!ENTITY RelativeURL  "[^:#/\?]*(:{0,0}|[#/\?].*)">
 <!ENTITY Length     "[0-9]*&#x25;?">
-<!ENTITY AsciiName    "[A-Za-z0-9_-]*">
+<!ENTITY AsciiName    "[\w-]*">

Review Comment:
   The Tomcat project is unable to change these schema. We have to use the ones provided by the specification. Please direct this part of the PR to the Jakarta Schema project:
   https://github.com/eclipse-ee4j/jakartaee-schemas



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] aooohan closed pull request #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"

Posted by "aooohan (via GitHub)" <gi...@apache.org>.
aooohan closed pull request #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"
URL: https://github.com/apache/tomcat/pull/592


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] ChristopherSchultz commented on pull request #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"

Posted by "ChristopherSchultz (via GitHub)" <gi...@apache.org>.
ChristopherSchultz commented on PR #592:
URL: https://github.com/apache/tomcat/pull/592#issuecomment-1443631640

   I'm curious about performance data when the `\w` has other things added to it, which is the case for all examples in Tomcat. This microbenchmark only compares `[A-Za-z0-9_]` as the whole character class against `\w` and `[\w]` without any other characters added to the character class.
   
   Also, most expressions used in Tomcat have a trailing `+` which in the attached performance data show that performance is again terrible. It still appears measurably and consistently better than `[A-Za-z0-9_]` but I think it's important to benchmark what Tomcat _actually_ uses and not something _very close_ to what Tomcat uses.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] tbw777 commented on pull request #592: Improved regexp performance: "a-zA-Z0-9_" -> "\w"

Posted by "tbw777 (via GitHub)" <gi...@apache.org>.
tbw777 commented on PR #592:
URL: https://github.com/apache/tomcat/pull/592#issuecomment-1445017914

   Compile speed
   https://gist.github.com/tbw777/6a6303f64894e65a160d3f6321685d27


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org