You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by "Leif Hedstrom (JIRA)" <ji...@apache.org> on 2009/11/19 23:01:39 UTC

[jira] Created: (TS-48) Change all regex code to use PCRE

Change all regex code to use PCRE
---------------------------------

                 Key: TS-48
                 URL: https://issues.apache.org/jira/browse/TS-48
             Project: Traffic Server
          Issue Type: Improvement
          Components: Core
    Affects Versions: 2.0a
            Reporter: Leif Hedstrom


We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (TS-48) Change all regex code to use PCRE

Posted by "Andrew Hsu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780301#action_12780301 ] 

Andrew Hsu commented on TS-48:
------------------------------

A useful comparison of regex features:
http://www.regular-expressions.info/refflavors.html

Although this is a bit old, still some useful performance numbers--especially the "simple matches" at the bottom of the page:
http://www.boost.org/doc/libs/1_40_0/libs/regex/doc/gcc-performance.html

License looks good to me:
http://www.pcre.org/licence.txt

Besides, httpd uses it (when configured '--with-pcre'):
http://httpd.apache.org/docs/2.2/new_features_2_2.html

Thus far, PCRE looks favorable.  Would be good to have a simple performance test of current POSIX regex we are using vs PCRE to make sure latest version we want to switch to is still good.

Cheers,
Andrew


> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0a
>            Reporter: Leif Hedstrom
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (TS-48) Change all regex code to use PCRE

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom updated TS-48:
----------------------------

    Fix Version/s:     (was: 2.2.0)
                   2.1.0

> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0.0a
>            Reporter: Leif Hedstrom
>             Fix For: 2.1.0
>
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (TS-48) Change all regex code to use PCRE

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom reassigned TS-48:
-------------------------------

    Assignee: Leif Hedstrom

> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0.0a
>            Reporter: Leif Hedstrom
>            Assignee: Leif Hedstrom
>             Fix For: 2.1.0
>
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (TS-48) Change all regex code to use PCRE

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom updated TS-48:
----------------------------

    Fix Version/s: 2.2.0

> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0.0a
>            Reporter: Leif Hedstrom
>             Fix For: 2.2.0
>
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (TS-48) Change all regex code to use PCRE

Posted by "Manjesh Nilange (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780321#action_12780321 ] 

Manjesh Nilange commented on TS-48:
-----------------------------------

I ran a simple rudimentary performance test where in a match a set of 10 strings of the form

http://a#.subdom1.subdom2.subdom3.yahoo.com/path.php

(where # went from 1 to 10)

and matched the set 10k times against the regex "http://a([a-z0-9])+.subdom1.subdom2.subdom3.yahoo.com/(.*)". Over many runs, the POSIX matching took an average of 0.86 seconds and the PCRE matching too 0.07 seconds (in user time). So it looks like PCRE is well over 10x faster than POSIX for this use case.


> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0a
>            Reporter: Leif Hedstrom
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (TS-48) Change all regex code to use PCRE

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom resolved TS-48.
-----------------------------

    Resolution: Fixed

> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0.0a
>            Reporter: Leif Hedstrom
>            Assignee: Leif Hedstrom
>             Fix For: 2.1.0
>
>         Attachments: pcre.diff
>
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (TS-48) Change all regex code to use PCRE

Posted by "Stephane Belmon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780708#action_12780708 ] 

Stephane Belmon commented on TS-48:
-----------------------------------

This is an old rant, but: http://swtch.com/~rsc/regexp/regexp1.html
It obviously depends on what regexes are used, and more importantly on whether they're exposed to input from the wire -- weird config files can trivially kill the proxy. Hopefully there's no "regex itself from the wire" case, which is even more interesting.



> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0a
>            Reporter: Leif Hedstrom
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (TS-48) Change all regex code to use PCRE

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780324#action_12780324 ] 

Leif Hedstrom commented on TS-48:
---------------------------------

Cool, yeah, that's the experience I had too, particularly with the regex remap plugin. Besides being faster, PCRE style regexes are also much more familiar to most people.

The only other "candidate" I can imagine here would be Boost, but not sure we want to drag that into the build requirements? PCRE is a fairly small library, and it's generally available on pretty much any platforms.

> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0a
>            Reporter: Leif Hedstrom
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (TS-48) Change all regex code to use PCRE

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom updated TS-48:
----------------------------

    Attachment: pcre.diff

> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0.0a
>            Reporter: Leif Hedstrom
>            Assignee: Leif Hedstrom
>             Fix For: 2.1.0
>
>         Attachments: pcre.diff
>
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Work started: (TS-48) Change all regex code to use PCRE

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on TS-48 started by Leif Hedstrom.

> Change all regex code to use PCRE
> ---------------------------------
>
>                 Key: TS-48
>                 URL: https://issues.apache.org/jira/browse/TS-48
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0.0a
>            Reporter: Leif Hedstrom
>            Assignee: Leif Hedstrom
>             Fix For: 2.1.0
>
>         Attachments: pcre.diff
>
>
> We are adding some new regex stuff into TS 2.0, using PCRE, so we should examine all existing regex features, and unify on PCRE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.