You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@httpd.apache.org by GitBox <gi...@apache.org> on 2022/02/02 16:24:41 UTC

[GitHub] [httpd] ylavic opened a new pull request #289: PCRE2 with Thread Local Storage

ylavic opened a new pull request #289:
URL: https://github.com/apache/httpd/pull/289


   First commit is the current backport proposal.
   
   Next is TLS work:
   r1897240
   r1897241
   r1897242
   r1897244
   r1897248
   r1897250
   r1897260
   r1897261
   r1897263
   r1897386
   r1897459
   r1897460
   r1897461
   r1897462
   r1897472
   r1897543
   r1897651
   r1897680
   r1897689
   r1897691
   r1897692
   
   Finally (but not chronologically), PCRE version printing:
   r1612934
   r1612940
   r1613189


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] ylavic commented on a change in pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
ylavic commented on a change in pull request #289:
URL: https://github.com/apache/httpd/pull/289#discussion_r797216794



##########
File path: server/main.c
##########
@@ -348,6 +353,23 @@ static process_rec *init_process(int *argc, const char * const * *argv)
     process->argc = *argc;
     process->argv = *argv;
     process->short_name = apr_filepath_name_get((*argv)[0]);
+
+#if AP_HAS_THREAD_LOCAL
+    {
+        apr_status_t rv;
+        apr_thread_t *thd = NULL;
+        if ((rv = ap_thread_main_create(&thd, process->pool))) {
+            char ctimebuff[APR_CTIME_LEN];
+            apr_ctime(ctimebuff, apr_time_now());
+            fprintf(stderr, "[%s] [crit] (%d) %s: %s failed "
+                            "to initialize thread context (%i), exiting\n",
+                            ctimebuff, stat, (*argv)[0], failed, rv);

Review comment:
       Fixed in r1897680 (a31b463532e83d557a4898cfb6411e144bc39266) and cheery picked here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] rpluem commented on a change in pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
rpluem commented on a change in pull request #289:
URL: https://github.com/apache/httpd/pull/289#discussion_r796690812



##########
File path: server/main.c
##########
@@ -348,6 +353,23 @@ static process_rec *init_process(int *argc, const char * const * *argv)
     process->argc = *argc;
     process->argv = *argv;
     process->short_name = apr_filepath_name_get((*argv)[0]);
+
+#if AP_HAS_THREAD_LOCAL
+    {
+        apr_status_t rv;
+        apr_thread_t *thd = NULL;
+        if ((rv = ap_thread_main_create(&thd, process->pool))) {
+            char ctimebuff[APR_CTIME_LEN];
+            apr_ctime(ctimebuff, apr_time_now());
+            fprintf(stderr, "[%s] [crit] (%d) %s: %s failed "
+                            "to initialize thread context (%i), exiting\n",
+                            ctimebuff, stat, (*argv)[0], failed, rv);

Review comment:
       ```suggestion
               fprintf(stderr, "[%s] [crit] (%d) %s: Failed "
                               "to initialize thread context, exiting\n",
                               ctimebuff, rv, (*argv)[0]);
   ```
   As otherwise we take over the stuff `apr_app_initialize`which seems wrong.
   

##########
File path: server/util_pcre.c
##########
@@ -217,7 +264,134 @@ AP_DECLARE(int) ap_regcomp(ap_regex_t * preg, const char *pattern, int cflags)
  * ints. However, if the number of possible capturing brackets is small, use a
  * block of store on the stack, to reduce the use of malloc/free. The threshold
  * is in a macro that can be changed at configure time.
+ * Yet more unfortunately, PCRE2 wants an opaque context by providing the API
+ * to allocate and free it, so to minimize these calls we maintain one opaque
+ * context per thread (in Thread Local Storage, TLS) grown as needed, and while
+ * at it we do the same for PCRE1 ints vectors. Note that this requires a fast
+ * TLS mechanism to be worth it, which is the case of apr_thread_data_get/set()
+ * from/to ap_thread_current() when AP_HAS_THREAD_LOCAL; otherwise we'll do
+ * the allocation and freeing for each ap_regexec().
  */
+
+#ifdef HAVE_PCRE2
+typedef pcre2_match_data* match_data_pt;
+typedef size_t*           match_vector_pt;
+#else
+typedef int*              match_data_pt;
+typedef int*              match_vector_pt;
+#endif
+
+static APR_INLINE
+match_data_pt alloc_match_data(apr_size_t size,
+                               match_vector_pt small_vector)
+{
+    match_data_pt data;
+
+#ifdef HAVE_PCRE2
+    data = pcre2_match_data_create(size, NULL);
+#else
+    if (size > POSIX_MALLOC_THRESHOLD) {
+        data = malloc(size * sizeof(int) * 3);
+    }
+    else {
+        data = small_vector;
+    }
+#endif
+
+    return data;
+}
+
+static APR_INLINE
+void free_match_data(match_data_pt data, apr_size_t size)
+{
+#ifdef HAVE_PCRE2
+    pcre2_match_data_free(data);
+#else
+    if (size > POSIX_MALLOC_THRESHOLD) {
+        free(data);
+    }
+#endif
+}
+
+#if AP_HAS_THREAD_LOCAL
+
+struct apreg_tls {
+    match_data_pt data;
+    apr_size_t size;
+};
+
+#ifdef HAVE_PCRE2
+static apr_status_t apreg_tls_cleanup(void *arg)
+{
+    struct apreg_tls *tls = arg;
+    pcre2_match_data_free(tls->data); /* NULL safe */
+    return APR_SUCCESS;
+}
+#endif
+
+static match_data_pt get_match_data(apr_size_t size,
+                                    match_vector_pt small_vector,
+                                    int *to_free)
+{
+    apr_thread_t *current;
+    struct apreg_tls *tls = NULL;
+
+    /* Even though AP_HAS_THREAD_LOCAL, we may still be called by a
+     * native/non-apr thread, let's fall back to alloc/free in this case.
+     */
+    current = ap_thread_current();
+    if (!current) {
+        *to_free = 1;
+        return alloc_match_data(size, small_vector);
+    }
+
+    apr_thread_data_get((void **)&tls, "apreg", current);
+    if (!tls || tls->size < size) {
+        apr_pool_t *tp = apr_thread_pool_get(current);
+        if (!tls) {
+            tls = apr_pcalloc(tp, sizeof(*tls));
+#ifdef HAVE_PCRE2
+            apr_thread_data_set(tls, "apreg", apreg_tls_cleanup, current);
+#else
+            apr_thread_data_set(tls, "apreg", NULL, current);
+#endif
+        }
+
+        tls->size *= 2;
+        if (tls->size < size) {
+            tls->size = size;
+            if (tls->size < POSIX_MALLOC_THRESHOLD) {
+                tls->size = POSIX_MALLOC_THRESHOLD;
+            }
+        }
+
+#ifdef HAVE_PCRE2
+        pcre2_match_data_free(tls->data); /* NULL safe */
+        tls->data = pcre2_match_data_create(tls->size, NULL);
+        if (!tls->data) {
+            tls->size = 0;
+            return NULL;
+        }
+#else
+        tls->data = apr_palloc(tp, tls->size * sizeof(int) * 3);

Review comment:
       If we are using PCRE and not PCRE2 we accumulate memory here. Shouldn't we create a subpool of `tp`, store it in `tls` as well and clean it before the `apr_palloc`? Or is this not worth the effort as we do not expect frequent reallocations? If it is not worth, we may should document this in a comment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] ylavic commented on pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
ylavic commented on pull request #289:
URL: https://github.com/apache/httpd/pull/289#issuecomment-1028115906


   close/reopen to restart ci


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] ylavic commented on a change in pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
ylavic commented on a change in pull request #289:
URL: https://github.com/apache/httpd/pull/289#discussion_r797216794



##########
File path: server/main.c
##########
@@ -348,6 +353,23 @@ static process_rec *init_process(int *argc, const char * const * *argv)
     process->argc = *argc;
     process->argv = *argv;
     process->short_name = apr_filepath_name_get((*argv)[0]);
+
+#if AP_HAS_THREAD_LOCAL
+    {
+        apr_status_t rv;
+        apr_thread_t *thd = NULL;
+        if ((rv = ap_thread_main_create(&thd, process->pool))) {
+            char ctimebuff[APR_CTIME_LEN];
+            apr_ctime(ctimebuff, apr_time_now());
+            fprintf(stderr, "[%s] [crit] (%d) %s: %s failed "
+                            "to initialize thread context (%i), exiting\n",
+                            ctimebuff, stat, (*argv)[0], failed, rv);

Review comment:
       Fixed in r1897680 (a31b463532e83d557a4898cfb6411e144bc39266) and cheery picked here..




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] ylavic commented on a change in pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
ylavic commented on a change in pull request #289:
URL: https://github.com/apache/httpd/pull/289#discussion_r797183848



##########
File path: server/main.c
##########
@@ -348,6 +353,23 @@ static process_rec *init_process(int *argc, const char * const * *argv)
     process->argc = *argc;
     process->argv = *argv;
     process->short_name = apr_filepath_name_get((*argv)[0]);
+
+#if AP_HAS_THREAD_LOCAL
+    {
+        apr_status_t rv;
+        apr_thread_t *thd = NULL;
+        if ((rv = ap_thread_main_create(&thd, process->pool))) {
+            char ctimebuff[APR_CTIME_LEN];
+            apr_ctime(ctimebuff, apr_time_now());
+            fprintf(stderr, "[%s] [crit] (%d) %s: %s failed "
+                            "to initialize thread context (%i), exiting\n",
+                            ctimebuff, stat, (*argv)[0], failed, rv);

Review comment:
       Bad copypasta, will fix thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] ylavic closed pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
ylavic closed pull request #289:
URL: https://github.com/apache/httpd/pull/289


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] rpluem commented on a change in pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
rpluem commented on a change in pull request #289:
URL: https://github.com/apache/httpd/pull/289#discussion_r797316167



##########
File path: server/util_pcre.c
##########
@@ -217,7 +264,134 @@ AP_DECLARE(int) ap_regcomp(ap_regex_t * preg, const char *pattern, int cflags)
  * ints. However, if the number of possible capturing brackets is small, use a
  * block of store on the stack, to reduce the use of malloc/free. The threshold
  * is in a macro that can be changed at configure time.
+ * Yet more unfortunately, PCRE2 wants an opaque context by providing the API
+ * to allocate and free it, so to minimize these calls we maintain one opaque
+ * context per thread (in Thread Local Storage, TLS) grown as needed, and while
+ * at it we do the same for PCRE1 ints vectors. Note that this requires a fast
+ * TLS mechanism to be worth it, which is the case of apr_thread_data_get/set()
+ * from/to ap_thread_current() when AP_HAS_THREAD_LOCAL; otherwise we'll do
+ * the allocation and freeing for each ap_regexec().
  */
+
+#ifdef HAVE_PCRE2
+typedef pcre2_match_data* match_data_pt;
+typedef size_t*           match_vector_pt;
+#else
+typedef int*              match_data_pt;
+typedef int*              match_vector_pt;
+#endif
+
+static APR_INLINE
+match_data_pt alloc_match_data(apr_size_t size,
+                               match_vector_pt small_vector)
+{
+    match_data_pt data;
+
+#ifdef HAVE_PCRE2
+    data = pcre2_match_data_create(size, NULL);
+#else
+    if (size > POSIX_MALLOC_THRESHOLD) {
+        data = malloc(size * sizeof(int) * 3);
+    }
+    else {
+        data = small_vector;
+    }
+#endif
+
+    return data;
+}
+
+static APR_INLINE
+void free_match_data(match_data_pt data, apr_size_t size)
+{
+#ifdef HAVE_PCRE2
+    pcre2_match_data_free(data);
+#else
+    if (size > POSIX_MALLOC_THRESHOLD) {
+        free(data);
+    }
+#endif
+}
+
+#if AP_HAS_THREAD_LOCAL
+
+struct apreg_tls {
+    match_data_pt data;
+    apr_size_t size;
+};
+
+#ifdef HAVE_PCRE2
+static apr_status_t apreg_tls_cleanup(void *arg)
+{
+    struct apreg_tls *tls = arg;
+    pcre2_match_data_free(tls->data); /* NULL safe */
+    return APR_SUCCESS;
+}
+#endif
+
+static match_data_pt get_match_data(apr_size_t size,
+                                    match_vector_pt small_vector,
+                                    int *to_free)
+{
+    apr_thread_t *current;
+    struct apreg_tls *tls = NULL;
+
+    /* Even though AP_HAS_THREAD_LOCAL, we may still be called by a
+     * native/non-apr thread, let's fall back to alloc/free in this case.
+     */
+    current = ap_thread_current();
+    if (!current) {
+        *to_free = 1;
+        return alloc_match_data(size, small_vector);
+    }
+
+    apr_thread_data_get((void **)&tls, "apreg", current);
+    if (!tls || tls->size < size) {
+        apr_pool_t *tp = apr_thread_pool_get(current);
+        if (!tls) {
+            tls = apr_pcalloc(tp, sizeof(*tls));
+#ifdef HAVE_PCRE2
+            apr_thread_data_set(tls, "apreg", apreg_tls_cleanup, current);
+#else
+            apr_thread_data_set(tls, "apreg", NULL, current);
+#endif
+        }
+
+        tls->size *= 2;
+        if (tls->size < size) {
+            tls->size = size;
+            if (tls->size < POSIX_MALLOC_THRESHOLD) {
+                tls->size = POSIX_MALLOC_THRESHOLD;
+            }
+        }
+
+#ifdef HAVE_PCRE2
+        pcre2_match_data_free(tls->data); /* NULL safe */
+        tls->data = pcre2_match_data_create(tls->size, NULL);
+        if (!tls->data) {
+            tls->size = 0;
+            return NULL;
+        }
+#else
+        tls->data = apr_palloc(tp, tls->size * sizeof(int) * 3);

Review comment:
       Good points.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] ylavic commented on a change in pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
ylavic commented on a change in pull request #289:
URL: https://github.com/apache/httpd/pull/289#discussion_r797196718



##########
File path: server/util_pcre.c
##########
@@ -217,7 +264,134 @@ AP_DECLARE(int) ap_regcomp(ap_regex_t * preg, const char *pattern, int cflags)
  * ints. However, if the number of possible capturing brackets is small, use a
  * block of store on the stack, to reduce the use of malloc/free. The threshold
  * is in a macro that can be changed at configure time.
+ * Yet more unfortunately, PCRE2 wants an opaque context by providing the API
+ * to allocate and free it, so to minimize these calls we maintain one opaque
+ * context per thread (in Thread Local Storage, TLS) grown as needed, and while
+ * at it we do the same for PCRE1 ints vectors. Note that this requires a fast
+ * TLS mechanism to be worth it, which is the case of apr_thread_data_get/set()
+ * from/to ap_thread_current() when AP_HAS_THREAD_LOCAL; otherwise we'll do
+ * the allocation and freeing for each ap_regexec().
  */
+
+#ifdef HAVE_PCRE2
+typedef pcre2_match_data* match_data_pt;
+typedef size_t*           match_vector_pt;
+#else
+typedef int*              match_data_pt;
+typedef int*              match_vector_pt;
+#endif
+
+static APR_INLINE
+match_data_pt alloc_match_data(apr_size_t size,
+                               match_vector_pt small_vector)
+{
+    match_data_pt data;
+
+#ifdef HAVE_PCRE2
+    data = pcre2_match_data_create(size, NULL);
+#else
+    if (size > POSIX_MALLOC_THRESHOLD) {
+        data = malloc(size * sizeof(int) * 3);
+    }
+    else {
+        data = small_vector;
+    }
+#endif
+
+    return data;
+}
+
+static APR_INLINE
+void free_match_data(match_data_pt data, apr_size_t size)
+{
+#ifdef HAVE_PCRE2
+    pcre2_match_data_free(data);
+#else
+    if (size > POSIX_MALLOC_THRESHOLD) {
+        free(data);
+    }
+#endif
+}
+
+#if AP_HAS_THREAD_LOCAL
+
+struct apreg_tls {
+    match_data_pt data;
+    apr_size_t size;
+};
+
+#ifdef HAVE_PCRE2
+static apr_status_t apreg_tls_cleanup(void *arg)
+{
+    struct apreg_tls *tls = arg;
+    pcre2_match_data_free(tls->data); /* NULL safe */
+    return APR_SUCCESS;
+}
+#endif
+
+static match_data_pt get_match_data(apr_size_t size,
+                                    match_vector_pt small_vector,
+                                    int *to_free)
+{
+    apr_thread_t *current;
+    struct apreg_tls *tls = NULL;
+
+    /* Even though AP_HAS_THREAD_LOCAL, we may still be called by a
+     * native/non-apr thread, let's fall back to alloc/free in this case.
+     */
+    current = ap_thread_current();
+    if (!current) {
+        *to_free = 1;
+        return alloc_match_data(size, small_vector);
+    }
+
+    apr_thread_data_get((void **)&tls, "apreg", current);
+    if (!tls || tls->size < size) {
+        apr_pool_t *tp = apr_thread_pool_get(current);
+        if (!tls) {
+            tls = apr_pcalloc(tp, sizeof(*tls));
+#ifdef HAVE_PCRE2
+            apr_thread_data_set(tls, "apreg", apreg_tls_cleanup, current);
+#else
+            apr_thread_data_set(tls, "apreg", NULL, current);
+#endif
+        }
+
+        tls->size *= 2;
+        if (tls->size < size) {
+            tls->size = size;
+            if (tls->size < POSIX_MALLOC_THRESHOLD) {
+                tls->size = POSIX_MALLOC_THRESHOLD;
+            }
+        }
+
+#ifdef HAVE_PCRE2
+        pcre2_match_data_free(tls->data); /* NULL safe */
+        tls->data = pcre2_match_data_create(tls->size, NULL);
+        if (!tls->data) {
+            tls->size = 0;
+            return NULL;
+        }
+#else
+        tls->data = apr_palloc(tp, tls->size * sizeof(int) * 3);

Review comment:
       The growth is x 2 so I don't expect to enter here many times. We'll hardly reach 8K per thread for the PCRE1 ovector since it's already `8K / 3*sizeof(int) ~= 682` captures. Filling the current node of `tp` (or unkileky reserve one more node at some point) like we do now sounds better to me than the reserving 8K from the begining with a subpool.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] ylavic commented on pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
ylavic commented on pull request #289:
URL: https://github.com/apache/httpd/pull/289#issuecomment-1054183315


   Merged in r1898467 / 285b6285e799d3480150375cb853d40973d3ab4c


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org


[GitHub] [httpd] ylavic closed pull request #289: PCRE2 with Thread Local Storage

Posted by GitBox <gi...@apache.org>.
ylavic closed pull request #289:
URL: https://github.com/apache/httpd/pull/289


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@httpd.apache.org
For additional commands, e-mail: notifications-help@httpd.apache.org