You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Lu, Yingqi" <yi...@intel.com> on 2014/05/14 01:55:44 UTC

RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Dear All,

During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.

Quick test result on modern dual socket Intel platform (Linux Kernel 3.13.9)
SO_REUSEPORT patch (bugzilla #55897)

1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement

2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement

3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)

Bucket patch (bugzilla #56279, only impact multiple listen statement case)

1.       Prefork MPM: 2 listen statements: 42% throughput improvement

2.       Worker MPM: 2 listen statements: 7% throughput improvement

In all the above testing cases, significant response time reductions are observed, even with throughput improvements.

Please let me know your feedback and comments.

Thanks,
Yingqi
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
Sent: Monday, March 17, 2014 1:41 PM
To: dev@httpd.apache.org<ma...@httpd.apache.org>
Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Dear all,

Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).

Below are the changes we made into this new version:

According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.

According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.

According to Jeff Trawick's comments,
1. We generate the patches against the httpd trunk.
2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.

Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.

Thanks,
Yingqi Lu



From: Lu, Yingqi
Sent: Tuesday, March 04, 2014 10:43 AM
To: dev@httpd.apache.org<ma...@httpd.apache.org>
Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jeff,

Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.

Thanks,
Yingqi

From: Jeff Trawick [mailto:trawick@gmail.com]
Sent: Tuesday, March 04, 2014 10:08 AM
To: Apache HTTP Server Development List
Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com>> wrote:
Hi All,

I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.

If you need any additional information/supporting data, please let us know as well.

Yeah, it has been on my todo list, but I don't have time to give an in depth review at the moment.  Here are a few questions/comments.  (And you'll have to deal with the fact that it is unnecessarily tedious for me to evaluate higher-level considerations if there are a lot of distractions, such as the code comments below ;)  But others are of course free to chime in.)

The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.

What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?

Have you tried the event MPM?

Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.

Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???

We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.

For the configure-time check for 3.9 kernel: I think we'd also use AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT definition is available, and not enable it if the system includes doesn't define it.  (Does that cause a problem for any significant number of people?)

Don't mention the patch in the patch ;) (e.g., "This function is added for the patch.")

Incomplete comments on style/syntax issues:

* mixing declarations and statements (e.g., "duplr->next = 0; apr_socket_t *temps;") isn't supported by all compilers and is distracting when reviewing
* suitable identifier names (e.g., fix global variable "flag" and whatever else isn't appropriate; "ap_post_config_listeners" should be renamed to indicate what it does
* APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" and don't add reminders in comments
* this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
and so on


Thanks,
Yingqi


From: Lu, Yingqi
Sent: Friday, January 24, 2014 3:26 PM
To: dev@httpd.apache.org<ma...@httpd.apache.org>
Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Dear All,

Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.

Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.

Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.

Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.

This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.

Configuration:
<IfModule prefork.c>
    ListenBacklog 105384
    ServerLimit 105000
    MaxClients 1024
    MaxRequestsPerChild 0
    StartServers 64
    MinSpareServers 8
    MaxSpareServers 16
</IfModule>

1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Thanks,
Yingqi



--
Born in Roswell... married an alien...
http://emptyhammock.com/
http://edjective.org/




--
Born in Roswell... married an alien...
http://emptyhammock.com/
http://edjective.org/


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Dear all,

Any other feedback/comments/questions? 

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi 
Sent: Wednesday, May 14, 2014 9:00 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for your email.

In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 

In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 

Please let me know if this answers your question.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Wednesday, May 14, 2014 6:57 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

This is very cool!

mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
We look OK on Linux, FreeBSD and OSX...

I'm +1 on folding into trunk.

On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Dear All,
>  
> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>  
> Quick test result on modern dual socket Intel platform (Linux Kernel
> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>  
> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>  
> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>  
> Please let me know your feedback and comments.
>  
> Thanks,
> Yingqi
> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
> Sent: Monday, March 17, 2014 1:41 PM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear all,
>  
> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>  
> Below are the changes we made into this new version:
>  
> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>  
> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>  
> According to Jeff Trawick's comments,
> 1. We generate the patches against the httpd trunk.
> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>  
> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>  
> Thanks,
> Yingqi Lu
>  
>  
>  
> From: Lu, Yingqi
> Sent: Tuesday, March 04, 2014 10:43 AM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Hi Jeff,
>  
> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>  
> Thanks,
> Yingqi
>  
> From: Jeff Trawick [mailto:trawick@gmail.com]
> Sent: Tuesday, March 04, 2014 10:08 AM
> To: Apache HTTP Server Development List
> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
> Hi All,
>  
> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>  
> If you need any additional information/supporting data, please let us know as well.
>  
> Yeah, it has been on my todo list, but I don't have time to give an in 
> depth review at the moment.  Here are a few questions/comments.  (And 
> you'll have to deal with the fact that it is unnecessarily tedious for 
> me to evaluate higher-level considerations if there are a lot of 
> distractions, such as the code comments below ;)  But others are of 
> course free to chime in.)
>  
> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>  
> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>  
> Have you tried the event MPM?
>  
> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>  
> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>  
> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>  
> For the configure-time check for 3.9 kernel: I think we'd also use 
> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
> definition is available, and not enable it if the system includes 
> doesn't define it.  (Does that cause a problem for any significant 
> number of people?)
>  
> Don't mention the patch in the patch ;) (e.g., "This function is added 
> for the patch.")
>  
> Incomplete comments on style/syntax issues:
>  
> * mixing declarations and statements (e.g., "duplr->next = 0; 
> apr_socket_t *temps;") isn't supported by all compilers and is 
> distracting when reviewing
> * suitable identifier names (e.g., fix global variable "flag" and 
> whatever else isn't appropriate; "ap_post_config_listeners" should be 
> renamed to indicate what it does
> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
> and don't add reminders in comments
> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
> and so on
>  
>  
> Thanks,
> Yingqi
>  
>  
> From: Lu, Yingqi
> Sent: Friday, January 24, 2014 3:26 PM
> To: dev@httpd.apache.org
> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear All,
>  
> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>  
> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>  
> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>  
> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>  
> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>  
> Configuration:
> <IfModule prefork.c>
>     ListenBacklog 105384
>     ServerLimit 105000
>     MaxClients 1024
>     MaxRequestsPerChild 0
>     StartServers 64
>     MinSpareServers 8
>     MaxSpareServers 16
> </IfModule>
>  
> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> Thanks,
> Yingqi
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Dear all,

Any other feedback/comments/questions? 

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi 
Sent: Wednesday, May 14, 2014 9:00 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for your email.

In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 

In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 

Please let me know if this answers your question.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Wednesday, May 14, 2014 6:57 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

This is very cool!

mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
We look OK on Linux, FreeBSD and OSX...

I'm +1 on folding into trunk.

On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Dear All,
>  
> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>  
> Quick test result on modern dual socket Intel platform (Linux Kernel
> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>  
> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>  
> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>  
> Please let me know your feedback and comments.
>  
> Thanks,
> Yingqi
> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
> Sent: Monday, March 17, 2014 1:41 PM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear all,
>  
> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>  
> Below are the changes we made into this new version:
>  
> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>  
> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>  
> According to Jeff Trawick's comments,
> 1. We generate the patches against the httpd trunk.
> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>  
> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>  
> Thanks,
> Yingqi Lu
>  
>  
>  
> From: Lu, Yingqi
> Sent: Tuesday, March 04, 2014 10:43 AM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Hi Jeff,
>  
> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>  
> Thanks,
> Yingqi
>  
> From: Jeff Trawick [mailto:trawick@gmail.com]
> Sent: Tuesday, March 04, 2014 10:08 AM
> To: Apache HTTP Server Development List
> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
> Hi All,
>  
> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>  
> If you need any additional information/supporting data, please let us know as well.
>  
> Yeah, it has been on my todo list, but I don't have time to give an in 
> depth review at the moment.  Here are a few questions/comments.  (And 
> you'll have to deal with the fact that it is unnecessarily tedious for 
> me to evaluate higher-level considerations if there are a lot of 
> distractions, such as the code comments below ;)  But others are of 
> course free to chime in.)
>  
> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>  
> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>  
> Have you tried the event MPM?
>  
> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>  
> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>  
> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>  
> For the configure-time check for 3.9 kernel: I think we'd also use 
> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
> definition is available, and not enable it if the system includes 
> doesn't define it.  (Does that cause a problem for any significant 
> number of people?)
>  
> Don't mention the patch in the patch ;) (e.g., "This function is added 
> for the patch.")
>  
> Incomplete comments on style/syntax issues:
>  
> * mixing declarations and statements (e.g., "duplr->next = 0; 
> apr_socket_t *temps;") isn't supported by all compilers and is 
> distracting when reviewing
> * suitable identifier names (e.g., fix global variable "flag" and 
> whatever else isn't appropriate; "ap_post_config_listeners" should be 
> renamed to indicate what it does
> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
> and don't add reminders in comments
> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
> and so on
>  
>  
> Thanks,
> Yingqi
>  
>  
> From: Lu, Yingqi
> Sent: Friday, January 24, 2014 3:26 PM
> To: dev@httpd.apache.org
> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear All,
>  
> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>  
> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>  
> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>  
> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>  
> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>  
> Configuration:
> <IfModule prefork.c>
>     ListenBacklog 105384
>     ServerLimit 105000
>     MaxClients 1024
>     MaxRequestsPerChild 0
>     StartServers 64
>     MinSpareServers 8
>     MaxSpareServers 16
> </IfModule>
>  
> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> Thanks,
> Yingqi
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Dear all,

I sent this message twice in the morning, but I did not receive anything back. I am resending this one more time, just trying to gather your feedback/comments/questions on both of the patches. 

Sorry for the duplications.

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi
Sent: Wednesday, May 14, 2014 9:00 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for your email.

In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 

In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 

Please let me know if this answers your question.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Wednesday, May 14, 2014 6:57 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

This is very cool!

mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
We look OK on Linux, FreeBSD and OSX...

I'm +1 on folding into trunk.

On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Dear All,
>  
> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>  
> Quick test result on modern dual socket Intel platform (Linux Kernel
> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>  
> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>  
> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>  
> Please let me know your feedback and comments.
>  
> Thanks,
> Yingqi
> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
> Sent: Monday, March 17, 2014 1:41 PM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear all,
>  
> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>  
> Below are the changes we made into this new version:
>  
> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>  
> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>  
> According to Jeff Trawick's comments,
> 1. We generate the patches against the httpd trunk.
> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>  
> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>  
> Thanks,
> Yingqi Lu
>  
>  
>  
> From: Lu, Yingqi
> Sent: Tuesday, March 04, 2014 10:43 AM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Hi Jeff,
>  
> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>  
> Thanks,
> Yingqi
>  
> From: Jeff Trawick [mailto:trawick@gmail.com]
> Sent: Tuesday, March 04, 2014 10:08 AM
> To: Apache HTTP Server Development List
> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
> Hi All,
>  
> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>  
> If you need any additional information/supporting data, please let us know as well.
>  
> Yeah, it has been on my todo list, but I don't have time to give an in 
> depth review at the moment.  Here are a few questions/comments.  (And 
> you'll have to deal with the fact that it is unnecessarily tedious for 
> me to evaluate higher-level considerations if there are a lot of 
> distractions, such as the code comments below ;)  But others are of 
> course free to chime in.)
>  
> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>  
> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>  
> Have you tried the event MPM?
>  
> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>  
> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>  
> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>  
> For the configure-time check for 3.9 kernel: I think we'd also use 
> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
> definition is available, and not enable it if the system includes 
> doesn't define it.  (Does that cause a problem for any significant 
> number of people?)
>  
> Don't mention the patch in the patch ;) (e.g., "This function is added 
> for the patch.")
>  
> Incomplete comments on style/syntax issues:
>  
> * mixing declarations and statements (e.g., "duplr->next = 0; 
> apr_socket_t *temps;") isn't supported by all compilers and is 
> distracting when reviewing
> * suitable identifier names (e.g., fix global variable "flag" and 
> whatever else isn't appropriate; "ap_post_config_listeners" should be 
> renamed to indicate what it does
> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
> and don't add reminders in comments
> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
> and so on
>  
>  
> Thanks,
> Yingqi
>  
>  
> From: Lu, Yingqi
> Sent: Friday, January 24, 2014 3:26 PM
> To: dev@httpd.apache.org
> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear All,
>  
> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>  
> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>  
> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>  
> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>  
> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>  
> Configuration:
> <IfModule prefork.c>
>     ListenBacklog 105384
>     ServerLimit 105000
>     MaxClients 1024
>     MaxRequestsPerChild 0
>     StartServers 64
>     MinSpareServers 8
>     MaxSpareServers 16
> </IfModule>
>  
> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> Thanks,
> Yingqi
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Hi All,

I just want to ping again on these two patches. 

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
Sent: Friday, May 23, 2014 9:03 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Dear All,

These two patches are the modified version based on the original single patch I sent out early January. Since I sent out the original patch, I have gotten lots of positive feedbacks and suggestions from this community. Thanks very much for your support and help!

The two most recent patches have addressed all the suggestions I got from you so far and they include changes for all three Linux MPMs. They have also been fully tested. Please refer to the change details and test results in the email below.

If there are no other comments/suggestions, should we fold them into the trunk?

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
Sent: Friday, May 16, 2014 11:53 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.

Attached are the two most recent patches. I already updated the bugzilla #55897 as well.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Thursday, May 15, 2014 7:53 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).

On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Thanks very much for your email.
> 
> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
> 
> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
> 
> Please let me know if this answers your question.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Wednesday, May 14, 2014 6:57 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> This is very cool!
> 
> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
> We look OK on Linux, FreeBSD and OSX...
> 
> I'm +1 on folding into trunk.
> 
> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Dear All,
>> 
>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>> 
>> Quick test result on modern dual socket Intel platform (Linux Kernel
>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>> 
>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>> 
>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>> 
>> Please let me know your feedback and comments.
>> 
>> Thanks,
>> Yingqi
>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>> Sent: Monday, March 17, 2014 1:41 PM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear all,
>> 
>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>> 
>> Below are the changes we made into this new version:
>> 
>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>> 
>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>> 
>> According to Jeff Trawick's comments, 1. We generate the patches 
>> against the httpd trunk.
>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>> 
>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>> 
>> Thanks,
>> Yingqi Lu
>> 
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Tuesday, March 04, 2014 10:43 AM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Hi Jeff,
>> 
>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>> 
>> Thanks,
>> Yingqi
>> 
>> From: Jeff Trawick [mailto:trawick@gmail.com]
>> Sent: Tuesday, March 04, 2014 10:08 AM
>> To: Apache HTTP Server Development List
>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> Hi All,
>> 
>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>> 
>> If you need any additional information/supporting data, please let us know as well.
>> 
>> Yeah, it has been on my todo list, but I don't have time to give an 
>> in depth review at the moment.  Here are a few questions/comments.
>> (And you'll have to deal with the fact that it is unnecessarily 
>> tedious for me to evaluate higher-level considerations if there are a 
>> lot of distractions, such as the code comments below ;)  But others 
>> are of course free to chime in.)
>> 
>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>> 
>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>> 
>> Have you tried the event MPM?
>> 
>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>> 
>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>> 
>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>> 
>> For the configure-time check for 3.9 kernel: I think we'd also use 
>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>> definition is available, and not enable it if the system includes 
>> doesn't define it.  (Does that cause a problem for any significant 
>> number of people?)
>> 
>> Don't mention the patch in the patch ;) (e.g., "This function is 
>> added for the patch.")
>> 
>> Incomplete comments on style/syntax issues:
>> 
>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>> apr_socket_t *temps;") isn't supported by all compilers and is 
>> distracting when reviewing
>> * suitable identifier names (e.g., fix global variable "flag" and 
>> whatever else isn't appropriate; "ap_post_config_listeners" should be 
>> renamed to indicate what it does
>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>> and don't add reminders in comments
>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>> and so on
>> 
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Friday, January 24, 2014 3:26 PM
>> To: dev@httpd.apache.org
>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear All,
>> 
>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>> 
>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>> 
>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>> 
>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>> 
>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>> 
>> Configuration:
>> <IfModule prefork.c>
>>    ListenBacklog 105384
>>    ServerLimit 105000
>>    MaxClients 1024
>>    MaxRequestsPerChild 0
>>    StartServers 64
>>    MinSpareServers 8
>>    MaxSpareServers 16
>> </IfModule>
>> 
>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Dear All,

These two patches are the modified version based on the original single patch I sent out early January. Since I sent out the original patch, I have gotten lots of positive feedbacks and suggestions from this community. Thanks very much for your support and help!

The two most recent patches have addressed all the suggestions I got from you so far and they include changes for all three Linux MPMs. They have also been fully tested. Please refer to the change details and test results in the email below.

If there are no other comments/suggestions, should we fold them into the trunk?

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
Sent: Friday, May 16, 2014 11:53 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.

Attached are the two most recent patches. I already updated the bugzilla #55897 as well.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Thursday, May 15, 2014 7:53 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).

On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Thanks very much for your email.
> 
> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
> 
> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
> 
> Please let me know if this answers your question.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Wednesday, May 14, 2014 6:57 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> This is very cool!
> 
> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
> We look OK on Linux, FreeBSD and OSX...
> 
> I'm +1 on folding into trunk.
> 
> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Dear All,
>> 
>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>> 
>> Quick test result on modern dual socket Intel platform (Linux Kernel
>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>> 
>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>> 
>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>> 
>> Please let me know your feedback and comments.
>> 
>> Thanks,
>> Yingqi
>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>> Sent: Monday, March 17, 2014 1:41 PM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear all,
>> 
>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>> 
>> Below are the changes we made into this new version:
>> 
>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>> 
>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>> 
>> According to Jeff Trawick's comments, 1. We generate the patches 
>> against the httpd trunk.
>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>> 
>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>> 
>> Thanks,
>> Yingqi Lu
>> 
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Tuesday, March 04, 2014 10:43 AM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Hi Jeff,
>> 
>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>> 
>> Thanks,
>> Yingqi
>> 
>> From: Jeff Trawick [mailto:trawick@gmail.com]
>> Sent: Tuesday, March 04, 2014 10:08 AM
>> To: Apache HTTP Server Development List
>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> Hi All,
>> 
>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>> 
>> If you need any additional information/supporting data, please let us know as well.
>> 
>> Yeah, it has been on my todo list, but I don't have time to give an 
>> in depth review at the moment.  Here are a few questions/comments.
>> (And you'll have to deal with the fact that it is unnecessarily 
>> tedious for me to evaluate higher-level considerations if there are a 
>> lot of distractions, such as the code comments below ;)  But others 
>> are of course free to chime in.)
>> 
>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>> 
>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>> 
>> Have you tried the event MPM?
>> 
>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>> 
>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>> 
>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>> 
>> For the configure-time check for 3.9 kernel: I think we'd also use 
>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>> definition is available, and not enable it if the system includes 
>> doesn't define it.  (Does that cause a problem for any significant 
>> number of people?)
>> 
>> Don't mention the patch in the patch ;) (e.g., "This function is 
>> added for the patch.")
>> 
>> Incomplete comments on style/syntax issues:
>> 
>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>> apr_socket_t *temps;") isn't supported by all compilers and is 
>> distracting when reviewing
>> * suitable identifier names (e.g., fix global variable "flag" and 
>> whatever else isn't appropriate; "ap_post_config_listeners" should be 
>> renamed to indicate what it does
>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>> and don't add reminders in comments
>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>> and so on
>> 
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Friday, January 24, 2014 3:26 PM
>> To: dev@httpd.apache.org
>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear All,
>> 
>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>> 
>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>> 
>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>> 
>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>> 
>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>> 
>> Configuration:
>> <IfModule prefork.c>
>>    ListenBacklog 105384
>>    ServerLimit 105000
>>    MaxClients 1024
>>    MaxRequestsPerChild 0
>>    StartServers 64
>>    MinSpareServers 8
>>    MaxSpareServers 16
>> </IfModule>
>> 
>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Dear All,

I am checking if there are any questions/comments on both of the patches? Also, I am wondering what the process of patch acceptance is.

Please let me know if there is anything I can do to help.

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
Sent: Friday, May 16, 2014 11:53 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.

Attached are the two most recent patches. I already updated the bugzilla #55897 as well.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Thursday, May 15, 2014 7:53 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).

On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Thanks very much for your email.
> 
> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
> 
> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
> 
> Please let me know if this answers your question.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Wednesday, May 14, 2014 6:57 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> This is very cool!
> 
> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
> We look OK on Linux, FreeBSD and OSX...
> 
> I'm +1 on folding into trunk.
> 
> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Dear All,
>> 
>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>> 
>> Quick test result on modern dual socket Intel platform (Linux Kernel
>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>> 
>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>> 
>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>> 
>> Please let me know your feedback and comments.
>> 
>> Thanks,
>> Yingqi
>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>> Sent: Monday, March 17, 2014 1:41 PM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear all,
>> 
>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>> 
>> Below are the changes we made into this new version:
>> 
>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>> 
>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>> 
>> According to Jeff Trawick's comments, 1. We generate the patches 
>> against the httpd trunk.
>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>> 
>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>> 
>> Thanks,
>> Yingqi Lu
>> 
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Tuesday, March 04, 2014 10:43 AM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Hi Jeff,
>> 
>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>> 
>> Thanks,
>> Yingqi
>> 
>> From: Jeff Trawick [mailto:trawick@gmail.com]
>> Sent: Tuesday, March 04, 2014 10:08 AM
>> To: Apache HTTP Server Development List
>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> Hi All,
>> 
>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>> 
>> If you need any additional information/supporting data, please let us know as well.
>> 
>> Yeah, it has been on my todo list, but I don't have time to give an 
>> in depth review at the moment.  Here are a few questions/comments.
>> (And you'll have to deal with the fact that it is unnecessarily 
>> tedious for me to evaluate higher-level considerations if there are a 
>> lot of distractions, such as the code comments below ;)  But others 
>> are of course free to chime in.)
>> 
>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>> 
>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>> 
>> Have you tried the event MPM?
>> 
>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>> 
>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>> 
>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>> 
>> For the configure-time check for 3.9 kernel: I think we'd also use 
>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>> definition is available, and not enable it if the system includes 
>> doesn't define it.  (Does that cause a problem for any significant 
>> number of people?)
>> 
>> Don't mention the patch in the patch ;) (e.g., "This function is 
>> added for the patch.")
>> 
>> Incomplete comments on style/syntax issues:
>> 
>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>> apr_socket_t *temps;") isn't supported by all compilers and is 
>> distracting when reviewing
>> * suitable identifier names (e.g., fix global variable "flag" and 
>> whatever else isn't appropriate; "ap_post_config_listeners" should be 
>> renamed to indicate what it does
>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>> and don't add reminders in comments
>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>> and so on
>> 
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Friday, January 24, 2014 3:26 PM
>> To: dev@httpd.apache.org
>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear All,
>> 
>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>> 
>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>> 
>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>> 
>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>> 
>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>> 
>> Configuration:
>> <IfModule prefork.c>
>>    ListenBacklog 105384
>>    ServerLimit 105000
>>    MaxClients 1024
>>    MaxRequestsPerChild 0
>>    StartServers 64
>>    MinSpareServers 8
>>    MaxSpareServers 16
>> </IfModule>
>> 
>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Hi Jim,

I found that original patch code does not guard "ap_daemons_to_start" to be equal or bigger than num_buckets. Since each bucket has a dedicated listener assigned to it, I think we need to make sure that each bucket has at least 1 child process to start with.

I created this small patch to fix the issue. Can you please help to add the changes into the trunk?

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
Sent: Thursday, June 05, 2014 9:12 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Thanks very much for your help!

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Thursday, June 05, 2014 6:38 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Committed r1600656

Thx
On Jun 4, 2014, at 3:39 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> I just found that prefork and worker has issue with restart. Event mpm code is good. 
> 
> I created this small patch to fix the issue for both prefork and worker. The patch is based on rev #1600451.
> 
> Can you please help add the changes in the trunk?
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
> Sent: Tuesday, June 03, 2014 8:50 AM
> To: dev@httpd.apache.org
> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> Thank you very much for your help!
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Tuesday, June 03, 2014 8:31 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> Next on the agenda is to push into eventopt
> 
> On Jun 3, 2014, at 10:21 AM, Jim Jagielski <ji...@jaguNET.com> wrote:
> 
>> FTR: I saw no reason to try to handle both patches... I used the 
>> so_reuseport patch as the primary patch to focus on.
>> 
>> I have some minor changes coming up to follow-up post the initial 
>> commit
>> 
>> On Jun 3, 2014, at 8:51 AM, Jim Jagielski <ji...@jagunet.com> wrote:
>> 
>>> I have folded this into trunk and am currently fixing some compile 
>>> warnings and errors...
>>> 
>>> On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> 
>>>> Hi Jim,
>>>> 
>>>> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
>>>> 
>>>> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
>>>> 
>>>> I already updated the Bugzilla database for the item 55897 and item 56279.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>> Sent: Saturday, May 31, 2014 11:48 PM
>>>> To: dev@httpd.apache.org
>>>> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
>>>> 
>>>> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
>>>> 
>>>> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
>>>> 
>>>> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
>>>> 
>>>> I already tested the code for the above approaches and they both work. 
>>>> 
>>>> Please let me know which way you think is better. I can quickly send you an update for review.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Lu, Yingqi
>>>> Sent: Saturday, May 31, 2014 3:28 PM
>>>> To: <de...@httpd.apache.org>
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Thanks very much for your email! I will look into both of them and send an update tonight!
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>>>>> 
>>>>> I also see:
>>>>> 
>>>>>     /* kill off the idle ones */
>>>>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>>>>> +        for (i = 0; i < num_buckets; i++) {
>>>>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>>>>> +        }
>>>>> 
>>>>> 
>>>>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>>>>> 
>>>>> /**
>>>>> * Write data to the pipe-of-death, signalling that all child 
>>>>> process
>>>>> * should die.
>>>>> * @param pod The pipe-of-death to write to.
>>>>> * @param num The number of child processes to kill
>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>> */
>>>>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>>>>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>>>>> +child_bucket);
>>>>> 
>>>>> Isn't 'num' the same in both implementation??
>>>>> 
>>>>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>>>>> 
>>>>>> Sorry I didn't catch this earlier:
>>>>>> 
>>>>>> I see
>>>>>> 
>>>>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>>>>> @@ -267,16 +267,18 @@
>>>>>> * Write data to the pipe-of-death, signalling that one child 
>>>>>> process
>>>>>> * should die.
>>>>>> * @param pod the pipe-of-death to write to.
>>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>>> */
>>>>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>>>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>>>>> +child_bucket);
>>>>>> 
>>>>>> We can change the API at this point. We could add another 
>>>>>> function, eg ap_mpm_pod_signal_ex() which takes the int param, 
>>>>>> but we can't modify ap_mpm_pod_signal() itself.
>>>>>> 
>>>>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Thank you very much!
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>>>>> 
>>>>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Jim,
>>>>>>>> 
>>>>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>>>>> 
>>>>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>> SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>>>>> 
>>>>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Jim,
>>>>>>>>> 
>>>>>>>>> Thanks very much for your email.
>>>>>>>>> 
>>>>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>>>>> 
>>>>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>>>>> 
>>>>>>>>> Please let me know if this answers your question.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>> SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> This is very cool!
>>>>>>>>> 
>>>>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>>>>> 
>>>>>>>>> I'm +1 on folding into trunk.
>>>>>>>>> 
>>>>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Dear All,
>>>>>>>>>> 
>>>>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>>>>> 
>>>>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>>>>> Kernel
>>>>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>>>>> 
>>>>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>>>>> 
>>>>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>>>>> 
>>>>>>>>>> Please let me know your feedback and comments.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Dear all,
>>>>>>>>>> 
>>>>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>>>>> 
>>>>>>>>>> Below are the changes we made into this new version:
>>>>>>>>>> 
>>>>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>>>>> 
>>>>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>>>>> 
>>>>>>>>>> According to Jeff Trawick's comments, 1. We generate the 
>>>>>>>>>> patches against the httpd trunk.
>>>>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>>>>> 
>>>>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi Lu
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi
>>>>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Hi Jeff,
>>>>>>>>>> 
>>>>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>>>>> To: Apache HTTP Server Development List
>>>>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>>> Hi All,
>>>>>>>>>> 
>>>>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>>>>> 
>>>>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>>>>> 
>>>>>>>>>> Yeah, it has been on my todo list, but I don't have time to 
>>>>>>>>>> give an in depth review at the moment.  Here are a few questions/comments.
>>>>>>>>>> (And you'll have to deal with the fact that it is 
>>>>>>>>>> unnecessarily tedious for me to evaluate higher-level 
>>>>>>>>>> considerations if there are a lot of distractions, such as 
>>>>>>>>>> the code comments below ;) But others are of course free to 
>>>>>>>>>> chime
>>>>>>>>>> in.)
>>>>>>>>>> 
>>>>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>>>>> 
>>>>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>>>>> 
>>>>>>>>>> Have you tried the event MPM?
>>>>>>>>>> 
>>>>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>>>>> 
>>>>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>>>>> 
>>>>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>>>>> 
>>>>>>>>>> For the configure-time check for 3.9 kernel: I think we'd 
>>>>>>>>>> also use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>>>>> SO_REUSEPORT definition is available, and not enable it if 
>>>>>>>>>> the system includes doesn't define it.  (Does that cause a 
>>>>>>>>>> problem for any significant number of people?)
>>>>>>>>>> 
>>>>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function 
>>>>>>>>>> is added for the patch.")
>>>>>>>>>> 
>>>>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>>>>> 
>>>>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and 
>>>>>>>>>> is distracting when reviewing
>>>>>>>>>> * suitable identifier names (e.g., fix global variable "flag" 
>>>>>>>>>> and whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>>>>> should be renamed to indicate what it does
>>>>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>>>>> and don't add reminders in comments
>>>>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>>>>> and so on
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi
>>>>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>>> SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Dear All,
>>>>>>>>>> 
>>>>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>>>>> 
>>>>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>>>>> 
>>>>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>>>>> 
>>>>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>>>>> 
>>>>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>>>>> 
>>>>>>>>>> Configuration:
>>>>>>>>>> <IfModule prefork.c>
>>>>>>>>>> ListenBacklog 105384
>>>>>>>>>> ServerLimit 105000
>>>>>>>>>> MaxClients 1024
>>>>>>>>>> MaxRequestsPerChild 0
>>>>>>>>>> StartServers 64
>>>>>>>>>> MinSpareServers 8
>>>>>>>>>> MaxSpareServers 16
>>>>>>>>>> </IfModule>
>>>>>>>>>> 
>>>>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>>> http://emptyhammock.com/
>>>>>>>>>> http://edjective.org/
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>>> http://emptyhammock.com/
>>>>>>>>>> http://edjective.org/
>>>>>>>>>> 
>>>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>>>>> 
>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>> 
>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>> 
>> 
> 
> <restart_fix.patch>


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Thanks very much for your help!

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com] 
Sent: Thursday, June 05, 2014 6:38 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Committed r1600656

Thx
On Jun 4, 2014, at 3:39 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> I just found that prefork and worker has issue with restart. Event mpm code is good. 
> 
> I created this small patch to fix the issue for both prefork and worker. The patch is based on rev #1600451.
> 
> Can you please help add the changes in the trunk?
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
> Sent: Tuesday, June 03, 2014 8:50 AM
> To: dev@httpd.apache.org
> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> Thank you very much for your help!
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Tuesday, June 03, 2014 8:31 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> Next on the agenda is to push into eventopt
> 
> On Jun 3, 2014, at 10:21 AM, Jim Jagielski <ji...@jaguNET.com> wrote:
> 
>> FTR: I saw no reason to try to handle both patches... I used the 
>> so_reuseport patch as the primary patch to focus on.
>> 
>> I have some minor changes coming up to follow-up post the initial 
>> commit
>> 
>> On Jun 3, 2014, at 8:51 AM, Jim Jagielski <ji...@jagunet.com> wrote:
>> 
>>> I have folded this into trunk and am currently fixing some compile 
>>> warnings and errors...
>>> 
>>> On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> 
>>>> Hi Jim,
>>>> 
>>>> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
>>>> 
>>>> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
>>>> 
>>>> I already updated the Bugzilla database for the item 55897 and item 56279.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>> Sent: Saturday, May 31, 2014 11:48 PM
>>>> To: dev@httpd.apache.org
>>>> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
>>>> 
>>>> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
>>>> 
>>>> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
>>>> 
>>>> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
>>>> 
>>>> I already tested the code for the above approaches and they both work. 
>>>> 
>>>> Please let me know which way you think is better. I can quickly send you an update for review.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Lu, Yingqi
>>>> Sent: Saturday, May 31, 2014 3:28 PM
>>>> To: <de...@httpd.apache.org>
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Thanks very much for your email! I will look into both of them and send an update tonight!
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>>>>> 
>>>>> I also see:
>>>>> 
>>>>>     /* kill off the idle ones */
>>>>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>>>>> +        for (i = 0; i < num_buckets; i++) {
>>>>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>>>>> +        }
>>>>> 
>>>>> 
>>>>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>>>>> 
>>>>> /**
>>>>> * Write data to the pipe-of-death, signalling that all child 
>>>>> process
>>>>> * should die.
>>>>> * @param pod The pipe-of-death to write to.
>>>>> * @param num The number of child processes to kill
>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>> */
>>>>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>>>>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>>>>> +child_bucket);
>>>>> 
>>>>> Isn't 'num' the same in both implementation??
>>>>> 
>>>>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>>>>> 
>>>>>> Sorry I didn't catch this earlier:
>>>>>> 
>>>>>> I see
>>>>>> 
>>>>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>>>>> @@ -267,16 +267,18 @@
>>>>>> * Write data to the pipe-of-death, signalling that one child 
>>>>>> process
>>>>>> * should die.
>>>>>> * @param pod the pipe-of-death to write to.
>>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>>> */
>>>>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>>>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>>>>> +child_bucket);
>>>>>> 
>>>>>> We can change the API at this point. We could add another 
>>>>>> function, eg ap_mpm_pod_signal_ex() which takes the int param, 
>>>>>> but we can't modify ap_mpm_pod_signal() itself.
>>>>>> 
>>>>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Thank you very much!
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>>>>> 
>>>>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Jim,
>>>>>>>> 
>>>>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>>>>> 
>>>>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>> SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>>>>> 
>>>>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Jim,
>>>>>>>>> 
>>>>>>>>> Thanks very much for your email.
>>>>>>>>> 
>>>>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>>>>> 
>>>>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>>>>> 
>>>>>>>>> Please let me know if this answers your question.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>> SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> This is very cool!
>>>>>>>>> 
>>>>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>>>>> 
>>>>>>>>> I'm +1 on folding into trunk.
>>>>>>>>> 
>>>>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Dear All,
>>>>>>>>>> 
>>>>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>>>>> 
>>>>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>>>>> Kernel
>>>>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>>>>> 
>>>>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>>>>> 
>>>>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>>>>> 
>>>>>>>>>> Please let me know your feedback and comments.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Dear all,
>>>>>>>>>> 
>>>>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>>>>> 
>>>>>>>>>> Below are the changes we made into this new version:
>>>>>>>>>> 
>>>>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>>>>> 
>>>>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>>>>> 
>>>>>>>>>> According to Jeff Trawick's comments, 1. We generate the 
>>>>>>>>>> patches against the httpd trunk.
>>>>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>>>>> 
>>>>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi Lu
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi
>>>>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Hi Jeff,
>>>>>>>>>> 
>>>>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>>>>> To: Apache HTTP Server Development List
>>>>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>>> Hi All,
>>>>>>>>>> 
>>>>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>>>>> 
>>>>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>>>>> 
>>>>>>>>>> Yeah, it has been on my todo list, but I don't have time to 
>>>>>>>>>> give an in depth review at the moment.  Here are a few questions/comments.
>>>>>>>>>> (And you'll have to deal with the fact that it is 
>>>>>>>>>> unnecessarily tedious for me to evaluate higher-level 
>>>>>>>>>> considerations if there are a lot of distractions, such as 
>>>>>>>>>> the code comments below ;) But others are of course free to 
>>>>>>>>>> chime
>>>>>>>>>> in.)
>>>>>>>>>> 
>>>>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>>>>> 
>>>>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>>>>> 
>>>>>>>>>> Have you tried the event MPM?
>>>>>>>>>> 
>>>>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>>>>> 
>>>>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>>>>> 
>>>>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>>>>> 
>>>>>>>>>> For the configure-time check for 3.9 kernel: I think we'd 
>>>>>>>>>> also use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>>>>> SO_REUSEPORT definition is available, and not enable it if 
>>>>>>>>>> the system includes doesn't define it.  (Does that cause a 
>>>>>>>>>> problem for any significant number of people?)
>>>>>>>>>> 
>>>>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function 
>>>>>>>>>> is added for the patch.")
>>>>>>>>>> 
>>>>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>>>>> 
>>>>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and 
>>>>>>>>>> is distracting when reviewing
>>>>>>>>>> * suitable identifier names (e.g., fix global variable "flag" 
>>>>>>>>>> and whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>>>>> should be renamed to indicate what it does
>>>>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>>>>> and don't add reminders in comments
>>>>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>>>>> and so on
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi
>>>>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>>> SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Dear All,
>>>>>>>>>> 
>>>>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>>>>> 
>>>>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>>>>> 
>>>>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>>>>> 
>>>>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>>>>> 
>>>>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>>>>> 
>>>>>>>>>> Configuration:
>>>>>>>>>> <IfModule prefork.c>
>>>>>>>>>> ListenBacklog 105384
>>>>>>>>>> ServerLimit 105000
>>>>>>>>>> MaxClients 1024
>>>>>>>>>> MaxRequestsPerChild 0
>>>>>>>>>> StartServers 64
>>>>>>>>>> MinSpareServers 8
>>>>>>>>>> MaxSpareServers 16
>>>>>>>>>> </IfModule>
>>>>>>>>>> 
>>>>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>>> http://emptyhammock.com/
>>>>>>>>>> http://edjective.org/
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>>> http://emptyhammock.com/
>>>>>>>>>> http://edjective.org/
>>>>>>>>>> 
>>>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>>>>> 
>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>> 
>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>> 
>> 
> 
> <restart_fix.patch>


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
Committed r1600656

Thx
On Jun 4, 2014, at 3:39 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> I just found that prefork and worker has issue with restart. Event mpm code is good. 
> 
> I created this small patch to fix the issue for both prefork and worker. The patch is based on rev #1600451.
> 
> Can you please help add the changes in the trunk?
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
> Sent: Tuesday, June 03, 2014 8:50 AM
> To: dev@httpd.apache.org
> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> Thank you very much for your help!
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Tuesday, June 03, 2014 8:31 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> Next on the agenda is to push into eventopt
> 
> On Jun 3, 2014, at 10:21 AM, Jim Jagielski <ji...@jaguNET.com> wrote:
> 
>> FTR: I saw no reason to try to handle both patches... I used the 
>> so_reuseport patch as the primary patch to focus on.
>> 
>> I have some minor changes coming up to follow-up post the initial 
>> commit
>> 
>> On Jun 3, 2014, at 8:51 AM, Jim Jagielski <ji...@jagunet.com> wrote:
>> 
>>> I have folded this into trunk and am currently fixing some compile 
>>> warnings and errors...
>>> 
>>> On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> 
>>>> Hi Jim,
>>>> 
>>>> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
>>>> 
>>>> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
>>>> 
>>>> I already updated the Bugzilla database for the item 55897 and item 56279.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>> Sent: Saturday, May 31, 2014 11:48 PM
>>>> To: dev@httpd.apache.org
>>>> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
>>>> 
>>>> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
>>>> 
>>>> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
>>>> 
>>>> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
>>>> 
>>>> I already tested the code for the above approaches and they both work. 
>>>> 
>>>> Please let me know which way you think is better. I can quickly send you an update for review.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Lu, Yingqi
>>>> Sent: Saturday, May 31, 2014 3:28 PM
>>>> To: <de...@httpd.apache.org>
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Thanks very much for your email! I will look into both of them and send an update tonight!
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>>>>> 
>>>>> I also see:
>>>>> 
>>>>>     /* kill off the idle ones */
>>>>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>>>>> +        for (i = 0; i < num_buckets; i++) {
>>>>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>>>>> +        }
>>>>> 
>>>>> 
>>>>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>>>>> 
>>>>> /**
>>>>> * Write data to the pipe-of-death, signalling that all child 
>>>>> process
>>>>> * should die.
>>>>> * @param pod The pipe-of-death to write to.
>>>>> * @param num The number of child processes to kill
>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>> */
>>>>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>>>>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>>>>> +child_bucket);
>>>>> 
>>>>> Isn't 'num' the same in both implementation??
>>>>> 
>>>>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>>>>> 
>>>>>> Sorry I didn't catch this earlier:
>>>>>> 
>>>>>> I see
>>>>>> 
>>>>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>>>>> @@ -267,16 +267,18 @@
>>>>>> * Write data to the pipe-of-death, signalling that one child 
>>>>>> process
>>>>>> * should die.
>>>>>> * @param pod the pipe-of-death to write to.
>>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>>> */
>>>>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>>>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>>>>> +child_bucket);
>>>>>> 
>>>>>> We can change the API at this point. We could add another 
>>>>>> function, eg ap_mpm_pod_signal_ex() which takes the int param, but 
>>>>>> we can't modify ap_mpm_pod_signal() itself.
>>>>>> 
>>>>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Thank you very much!
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>>>>> 
>>>>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Jim,
>>>>>>>> 
>>>>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>>>>> 
>>>>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>> SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>>>>> 
>>>>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Jim,
>>>>>>>>> 
>>>>>>>>> Thanks very much for your email.
>>>>>>>>> 
>>>>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>>>>> 
>>>>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>>>>> 
>>>>>>>>> Please let me know if this answers your question.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>> SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> This is very cool!
>>>>>>>>> 
>>>>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>>>>> 
>>>>>>>>> I'm +1 on folding into trunk.
>>>>>>>>> 
>>>>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Dear All,
>>>>>>>>>> 
>>>>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>>>>> 
>>>>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>>>>> Kernel
>>>>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>>>>> 
>>>>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>>>>> 
>>>>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>>>>> 
>>>>>>>>>> Please let me know your feedback and comments.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Dear all,
>>>>>>>>>> 
>>>>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>>>>> 
>>>>>>>>>> Below are the changes we made into this new version:
>>>>>>>>>> 
>>>>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>>>>> 
>>>>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>>>>> 
>>>>>>>>>> According to Jeff Trawick's comments, 1. We generate the 
>>>>>>>>>> patches against the httpd trunk.
>>>>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>>>>> 
>>>>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi Lu
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi
>>>>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Hi Jeff,
>>>>>>>>>> 
>>>>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>>>>> To: Apache HTTP Server Development List
>>>>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>>> Hi All,
>>>>>>>>>> 
>>>>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>>>>> 
>>>>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>>>>> 
>>>>>>>>>> Yeah, it has been on my todo list, but I don't have time to 
>>>>>>>>>> give an in depth review at the moment.  Here are a few questions/comments.
>>>>>>>>>> (And you'll have to deal with the fact that it is 
>>>>>>>>>> unnecessarily tedious for me to evaluate higher-level 
>>>>>>>>>> considerations if there are a lot of distractions, such as the 
>>>>>>>>>> code comments below ;) But others are of course free to chime
>>>>>>>>>> in.)
>>>>>>>>>> 
>>>>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>>>>> 
>>>>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>>>>> 
>>>>>>>>>> Have you tried the event MPM?
>>>>>>>>>> 
>>>>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>>>>> 
>>>>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>>>>> 
>>>>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>>>>> 
>>>>>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>>>>>> for any significant number of people?)
>>>>>>>>>> 
>>>>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function 
>>>>>>>>>> is added for the patch.")
>>>>>>>>>> 
>>>>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>>>>> 
>>>>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>>>>>> distracting when reviewing
>>>>>>>>>> * suitable identifier names (e.g., fix global variable "flag" 
>>>>>>>>>> and whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>>>>> should be renamed to indicate what it does
>>>>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>>>>> and don't add reminders in comments
>>>>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>>>>> and so on
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> From: Lu, Yingqi
>>>>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>>> SO_REUSEPORT support
>>>>>>>>>> 
>>>>>>>>>> Dear All,
>>>>>>>>>> 
>>>>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>>>>> 
>>>>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>>>>> 
>>>>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>>>>> 
>>>>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>>>>> 
>>>>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>>>>> 
>>>>>>>>>> Configuration:
>>>>>>>>>> <IfModule prefork.c>
>>>>>>>>>> ListenBacklog 105384
>>>>>>>>>> ServerLimit 105000
>>>>>>>>>> MaxClients 1024
>>>>>>>>>> MaxRequestsPerChild 0
>>>>>>>>>> StartServers 64
>>>>>>>>>> MinSpareServers 8
>>>>>>>>>> MaxSpareServers 16
>>>>>>>>>> </IfModule>
>>>>>>>>>> 
>>>>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Yingqi
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>>> http://emptyhammock.com/
>>>>>>>>>> http://edjective.org/
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>>> http://emptyhammock.com/
>>>>>>>>>> http://edjective.org/
>>>>>>>>>> 
>>>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>>>>> 
>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>> 
>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>> 
>> 
> 
> <restart_fix.patch>


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Hi Jim,

I just found that prefork and worker has issue with restart. Event mpm code is good. 

I created this small patch to fix the issue for both prefork and worker. The patch is based on rev #1600451.

Can you please help add the changes in the trunk?

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
Sent: Tuesday, June 03, 2014 8:50 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Thank you very much for your help!

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Tuesday, June 03, 2014 8:31 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Next on the agenda is to push into eventopt

On Jun 3, 2014, at 10:21 AM, Jim Jagielski <ji...@jaguNET.com> wrote:

> FTR: I saw no reason to try to handle both patches... I used the 
> so_reuseport patch as the primary patch to focus on.
> 
> I have some minor changes coming up to follow-up post the initial 
> commit
> 
> On Jun 3, 2014, at 8:51 AM, Jim Jagielski <ji...@jagunet.com> wrote:
> 
>> I have folded this into trunk and am currently fixing some compile 
>> warnings and errors...
>> 
>> On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> 
>>> Hi Jim,
>>> 
>>> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
>>> 
>>> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
>>> 
>>> I already updated the Bugzilla database for the item 55897 and item 56279.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>> Sent: Saturday, May 31, 2014 11:48 PM
>>> To: dev@httpd.apache.org
>>> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Hi Jim,
>>> 
>>> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
>>> 
>>> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
>>> 
>>> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
>>> 
>>> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
>>> 
>>> I already tested the code for the above approaches and they both work. 
>>> 
>>> Please let me know which way you think is better. I can quickly send you an update for review.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Lu, Yingqi
>>> Sent: Saturday, May 31, 2014 3:28 PM
>>> To: <de...@httpd.apache.org>
>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Hi Jim,
>>> 
>>> Thanks very much for your email! I will look into both of them and send an update tonight!
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>>>> 
>>>> I also see:
>>>> 
>>>>      /* kill off the idle ones */
>>>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>>>> +        for (i = 0; i < num_buckets; i++) {
>>>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>>>> +        }
>>>> 
>>>> 
>>>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>>>> 
>>>> /**
>>>> * Write data to the pipe-of-death, signalling that all child 
>>>> process
>>>> * should die.
>>>> * @param pod The pipe-of-death to write to.
>>>> * @param num The number of child processes to kill
>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>> */
>>>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>>>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>>>> +child_bucket);
>>>> 
>>>> Isn't 'num' the same in both implementation??
>>>> 
>>>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>>>> 
>>>>> Sorry I didn't catch this earlier:
>>>>> 
>>>>> I see
>>>>> 
>>>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>>>> @@ -267,16 +267,18 @@
>>>>> * Write data to the pipe-of-death, signalling that one child 
>>>>> process
>>>>> * should die.
>>>>> * @param pod the pipe-of-death to write to.
>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>> */
>>>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>>>> +child_bucket);
>>>>> 
>>>>> We can change the API at this point. We could add another 
>>>>> function, eg ap_mpm_pod_signal_ex() which takes the int param, but 
>>>>> we can't modify ap_mpm_pod_signal() itself.
>>>>> 
>>>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Thank you very much!
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>>>> 
>>>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Hi Jim,
>>>>>>> 
>>>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>>>> 
>>>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>>>> 
>>>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Jim,
>>>>>>>> 
>>>>>>>> Thanks very much for your email.
>>>>>>>> 
>>>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>>>> 
>>>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>>>> 
>>>>>>>> Please let me know if this answers your question.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>> SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> This is very cool!
>>>>>>>> 
>>>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>>>> 
>>>>>>>> I'm +1 on folding into trunk.
>>>>>>>> 
>>>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> 
>>>>>>>>> Dear All,
>>>>>>>>> 
>>>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>>>> 
>>>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>>>> Kernel
>>>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>>>> 
>>>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>>>> 
>>>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>>>> 
>>>>>>>>> Please let me know your feedback and comments.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Dear all,
>>>>>>>>> 
>>>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>>>> 
>>>>>>>>> Below are the changes we made into this new version:
>>>>>>>>> 
>>>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>>>> 
>>>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>>>> 
>>>>>>>>> According to Jeff Trawick's comments, 1. We generate the 
>>>>>>>>> patches against the httpd trunk.
>>>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>>>> 
>>>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi Lu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi
>>>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Hi Jeff,
>>>>>>>>> 
>>>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>>>> To: Apache HTTP Server Development List
>>>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> Hi All,
>>>>>>>>> 
>>>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>>>> 
>>>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>>>> 
>>>>>>>>> Yeah, it has been on my todo list, but I don't have time to 
>>>>>>>>> give an in depth review at the moment.  Here are a few questions/comments.
>>>>>>>>> (And you'll have to deal with the fact that it is 
>>>>>>>>> unnecessarily tedious for me to evaluate higher-level 
>>>>>>>>> considerations if there are a lot of distractions, such as the 
>>>>>>>>> code comments below ;) But others are of course free to chime
>>>>>>>>> in.)
>>>>>>>>> 
>>>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>>>> 
>>>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>>>> 
>>>>>>>>> Have you tried the event MPM?
>>>>>>>>> 
>>>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>>>> 
>>>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>>>> 
>>>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>>>> 
>>>>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>>>>> for any significant number of people?)
>>>>>>>>> 
>>>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function 
>>>>>>>>> is added for the patch.")
>>>>>>>>> 
>>>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>>>> 
>>>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>>>>> distracting when reviewing
>>>>>>>>> * suitable identifier names (e.g., fix global variable "flag" 
>>>>>>>>> and whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>>>> should be renamed to indicate what it does
>>>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>>>> and don't add reminders in comments
>>>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>>>> and so on
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi
>>>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>> SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Dear All,
>>>>>>>>> 
>>>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>>>> 
>>>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>>>> 
>>>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>>>> 
>>>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>>>> 
>>>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>>>> 
>>>>>>>>> Configuration:
>>>>>>>>> <IfModule prefork.c>
>>>>>>>>> ListenBacklog 105384
>>>>>>>>> ServerLimit 105000
>>>>>>>>> MaxClients 1024
>>>>>>>>> MaxRequestsPerChild 0
>>>>>>>>> StartServers 64
>>>>>>>>> MinSpareServers 8
>>>>>>>>> MaxSpareServers 16
>>>>>>>>> </IfModule>
>>>>>>>>> 
>>>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>> http://emptyhammock.com/
>>>>>>>>> http://edjective.org/
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>> http://emptyhammock.com/
>>>>>>>>> http://edjective.org/
>>>>>>>>> 
>>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>>>> 
>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>> 
>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>> 
> 


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Thank you very much for your help!

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com] 
Sent: Tuesday, June 03, 2014 8:31 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Next on the agenda is to push into eventopt

On Jun 3, 2014, at 10:21 AM, Jim Jagielski <ji...@jaguNET.com> wrote:

> FTR: I saw no reason to try to handle both patches... I used the 
> so_reuseport patch as the primary patch to focus on.
> 
> I have some minor changes coming up to follow-up post the initial 
> commit
> 
> On Jun 3, 2014, at 8:51 AM, Jim Jagielski <ji...@jagunet.com> wrote:
> 
>> I have folded this into trunk and am currently fixing some compile 
>> warnings and errors...
>> 
>> On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> 
>>> Hi Jim,
>>> 
>>> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
>>> 
>>> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
>>> 
>>> I already updated the Bugzilla database for the item 55897 and item 56279.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>> Sent: Saturday, May 31, 2014 11:48 PM
>>> To: dev@httpd.apache.org
>>> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Hi Jim,
>>> 
>>> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
>>> 
>>> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
>>> 
>>> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
>>> 
>>> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
>>> 
>>> I already tested the code for the above approaches and they both work. 
>>> 
>>> Please let me know which way you think is better. I can quickly send you an update for review.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Lu, Yingqi
>>> Sent: Saturday, May 31, 2014 3:28 PM
>>> To: <de...@httpd.apache.org>
>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Hi Jim,
>>> 
>>> Thanks very much for your email! I will look into both of them and send an update tonight!
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>>>> 
>>>> I also see:
>>>> 
>>>>      /* kill off the idle ones */
>>>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>>>> +        for (i = 0; i < num_buckets; i++) {
>>>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>>>> +        }
>>>> 
>>>> 
>>>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>>>> 
>>>> /**
>>>> * Write data to the pipe-of-death, signalling that all child 
>>>> process
>>>> * should die.
>>>> * @param pod The pipe-of-death to write to.
>>>> * @param num The number of child processes to kill
>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>> */
>>>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>>>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>>>> +child_bucket);
>>>> 
>>>> Isn't 'num' the same in both implementation??
>>>> 
>>>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>>>> 
>>>>> Sorry I didn't catch this earlier:
>>>>> 
>>>>> I see
>>>>> 
>>>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>>>> @@ -267,16 +267,18 @@
>>>>> * Write data to the pipe-of-death, signalling that one child 
>>>>> process
>>>>> * should die.
>>>>> * @param pod the pipe-of-death to write to.
>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>> */
>>>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>>>> +child_bucket);
>>>>> 
>>>>> We can change the API at this point. We could add another 
>>>>> function, eg ap_mpm_pod_signal_ex() which takes the int param, but 
>>>>> we can't modify ap_mpm_pod_signal() itself.
>>>>> 
>>>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Thank you very much!
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>>>> 
>>>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Hi Jim,
>>>>>>> 
>>>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>>>> 
>>>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>>>> 
>>>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Jim,
>>>>>>>> 
>>>>>>>> Thanks very much for your email.
>>>>>>>> 
>>>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>>>> 
>>>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>>>> 
>>>>>>>> Please let me know if this answers your question.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>> SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> This is very cool!
>>>>>>>> 
>>>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>>>> 
>>>>>>>> I'm +1 on folding into trunk.
>>>>>>>> 
>>>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> 
>>>>>>>>> Dear All,
>>>>>>>>> 
>>>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>>>> 
>>>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>>>> Kernel
>>>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>>>> 
>>>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>>>> 
>>>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>>>> 
>>>>>>>>> Please let me know your feedback and comments.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Dear all,
>>>>>>>>> 
>>>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>>>> 
>>>>>>>>> Below are the changes we made into this new version:
>>>>>>>>> 
>>>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>>>> 
>>>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>>>> 
>>>>>>>>> According to Jeff Trawick's comments, 1. We generate the 
>>>>>>>>> patches against the httpd trunk.
>>>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>>>> 
>>>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi Lu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi
>>>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Hi Jeff,
>>>>>>>>> 
>>>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>>>> To: Apache HTTP Server Development List
>>>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> Hi All,
>>>>>>>>> 
>>>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>>>> 
>>>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>>>> 
>>>>>>>>> Yeah, it has been on my todo list, but I don't have time to 
>>>>>>>>> give an in depth review at the moment.  Here are a few questions/comments.
>>>>>>>>> (And you'll have to deal with the fact that it is 
>>>>>>>>> unnecessarily tedious for me to evaluate higher-level 
>>>>>>>>> considerations if there are a lot of distractions, such as the 
>>>>>>>>> code comments below ;) But others are of course free to chime 
>>>>>>>>> in.)
>>>>>>>>> 
>>>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>>>> 
>>>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>>>> 
>>>>>>>>> Have you tried the event MPM?
>>>>>>>>> 
>>>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>>>> 
>>>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>>>> 
>>>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>>>> 
>>>>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>>>>> for any significant number of people?)
>>>>>>>>> 
>>>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function 
>>>>>>>>> is added for the patch.")
>>>>>>>>> 
>>>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>>>> 
>>>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>>>>> distracting when reviewing
>>>>>>>>> * suitable identifier names (e.g., fix global variable "flag" 
>>>>>>>>> and whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>>>> should be renamed to indicate what it does
>>>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>>>> and don't add reminders in comments
>>>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>>>> and so on
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi
>>>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>> SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Dear All,
>>>>>>>>> 
>>>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>>>> 
>>>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>>>> 
>>>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>>>> 
>>>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>>>> 
>>>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>>>> 
>>>>>>>>> Configuration:
>>>>>>>>> <IfModule prefork.c>
>>>>>>>>> ListenBacklog 105384
>>>>>>>>> ServerLimit 105000
>>>>>>>>> MaxClients 1024
>>>>>>>>> MaxRequestsPerChild 0
>>>>>>>>> StartServers 64
>>>>>>>>> MinSpareServers 8
>>>>>>>>> MaxSpareServers 16
>>>>>>>>> </IfModule>
>>>>>>>>> 
>>>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>> http://emptyhammock.com/
>>>>>>>>> http://edjective.org/
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>> http://emptyhammock.com/
>>>>>>>>> http://edjective.org/
>>>>>>>>> 
>>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>>>> 
>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>> 
>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>> 
> 


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
Next on the agenda is to push into eventopt

On Jun 3, 2014, at 10:21 AM, Jim Jagielski <ji...@jaguNET.com> wrote:

> FTR: I saw no reason to try to handle both patches... I used
> the so_reuseport patch as the primary patch to focus on.
> 
> I have some minor changes coming up to follow-up post
> the initial commit
> 
> On Jun 3, 2014, at 8:51 AM, Jim Jagielski <ji...@jagunet.com> wrote:
> 
>> I have folded this into trunk and am currently fixing some
>> compile warnings and errors...
>> 
>> On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> 
>>> Hi Jim,
>>> 
>>> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
>>> 
>>> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
>>> 
>>> I already updated the Bugzilla database for the item 55897 and item 56279.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
>>> Sent: Saturday, May 31, 2014 11:48 PM
>>> To: dev@httpd.apache.org
>>> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>>> 
>>> Hi Jim,
>>> 
>>> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
>>> 
>>> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
>>> 
>>> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
>>> 
>>> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
>>> 
>>> I already tested the code for the above approaches and they both work. 
>>> 
>>> Please let me know which way you think is better. I can quickly send you an update for review.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Lu, Yingqi
>>> Sent: Saturday, May 31, 2014 3:28 PM
>>> To: <de...@httpd.apache.org>
>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>>> 
>>> Hi Jim,
>>> 
>>> Thanks very much for your email! I will look into both of them and send an update tonight!
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>>>> 
>>>> I also see:
>>>> 
>>>>      /* kill off the idle ones */
>>>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>>>> +        for (i = 0; i < num_buckets; i++) {
>>>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>>>> +        }
>>>> 
>>>> 
>>>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>>>> 
>>>> /**
>>>> * Write data to the pipe-of-death, signalling that all child process
>>>> * should die.
>>>> * @param pod The pipe-of-death to write to.
>>>> * @param num The number of child processes to kill
>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>> */
>>>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>>>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>>>> +child_bucket);
>>>> 
>>>> Isn't 'num' the same in both implementation??
>>>> 
>>>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>>>> 
>>>>> Sorry I didn't catch this earlier:
>>>>> 
>>>>> I see
>>>>> 
>>>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>>>> @@ -267,16 +267,18 @@
>>>>> * Write data to the pipe-of-death, signalling that one child process
>>>>> * should die.
>>>>> * @param pod the pipe-of-death to write to.
>>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>>> */
>>>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>>>> +child_bucket);
>>>>> 
>>>>> We can change the API at this point. We could add another function, 
>>>>> eg ap_mpm_pod_signal_ex() which takes the int param, but we can't 
>>>>> modify ap_mpm_pod_signal() itself.
>>>>> 
>>>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Thank you very much!
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>>>> 
>>>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Hi Jim,
>>>>>>> 
>>>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>>>> 
>>>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>>>> 
>>>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Jim,
>>>>>>>> 
>>>>>>>> Thanks very much for your email.
>>>>>>>> 
>>>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>>>> 
>>>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>>>> 
>>>>>>>> Please let me know if this answers your question.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>> SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> This is very cool!
>>>>>>>> 
>>>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>>>> 
>>>>>>>> I'm +1 on folding into trunk.
>>>>>>>> 
>>>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> 
>>>>>>>>> Dear All,
>>>>>>>>> 
>>>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>>>> 
>>>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>>>> Kernel
>>>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>>>> 
>>>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>>>> 
>>>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>>>> 
>>>>>>>>> Please let me know your feedback and comments.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Dear all,
>>>>>>>>> 
>>>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>>>> 
>>>>>>>>> Below are the changes we made into this new version:
>>>>>>>>> 
>>>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>>>> 
>>>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>>>> 
>>>>>>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>>>>>>> against the httpd trunk.
>>>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>>>> 
>>>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi Lu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi
>>>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Hi Jeff,
>>>>>>>>> 
>>>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>>>> To: Apache HTTP Server Development List
>>>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>>> with SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>>> Hi All,
>>>>>>>>> 
>>>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>>>> 
>>>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>>>> 
>>>>>>>>> Yeah, it has been on my todo list, but I don't have time to give 
>>>>>>>>> an in depth review at the moment.  Here are a few questions/comments.
>>>>>>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>>>>>>> tedious for me to evaluate higher-level considerations if there 
>>>>>>>>> are a lot of distractions, such as the code comments below ;) But 
>>>>>>>>> others are of course free to chime in.)
>>>>>>>>> 
>>>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>>>> 
>>>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>>>> 
>>>>>>>>> Have you tried the event MPM?
>>>>>>>>> 
>>>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>>>> 
>>>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>>>> 
>>>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>>>> 
>>>>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>>>>> for any significant number of people?)
>>>>>>>>> 
>>>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>>>>>>> added for the patch.")
>>>>>>>>> 
>>>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>>>> 
>>>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>>>>> distracting when reviewing
>>>>>>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>>>>>>> whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>>>> should be renamed to indicate what it does
>>>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>>>> and don't add reminders in comments
>>>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>>>> and so on
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From: Lu, Yingqi
>>>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>>>> To: dev@httpd.apache.org
>>>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>>> SO_REUSEPORT support
>>>>>>>>> 
>>>>>>>>> Dear All,
>>>>>>>>> 
>>>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>>>> 
>>>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>>>> 
>>>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>>>> 
>>>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>>>> 
>>>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>>>> 
>>>>>>>>> Configuration:
>>>>>>>>> <IfModule prefork.c>
>>>>>>>>> ListenBacklog 105384
>>>>>>>>> ServerLimit 105000
>>>>>>>>> MaxClients 1024
>>>>>>>>> MaxRequestsPerChild 0
>>>>>>>>> StartServers 64
>>>>>>>>> MinSpareServers 8
>>>>>>>>> MaxSpareServers 16
>>>>>>>>> </IfModule>
>>>>>>>>> 
>>>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Yingqi
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>> http://emptyhammock.com/
>>>>>>>>> http://edjective.org/
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Born in Roswell... married an alien...
>>>>>>>>> http://emptyhammock.com/
>>>>>>>>> http://edjective.org/
>>>>>>>>> 
>>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>>>> 
>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>> 
>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>> 
> 


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
FTR: I saw no reason to try to handle both patches... I used
the so_reuseport patch as the primary patch to focus on.

I have some minor changes coming up to follow-up post
the initial commit

On Jun 3, 2014, at 8:51 AM, Jim Jagielski <ji...@jagunet.com> wrote:

> I have folded this into trunk and am currently fixing some
> compile warnings and errors...
> 
> On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Hi Jim,
>> 
>> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
>> 
>> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
>> 
>> I already updated the Bugzilla database for the item 55897 and item 56279.
>> 
>> Thanks,
>> Yingqi
>> 
>> -----Original Message-----
>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
>> Sent: Saturday, May 31, 2014 11:48 PM
>> To: dev@httpd.apache.org
>> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>> 
>> Hi Jim,
>> 
>> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
>> 
>> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
>> 
>> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
>> 
>> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
>> 
>> I already tested the code for the above approaches and they both work. 
>> 
>> Please let me know which way you think is better. I can quickly send you an update for review.
>> 
>> Thanks,
>> Yingqi
>> 
>> -----Original Message-----
>> From: Lu, Yingqi
>> Sent: Saturday, May 31, 2014 3:28 PM
>> To: <de...@httpd.apache.org>
>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>> 
>> Hi Jim,
>> 
>> Thanks very much for your email! I will look into both of them and send an update tonight!
>> 
>> Thanks,
>> Yingqi
>> 
>>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>>> 
>>> I also see:
>>> 
>>>       /* kill off the idle ones */
>>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>>> +        for (i = 0; i < num_buckets; i++) {
>>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>>> +        }
>>> 
>>> 
>>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>>> 
>>> /**
>>> * Write data to the pipe-of-death, signalling that all child process
>>> * should die.
>>> * @param pod The pipe-of-death to write to.
>>> * @param num The number of child processes to kill
>>> + * @param my_bucket the bucket that holds the dying child process.
>>> */
>>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>>> +child_bucket);
>>> 
>>> Isn't 'num' the same in both implementation??
>>> 
>>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>>> 
>>>> Sorry I didn't catch this earlier:
>>>> 
>>>> I see
>>>> 
>>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>>> @@ -267,16 +267,18 @@
>>>> * Write data to the pipe-of-death, signalling that one child process
>>>> * should die.
>>>> * @param pod the pipe-of-death to write to.
>>>> + * @param my_bucket the bucket that holds the dying child process.
>>>> */
>>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>>> +child_bucket);
>>>> 
>>>> We can change the API at this point. We could add another function, 
>>>> eg ap_mpm_pod_signal_ex() which takes the int param, but we can't 
>>>> modify ap_mpm_pod_signal() itself.
>>>> 
>>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>> 
>>>>> Thank you very much!
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>>> 
>>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Hi Jim,
>>>>>> 
>>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>>> 
>>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>>> 
>>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Hi Jim,
>>>>>>> 
>>>>>>> Thanks very much for your email.
>>>>>>> 
>>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>>> 
>>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>>> 
>>>>>>> Please let me know if this answers your question.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> This is very cool!
>>>>>>> 
>>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>>> 
>>>>>>> I'm +1 on folding into trunk.
>>>>>>> 
>>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>> 
>>>>>>>> Dear All,
>>>>>>>> 
>>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>>> 
>>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>>> Kernel
>>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>>> 
>>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>>> 
>>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>>> 
>>>>>>>> Please let me know your feedback and comments.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>> 
>>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>> with SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> Dear all,
>>>>>>>> 
>>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>>> 
>>>>>>>> Below are the changes we made into this new version:
>>>>>>>> 
>>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>>> 
>>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>>> 
>>>>>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>>>>>> against the httpd trunk.
>>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>>> 
>>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi Lu
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> From: Lu, Yingqi
>>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>> with SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> Hi Jeff,
>>>>>>>> 
>>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>>> To: Apache HTTP Server Development List
>>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>>> with SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>>> Hi All,
>>>>>>>> 
>>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>>> 
>>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>>> 
>>>>>>>> Yeah, it has been on my todo list, but I don't have time to give 
>>>>>>>> an in depth review at the moment.  Here are a few questions/comments.
>>>>>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>>>>>> tedious for me to evaluate higher-level considerations if there 
>>>>>>>> are a lot of distractions, such as the code comments below ;) But 
>>>>>>>> others are of course free to chime in.)
>>>>>>>> 
>>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>>> 
>>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>>> 
>>>>>>>> Have you tried the event MPM?
>>>>>>>> 
>>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>>> 
>>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>>> 
>>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>>> 
>>>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>>>> for any significant number of people?)
>>>>>>>> 
>>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>>>>>> added for the patch.")
>>>>>>>> 
>>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>>> 
>>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>>>> distracting when reviewing
>>>>>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>>>>>> whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>>> should be renamed to indicate what it does
>>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>>> and don't add reminders in comments
>>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>>> and so on
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> 
>>>>>>>> From: Lu, Yingqi
>>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>>> To: dev@httpd.apache.org
>>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>>> SO_REUSEPORT support
>>>>>>>> 
>>>>>>>> Dear All,
>>>>>>>> 
>>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>>> 
>>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>>> 
>>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>>> 
>>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>>> 
>>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>>> 
>>>>>>>> Configuration:
>>>>>>>> <IfModule prefork.c>
>>>>>>>> ListenBacklog 105384
>>>>>>>> ServerLimit 105000
>>>>>>>> MaxClients 1024
>>>>>>>> MaxRequestsPerChild 0
>>>>>>>> StartServers 64
>>>>>>>> MinSpareServers 8
>>>>>>>> MaxSpareServers 16
>>>>>>>> </IfModule>
>>>>>>>> 
>>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yingqi
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Born in Roswell... married an alien...
>>>>>>>> http://emptyhammock.com/
>>>>>>>> http://edjective.org/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Born in Roswell... married an alien...
>>>>>>>> http://emptyhammock.com/
>>>>>>>> http://edjective.org/
>>>>>>>> 
>>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>>> 
>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>> 
>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
I have folded this into trunk and am currently fixing some
compile warnings and errors...

On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
> 
> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
> 
> I already updated the Bugzilla database for the item 55897 and item 56279.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
> Sent: Saturday, May 31, 2014 11:48 PM
> To: dev@httpd.apache.org
> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> Hi Jim,
> 
> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
> 
> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
> 
> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
> 
> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
> 
> I already tested the code for the above approaches and they both work. 
> 
> Please let me know which way you think is better. I can quickly send you an update for review.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Lu, Yingqi
> Sent: Saturday, May 31, 2014 3:28 PM
> To: <de...@httpd.apache.org>
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> Hi Jim,
> 
> Thanks very much for your email! I will look into both of them and send an update tonight!
> 
> Thanks,
> Yingqi
> 
>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>> 
>> I also see:
>> 
>>        /* kill off the idle ones */
>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>> +        for (i = 0; i < num_buckets; i++) {
>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>> +        }
>> 
>> 
>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>> 
>> /**
>> * Write data to the pipe-of-death, signalling that all child process
>> * should die.
>> * @param pod The pipe-of-death to write to.
>> * @param num The number of child processes to kill
>> + * @param my_bucket the bucket that holds the dying child process.
>> */
>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>> +child_bucket);
>> 
>> Isn't 'num' the same in both implementation??
>> 
>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>> 
>>> Sorry I didn't catch this earlier:
>>> 
>>> I see
>>> 
>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>> @@ -267,16 +267,18 @@
>>> * Write data to the pipe-of-death, signalling that one child process
>>> * should die.
>>> * @param pod the pipe-of-death to write to.
>>> + * @param my_bucket the bucket that holds the dying child process.
>>> */
>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>> +child_bucket);
>>> 
>>> We can change the API at this point. We could add another function, 
>>> eg ap_mpm_pod_signal_ex() which takes the int param, but we can't 
>>> modify ap_mpm_pod_signal() itself.
>>> 
>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>> 
>>>> Thank you very much!
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>> To: dev@httpd.apache.org
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>> 
>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>> 
>>>>> Hi Jim,
>>>>> 
>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>> 
>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>> 
>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Hi Jim,
>>>>>> 
>>>>>> Thanks very much for your email.
>>>>>> 
>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>> 
>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>> 
>>>>>> Please let me know if this answers your question.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> This is very cool!
>>>>>> 
>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>> 
>>>>>> I'm +1 on folding into trunk.
>>>>>> 
>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Dear All,
>>>>>>> 
>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>> 
>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>> Kernel
>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>> 
>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>> 
>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>> 
>>>>>>> Please let me know your feedback and comments.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>> 
>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>> with SO_REUSEPORT support
>>>>>>> 
>>>>>>> Dear all,
>>>>>>> 
>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>> 
>>>>>>> Below are the changes we made into this new version:
>>>>>>> 
>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>> 
>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>> 
>>>>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>>>>> against the httpd trunk.
>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>> 
>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi Lu
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> From: Lu, Yingqi
>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>> with SO_REUSEPORT support
>>>>>>> 
>>>>>>> Hi Jeff,
>>>>>>> 
>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>> To: Apache HTTP Server Development List
>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>> with SO_REUSEPORT support
>>>>>>> 
>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> Hi All,
>>>>>>> 
>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>> 
>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>> 
>>>>>>> Yeah, it has been on my todo list, but I don't have time to give 
>>>>>>> an in depth review at the moment.  Here are a few questions/comments.
>>>>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>>>>> tedious for me to evaluate higher-level considerations if there 
>>>>>>> are a lot of distractions, such as the code comments below ;) But 
>>>>>>> others are of course free to chime in.)
>>>>>>> 
>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>> 
>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>> 
>>>>>>> Have you tried the event MPM?
>>>>>>> 
>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>> 
>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>> 
>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>> 
>>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>>> for any significant number of people?)
>>>>>>> 
>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>>>>> added for the patch.")
>>>>>>> 
>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>> 
>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>>> distracting when reviewing
>>>>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>>>>> whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>> should be renamed to indicate what it does
>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>> and don't add reminders in comments
>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>> and so on
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> 
>>>>>>> From: Lu, Yingqi
>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> Dear All,
>>>>>>> 
>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>> 
>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>> 
>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>> 
>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>> 
>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>> 
>>>>>>> Configuration:
>>>>>>> <IfModule prefork.c>
>>>>>>> ListenBacklog 105384
>>>>>>> ServerLimit 105000
>>>>>>> MaxClients 1024
>>>>>>> MaxRequestsPerChild 0
>>>>>>> StartServers 64
>>>>>>> MinSpareServers 8
>>>>>>> MaxSpareServers 16
>>>>>>> </IfModule>
>>>>>>> 
>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Born in Roswell... married an alien...
>>>>>>> http://emptyhammock.com/
>>>>>>> http://edjective.org/
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Born in Roswell... married an alien...
>>>>>>> http://emptyhammock.com/
>>>>>>> http://edjective.org/
>>>>>>> 
>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>> 
>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>> 
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Marion et Christophe JAILLET <ch...@wanadoo.fr>.
Hi,

 

see http://httpd.apache.org/lists.html#http-dev

You should just send a mail to  dev-unsubscribe@httpd.apache.org.

 

best regards,

 

CJ





> Message du 02/06/14 16:16
> De : "Mihai Iacob" 
> A : dev@httpd.apache.org
> Copie à : 
> Objet : Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
>
> How do I unsubscribe from this list ?
> 
> Regards,
> Mihai Iacob
> DB2 Security Development
> DB2 pureScale Development
> Phone: (905) 413-5378
> Email: miacob@ca.ibm.com
>


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Mihai Iacob <mi...@ca.ibm.com>.
How do I unsubscribe from this list ?

Regards,
Mihai Iacob
DB2 Security Development
DB2 pureScale Development
Phone: (905) 413-5378
Email: miacob@ca.ibm.com



From:	Jim Jagielski <ji...@jaguNET.com>
To:	dev@httpd.apache.org
Date:	06/02/2014 07:36 AM
Subject:	Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
            SO_REUSEPORT support



Thanks!
On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
>
> Personally, I think the second approach is better, it keeps
ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones,
only modifies dummy_connection (). Please let me know if you have different
opinions.
>
> Attached is the latest version of the two patches. They were both
generated against trunk rev. 1598561. Please review them and let me know if
there is anything missing.
>
> I already updated the Bugzilla database for the item 55897 and item
56279.
>
> Thanks,
> Yingqi
>
> -----Original Message-----
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
> Sent: Saturday, May 31, 2014 11:48 PM
> To: dev@httpd.apache.org
> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
SO_REUSEPORT support
>
> Hi Jim,
>
> Regarding to your comment #2, yes, you are right, it should be
ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much
for catching this.
>
> Regarding to your comment #1, the patch modifies the dummy_connection
(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket).
Inside the function, the reference listen statement is mpm_listen
[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection().
>
> Can we just modify the return of ap_mpm_pod_signal() from
dummy_connection(pod) to dummy_connection(pod, 0) and add
ap_mpm_pod_signal_ex()?
>
> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can
modify dummy_connection() to send dummy data via all the duplicated listen
statements. Then, we do not need child_bucket as the input parameter for
dummy_connection(). In this case, we do not need adding
ap_mpm_pod_signal_ex() too.
>
> I already tested the code for the above approaches and they both work.
>
> Please let me know which way you think is better. I can quickly send you
an update for review.
>
> Thanks,
> Yingqi
>
> -----Original Message-----
> From: Lu, Yingqi
> Sent: Saturday, May 31, 2014 3:28 PM
> To: <de...@httpd.apache.org>
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
SO_REUSEPORT support
>
> Hi Jim,
>
> Thanks very much for your email! I will look into both of them and send
an update tonight!
>
> Thanks,
> Yingqi
>
>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>>
>> I also see:
>>
>>        /* kill off the idle ones */
>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>> +        for (i = 0; i < num_buckets; i++) {
>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>> +        }
>>
>>
>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->
max_daemons_limit, i); ??
>>
>> /**
>> * Write data to the pipe-of-death, signalling that all child process
>> * should die.
>> * @param pod The pipe-of-death to write to.
>> * @param num The number of child processes to kill
>> + * @param my_bucket the bucket that holds the dying child process.
>> */
>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int
>> +child_bucket);
>>
>> Isn't 'num' the same in both implementation??
>>
>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>>
>>> Sorry I didn't catch this earlier:
>>>
>>> I see
>>>
>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16
13:07:03.892987491 -0400
>>> @@ -267,16 +267,18 @@
>>> * Write data to the pipe-of-death, signalling that one child process
>>> * should die.
>>> * @param pod the pipe-of-death to write to.
>>> + * @param my_bucket the bucket that holds the dying child process.
>>> */
>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int
>>> +child_bucket);
>>>
>>> We can change the API at this point. We could add another function,
>>> eg ap_mpm_pod_signal_ex() which takes the int param, but we can't
>>> modify ap_mpm_pod_signal() itself.
>>>
>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>
>>>> Thank you very much!
>>>>
>>>> Thanks,
>>>> Yingqi
>>>>
>>>> -----Original Message-----
>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>> To: dev@httpd.apache.org
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>>>> SO_REUSEPORT support
>>>>
>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>>
>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>
>>>>> Hi Jim,
>>>>>
>>>>> Thanks very much for clarifying this with me. I added #ifdef in the
code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch
does not use this parameter so that it remains the same.
>>>>>
>>>>> Attached are the two most recent patches. I already updated the
bugzilla #55897 as well.
>>>>>
>>>>> Thanks,
>>>>> Yingqi
>>>>>
>>>>> -----Original Message-----
>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>>>>> SO_REUSEPORT support
>>>>>
>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>> We could either check for that during config/build or protect it with
a #ifdef in the code (and create some logging so the admin nows if it was
found or not).
>>>>>
>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com>
wrote:
>>>>>>
>>>>>> Hi Jim,
>>>>>>
>>>>>> Thanks very much for your email.
>>>>>>
>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside
listen.c file. If the feature is not supported on the OS (for example,
Linux kernel < 3.9), it will fall back to the original behavior.
>>>>>>
>>>>>> In the bucket patch, there is no need to check the params. With
single listen statement, it is just the default behavior.
>>>>>>
>>>>>> Please let me know if this answers your question.
>>>>>>
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>>>>>> SO_REUSEPORT support
>>>>>>
>>>>>> This is very cool!
>>>>>>
>>>>>> mod_status assumes that sysconf() exists, but do we need to do a
config check on the params we use in these patches?
>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>>
>>>>>> I'm +1 on folding into trunk.
>>>>>>
>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com>
wrote:
>>>>>>>
>>>>>>> Dear All,
>>>>>>>
>>>>>>> During the last couple weeks, I spent some time extending the
original two patches from prefork MPM only to all three Linux MPMs
(prefork, worker and event). Attached is the latest version of the two
patches. Bugzilla database has also been updated already. The ID for the
two patches are #55897 and #56279. Please refer to messages below for
details on both of the patches.
>>>>>>>
>>>>>>> Quick test result on modern dual socket Intel platform (Linux
>>>>>>> Kernel
>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput
improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput
improvement; 2 listen statements: 35% throughput improvement
>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement;
2 listen statements: throughput parity, but 62% response time reduction
(with patch, 38% response time as original SW)
>>>>>>>
>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen
statement case)
>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput
improvement
>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>>
>>>>>>> In all the above testing cases, significant response time
reductions are observed, even with throughput improvements.
>>>>>>>
>>>>>>> Please let me know your feedback and comments.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> Software and workloads used in performance tests may have been
optimized for performance only on Intel microprocessors. Performance tests,
such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any
of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when
combined with other products.
>>>>>>>
>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch
>>>>>>> with SO_REUSEPORT support
>>>>>>>
>>>>>>> Dear all,
>>>>>>>
>>>>>>> Based on the feedback we received, we modified this patch. Here is
the most recent version. We also modified the Bugzilla database(Bugzilla#
55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>>
>>>>>>> Below are the changes we made into this new version:
>>>>>>>
>>>>>>> According to Yann Ylavic and other people's comments, we separate
the original patch between with and without SO_REUSEPORT into two separated
patches. The SO_REUSEPORT patch does not change the original listen
sockets, it just duplicate the original one into multiple ones. Since the
listen sockets are identical, there is no need to change the
idle_server_maintenance function. The bucket patch (without SO_REUSEPORT),
on the other hand, it breaks down the original listen record (if there are
multiple listen socks) to multiple listen record linked lists. In this
case, idle_server_maintenance is implemented at bucket level to address the
situation that imbalanced traffic occurs among different listen
sockets/children buckets. In the bucket patch, the polling in the child
process is removed since each child only listens to 1 sock.
>>>>>>>
>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection
of SO_REUSEPORT" at run time.
>>>>>>>
>>>>>>> According to Jeff Trawick's comments, 1. We generate the patches
>>>>>>> against the httpd trunk.
>>>>>>> 2. We tested the current patches and they do not impact event and
worker mpms. If current patches can be accepted, we would be happy to
extend them to other Linux based mpms. There are not much code changes, but
require some time to setup the workload to test.
>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also
changed some of the parameter/variable/function names to better represent
their meanings.
>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch.
For bucket patch, the only thing is the number of children bucket only
scales to MAX_SPAWN_RATE. If there are more than 32 (current default
MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of
buckets will be fixed to 32. The reason for this is because that we
implement the idle_server_maintenance at bucket level, each bucket's own
max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>>
>>>>>>> Again, thanks very much for all the comments and feedback. Please
let us know if there are more changes we need to complete to make them
accepted.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Yingqi Lu
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: Lu, Yingqi
>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch
>>>>>>> with SO_REUSEPORT support
>>>>>>>
>>>>>>> Hi Jeff,
>>>>>>>
>>>>>>> Thanks very much for your time reviewing the patch! We will modify
the patch according to your comments and repost it here.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>>
>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>> To: Apache HTTP Server Development List
>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch
>>>>>>> with SO_REUSEPORT support
>>>>>>>
>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com>
wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I just want to ping again on this patch to gather your feedback and
comments. Please refer to the messages below for patch details.
>>>>>>>
>>>>>>> If you need any additional information/supporting data, please let
us know as well.
>>>>>>>
>>>>>>> Yeah, it has been on my todo list, but I don't have time to give
>>>>>>> an in depth review at the moment.  Here are a few
questions/comments.
>>>>>>> (And you'll have to deal with the fact that it is unnecessarily
>>>>>>> tedious for me to evaluate higher-level considerations if there
>>>>>>> are a lot of distractions, such as the code comments below ;) But
>>>>>>> others are of course free to chime in.)
>>>>>>>
>>>>>>> The patch should be against httpd trunk.  It probably won't take
much time for you to create that patch and confirm basic operation.
>>>>>>>
>>>>>>> What is the impact to other MPMs, even if they shouldn't use or
don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>>
>>>>>>> Have you tried the event MPM?
>>>>>>>
>>>>>>> Is there a way for the admin to choose this behavior?  Most won't
care, but everyone's behavior is changed AFAICT.
>>>>>>>
>>>>>>> Are there built-in limitations in this patch that we should be
aware of?  E.g., the free slot/spawn rate changes suggest to me that there
can't be more than 1025 children???
>>>>>>>
>>>>>>> We should assume for now that there's no reason this couldn't be
committed to trunk after review/rework, so make sure it is as close as you
can get it to what you think is the final form.
>>>>>>>
>>>>>>> For the configure-time check for 3.9 kernel: I think we'd also
>>>>>>> use AC_TRY_COMPILE at configure time to confirm that the
>>>>>>> SO_REUSEPORT definition is available, and not enable it if the
>>>>>>> system includes doesn't define it.  (Does that cause a problem
>>>>>>> for any significant number of people?)
>>>>>>>
>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function is
>>>>>>> added for the patch.")
>>>>>>>
>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>>
>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0;
>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is
>>>>>>> distracting when reviewing
>>>>>>> * suitable identifier names (e.g., fix global variable "flag" and
>>>>>>> whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>> should be renamed to indicate what it does
>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO
()"
>>>>>>> and don't add reminders in comments
>>>>>>> * this doesn't seem portable: "int free_slots
[MAX_SPAWN_RATE/num_buckets];"
>>>>>>> and so on
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>>
>>>>>>>
>>>>>>> From: Lu, Yingqi
>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>>>>>>> SO_REUSEPORT support
>>>>>>>
>>>>>>> Dear All,
>>>>>>>
>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread
Intel Xeon 2600 series systems, using an open source three tier social
networking web server workload, revealed performance scaling issues.  In
current software single listen statement (listen 80) provides better
scalability due to un-serialized accept. However, when system is under very
high load, this can lead to big number of child processes stuck in D state.
On the other hand, the serialized accept approach cannot scale with the
high load either.  In our analysis, a 32-thread system, with 2 listen
statements specified, could scale to just 70% utilization, and a 64-thread
system, with signal listen statement specified (listen 80, 4 network
interfaces), could scale to only 60% utilization.
>>>>>>>
>>>>>>> Based on those findings, we created a prototype patch for prefork
mpm which extends performance and thread utilization. In Linux kernel newer
than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets
listen to the same IP:port and automatically round robins connections. We
use this feature to create multiple duplicated listener records of the
original one and partition the child processes into buckets. Each bucket
listens to 1 IP:port. In case of old kernel which does not have the
SO_REUSEPORT enabled, we modified the "multiple listen statement case" by
creating 1 listen record for each listen statement and partitioning the
child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>>
>>>>>>> Quick tests of the patch, running the same workload, demonstrated a
22% throughput increase with 32-threads system and 2 listen statements
(Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without
SO_REUSEPORT), 10% performance gain was measured. With single listen
statement (listen 80) configuration, we observed over 2X performance
improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4).
We also observed big reduction in response time, in addition to the
throughput improvement gained in our tests 1.
>>>>>>>
>>>>>>> Following the feedback from the bugzilla website where we
originally submitted the patch, we removed the dependency of APR change to
simplify the patch testing process. Thanks Jeff Trawick for his good
suggestion! We are also actively working on extending the patch to worker
and event MPMs, as a next step. Meanwhile, we would like to gather comments
from all of you on the current prefork patch. Please take some time test it
and let us know how it works in your environment.
>>>>>>>
>>>>>>> This is our first patch to the Apache community. Please help us
review it and let us know if there is anything we might revise to improve
it. Your feedback is very much appreciated.
>>>>>>>
>>>>>>> Configuration:
>>>>>>> <IfModule prefork.c>
>>>>>>> ListenBacklog 105384
>>>>>>> ServerLimit 105000
>>>>>>> MaxClients 1024
>>>>>>> MaxRequestsPerChild 0
>>>>>>> StartServers 64
>>>>>>> MinSpareServers 8
>>>>>>> MaxSpareServers 16
>>>>>>> </IfModule>
>>>>>>>
>>>>>>> 1. Software and workloads used in performance tests may have been
optimized for performance only on Intel microprocessors. Performance tests,
such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any
of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when
combined with other products.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Born in Roswell... married an alien...
>>>>>>> http://emptyhammock.com/
>>>>>>> http://edjective.org/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Born in Roswell... married an alien...
>>>>>>> http://emptyhammock.com/
>>>>>>> http://edjective.org/
>>>>>>>
>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>>
>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
Thanks!
On Jun 2, 2014, at 4:22 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.
> 
> Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.
> 
> I already updated the Bugzilla database for the item 55897 and item 56279.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
> Sent: Saturday, May 31, 2014 11:48 PM
> To: dev@httpd.apache.org
> Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> Hi Jim,
> 
> Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.
> 
> Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 
> 
> Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 
> 
> Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.
> 
> I already tested the code for the above approaches and they both work. 
> 
> Please let me know which way you think is better. I can quickly send you an update for review.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Lu, Yingqi
> Sent: Saturday, May 31, 2014 3:28 PM
> To: <de...@httpd.apache.org>
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> Hi Jim,
> 
> Thanks very much for your email! I will look into both of them and send an update tonight!
> 
> Thanks,
> Yingqi
> 
>> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
>> 
>> I also see:
>> 
>>        /* kill off the idle ones */
>> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
>> +        for (i = 0; i < num_buckets; i++) {
>> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
>> +        }
>> 
>> 
>> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
>> 
>> /**
>> * Write data to the pipe-of-death, signalling that all child process
>> * should die.
>> * @param pod The pipe-of-death to write to.
>> * @param num The number of child processes to kill
>> + * @param my_bucket the bucket that holds the dying child process.
>> */
>> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
>> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
>> +child_bucket);
>> 
>> Isn't 'num' the same in both implementation??
>> 
>>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>> 
>>> Sorry I didn't catch this earlier:
>>> 
>>> I see
>>> 
>>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>>> @@ -267,16 +267,18 @@
>>> * Write data to the pipe-of-death, signalling that one child process
>>> * should die.
>>> * @param pod the pipe-of-death to write to.
>>> + * @param my_bucket the bucket that holds the dying child process.
>>> */
>>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>>> +child_bucket);
>>> 
>>> We can change the API at this point. We could add another function, 
>>> eg ap_mpm_pod_signal_ex() which takes the int param, but we can't 
>>> modify ap_mpm_pod_signal() itself.
>>> 
>>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>> 
>>>> Thank you very much!
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>> Sent: Friday, May 30, 2014 7:07 AM
>>>> To: dev@httpd.apache.org
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>>> 
>>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>> 
>>>>> Hi Jim,
>>>>> 
>>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>>> 
>>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>>> 
>>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Hi Jim,
>>>>>> 
>>>>>> Thanks very much for your email.
>>>>>> 
>>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>>> 
>>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>>> 
>>>>>> Please let me know if this answers your question.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> This is very cool!
>>>>>> 
>>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>>> 
>>>>>> I'm +1 on folding into trunk.
>>>>>> 
>>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> 
>>>>>>> Dear All,
>>>>>>> 
>>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>>> 
>>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>>> Kernel
>>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>>> 
>>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>>> 
>>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>>> 
>>>>>>> Please let me know your feedback and comments.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>> 
>>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>> with SO_REUSEPORT support
>>>>>>> 
>>>>>>> Dear all,
>>>>>>> 
>>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>>> 
>>>>>>> Below are the changes we made into this new version:
>>>>>>> 
>>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>>> 
>>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>>> 
>>>>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>>>>> against the httpd trunk.
>>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>>> 
>>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi Lu
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> From: Lu, Yingqi
>>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>> with SO_REUSEPORT support
>>>>>>> 
>>>>>>> Hi Jeff,
>>>>>>> 
>>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>>> To: Apache HTTP Server Development List
>>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>>> with SO_REUSEPORT support
>>>>>>> 
>>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>>> Hi All,
>>>>>>> 
>>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>>> 
>>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>>> 
>>>>>>> Yeah, it has been on my todo list, but I don't have time to give 
>>>>>>> an in depth review at the moment.  Here are a few questions/comments.
>>>>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>>>>> tedious for me to evaluate higher-level considerations if there 
>>>>>>> are a lot of distractions, such as the code comments below ;) But 
>>>>>>> others are of course free to chime in.)
>>>>>>> 
>>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>>> 
>>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>>> 
>>>>>>> Have you tried the event MPM?
>>>>>>> 
>>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>>> 
>>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>>> 
>>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>>> 
>>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>>> for any significant number of people?)
>>>>>>> 
>>>>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>>>>> added for the patch.")
>>>>>>> 
>>>>>>> Incomplete comments on style/syntax issues:
>>>>>>> 
>>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>>> distracting when reviewing
>>>>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>>>>> whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>>> should be renamed to indicate what it does
>>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>>> and don't add reminders in comments
>>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>>> and so on
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> 
>>>>>>> From: Lu, Yingqi
>>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>>> To: dev@httpd.apache.org
>>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>>> SO_REUSEPORT support
>>>>>>> 
>>>>>>> Dear All,
>>>>>>> 
>>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>>> 
>>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>>> 
>>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>>> 
>>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>>> 
>>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>>> 
>>>>>>> Configuration:
>>>>>>> <IfModule prefork.c>
>>>>>>> ListenBacklog 105384
>>>>>>> ServerLimit 105000
>>>>>>> MaxClients 1024
>>>>>>> MaxRequestsPerChild 0
>>>>>>> StartServers 64
>>>>>>> MinSpareServers 8
>>>>>>> MaxSpareServers 16
>>>>>>> </IfModule>
>>>>>>> 
>>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yingqi
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Born in Roswell... married an alien...
>>>>>>> http://emptyhammock.com/
>>>>>>> http://edjective.org/
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Born in Roswell... married an alien...
>>>>>>> http://emptyhammock.com/
>>>>>>> http://edjective.org/
>>>>>>> 
>>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>>> 
>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>> 
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Hi Jim,

Personally, I think the second approach is better, it keeps ap_mpm_pod_signal () and ap_mpm_pod_killpg () exactly as the original ones, only modifies dummy_connection (). Please let me know if you have different opinions.

Attached is the latest version of the two patches. They were both generated against trunk rev. 1598561. Please review them and let me know if there is anything missing.

I already updated the Bugzilla database for the item 55897 and item 56279.

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
Sent: Saturday, May 31, 2014 11:48 PM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.

Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 

Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 

Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.

I already tested the code for the above approaches and they both work. 

Please let me know which way you think is better. I can quickly send you an update for review.

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi
Sent: Saturday, May 31, 2014 3:28 PM
To: <de...@httpd.apache.org>
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for your email! I will look into both of them and send an update tonight!

Thanks,
Yingqi

> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
> 
> I also see:
> 
>         /* kill off the idle ones */
> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
> +        for (i = 0; i < num_buckets; i++) {
> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
> +        }
> 
> 
> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
> 
> /**
>  * Write data to the pipe-of-death, signalling that all child process
>  * should die.
>  * @param pod The pipe-of-death to write to.
>  * @param num The number of child processes to kill
> + * @param my_bucket the bucket that holds the dying child process.
>  */
> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
> +child_bucket);
> 
> Isn't 'num' the same in both implementation??
> 
>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>> 
>> Sorry I didn't catch this earlier:
>> 
>> I see
>> 
>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>> @@ -267,16 +267,18 @@
>> * Write data to the pipe-of-death, signalling that one child process
>> * should die.
>> * @param pod the pipe-of-death to write to.
>> + * @param my_bucket the bucket that holds the dying child process.
>> */
>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>> +child_bucket);
>> 
>> We can change the API at this point. We could add another function, 
>> eg ap_mpm_pod_signal_ex() which takes the int param, but we can't 
>> modify ap_mpm_pod_signal() itself.
>> 
>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> 
>>> Thank you very much!
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>> Sent: Friday, May 30, 2014 7:07 AM
>>> To: dev@httpd.apache.org
>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>> 
>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>> 
>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>> To: dev@httpd.apache.org
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>> 
>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>> 
>>>>> Hi Jim,
>>>>> 
>>>>> Thanks very much for your email.
>>>>> 
>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>> 
>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>> 
>>>>> Please let me know if this answers your question.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> This is very cool!
>>>>> 
>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>> 
>>>>> I'm +1 on folding into trunk.
>>>>> 
>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Dear All,
>>>>>> 
>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>> 
>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>> Kernel
>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>> 
>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>> 
>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>> 
>>>>>> Please let me know your feedback and comments.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>> 
>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>> with SO_REUSEPORT support
>>>>>> 
>>>>>> Dear all,
>>>>>> 
>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>> 
>>>>>> Below are the changes we made into this new version:
>>>>>> 
>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>> 
>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>> 
>>>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>>>> against the httpd trunk.
>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>> 
>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi Lu
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> From: Lu, Yingqi
>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>> with SO_REUSEPORT support
>>>>>> 
>>>>>> Hi Jeff,
>>>>>> 
>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>> To: Apache HTTP Server Development List
>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>> with SO_REUSEPORT support
>>>>>> 
>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> Hi All,
>>>>>> 
>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>> 
>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>> 
>>>>>> Yeah, it has been on my todo list, but I don't have time to give 
>>>>>> an in depth review at the moment.  Here are a few questions/comments.
>>>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>>>> tedious for me to evaluate higher-level considerations if there 
>>>>>> are a lot of distractions, such as the code comments below ;) But 
>>>>>> others are of course free to chime in.)
>>>>>> 
>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>> 
>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>> 
>>>>>> Have you tried the event MPM?
>>>>>> 
>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>> 
>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>> 
>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>> 
>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>> for any significant number of people?)
>>>>>> 
>>>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>>>> added for the patch.")
>>>>>> 
>>>>>> Incomplete comments on style/syntax issues:
>>>>>> 
>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>> distracting when reviewing
>>>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>>>> whatever else isn't appropriate; "ap_post_config_listeners"
>>>>>> should be renamed to indicate what it does
>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>> and don't add reminders in comments
>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>> and so on
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> 
>>>>>> From: Lu, Yingqi
>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> Dear All,
>>>>>> 
>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>> 
>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>> 
>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>> 
>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>> 
>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>> 
>>>>>> Configuration:
>>>>>> <IfModule prefork.c>
>>>>>> ListenBacklog 105384
>>>>>> ServerLimit 105000
>>>>>> MaxClients 1024
>>>>>> MaxRequestsPerChild 0
>>>>>> StartServers 64
>>>>>> MinSpareServers 8
>>>>>> MaxSpareServers 16
>>>>>> </IfModule>
>>>>>> 
>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Born in Roswell... married an alien...
>>>>>> http://emptyhammock.com/
>>>>>> http://edjective.org/
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Born in Roswell... married an alien...
>>>>>> http://emptyhammock.com/
>>>>>> http://edjective.org/
>>>>>> 
>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>> 
>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 

RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Hi Jim,

Regarding to your comment #2, yes, you are right, it should be ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i). Thanks very much for catching this.

Regarding to your comment #1, the patch modifies the dummy_connection(ap_pod_t *pod) to be dummy_connection(ap_pod_t *pod, int child_bucket). Inside the function, the reference listen statement is mpm_listen[child_bucket]. And, ap_mpm_pod_signal() calls dummy_connection(). 

Can we just modify the return of ap_mpm_pod_signal() from dummy_connection(pod) to dummy_connection(pod, 0) and add ap_mpm_pod_signal_ex()? 

Or, if we need to keep ap_mpm_pod_signal() exactly as the original, I can modify dummy_connection() to send dummy data via all the duplicated listen statements. Then, we do not need child_bucket as the input parameter for dummy_connection(). In this case, we do not need adding ap_mpm_pod_signal_ex() too.

I already tested the code for the above approaches and they both work. 

Please let me know which way you think is better. I can quickly send you an update for review.

Thanks,
Yingqi

-----Original Message-----
From: Lu, Yingqi 
Sent: Saturday, May 31, 2014 3:28 PM
To: <de...@httpd.apache.org>
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for your email! I will look into both of them and send an update tonight!

Thanks,
Yingqi

> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
> 
> I also see:
> 
>         /* kill off the idle ones */
> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
> +        for (i = 0; i < num_buckets; i++) {
> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
> +        }
> 
> 
> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
> 
> /**
>  * Write data to the pipe-of-death, signalling that all child process
>  * should die.
>  * @param pod The pipe-of-death to write to.
>  * @param num The number of child processes to kill
> + * @param my_bucket the bucket that holds the dying child process.
>  */
> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int 
> +child_bucket);
> 
> Isn't 'num' the same in both implementation??
> 
>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>> 
>> Sorry I didn't catch this earlier:
>> 
>> I see
>> 
>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>> @@ -267,16 +267,18 @@
>> * Write data to the pipe-of-death, signalling that one child process
>> * should die.
>> * @param pod the pipe-of-death to write to.
>> + * @param my_bucket the bucket that holds the dying child process.
>> */
>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int 
>> +child_bucket);
>> 
>> We can change the API at this point. We could add another function, 
>> eg ap_mpm_pod_signal_ex() which takes the int param, but we can't 
>> modify ap_mpm_pod_signal() itself.
>> 
>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> 
>>> Thank you very much!
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>> Sent: Friday, May 30, 2014 7:07 AM
>>> To: dev@httpd.apache.org
>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>> 
>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>> 
>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>> To: dev@httpd.apache.org
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>> 
>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>> 
>>>>> Hi Jim,
>>>>> 
>>>>> Thanks very much for your email.
>>>>> 
>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>> 
>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>> 
>>>>> Please let me know if this answers your question.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> This is very cool!
>>>>> 
>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>> 
>>>>> I'm +1 on folding into trunk.
>>>>> 
>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Dear All,
>>>>>> 
>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>> 
>>>>>> Quick test result on modern dual socket Intel platform (Linux 
>>>>>> Kernel
>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>> 
>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>> 
>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>> 
>>>>>> Please let me know your feedback and comments.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>> 
>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>> with SO_REUSEPORT support
>>>>>> 
>>>>>> Dear all,
>>>>>> 
>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>> 
>>>>>> Below are the changes we made into this new version:
>>>>>> 
>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>> 
>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>> 
>>>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>>>> against the httpd trunk.
>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>> 
>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi Lu
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> From: Lu, Yingqi
>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>> with SO_REUSEPORT support
>>>>>> 
>>>>>> Hi Jeff,
>>>>>> 
>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>> To: Apache HTTP Server Development List
>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch 
>>>>>> with SO_REUSEPORT support
>>>>>> 
>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> Hi All,
>>>>>> 
>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>> 
>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>> 
>>>>>> Yeah, it has been on my todo list, but I don't have time to give 
>>>>>> an in depth review at the moment.  Here are a few questions/comments.
>>>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>>>> tedious for me to evaluate higher-level considerations if there 
>>>>>> are a lot of distractions, such as the code comments below ;)  
>>>>>> But others are of course free to chime in.)
>>>>>> 
>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>> 
>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>> 
>>>>>> Have you tried the event MPM?
>>>>>> 
>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>> 
>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>> 
>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>> 
>>>>>> For the configure-time check for 3.9 kernel: I think we'd also 
>>>>>> use AC_TRY_COMPILE at configure time to confirm that the 
>>>>>> SO_REUSEPORT definition is available, and not enable it if the 
>>>>>> system includes doesn't define it.  (Does that cause a problem 
>>>>>> for any significant number of people?)
>>>>>> 
>>>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>>>> added for the patch.")
>>>>>> 
>>>>>> Incomplete comments on style/syntax issues:
>>>>>> 
>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>> distracting when reviewing
>>>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>>>> whatever else isn't appropriate; "ap_post_config_listeners" 
>>>>>> should be renamed to indicate what it does
>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>> and don't add reminders in comments
>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>> and so on
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> 
>>>>>> From: Lu, Yingqi
>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> Dear All,
>>>>>> 
>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>> 
>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>> 
>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>> 
>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>> 
>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>> 
>>>>>> Configuration:
>>>>>> <IfModule prefork.c>
>>>>>> ListenBacklog 105384
>>>>>> ServerLimit 105000
>>>>>> MaxClients 1024
>>>>>> MaxRequestsPerChild 0
>>>>>> StartServers 64
>>>>>> MinSpareServers 8
>>>>>> MaxSpareServers 16
>>>>>> </IfModule>
>>>>>> 
>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Born in Roswell... married an alien...
>>>>>> http://emptyhammock.com/
>>>>>> http://edjective.org/
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Born in Roswell... married an alien...
>>>>>> http://emptyhammock.com/
>>>>>> http://edjective.org/
>>>>>> 
>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>> 
>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 

Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Hi Jim,

Thanks very much for your email! I will look into both of them and send an update tonight!

Thanks,
Yingqi

> On May 31, 2014, at 9:43 AM, "Jim Jagielski" <ji...@jaguNET.com> wrote:
> 
> I also see:
> 
>         /* kill off the idle ones */
> -        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
> +        for (i = 0; i < num_buckets; i++) {
> +            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
> +        }
> 
> 
> Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??
> 
> /**
>  * Write data to the pipe-of-death, signalling that all child process
>  * should die.
>  * @param pod The pipe-of-death to write to.
>  * @param num The number of child processes to kill
> + * @param my_bucket the bucket that holds the dying child process.
>  */
> -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
> +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int child_bucket);
> 
> Isn't 'num' the same in both implementation??
> 
>> On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>> 
>> Sorry I didn't catch this earlier:
>> 
>> I see
>> 
>> +++ httpd-trunk.new/include/mpm_common.h    2014-05-16 13:07:03.892987491 -0400
>> @@ -267,16 +267,18 @@
>> * Write data to the pipe-of-death, signalling that one child process
>> * should die.
>> * @param pod the pipe-of-death to write to.
>> + * @param my_bucket the bucket that holds the dying child process.
>> */
>> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
>> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int child_bucket);
>> 
>> We can change the API at this point. We could
>> add another function, eg ap_mpm_pod_signal_ex() which
>> takes the int param, but we can't modify ap_mpm_pod_signal()
>> itself.
>> 
>>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> 
>>> Thank you very much!
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Jim Jagielski [mailto:jim@jaguNET.com] 
>>> Sent: Friday, May 30, 2014 7:07 AM
>>> To: dev@httpd.apache.org
>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>>> 
>>> Thx! Let me review. My plan is to fold into trunk this weekend.
>>> 
>>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>> 
>>>> Hi Jim,
>>>> 
>>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>>> 
>>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>> Sent: Thursday, May 15, 2014 7:53 AM
>>>> To: dev@httpd.apache.org
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>>> 
>>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>> 
>>>>> Hi Jim,
>>>>> 
>>>>> Thanks very much for your email.
>>>>> 
>>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>>> 
>>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>>> 
>>>>> Please let me know if this answers your question.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> This is very cool!
>>>>> 
>>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>>> We look OK on Linux, FreeBSD and OSX...
>>>>> 
>>>>> I'm +1 on folding into trunk.
>>>>> 
>>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> 
>>>>>> Dear All,
>>>>>> 
>>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>>> 
>>>>>> Quick test result on modern dual socket Intel platform (Linux Kernel
>>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>>> 
>>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>>> 
>>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>>> 
>>>>>> Please let me know your feedback and comments.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>> 
>>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> Dear all,
>>>>>> 
>>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>>> 
>>>>>> Below are the changes we made into this new version:
>>>>>> 
>>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>>> 
>>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>>> 
>>>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>>>> against the httpd trunk.
>>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>>> 
>>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi Lu
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> From: Lu, Yingqi
>>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> Hi Jeff,
>>>>>> 
>>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>>> To: Apache HTTP Server Development List
>>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>>> Hi All,
>>>>>> 
>>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>>> 
>>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>>> 
>>>>>> Yeah, it has been on my todo list, but I don't have time to give an 
>>>>>> in depth review at the moment.  Here are a few questions/comments.
>>>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>>>> tedious for me to evaluate higher-level considerations if there are 
>>>>>> a lot of distractions, such as the code comments below ;)  But 
>>>>>> others are of course free to chime in.)
>>>>>> 
>>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>>> 
>>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>>> 
>>>>>> Have you tried the event MPM?
>>>>>> 
>>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>>> 
>>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>>> 
>>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>>> 
>>>>>> For the configure-time check for 3.9 kernel: I think we'd also use 
>>>>>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>>>>>> definition is available, and not enable it if the system includes 
>>>>>> doesn't define it.  (Does that cause a problem for any significant 
>>>>>> number of people?)
>>>>>> 
>>>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>>>> added for the patch.")
>>>>>> 
>>>>>> Incomplete comments on style/syntax issues:
>>>>>> 
>>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>>> distracting when reviewing
>>>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>>>> whatever else isn't appropriate; "ap_post_config_listeners" should 
>>>>>> be renamed to indicate what it does
>>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>>> and don't add reminders in comments
>>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>>> and so on
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> 
>>>>>> From: Lu, Yingqi
>>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>>> To: dev@httpd.apache.org
>>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>>> SO_REUSEPORT support
>>>>>> 
>>>>>> Dear All,
>>>>>> 
>>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>>> 
>>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>>> 
>>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>>> 
>>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>>> 
>>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>>> 
>>>>>> Configuration:
>>>>>> <IfModule prefork.c>
>>>>>> ListenBacklog 105384
>>>>>> ServerLimit 105000
>>>>>> MaxClients 1024
>>>>>> MaxRequestsPerChild 0
>>>>>> StartServers 64
>>>>>> MinSpareServers 8
>>>>>> MaxSpareServers 16
>>>>>> </IfModule>
>>>>>> 
>>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>>> 
>>>>>> Thanks,
>>>>>> Yingqi
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Born in Roswell... married an alien...
>>>>>> http://emptyhammock.com/
>>>>>> http://edjective.org/
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Born in Roswell... married an alien...
>>>>>> http://emptyhammock.com/
>>>>>> http://edjective.org/
>>>>>> 
>>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>> 
>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 

Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
I also see:

         /* kill off the idle ones */
-        ap_mpm_pod_killpg(pod, retained->max_daemons_limit);
+        for (i = 0; i < num_buckets; i++) {
+            ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit);
+        }


Is that right? Why isn't it: ap_mpm_pod_killpg(pod, retained->max_daemons_limit, i); ??

/**
  * Write data to the pipe-of-death, signalling that all child process
  * should die.
  * @param pod The pipe-of-death to write to.
  * @param num The number of child processes to kill
+ * @param my_bucket the bucket that holds the dying child process.
  */
-AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num);
+AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int child_bucket);

Isn't 'num' the same in both implementation??

On May 31, 2014, at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:

> Sorry I didn't catch this earlier:
> 
> I see
> 
> +++ httpd-trunk.new/include/mpm_common.h	2014-05-16 13:07:03.892987491 -0400
> @@ -267,16 +267,18 @@
>  * Write data to the pipe-of-death, signalling that one child process
>  * should die.
>  * @param pod the pipe-of-death to write to.
> + * @param my_bucket the bucket that holds the dying child process.
>  */
> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int child_bucket);
> 
> We can change the API at this point. We could
> add another function, eg ap_mpm_pod_signal_ex() which
> takes the int param, but we can't modify ap_mpm_pod_signal()
> itself.
> 
> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Thank you very much!
>> 
>> Thanks,
>> Yingqi
>> 
>> -----Original Message-----
>> From: Jim Jagielski [mailto:jim@jaguNET.com] 
>> Sent: Friday, May 30, 2014 7:07 AM
>> To: dev@httpd.apache.org
>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>> 
>> Thx! Let me review. My plan is to fold into trunk this weekend.
>> 
>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
>> 
>>> Hi Jim,
>>> 
>>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>>> 
>>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>> Sent: Thursday, May 15, 2014 7:53 AM
>>> To: dev@httpd.apache.org
>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>>> 
>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> 
>>>> Hi Jim,
>>>> 
>>>> Thanks very much for your email.
>>>> 
>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>>> 
>>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>>> 
>>>> Please let me know if this answers your question.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> -----Original Message-----
>>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>>> To: dev@httpd.apache.org
>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> This is very cool!
>>>> 
>>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>>> We look OK on Linux, FreeBSD and OSX...
>>>> 
>>>> I'm +1 on folding into trunk.
>>>> 
>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>>> 
>>>>> Dear All,
>>>>> 
>>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>>> 
>>>>> Quick test result on modern dual socket Intel platform (Linux Kernel
>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>>> 
>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>>> 
>>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>>> 
>>>>> Please let me know your feedback and comments.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>> 
>>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> Dear all,
>>>>> 
>>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>>> 
>>>>> Below are the changes we made into this new version:
>>>>> 
>>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>>> 
>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>>> 
>>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>>> against the httpd trunk.
>>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>>> 
>>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi Lu
>>>>> 
>>>>> 
>>>>> 
>>>>> From: Lu, Yingqi
>>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> Hi Jeff,
>>>>> 
>>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>>> To: Apache HTTP Server Development List
>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>>> Hi All,
>>>>> 
>>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>>> 
>>>>> If you need any additional information/supporting data, please let us know as well.
>>>>> 
>>>>> Yeah, it has been on my todo list, but I don't have time to give an 
>>>>> in depth review at the moment.  Here are a few questions/comments.
>>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>>> tedious for me to evaluate higher-level considerations if there are 
>>>>> a lot of distractions, such as the code comments below ;)  But 
>>>>> others are of course free to chime in.)
>>>>> 
>>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>>> 
>>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>>> 
>>>>> Have you tried the event MPM?
>>>>> 
>>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>>> 
>>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>>> 
>>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>>> 
>>>>> For the configure-time check for 3.9 kernel: I think we'd also use 
>>>>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>>>>> definition is available, and not enable it if the system includes 
>>>>> doesn't define it.  (Does that cause a problem for any significant 
>>>>> number of people?)
>>>>> 
>>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>>> added for the patch.")
>>>>> 
>>>>> Incomplete comments on style/syntax issues:
>>>>> 
>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>>> distracting when reviewing
>>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>>> whatever else isn't appropriate; "ap_post_config_listeners" should 
>>>>> be renamed to indicate what it does
>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>>> and don't add reminders in comments
>>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>>> and so on
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> 
>>>>> From: Lu, Yingqi
>>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>>> To: dev@httpd.apache.org
>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>>> SO_REUSEPORT support
>>>>> 
>>>>> Dear All,
>>>>> 
>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>>> 
>>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>>> 
>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>>> 
>>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>>> 
>>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>>> 
>>>>> Configuration:
>>>>> <IfModule prefork.c>
>>>>> ListenBacklog 105384
>>>>> ServerLimit 105000
>>>>> MaxClients 1024
>>>>> MaxRequestsPerChild 0
>>>>> StartServers 64
>>>>> MinSpareServers 8
>>>>> MaxSpareServers 16
>>>>> </IfModule>
>>>>> 
>>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>>> 
>>>>> Thanks,
>>>>> Yingqi
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Born in Roswell... married an alien...
>>>>> http://emptyhammock.com/
>>>>> http://edjective.org/
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Born in Roswell... married an alien...
>>>>> http://emptyhammock.com/
>>>>> http://edjective.org/
>>>>> 
>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>>> 
>>> 
>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>> 
> 


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
Sorry I didn't catch this earlier:

I see

+++ httpd-trunk.new/include/mpm_common.h	2014-05-16 13:07:03.892987491 -0400
@@ -267,16 +267,18 @@
  * Write data to the pipe-of-death, signalling that one child process
  * should die.
  * @param pod the pipe-of-death to write to.
+ * @param my_bucket the bucket that holds the dying child process.
  */
-AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod);
+AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int child_bucket);

We can change the API at this point. We could
add another function, eg ap_mpm_pod_signal_ex() which
takes the int param, but we can't modify ap_mpm_pod_signal()
itself.

On May 30, 2014, at 11:15 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Thank you very much!
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com] 
> Sent: Friday, May 30, 2014 7:07 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> Thx! Let me review. My plan is to fold into trunk this weekend.
> 
> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Hi Jim,
>> 
>> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
>> 
>> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
>> 
>> Thanks,
>> Yingqi
>> 
>> -----Original Message-----
>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>> Sent: Thursday, May 15, 2014 7:53 AM
>> To: dev@httpd.apache.org
>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
>> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
>> 
>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> 
>>> Hi Jim,
>>> 
>>> Thanks very much for your email.
>>> 
>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>>> 
>>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>>> 
>>> Please let me know if this answers your question.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> -----Original Message-----
>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>> Sent: Wednesday, May 14, 2014 6:57 AM
>>> To: dev@httpd.apache.org
>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> This is very cool!
>>> 
>>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>>> We look OK on Linux, FreeBSD and OSX...
>>> 
>>> I'm +1 on folding into trunk.
>>> 
>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>> 
>>>> Dear All,
>>>> 
>>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>>> 
>>>> Quick test result on modern dual socket Intel platform (Linux Kernel
>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>>> 
>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>> 
>>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>>> 
>>>> Please let me know your feedback and comments.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>> 
>>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>>> Sent: Monday, March 17, 2014 1:41 PM
>>>> To: dev@httpd.apache.org
>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Dear all,
>>>> 
>>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>> 
>>>> Below are the changes we made into this new version:
>>>> 
>>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>>> 
>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>>> 
>>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>>> against the httpd trunk.
>>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>> 
>>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>>> 
>>>> Thanks,
>>>> Yingqi Lu
>>>> 
>>>> 
>>>> 
>>>> From: Lu, Yingqi
>>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>>> To: dev@httpd.apache.org
>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Hi Jeff,
>>>> 
>>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>>> To: Apache HTTP Server Development List
>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>>> Hi All,
>>>> 
>>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>>> 
>>>> If you need any additional information/supporting data, please let us know as well.
>>>> 
>>>> Yeah, it has been on my todo list, but I don't have time to give an 
>>>> in depth review at the moment.  Here are a few questions/comments.
>>>> (And you'll have to deal with the fact that it is unnecessarily 
>>>> tedious for me to evaluate higher-level considerations if there are 
>>>> a lot of distractions, such as the code comments below ;)  But 
>>>> others are of course free to chime in.)
>>>> 
>>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>>> 
>>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>>> 
>>>> Have you tried the event MPM?
>>>> 
>>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>>> 
>>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>>> 
>>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>>> 
>>>> For the configure-time check for 3.9 kernel: I think we'd also use 
>>>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>>>> definition is available, and not enable it if the system includes 
>>>> doesn't define it.  (Does that cause a problem for any significant 
>>>> number of people?)
>>>> 
>>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>>> added for the patch.")
>>>> 
>>>> Incomplete comments on style/syntax issues:
>>>> 
>>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>>> distracting when reviewing
>>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>>> whatever else isn't appropriate; "ap_post_config_listeners" should 
>>>> be renamed to indicate what it does
>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>>> and don't add reminders in comments
>>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>>> and so on
>>>> 
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> 
>>>> From: Lu, Yingqi
>>>> Sent: Friday, January 24, 2014 3:26 PM
>>>> To: dev@httpd.apache.org
>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>>> SO_REUSEPORT support
>>>> 
>>>> Dear All,
>>>> 
>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>>> 
>>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>>> 
>>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>>> 
>>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>>> 
>>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>>> 
>>>> Configuration:
>>>> <IfModule prefork.c>
>>>>  ListenBacklog 105384
>>>>  ServerLimit 105000
>>>>  MaxClients 1024
>>>>  MaxRequestsPerChild 0
>>>>  StartServers 64
>>>>  MinSpareServers 8
>>>>  MaxSpareServers 16
>>>> </IfModule>
>>>> 
>>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>>> 
>>>> Thanks,
>>>> Yingqi
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Born in Roswell... married an alien...
>>>> http://emptyhammock.com/
>>>> http://edjective.org/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Born in Roswell... married an alien...
>>>> http://emptyhammock.com/
>>>> http://edjective.org/
>>>> 
>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>> 
>> 
>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Thank you very much!

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com] 
Sent: Friday, May 30, 2014 7:07 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Thx! Let me review. My plan is to fold into trunk this weekend.

On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
> 
> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Thursday, May 15, 2014 7:53 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
> 
> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Hi Jim,
>> 
>> Thanks very much for your email.
>> 
>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>> 
>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>> 
>> Please let me know if this answers your question.
>> 
>> Thanks,
>> Yingqi
>> 
>> -----Original Message-----
>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>> Sent: Wednesday, May 14, 2014 6:57 AM
>> To: dev@httpd.apache.org
>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> This is very cool!
>> 
>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>> We look OK on Linux, FreeBSD and OSX...
>> 
>> I'm +1 on folding into trunk.
>> 
>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>> 
>>> Dear All,
>>> 
>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>> 
>>> Quick test result on modern dual socket Intel platform (Linux Kernel
>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>> 
>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>> 
>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>> 
>>> Please let me know your feedback and comments.
>>> 
>>> Thanks,
>>> Yingqi
>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>> 
>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>> Sent: Monday, March 17, 2014 1:41 PM
>>> To: dev@httpd.apache.org
>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Dear all,
>>> 
>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>> 
>>> Below are the changes we made into this new version:
>>> 
>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>> 
>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>> 
>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>> against the httpd trunk.
>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>> 
>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>> 
>>> Thanks,
>>> Yingqi Lu
>>> 
>>> 
>>> 
>>> From: Lu, Yingqi
>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>> To: dev@httpd.apache.org
>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Hi Jeff,
>>> 
>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>> To: Apache HTTP Server Development List
>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> Hi All,
>>> 
>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>> 
>>> If you need any additional information/supporting data, please let us know as well.
>>> 
>>> Yeah, it has been on my todo list, but I don't have time to give an 
>>> in depth review at the moment.  Here are a few questions/comments.
>>> (And you'll have to deal with the fact that it is unnecessarily 
>>> tedious for me to evaluate higher-level considerations if there are 
>>> a lot of distractions, such as the code comments below ;)  But 
>>> others are of course free to chime in.)
>>> 
>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>> 
>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>> 
>>> Have you tried the event MPM?
>>> 
>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>> 
>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>> 
>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>> 
>>> For the configure-time check for 3.9 kernel: I think we'd also use 
>>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>>> definition is available, and not enable it if the system includes 
>>> doesn't define it.  (Does that cause a problem for any significant 
>>> number of people?)
>>> 
>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>> added for the patch.")
>>> 
>>> Incomplete comments on style/syntax issues:
>>> 
>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>> distracting when reviewing
>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>> whatever else isn't appropriate; "ap_post_config_listeners" should 
>>> be renamed to indicate what it does
>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>> and don't add reminders in comments
>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>> and so on
>>> 
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> 
>>> From: Lu, Yingqi
>>> Sent: Friday, January 24, 2014 3:26 PM
>>> To: dev@httpd.apache.org
>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Dear All,
>>> 
>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>> 
>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>> 
>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>> 
>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>> 
>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>> 
>>> Configuration:
>>> <IfModule prefork.c>
>>>   ListenBacklog 105384
>>>   ServerLimit 105000
>>>   MaxClients 1024
>>>   MaxRequestsPerChild 0
>>>   StartServers 64
>>>   MinSpareServers 8
>>>   MaxSpareServers 16
>>> </IfModule>
>>> 
>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> 
>>> 
>>> --
>>> Born in Roswell... married an alien...
>>> http://emptyhammock.com/
>>> http://edjective.org/
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Born in Roswell... married an alien...
>>> http://emptyhammock.com/
>>> http://edjective.org/
>>> 
>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>> 
> 
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Mihai Iacob <mi...@ca.ibm.com>.
how do I unsubsribe from this list ?

Regards,
Mihai Iacob
DB2 Security Development
DB2 pureScale Development
Phone: (905) 413-5378
Email: miacob@ca.ibm.com



From:	Jim Jagielski <ji...@jaguNET.com>
To:	dev@httpd.apache.org
Date:	05/30/2014 10:07 AM
Subject:	Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
            SO_REUSEPORT support



Thx! Let me review. My plan is to fold into trunk
this weekend.

On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
>
> Thanks very much for clarifying this with me. I added #ifdef in the code
to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does
not use this parameter so that it remains the same.
>
> Attached are the two most recent patches. I already updated the bugzilla
#55897 as well.
>
> Thanks,
> Yingqi
>
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Thursday, May 15, 2014 7:53 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
SO_REUSEPORT support
>
> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
> We could either check for that during config/build or protect it with a
#ifdef in the code (and create some logging so the admin nows if it was
found or not).
>
> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
>
>> Hi Jim,
>>
>> Thanks very much for your email.
>>
>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside
listen.c file. If the feature is not supported on the OS (for example,
Linux kernel < 3.9), it will fall back to the original behavior.
>>
>> In the bucket patch, there is no need to check the params. With single
listen statement, it is just the default behavior.
>>
>> Please let me know if this answers your question.
>>
>> Thanks,
>> Yingqi
>>
>> -----Original Message-----
>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>> Sent: Wednesday, May 14, 2014 6:57 AM
>> To: dev@httpd.apache.org
>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>> SO_REUSEPORT support
>>
>> This is very cool!
>>
>> mod_status assumes that sysconf() exists, but do we need to do a config
check on the params we use in these patches?
>> We look OK on Linux, FreeBSD and OSX...
>>
>> I'm +1 on folding into trunk.
>>
>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>>
>>> Dear All,
>>>
>>> During the last couple weeks, I spent some time extending the original
two patches from prefork MPM only to all three Linux MPMs (prefork, worker
and event). Attached is the latest version of the two patches. Bugzilla
database has also been updated already. The ID for the two patches are
#55897 and #56279. Please refer to messages below for details on both of
the patches.
>>>
>>> Quick test result on modern dual socket Intel platform (Linux Kernel
>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement;
2 listen statements: 2.33X throughput improvement
>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2
listen statements: 35% throughput improvement
>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2
listen statements: throughput parity, but 62% response time reduction (with
patch, 38% response time as original SW)
>>>
>>> Bucket patch (bugzilla #56279, only impact multiple listen statement
case)
>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>>
>>> In all the above testing cases, significant response time reductions
are observed, even with throughput improvements.
>>>
>>> Please let me know your feedback and comments.
>>>
>>> Thanks,
>>> Yingqi
>>> Software and workloads used in performance tests may have been
optimized for performance only on Intel microprocessors. Performance tests,
such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any
of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when
combined with other products.
>>>
>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>> Sent: Monday, March 17, 2014 1:41 PM
>>> To: dev@httpd.apache.org
>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>>> SO_REUSEPORT support
>>>
>>> Dear all,
>>>
>>> Based on the feedback we received, we modified this patch. Here is the
most recent version. We also modified the Bugzilla database(Bugzilla# 55897
for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>>
>>> Below are the changes we made into this new version:
>>>
>>> According to Yann Ylavic and other people's comments, we separate the
original patch between with and without SO_REUSEPORT into two separated
patches. The SO_REUSEPORT patch does not change the original listen
sockets, it just duplicate the original one into multiple ones. Since the
listen sockets are identical, there is no need to change the
idle_server_maintenance function. The bucket patch (without SO_REUSEPORT),
on the other hand, it breaks down the original listen record (if there are
multiple listen socks) to multiple listen record linked lists. In this
case, idle_server_maintenance is implemented at bucket level to address the
situation that imbalanced traffic occurs among different listen
sockets/children buckets. In the bucket patch, the polling in the child
process is removed since each child only listens to 1 sock.
>>>
>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of
SO_REUSEPORT" at run time.
>>>
>>> According to Jeff Trawick's comments, 1. We generate the patches
>>> against the httpd trunk.
>>> 2. We tested the current patches and they do not impact event and
worker mpms. If current patches can be accepted, we would be happy to
extend them to other Linux based mpms. There are not much code changes, but
require some time to setup the workload to test.
>>> 3. We removed unnecessary comments and changed APLOGNO(). We also
changed some of the parameter/variable/function names to better represent
their meanings.
>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For
bucket patch, the only thing is the number of children bucket only scales
to MAX_SPAWN_RATE. If there are more than 32 (current default
MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of
buckets will be fixed to 32. The reason for this is because that we
implement the idle_server_maintenance at bucket level, each bucket's own
max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>>
>>> Again, thanks very much for all the comments and feedback. Please let
us know if there are more changes we need to complete to make them
accepted.
>>>
>>> Thanks,
>>> Yingqi Lu
>>>
>>>
>>>
>>> From: Lu, Yingqi
>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>> To: dev@httpd.apache.org
>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>>> SO_REUSEPORT support
>>>
>>> Hi Jeff,
>>>
>>> Thanks very much for your time reviewing the patch! We will modify the
patch according to your comments and repost it here.
>>>
>>> Thanks,
>>> Yingqi
>>>
>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>> To: Apache HTTP Server Development List
>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>>> SO_REUSEPORT support
>>>
>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com>
wrote:
>>> Hi All,
>>>
>>> I just want to ping again on this patch to gather your feedback and
comments. Please refer to the messages below for patch details.
>>>
>>> If you need any additional information/supporting data, please let us
know as well.
>>>
>>> Yeah, it has been on my todo list, but I don't have time to give an
>>> in depth review at the moment.  Here are a few questions/comments.
>>> (And you'll have to deal with the fact that it is unnecessarily
>>> tedious for me to evaluate higher-level considerations if there are a
>>> lot of distractions, such as the code comments below ;)  But others
>>> are of course free to chime in.)
>>>
>>> The patch should be against httpd trunk.  It probably won't take much
time for you to create that patch and confirm basic operation.
>>>
>>> What is the impact to other MPMs, even if they shouldn't use or don't
have the necessary code to use SO_REUSEPORT at this time?
>>>
>>> Have you tried the event MPM?
>>>
>>> Is there a way for the admin to choose this behavior?  Most won't care,
but everyone's behavior is changed AFAICT.
>>>
>>> Are there built-in limitations in this patch that we should be aware
of?  E.g., the free slot/spawn rate changes suggest to me that there can't
be more than 1025 children???
>>>
>>> We should assume for now that there's no reason this couldn't be
committed to trunk after review/rework, so make sure it is as close as you
can get it to what you think is the final form.
>>>
>>> For the configure-time check for 3.9 kernel: I think we'd also use
>>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT
>>> definition is available, and not enable it if the system includes
>>> doesn't define it.  (Does that cause a problem for any significant
>>> number of people?)
>>>
>>> Don't mention the patch in the patch ;) (e.g., "This function is
>>> added for the patch.")
>>>
>>> Incomplete comments on style/syntax issues:
>>>
>>> * mixing declarations and statements (e.g., "duplr->next = 0;
>>> apr_socket_t *temps;") isn't supported by all compilers and is
>>> distracting when reviewing
>>> * suitable identifier names (e.g., fix global variable "flag" and
>>> whatever else isn't appropriate; "ap_post_config_listeners" should be
>>> renamed to indicate what it does
>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()"
>>> and don't add reminders in comments
>>> * this doesn't seem portable: "int free_slots
[MAX_SPAWN_RATE/num_buckets];"
>>> and so on
>>>
>>>
>>> Thanks,
>>> Yingqi
>>>
>>>
>>> From: Lu, Yingqi
>>> Sent: Friday, January 24, 2014 3:26 PM
>>> To: dev@httpd.apache.org
>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with
>>> SO_REUSEPORT support
>>>
>>> Dear All,
>>>
>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread
Intel Xeon 2600 series systems, using an open source three tier social
networking web server workload, revealed performance scaling issues.  In
current software single listen statement (listen 80) provides better
scalability due to un-serialized accept. However, when system is under very
high load, this can lead to big number of child processes stuck in D state.
On the other hand, the serialized accept approach cannot scale with the
high load either.  In our analysis, a 32-thread system, with 2 listen
statements specified, could scale to just 70% utilization, and a 64-thread
system, with signal listen statement specified (listen 80, 4 network
interfaces), could scale to only 60% utilization.
>>>
>>> Based on those findings, we created a prototype patch for prefork mpm
which extends performance and thread utilization. In Linux kernel newer
than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets
listen to the same IP:port and automatically round robins connections. We
use this feature to create multiple duplicated listener records of the
original one and partition the child processes into buckets. Each bucket
listens to 1 IP:port. In case of old kernel which does not have the
SO_REUSEPORT enabled, we modified the "multiple listen statement case" by
creating 1 listen record for each listen statement and partitioning the
child processes into different buckets. Each bucket listens to 1 IP:port.
>>>
>>> Quick tests of the patch, running the same workload, demonstrated a 22%
throughput increase with 32-threads system and 2 listen statements (Linux
kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without
SO_REUSEPORT), 10% performance gain was measured. With single listen
statement (listen 80) configuration, we observed over 2X performance
improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4).
We also observed big reduction in response time, in addition to the
throughput improvement gained in our tests 1.
>>>
>>> Following the feedback from the bugzilla website where we originally
submitted the patch, we removed the dependency of APR change to simplify
the patch testing process. Thanks Jeff Trawick for his good suggestion! We
are also actively working on extending the patch to worker and event MPMs,
as a next step. Meanwhile, we would like to gather comments from all of you
on the current prefork patch. Please take some time test it and let us know
how it works in your environment.
>>>
>>> This is our first patch to the Apache community. Please help us review
it and let us know if there is anything we might revise to improve it. Your
feedback is very much appreciated.
>>>
>>> Configuration:
>>> <IfModule prefork.c>
>>>   ListenBacklog 105384
>>>   ServerLimit 105000
>>>   MaxClients 1024
>>>   MaxRequestsPerChild 0
>>>   StartServers 64
>>>   MinSpareServers 8
>>>   MaxSpareServers 16
>>> </IfModule>
>>>
>>> 1. Software and workloads used in performance tests may have been
optimized for performance only on Intel microprocessors. Performance tests,
such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any
of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when
combined with other products.
>>>
>>> Thanks,
>>> Yingqi
>>>
>>>
>>>
>>> --
>>> Born in Roswell... married an alien...
>>> http://emptyhammock.com/
>>> http://edjective.org/
>>>
>>>
>>>
>>>
>>> --
>>> Born in Roswell... married an alien...
>>> http://emptyhammock.com/
>>> http://edjective.org/
>>>
>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>>
>
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
Thx! Let me review. My plan is to fold into trunk
this weekend.

On May 16, 2014, at 2:53 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.
> 
> Attached are the two most recent patches. I already updated the bugzilla #55897 as well.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com] 
> Sent: Thursday, May 15, 2014 7:53 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
> We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).
> 
> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Hi Jim,
>> 
>> Thanks very much for your email.
>> 
>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
>> 
>> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
>> 
>> Please let me know if this answers your question.
>> 
>> Thanks,
>> Yingqi
>> 
>> -----Original Message-----
>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>> Sent: Wednesday, May 14, 2014 6:57 AM
>> To: dev@httpd.apache.org
>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> This is very cool!
>> 
>> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
>> We look OK on Linux, FreeBSD and OSX...
>> 
>> I'm +1 on folding into trunk.
>> 
>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
>> 
>>> Dear All,
>>> 
>>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>>> 
>>> Quick test result on modern dual socket Intel platform (Linux Kernel
>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>>> 
>>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>>> 
>>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>>> 
>>> Please let me know your feedback and comments.
>>> 
>>> Thanks,
>>> Yingqi
>>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>> 
>>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>>> Sent: Monday, March 17, 2014 1:41 PM
>>> To: dev@httpd.apache.org
>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Dear all,
>>> 
>>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>>> 
>>> Below are the changes we made into this new version:
>>> 
>>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>>> 
>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>>> 
>>> According to Jeff Trawick's comments, 1. We generate the patches 
>>> against the httpd trunk.
>>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>>> 
>>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>>> 
>>> Thanks,
>>> Yingqi Lu
>>> 
>>> 
>>> 
>>> From: Lu, Yingqi
>>> Sent: Tuesday, March 04, 2014 10:43 AM
>>> To: dev@httpd.apache.org
>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Hi Jeff,
>>> 
>>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> From: Jeff Trawick [mailto:trawick@gmail.com]
>>> Sent: Tuesday, March 04, 2014 10:08 AM
>>> To: Apache HTTP Server Development List
>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>>> Hi All,
>>> 
>>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>>> 
>>> If you need any additional information/supporting data, please let us know as well.
>>> 
>>> Yeah, it has been on my todo list, but I don't have time to give an 
>>> in depth review at the moment.  Here are a few questions/comments.  
>>> (And you'll have to deal with the fact that it is unnecessarily 
>>> tedious for me to evaluate higher-level considerations if there are a 
>>> lot of distractions, such as the code comments below ;)  But others 
>>> are of course free to chime in.)
>>> 
>>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>>> 
>>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>>> 
>>> Have you tried the event MPM?
>>> 
>>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>>> 
>>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>>> 
>>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>>> 
>>> For the configure-time check for 3.9 kernel: I think we'd also use 
>>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>>> definition is available, and not enable it if the system includes 
>>> doesn't define it.  (Does that cause a problem for any significant 
>>> number of people?)
>>> 
>>> Don't mention the patch in the patch ;) (e.g., "This function is 
>>> added for the patch.")
>>> 
>>> Incomplete comments on style/syntax issues:
>>> 
>>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>>> apr_socket_t *temps;") isn't supported by all compilers and is 
>>> distracting when reviewing
>>> * suitable identifier names (e.g., fix global variable "flag" and 
>>> whatever else isn't appropriate; "ap_post_config_listeners" should be 
>>> renamed to indicate what it does
>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>>> and don't add reminders in comments
>>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>>> and so on
>>> 
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> 
>>> From: Lu, Yingqi
>>> Sent: Friday, January 24, 2014 3:26 PM
>>> To: dev@httpd.apache.org
>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>>> SO_REUSEPORT support
>>> 
>>> Dear All,
>>> 
>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>>> 
>>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>>> 
>>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>>> 
>>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>>> 
>>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>>> 
>>> Configuration:
>>> <IfModule prefork.c>
>>>   ListenBacklog 105384
>>>   ServerLimit 105000
>>>   MaxClients 1024
>>>   MaxRequestsPerChild 0
>>>   StartServers 64
>>>   MinSpareServers 8
>>>   MaxSpareServers 16
>>> </IfModule>
>>> 
>>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>>> 
>>> Thanks,
>>> Yingqi
>>> 
>>> 
>>> 
>>> --
>>> Born in Roswell... married an alien...
>>> http://emptyhammock.com/
>>> http://edjective.org/
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Born in Roswell... married an alien...
>>> http://emptyhammock.com/
>>> http://edjective.org/
>>> 
>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
>> 
> 
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Hi Jim,

Thanks very much for clarifying this with me. I added #ifdef in the code to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does not use this parameter so that it remains the same.

Attached are the two most recent patches. I already updated the bugzilla #55897 as well.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com] 
Sent: Thursday, May 15, 2014 7:53 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
We could either check for that during config/build or protect it with a #ifdef in the code (and create some logging so the admin nows if it was found or not).

On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Thanks very much for your email.
> 
> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
> 
> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
> 
> Please let me know if this answers your question.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com]
> Sent: Wednesday, May 14, 2014 6:57 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
> 
> This is very cool!
> 
> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
> We look OK on Linux, FreeBSD and OSX...
> 
> I'm +1 on folding into trunk.
> 
> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Dear All,
>> 
>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>> 
>> Quick test result on modern dual socket Intel platform (Linux Kernel
>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>> 
>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>> 
>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>> 
>> Please let me know your feedback and comments.
>> 
>> Thanks,
>> Yingqi
>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>> Sent: Monday, March 17, 2014 1:41 PM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear all,
>> 
>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>> 
>> Below are the changes we made into this new version:
>> 
>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>> 
>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>> 
>> According to Jeff Trawick's comments, 1. We generate the patches 
>> against the httpd trunk.
>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>> 
>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>> 
>> Thanks,
>> Yingqi Lu
>> 
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Tuesday, March 04, 2014 10:43 AM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Hi Jeff,
>> 
>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>> 
>> Thanks,
>> Yingqi
>> 
>> From: Jeff Trawick [mailto:trawick@gmail.com]
>> Sent: Tuesday, March 04, 2014 10:08 AM
>> To: Apache HTTP Server Development List
>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> Hi All,
>> 
>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>> 
>> If you need any additional information/supporting data, please let us know as well.
>> 
>> Yeah, it has been on my todo list, but I don't have time to give an 
>> in depth review at the moment.  Here are a few questions/comments.  
>> (And you'll have to deal with the fact that it is unnecessarily 
>> tedious for me to evaluate higher-level considerations if there are a 
>> lot of distractions, such as the code comments below ;)  But others 
>> are of course free to chime in.)
>> 
>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>> 
>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>> 
>> Have you tried the event MPM?
>> 
>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>> 
>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>> 
>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>> 
>> For the configure-time check for 3.9 kernel: I think we'd also use 
>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>> definition is available, and not enable it if the system includes 
>> doesn't define it.  (Does that cause a problem for any significant 
>> number of people?)
>> 
>> Don't mention the patch in the patch ;) (e.g., "This function is 
>> added for the patch.")
>> 
>> Incomplete comments on style/syntax issues:
>> 
>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>> apr_socket_t *temps;") isn't supported by all compilers and is 
>> distracting when reviewing
>> * suitable identifier names (e.g., fix global variable "flag" and 
>> whatever else isn't appropriate; "ap_post_config_listeners" should be 
>> renamed to indicate what it does
>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>> and don't add reminders in comments
>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>> and so on
>> 
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Friday, January 24, 2014 3:26 PM
>> To: dev@httpd.apache.org
>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear All,
>> 
>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>> 
>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>> 
>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>> 
>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>> 
>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>> 
>> Configuration:
>> <IfModule prefork.c>
>>    ListenBacklog 105384
>>    ServerLimit 105000
>>    MaxClients 1024
>>    MaxRequestsPerChild 0
>>    StartServers 64
>>    MinSpareServers 8
>>    MaxSpareServers 16
>> </IfModule>
>> 
>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff...
We could either check for that during config/build or protect
it with a #ifdef in the code (and create some logging so
the admin nows if it was found or not).

On May 14, 2014, at 11:59 AM, Lu, Yingqi <yi...@intel.com> wrote:

> Hi Jim,
> 
> Thanks very much for your email.
> 
> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 
> 
> In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 
> 
> Please let me know if this answers your question.
> 
> Thanks,
> Yingqi
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com] 
> Sent: Wednesday, May 14, 2014 6:57 AM
> To: dev@httpd.apache.org
> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
> 
> This is very cool!
> 
> mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
> We look OK on Linux, FreeBSD and OSX...
> 
> I'm +1 on folding into trunk.
> 
> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:
> 
>> Dear All,
>> 
>> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>> 
>> Quick test result on modern dual socket Intel platform (Linux Kernel 
>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
>> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
>> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
>> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>> 
>> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
>> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
>> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>> 
>> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>> 
>> Please let me know your feedback and comments.
>> 
>> Thanks,
>> Yingqi
>> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
>> Sent: Monday, March 17, 2014 1:41 PM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear all,
>> 
>> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>> 
>> Below are the changes we made into this new version:
>> 
>> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>> 
>> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>> 
>> According to Jeff Trawick's comments,
>> 1. We generate the patches against the httpd trunk.
>> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
>> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>> 
>> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>> 
>> Thanks,
>> Yingqi Lu
>> 
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Tuesday, March 04, 2014 10:43 AM
>> To: dev@httpd.apache.org
>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Hi Jeff,
>> 
>> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>> 
>> Thanks,
>> Yingqi
>> 
>> From: Jeff Trawick [mailto:trawick@gmail.com]
>> Sent: Tuesday, March 04, 2014 10:08 AM
>> To: Apache HTTP Server Development List
>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
>> Hi All,
>> 
>> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>> 
>> If you need any additional information/supporting data, please let us know as well.
>> 
>> Yeah, it has been on my todo list, but I don't have time to give an in 
>> depth review at the moment.  Here are a few questions/comments.  (And 
>> you'll have to deal with the fact that it is unnecessarily tedious for 
>> me to evaluate higher-level considerations if there are a lot of 
>> distractions, such as the code comments below ;)  But others are of 
>> course free to chime in.)
>> 
>> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>> 
>> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>> 
>> Have you tried the event MPM?
>> 
>> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>> 
>> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>> 
>> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>> 
>> For the configure-time check for 3.9 kernel: I think we'd also use 
>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
>> definition is available, and not enable it if the system includes 
>> doesn't define it.  (Does that cause a problem for any significant 
>> number of people?)
>> 
>> Don't mention the patch in the patch ;) (e.g., "This function is added 
>> for the patch.")
>> 
>> Incomplete comments on style/syntax issues:
>> 
>> * mixing declarations and statements (e.g., "duplr->next = 0; 
>> apr_socket_t *temps;") isn't supported by all compilers and is 
>> distracting when reviewing
>> * suitable identifier names (e.g., fix global variable "flag" and 
>> whatever else isn't appropriate; "ap_post_config_listeners" should be 
>> renamed to indicate what it does
>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
>> and don't add reminders in comments
>> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
>> and so on
>> 
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> From: Lu, Yingqi
>> Sent: Friday, January 24, 2014 3:26 PM
>> To: dev@httpd.apache.org
>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
>> SO_REUSEPORT support
>> 
>> Dear All,
>> 
>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>> 
>> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>> 
>> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>> 
>> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>> 
>> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>> 
>> Configuration:
>> <IfModule prefork.c>
>>    ListenBacklog 105384
>>    ServerLimit 105000
>>    MaxClients 1024
>>    MaxRequestsPerChild 0
>>    StartServers 64
>>    MinSpareServers 8
>>    MaxSpareServers 16
>> </IfModule>
>> 
>> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>> 
>> Thanks,
>> Yingqi
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> 
>> 
>> 
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>> http://edjective.org/
>> 
>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>
> 


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Now, I just got the message I sent out yesterday, what a long delay!

Actually, this morning I sent out several messages checking for other feedback/comments/questions. I guess those messages are still in the air since I have not received any copy of those mails.

All, please let me know your feedback/comments/questions. Thanks very much for your time reviewing the patches.

Thanks,
Yingqi 

-----Original Message-----
From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
Sent: Wednesday, May 14, 2014 9:00 AM
To: dev@httpd.apache.org
Subject: RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Hi Jim,

Thanks very much for your email.

In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 

In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 

Please let me know if this answers your question.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com]
Sent: Wednesday, May 14, 2014 6:57 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

This is very cool!

mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
We look OK on Linux, FreeBSD and OSX...

I'm +1 on folding into trunk.

On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Dear All,
>  
> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>  
> Quick test result on modern dual socket Intel platform (Linux Kernel
> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>  
> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>  
> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>  
> Please let me know your feedback and comments.
>  
> Thanks,
> Yingqi
> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
> Sent: Monday, March 17, 2014 1:41 PM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear all,
>  
> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>  
> Below are the changes we made into this new version:
>  
> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>  
> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>  
> According to Jeff Trawick's comments,
> 1. We generate the patches against the httpd trunk.
> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>  
> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>  
> Thanks,
> Yingqi Lu
>  
>  
>  
> From: Lu, Yingqi
> Sent: Tuesday, March 04, 2014 10:43 AM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Hi Jeff,
>  
> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>  
> Thanks,
> Yingqi
>  
> From: Jeff Trawick [mailto:trawick@gmail.com]
> Sent: Tuesday, March 04, 2014 10:08 AM
> To: Apache HTTP Server Development List
> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
> Hi All,
>  
> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>  
> If you need any additional information/supporting data, please let us know as well.
>  
> Yeah, it has been on my todo list, but I don't have time to give an in 
> depth review at the moment.  Here are a few questions/comments.  (And 
> you'll have to deal with the fact that it is unnecessarily tedious for 
> me to evaluate higher-level considerations if there are a lot of 
> distractions, such as the code comments below ;)  But others are of 
> course free to chime in.)
>  
> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>  
> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>  
> Have you tried the event MPM?
>  
> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>  
> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>  
> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>  
> For the configure-time check for 3.9 kernel: I think we'd also use 
> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
> definition is available, and not enable it if the system includes 
> doesn't define it.  (Does that cause a problem for any significant 
> number of people?)
>  
> Don't mention the patch in the patch ;) (e.g., "This function is added 
> for the patch.")
>  
> Incomplete comments on style/syntax issues:
>  
> * mixing declarations and statements (e.g., "duplr->next = 0; 
> apr_socket_t *temps;") isn't supported by all compilers and is 
> distracting when reviewing
> * suitable identifier names (e.g., fix global variable "flag" and 
> whatever else isn't appropriate; "ap_post_config_listeners" should be 
> renamed to indicate what it does
> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
> and don't add reminders in comments
> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
> and so on
>  
>  
> Thanks,
> Yingqi
>  
>  
> From: Lu, Yingqi
> Sent: Friday, January 24, 2014 3:26 PM
> To: dev@httpd.apache.org
> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear All,
>  
> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>  
> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>  
> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>  
> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>  
> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>  
> Configuration:
> <IfModule prefork.c>
>     ListenBacklog 105384
>     ServerLimit 105000
>     MaxClients 1024
>     MaxRequestsPerChild 0
>     StartServers 64
>     MinSpareServers 8
>     MaxSpareServers 16
> </IfModule>
>  
> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> Thanks,
> Yingqi
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


RE: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by "Lu, Yingqi" <yi...@intel.com>.
Hi Jim,

Thanks very much for your email.

In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside listen.c file. If the feature is not supported on the OS (for example, Linux kernel < 3.9), it will fall back to the original behavior. 

In the bucket patch, there is no need to check the params. With single listen statement, it is just the default behavior. 

Please let me know if this answers your question.

Thanks,
Yingqi

-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com] 
Sent: Wednesday, May 14, 2014 6:57 AM
To: dev@httpd.apache.org
Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

This is very cool!

mod_status assumes that sysconf() exists, but do we need to do a config check on the params we use in these patches?
We look OK on Linux, FreeBSD and OSX...

I'm +1 on folding into trunk.

On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Dear All,
>  
> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>  
> Quick test result on modern dual socket Intel platform (Linux Kernel 
> 3.13.9) SO_REUSEPORT patch (bugzilla #55897)
> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>  
> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>  
> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>  
> Please let me know your feedback and comments.
>  
> Thanks,
> Yingqi
> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com]
> Sent: Monday, March 17, 2014 1:41 PM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear all,
>  
> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>  
> Below are the changes we made into this new version:
>  
> According to Yann Ylavic and other people's comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>  
> According to Arkadiusz Miskiewicz's comment, we make the "detection of SO_REUSEPORT" at run time.
>  
> According to Jeff Trawick's comments,
> 1. We generate the patches against the httpd trunk.
> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket's own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>  
> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>  
> Thanks,
> Yingqi Lu
>  
>  
>  
> From: Lu, Yingqi
> Sent: Tuesday, March 04, 2014 10:43 AM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Hi Jeff,
>  
> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>  
> Thanks,
> Yingqi
>  
> From: Jeff Trawick [mailto:trawick@gmail.com]
> Sent: Tuesday, March 04, 2014 10:08 AM
> To: Apache HTTP Server Development List
> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
> Hi All,
>  
> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>  
> If you need any additional information/supporting data, please let us know as well.
>  
> Yeah, it has been on my todo list, but I don't have time to give an in 
> depth review at the moment.  Here are a few questions/comments.  (And 
> you'll have to deal with the fact that it is unnecessarily tedious for 
> me to evaluate higher-level considerations if there are a lot of 
> distractions, such as the code comments below ;)  But others are of 
> course free to chime in.)
>  
> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>  
> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>  
> Have you tried the event MPM?
>  
> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>  
> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>  
> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>  
> For the configure-time check for 3.9 kernel: I think we'd also use 
> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT 
> definition is available, and not enable it if the system includes 
> doesn't define it.  (Does that cause a problem for any significant 
> number of people?)
>  
> Don't mention the patch in the patch ;) (e.g., "This function is added 
> for the patch.")
>  
> Incomplete comments on style/syntax issues:
>  
> * mixing declarations and statements (e.g., "duplr->next = 0; 
> apr_socket_t *temps;") isn't supported by all compilers and is 
> distracting when reviewing
> * suitable identifier names (e.g., fix global variable "flag" and 
> whatever else isn't appropriate; "ap_post_config_listeners" should be 
> renamed to indicate what it does
> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" 
> and don't add reminders in comments
> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
> and so on
>  
>  
> Thanks,
> Yingqi
>  
>  
> From: Lu, Yingqi
> Sent: Friday, January 24, 2014 3:26 PM
> To: dev@httpd.apache.org
> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with 
> SO_REUSEPORT support
>  
> Dear All,
>  
> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>  
> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>  
> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>  
> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>  
> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>  
> Configuration:
> <IfModule prefork.c>
>     ListenBacklog 105384
>     ServerLimit 105000
>     MaxClients 1024
>     MaxRequestsPerChild 0
>     StartServers 64
>     MinSpareServers 8
>     MaxSpareServers 16
> </IfModule>
>  
> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> Thanks,
> Yingqi
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>


Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support

Posted by Jim Jagielski <ji...@jaguNET.com>.
This is very cool!

mod_status assumes that sysconf() exists, but do we need
to do a config check on the params we use in these patches?
We look OK on Linux, FreeBSD and OSX...

I'm +1 on folding into trunk.

On May 13, 2014, at 7:55 PM, Lu, Yingqi <yi...@intel.com> wrote:

> Dear All,
>  
> During the last couple weeks, I spent some time extending the original two patches from prefork MPM only to all three Linux MPMs (prefork, worker and event). Attached is the latest version of the two patches. Bugzilla database has also been updated already. The ID for the two patches are #55897 and #56279. Please refer to messages below for details on both of the patches.
>  
> Quick test result on modern dual socket Intel platform (Linux Kernel 3.13.9)
> SO_REUSEPORT patch (bugzilla #55897)
> 1.       Prefork MPM: 1 listen statement: 2.16X throughput improvement; 2 listen statements: 2.33X throughput improvement
> 2.       Worker MPM: 1 listen statement: 10% throughput improvement; 2 listen statements: 35% throughput improvement
> 3.       Event MPM: 1 listen statement: 13% throughput improvement; 2 listen statements: throughput parity, but 62% response time reduction (with patch, 38% response time as original SW)
>  
> Bucket patch (bugzilla #56279, only impact multiple listen statement case)
> 1.       Prefork MPM: 2 listen statements: 42% throughput improvement
> 2.       Worker MPM: 2 listen statements: 7% throughput improvement
>  
> In all the above testing cases, significant response time reductions are observed, even with throughput improvements.
>  
> Please let me know your feedback and comments.
>  
> Thanks,
> Yingqi
> Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> From: Lu, Yingqi [mailto:yingqi.lu@intel.com] 
> Sent: Monday, March 17, 2014 1:41 PM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>  
> Dear all,
>  
> Based on the feedback we received, we modified this patch. Here is the most recent version. We also modified the Bugzilla database(Bugzilla# 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch).
>  
> Below are the changes we made into this new version:
>  
> According to Yann Ylavic and other people’s comments, we separate the original patch between with and without SO_REUSEPORT into two separated patches. The SO_REUSEPORT patch does not change the original listen sockets, it just duplicate the original one into multiple ones. Since the listen sockets are identical, there is no need to change the idle_server_maintenance function. The bucket patch (without SO_REUSEPORT), on the other hand, it breaks down the original listen record (if there are multiple listen socks) to multiple listen record linked lists. In this case, idle_server_maintenance is implemented at bucket level to address the situation that imbalanced traffic occurs among different listen sockets/children buckets. In the bucket patch, the polling in the child process is removed since each child only listens to 1 sock.
>  
> According to Arkadiusz Miskiewicz’s comment, we make the “detection of SO_REUSEPORT” at run time.
>  
> According to Jeff Trawick’s comments,
> 1. We generate the patches against the httpd trunk.
> 2. We tested the current patches and they do not impact event and worker mpms. If current patches can be accepted, we would be happy to extend them to other Linux based mpms. There are not much code changes, but require some time to setup the workload to test.
> 3. We removed unnecessary comments and changed APLOGNO(). We also changed some of the parameter/variable/function names to better represent their meanings.
> 4. There should be no build-in limitations for SO_REUSEPORT patch. For bucket patch, the only thing is the number of children bucket only scales to MAX_SPAWN_RATE. If there are more than 32 (current default MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number of buckets will be fixed to 32. The reason for this is because that we implement the idle_server_maintenance at bucket level, each bucket’s own max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets.
>  
> Again, thanks very much for all the comments and feedback. Please let us know if there are more changes we need to complete to make them accepted.
>  
> Thanks,
> Yingqi Lu
>  
>  
>  
> From: Lu, Yingqi 
> Sent: Tuesday, March 04, 2014 10:43 AM
> To: dev@httpd.apache.org
> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>  
> Hi Jeff,
>  
> Thanks very much for your time reviewing the patch! We will modify the patch according to your comments and repost it here.
>  
> Thanks,
> Yingqi
>  
> From: Jeff Trawick [mailto:trawick@gmail.com] 
> Sent: Tuesday, March 04, 2014 10:08 AM
> To: Apache HTTP Server Development List
> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>  
> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yi...@intel.com> wrote:
> Hi All, 
>  
> I just want to ping again on this patch to gather your feedback and comments. Please refer to the messages below for patch details.
>  
> If you need any additional information/supporting data, please let us know as well.
>  
> Yeah, it has been on my todo list, but I don't have time to give an in depth review at the moment.  Here are a few questions/comments.  (And you'll have to deal with the fact that it is unnecessarily tedious for me to evaluate higher-level considerations if there are a lot of distractions, such as the code comments below ;)  But others are of course free to chime in.)
>  
> The patch should be against httpd trunk.  It probably won't take much time for you to create that patch and confirm basic operation.
>  
> What is the impact to other MPMs, even if they shouldn't use or don't have the necessary code to use SO_REUSEPORT at this time?
>  
> Have you tried the event MPM?
>  
> Is there a way for the admin to choose this behavior?  Most won't care, but everyone's behavior is changed AFAICT.
>  
> Are there built-in limitations in this patch that we should be aware of?  E.g., the free slot/spawn rate changes suggest to me that there can't be more than 1025 children???
>  
> We should assume for now that there's no reason this couldn't be committed to trunk after review/rework, so make sure it is as close as you can get it to what you think is the final form.
>  
> For the configure-time check for 3.9 kernel: I think we'd also use AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT definition is available, and not enable it if the system includes doesn't define it.  (Does that cause a problem for any significant number of people?)
>  
> Don't mention the patch in the patch ;) (e.g., "This function is added for the patch.")
>  
> Incomplete comments on style/syntax issues:
>  
> * mixing declarations and statements (e.g., "duplr->next = 0; apr_socket_t *temps;") isn't supported by all compilers and is distracting when reviewing
> * suitable identifier names (e.g., fix global variable "flag" and whatever else isn't appropriate; "ap_post_config_listeners" should be renamed to indicate what it does
> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" and don't add reminders in comments
> * this doesn't seem portable: "int free_slots[MAX_SPAWN_RATE/num_buckets];"
> and so on
>  
>  
> Thanks,
> Yingqi
>  
>  
> From: Lu, Yingqi 
> Sent: Friday, January 24, 2014 3:26 PM
> To: dev@httpd.apache.org
> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT support
>  
> Dear All,
>  
> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread Intel Xeon 2600 series systems, using an open source three tier social networking web server workload, revealed performance scaling issues.  In current software single listen statement (listen 80) provides better scalability due to un-serialized accept. However, when system is under very high load, this can lead to big number of child processes stuck in D state. On the other hand, the serialized accept approach cannot scale with the high load either.  In our analysis, a 32-thread system, with 2 listen statements specified, could scale to just 70% utilization, and a 64-thread system, with signal listen statement specified (listen 80, 4 network interfaces), could scale to only 60% utilization.
>  
> Based on those findings, we created a prototype patch for prefork mpm which extends performance and thread utilization. In Linux kernel newer than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets listen to the same IP:port and automatically round robins connections. We use this feature to create multiple duplicated listener records of the original one and partition the child processes into buckets. Each bucket listens to 1 IP:port. In case of old kernel which does not have the SO_REUSEPORT enabled, we modified the "multiple listen statement case" by creating 1 listen record for each listen statement and partitioning the child processes into different buckets. Each bucket listens to 1 IP:port.
>  
> Quick tests of the patch, running the same workload, demonstrated a 22% throughput increase with 32-threads system and 2 listen statements (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, without SO_REUSEPORT), 10% performance gain was measured. With single listen statement (listen 80) configuration, we observed over 2X performance improvements on modern dual socket Intel platforms (Linux Kernel 3.10.4). We also observed big reduction in response time, in addition to the throughput improvement gained in our tests 1.
>  
> Following the feedback from the bugzilla website where we originally submitted the patch, we removed the dependency of APR change to simplify the patch testing process. Thanks Jeff Trawick for his good suggestion! We are also actively working on extending the patch to worker and event MPMs, as a next step. Meanwhile, we would like to gather comments from all of you on the current prefork patch. Please take some time test it and let us know how it works in your environment.
>  
> This is our first patch to the Apache community. Please help us review it and let us know if there is anything we might revise to improve it. Your feedback is very much appreciated.
>  
> Configuration:
> <IfModule prefork.c>
>     ListenBacklog 105384
>     ServerLimit 105000
>     MaxClients 1024
>     MaxRequestsPerChild 0
>     StartServers 64
>     MinSpareServers 8
>     MaxSpareServers 16
> </IfModule>
>  
> 1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
>  
> Thanks,
> Yingqi
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> 
> 
>  
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
> http://edjective.org/
>  
> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch>