You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@trafficserver.apache.org by "kuotai (JIRA)" <ji...@apache.org> on 2012/08/14 05:01:38 UTC

[jira] [Created] (TS-1405) apply time-wheel scheduler about event system

kuotai created TS-1405:
--------------------------

             Summary: apply time-wheel scheduler  about event system
                 Key: TS-1405
                 URL: https://issues.apache.org/jira/browse/TS-1405
             Project: Traffic Server
          Issue Type: Improvement
          Components: Core
    Affects Versions: 3.2.0
            Reporter: kuotai
            Assignee: kuotai
             Fix For: 3.3.0


when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "weijin (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448862#comment-13448862 ] 

weijin commented on TS-1405:
----------------------------

 If the cancel action of an event happened between the event dequeued from the protectQueue and inserted into the PriorityEventQueue, how to free it as soon as possible ?

should we check the cancel flag before put it into the PriorityEventQueue?
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment:     (was: linux_time_wheel.patch)
    
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "John Plevyak (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443583#comment-13443583 ] 

John Plevyak commented on TS-1405:
----------------------------------

Sorry, the numbers for 30 seconds should be 30/5 + ~17 (every time a power of 2 bucket is touched, 1/2 of the of the elements will be moved out, and 1/2 of those will be moved down 2 levels, etc.) = 27 vs 7 for the time wheel

So the time wheel, in the case of short expired timeouts, can be several times more efficient.
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.0
>
>         Attachments: time-wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment: linux_time_wheel.patch
    
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment: time_wheel_v2.patch
    
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: time-wheel.patch, time_wheel_v2.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment: time-wheel.patch
    
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.0
>
>         Attachments: time-wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447456#comment-13447456 ] 

kuotai commented on TS-1405:
----------------------------

{code}
ab -n 500000 -c 1000 -k -H "Host: ts.cn" http://115.238.23.222:8080/1024/1.bmp

orig:
[root@test58 ~]# ab -n 500000 -c 1000 -k -H "Host: ts.cn" http://115.238.23.222:8080/1024/1.bmp
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 115.238.23.222 (be patient)
Completed 50000 requests
Completed 100000 requests
Completed 150000 requests
Completed 200000 requests
Completed 250000 requests
Completed 300000 requests
Completed 350000 requests
Completed 400000 requests
Completed 450000 requests
Completed 500000 requests
Finished 500000 requests


Server Software:        ATS/3.2.0
Server Hostname:        115.238.23.222
Server Port:            8080

Document Path:          /1024/1.bmp
Document Length:        1024 bytes

Concurrency Level:      1000
Time taken for tests:   41.269 seconds
Complete requests:      500000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    500000
Total transferred:      691506050 bytes
HTML transferred:       512001024 bytes
Requests per second:    12115.56 [#/sec] (mean)
Time per request:       82.538 [ms] (mean)
Time per request:       0.083 [ms] (mean, across all concurrent requests)
Transfer rate:          16363.25 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0  30.6      0    3002
Processing:     0   82  83.8     56    3105
Waiting:        0   81  82.7     56    3105
Total:          0   82  89.4     56    3676

Percentage of the requests served within a certain time (ms)
  50%     56
  66%     87
  75%    112
  80%    131
  90%    193
  95%    253
  98%    328
  99%    383
 100%   3676 (longest request)

time_wheel:

[root@test58 ~]# ab -n 500000 -c 1000 -k -H "Host: ts.cn" http://115.238.23.222:8080/1024/1.bmp
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 115.238.23.222 (be patient)
Completed 50000 requests
Completed 100000 requests
Completed 150000 requests
Completed 200000 requests
Completed 250000 requests
Completed 300000 requests
Completed 350000 requests
Completed 400000 requests
Completed 450000 requests
Completed 500000 requests
Finished 500000 requests


Server Software:        ATS/3.2.0
Server Hostname:        115.238.23.222
Server Port:            8080

Document Path:          /1024/1.bmp
Document Length:        1024 bytes

Concurrency Level:      1000
Time taken for tests:   35.423 seconds
Complete requests:      500000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    500000
Total transferred:      691504308 bytes
HTML transferred:       512000000 bytes
Requests per second:    14115.08 [#/sec] (mean)
Time per request:       70.846 [ms] (mean)
Time per request:       0.071 [ms] (mean, across all concurrent requests)
Transfer rate:          19063.74 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0  30.0      0    3002
Processing:     0   70  69.6     51    3033
Waiting:        0   70  68.7     51    3033
Total:          0   71  76.2     51    3346

Percentage of the requests served within a certain time (ms)
  50%     51
  66%     76
  75%     96
  80%    110
  90%    158
  95%    210
  98%    276
  99%    326
 100%   3346 (longest request)

{code}
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: time-wheel.patch, time_wheel_v2.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443716#comment-13443716 ] 

kuotai commented on TS-1405:
----------------------------

Thanks your comments:-) yeah, we will take more tests. In my env(cluster mode), ts have 15K+ qps, and 20W+ event in scheduler.
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.0
>
>         Attachments: time-wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443716#comment-13443716 ] 

kuotai edited comment on TS-1405 at 8/29/12 12:15 PM:
------------------------------------------------------

Thanks your comments:-) yeah, we will take more tests. In my env(cluster mode), ts have 15K+ qps, and 70K+ event in scheduler.
                
      was (Author: kuotai):
    Thanks your comments:-) yeah, we will take more tests. In my env(cluster mode), ts have 15K+ qps, and 20W+ event in scheduler.
                  
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.0
>
>         Attachments: time-wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "John Plevyak (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443575#comment-13443575 ] 

John Plevyak commented on TS-1405:
----------------------------------

The current code "should" have a complexity which is bounded by the need to scan the entire queue every 5 seconds.  This is necessary because cancelling an event involves setting the volatile "cancelled" flag and to not scan them would result in running out of memory.  Assuming an event is inserted with a 30 seconds timeout and waits till it runs, it will be touched 30/5 = 6 + 10 = 16 times.  For a 300 second timeout it will be touched 300/5 = 60 + 10 = 70 times.

If an event is cancelled (the normal case for timeouts). Then it will be touched once (after an average of 2.5 seconds).  So (at least according to the design). The cost of the current design should be only a small constant factor worse than the time wheel and should average slightly more than 1 touch per event which is the best that can be expected.   Of course that is the design.... if it is causing problems, then likely there is a bug or something about the workload which is causing problems.

The time wheel can bring this down to 1 touch every N seconds with expected 1 touch per event or 6 and 60 above.

So, I think this is a very reasonable change, assuming that it can deal with the out-of-memory issue, and I interested in seeing the benchmarks as I am curious as to see how the theory and practice collide.
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.0
>
>         Attachments: time-wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment: linux_time_wheel_v2.patch

free cancel event only in automiclist
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch, linux_time_wheel_v2.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment:     (was: linux_time_wheel_v3.patch)
    
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch, linux_time_wheel_v2.patch, linux_time_wheel_v3.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "weijin (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451365#comment-13451365 ] 

weijin commented on TS-1405:
----------------------------

I afraid the v2 patch still have race in event cancel when the cancel thread set the cancel flag, but not set the in_the_cancel_queue, the thread own the event do the ProtectedQueue::dequeue_timed.


                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch, linux_time_wheel_v2.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443773#comment-13443773 ] 

Leif Hedstrom commented on TS-1405:
-----------------------------------

Hmmm, I need to play with this some more, but with "few" connections (300), the time wheel patch has very noticeable performance degredation. Just doing a quick test (I will fiddle with it some more), I get:

{code}
http_load  -parallel 100 -seconds 20 -keep_alive 100 /tmp/URL
2644059 fetches on 26310 conns, 300 max parallel, 2.644059E+06 bytes, in 20 seconds
1 mean bytes/fetch
132202.7 fetches/sec, 1.322027E+05 bytes/sec
msecs/connect: 0.156 mean, 1.884 max, 0.048 min
msecs/first-response: 2.156 mean, 82.044 max, 0.076 min

tinkerballa (21:15) 272/0 $ ~/benchit.sh 100 20 100
http_load  -parallel 100 -seconds 20 -keep_alive 100 /tmp/URL
3275553 fetches on 32567 conns, 300 max parallel, 3.275550E+06 bytes, in 20 seconds
1 mean bytes/fetch
163776.5 fetches/sec, 1.637765E+05 bytes/sec
msecs/connect: 0.171 mean, 2.251 max, 0.047 min
msecs/first-response: 1.440 mean, 117.784 max, 0.090 min
{code}

The first is with the time wheel patch, the second is basic trunk (which is still a little slower than I normally would see it, need to look into that too). But both throughput (QPS) and latency is worse with the patch.
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.0
>
>         Attachments: time-wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment: linux_time_wheel_v3.patch

yeah, in first version. the in_the_cancel_queue flag will be set before the cancel flag. it's my mistake. Thanks.
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch, linux_time_wheel_v2.patch, linux_time_wheel_v3.patch, linux_time_wheel_v3.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447455#comment-13447455 ] 

kuotai commented on TS-1405:
----------------------------

this because of origin scheduler's accuracy is 5ms(event < 5ms will insert to after[0], and processed at next loop). so some event can't process, then enter epoll_wait(sleep). The new patch change to 5ms alse.
the new patch test:
{code}
orig:
[root@test58 ~]# ab -n 500000 -c 50 -k -H "Host: ts.cn" http://115.238.23.222:8080/1024/1.bmp
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 115.238.23.222 (be patient)
Completed 50000 requests
Completed 100000 requests
Completed 150000 requests
Completed 200000 requests
Completed 250000 requests
Completed 300000 requests
Completed 350000 requests
Completed 400000 requests
Completed 450000 requests
Completed 500000 requests
Finished 500000 requests


Server Software:        ATS/3.2.0
Server Hostname:        115.238.23.222
Server Port:            8080

Document Path:          /1024/1.bmp
Document Length:        1024 bytes

Concurrency Level:      50
Time taken for tests:   34.522 seconds
Complete requests:      500000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    500000
Total transferred:      691500000 bytes
HTML transferred:       512000000 bytes
Requests per second:    14483.42 [#/sec] (mean)
Time per request:       3.452 [ms] (mean)
Time per request:       0.069 [ms] (mean, across all concurrent requests)
Transfer rate:          19561.10 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     0    3  10.8      1     316
Waiting:        0    3  10.8      1     285
Total:          0    3  10.8      1     316

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      1
  90%      3
  95%     20
  98%     41
  99%     52
 100%    316 (longest request)

time_wheel:
[root@test58 ~]# ab -n 500000 -c 50 -k -H "Host: ts.cn" http://115.238.23.222:8080/1024/1.bmp
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 115.238.23.222 (be patient)
Completed 50000 requests
Completed 100000 requests
Completed 150000 requests
Completed 200000 requests
Completed 250000 requests
Completed 300000 requests
Completed 350000 requests
Completed 400000 requests
Completed 450000 requests
Completed 500000 requests
Finished 500000 requests


Server Software:        ATS/3.2.0
Server Hostname:        115.238.23.222
Server Port:            8080

Document Path:          /1024/1.bmp
Document Length:        1024 bytes

Concurrency Level:      50
Time taken for tests:   35.486 seconds
Complete requests:      500000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    500000
Total transferred:      691500000 bytes
HTML transferred:       512000000 bytes
Requests per second:    14090.22 [#/sec] (mean)
Time per request:       3.549 [ms] (mean)
Time per request:       0.071 [ms] (mean, across all concurrent requests)
Transfer rate:          19030.05 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0  29.1      0    3000
Processing:     0    3  10.2      1     263
Waiting:        0    3  10.2      1     263
Total:          0    4  31.4      1    3262

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      1
  90%      2
  95%     20
  98%     40
  99%     51
 100%   3262 (longest request)

{code}
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: time-wheel.patch, time_wheel_v2.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "weijin (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451549#comment-13451549 ] 

weijin commented on TS-1405:
----------------------------

the v3 patch have race more severe than v2. It may lead to call back the continuation even if we cancelled the event.
 
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch, linux_time_wheel_v2.patch, linux_time_wheel_v3.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "John Plevyak (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449953#comment-13449953 ] 

John Plevyak commented on TS-1405:
----------------------------------

weijin: I don't know that freeing it as soon as possible is as big a goal as race conditions are a problem :)  The current code can take up to 5 seconds to free a cancelled event, so this code is much better in that regard, even if we have to wait for the next time the event loop runs.
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment: time_wheel_v4.patch

change:
1、the condition about insert cancel_queue
2、move process_cancel_event to class EThread
3、the cancel event handle

this patch have runned in our evn(cluster_type == 1), and run well
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch, linux_time_wheel_v2.patch, linux_time_wheel_v3.patch, time_wheel_v4.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443216#comment-13443216 ] 

Leif Hedstrom commented on TS-1405:
-----------------------------------

I'm wondering, with these improvements (they are improvements, right? :) ), could we get rid of inactivity cop, and enable the old code path which injected inactivity events ? I believe the inactivity cop was added as a response to "performance concerns" with the events, but right now inactivity cop can itself be a serious performance problem.
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.0
>
>         Attachments: time-wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment: linux_time_wheel_v3.patch

yeah, in first version, ethread->EventQueue.cancel() will be before Action::cancel_action(c). it's my mistake:-(
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch, linux_time_wheel_v2.patch, linux_time_wheel_v3.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443774#comment-13443774 ] 

Leif Hedstrom commented on TS-1405:
-----------------------------------

I should point out that CPU usage is less with the time wheel patch. So, perhaps there's a lock contention or something that triggers now, preventing us from consuming all available CPU ?
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.0
>
>         Attachments: time-wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment:     (was: time-wheel.patch)
    
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "John Plevyak (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449950#comment-13449950 ] 

John Plevyak commented on TS-1405:
----------------------------------

There is a race between the adding into the atomic list in the cancelling thread, getting dequeued in the controlling thread, and the setting of the cancelled flag in the cancelling thread.  One solution is to take the mutex lock in the check_ready code as the cancelling thread must be holding that lock over the insert into the atomic list and setting the cancelled flag.  Note, you could set the cancelled flag before adding to the atomic list and then just ignore it in process_thread() (and any other place) counting on it getting free'd eventually via the atomic list.  
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment: linux_time_wheel.patch
    
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464484#comment-13464484 ] 

kuotai commented on TS-1405:
----------------------------

our env: Cluster(cluster_type == 1) 10*Cache Server:
CPU:Intel(R) Xeon(R) CPU           L5630  @ 2.13GHz
Ram:MemTotal:       49416984 kB
Interface: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network
Total Throughput: > 8Gbps
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch, linux_time_wheel_v2.patch, linux_time_wheel_v3.patch, time_wheel_v4.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

Posted by "weijin (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450257#comment-13450257 ] 

weijin commented on TS-1405:
----------------------------

@John, yes, the current code can ensure free a cancelled event in 5 seconds, but the patch can not ensure that, it may left in the PriorityEventQueue for a long time. 
                
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1405) apply time-wheel scheduler about event system

Posted by "kuotai (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kuotai updated TS-1405:
-----------------------

    Attachment:     (was: time_wheel_v2.patch)
    
> apply time-wheel scheduler  about event system
> ----------------------------------------------
>
>                 Key: TS-1405
>                 URL: https://issues.apache.org/jira/browse/TS-1405
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 3.2.0
>            Reporter: kuotai
>            Assignee: kuotai
>             Fix For: 3.3.1
>
>         Attachments: linux_time_wheel.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is the reason why we use inactivecop to handler keepalive. the new scheduler is time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira