You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Bill Stoddard <bi...@wstoddard.com> on 2002/07/02 22:29:59 UTC
Apache 2 cycle vs instruction profiling
Profiled serving a 500 byte file out of mod_mem_cache (CacheEnable mem /).
This analysis covers one complete request starting at ap_read_request and
ending with ap_graceful_stop_signalled.
I've hacked Apache 2 in several places...
The most notable hack is that I remove ap_http_header_filter and send the
pre-built headers straight out of the cache. This substantially reduced the
time spent in apr_brigade_puts (and was good for a 4% or so performance
boost). Can get a lot fancier here by making sure the buffer holding the
headers is properly aligned on a page boundary.
I also shortcircuit the age and freshness calculations in mod_cache (to save
gettimeofday calls). I know we have performance problems with apr_time_now
with potential fixes being discussed.
Last hack is to do a call apr_setsocketopt(csd, APR_INCOMPLETE_READ, 1)
immediately after the call to ap_graceful_stop_signaled to cause the next
ap_read_request to go directly to select rather than calling read first. I
think this is a good optimization because we just checked for a pipelined
request (read returned EAGAIN) so no sense in calling read again. Will
submit a patch for this piece later.
Instructions = Cycles/Ratio
I assume a large ratio is probably a bad thing and indicates, among other
things, that pipeline flushes, pagefaults, etc, are occuring... I have a
routine flow which should show transtions to the kernel (not sure if it will
show pagefaults or not) that I have not spent a lot of time on. We spend a
suprising amount of time in module log routines and logging is not enabled.
Bill
Space % Cycles
===== ==== =====
User 19.0 46645
Shared Library 48.6 119222
Kernel 32.4 79405
Total 245272 (4.22 per instruction)
/usr/local/apachetest/bin/httpd :
Subroutine Name Source File Ratio Enter %
Cycles
=============== =========== ===== ===== ====
======
.core_input_filter core.c 3.15 11 1.5 3628
.ap_rgetline_core protocol.c 3.05 6 1.3 3241
.get_filter_handle util_filter.c 4.58 2 1.1 2705
.ap_find_token util.c 4.16 6 0.9 2306
.add_any_filter_handle util_filter.c 4.37 6 0.9 2090
.net_time_filter core.c 2.90 12 0.7 1713
._ptrgl ptrgl.s 4.13 66 0.7 1637
.fix_hostname vhost.c 3.65 1 0.7 1626
.core_output_filter core.c 3.63 2 0.6 1469
.ap_update_child_status_from_indexes scoreboard.c 17.94 2 0.6 1435
.add_any_filter util_filter.c 5.46 1 0.6 1403
.ap_get_brigade util_filter.c 3.32 25 0.4 1079
.ap_set_keepalive http_protocol.c 3.50 2 0.4 1036
._moveeq moveeq.s 4.72 14 0.4 1025
.ap_read_request protocol.c 6.32 1 0.3 834
.ap_getword_white util.c 3.09 2 0.3 784
.ap_get_mime_headers protocol.c 3.32 1 0.3 756
.ap_content_length_filter protocol.c 4.33 1 0.3 736
.ap_pass_brigade util_filter.c 4.82 5 0.2 584
.ap_recent_rfc822_date util_time.c 2.74 1 0.2 520
.log_error_core log.c 4.28 4 0.2 513
.read_request_line protocol.c 4.15 1 0.2 502
.apr_palloc glink.s 3.71 22 0.2 490
.apr_brigade_destroy glink.s 6.51 12 0.2 469
.isspace glink.s 3.45 22 0.2 455
.core_create_req core.c 7.57 1 0.2 439
.apr_table_get glink.s 4.50 16 0.2 432
.ap_run_log_transaction protocol.c 12.05 1 0.2 398
.ap_run_quick_handler config.c 13.96 1 0.2 391
.cached_explode util_time.c 5.61 1 0.1 365
.ap_run_create_request request.c 8.06 1 0.1 363
.ap_log_error log.c 3.82 4 0.1 336
.tolower glink.s 2.93 19 0.1 334
.ap_http_filter http_protocol.c 3.35 2 0.1 332
.ap_add_output_filters_by_type core.c 14.32 1 0.1 329
.apr_brigade_create glink.s 3.48 15 0.1 313
.__divi64 glink.s 12.72 4 0.1 305
.isalpha glink.s 2.64 19 0.1 301
.lookup_builtin_method http_protocol.c 10.91 1 0.1 295
.apr_setsocketopt glink.s 4.39 11 0.1 290
.ap_process_http_connection http_core.c 8.17 0 0.1 286
.writev_it_all core.c 6.07 1 0.1 273
.basic_http_header http_protocol.c 3.09 1 0.1 260
.http_create_request http_core.c 6.19 1 0.1 235
.strlen glink.s 9.76 4 0.1 234
.apr_brigade_split_line glink.s 6.39 6 0.1 230
.apr_table_make glink.s 6.28 6 0.1 226
.form_header_field http_protocol.c 2.73 2 0.1 213
.avoid_xlc_bug core.c 4.79 11 0.1 211
.ap_byterange_filter http_protocol.c 7.26 1 0.1 210
.ap_run_post_read_request protocol.c 5.88 1 0.1 200
.ap_discard_request_body http_protocol.c 2.82 1 0.1 200
.apr_pstrcatv glink.s 29.67 1 0.1 178
.strchr glink.s 7.39 4 0.1 177
.basic_http_header_check http_protocol.c 4.18 1 0.1 176
.check_pipeline_flush http_request.c 3.97 1 0.1 175
.ap_update_vhost_from_headers vhost.c 5.61 1 0.1 174
.memset glink.s 6.95 4 0.1 167
.strncasecmp glink.s 5.44 5 0.1 163
.create_empty_config config.c 7.75 1 0.1 163
.apr_brigade_puts glink.s 6.48 4 0.1 156
.ap_process_request http_request.c 5.68 1 0.1 153
.ap_graceful_stop_signalled worker.c 51.13 1 0.1 153
.apr_brigade_write glink.s 3.18 8 0.1 153
.ap_parse_uri protocol.c 3.40 1 0.1 150
.ap_set_byterange http_protocol.c 4.54 1 0.1 145
.ap_add_output_filter util_filter.c 17.40 1 0.1 139
.apr_pool_destroy glink.s 22.80 1 0.1 137
.ap_basic_http_header http_protocol.c 9.04 1 0.1 136
.apr_table_unset glink.s 10.73 2 0.1 129
.apr_sendv glink.s 20.00 1 0.0 120
.ap_update_child_status scoreboard.c 8.30 2 0.0 116
.ap_index_of_response http_protocol.c 6.09 1 0.0 116
.ap_get_limit_req_body core.c 15.43 1 0.0 108
.apr_array_make glink.s 17.97 1 0.0 108
.ap_get_server_version core.c 21.36 1 0.0 107
.remove_any_filter util_filter.c 3.44 2 0.0 107
.apr_table_mergen glink.s 8.72 2 0.0 105
.isdigit glink.s 8.20 2 0.0 98
.apr_table_overlap glink.s 16.00 1 0.0 96
.apr_pstrdup glink.s 15.73 1 0.0 94
.apr_table_addn glink.s 3.93 4 0.0 94
.ap_get_output_filter_handle util_filter.c 15.53 2 0.0 93
.apr_off_t_toa glink.s 15.53 1 0.0 93
.apr_uri_parse glink.s 15.33 1 0.0 92
.apr_brigade_partition glink.s 3.73 4 0.0 90
.apr_pool_create_ex glink.s 14.60 1 0.0 88
.apr_brigade_split glink.s 2.85 5 0.0 86
.ap_make_method_list http_protocol.c 3.25 1 0.0 81
.apr_parse_addr_port glink.s 13.10 1 0.0 79
.ap_set_content_length protocol.c 3.73 1 0.0 75
.ap_remove_output_filter util_filter.c 3.42 2 0.0 68
.ap_explode_recent_gmt util_time.c 17.05 1 0.0 68
.apr_psprintf glink.s 5.57 2 0.0 67
.ap_set_content_type http_protocol.c 2.91 1 0.0 61
.apr_bucket_flush_create glink.s 9.67 1 0.0 58
.apr_brigade_cleanup glink.s 9.17 1 0.0 55
.ap_create_request_config config.c 54.20 1 0.0 54
.ap_add_output_filter_handle util_filter.c 2.83 3 0.0 51
.ap_method_number_of http_protocol.c 2.40 1 0.0 48
.ap_finalize_request_protocol protocol.c 2.47 1 0.0 47
.apr_table_setn glink.s 2.39 3 0.0 43
.apr_time_now glink.s 6.23 1 0.0 37
.ap_add_input_filter_handle util_filter.c 2.83 2 0.0 34
.apr_pstrmemdup glink.s 5.67 1 0.0 34
.apr_bucket_eos_create glink.s 2.17 1 0.0 13
Shlib Subroutine Source File Ratio Enter %
Cycles
================ =========== ===== ===== ====
======
.memset memset.s 4.99 9 5.5 13580
.apr_table_get apr_tables.c 3.15 21 2.1 5030
.__is_wctype_std libc/__is_wctype_std.c 3.34 68 1.8 4536
.strcasecmp libaixinet/strcasecmp.c 3.47 12 1.8 4458
.apr_palloc apr_pools.c 1.97 104 1.5 3680
.apr_vformatter apr_snprintf.c 3.83 2 1.3 3231
.apr_table_setn apr_tables.c 2.97 13 1.2 2936
._moveeq moveeq.s 3.83 11 1.1 2818
.apr_setsocketopt sockopt.c 3.53 13 1.1 2753
.__divu64 divu64.s 2.75 3 1.1 2675
.memchr libc/memchr.c 3.61 11 1.1 2668
.__divi64 divi64.s 2.71 5 1.0 2509
.strlen strlen.s 3.18 17 0.8 2077
.pthread_mutex_lock libpthreads/mutex.c 10.40 3 0.8 1841
.apr_bucket_alloc apr_buckets_alloc.c 2.72 28 0.7 1684
.apr_brigade_cleanup apr_brigade.c 2.74 17 0.7 1648
.apr_brigade_puts apr_brigade.c 2.32 5 0.6 1585
.isspace libc/isspace.c 2.84 23 0.5 1306
.apr_brigade_split_line apr_brigade.c 2.43 6 0.5 1297
.apr_brigade_write apr_brigade.c 2.22 11 0.5 1247
.apr_brigade_create apr_brigade.c 2.11 21 0.5 1199
.strncasecmp libaixinet/strcasecmp.c 2.74 6 0.5 1177
.cache_url_handler mod_cache.c 5.88 1 0.5 1147
.allocator_alloc apr_pools.c 4.47 5 0.5 1143
._moveeq moveeq.s 4.70 16 0.5 1118
.apr_pool_cleanup_register apr_pools.c 2.17 22 0.4 1099
.apr_bucket_free apr_buckets_alloc.c 2.87 28 0.4 1080
._ptrgl ptrgl.s 2.46 72 0.4 1063
._ptrgl ptrgl.s 2.65 63 0.4 1000
.allocator_free apr_pools.c 4.23 5 0.4 960
.tolower libc/tolower.c 2.87 19 0.4 926
.apr_bucket_simple_copy apr_buckets_simple.c 3.39 14 0.4 902
.multi_log_transaction mod_log_config.c 28.55 1 0.4 885
.make_array_core apr_tables.c 2.40 12 0.3 790
.apr_pool_cleanup_kill apr_pools.c 4.05 12 0.3 786
.unserialize_table mod_mem_cache.c 5.17 4 0.3 781
.strchr strchr.s 2.90 5 0.3 778
.apr_array_push_noclear apr_tables.c 1.69 20 0.3 776
.isupper libc/isupper.c 1.94 19 0.3 739
.isalpha libc/isalpha.c 1.92 19 0.3 730
.apr_recv sendrecv.c 6.67 2 0.3 693
.heap_bucket_destroy apr_buckets_heap.c 2.49 17 0.3 644
.apr_uri_parse apr_uri.c 5.22 1 0.3 627
.match_headers mod_setenvif.c 16.49 1 0.3 627
.apr_table_mergen apr_tables.c 3.11 2 0.3 621
.find_entry cache_hash.c 4.42 1 0.3 619
.apr_table_overlap apr_tables.c 2.98 1 0.2 611
.pthread_mutex_unlock libpthreads/mutex.c 5.02 3 0.2 603
.ap_cache_get_cachetype cache_util.c 10.48 1 0.2 597
.overlap_hash apr_tables.c 3.50 4 0.2 588
.apr_palloc glink.s 2.62 37 0.2 583
.apr_bucket_simple_split apr_buckets_simple.c 1.82 10 0.2 564
.apr_table_unset apr_tables.c 2.72 2 0.2 564
.read libc/read.c 7.89 2 0.2 552
.spin_lock_global_ppc_up locks_ppc_up.s 16.71 3 0.2 551
.apr_brigade_destroy apr_brigade.c 2.68 12 0.2 547
.apr_bucket_alloc glink.s 3.21 28 0.2 539
.heap_bucket_read apr_buckets_heap.c 1.95 30 0.2 528
.open_entity mod_mem_cache.c 6.32 1 0.2 512
.apr_brigade_partition apr_brigade.c 3.31 4 0.2 476
.apr_table_make apr_tables.c 2.50 10 0.2 475
.islower libc/islower.c 5.89 4 0.2 471
.run_cleanups apr_pools.c 9.58 1 0.2 460
.apr_pool_cleanup_kill glink.s 6.06 12 0.2 437
.socket_bucket_read apr_buckets_socket.c 3.62 2 0.2 428
.pthread_mutex_lock glink.s 23.62 3 0.2 425
.cache_out_filter mod_cache.c 4.35 1 0.2 387
.apr_palloc glink.s 3.07 21 0.2 387
.apr_sendv sendrecv.c 8.37 1 0.2 385
.apr_pvsprintf apr_pools.c 3.91 2 0.2 383
.apr_pstrcat apr_strings.c 3.15 1 0.2 378
.conv_10 apr_snprintf.c 2.64 4 0.2 370
.apr_bucket_heap_make apr_buckets_heap.c 3.81 3 0.1 365
.apr_pool_cleanup_register glink.s 2.88 21 0.1 362
.apr_bucket_shared_split apr_buckets_refcount.c 2.00 10 0.1 360
.apr_wait_for_io_or_timeout sendrecv.c 6.40 1 0.1 352
.apr_bucket_shared_destroy glink.s 3.45 17 0.1 352
.cache_select_url cache_storage.c 4.76 1 0.1 343
.strcasecmp glink.s 5.13 11 0.1 339
.apr_table_addn apr_tables.c 2.14 4 0.1 333
.apr_pstrdup apr_strings.c 2.84 5 0.1 330
.apr_parse_addr_port sockaddr.c 5.30 1 0.1 323
.apr_bucket_shared_destroy apr_buckets_refcount.c 3.07 17 0.1 313
.kread glink.s 26.03 2 0.1 312
.pthread_mutex_unlock glink.s 16.82 3 0.1 303
.isdigit libc/isdigit.c 5.03 3 0.1 302
.apr_brigade_split apr_brigade.c 2.12 5 0.1 292
.apr_allocator_alloc glink.s 12.09 4 0.1 290
.spin_unlock_global_ppc_up locks_ppc_up.s 6.94 3 0.1 271
.apr_thread_mutex_lock thread_mutex.c 5.60 3 0.1 269
.decrement_refcount mod_mem_cache.c 10.18 1 0.1 265
.apr_pool_destroy apr_pools.c 4.97 1 0.1 263
.writev libc/write.c 13.13 1 0.1 263
._ptrgl ptrgl.s 10.81 4 0.1 259
.strlen glink.s 5.28 8 0.1 254
.read_headers mod_mem_cache.c 3.96 1 0.1 253
.apr_bucket_simple_split glink.s 4.07 10 0.1 244
.apr_pstrcatv apr_strings.c 2.88 1 0.1 242
.read_real_time read_real_time.s 8.61 1 0.1 241
.__pthread_geterrno_addr libpthreads/lib_lock.c 4.91 4 0.1 236
.apr_pool_create_ex apr_pools.c 3.50 1 0.1 235
._Errno libc/errno.c 3.99 3 0.1 228
.apr_bucket_shared_make glink.s 12.47 3 0.1 224
.read_body mod_mem_cache.c 5.32 1 0.1 223
.apr_atomic_dec apr_atomic.c 7.10 1 0.1 220
.apr_pstrmemdup glink.s 36.37 1 0.1 218
._ptrgl ptrgl.s 8.62 4 0.1 207
.memchr glink.s 3.18 10 0.1 191
._ptrgl ptrgl.s 6.33 5 0.1 190
.memcache_gdsf_algorithm mod_mem_cache.c 6.68 2 0.1 187
.apr_off_t_toa apr_strings.c 2.84 1 0.1 182
.apr_atomic_inc apr_atomic.c 6.14 1 0.1 178
.apr_pstrndup apr_strings.c 5.72 1 0.1 177
.cache_run_open_entity cache_storage.c 5.50 1 0.1 176
.ap_cache_liststr cache_util.c 4.14 2 0.1 174
.read glink.s 14.18 2 0.1 170
.apr_allocator_free glink.s 7.08 4 0.1 170
.gettimeofday libc/gettimeofday.c 3.96 1 0.1 166
.ap_cache_tokstr cache_util.c 3.66 1 0.1 165
.memcmp memcmp.s 4.41 1 0.1 159
.apr_thread_mutex_unlock thread_mutex.c 3.24 3 0.1 156
.apr_pstrmemdup apr_strings.c 3.33 2 0.1 153
.apr_table_setn glink.s 2.50 10 0.1 150
.memset glink.s 12.15 2 0.1 146
.apr_recv glink.s 11.87 2 0.1 142
.islower glink.s 5.60 4 0.1 134
.apr_thread_mutex_lock glink.s 11.12 2 0.1 133
.apr_time_now time.c 5.29 1 0.1 132
.ap_set_content_type glink.s 21.80 1 0.1 131
.__divi64 glink.s 21.70 1 0.1 130
.apr_vformatter glink.s 10.83 2 0.1 130
.strlen glink.s 4.33 5 0.1 130
.cache_hash_get glink.s 21.33 1 0.1 128
.memcmp glink.s 21.30 1 0.1 128
._Errno glink.s 7.04 3 0.1 127
.ap_cache_get_cachetype glink.s 20.77 1 0.1 125
.ap_cache_tokstr glink.s 20.60 1 0.1 124
.apr_table_make glink.s 5.09 4 0.0 122
.apr_bucket_heap_create glink.s 10.13 2 0.0 122
.ap_cache_liststr glink.s 10.08 2 0.0 121
.cache_find glink.s 18.63 1 0.0 112
.cache_update glink.s 18.63 1 0.0 112
.cache_pq_change_priority cache_pqueue.c 5.30 1 0.0 111
.apr_bucket_shared_copy apr_buckets_refcount.c 1.69 4 0.0 108
.apr_atomic_dec glink.s 17.50 1 0.0 105
.kwritev glink.s 17.13 1 0.0 103
.gettimeofday glink.s 17.10 1 0.0 103
.apr_bucket_eos_create apr_buckets_eos.c 2.55 2 0.0 102
.apr_bucket_socket_create apr_buckets_socket.c 4.82 1 0.0 101
.cache_find cache_cache.c 7.71 1 0.0 100
.bzero glink.s 16.67 1 0.0 100
.ap_log_error glink.s 4.16 4 0.0 100
.free_proc_chain apr_pools.c 6.97 1 0.0 98
.apr_table_get glink.s 3.25 5 0.0 98
.cache_read_entity_body cache_storage.c 9.74 1 0.0 97
.writev glink.s 16.00 1 0.0 96
.apr_bucket_heap_create apr_buckets_heap.c 1.79 2 0.0 93
.time_base_to_time libc/POWER/time_base_to_time.c 1.66 1 0.0 93
.apr_pstrdup glink.s 5.16 3 0.0 93
.memset glink.s 15.47 1 0.0 93
.ap_pass_brigade glink.s 7.70 2 0.0 92
.select glink.s 15.30 1 0.0 92
.apr_bucket_shared_make apr_buckets_refcount.c 5.08 3 0.0 91
.apr_thread_mutex_lock glink.s 15.13 1 0.0 91
.apr_bucket_immortal_create apr_buckets_simple.c 3.54 1 0.0 85
.isdigit glink.s 13.77 1 0.0 83
.ap_add_output_filter glink.s 13.53 1 0.0 81
.minchild cache_pqueue.c 4.48 1 0.0 76
.apr_bucket_simple_copy glink.s 3.08 4 0.0 74
.cache_generate_key_default cache_storage.c 4.29 1 0.0 73
.cache_read_entity_body glink.s 11.87 1 0.0 71
.apr_bucket_free glink.s 2.93 4 0.0 70
.apr_bucket_eos_make apr_buckets_eos.c 4.80 2 0.0 67
.apr_array_make apr_tables.c 3.36 1 0.0 67
.cache_pq_percolate_down cache_pqueue.c 3.94 1 0.0 67
.cache_update cache_cache.c 2.16 1 0.0 67
.apr_psprintf apr_pools.c 1.83 2 0.0 66
.apr_brigade_create glink.s 10.97 1 0.0 66
.apr_thread_mutex_unlock glink.s 5.48 2 0.0 66
.ap_set_keepalive glink.s 5.45 2 0.0 65
.apr_bucket_eos_create glink.s 10.77 1 0.0 65
.apr_allocator_mutex_get apr_pools.c 15.90 2 0.0 64
.strncasecmp glink.s 10.47 1 0.0 63
.memchr glink.s 10.17 1 0.0 61
.brigade_cleanup apr_brigade.c 15.00 4 0.0 60
.apr_bucket_flush_create apr_buckets_flush.c 2.86 1 0.0 57
.cache_read_entity_headers cache_storage.c 2.60 1 0.0 57
.apr_bucket_immortal_create glink.s 9.23 1 0.0 55
.apr_getsocketopt glink.s 9.10 1 0.0 55
.apr_atomic_inc glink.s 9.00 1 0.0 54
.simple_bucket_read apr_buckets_simple.c 3.26 2 0.0 52
.apr_pstrcat glink.s 8.33 1 0.0 50
.cache_hash_get cache_hash.c 3.54 1 0.0 50
.ap_remove_output_filter glink.s 8.10 1 0.0 49
.memcache_get_pos mod_mem_cache.c 16.00 1 0.0 48
.apr_palloc glink.s 3.93 2 0.0 47
.ap_get_output_filter_handle glink.s 3.90 2 0.0
47
.apr_brigade_write glink.s 2.39 3 0.0 43
.strcasecmp glink.s 7.07 1 0.0 42
.apr_bucket_immortal_make apr_buckets_simple.c 5.86 1 0.0 41
.bzero memset.s 13.27 1 0.0 40
.apr_bucket_heap_make glink.s 6.50 1 0.0 39
.ap_basic_http_header glink.s 6.17 1 0.0 37
.apr_bucket_socket_make apr_buckets_socket.c 5.23 1 0.0 37
.apr_pstrndup glink.s 5.63 1 0.0 34
.strchr glink.s 5.23 1 0.0 31
.apr_bucket_destroy_noop apr_buckets.c 7.00 4 0.0 28
.apr_allocator_free apr_pools.c 7.00 4 0.0 28
.apr_allocator_alloc apr_pools.c 7.00 4 0.0 28
.apr_getsocketopt sockopt.c 4.47 1 0.0 27
.apr_setsocketopt glink.s 2.17 2 0.0 26
.isspace glink.s 4.07 1 0.0 24
.apr_allocator_owner_get apr_pools.c 5.00 2 0.0 20
.memcache_inc_frequency mod_mem_cache.c 2.14 1 0.0 15
.apr_bucket_flush_make apr_buckets_flush.c 2.00 1 0.0 14
.apr_brigade_puts glink.s 2.17 1 0.0 13
.memset glink.s 2.17 1 0.0 13
.cache_select_url glink.s 2.17 1 0.0 13
.apr_pstrdup glink.s 2.17 1 0.0 13
.cache_pq_change_priority glink.s 2.17 1 0.0 13
.apr_pool_cleanup_register glink.s 2.17 1 0.0 13
.apr_thread_mutex_unlock glink.s 2.17 1 0.0 13
.eos_bucket_read apr_buckets_eos.c 2.75 1 0.0 11
Re: Apache 2 cycle vs instruction profiling
Posted by Justin Erenkrantz <je...@apache.org>.
On Tue, Jul 02, 2002 at 03:22:51PM -0700, Ryan Morgan wrote:
>
> I sent a patch to the list a while back for that one:
>
> http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=101445378225873&w=2
All I can say is that when you don't hear about a patch, please
resubmit until you get feedback. -- justin
Re: Apache 2 cycle vs instruction profiling
Posted by Ryan Morgan <rm...@covalent.net>.
I sent a patch to the list a while back for that one:
http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=101445378225873&w=2
On Tue, Jul 02, 2002 at 02:33:38PM -0700, Justin Erenkrantz wrote:
> On Tue, Jul 02, 2002 at 02:13:23PM -0700, Brian Pane wrote:
> > We may be able to replace some more ap_add_*_filter() calls with
> > ap_add_*_filter_handle() to reduce the cost of the add_any_filter()
> > trie lookup.
>
> CORE_IN and CORE are added via ap_add_*_filter() (server/core.c:4063).
>
> I'll bet that's why this function is so high. -- justin
>
Re: Apache 2 cycle vs instruction profiling
Posted by Justin Erenkrantz <je...@apache.org>.
On Tue, Jul 02, 2002 at 02:13:23PM -0700, Brian Pane wrote:
> We may be able to replace some more ap_add_*_filter() calls with
> ap_add_*_filter_handle() to reduce the cost of the add_any_filter()
> trie lookup.
CORE_IN and CORE are added via ap_add_*_filter() (server/core.c:4063).
I'll bet that's why this function is so high. -- justin
RE: Apache 2 cycle vs instruction profiling
Posted by Bill Stoddard <bi...@wstoddard.com>.
> Bill Stoddard wrote:
>
> >Subroutine Name Source File Ratio Enter %
> >Cycles
> >=============== =========== ===== ===== ====
> >======
> >.core_input_filter core.c 3.15 11
> 1.5 3628
> >.ap_rgetline_core protocol.c 3.05 6
> 1.3 3241
> >
>
> Are these numbers from the latest version of ap_rgetline_core(), with
> the optimization to re-use the same temp brigade for the lookahead read?
> If so, I'll proceed with the next wave of fixes on that function (using
> a persistent brigade) to try to optimize it out of the top 5.
These results are from 1.107 and the optimization went into 1.108. I'll
rerun the profile with HEAD and post the results.
>
> >.get_filter_handle util_filter.c 4.58 2
> 1.1 2705
> >.ap_find_token util.c 4.16 6
> 0.9 2306
> >
>
> Wow, I've never seen ap_find_token() rank so high in a profile before.
> Does your profiling tool show any information on where it's
> getting called?
ap_set_keepalive calls ap_find_token three times. Ooops, I see I am calling
ap_set_keepalive twice (once to many) in mod_cache. I'll correct this.
>
> >.fix_hostname vhost.c 3.65 1
> 0.7 1626
> >
>
> fix_hostname() is spending all its time in string processing and
> string-to-int
> conversions. There's probably an optimization opportunity there.
>
> >.ap_update_child_status_from_indexes scoreboard.c 17.94 2
> 0.6 1435
> >
>
> Is the high ratio here just a result of the cost of writing to the
> scoreboard
> in shared memory?
I suspect so.
Bill
Re: Apache 2 cycle vs instruction profiling
Posted by Brian Pane <br...@apache.org>.
Bill Stoddard wrote:
>Subroutine Name Source File Ratio Enter %
>Cycles
>=============== =========== ===== ===== ====
>======
>.core_input_filter core.c 3.15 11 1.5 3628
>.ap_rgetline_core protocol.c 3.05 6 1.3 3241
>
Are these numbers from the latest version of ap_rgetline_core(), with
the optimization to re-use the same temp brigade for the lookahead read?
If so, I'll proceed with the next wave of fixes on that function (using
a persistent brigade) to try to optimize it out of the top 5.
>.get_filter_handle util_filter.c 4.58 2 1.1 2705
>.ap_find_token util.c 4.16 6 0.9 2306
>
Wow, I've never seen ap_find_token() rank so high in a profile before.
Does your profiling tool show any information on where it's getting called?
>.fix_hostname vhost.c 3.65 1 0.7 1626
>
fix_hostname() is spending all its time in string processing and
string-to-int
conversions. There's probably an optimization opportunity there.
>.ap_update_child_status_from_indexes scoreboard.c 17.94 2 0.6 1435
>
Is the high ratio here just a result of the cost of writing to the
scoreboard
in shared memory?
>.add_any_filter util_filter.c 5.46 1 0.6 1403
>
We may be able to replace some more ap_add_*_filter() calls with
ap_add_*_filter_handle() to reduce the cost of the add_any_filter()
trie lookup.
--Brian