You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by "Leif Hedstrom (JIRA)" <ji...@apache.org> on 2010/02/10 23:08:27 UTC
[jira] Created: (TS-159) Severe performance degradation with
transaction logging enabled.
Severe performance degradation with transaction logging enabled.
----------------------------------------------------------------
Key: TS-159
URL: https://issues.apache.org/jira/browse/TS-159
Project: Traffic Server
Issue Type: Bug
Components: Logging
Reporter: Leif Hedstrom
Priority: Critical
Fix For: 2.0.0
With logging enabled, e.g.
CONFIG proxy.config.log2.logging_enabled INT 3
CONFIG proxy.config.log2.squid_log_enabled INT 1
CONFIG proxy.config.log2.squid_log_is_ascii INT 0
ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832235#action_12832235 ]
Leif Hedstrom commented on TS-159:
----------------------------------
One final comment, I don't have hard numbers, but CPU usage does increase as performance goes down.
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "John Plevyak (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832291#action_12832291 ]
John Plevyak commented on TS-159:
---------------------------------
jplevyak:svn-trunk [649] % opreport --exclude-dependent --demangle=smart --symbols proxy/traffic_server
CPU: Core 2, speed 1596 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples % symbol name
2766813 62.7434 LogBuffer::checkout_write(unsigned long*, unsigned long)
368837 8.3642 LogObject::_checkout_write(unsigned long*, unsigned long)
295131 6.6927 iObject::operator new(unsigned long)
104961 2.3802 ink_freelist_new
82684 1.8750 ink_freelist_free
49592 1.1246 HttpSM::_instantiate_func(HttpSM*, HttpSM*)
18463 0.4187 mime_hdr_field_find(MIMEHdrImpl*, char const*, int)
16488 0.3739 HttpSM::cleanup()
15614 0.3541 EThread::process_event(Event*, int)
14378 0.3261 read_from_net(NetHandler*, UnixNetVConnection*, EThread*)
13483 0.3058 Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, CacheLookupHttpConfig*, CacheFragType, char*, int)
13029 0.2955 PriorityEventQueue::check_ready(long long, EThread*)
12445 0.2822 write_to_net_io(NetHandler*, UnixNetVConnection*, EThread*)
12029 0.2728 NetHandler::mainNetEvent(int, Event*)
11777 0.2671 EThread::execute()
10563 0.2395 CacheVC::openReadStartHead(int, Event*)
10453 0.2370 HdrHeap::allocate_obj(int, int)
9189 0.2084 MIOBuffer::write_avail()
8920 0.2023 HdrHeap::allocate_str(int)
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832231#action_12832231 ]
Leif Hedstrom commented on TS-159:
----------------------------------
./bin/traffic_server(RemapProcessor::perform_remap(Continuation*, HttpTransact::State*)+0xc4)[0x81a83d4]
[0xb2309e08]
./bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0xae)[0x81553de]
./bin/traffic_server(HttpSM::handle_api_return()+0xa8)[0x816b4f8]
./bin/traffic_server(HttpSM::set_next_state()+0x87f)[0x816aa3f]
./bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0xae)[0x81553de]
./bin/traffic_server(HttpSM::state_read_client_request_header(int, void*)+0x65b)[0x815c89b]
./bin/traffic_server(HttpSM::setup_client_read_request_header()+0x5f)[0x815ca1f]
./bin/traffic_server(HttpSM::state_add_to_list(int, void*)+0x19e)[0x8161cee]
./bin/traffic_server(HttpClientSession::new_transaction()+0x181)[0x81461c1]
./bin/traffic_server(HttpClientSession::state_keep_alive(int, void*)+0x97)[0x8145fb7]
./bin/traffic_server[0x82c5aa3]
./bin/traffic_server(NetHandler::mainNetEvent(int, Event*)+0x2b1)[0x82bde31]
./bin/traffic_server(EThread::process_event(Event*, int)+0xa1)[0x82e60f1]
./bin/traffic_server(EThread::execute()+0x3b9)[0x82e6869]
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leif Hedstrom updated TS-159:
-----------------------------
Fix Version/s: (was: 2.0.0)
2.0.0a
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Assignee: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0a
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leif Hedstrom updated TS-159:
-----------------------------
Attachment: TS-159.diff
This diff fixes the problem, ignore the "indentation" fixes (indent did a crappy job on this file, I fixed the worst of it). The issue is the initialization part of the ctor, I think this gets called when iObject is subclassed, which is wrong becuse the size is already properly set in the ::operator new.
I'm not a C++ expert in any way, so maybe jplevyak can correct me if this is wrong. It does seem to solve the problem though.
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Assignee: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0a
>
> Attachments: TS-159.diff
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832379#action_12832379 ]
Leif Hedstrom commented on TS-159:
----------------------------------
Just to clarify, these are the offending lines that really matters:
- iObject():class_size(0), next_object(NULL)
- { /* nop */
- }
The fix is to remove the initialization part.
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Assignee: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0a
>
> Attachments: TS-159.diff
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832232#action_12832232 ]
Leif Hedstrom commented on TS-159:
----------------------------------
I should also say, disabling logging avoids this problem entirely.
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leif Hedstrom resolved TS-159.
------------------------------
Resolution: Fixed
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Assignee: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0a
>
> Attachments: TS-159.diff
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leif Hedstrom reassigned TS-159:
--------------------------------
Assignee: Leif Hedstrom
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Assignee: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0a
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (TS-159) Severe performance degradation with
transaction logging enabled.
Posted by "John Plevyak (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832611#action_12832611 ]
John Plevyak commented on TS-159:
---------------------------------
Reviewed and tested the critical offending lines, the rest is just indenting which should be fixed also,
Check it in.
> Severe performance degradation with transaction logging enabled.
> ----------------------------------------------------------------
>
> Key: TS-159
> URL: https://issues.apache.org/jira/browse/TS-159
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: Leif Hedstrom
> Assignee: Leif Hedstrom
> Priority: Critical
> Fix For: 2.0.0a
>
> Attachments: TS-159.diff
>
>
> With logging enabled, e.g.
> CONFIG proxy.config.log2.logging_enabled INT 3
> CONFIG proxy.config.log2.squid_log_enabled INT 1
> CONFIG proxy.config.log2.squid_log_is_ascii INT 0
> ATS "trunk" (and "dev" branch) suffers from serious performance degradation over time. I ran a number of tests, each 30s long, fetching a 500 byte long object from cache (ram cache enabled). The tests starts off doing about 20k QPS on my box, but with every iteration, it degrades (more so in the beginning). Within 30 minutes, performance had dropped to 3,800 QPS, at which point the process segfaulted (I think it's a meaningless trace though, but attaching it nonetheless).
> The CPU is completely pegged at all times, so it's definitely doing less and less workload, using the same (or more) CPU. I did not observe any memory growth, so I don't think it's a memory leak per se (but could still obviously be related to starving some resource inside TS).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.