You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Leif Hedstrom (JIRA)" <ji...@apache.org> on 2011/02/26 05:33:21 UTC

[jira] Created: (TS-678) Should we reduce MUTEX_RETRY_DELAY ?

Should we reduce MUTEX_RETRY_DELAY ?
------------------------------------

                 Key: TS-678
                 URL: https://issues.apache.org/jira/browse/TS-678
             Project: Traffic Server
          Issue Type: Improvement
            Reporter: Leif Hedstrom
             Fix For: 2.1.6


We have a define

#define MUTEX_RETRY_DELAY HRTIME_MSECONDS(20)


which might be overly long? A suggestion was to set it to 11ms. bcall reports this being an issue with the old code base as well.

Long term (post v3.0) I believe John is considering changing several of these locks in cache (and perhaps other areas) to be small critical sections, and just plain locks (and not try-locks). So it's probably not wortwhile for v3.0 to spend significant time on the existing code (hence the quick and dirty reduction in the delay).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (TS-678) Should we reduce MUTEX_RETRY_DELAY ?

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TS-678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom updated TS-678:
-----------------------------

    Attachment: mutex-try.diff

Add a records.config option to control the retry delay for the cache.

> Should we reduce MUTEX_RETRY_DELAY ?
> ------------------------------------
>
>                 Key: TS-678
>                 URL: https://issues.apache.org/jira/browse/TS-678
>             Project: Traffic Server
>          Issue Type: Improvement
>            Reporter: Leif Hedstrom
>             Fix For: 2.1.6
>
>         Attachments: mutex-try.diff
>
>
> We have a define
> #define MUTEX_RETRY_DELAY HRTIME_MSECONDS(20)
> which might be overly long? A suggestion was to set it to 11ms. bcall reports this being an issue with the old code base as well.
> Long term (post v3.0) I believe John is considering changing several of these locks in cache (and perhaps other areas) to be small critical sections, and just plain locks (and not try-locks). So it's probably not wortwhile for v3.0 to spend significant time on the existing code (hence the quick and dirty reduction in the delay).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (TS-678) Should we reduce MUTEX_RETRY_DELAY ?

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TS-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000039#comment-13000039 ] 

Leif Hedstrom commented on TS-678:
----------------------------------

Fwiw, I think the congestion is always where we end up calling VC_SCHED_LOCK_RETRY()  . That doesn't narrow it down a lot, but might be useful to have.

I think I'm going to commit a change making this configurable, that allows us further testing. By mistake, I set it the delay to 0ms (which would be the equivalent to "schedule_imm", or very close at least), and latency goes down 4x (from 8ms to 2ms), and I see no noticeable difference on CPU usage or throughput (it actually does slightly better on throughput).

> Should we reduce MUTEX_RETRY_DELAY ?
> ------------------------------------
>
>                 Key: TS-678
>                 URL: https://issues.apache.org/jira/browse/TS-678
>             Project: Traffic Server
>          Issue Type: Improvement
>            Reporter: Leif Hedstrom
>             Fix For: 2.1.6
>
>         Attachments: mutex-try.diff
>
>
> We have a define
> #define MUTEX_RETRY_DELAY HRTIME_MSECONDS(20)
> which might be overly long? A suggestion was to set it to 11ms. bcall reports this being an issue with the old code base as well.
> Long term (post v3.0) I believe John is considering changing several of these locks in cache (and perhaps other areas) to be small critical sections, and just plain locks (and not try-locks). So it's probably not wortwhile for v3.0 to spend significant time on the existing code (hence the quick and dirty reduction in the delay).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (TS-678) Should we reduce MUTEX_RETRY_DELAY ?

Posted by "Bryan Call (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TS-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12999717#comment-12999717 ] 

Bryan Call commented on TS-678:
-------------------------------

I was talking to Leif about this tonight and how a group at Yahoo! found a problem with this.  Here is the graph with the delay at 20ms:
http://people.apache.org/~bcall/no-store_latency_plot.gif

Here is a graph after making it a 2ms delay:
http://people.apache.org/~bcall/fixed_latency_plot.gif

> Should we reduce MUTEX_RETRY_DELAY ?
> ------------------------------------
>
>                 Key: TS-678
>                 URL: https://issues.apache.org/jira/browse/TS-678
>             Project: Traffic Server
>          Issue Type: Improvement
>            Reporter: Leif Hedstrom
>             Fix For: 2.1.6
>
>
> We have a define
> #define MUTEX_RETRY_DELAY HRTIME_MSECONDS(20)
> which might be overly long? A suggestion was to set it to 11ms. bcall reports this being an issue with the old code base as well.
> Long term (post v3.0) I believe John is considering changing several of these locks in cache (and perhaps other areas) to be small critical sections, and just plain locks (and not try-locks). So it's probably not wortwhile for v3.0 to spend significant time on the existing code (hence the quick and dirty reduction in the delay).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Assigned: (TS-678) Should we reduce MUTEX_RETRY_DELAY ?

Posted by "Leif Hedstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TS-678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom reassigned TS-678:
--------------------------------

    Assignee: Leif Hedstrom

> Should we reduce MUTEX_RETRY_DELAY ?
> ------------------------------------
>
>                 Key: TS-678
>                 URL: https://issues.apache.org/jira/browse/TS-678
>             Project: Traffic Server
>          Issue Type: Improvement
>            Reporter: Leif Hedstrom
>            Assignee: Leif Hedstrom
>             Fix For: 2.1.6
>
>         Attachments: mutex-try.diff
>
>
> We have a define
> #define MUTEX_RETRY_DELAY HRTIME_MSECONDS(20)
> which might be overly long? A suggestion was to set it to 11ms. bcall reports this being an issue with the old code base as well.
> Long term (post v3.0) I believe John is considering changing several of these locks in cache (and perhaps other areas) to be small critical sections, and just plain locks (and not try-locks). So it's probably not wortwhile for v3.0 to spend significant time on the existing code (hence the quick and dirty reduction in the delay).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira