You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@stdcxx.apache.org by "Martin Sebor (JIRA)" <ji...@apache.org> on 2008/02/12 03:09:07 UTC

[jira] Created: (STDCXX-723) [gcc] use __builtin_prefetch to optimize string

[gcc] use __builtin_prefetch to optimize string
-----------------------------------------------

                 Key: STDCXX-723
                 URL: https://issues.apache.org/jira/browse/STDCXX-723
             Project: C++ Standard Library
          Issue Type: Sub-task
          Components: 21. Strings
    Affects Versions: 4.2.0, 4.1.4, 4.1.3, 4.1.2
            Reporter: Martin Sebor
            Priority: Minor
             Fix For: 4.2.1


We might be able to use the gcc {{__builtin_prefetch}} function in {{basic_string}} to give the hardware a hint when a string object's data is about to be accessed (e.g., the reference count which is stored at a negative offset from the {{basic_string::_C_data}} member pointer. This could improve performance on modern processors that implement prefetching (e.g., IA-64, x86_64, or PowerPC).

Quoting from section [5.46 Other built-in functions provided by GCC|http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Other-Builtins.html#Other-Builtins] of the gcc manual:

{quote}
Built-in Function: {{void __builtin_prefetch (const void *addr, ...)}}

    This function is used to minimize cache-miss latency by moving data into a cache before it is accessed. You can insert calls to {{__builtin_prefetch}} into code for which you know addresses of data in memory that is likely to be accessed soon. If the target supports them, data prefetch instructions will be generated. If the prefetch is done early enough before the access then the data will be in the cache by the time it is accessed.

    The value of {{addr}} is the address of the memory to prefetch. There are two optional arguments, {{rw}} and {{locality}}. The value of {{rw}} is a compile-time constant one or zero; one means that the prefetch is preparing for a write to the memory address and zero, the default, means that the prefetch is preparing for a read. The value {{locality}} must be a compile-time constant integer between zero and three. A value of zero means that the data has no temporal locality, so it need not be left in the cache after the access. A value of three means that the data has a high degree of temporal locality and should be left in all levels of cache possible. Values of one and two mean, respectively, a low or moderate degree of temporal locality. The default is three.

{code}
for (i = 0; i < n; i++)
{
    a[i] = a[i] + b[i];
    __builtin_prefetch (&a[i+j], 1, 1);
    __builtin_prefetch (&b[i+j], 0, 1);
    /* ... */
}
{code}         

    Data prefetch does not generate faults if {{addr}} is invalid, but the address expression itself must be valid. For example, a prefetch of {{p->next}} will not fault if {{p->next}} is not a valid address, but evaluation will fault if {{p}} is not a valid address.

    If the target does not support data prefetch, the address expression is evaluated if it includes side effects but no other code is generated and GCC does not issue a warning. 
{quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (STDCXX-723) [gcc] use __builtin_prefetch to optimize string

Posted by "Martin Sebor (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/STDCXX-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567931#action_12567931 ] 

Martin Sebor commented on STDCXX-723:
-------------------------------------

The gcc [i386-prefetch.exp|http://gcc.gnu.org/viewcvs/trunk/gcc/testsuite/gcc.misc-tests/i386-prefetch.exp?view=markup] test suite file contains some info on when {{__builtin_prefetch}} is useful.

> [gcc] use __builtin_prefetch to optimize string
> -----------------------------------------------
>
>                 Key: STDCXX-723
>                 URL: https://issues.apache.org/jira/browse/STDCXX-723
>             Project: C++ Standard Library
>          Issue Type: Sub-task
>          Components: 21. Strings
>    Affects Versions: 4.1.2, 4.1.3, 4.1.4, 4.2.0
>            Reporter: Martin Sebor
>            Priority: Minor
>             Fix For: 4.2.1
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> We might be able to use the gcc {{__builtin_prefetch}} function in {{basic_string}} to give the hardware a hint when a string object's data is about to be accessed (e.g., the reference count which is stored at a negative offset from the {{basic_string::_C_data}} member pointer. This could improve performance on modern processors that implement prefetching (e.g., IA-64, x86_64, or PowerPC).
> Quoting from section [5.46 Other built-in functions provided by GCC|http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Other-Builtins.html#Other-Builtins] of the gcc manual:
> {quote}
> Built-in Function: {{void __builtin_prefetch (const void *addr, ...)}}
>     This function is used to minimize cache-miss latency by moving data into a cache before it is accessed. You can insert calls to {{__builtin_prefetch}} into code for which you know addresses of data in memory that is likely to be accessed soon. If the target supports them, data prefetch instructions will be generated. If the prefetch is done early enough before the access then the data will be in the cache by the time it is accessed.
>     The value of {{addr}} is the address of the memory to prefetch. There are two optional arguments, {{rw}} and {{locality}}. The value of {{rw}} is a compile-time constant one or zero; one means that the prefetch is preparing for a write to the memory address and zero, the default, means that the prefetch is preparing for a read. The value {{locality}} must be a compile-time constant integer between zero and three. A value of zero means that the data has no temporal locality, so it need not be left in the cache after the access. A value of three means that the data has a high degree of temporal locality and should be left in all levels of cache possible. Values of one and two mean, respectively, a low or moderate degree of temporal locality. The default is three.
> {code}
> for (i = 0; i < n; i++)
> {
>     a[i] = a[i] + b[i];
>     __builtin_prefetch (&a[i+j], 1, 1);
>     __builtin_prefetch (&b[i+j], 0, 1);
>     /* ... */
> }
> {code}         
>     Data prefetch does not generate faults if {{addr}} is invalid, but the address expression itself must be valid. For example, a prefetch of {{p->next}} will not fault if {{p->next}} is not a valid address, but evaluation will fault if {{p}} is not a valid address.
>     If the target does not support data prefetch, the address expression is evaluated if it includes side effects but no other code is generated and GCC does not issue a warning. 
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (STDCXX-723) [gcc] use __builtin_prefetch to optimize string

Posted by "Martin Sebor (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/STDCXX-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Sebor updated STDCXX-723:
--------------------------------

    Fix Version/s:     (was: 4.2.1)
                   4.2.2

Deferred until 4.2.2.

> [gcc] use __builtin_prefetch to optimize string
> -----------------------------------------------
>
>                 Key: STDCXX-723
>                 URL: https://issues.apache.org/jira/browse/STDCXX-723
>             Project: C++ Standard Library
>          Issue Type: Sub-task
>          Components: 21. Strings
>    Affects Versions: 4.1.2, 4.1.3, 4.1.4, 4.2.0
>            Reporter: Martin Sebor
>            Priority: Minor
>             Fix For: 4.2.2
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> We might be able to use the gcc {{__builtin_prefetch}} function in {{basic_string}} to give the hardware a hint when a string object's data is about to be accessed (e.g., the reference count which is stored at a negative offset from the {{basic_string::_C_data}} member pointer. This could improve performance on modern processors that implement prefetching (e.g., IA-64, x86_64, or PowerPC).
> Quoting from section [5.46 Other built-in functions provided by GCC|http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Other-Builtins.html#Other-Builtins] of the gcc manual:
> {quote}
> Built-in Function: {{void __builtin_prefetch (const void *addr, ...)}}
>     This function is used to minimize cache-miss latency by moving data into a cache before it is accessed. You can insert calls to {{__builtin_prefetch}} into code for which you know addresses of data in memory that is likely to be accessed soon. If the target supports them, data prefetch instructions will be generated. If the prefetch is done early enough before the access then the data will be in the cache by the time it is accessed.
>     The value of {{addr}} is the address of the memory to prefetch. There are two optional arguments, {{rw}} and {{locality}}. The value of {{rw}} is a compile-time constant one or zero; one means that the prefetch is preparing for a write to the memory address and zero, the default, means that the prefetch is preparing for a read. The value {{locality}} must be a compile-time constant integer between zero and three. A value of zero means that the data has no temporal locality, so it need not be left in the cache after the access. A value of three means that the data has a high degree of temporal locality and should be left in all levels of cache possible. Values of one and two mean, respectively, a low or moderate degree of temporal locality. The default is three.
> {code}
> for (i = 0; i < n; i++)
> {
>     a[i] = a[i] + b[i];
>     __builtin_prefetch (&a[i+j], 1, 1);
>     __builtin_prefetch (&b[i+j], 0, 1);
>     /* ... */
> }
> {code}         
>     Data prefetch does not generate faults if {{addr}} is invalid, but the address expression itself must be valid. For example, a prefetch of {{p->next}} will not fault if {{p->next}} is not a valid address, but evaluation will fault if {{p}} is not a valid address.
>     If the target does not support data prefetch, the address expression is evaluated if it includes side effects but no other code is generated and GCC does not issue a warning. 
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.