You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@apr.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2005/05/16 20:39:20 UTC

LP64/P64 model API issue #2

Issue  #1; How to manipulate nelts in apr_table_entry?

We can preserve the existing behavior of nelts, defining it as
a simple int.  A few casts are required out of the result of some
pointer arithmetic.

The alternative is to use apr_size_t to define nelts, however
that will percolate many changes down to the users code.

Comments?

Re: LP64/P64 model API issue #2

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

At 09:35 AM 5/18/2005, Joe Orton wrote:
>On Wed, May 18, 2005 at 02:41:48AM -0500, William Rowe wrote:
>> At 09:01 AM 5/17/2005, Joe Orton wrote:
>> >On Mon, May 16, 2005 at 01:39:20PM -0500, William Rowe wrote:
>> >> Issue  #1; How to manipulate nelts in apr_table_entry?
>> >> 
>> >> We can preserve the existing behavior of nelts, defining it as
>> >> a simple int.  A few casts are required out of the result of some
>> >> pointer arithmetic.
>> >
>> >Can you explain what casts are needed exactly, and why?
>> 
>> It's embodied by the patch below - the delta of two pointer
>> offsets result in a size_t (by definition).  On LP64/P64, the
>> sizeof(int) < sizeof(size_t).
>
>The code is really fine in both places - you're really out here to fix
>compiler warnings, right?

No, I'm using the compiler warnings to identify code with the
wrong apr_size_t/apr_off_t/int choice of arguments.  The patch 
only fixes emits we are willing to accept...

>  But a different compiler could now issue a
>new warning in table_mergesort:
>
>    for (i = 0; i + 1 < n; i += 2) {
>
>for comparison of (apr_size_t) i and (int) n since they have both
>different size and signedness.  So that's not improved the code quality
>a great deal really.

Very true.  As table_mergesort is a static local function, this
is easy to clear up without a cast.

That said; on an LP64 platform, we *could* use size_t instead of int
as the nelts domain in apr-util v2.0.  Question: Do we want to?

Re: LP64/P64 model API issue #2

Posted by Joe Orton <jo...@redhat.com>.

On Wed, May 18, 2005 at 02:41:48AM -0500, William Rowe wrote:
> At 09:01 AM 5/17/2005, Joe Orton wrote:
> >On Mon, May 16, 2005 at 01:39:20PM -0500, William Rowe wrote:
> >> Issue  #1; How to manipulate nelts in apr_table_entry?
> >> 
> >> We can preserve the existing behavior of nelts, defining it as
> >> a simple int.  A few casts are required out of the result of some
> >> pointer arithmetic.
> >
> >Can you explain what casts are needed exactly, and why?
> 
> It's embodied by the patch below - the delta of two pointer
> offsets result in a size_t (by definition).  On LP64/P64, the
> sizeof(int) < sizeof(size_t).

The code is really fine in both places - you're really out here to fix
compiler warnings, right?  But a different compiler could now issue a
new warning in table_mergesort:

    for (i = 0; i + 1 < n; i += 2) {

for comparison of (apr_size_t) i and (int) n since they have both
different size and signedness.  So that's not improved the code quality
a great deal really.

> Author: wrowe
> Date: Mon May 16 14:49:49 2005
> New Revision: 170468
> 
> URL: http://svn.apache.org/viewcvs?rev=170468&view=rev
> Log:
> 
>   We play pointer math with local 'i', so it must be apr_size_t.
> 
>   Until we decide otherwise, cast the pointer math result back to an
>   int for a nelts result.
> 
> Modified:
>     apr/apr/trunk/tables/apr_tables.c
> 
> Modified: apr/apr/trunk/tables/apr_tables.c
> URL: http://svn.apache.org/viewcvs/apr/apr/trunk/tables/apr_tables.c?rev=170468&r1=170467&r2=170468&view=diff
> ==============================================================================
> --- apr/apr/trunk/tables/apr_tables.c (original)
> +++ apr/apr/trunk/tables/apr_tables.c Mon May 16 14:49:49 2005
> @@ -970,7 +970,7 @@
>       */
>      apr_table_entry_t **values_tmp =
>          (apr_table_entry_t **)apr_palloc(pool, n * sizeof(apr_table_entry_t*));
> -    int i;
> +    apr_size_t i;
>      int blocksize;
>  
>      /* First pass: sort pairs of elements (blocksize=1) */
> @@ -1156,7 +1156,7 @@
>                  *dst++ = *src;
>              }
>          } while (++src < last_elt);
> -        t->a.nelts -= (last_elt - dst);
> +        t->a.nelts -= (int)(last_elt - dst);
>      }
>  
>      table_reindex(t);
>

Re: LP64/P64 model API issue #2

Posted by Rici Lake <ri...@ricilake.net>.

On 21-May-05, at 8:04 AM, Wesley W. Garland wrote:

>> do we want (size_t) members of apr_table nelts values, or are we 
>> happy to
>> have them int?
>
> I'll vote size_t -- int suggests that it's possible for us to have
> negative-sized arrays, which strikes me a kind of silly. I've always
> been fond of size_t for nelts-type members, because when I write
> *static* arrarys, I use this code pattern requently:
>
> for (i = 0; i < (sizeof(array) / sizeof(array[0])); i++)
>   do_stuff(array[i]);
>
> Besides which, if it gets made into an unsigned quantity, it will
> clear up a *pile* of casts (to clear up signed/unsigned comparison
> warnings) in my code. ;)

In theory, I agree with you. In practice, I think there is a lot of
code out there which is aware that nelts is an int and would need to
have casts added.

>> I think int is fine. If apr_table_t were rewritten to be scalable to >
>> 2^31 elements, it would probably acquire a different iterator
>> interface.
>
> I'm not sure that's a valid concern, unless I'm missing something
> non-obvious. Are you worried only about the cost running comp() more
> than two billion times? If so -- I would suggest that if you're
> storing that many records in an apr_table_t, and need to
> apr_table_do() over them regularly, that you have simply chosen the
> wrong data storage/abstraction and basically deserve what you get...

That's exactly my point. The expected use case for an apr_table is
storage of a relatively small number of <key, value> pairs where the
both the key and the value are strings and moreover the keys are
case-insensitive and can appear multiple times. Furthermore, the
iteration order, at least for multiple matching keys, ought to be
preserved by iteration.

That's a use case which comes up a fair amount in servers, because
of the historical design based on email headers (a relatively
small number of <key: value> assignments where both the key and the
value ... etc.)

That could be done in other ways than simply storing the
<key, value> pairs as an array, but given that the expected
number of elements is small, it doesn't appear to be a priority.
If it were reimplemented using some variant of a hash table, it
may well be more efficient and capable of scaling to more than
2^31 elements, but:

1) The expected use case for apr_table's doesn't have that
requirement;

2) No-one, least of all me, seems to be clamouring for a datatype
which does have exactly those requirements; and

3) Such a datatype would probably have an iteration protocol
which was not based on extracting internal members (nelts,
elts).

So my conclusion is that int is fine, because it is in wide use
and the theoretical nicety of using an unsigned and possibly more
capacious datatype does not reflect any real need.

Re: LP64/P64 model API issue #2

Posted by "Wesley W. Garland" <we...@gmail.com>.

> do we want (size_t) members of apr_table nelts values, or are we happy to 
> have them int?

I'll vote size_t -- int suggests that it's possible for us to have
negative-sized arrays, which strikes me a kind of silly. I've always
been fond of size_t for nelts-type members, because when I write
*static* arrarys, I use this code pattern requently:

for (i = 0; i < (sizeof(array) / sizeof(array[0])); i++)
  do_stuff(array[i]);

Besides which, if it gets made into an unsigned quantity, it will
clear up a *pile* of casts (to clear up signed/unsigned comparison
warnings) in my code. ;)

> I think int is fine. If apr_table_t were rewritten to be scalable to >
> 2^31 elements, it would probably acquire a different iterator
> interface.

I'm not sure that's a valid concern, unless I'm missing something
non-obvious. Are you worried only about the cost running comp() more
than two billion times? If so -- I would suggest that if you're
storing that many records in an apr_table_t, and need to
apr_table_do() over them regularly, that you have simply chosen the
wrong data storage/abstraction and basically deserve what you get...

I currently store about 150K highly-volatile records in an apr hash
table (with frequent disk flushes), and recognize that it's not the
best choice (although the allocation changes made to it about six
months ago, combined with fiddling the initial alloc quantity made a
BIG difference)... I'm looking at either stuffing those records into
Oracle or a Berkeley-style DBM (haven't decided which yet, I have
beta-code written for both so I can pick which solution agrees best
with real-life).

Wes

-- 
Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102

Re: LP64/P64 model API issue #2

Posted by Rici Lake <ri...@ricilake.net>.

On 20-May-05, at 3:45 PM, William A. Rowe, Jr. wrote:

>
> Enough mental gymnastics, do we want (size_t) members of
> apr_table nelts values, or are we happy to have them int?
>
> Bill
>

I think int is fine. If apr_table_t were rewritten to be scalable to > 
2^31 elements, it would probably acquire a different iterator 
interface.

Re: LP64/P64 model API issue #2

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

At 11:32 AM 5/20/2005, Branko Äibej wrote:
So? That's just how things are done on Win64. It says nothing about (L)P64 architectures in general. Or any other 64-bit architecture for that matter, because size_t is allowed to be smaller than int.

size_t and ptrdiff_t and ssize_t must all map to the domain
of void*, such that all of memory (and nothing more than the
domain of memory) can be referenced (considering wrap-around
etc etc).  They may be smaller than int only on a I128P64 
system.  On an I[L]P64 system, they both are defined 64 bits.
On an [L]P64 system, int is smaller than size_t, generally 
assumed as 32 bits.

Enough mental gymnastics, do we want (size_t) members of 
apr_table nelts values, or are we happy to have them int?

Bill

Re: LP64/P64 model API issue #2

Posted by Branko Čibej <br...@xbc.nu>.

Greg Marr wrote:

> At 01:23 AM 5/19/2005, Branko ÄŒibej wrote:
>
>> Um. Even on those architectures, size_t can be smaller than ptrdiff_t.
>
>
> #ifdef  _WIN64
> typedef unsigned __int64    size_t;
> #else
> typedef _W64 unsigned int   size_t;
> #endif
>
> #ifdef  _WIN64
> typedef __int64             ptrdiff_t;
> #else
> typedef _W64 int            ptrdiff_t;
> #endif

So? That's just how things are done on Win64. It says nothing about 
(L)P64 architectures in general. Or any other 64-bit architecture for 
that matter, because size_t is allowed to be smaller than int.

-- Brane

Re: LP64/P64 model API issue #2

Posted by Greg Marr <gr...@alum.wpi.edu>.

At 01:23 AM 5/19/2005, Branko Äibej wrote:
>Um. Even on those architectures, size_t can be smaller than ptrdiff_t.

#ifdef  _WIN64
typedef unsigned __int64    size_t;
#else
typedef _W64 unsigned int   size_t;
#endif

#ifdef  _WIN64
typedef __int64             ptrdiff_t;
#else
typedef _W64 int            ptrdiff_t;
#endif

Re: LP64/P64 model API issue #2

Posted by Branko Čibej <br...@xbc.nu>.

William A. Rowe, Jr. wrote:

>>size_t is an unsigned integral type that holds the value
>>of a sizeof operation. The relationship you mention 
>>need not be true on a 64-bit platform.
>>    
>>
>
>There you are wrong - as I've been posting about LP64 or P64
>architectures (such as Win64).
>
Um. Even on those architectures, size_t can be smaller than ptrdiff_t. 
the size_t type mus be able to hold the size of the largest object (or 
"allocated block of memory") allowed by a platform, and it's quite 
possible that's smaller than the size of addressable memory. I wouldn't 
be surprised if, on an (L)P64 architecture, sizeof(size_t) == 
sizeof(int) (but sizeof(ptrdiff_t) == sizeof(void*)).

-- Brane

Re: LP64/P64 model API issue #2

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

At 11:24 AM 5/18/2005, Andrew Binstock wrote:
>Quick note re:
>
>> It's embodied by the patch below - the delta of two pointer
>> offsets result in a size_t (by definition).  On LP64/P64, the
>> sizeof(int) < sizeof(size_t).
>
>That's not the definition of size_t. The delta of 
>two pointers you're referring to is ptrdiff_t.

Ack.  sizeof(ptrdiff_t) == sizeof(size_t).  ptrdiff_t, of course,
is a signed quantity (al la ssize_t).

>size_t is an unsigned integral type that holds the value
>of a sizeof operation. The relationship you mention 
>need not be true on a 64-bit platform.

There you are wrong - as I've been posting about LP64 or P64
architectures (such as Win64).  You were thinking of an ILP64
platform such as Linux, where we had mopped up most issues 
in APR well over a year ago.

Bill

Re: LP64/P64 model API issue #2

Posted by Andrew Binstock <ab...@pacificdataworks.com>.

Quick note re:

 > It's embodied by the patch below - the delta of two pointer
 > offsets result in a size_t (by definition).  On LP64/P64, the
 > sizeof(int) < sizeof(size_t).

That's not the definition of size_t. The delta of
two pointers you're referring to is ptrdiff_t.

size_t is an unsigned integral type that holds the value
of a sizeof operation. The relationship you mention
need not be true on a 64-bit platform.

---Andrew Binstock

 >>> re >>>

At 12:41 AM 5/18/2005, William A. Rowe, Jr. wrote:

>At 09:01 AM 5/17/2005, Joe Orton wrote:
> >On Mon, May 16, 2005 at 01:39:20PM -0500, William Rowe wrote:
> >> Issue  #1; How to manipulate nelts in apr_table_entry?
> >>
> >> We can preserve the existing behavior of nelts, defining it as
> >> a simple int.  A few casts are required out of the result of some
> >> pointer arithmetic.
> >
> >Can you explain what casts are needed exactly, and why?
>
>It's embodied by the patch below - the delta of two pointer
>offsets result in a size_t (by definition).  On LP64/P64, the
>sizeof(int) < sizeof(size_t).
>
>Author: wrowe
>Date: Mon May 16 14:49:49 2005
>New Revision: 170468
>
>URL: http://svn.apache.org/viewcvs?rev=170468&view=rev
>Log:
>
>   We play pointer math with local 'i', so it must be apr_size_t.
>
>   Until we decide otherwise, cast the pointer math result back to an
>   int for a nelts result.
>
>Modified:
>     apr/apr/trunk/tables/apr_tables.c
>
>Modified: apr/apr/trunk/tables/apr_tables.c
>URL: 
>http://svn.apache.org/viewcvs/apr/apr/trunk/tables/apr_tables.c?rev=170468&r1=170467&r2=170468&view=diff
>==============================================================================
>--- apr/apr/trunk/tables/apr_tables.c (original)
>+++ apr/apr/trunk/tables/apr_tables.c Mon May 16 14:49:49 2005
>@@ -970,7 +970,7 @@
>       */
>      apr_table_entry_t **values_tmp =
>          (apr_table_entry_t **)apr_palloc(pool, n * 
> sizeof(apr_table_entry_t*));
>-    int i;
>+    apr_size_t i;
>      int blocksize;
>
>      /* First pass: sort pairs of elements (blocksize=1) */
>@@ -1156,7 +1156,7 @@
>                  *dst++ = *src;
>              }
>          } while (++src < last_elt);
>-        t->a.nelts -= (last_elt - dst);
>+        t->a.nelts -= (int)(last_elt - dst);
>      }
>
>      table_reindex(t);


Pacific Data Works LLC
Technology White Papers
www.pacificdataworks.com

Re: LP64/P64 model API issue #2

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

At 09:01 AM 5/17/2005, Joe Orton wrote:
>On Mon, May 16, 2005 at 01:39:20PM -0500, William Rowe wrote:
>> Issue  #1; How to manipulate nelts in apr_table_entry?
>> 
>> We can preserve the existing behavior of nelts, defining it as
>> a simple int.  A few casts are required out of the result of some
>> pointer arithmetic.
>
>Can you explain what casts are needed exactly, and why?

It's embodied by the patch below - the delta of two pointer
offsets result in a size_t (by definition).  On LP64/P64, the
sizeof(int) < sizeof(size_t).

Author: wrowe
Date: Mon May 16 14:49:49 2005
New Revision: 170468

URL: http://svn.apache.org/viewcvs?rev=170468&view=rev
Log:

  We play pointer math with local 'i', so it must be apr_size_t.

  Until we decide otherwise, cast the pointer math result back to an
  int for a nelts result.

Modified:
    apr/apr/trunk/tables/apr_tables.c

Modified: apr/apr/trunk/tables/apr_tables.c
URL: http://svn.apache.org/viewcvs/apr/apr/trunk/tables/apr_tables.c?rev=170468&r1=170467&r2=170468&view=diff
==============================================================================
--- apr/apr/trunk/tables/apr_tables.c (original)
+++ apr/apr/trunk/tables/apr_tables.c Mon May 16 14:49:49 2005
@@ -970,7 +970,7 @@
      */
     apr_table_entry_t **values_tmp =
         (apr_table_entry_t **)apr_palloc(pool, n * sizeof(apr_table_entry_t*));
-    int i;
+    apr_size_t i;
     int blocksize;
 
     /* First pass: sort pairs of elements (blocksize=1) */
@@ -1156,7 +1156,7 @@
                 *dst++ = *src;
             }
         } while (++src < last_elt);
-        t->a.nelts -= (last_elt - dst);
+        t->a.nelts -= (int)(last_elt - dst);
     }
 
     table_reindex(t);

Re: LP64/P64 model API issue #2

Posted by Joe Orton <jo...@redhat.com>.

On Mon, May 16, 2005 at 01:39:20PM -0500, William Rowe wrote:
> Issue  #1; How to manipulate nelts in apr_table_entry?
> 
> We can preserve the existing behavior of nelts, defining it as
> a simple int.  A few casts are required out of the result of some
> pointer arithmetic.

Can you explain what casts are needed exactly, and why?

Regards,

joe