You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Ivan Zhakov <iv...@visualsvn.com> on 2019/06/27 14:01:56 UTC

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

On Tue, 25 Jun 2019 at 17:21, <jo...@apache.org> wrote:

> Author: jorton
> Date: Tue Jun 25 14:21:56 2019
> New Revision: 1862071
>
> URL: http://svn.apache.org/viewvc?rev=1862071&view=rev
> Log:
> Add apr_dir_pread(), a variant of apr_dir_read() which allows callers
> to read a directory with constant memory consumption:
>
> * include/apr_file_info.h: Add warning on memory consumption for
>   apr_dir_read; declare apr_dir_pread.
>
> * file_io/unix/dir.c (apr_dir_pread): Rename from apr_dir_read and
>   take pool argument.  (apr_dir_read): Reimplement using it.
>
> * file_io/win32/dir.c, file_io/os2/dir.c: Likewise, but untested.


> * test/testdir.c (test_pread) [APR_POOL_DEBUG]: Add test case.
>
> I'm not sure it's best fix. Better solution would be allocate buffer for
dirname + MAX_FILE_NAME in apr_dir_t and reuse it on every iteration. Win32
already has such code.


-- 
Ivan Zhakov

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Joe Orton <jo...@redhat.com>.
On Tue, Jul 02, 2019 at 08:22:47AM -0500, Greg Stein wrote:
> On Tue, Jul 2, 2019 at 8:04 AM Branko Čibej <br...@apache.org> wrote:
> > Perhaps this is the right time to note (again) that over in Subversion
> > land, we found that functions should take _two_ pool parameters: one for
> > allocating the returned values and one for temporary allocations. This
> > is better than functions creating temporary subpools for temporaries,
> > because it allows the API consumer better control over allocations.
> 
> https://subversion.apache.org/docs/community-guide/conventions.html#apr-pools

No argument from me, applying this guidance consistently to the APR API 
would be a massive improvement - and also a massive BC break!

APR is particularly bad about embedding pool pointers in public 
structures - sockaddr is a big PITA here.


Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Greg Stein <gs...@gmail.com>.
On Tue, Jul 2, 2019 at 8:04 AM Branko Čibej <br...@apache.org> wrote:
>...

> Perhaps this is the right time to note (again) that over in Subversion
> land, we found that functions should take _two_ pool parameters: one for
> allocating the returned values and one for temporary allocations. This
> is better than functions creating temporary subpools for temporaries,
> because it allows the API consumer better control over allocations.
>

https://subversion.apache.org/docs/community-guide/conventions.html#apr-pools

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Branko Čibej <br...@apache.org>.
On 02.07.2019 14:04, Joe Orton wrote:
> On Tue, Jul 02, 2019 at 11:18:31AM +0100, Joe Orton wrote:
>> Reading the win32 code a bit more, it is not constant-memory even with 
>> the buffer in apr_dir_t, because it can allocate from dir->pool (and 
>> even create cleanups!) in the more_finfo() call, so I think the current 
>> behaviour is not justifiable.
> Also, since the Unix implementation calls apr_stat(), which takes a 
> pool, I actually can't imagine there is another way around the need for 
> a pool passed to apr_dir_read() call to guarantee constant memory.

Any function that allocates memory should take a pool parameter. We
can't change apr_dir_read() retroactively, but we can deprecate it and I
think we should.

Perhaps this is the right time to note (again) that over in Subversion
land, we found that functions should take _two_ pool parameters: one for
allocating the returned values and one for temporary allocations. This
is better than functions creating temporary subpools for temporaries,
because it allows the API consumer better control over allocations.

-- Brane


Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Joe Orton <jo...@redhat.com>.
On Tue, Jul 02, 2019 at 11:18:31AM +0100, Joe Orton wrote:
> Reading the win32 code a bit more, it is not constant-memory even with 
> the buffer in apr_dir_t, because it can allocate from dir->pool (and 
> even create cleanups!) in the more_finfo() call, so I think the current 
> behaviour is not justifiable.

Also, since the Unix implementation calls apr_stat(), which takes a 
pool, I actually can't imagine there is another way around the need for 
a pool passed to apr_dir_read() call to guarantee constant memory.


Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Ruediger Pluem <rp...@apache.org>.

On 07/03/2019 05:37 PM, Joe Orton wrote:
> On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
>> They also make the existing old API unusable for many cases thus 
>> making a switch to the new API mandatory, even though it doesn't have 
>> to be so (see below).
>>
>> APR 1.x did not specify that the result of apr_dir_read() has to be allocated
>> in dir->pool:
>>
>>   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
>>
>> This function does not accept a pool argument, and I think that it's natural
>> to assume limited lifetime in such case. This is what the current major APR
>> users (httpd, svn) do, and other users should probably be ready for that too,
>> since on Win32 the subsequent calls to apr_dir_read() already invalidate the
>> previous apr_finfo_t.
> 
> Are there many examples in the APR API where returned strings are not 
> either pool-allocated or constant (process lifetime)?  It seems unusual 
> and very surprising to me.
> 
> A hypothetical API consumer can have made either assumption:
> 
> a) of constant memory (true with limits on Win32, breaks for Unix), or
> b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
> 
> neither is documented and both are broken by current implementations. It 
> is impossible to satisfy both so anybody can argue that fixing either 
> implementation to match the other is a regression.  
> 
> Doubling down on (a) over (b) seems strongly inferior to me, the API 
> cannot support this without introducing runtime overhead for all callers 
> (a new iterpool in the dir object), so I'd rather fix this properly with 
> a new API.
> 
> I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at 
> the same as introducing _pread() if desired - making the strdup only 
> happen for _pread would only be a minor tweak, not a whole new subpool.
> 
+1

Regards

Rüdiger



Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Wed, 28 Aug 2019 at 10:08, Joe Orton <jo...@redhat.com> wrote:
>
> On Sat, Aug 24, 2019 at 05:39:17PM +0300, Ivan Zhakov wrote:
> > For what it's worth, I'm -1 on the changes done in r1862071 and
> > r1862435 for the reasons listed above.
> >
> > PS: No idea if I can actually veto changes, because while I'm a
> > committer on the APR project, I am not in its PMC. But "-1" properly
> > expresses my technical opinion on this topic.
>
> OK fair enough, Ivan - reverted in r1866019 and please go ahead with the
> iterpool-based proposal - thanks a lot for reviewing this.
>
Thanks!

I'm currently busy with other thing, but I'll add this problem to my
TODO list and will try to fix it later.

-- 
Ivan Zhakov

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Joe Orton <jo...@redhat.com>.
On Sat, Aug 24, 2019 at 05:39:17PM +0300, Ivan Zhakov wrote:
> For what it's worth, I'm -1 on the changes done in r1862071 and
> r1862435 for the reasons listed above.
> 
> PS: No idea if I can actually veto changes, because while I'm a
> committer on the APR project, I am not in its PMC. But "-1" properly
> expresses my technical opinion on this topic.

OK fair enough, Ivan - reverted in r1866019 and please go ahead with the 
iterpool-based proposal - thanks a lot for reviewing this.

Regards, Joe



Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Branko Čibej <br...@apache.org>.
On 27.08.2019 11:25, Ruediger Pluem wrote:
>
> On 08/25/2019 04:04 PM, Branko Čibej wrote:
>> On 24.08.2019 16:39, Ivan Zhakov wrote:
>>> On Thu, 25 Jul 2019 at 15:58, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>>> On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>>>> On Wed, 3 Jul 2019 at 18:37, Joe Orton <jo...@redhat.com> wrote:
>>>>>> On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
>>>>>>> They also make the existing old API unusable for many cases thus
>>>>>>> making a switch to the new API mandatory, even though it doesn't have
>>>>>>> to be so (see below).
>>>>>>>
>>>>>>> APR 1.x did not specify that the result of apr_dir_read() has to be allocated
>>>>>>> in dir->pool:
>>>>>>>
>>>>>>>   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
>>>>>>>
>>>>>>> This function does not accept a pool argument, and I think that it's natural
>>>>>>> to assume limited lifetime in such case. This is what the current major APR
>>>>>>> users (httpd, svn) do, and other users should probably be ready for that too,
>>>>>>> since on Win32 the subsequent calls to apr_dir_read() already invalidate the
>>>>>>> previous apr_finfo_t.
>>>>>> Are there many examples in the APR API where returned strings are not
>>>>>> either pool-allocated or constant (process lifetime)?  It seems unusual
>>>>>> and very surprising to me.
>>>>>>
>>>>>> A hypothetical API consumer can have made either assumption:
>>>>>>
>>>>>> a) of constant memory (true with limits on Win32, breaks for Unix), or
>>>>>> b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
>>>>>>
>>>>>> neither is documented and both are broken by current implementations. It
>>>>>> is impossible to satisfy both so anybody can argue that fixing either
>>>>>> implementation to match the other is a regression.
>>>>>>
>>>>>> Doubling down on (a) over (b) seems strongly inferior to me, the API
>>>>>> cannot support this without introducing runtime overhead for all callers
>>>>>> (a new iterpool in the dir object), so I'd rather fix this properly with
>>>>>> a new API.
>>>>>>
>>>>>> I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at
>>>>>> the same as introducing _pread() if desired - making the strdup only
>>>>>> happen for _pread would only be a minor tweak, not a whole new subpool.
>>>>>>
>>>>> There is example of apr_hash_t.ITERATOR that is statically allocated in
>>>>> the hash object and returned to the caller.
>>>>>
>>>>> Also the native readdir() API behaves the same way [1]:
>>>>> [[[
>>>>> The returned pointer, and pointers within the structure, might be invalidated
>>>>> or the structure or the storage areas might be overwritten by a subsequent call
>>>>> to readdir() on the same directory stream.
>>>>> ]]]
>>>>>
>>>>> From this perspective the current behavior of apr_dir_read on Unix looks
>>>>> inconsistent and surprising for the API user because it is impossible to use
>>>>> in a loop without unbounded memory usage.
>>>>>
>>>>> I strongly agree that the revved version of this function that accepts a POOL
>>>>> argument will be good from both usage and implementation terms. I would be +1
>>>>> on adding such API. But I also think that unless we fix the original issue
>>>>> by using the preallocated buffer/iterpool, the whole thing would still be
>>>>> problematic for any existing API user.
>>>>>
>>>>> More detailed: if we take this opportunity to choose and document the API to
>>>>> consistently return a temporary result, then as the caller I can either process
>>>>> the data during iteration or manually put it into a long-living pool if that is
>>>>> necessary. I think that all current callers work with the API this way. And I
>>>>> can also switch to the new revved function to manually specify the pool and
>>>>> better control the allocations. If I would like to support both APR 1.x and 2.x
>>>>> the compatibility is also straightforward to achieve.
>>>>>
>>>>> If we instead choose an option where the results may be allocated
>>>>> in the dir object pool, then as the caller I cannot avoid unbounded memory
>>>>> usage when supporting both APR 1.x and 2.x. And this approach also introduces
>>>>> the unavoidable the unbounded memory usage for all Win32 users that
>>>>> do not update their code.
>>>>>
>>>>> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475
>>>>>
>>>> Is there any feedback or thoughts on these proposed changes?
>>>>
>>>> Thanks!
>>>>
>>> For what it's worth, I'm -1 on the changes done in r1862071 and
>>> r1862435 for the reasons listed above.
>>>
>>> PS: No idea if I can actually veto changes, because while I'm a
>>> committer on the APR project, I am not in its PMC. But "-1" properly
>>> expresses my technical opinion on this topic.
>>
>> I agree with your analysis. I think it would be best if you just
>> implemented your proposal, including the revised version of the API. The
>> unbounded-memory case you describe is pretty much a showstopper. The
>> allocation choices do have to be documented better, IMO.
>>
>> That would be trunk and 1.7.x, although we can't add a revised API in
>> the 1.7.x line since it's released already. For compatibility it would
>> probably be best to leave apr_dir_read on trunk without the extra pool
>> parameter.
> The API that allows to specify a pool can only be introduced with 1.8 / trunk, but what about the current API which
> needs to stay for 1.8? Either way we fix the code we might break applications. If we use the Windows approach on Unix
> you break applications that rely on getting a copy back, if we use the Unix approach we have an unbounded-memory issue.
> Do you see any other option but choosing one implementation (Windows or Unix) for the other case as well and place
> a fat note about the changed behavior somewhere (Release notes / Change log)?


The way stuff is allocated was only documented a couple months ago on
trunk. Fix those docs and that's it. Anyone who previously relied on
getting a copy was looking at the implementation, not the documentation.

This is release notes/API errata territory.

-- Brane


Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Ruediger Pluem <rp...@apache.org>.

On 08/25/2019 04:04 PM, Branko Čibej wrote:
> On 24.08.2019 16:39, Ivan Zhakov wrote:
>> On Thu, 25 Jul 2019 at 15:58, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>> On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>>> On Wed, 3 Jul 2019 at 18:37, Joe Orton <jo...@redhat.com> wrote:
>>>>> On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
>>>>>> They also make the existing old API unusable for many cases thus
>>>>>> making a switch to the new API mandatory, even though it doesn't have
>>>>>> to be so (see below).
>>>>>>
>>>>>> APR 1.x did not specify that the result of apr_dir_read() has to be allocated
>>>>>> in dir->pool:
>>>>>>
>>>>>>   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
>>>>>>
>>>>>> This function does not accept a pool argument, and I think that it's natural
>>>>>> to assume limited lifetime in such case. This is what the current major APR
>>>>>> users (httpd, svn) do, and other users should probably be ready for that too,
>>>>>> since on Win32 the subsequent calls to apr_dir_read() already invalidate the
>>>>>> previous apr_finfo_t.
>>>>> Are there many examples in the APR API where returned strings are not
>>>>> either pool-allocated or constant (process lifetime)?  It seems unusual
>>>>> and very surprising to me.
>>>>>
>>>>> A hypothetical API consumer can have made either assumption:
>>>>>
>>>>> a) of constant memory (true with limits on Win32, breaks for Unix), or
>>>>> b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
>>>>>
>>>>> neither is documented and both are broken by current implementations. It
>>>>> is impossible to satisfy both so anybody can argue that fixing either
>>>>> implementation to match the other is a regression.
>>>>>
>>>>> Doubling down on (a) over (b) seems strongly inferior to me, the API
>>>>> cannot support this without introducing runtime overhead for all callers
>>>>> (a new iterpool in the dir object), so I'd rather fix this properly with
>>>>> a new API.
>>>>>
>>>>> I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at
>>>>> the same as introducing _pread() if desired - making the strdup only
>>>>> happen for _pread would only be a minor tweak, not a whole new subpool.
>>>>>
>>>> There is example of apr_hash_t.ITERATOR that is statically allocated in
>>>> the hash object and returned to the caller.
>>>>
>>>> Also the native readdir() API behaves the same way [1]:
>>>> [[[
>>>> The returned pointer, and pointers within the structure, might be invalidated
>>>> or the structure or the storage areas might be overwritten by a subsequent call
>>>> to readdir() on the same directory stream.
>>>> ]]]
>>>>
>>>> From this perspective the current behavior of apr_dir_read on Unix looks
>>>> inconsistent and surprising for the API user because it is impossible to use
>>>> in a loop without unbounded memory usage.
>>>>
>>>> I strongly agree that the revved version of this function that accepts a POOL
>>>> argument will be good from both usage and implementation terms. I would be +1
>>>> on adding such API. But I also think that unless we fix the original issue
>>>> by using the preallocated buffer/iterpool, the whole thing would still be
>>>> problematic for any existing API user.
>>>>
>>>> More detailed: if we take this opportunity to choose and document the API to
>>>> consistently return a temporary result, then as the caller I can either process
>>>> the data during iteration or manually put it into a long-living pool if that is
>>>> necessary. I think that all current callers work with the API this way. And I
>>>> can also switch to the new revved function to manually specify the pool and
>>>> better control the allocations. If I would like to support both APR 1.x and 2.x
>>>> the compatibility is also straightforward to achieve.
>>>>
>>>> If we instead choose an option where the results may be allocated
>>>> in the dir object pool, then as the caller I cannot avoid unbounded memory
>>>> usage when supporting both APR 1.x and 2.x. And this approach also introduces
>>>> the unavoidable the unbounded memory usage for all Win32 users that
>>>> do not update their code.
>>>>
>>>> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475
>>>>
>>> Is there any feedback or thoughts on these proposed changes?
>>>
>>> Thanks!
>>>
>> For what it's worth, I'm -1 on the changes done in r1862071 and
>> r1862435 for the reasons listed above.
>>
>> PS: No idea if I can actually veto changes, because while I'm a
>> committer on the APR project, I am not in its PMC. But "-1" properly
>> expresses my technical opinion on this topic.
> 
> 
> I agree with your analysis. I think it would be best if you just
> implemented your proposal, including the revised version of the API. The
> unbounded-memory case you describe is pretty much a showstopper. The
> allocation choices do have to be documented better, IMO.
> 
> That would be trunk and 1.7.x, although we can't add a revised API in
> the 1.7.x line since it's released already. For compatibility it would
> probably be best to leave apr_dir_read on trunk without the extra pool
> parameter.

The API that allows to specify a pool can only be introduced with 1.8 / trunk, but what about the current API which
needs to stay for 1.8? Either way we fix the code we might break applications. If we use the Windows approach on Unix
you break applications that rely on getting a copy back, if we use the Unix approach we have an unbounded-memory issue.
Do you see any other option but choosing one implementation (Windows or Unix) for the other case as well and place
a fat note about the changed behavior somewhere (Release notes / Change log)?

Regards

Rüdiger



Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Branko Čibej <br...@apache.org>.
On 24.08.2019 16:39, Ivan Zhakov wrote:
> On Thu, 25 Jul 2019 at 15:58, Ivan Zhakov <iv...@visualsvn.com> wrote:
>> On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>> On Wed, 3 Jul 2019 at 18:37, Joe Orton <jo...@redhat.com> wrote:
>>>> On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
>>>>> They also make the existing old API unusable for many cases thus
>>>>> making a switch to the new API mandatory, even though it doesn't have
>>>>> to be so (see below).
>>>>>
>>>>> APR 1.x did not specify that the result of apr_dir_read() has to be allocated
>>>>> in dir->pool:
>>>>>
>>>>>   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
>>>>>
>>>>> This function does not accept a pool argument, and I think that it's natural
>>>>> to assume limited lifetime in such case. This is what the current major APR
>>>>> users (httpd, svn) do, and other users should probably be ready for that too,
>>>>> since on Win32 the subsequent calls to apr_dir_read() already invalidate the
>>>>> previous apr_finfo_t.
>>>> Are there many examples in the APR API where returned strings are not
>>>> either pool-allocated or constant (process lifetime)?  It seems unusual
>>>> and very surprising to me.
>>>>
>>>> A hypothetical API consumer can have made either assumption:
>>>>
>>>> a) of constant memory (true with limits on Win32, breaks for Unix), or
>>>> b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
>>>>
>>>> neither is documented and both are broken by current implementations. It
>>>> is impossible to satisfy both so anybody can argue that fixing either
>>>> implementation to match the other is a regression.
>>>>
>>>> Doubling down on (a) over (b) seems strongly inferior to me, the API
>>>> cannot support this without introducing runtime overhead for all callers
>>>> (a new iterpool in the dir object), so I'd rather fix this properly with
>>>> a new API.
>>>>
>>>> I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at
>>>> the same as introducing _pread() if desired - making the strdup only
>>>> happen for _pread would only be a minor tweak, not a whole new subpool.
>>>>
>>> There is example of apr_hash_t.ITERATOR that is statically allocated in
>>> the hash object and returned to the caller.
>>>
>>> Also the native readdir() API behaves the same way [1]:
>>> [[[
>>> The returned pointer, and pointers within the structure, might be invalidated
>>> or the structure or the storage areas might be overwritten by a subsequent call
>>> to readdir() on the same directory stream.
>>> ]]]
>>>
>>> From this perspective the current behavior of apr_dir_read on Unix looks
>>> inconsistent and surprising for the API user because it is impossible to use
>>> in a loop without unbounded memory usage.
>>>
>>> I strongly agree that the revved version of this function that accepts a POOL
>>> argument will be good from both usage and implementation terms. I would be +1
>>> on adding such API. But I also think that unless we fix the original issue
>>> by using the preallocated buffer/iterpool, the whole thing would still be
>>> problematic for any existing API user.
>>>
>>> More detailed: if we take this opportunity to choose and document the API to
>>> consistently return a temporary result, then as the caller I can either process
>>> the data during iteration or manually put it into a long-living pool if that is
>>> necessary. I think that all current callers work with the API this way. And I
>>> can also switch to the new revved function to manually specify the pool and
>>> better control the allocations. If I would like to support both APR 1.x and 2.x
>>> the compatibility is also straightforward to achieve.
>>>
>>> If we instead choose an option where the results may be allocated
>>> in the dir object pool, then as the caller I cannot avoid unbounded memory
>>> usage when supporting both APR 1.x and 2.x. And this approach also introduces
>>> the unavoidable the unbounded memory usage for all Win32 users that
>>> do not update their code.
>>>
>>> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475
>>>
>> Is there any feedback or thoughts on these proposed changes?
>>
>> Thanks!
>>
> For what it's worth, I'm -1 on the changes done in r1862071 and
> r1862435 for the reasons listed above.
>
> PS: No idea if I can actually veto changes, because while I'm a
> committer on the APR project, I am not in its PMC. But "-1" properly
> expresses my technical opinion on this topic.


I agree with your analysis. I think it would be best if you just
implemented your proposal, including the revised version of the API. The
unbounded-memory case you describe is pretty much a showstopper. The
allocation choices do have to be documented better, IMO.

That would be trunk and 1.7.x, although we can't add a revised API in
the 1.7.x line since it's released already. For compatibility it would
probably be best to leave apr_dir_read on trunk without the extra pool
parameter.

-- Brane


Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Thu, 25 Jul 2019 at 15:58, Ivan Zhakov <iv...@visualsvn.com> wrote:
>
> On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <iv...@visualsvn.com> wrote:
> >
> > On Wed, 3 Jul 2019 at 18:37, Joe Orton <jo...@redhat.com> wrote:
> > >
> > > On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
> > > > They also make the existing old API unusable for many cases thus
> > > > making a switch to the new API mandatory, even though it doesn't have
> > > > to be so (see below).
> > > >
> > > > APR 1.x did not specify that the result of apr_dir_read() has to be allocated
> > > > in dir->pool:
> > > >
> > > >   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
> > > >
> > > > This function does not accept a pool argument, and I think that it's natural
> > > > to assume limited lifetime in such case. This is what the current major APR
> > > > users (httpd, svn) do, and other users should probably be ready for that too,
> > > > since on Win32 the subsequent calls to apr_dir_read() already invalidate the
> > > > previous apr_finfo_t.
> > >
> > > Are there many examples in the APR API where returned strings are not
> > > either pool-allocated or constant (process lifetime)?  It seems unusual
> > > and very surprising to me.
> > >
> > > A hypothetical API consumer can have made either assumption:
> > >
> > > a) of constant memory (true with limits on Win32, breaks for Unix), or
> > > b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
> > >
> > > neither is documented and both are broken by current implementations. It
> > > is impossible to satisfy both so anybody can argue that fixing either
> > > implementation to match the other is a regression.
> > >
> > > Doubling down on (a) over (b) seems strongly inferior to me, the API
> > > cannot support this without introducing runtime overhead for all callers
> > > (a new iterpool in the dir object), so I'd rather fix this properly with
> > > a new API.
> > >
> > > I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at
> > > the same as introducing _pread() if desired - making the strdup only
> > > happen for _pread would only be a minor tweak, not a whole new subpool.
> > >
> >
> > There is example of apr_hash_t.ITERATOR that is statically allocated in
> > the hash object and returned to the caller.
> >
> > Also the native readdir() API behaves the same way [1]:
> > [[[
> > The returned pointer, and pointers within the structure, might be invalidated
> > or the structure or the storage areas might be overwritten by a subsequent call
> > to readdir() on the same directory stream.
> > ]]]
> >
> > From this perspective the current behavior of apr_dir_read on Unix looks
> > inconsistent and surprising for the API user because it is impossible to use
> > in a loop without unbounded memory usage.
> >
> > I strongly agree that the revved version of this function that accepts a POOL
> > argument will be good from both usage and implementation terms. I would be +1
> > on adding such API. But I also think that unless we fix the original issue
> > by using the preallocated buffer/iterpool, the whole thing would still be
> > problematic for any existing API user.
> >
> > More detailed: if we take this opportunity to choose and document the API to
> > consistently return a temporary result, then as the caller I can either process
> > the data during iteration or manually put it into a long-living pool if that is
> > necessary. I think that all current callers work with the API this way. And I
> > can also switch to the new revved function to manually specify the pool and
> > better control the allocations. If I would like to support both APR 1.x and 2.x
> > the compatibility is also straightforward to achieve.
> >
> > If we instead choose an option where the results may be allocated
> > in the dir object pool, then as the caller I cannot avoid unbounded memory
> > usage when supporting both APR 1.x and 2.x. And this approach also introduces
> > the unavoidable the unbounded memory usage for all Win32 users that
> > do not update their code.
> >
> > [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475
> >
> Is there any feedback or thoughts on these proposed changes?
>
> Thanks!
>

For what it's worth, I'm -1 on the changes done in r1862071 and
r1862435 for the reasons listed above.

PS: No idea if I can actually veto changes, because while I'm a
committer on the APR project, I am not in its PMC. But "-1" properly
expresses my technical opinion on this topic.

--
Ivan Zhakov

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <iv...@visualsvn.com> wrote:
>
> On Wed, 3 Jul 2019 at 18:37, Joe Orton <jo...@redhat.com> wrote:
> >
> > On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
> > > They also make the existing old API unusable for many cases thus
> > > making a switch to the new API mandatory, even though it doesn't have
> > > to be so (see below).
> > >
> > > APR 1.x did not specify that the result of apr_dir_read() has to be allocated
> > > in dir->pool:
> > >
> > >   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
> > >
> > > This function does not accept a pool argument, and I think that it's natural
> > > to assume limited lifetime in such case. This is what the current major APR
> > > users (httpd, svn) do, and other users should probably be ready for that too,
> > > since on Win32 the subsequent calls to apr_dir_read() already invalidate the
> > > previous apr_finfo_t.
> >
> > Are there many examples in the APR API where returned strings are not
> > either pool-allocated or constant (process lifetime)?  It seems unusual
> > and very surprising to me.
> >
> > A hypothetical API consumer can have made either assumption:
> >
> > a) of constant memory (true with limits on Win32, breaks for Unix), or
> > b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
> >
> > neither is documented and both are broken by current implementations. It
> > is impossible to satisfy both so anybody can argue that fixing either
> > implementation to match the other is a regression.
> >
> > Doubling down on (a) over (b) seems strongly inferior to me, the API
> > cannot support this without introducing runtime overhead for all callers
> > (a new iterpool in the dir object), so I'd rather fix this properly with
> > a new API.
> >
> > I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at
> > the same as introducing _pread() if desired - making the strdup only
> > happen for _pread would only be a minor tweak, not a whole new subpool.
> >
>
> There is example of apr_hash_t.ITERATOR that is statically allocated in
> the hash object and returned to the caller.
>
> Also the native readdir() API behaves the same way [1]:
> [[[
> The returned pointer, and pointers within the structure, might be invalidated
> or the structure or the storage areas might be overwritten by a subsequent call
> to readdir() on the same directory stream.
> ]]]
>
> From this perspective the current behavior of apr_dir_read on Unix looks
> inconsistent and surprising for the API user because it is impossible to use
> in a loop without unbounded memory usage.
>
> I strongly agree that the revved version of this function that accepts a POOL
> argument will be good from both usage and implementation terms. I would be +1
> on adding such API. But I also think that unless we fix the original issue
> by using the preallocated buffer/iterpool, the whole thing would still be
> problematic for any existing API user.
>
> More detailed: if we take this opportunity to choose and document the API to
> consistently return a temporary result, then as the caller I can either process
> the data during iteration or manually put it into a long-living pool if that is
> necessary. I think that all current callers work with the API this way. And I
> can also switch to the new revved function to manually specify the pool and
> better control the allocations. If I would like to support both APR 1.x and 2.x
> the compatibility is also straightforward to achieve.
>
> If we instead choose an option where the results may be allocated
> in the dir object pool, then as the caller I cannot avoid unbounded memory
> usage when supporting both APR 1.x and 2.x. And this approach also introduces
> the unavoidable the unbounded memory usage for all Win32 users that
> do not update their code.
>
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475
>
Is there any feedback or thoughts on these proposed changes?

Thanks!

--
Ivan Zhakov

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Wed, 3 Jul 2019 at 18:37, Joe Orton <jo...@redhat.com> wrote:
>
> On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
> > They also make the existing old API unusable for many cases thus
> > making a switch to the new API mandatory, even though it doesn't have
> > to be so (see below).
> >
> > APR 1.x did not specify that the result of apr_dir_read() has to be allocated
> > in dir->pool:
> >
> >   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
> >
> > This function does not accept a pool argument, and I think that it's natural
> > to assume limited lifetime in such case. This is what the current major APR
> > users (httpd, svn) do, and other users should probably be ready for that too,
> > since on Win32 the subsequent calls to apr_dir_read() already invalidate the
> > previous apr_finfo_t.
>
> Are there many examples in the APR API where returned strings are not
> either pool-allocated or constant (process lifetime)?  It seems unusual
> and very surprising to me.
>
> A hypothetical API consumer can have made either assumption:
>
> a) of constant memory (true with limits on Win32, breaks for Unix), or
> b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
>
> neither is documented and both are broken by current implementations. It
> is impossible to satisfy both so anybody can argue that fixing either
> implementation to match the other is a regression.
>
> Doubling down on (a) over (b) seems strongly inferior to me, the API
> cannot support this without introducing runtime overhead for all callers
> (a new iterpool in the dir object), so I'd rather fix this properly with
> a new API.
>
> I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at
> the same as introducing _pread() if desired - making the strdup only
> happen for _pread would only be a minor tweak, not a whole new subpool.
>

There is example of apr_hash_t.ITERATOR that is statically allocated in
the hash object and returned to the caller.

Also the native readdir() API behaves the same way [1]:
[[[
The returned pointer, and pointers within the structure, might be invalidated
or the structure or the storage areas might be overwritten by a subsequent call
to readdir() on the same directory stream.
]]]

From this perspective the current behavior of apr_dir_read on Unix looks
inconsistent and surprising for the API user because it is impossible to use
in a loop without unbounded memory usage.

I strongly agree that the revved version of this function that accepts a POOL
argument will be good from both usage and implementation terms. I would be +1
on adding such API. But I also think that unless we fix the original issue
by using the preallocated buffer/iterpool, the whole thing would still be
problematic for any existing API user.

More detailed: if we take this opportunity to choose and document the API to
consistently return a temporary result, then as the caller I can either process
the data during iteration or manually put it into a long-living pool if that is
necessary. I think that all current callers work with the API this way. And I
can also switch to the new revved function to manually specify the pool and
better control the allocations. If I would like to support both APR 1.x and 2.x
the compatibility is also straightforward to achieve.

If we instead choose an option where the results may be allocated
in the dir object pool, then as the caller I cannot avoid unbounded memory
usage when supporting both APR 1.x and 2.x. And this approach also introduces
the unavoidable the unbounded memory usage for all Win32 users that
do not update their code.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475


-- 
Ivan Zhakov

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
> They also make the existing old API unusable for many cases thus 
> making a switch to the new API mandatory, even though it doesn't have 
> to be so (see below).
>
> APR 1.x did not specify that the result of apr_dir_read() has to be allocated
> in dir->pool:
> 
>   https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
> 
> This function does not accept a pool argument, and I think that it's natural
> to assume limited lifetime in such case. This is what the current major APR
> users (httpd, svn) do, and other users should probably be ready for that too,
> since on Win32 the subsequent calls to apr_dir_read() already invalidate the
> previous apr_finfo_t.

Are there many examples in the APR API where returned strings are not 
either pool-allocated or constant (process lifetime)?  It seems unusual 
and very surprising to me.

A hypothetical API consumer can have made either assumption:

a) of constant memory (true with limits on Win32, breaks for Unix), or
b) of pool-allocated string lifetime (works for Unix, breaks for Win32)

neither is documented and both are broken by current implementations. It 
is impossible to satisfy both so anybody can argue that fixing either 
implementation to match the other is a regression.  

Doubling down on (a) over (b) seems strongly inferior to me, the API 
cannot support this without introducing runtime overhead for all callers 
(a new iterpool in the dir object), so I'd rather fix this properly with 
a new API.

I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at 
the same as introducing _pread() if desired - making the strdup only 
happen for _pread would only be a minor tweak, not a whole new subpool.

Regards, Joe

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Tue, 2 Jul 2019 at 13:18, Joe Orton <jo...@redhat.com> wrote:
>
> On Tue, Jul 02, 2019 at 09:59:20AM +0200, Branko Čibej wrote:
> > On 02.07.2019 08:49, Joe Orton wrote:
> > > On Thu, Jun 27, 2019 at 05:01:56PM +0300, Ivan Zhakov wrote:
> > >> On Tue, 25 Jun 2019 at 17:21, <jo...@apache.org> wrote:
> > >>
> > >>> Author: jorton
> > >>> Date: Tue Jun 25 14:21:56 2019
> > >>> New Revision: 1862071
> > >>>
> > >>> URL: http://svn.apache.org/viewvc?rev=1862071&view=rev
> > >>> Log:
> > >>> Add apr_dir_pread(), a variant of apr_dir_read() which allows callers
> > >>> to read a directory with constant memory consumption:
> > > ...
> > >> I'm not sure it's best fix. Better solution would be allocate buffer for
> > >> dirname + MAX_FILE_NAME in apr_dir_t and reuse it on every iteration. Win32
> > >> already has such code.
> >
> > > I didn't look at win32 too closely, so on win32 currently doing:
> > >
> > >   apr_dir_open(&x, ...);
> > >   apr_dir_read(&f1, ..., x);
> > >   apr_dir_read(&f2, ..., x);
> > >
> > > invalidates f1?  That's pretty ugly, there is no indication in the API
> > > that apr_dir_read() has side-effects like that.
> >
> > The 2.0 docstring says explicitly that "memory is allocated in the pool
> > passed to apr_dir_open()". While this is really bad API design (hence
> > apr_dir_pread() ...), it pretty much forbids the implementation from
> > reusing an internal name buffer. Looks like the Windows implementation
> > needs fixing?
>
> Reading the win32 code a bit more, it is not constant-memory even with
> the buffer in apr_dir_t, because it can allocate from dir->pool (and
> even create cleanups!) in the more_finfo() call, so I think the current
> behaviour is not justifiable.
>
> Ivan, do you have a counter-proposal to apr_dir_pread() to allow
> constant memory while reading a directory?  If we could have a hard
> constraint on apr_dir_read() *never* allocating any memory I guess that
> would work but appears to require substantial changes to the win32 side
> which I'm not qualified for.
>
> IMO not pstrdup'ing the filenames on Win32 is an API violation.  You
> have to be very clear in documenting things like that if someone is
> passing in a struct *.
>
Hi Joe,

I think that these changes to apr_dir_read() may be troublesome, because
they basically *introduce* unbounded memory usage for any existing user of
apr_dir_read() on Win32. They also make the existing old API unusable for
many cases thus making a switch to the new API mandatory, even though it
doesn't have to be so (see below).

APR 1.x did not specify that the result of apr_dir_read() has to be allocated
in dir->pool:

  https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8

This function does not accept a pool argument, and I think that it's natural
to assume limited lifetime in such case. This is what the current major APR
users (httpd, svn) do, and other users should probably be ready for that too,
since on Win32 the subsequent calls to apr_dir_read() already invalidate the
previous apr_finfo_t.

With all that in mind, here is what I propose instead:

1) First, fix the apr_dir_read()'s unbounded memory usage without introducing
   the new API.

  I think that this can be done by using the preallocated buffer for the
  filename or by introducing an internal `iterpool` that will be cleared
  by subsequent apr_dir_read() calls.

   If there are other Win32-related unbounded memory usages in the current
   apr_dir_read() implementation, I'll also be willing to fix them.

2) For APR 2.0, revv apr_dir_read() to apr_dir_read2() that accepts two pools:
   `result_pool` and `scratch_pool`.

   I think that this would be an appropriate change by itself (by giving the
   caller more fine-grained control over the allocations), but I also see it
   as an improvement that is unrelated to (1).

   In other words, I think that we should fix the existing problem first by
   changing or plumbing the existing API and only then add the new API that
   improves over it, because doing so doesn't make the switch to the new API
   mandatory for existing users who want to avoid regressions.

-- 
Ivan Zhakov

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Joe Orton <jo...@redhat.com>.
On Tue, Jul 02, 2019 at 09:59:20AM +0200, Branko Čibej wrote:
> On 02.07.2019 08:49, Joe Orton wrote:
> > On Thu, Jun 27, 2019 at 05:01:56PM +0300, Ivan Zhakov wrote:
> >> On Tue, 25 Jun 2019 at 17:21, <jo...@apache.org> wrote:
> >>
> >>> Author: jorton
> >>> Date: Tue Jun 25 14:21:56 2019
> >>> New Revision: 1862071
> >>>
> >>> URL: http://svn.apache.org/viewvc?rev=1862071&view=rev
> >>> Log:
> >>> Add apr_dir_pread(), a variant of apr_dir_read() which allows callers
> >>> to read a directory with constant memory consumption:
> > ...
> >> I'm not sure it's best fix. Better solution would be allocate buffer for
> >> dirname + MAX_FILE_NAME in apr_dir_t and reuse it on every iteration. Win32
> >> already has such code.
>
> > I didn't look at win32 too closely, so on win32 currently doing:
> >
> >   apr_dir_open(&x, ...);
> >   apr_dir_read(&f1, ..., x);
> >   apr_dir_read(&f2, ..., x);
> >
> > invalidates f1?  That's pretty ugly, there is no indication in the API 
> > that apr_dir_read() has side-effects like that.
> 
> The 2.0 docstring says explicitly that "memory is allocated in the pool
> passed to apr_dir_open()". While this is really bad API design (hence
> apr_dir_pread() ...), it pretty much forbids the implementation from
> reusing an internal name buffer. Looks like the Windows implementation
> needs fixing?

Reading the win32 code a bit more, it is not constant-memory even with 
the buffer in apr_dir_t, because it can allocate from dir->pool (and 
even create cleanups!) in the more_finfo() call, so I think the current 
behaviour is not justifiable.

Ivan, do you have a counter-proposal to apr_dir_pread() to allow 
constant memory while reading a directory?  If we could have a hard 
constraint on apr_dir_read() *never* allocating any memory I guess that 
would work but appears to require substantial changes to the win32 side 
which I'm not qualified for.

IMO not pstrdup'ing the filenames on Win32 is an API violation.  You 
have to be very clear in documenting things like that if someone is 
passing in a struct *.

Regards, Joe

Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Branko Čibej <br...@apache.org>.
On 02.07.2019 08:49, Joe Orton wrote:
> On Thu, Jun 27, 2019 at 05:01:56PM +0300, Ivan Zhakov wrote:
>> On Tue, 25 Jun 2019 at 17:21, <jo...@apache.org> wrote:
>>
>>> Author: jorton
>>> Date: Tue Jun 25 14:21:56 2019
>>> New Revision: 1862071
>>>
>>> URL: http://svn.apache.org/viewvc?rev=1862071&view=rev
>>> Log:
>>> Add apr_dir_pread(), a variant of apr_dir_read() which allows callers
>>> to read a directory with constant memory consumption:
> ...
>> I'm not sure it's best fix. Better solution would be allocate buffer for
>> dirname + MAX_FILE_NAME in apr_dir_t and reuse it on every iteration. Win32
>> already has such code.
> I didn't look at win32 too closely, so on win32 currently doing:
>
>   apr_dir_open(&x, ...);
>   apr_dir_read(&f1, ..., x);
>   apr_dir_read(&f2, ..., x);
>
> invalidates f1?  That's pretty ugly, there is no indication in the API 
> that apr_dir_read() has side-effects like that.

The 2.0 docstring says explicitly that "memory is allocated in the pool
passed to apr_dir_open()". While this is really bad API design (hence
apr_dir_pread() ...), it pretty much forbids the implementation from
reusing an internal name buffer. Looks like the Windows implementation
needs fixing?

-- Brane


Re: svn commit: r1862071 - in /apr/apr/trunk: file_io/os2/dir.c file_io/unix/dir.c file_io/win32/dir.c include/apr_file_info.h test/testdir.c

Posted by Joe Orton <jo...@redhat.com>.
On Thu, Jun 27, 2019 at 05:01:56PM +0300, Ivan Zhakov wrote:
> On Tue, 25 Jun 2019 at 17:21, <jo...@apache.org> wrote:
> 
> > Author: jorton
> > Date: Tue Jun 25 14:21:56 2019
> > New Revision: 1862071
> >
> > URL: http://svn.apache.org/viewvc?rev=1862071&view=rev
> > Log:
> > Add apr_dir_pread(), a variant of apr_dir_read() which allows callers
> > to read a directory with constant memory consumption:
...
> I'm not sure it's best fix. Better solution would be allocate buffer for
> dirname + MAX_FILE_NAME in apr_dir_t and reuse it on every iteration. Win32
> already has such code.

I didn't look at win32 too closely, so on win32 currently doing:

  apr_dir_open(&x, ...);
  apr_dir_read(&f1, ..., x);
  apr_dir_read(&f2, ..., x);

invalidates f1?  That's pretty ugly, there is no indication in the API 
that apr_dir_read() has side-effects like that.