You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stdcxx.apache.org by "Martin Sebor (JIRA)" <ji...@apache.org> on 2007/04/30 22:09:15 UTC
[jira] Assigned: (STDCXX-397) std::sort introsort implementation
error
[ https://issues.apache.org/jira/browse/STDCXX-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martin Sebor reassigned STDCXX-397:
-----------------------------------
Assignee: Martin Sebor
> std::sort introsort implementation error
> ----------------------------------------
>
> Key: STDCXX-397
> URL: https://issues.apache.org/jira/browse/STDCXX-397
> Project: C++ Standard Library
> Issue Type: Bug
> Components: 25. Algorithms
> Affects Versions: 4.1.3
> Environment: Bug in algorithm.cc effects all platforms
> Reporter: Joshua Lehrer
> Assigned To: Martin Sebor
>
> introsort is designed to detect when an input set would push quicksort into its worst-case scenario N^2 and fall back on a slower, yet still NLogN algorithm.
> The implementation in __introsort_loop has a bug, however, and it fails to catch all of the scenarios. While I can not supply an exact input set to demonstrate the bug, I can explain the bug very easily.
> First, allow me to paste in the code:
> // David R. Musser's Introspective Sorting algorithm
> // O(N * log (N)) worst case complexity
> _EXPORT
> template <class _RandomAccessIter, class _Dist, class _Compare>
> void __introsort_loop (_RandomAccessIter __first, _RandomAccessIter __last,
> _Dist __max_depth, _Compare __comp)
> {
> for (; __last - __first > __rw_threshold; __max_depth /= 2) {
> if (0 == __max_depth) {
> __partial_sort (__first, __last, __last,
> _RWSTD_VALUE_TYPE (_RandomAccessIter), __comp);
> break;
> }
> _RandomAccessIter __cut =
> __unguarded_partition (__first, __last,
> __median (*__first,
> *(__first + (__last - __first) /2),
> *(__last - 1), __comp), __comp);
> // limit the depth of the recursion tree to log2 (last - first)
> // where first and last are the initial values passed in from sort()
> __introsort_loop (__cut, __last, __max_depth, __comp);
> __last = __cut;
> }
> }
> the variable '__max_depth' is supposed to be cut in half on each subsequent "recursive" call. Once it reaches zero, LogN recurisve calls have been made, and the algorithm falls back on a different sorting algorithm for the remainder.
> The algorithm, as implemented, uses real recursion and tail recursion.
> First, the pivot is selected, the pivot is done, and the algorithm has a left and a right half, hopefully balanced.
> Consider what happens for the LEFT half, which is done using tail recursion. '__last' gets assigned '__cut', then the code goes to the top of the 'for' loop. The test condition of the loop is run, which divides '__max_depth' by two, bringing it closer to zero.
> Now consider what happens for the RIGHT half, which is done using real recursion. The function is called recurisvely on the right. __max_depth is NOT cut in half.
> What would happen if a poor pivot was selected causing the right half to be large and the left half to be small? What if that happens again and again? The real-recursion case is failing to decrement __max_depth until it starts working on the left half. You can see how if the algorithm continually built right-halves that were relatively large that __max_depth never gets decremented, and the algorithm never detects that it has made LogN recurisve calls.
> I believe the proper fix is as follows:
> // David R. Musser's Introspective Sorting algorithm
> // O(N * log (N)) worst case complexity
> _EXPORT
> template <class _RandomAccessIter, class _Dist, class _Compare>
> void __introsort_loop (_RandomAccessIter __first, _RandomAccessIter __last,
> _Dist __max_depth, _Compare __comp)
> {
> for (; __last - __first > __rw_threshold; ) {
> if (0 == __max_depth) {
> __partial_sort (__first, __last, __last,
> _RWSTD_VALUE_TYPE (_RandomAccessIter), __comp);
> break;
> }
> _RandomAccessIter __cut =
> __unguarded_partition (__first, __last,
> __median (*__first,
> *(__first + (__last - __first) /2),
> *(__last - 1), __comp), __comp);
> // limit the depth of the recursion tree to log2 (last - first)
> // where first and last are the initial values passed in from sort()
> __max_depth /= 2;
> __introsort_loop (__cut, __last, __max_depth, __comp);
> __last = __cut;
> }
> }
> "__max_depth/=2" is removed from the "for" loop and placed just above the two recursive calls.
> This fixes the worst-case sample set that I have generated.
> I look forward to your response,
> joshua lehrer
> http://www.lehrerfamily.com/
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.