You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stdcxx.apache.org by "Martin Sebor (JIRA)" <ji...@apache.org> on 2007/08/27 01:55:31 UTC
[jira] Updated: (STDCXX-239) std::num_get::do_get() cannot parse
nan, infinity
[ https://issues.apache.org/jira/browse/STDCXX-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martin Sebor updated STDCXX-239:
--------------------------------
Affects Version/s: 4.1.4
Fix Version/s: 4.3
Added 4.1.4 to the set of affected versions and scheduled for 4.3 (I expect this feature to be bigger than what would be appropriate for a patch release, possibly even too big for a minor release).
> std::num_get::do_get() cannot parse nan, infinity
> -------------------------------------------------
>
> Key: STDCXX-239
> URL: https://issues.apache.org/jira/browse/STDCXX-239
> Project: C++ Standard Library
> Issue Type: New Feature
> Components: 22. Localization
> Affects Versions: 4.1.2, 4.1.3, 4.1.4
> Environment: all
> Reporter: Martin Sebor
> Fix For: 4.3
>
>
> Moved from the Rogue Wave bug tracking database:
> ****Created By: sebor @ Apr 04, 2000 07:13:59 PM****
> The num_get<> facet's do_get() members fail to take the special strings [-]inf[inity] and [-]nan into account. The facet reports an error when it encounters such strings. See 7.19.6.1 and 7.19.6.2 of C99 for a list of allowed strings.
> The fix for this will not be trivial due to the messy implementation of the facets. It might be easier just to rewrite them from scratch.
> The testcase below demonstrates the incorrect behavior. Modified test case added as tests/regress/src/test_issue22564.cpp - see p4 describe 22408.
> $ g++ ... test.cpp
> $ a.out 0 1 inf infinity nan INF INFINITY NAN
> sscanf("0", "%lf") --> 0.000000
> num_get<>::do_get("0", ...) --> 0.000000
> sscanf("1", "%lf") --> 1.000000
> num_get<>::do_get("1", ...) --> 1.000000
> sscanf("inf", "%lf") --> inf
> num_get<>::do_get("inf", ...) --> error
> sscanf("infinity", "%lf") --> inf
> num_get<>::do_get("infinity", ...) --> error
> sscanf("nan", "%lf") --> nan
> num_get<>::do_get("nan", ...) --> error
> sscanf("INF", "%lf") --> inf
> num_get<>::do_get("INF", ...) --> error
> sscanf("INFINITY", "%lf") --> inf
> num_get<>::do_get("INFINITY", ...) --> error
> sscanf("NAN", "%lf") --> nan
> num_get<>::do_get("NAN", ...) --> error
> $ cat test.cpp
> #include <iostream>
> #include <locale>
> #include <stdio.h>
> #include <string.h>
> using namespace std;
> int main (int argc, const char *argv[])
> {
> num_get<char, const char*> nget;
> for (int i = 1; i != argc; ++i) {
> double x = 0, y = 0;
> ios::iostate err = ios::goodbit;
> nget.get (argv [i], argv [i] + strlen (argv [i]), cin, err, x);
> if (1 != sscanf (argv [i], "%lf", &y))
> printf ("sscanf(\"%s\", \"%%lf\") --> error\n", argv [i]);
> else
> printf ("sscanf(\"%s\", \"%%lf\") --> %f\n", argv [i], y);
> if ((ios::failbit | ios::badbit) & err)
> printf ("num_get<>::do_get(\"%s\", ...) --> error\n", argv [i]);
> else
> printf ("num_get<>::do_get(\"%s\", ...) --> %f\n", argv [i], x);
> }
> }
> ****Modified By: sebor @ Apr 09, 2000 09:31:49 PM****
> Fixed with p4 describe 22544. Test case fixed with p4 describe 22545. Closed.
> ****Modified By: leroy @ Mar 30, 2001 03:09:11 PM****
> Change 22544 by sebor@sebor_dev_killer on 2000/04/09 20:30:50
> Added support for inf[inity] and nan[(n-char-sequence)] as described
> in 7.19.6.1, p8 of C99.
> nan(n-char-sequence) currently treated the same as nan due to poor
> implementation of std::num_get<> and supporting classes - fix requires
> at least a partial rewrite of the facet.
> Resolves Onyx #22564 (and the duplicate #22601).
> Affected files ...
> ... //stdlib2/dev/source/src/include/rw/numbrw#17 edit
> ... //stdlib2/dev/source/src/include/rw/numbrw.cc#12 edit
> ... //stdlib2/dev/source/vendor.cpp#17 edit
> ****Modified By: sebor @ Apr 03, 2001 08:46:50 PM****
> It looks like this is actually not a bug and the fix is wrong (even as an extension). Here's some background...
> Subject: Is this a permissible extension?
> Date: Thu, 8 Feb 2001 18:16:18 -0500 (EST)
> From: Andrew Koenig <ar...@research.att.com>
> Reply-To: c++std-lib@research.att.com
> To: C++ libraries mailing list
> Message c++std-lib-8281
> Suppose we execute
> double x;
> std::cin >> x;
> at a point where the input stream contains
> NaN
> followed perhaps by other characters.
> One might plausibly expect an implementation to set x to NaN
> on an implementation that supports IEEE floating-point.
> Surely the standard cannot mandate such behavior, because not
> every implementation knows what NaN is. However, on an implementation
> that does support NaN, is such behavior a permitted extension?
> My first attempt at an answer is no, because if I track through the
> standard, I find that the behavior of this statement is defined
> as being identical to the behavior of strtod in c89, and that behavior
> requires at least one digit in the input in order for the intput to
> be valid. However, I might have missed something. Have I?
> ****Modified By: sebor @ Apr 03, 2001 08:48:03 PM****
> Subject: Re: Is this a permissible extension?
> Date: Fri, 09 Feb 2001 09:28:25 -0800
> From: Matt Austern <au...@research.att.com>
> Reply-To: c++std-lib@research.att.com
> Organization: AT&T Labs - Research
> References: 1 , 2
> To: C++ libraries mailing list
> Message c++std-lib-8284
> Andrew Koenig wrote:
> > Fred> In "C" locale, only decimal floating-point constants are valid.
> > Fred> So, no NaN nor Infinity is allowed.
> >
> > Yes -- I was talking about the default locale.
> Actually, I think that strtod isn't the important part, at least for
> discussing C++. I think that this is an illegal extension in all
> named locales.
> First, let me explain why I said *named* locales. If you construct
> a locale with locale("foo"), the way it works is that the locale is
> built up out of _byname facets instead of base class facets. Except
> that not all facets have _byname derived classes, so in some cases
> you've still got the default behavior from the facet base class.
> One of the facets that has no _byname variant is num_get<>. So if I
> can construct an argument that the documented behavior of num_get<>
> precludes this extension, I have also proved that this extension is
> impossible in any named locale. This argument does not apply to
> arbitrary locales, since an arbitrary locale may replace any base
> class facet that with a facet that inherits from it.
> OK, now the argument I promised, saying that num_get<> can't recognize
> the character string "NaN".
> 22.2.2.1.2, paragraph 2: num_get's overloaded conversion function,
> num_get::do_get(), works in three stages.
> (1) It determines conversion specifiers. We're OK so far.
> (2) It accumulates characters from a provided input character.
> (3) It uses the conversion specifiers and the characters it has
> accumulated to produce a number.
> Stage 2 is the crucial one. it's described in 22.2.2.1.2/8-10, in
> great detail.
> For each character,
> (a) We get it from a supplied input iterator.
> (b) We look it up in a lookup table whose contents are prescribed
> by the standard. (This has to do with wide characters, but there
> is no exception for the special case where you're reading narrow
> characters.)
> (c) If a character is found in the lookup table, or if it's a decimal
> point or a thousands sep, then it's checked to see if it can
> legally appear in the number at that point. If so, we keep
> acumulating characters.
> The characters in the lookup table are "0123456789abcdefABCDEF+-".
> Library issue 221 would amend that to "0123456789abcdefxABCDEFX+-".
> "N" isn't present in the lookup table, so stage 2 of num_get<>::do_get()
> is not permitted to read the character sequence "NaN".
> If you want to argue that num_get<>::do_get() is overspecified, I
> wouldn't disagree too violently.
> --Matt
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.