You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Amogh Margoor (Code Review)" <ge...@cloudera.org> on 2021/07/05 14:34:48 UTC
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 )
Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library
......................................................................
Patch Set 6:
> (4 comments)
>
> It is great to know that Impala can achieve 926 MB/s conversion
> rate and very attempting to get the best from fast_double_parser():-)
>
> The key is not to populate a new std::string when the original
> input conforms to the requirements of the library (well formed
> null-terminated string via string::c_str() in constant speed),
> which should be true in most cases.
>
> Throughout the code base of Impala, I was able to find only the
> following call that needs the service of converting string to
> double which makes the above idea feasible.
>
> 346 static bool ParseProbability(const string& prob_str, bool*
> should_execute) {
> 347 StringParser::ParseResult parse_result;
> 348 double probability = StringParser::StringToFloat<double>(
> 349 prob_str.c_str(), prob_str.size(), &parse_result);
> 350 if (parse_result != StringParser::PARSE_SUCCESS ||
> 351 probability < 0.0 || probability > 1.0) {
> 352 return false;
> 353 }
> 354 // +1L ensures probability of 0.0 and 1.0 work as expected.
> 355 *should_execute = rand() < probability * (RAND_MAX + 1L);
> 356 return true;
> 357 }
Hi Qifan, I got late to the comment. So the other important code path which can lead to non-null terminated strings are due to the cast: 'select cast("0.454" as double)' or 'select cast(x as double) from foo' etc. The code path will pass through CastFunctions::CastToDoubleVal generated via Macro:
#define CAST_FROM_STRING(num_type, native_type, string_parser_fn) \
num_type CastFunctions::CastTo##num_type(FunctionContext* ctx, const StringVal& val) { \
if (val.is_null) return num_type::null(); \
StringParser::ParseResult result; \
num_type ret; \
ret.val = StringParser::string_parser_fn<native_type>( \
reinterpret_cast<char*>(val.ptr), val.len, &result); \
if (UNLIKELY(result != StringParser::PARSE_SUCCESS)) return num_type::null(); \
return ret; \
}
this code can probably be frequently used based on usage of cast by client/customer. But the point you are making is valid that well formed null-terminated string need no extra processing and should directly be passed to library function.
--
To view, visit http://gerrit.cloudera.org:8080/17389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
Gerrit-Change-Number: 17389
Gerrit-PatchSet: 6
Gerrit-Owner: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 05 Jul 2021 14:34:48 +0000
Gerrit-HasComments: No