You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/10/12 00:33:00 UTC
[jira] [Created] (ARROW-17995) [C++] arrow::json::DecimalConverter should rescale values based on the explicit_schema
Quanlong Huang created ARROW-17995:
--------------------------------------
Summary: [C++] arrow::json::DecimalConverter should rescale values based on the explicit_schema
Key: ARROW-17995
URL: https://issues.apache.org/jira/browse/ARROW-17995
Project: Apache Arrow
Issue Type: Bug
Components: C++
Affects Versions: 9.0.0, 8.0.1, 8.0.0, 7.0.1, 7.0.0, 6.0.2, 6.0.1, 6.0.0
Reporter: Quanlong Huang
Assignee: Quanlong Huang
The C++ lib doesn't read JSON decimal values correctly based on the explicit_schema. This can be reproduced by this helloworld program: [https://github.com/stiga-huang/arrow-helloworld/tree/d267862]
The input JSON file has the following rows:
{code:json}
{"id":1,"str":"Some","price":"30.04"}
{"id":2,"str":"data","price":"1.234"} {code}
If we read the price column using decimal128(9, 2), the values are
{noformat}
30.04,
12.34
{noformat}
If we use decimal128(9, 3) instead, the values are
{noformat}
3.004,
1.234
{noformat}
The decimal type in the explicit_schema is set here: https://github.com/stiga-huang/arrow-helloworld/blob/d26786270e87d9ab847658ead96a96190461b98f/json_decimal_example.cc#L38
The cause is {{arrow::json::DecimalConverter}} doesn't rescale the value based on the out_type_:
{code:cpp}
Status Convert(const std::shared_ptr<Array>& in, std::shared_ptr<Array>* out) override {
if (in->type_id() == Type::NA) {
return MakeArrayOfNull(out_type_, in->length(), pool_).Value(out);
}
const auto& dict_array = GetDictionaryArray(in);
using Builder = typename TypeTraits<T>::BuilderType;
Builder builder(out_type_, pool_);
RETURN_NOT_OK(builder.Resize(dict_array.indices()->length()));
auto visit_valid = [&builder](string_view repr) {
ARROW_ASSIGN_OR_RAISE(value_type value,
TypeTraits<T>::BuilderType::ValueType::FromString(repr));
//////////// Should rescale the value based on out_type_ here
builder.UnsafeAppend(value);
return Status::OK();
};
auto visit_null = [&builder]() {
builder.UnsafeAppendNull();
return Status::OK();
};
RETURN_NOT_OK(VisitDictionaryEntries(dict_array, visit_valid, visit_null));
return builder.Finish(out);
}
{code}
https://github.com/apache/arrow/blob/cdd0fdf39033b9cf132a5cfc9caa5ed60713845a/cpp/src/arrow/json/converter.cc#L171-L173
--
This message was sent by Atlassian Jira
(v8.20.10#820010)