You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Gabor Kaszab (Jira)" <ji...@apache.org> on 2021/08/31 06:43:00 UTC
[jira] [Created] (IMPALA-10901) Clean up Datasketches serialization
and deserialization
Gabor Kaszab created IMPALA-10901:
-------------------------------------
Summary: Clean up Datasketches serialization and deserialization
Key: IMPALA-10901
URL: https://issues.apache.org/jira/browse/IMPALA-10901
Project: IMPALA
Issue Type: Improvement
Components: Backend
Affects Versions: Impala 4.0.0
Reporter: Gabor Kaszab
(copy-paste from a mail thread)
Regarding serialization using bytes as opposed to a stream. This has nothing to do with BINARY data type in Impala.
Currently I see in the Impala code something like this (simplified):
std::stringstream tmp;
sketch.serialize(tmp);
std::string str = tmp.str(); // in StringStreamToStringVal
StringVal result(context, str.size());
memcpy(result.ptr, str.c_str(), str.size());
You could do it faster like this:
auto bytes = sketch.serialize();
StringVal result(context, bytes.size());
memcpy(result.ptr, bytes.data() bytes.size());
Regarding unnecessary constructor during deserialization. I see a code like this (HLL is an example, but the pattern is the same):
datasketches::hll_sketch src_sketch(DS_SKETCH_CONFIG, DS_HLL_TYPE); // construct an empty sketch, which is not needed
DeserializeDsSketch(src, &src_sketch); // pass it into a function, which will replace it by an assignment (hopefully a move, not copy)
// in the function
*sketch = T::deserialize((void*)serialized_sketch.ptr, serialized_sketch.len);
This can be accomplished like so avoiding unnecessary constructor:
datasketches::hll_sketch src_sketch = datasketches::hll_sketch::deserialize((void*)serialized_sketch.ptr, serialized_sketch.len);
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org