You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by Xiangdong Huang <sa...@gmail.com> on 2019/03/03 16:20:19 UTC

[PR summary] #83 Enable float precision control

Hi,

As what we discuss in other mailing thread, I will post the summary of my
new PRs to this mailing list.

This email is for summary the info of PR #83 from Github to this mailing
list.

PR #83 and #81 is for fixing the issue of IoTDB-31 on JIRA.  Yi Xu has
reviewed the codes and approved it.

Now the program logic is:
* When writing data into the memory, we do not control the precision, so
that we can get the best ingestion performance.
* When querying data from disk, we do not need additional operations,
because when data is flushing on disk, it will be processed to only remain
the required precision.
* When querying data from memory which is not flushed on disk, we process
this part of data to control its precision.

I use `Math.round()` method to implement the precision control, which only
takes  less than 1/10 additional time cost. (performance test codes: see
comments in JIRA 31)

As the limitation of Java's `Math.round(float)` and `Math.round(double)`,
the value range of float/double is constrained:

* For Float data value, The data range is (-Integer.MAX_VALUE,
Integer.MAX_VALUE), rather than Float.MAX_VALUE, and the max_point_number
is 19.
* For Double data value, The data range is (-Long.MAX_VALUE,
Long.MAX_VALUE), rather than Double.MAX_VALUE, and the max_point_number is
19.

As for the performance, I think the loss is less than 10% (if the queried
data is in memtable), or zero (if there is no suitable data in memtable),
JIRA [IoTDB-31] has a snippet of code for performance.

In this PR, I replace many parameters' declaration `TimeValueSortor` with
`ReadOnlyMemChunk`, because the owners of these parameters never receive a
`TimeValueSortor` instance which is not `ReadOnlyMemChunk`.
I think at this time, using a parent interface is not quite good, because
you will lose many features that  a child class has.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院

AW: [PR summary] #83 Enable float precision control

Posted by Julian Feinauer <j....@pragmaticminds.de>.
Hi xiangdong,

I cannot comment much on the content as I'm still lacking detailed knowledge.

But a big +1 for the excellent write up on the list!

Julian

Von meinem Mobiltelefon gesendet


-------- Ursprüngliche Nachricht --------
Betreff: [PR summary] #83 Enable float precision control
Von: Xiangdong Huang
An: dev@iotdb.apache.org
Cc:

Hi,

As what we discuss in other mailing thread, I will post the summary of my
new PRs to this mailing list.

This email is for summary the info of PR #83 from Github to this mailing
list.

PR #83 and #81 is for fixing the issue of IoTDB-31 on JIRA.  Yi Xu has
reviewed the codes and approved it.

Now the program logic is:
* When writing data into the memory, we do not control the precision, so
that we can get the best ingestion performance.
* When querying data from disk, we do not need additional operations,
because when data is flushing on disk, it will be processed to only remain
the required precision.
* When querying data from memory which is not flushed on disk, we process
this part of data to control its precision.

I use `Math.round()` method to implement the precision control, which only
takes  less than 1/10 additional time cost. (performance test codes: see
comments in JIRA 31)

As the limitation of Java's `Math.round(float)` and `Math.round(double)`,
the value range of float/double is constrained:

* For Float data value, The data range is (-Integer.MAX_VALUE,
Integer.MAX_VALUE), rather than Float.MAX_VALUE, and the max_point_number
is 19.
* For Double data value, The data range is (-Long.MAX_VALUE,
Long.MAX_VALUE), rather than Double.MAX_VALUE, and the max_point_number is
19.

As for the performance, I think the loss is less than 10% (if the queried
data is in memtable), or zero (if there is no suitable data in memtable),
JIRA [IoTDB-31] has a snippet of code for performance.

In this PR, I replace many parameters' declaration `TimeValueSortor` with
`ReadOnlyMemChunk`, because the owners of these parameters never receive a
`TimeValueSortor` instance which is not `ReadOnlyMemChunk`.
I think at this time, using a parent interface is not quite good, because
you will lose many features that  a child class has.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院