You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iotdb.apache.org by ha...@apache.org on 2023/02/19 10:23:29 UTC

[iotdb] branch master updated: Improve the document of FREQ encoding (#9095)

This is an automated email from the ASF dual-hosted git repository.

haonan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iotdb.git


The following commit(s) were added to refs/heads/master by this push:
     new 0cd44616f8 Improve the document of FREQ encoding (#9095)
0cd44616f8 is described below

commit 0cd44616f80442decc593a6fcfcb6a7fe07c5c18
Author: Haoyu Wang <37...@users.noreply.github.com>
AuthorDate: Sun Feb 19 18:23:21 2023 +0800

    Improve the document of FREQ encoding (#9095)
---
 docs/UserGuide/Data-Concept/Encoding.md    | 4 ++--
 docs/zh/UserGuide/Data-Concept/Encoding.md | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/UserGuide/Data-Concept/Encoding.md b/docs/UserGuide/Data-Concept/Encoding.md
index 505325d094..b57d96736a 100644
--- a/docs/UserGuide/Data-Concept/Encoding.md
+++ b/docs/UserGuide/Data-Concept/Encoding.md
@@ -56,9 +56,9 @@ DICTIONARY encoding is lossless. It is suitable for TEXT data with low cardinali
 
 * FREQ
 
-FREQ encoding is lossy. It transforms the time sequence to the frequency domain and only reserve part of the frequency components with high energy. It is more suitable for sequence with obvious periodicity.
+FREQ encoding is lossy. Based on the idea of transform coding, it transforms the time sequence to the frequency domain and only reserve part of the frequency components with high energy. Thus, it greatly improves the space efficiency with little accuracy loss. It is suitable for data with high energy concentration (especially those with obvious periodicity), not suitable for data with uniformly distributed energy (such as white noise).
 
-> There are two parameters of FREQ encoding in the configuration file: `freq_snr` defines the signal-noise-ratio (SNR). Both the compression ratio and accuracy loss decrease when it increases. `freq_block_size` defines the data size in a time-frequency transformation. It is not recommended to modify the default value. The detailed experimental results and analysis of the influences of parameters are in the design document. 
+> There are two parameters of FREQ encoding in the configuration file: `freq_snr` defines the signal-noise-ratio (SNR). There is a mathematical relationship between SNR and NRMSE as $NRMSE = 10^{-SNR/20}$. Both the compression ratio and accuracy loss decrease when it increases. `freq_block_size` defines the data size in a time-frequency transformation. It is not recommended to modify the default value. The detailed experimental results and analysis of the influences of parameters are in  [...]
 
 * ZIGZAG 
   
diff --git a/docs/zh/UserGuide/Data-Concept/Encoding.md b/docs/zh/UserGuide/Data-Concept/Encoding.md
index 2d1d438340..a6194a5bd2 100644
--- a/docs/zh/UserGuide/Data-Concept/Encoding.md
+++ b/docs/zh/UserGuide/Data-Concept/Encoding.md
@@ -55,9 +55,9 @@ GORILLA 编码是一种无损编码,它比较适合编码前后值比较接近
 
 * 频域编码 (FREQ)
 
-频域编码是一种有损编码,它将时序数据变换为频域,仅保留部分高能量的频域分量。该编码适合于具有明显周期性的数据。
+频域编码是一种有损编码,它基于变换编码的思想,将时序数据变换为频域,仅保留部分高能量的频域分量,以少许的精度损失为代价大幅提高空间效率。该编码适合于频域能量分布较为集中的数据(特别是具有明显周期性的数据),不适合能量分布均匀的数据(如白噪声)。
 
-> 频域编码在配置文件中包括两个参数:`freq_snr`指定了编码的信噪比,该参数增大会同时降低压缩比和精度损失;`freq_block_size`指定了编码进行时频域变换的分组大小,推荐不对默认值进行修改。参数影响的实验结果和分析详见设计文档。
+> 频域编码在配置文件中包括两个参数:`freq_snr`指定了编码的信噪比(与标准均方根误差的关系为$NRMSE=10^{-SNR/20}$),该参数增大会同时降低压缩比和精度损失,请根据实际应用的需要进行设置;`freq_block_size`指定了编码进行时频域变换的分组大小,推荐不对默认值进行修改。参数影响的实验结果和分析详见设计文档。
 
 * ZIGZAG 编码
 
@@ -87,4 +87,4 @@ CHIMP 是一种无损编码。它是一种新的流式浮点数据压缩算法
 ```
 IoTDB> create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF
 Msg: 507: encoding TS_2DIFF does not support BOOLEAN
-```
\ No newline at end of file
+```