You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iotdb.apache.org by ha...@apache.org on 2021/09/02 07:09:42 UTC

[iotdb] branch master updated: [IOTDB-1601]Add Like and REGEXP statement user guide in DML (#3889)

This is an automated email from the ASF dual-hosted git repository.

haonan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iotdb.git


The following commit(s) were added to refs/heads/master by this push:
     new b92a16a  [IOTDB-1601]Add Like and REGEXP statement user guide in DML (#3889)
b92a16a is described below

commit b92a16af1c6e08135744c28cb940b6ad2e60d595
Author: Hang Ji <55...@users.noreply.github.com>
AuthorDate: Thu Sep 2 15:09:09 2021 +0800

    [IOTDB-1601]Add Like and REGEXP statement user guide in DML (#3889)
    
    Co-authored-by: Haonan <hh...@outlook.com>
---
 .../DML-Data-Manipulation-Language.md              | 74 +++++++++++++++++++++
 .../DML-Data-Manipulation-Language.md              | 77 ++++++++++++++++++++++
 2 files changed, 151 insertions(+)

diff --git a/docs/UserGuide/IoTDB-SQL-Language/DML-Data-Manipulation-Language.md b/docs/UserGuide/IoTDB-SQL-Language/DML-Data-Manipulation-Language.md
index 5028426..19be865 100644
--- a/docs/UserGuide/IoTDB-SQL-Language/DML-Data-Manipulation-Language.md
+++ b/docs/UserGuide/IoTDB-SQL-Language/DML-Data-Manipulation-Language.md
@@ -924,7 +924,81 @@ Total line number = 2
 It costs 0.002s
 ```
 
+### Fuzzy query
 
+Fuzzy query is divided into Like statement and Regexp statement, both of which can support fuzzy matching of TEXT type data.
+
+Like statement:
+
+Example 1: Query data containing `'cc'` in `value` under `root.sg.device`. 
+The percentage (`%`) wildcard matches any string of zero or more characters.
+
+
+```
+IoTDB> select * from root.sg.device where value like '%cc%'
++-----------------------------+--------------------+
+|                         Time|root.sg.device.value|
++-----------------------------+--------------------+
+|2017-11-07T23:59:00.000+08:00|            aabbccdd| 
+|2017-11-07T23:59:00.000+08:00|                  cc|
++-----------------------------+--------------------+
+Total line number = 2
+It costs 0.002s
+```
+
+Example 2: Query data that consists of 3 characters and the second character is `'b'` in `value` under `root.sg.device`.
+The underscore (`_`) wildcard matches any single character.
+```
+IoTDB> select * from root.sg.device where value like '_b_'
++-----------------------------+--------------------+
+|                         Time|root.sg.device.value|
++-----------------------------+--------------------+
+|2017-11-07T23:59:00.000+08:00|                 abc| 
++-----------------------------+--------------------+
+Total line number = 1
+It costs 0.002s
+```
+
+Regexp statement:
+
+The filter conditions that need to be passed in are regular expressions in the Java standard library style
+
+Example 1: Query a string composed of 26 English characters for the value under root.sg.device
+```
+IoTDB> select * from root.sg.device where value regexp '^[A-Za-z]+$'
++-----------------------------+--------------------+
+|                         Time|root.sg.device.value|
++-----------------------------+--------------------+
+|2017-11-07T23:59:00.000+08:00|            aabbccdd| 
+|2017-11-07T23:59:00.000+08:00|                  cc|
++-----------------------------+--------------------+
+Total line number = 2
+It costs 0.002s
+```
+
+Example 2: Query root.sg.device where the value value is a string composed of 26 lowercase English characters and the time is greater than 100
+```
+IoTDB> select * from root.sg.device where value regexp '^[a-z]+$' and time > 100
++-----------------------------+--------------------+
+|                         Time|root.sg.device.value|
++-----------------------------+--------------------+
+|2017-11-07T23:59:00.000+08:00|            aabbccdd| 
+|2017-11-07T23:59:00.000+08:00|                  cc|
++-----------------------------+--------------------+
+Total line number = 2
+It costs 0.002s
+```
+
+Examples of common regular matching:
+
+```
+All characters with a length of 3-20: ^.{3,20}$
+Uppercase english characters: ^[A-Z]+$
+Numbers and English characters: ^[A-Za-z0-9]+$
+Beginning with a: ^a.*
+```
+
+For more syntax description, please read [SQL Reference](../Appendix/SQL-Reference.md).
 ### Automated Fill
 
 In the actual use of IoTDB, when doing the query operation of timeseries, situations where the value is null at some time points may appear, which will obstruct the further analysis by users. In order to better reflect the degree of data change, users expect missing values to be automatically filled. Therefore, the IoTDB system introduces the function of Automated Fill.
diff --git a/docs/zh/UserGuide/IoTDB-SQL-Language/DML-Data-Manipulation-Language.md b/docs/zh/UserGuide/IoTDB-SQL-Language/DML-Data-Manipulation-Language.md
index db5e5bb..8ae15ff 100644
--- a/docs/zh/UserGuide/IoTDB-SQL-Language/DML-Data-Manipulation-Language.md
+++ b/docs/zh/UserGuide/IoTDB-SQL-Language/DML-Data-Manipulation-Language.md
@@ -696,6 +696,83 @@ Total line number = 2
 It costs 0.002s
 ```
 
+### 模糊查询
+
+模糊查询分为Like语句和Regexp语句,都可以支持对Text类型的数据进行模糊匹配
+
+Like语句:
+
+示例 1:查询 `root.sg.device` 下 `value` 含有`'cc'`的数据。
+`%` 表示任意0个或多个字符。
+
+```
+IoTDB> select * from root.sg.device where value like '%cc%'
++-----------------------------+--------------------+
+|                         Time|root.sg.device.value|
++-----------------------------+--------------------+
+|2017-11-07T23:59:00.000+08:00|            aabbccdd| 
+|2017-11-07T23:59:00.000+08:00|                  cc|
++-----------------------------+--------------------+
+Total line number = 2
+It costs 0.002s
+```
+
+示例 2:查询 `root.sg.device` 下 `value` 中间为 `'b'`、前后为任意单个字符的数据。
+`_` 表示任意单个字符。
+
+```
+IoTDB> select * from root.sg.device where value like '_b_'
++-----------------------------+--------------------+
+|                         Time|root.sg.device.value|
++-----------------------------+--------------------+
+|2017-11-07T23:59:00.000+08:00|                 abc| 
++-----------------------------+--------------------+
+Total line number = 1
+It costs 0.002s
+```
+
+Regexp语句:
+
+需要传入的过滤条件为 Java 标准库风格的正则表达式
+
+示例 1:查询 root.sg.device 下 value 值为26个英文字符组成的字符串
+
+```
+IoTDB> select * from root.sg.device where value regexp '^[A-Za-z]+$'
++-----------------------------+--------------------+
+|                         Time|root.sg.device.value|
++-----------------------------+--------------------+
+|2017-11-07T23:59:00.000+08:00|            aabbccdd| 
+|2017-11-07T23:59:00.000+08:00|                  cc|
++-----------------------------+--------------------+
+Total line number = 2
+It costs 0.002s
+```
+
+示例 2:查询 root.sg.device 下 value 值为26个小写英文字符组成的字符串 且时间大于100的
+
+```
+IoTDB> select * from root.sg.device where value regexp '^[a-z]+$' and time > 100
++-----------------------------+--------------------+
+|                         Time|root.sg.device.value|
++-----------------------------+--------------------+
+|2017-11-07T23:59:00.000+08:00|            aabbccdd| 
+|2017-11-07T23:59:00.000+08:00|                  cc|
++-----------------------------+--------------------+
+Total line number = 2
+It costs 0.002s
+```
+
+常见的正则匹配举例:
+
+```
+长度为3-20的所有字符:^.{3,20}$
+大写英文字符:^[A-Z]+$
+数字和英文字符:^[A-Za-z0-9]+$
+以a开头的:^a.*
+```
+
+更多语法请参照 [SQL REFERENCE](../Appendix/SQL-Reference.md).
 ### 空值填充
 
 在 IoTDB 的实际使用中,当进行时间序列的查询操作时,可能会出现在某些时间点值为 null 的情况,这会妨碍用户进行进一步的分析。 为了更好地反映数据更改的程度,用户希望可以自动填充缺失值。 因此,IoTDB 系统引入了自动填充功能。