You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Forward Xu (Jira)" <ji...@apache.org> on 2019/11/20 15:04:00 UTC

[jira] [Commented] (CALCITE-3510) Implement Redis adapter

    [ https://issues.apache.org/jira/browse/CALCITE-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978496#comment-16978496 ] 

Forward Xu commented on CALCITE-3510:
-------------------------------------

hi [~julianhyde] Thank you very much ,I think For XML, Avro, Arrow, ORC, Parquet, Protobuf, are we able to abstract a format parsing class to handle content processing? Redis is not a normal key-value store. It is actually a data structure storage server that can support different types of values. This means that redis values for redis can contain more complex data structures than traditional key-value pairs for string key and string value storage.

1. String: Binary-safe strings
2. Lists: A collection of string elements sorted according to the order of insertion. Basically it is a linked list.
3. Sets: An unordered and unique collection of string elements.
4. Sorted sets: Similar to Sets, but each string element is associated with a floating point number, called a score. The elements here are always sorted by score, so they are different from Sets. (Example: You can take the value like this: I want the top ten, I want the last ten)
5. Hashes: A mapping consisting of fields associated with values. Fields and values are strings. This is very similar to a Ruby or Python hash.
6. Bit arrays or simply bitmaps: You can use special commands to process string values. For example, you can set and clear individual bits, set all bits to 1, and find the first or unset bits. and many more.
7.HyperLogLogs: This is a probabilistic data structure used to calculate the cardinality of the set.

> Implement  Redis adapter
> ------------------------
>
>                 Key: CALCITE-3510
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3510
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Forward Xu
>            Assignee: Forward Xu
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The Redis adapter allows querying of live data stored in Redis.Each Redis key/value pair is presented as a single row in Presto. Rows can be broken down into cells by using table definition files.
>  Redis `string` ,`hash`, `sets`, `zsets`, `list` value types are supported;
> CSV format data
> {code:java}
> Set hello_world_1 1, james, 10
> Set hello_world_2 2,bond,20
> Set hello_world_3 3,lily,30
> Set hello_world_4 4,lucy,20
> {code}
> JSON format data
> {code:java}
> Set hello_foo_1 '{"id":1,"name":"james","age":110}'
> Set hello_foo_2 '{"id": 2, "name": "bond", "age": 210}'
> Set hello_foo_3 '{"id": 3, "name": "lily", "age": 310}'
> Set hello_foo_4 '{"id": 3, "name": "lucy", "age": 210}'
> {code}
> RAW format data
> {code:java}
> Set hello_raw_1 1, james, 10
> Set hello_raw_2 2,bond,20
> Set hello_raw_3 3,lily,30
> Set hello_raw_4 4, lucy, 20
> {code}
> We inserted data in three formats, one in CSV format, one in JSON format, and one in RAW format. This is the three formats we currently support, which will be demonstrated separately.
> Then you can define the corresponding mapping table in the JSON file:
> {code:java}
> {
>   "version": "1.0",
>   "defaultSchema": "foodmart",
>   "schemas": [
>     {
>       "type": "custom",
>       "name": "foodmart",
>       "factory": "org.apache.calcite.adapter.redis.RedisSchemaFactory",
>       "operand": {
>       ```
>         "host": "localhost",
>         "port": 6379,
>         "database": 0,
>         "password": ""
>         ```
>       },
>       "tables": [
>         {
>           "name": "raw_01",
>           "factory": "org.apache.calcite.adapter.redis.RedisTableFactory",
>           "operand": {
>           ```
>             "dataFormat": "csv",
>             "keyDelimiter": ":",
>             "fields": [
>               {
>                 "name": "id",
>                 "type": "varchar",
>                 "mapping": "id"
>               }
>               ```
>             ]
>           }
>         }
>       ]
>     }
>   ]
> }
> {code}
> Here are a few details about the fields:
> keyDelimiter is used to split the value, the default is a colon, and the split value is used to map the field column. Only works for the CSV format.
> Format is used to specify the format of the data in Redis. Currently, it supports: CSV, JSON and RAW. The raw format keeps the original redis key and value intact and only one field key is used for the query. The details are not described below.
> The function of COLUMN_MAPPING is to map the columns of Redis to the underlying data. Since there is no concept of column under the Redis, the specific mapping method varies according to the format. For example, here CSV, we know that the CSV data will be formed after being parsed. A string array, the corresponding column_mapping is mapped to the index (subscript) of the underlying array. For example, here map id to subscript 2, map name to subscript 1 and so on.
> You can query the data in the Redis database:
> Mysql> select * from dla_person;
> ||name||id||age||
> |bond|20|2|
> |lily|30|3|
> |lucy|20|4|
> |james|10|1|
> 4 rows in set (0.18 sec)
>  Students who are familiar with SQL must feel very cool, you can go to the familiar SQL syntax to operate the Redis database.
> JSON
>  The above demonstrates the data in CSV format. Let's try the data in JSON format. Let's create a new table:
> {code:java}
> {
>   "version": "1.0",
>   "defaultSchema": "foodmart",
>   "schemas": [
>     {
>       "type": "custom",
>       "name": "foodmart",
>       "factory": "org.apache.calcite.adapter.redis.RedisSchemaFactory",
>       "operand": {
>       ```
>         "host": "localhost",
>         "port": 6379,
>         "database": 0,
>         "password": ""
>         ```
>       },
>       "tables": [
>         {
>           "name": "raw_01",
>           "factory": "org.apache.calcite.adapter.redis.RedisTableFactory",
>           "operand": {
>           ```
>             "dataFormat": "json",
>             "fields": [
>               {
>                 "name": "id",
>                 "type": "varchar",
>                 "mapping": "id"
>               }
>               ```
>             ]
>           }
>         }
>       ]
>     }
>   ]
> }
> {code}
> Note that we have specified COLUMN_MAPPING here, because the JSON data has a field name, so the column name of the Redis layer is mapped to the name of the field in the JSON data. Here, the Redis id column is deliberately mapped to Redis for demonstration purposes. The age, let's look at the results:
> Mysql> select * from dla_person_json;
> ||name||id||age||
> |lucy|210|3|
> |james|110|1|
> |bond|210|2|
> |lily|310|3|
> 4 rows in set (0.12 sec)
>  As we wish, the id column shows the value of the corresponding age field in Redis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)