You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@iotdb.apache.org by Jialin Qiao <qi...@apache.org> on 2020/02/07 06:29:21 UTC

[DISCUSS] Table schema of group by device

Hi,

In IOTDB-243 [1], We want to allow create measurements with the same name
but with different types in the same storage group.

For example,
root.sg1.d1.s1, int32
root.sg1.d1.s2 int32
root.sg1.d2.s1 boolean
root.sg1.d2.s2 int32

This may cause trouble in group by device query. How do we organize the
result (table schema)? I thought of three ways:

(1) Time, Device, s1_int, s1_boolean, s2_int32

* advantage：
- No ambiguity
- The number of columns is acceptable.

* disadvantage:
- In most cases, the datatype indicator is redundant and weird.
- Difficult to use parallelization among devices in the query.

(2) Time, d1, s1, s2 Time, d2, s1, s2

* advantage:
- No ambiguity
- This could leverage the parallelization among devices in the query.

* disadvantage:
- The number of columns may be large.

(3) Time DeviceId, s1, s2

This may need to do much work in the QueryDataSet, and users need to get
value carefully according to the measurement type of one device. Otherwise,
it may cause RunTimeException in JDBC Client.

* advantage:
- The number of columns is the minimal.

* disadvantage:
- May cause ambiguity, a column of one table has more than one type, which
also conflicts to the Spark connector or Hive connector.
- Difficult to use parallelization in the query.

_______________

From my perspective, I prefer (1) ≈ (2) > (3).

What's your opinion?

[1] https://issues.apache.org/jira/browse/IOTDB-243

Thanks,
—————————————————
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

Re: [DISCUSS] Table schema of group by device

Posted by Xiangdong Huang <sa...@gmail.com>.

Hi Jialin,

Very glad that you can agree with that. :-D

-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Jialin Qiao <qi...@apache.org> 于2020年2月11日周二 下午5:50写道：

> Hi,
>
> If we use text when a column has multiple types, I'm ok with (3).
>
> Thanks,
> —————————————————
> Jialin Qiao
> School of Software, Tsinghua University
>
> 乔嘉林
> 清华大学 软件学院
>
>
> 魏祥威 <52...@qq.com> 于2020年2月9日周日 下午5:30写道：
>
> > Hi,
> >
> >
> > I agree with the opinion of Xiangdong Huang.
> >
> >
> > (3) is the most friendly for users who are using Relational DB, and if
> > they want a relational query (group by device query), their applications
> > should guarantee the consistency of data type.
> >
> > Best,
> > Xiangwei Wei
> >
> >
> >
> > &nbsp;
> >
> >
> >
> >
> > ------------------&nbsp;原始邮件&nbsp;------------------
> > 发件人:&nbsp;"Xiangdong Huang"<sainthxd@gmail.com&gt;;
> > 发送时间:&nbsp;2020年2月7日(星期五) 下午2:58
> > 收件人:&nbsp;"dev"<dev@iotdb.apache.org&gt;;
> >
> > 主题:&nbsp;Re: [DISCUSS] Table schema of group by device
> >
> >
> >
> > One more suggestion, using "align by device" is more clear than "group by
> > device".
> >
> > -----------------------------------
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> > &nbsp;黄向东
> > 清华大学 软件学院
> >
> >
> > Xiangdong Huang <sainthxd@gmail.com&gt; 于2020年2月7日周五 下午2:56写道：
> >
> > &gt;&nbsp; -1 for (2), forever and&nbsp; I think I will never vote +1 for
> > it...
> > &gt;
> > &gt; If you do it like that, there is no chance to replace those
> > applications
> > &gt; which are using relational db to manage timeseries data.
> > &gt;
> > &gt; (3) is the most friendly for those developers who are using
> > Relational DB,
> > &gt; because when they write a SQL like "select c1, c2, c3 FROM", they
> > think it
> > &gt; is of course that the resultset has 3 columns...
> > &gt;
> > &gt; Of course, for users who are using RDB and want a table like "Time
> > &gt; DeviceId, s1, s2", their applications can guarantee the data type of
> > data
> > &gt; in s2 as const.
> > &gt; If there are many data types in s2, the RDB users may use "text"
> > &gt; "varchar2" format directly.
> > &gt;
> > &gt; Considering that, I think the choice is: if all data has the same
> data
> > &gt; type in a column, use the correct data type. Otherwise use String.
> > &gt;
> > &gt; (1) Well, it can be an option. But my suggestion is, if all data has
> > the
> > &gt; same data type in a column, do not change its column name.
> > &gt;
> > &gt; Best,
> > &gt; -----------------------------------
> > &gt; Xiangdong Huang
> > &gt; School of Software, Tsinghua University
> > &gt;
> > &gt;&nbsp; 黄向东
> > &gt; 清华大学 软件学院
> > &gt;
> > &gt;
> > &gt; Jialin Qiao <qiaojialin@apache.org&gt; 于2020年2月7日周五 下午2:29写道：
> > &gt;
> > &gt;&gt; Hi,
> > &gt;&gt;
> > &gt;&gt; In IOTDB-243 [1], We want to allow create measurements with the
> > same name
> > &gt;&gt; but with different types in the same storage group.
> > &gt;&gt;
> > &gt;&gt; For example,
> > &gt;&gt; root.sg1.d1.s1, int32
> > &gt;&gt; root.sg1.d1.s2 int32
> > &gt;&gt; root.sg1.d2.s1 boolean
> > &gt;&gt; root.sg1.d2.s2 int32
> > &gt;&gt;
> > &gt;&gt; This may cause trouble in group by device query. How do we
> > organize the
> > &gt;&gt; result (table schema)? I thought of three ways:
> > &gt;&gt;
> > &gt;&gt; (1) Time, Device, s1_int, s1_boolean, s2_int32
> > &gt;&gt;
> > &gt;&gt; * advantage：
> > &gt;&gt; - No ambiguity
> > &gt;&gt; - The number of columns is acceptable.
> > &gt;&gt;
> > &gt;&gt; * disadvantage:
> > &gt;&gt; - In most cases, the datatype indicator is redundant and weird.
> > &gt;&gt; - Difficult to use parallelization among devices in the query.
> > &gt;&gt;
> > &gt;&gt; (2) Time, d1, s1, s2 Time, d2, s1, s2
> > &gt;&gt;
> > &gt;&gt; * advantage:
> > &gt;&gt; - No ambiguity
> > &gt;&gt; - This could leverage the parallelization among devices in the
> > query.
> > &gt;&gt;
> > &gt;&gt; * disadvantage:
> > &gt;&gt; - The number of columns may be large.
> > &gt;&gt;
> > &gt;&gt; (3) Time DeviceId, s1, s2
> > &gt;&gt;
> > &gt;&gt; This may need to do much work in the QueryDataSet, and users
> need
> > to get
> > &gt;&gt; value carefully according to the measurement type of one device.
> > &gt;&gt; Otherwise,
> > &gt;&gt; it may cause RunTimeException in JDBC Client.
> > &gt;&gt;
> > &gt;&gt; * advantage:
> > &gt;&gt; - The number of columns is the minimal.
> > &gt;&gt;
> > &gt;&gt; * disadvantage:
> > &gt;&gt; - May cause ambiguity, a column of one table has more than one
> > type, which
> > &gt;&gt; also conflicts to the Spark connector or Hive connector.
> > &gt;&gt; - Difficult to use parallelization in the query.
> > &gt;&gt;
> > &gt;&gt; _______________
> > &gt;&gt;
> > &gt;&gt; From my perspective, I prefer (1) ≈ (2) &gt; (3).
> > &gt;&gt;
> > &gt;&gt; What's your opinion?
> > &gt;&gt;
> > &gt;&gt; [1] https://issues.apache.org/jira/browse/IOTDB-243
> > &gt;&gt;
> > &gt;&gt; Thanks,
> > &gt;&gt; —————————————————
> > &gt;&gt; Jialin Qiao
> > &gt;&gt; School of Software, Tsinghua University
> > &gt;&gt;
> > &gt;&gt; 乔嘉林
> > &gt;&gt; 清华大学 软件学院
> > &gt;&gt;
> > &gt;
>

Re: [DISCUSS] Table schema of group by device

Posted by Jialin Qiao <qi...@apache.org>.

Hi,

If we use text when a column has multiple types, I'm ok with (3).

Thanks,
—————————————————
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院


魏祥威 <52...@qq.com> 于2020年2月9日周日 下午5:30写道：

> Hi,
>
>
> I agree with the opinion of Xiangdong Huang.
>
>
> (3) is the most friendly for users who are using Relational DB, and if
> they want a relational query (group by device query), their applications
> should guarantee the consistency of data type.
>
> Best,
> Xiangwei Wei
>
>
>
> &nbsp;
>
>
>
>
> ------------------&nbsp;原始邮件&nbsp;------------------
> 发件人:&nbsp;"Xiangdong Huang"<sainthxd@gmail.com&gt;;
> 发送时间:&nbsp;2020年2月7日(星期五) 下午2:58
> 收件人:&nbsp;"dev"<dev@iotdb.apache.org&gt;;
>
> 主题:&nbsp;Re: [DISCUSS] Table schema of group by device
>
>
>
> One more suggestion, using "align by device" is more clear than "group by
> device".
>
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
>
> &nbsp;黄向东
> 清华大学 软件学院
>
>
> Xiangdong Huang <sainthxd@gmail.com&gt; 于2020年2月7日周五 下午2:56写道：
>
> &gt;&nbsp; -1 for (2), forever and&nbsp; I think I will never vote +1 for
> it...
> &gt;
> &gt; If you do it like that, there is no chance to replace those
> applications
> &gt; which are using relational db to manage timeseries data.
> &gt;
> &gt; (3) is the most friendly for those developers who are using
> Relational DB,
> &gt; because when they write a SQL like "select c1, c2, c3 FROM", they
> think it
> &gt; is of course that the resultset has 3 columns...
> &gt;
> &gt; Of course, for users who are using RDB and want a table like "Time
> &gt; DeviceId, s1, s2", their applications can guarantee the data type of
> data
> &gt; in s2 as const.
> &gt; If there are many data types in s2, the RDB users may use "text"
> &gt; "varchar2" format directly.
> &gt;
> &gt; Considering that, I think the choice is: if all data has the same data
> &gt; type in a column, use the correct data type. Otherwise use String.
> &gt;
> &gt; (1) Well, it can be an option. But my suggestion is, if all data has
> the
> &gt; same data type in a column, do not change its column name.
> &gt;
> &gt; Best,
> &gt; -----------------------------------
> &gt; Xiangdong Huang
> &gt; School of Software, Tsinghua University
> &gt;
> &gt;&nbsp; 黄向东
> &gt; 清华大学 软件学院
> &gt;
> &gt;
> &gt; Jialin Qiao <qiaojialin@apache.org&gt; 于2020年2月7日周五 下午2:29写道：
> &gt;
> &gt;&gt; Hi,
> &gt;&gt;
> &gt;&gt; In IOTDB-243 [1], We want to allow create measurements with the
> same name
> &gt;&gt; but with different types in the same storage group.
> &gt;&gt;
> &gt;&gt; For example,
> &gt;&gt; root.sg1.d1.s1, int32
> &gt;&gt; root.sg1.d1.s2 int32
> &gt;&gt; root.sg1.d2.s1 boolean
> &gt;&gt; root.sg1.d2.s2 int32
> &gt;&gt;
> &gt;&gt; This may cause trouble in group by device query. How do we
> organize the
> &gt;&gt; result (table schema)? I thought of three ways:
> &gt;&gt;
> &gt;&gt; (1) Time, Device, s1_int, s1_boolean, s2_int32
> &gt;&gt;
> &gt;&gt; * advantage：
> &gt;&gt; - No ambiguity
> &gt;&gt; - The number of columns is acceptable.
> &gt;&gt;
> &gt;&gt; * disadvantage:
> &gt;&gt; - In most cases, the datatype indicator is redundant and weird.
> &gt;&gt; - Difficult to use parallelization among devices in the query.
> &gt;&gt;
> &gt;&gt; (2) Time, d1, s1, s2 Time, d2, s1, s2
> &gt;&gt;
> &gt;&gt; * advantage:
> &gt;&gt; - No ambiguity
> &gt;&gt; - This could leverage the parallelization among devices in the
> query.
> &gt;&gt;
> &gt;&gt; * disadvantage:
> &gt;&gt; - The number of columns may be large.
> &gt;&gt;
> &gt;&gt; (3) Time DeviceId, s1, s2
> &gt;&gt;
> &gt;&gt; This may need to do much work in the QueryDataSet, and users need
> to get
> &gt;&gt; value carefully according to the measurement type of one device.
> &gt;&gt; Otherwise,
> &gt;&gt; it may cause RunTimeException in JDBC Client.
> &gt;&gt;
> &gt;&gt; * advantage:
> &gt;&gt; - The number of columns is the minimal.
> &gt;&gt;
> &gt;&gt; * disadvantage:
> &gt;&gt; - May cause ambiguity, a column of one table has more than one
> type, which
> &gt;&gt; also conflicts to the Spark connector or Hive connector.
> &gt;&gt; - Difficult to use parallelization in the query.
> &gt;&gt;
> &gt;&gt; _______________
> &gt;&gt;
> &gt;&gt; From my perspective, I prefer (1) ≈ (2) &gt; (3).
> &gt;&gt;
> &gt;&gt; What's your opinion?
> &gt;&gt;
> &gt;&gt; [1] https://issues.apache.org/jira/browse/IOTDB-243
> &gt;&gt;
> &gt;&gt; Thanks,
> &gt;&gt; —————————————————
> &gt;&gt; Jialin Qiao
> &gt;&gt; School of Software, Tsinghua University
> &gt;&gt;
> &gt;&gt; 乔嘉林
> &gt;&gt; 清华大学 软件学院
> &gt;&gt;
> &gt;

回复： [DISCUSS] Table schema of group by device

Posted by 魏祥威 <52...@qq.com>.

Hi,


I agree with the opinion of Xiangdong Huang.


(3) is the most friendly for users who are using Relational DB, and if they want a relational query (group by device query), their applications should guarantee the consistency of data type.

Best,
Xiangwei Wei



&nbsp;




------------------&nbsp;原始邮件&nbsp;------------------
发件人:&nbsp;"Xiangdong Huang"<sainthxd@gmail.com&gt;;
发送时间:&nbsp;2020年2月7日(星期五) 下午2:58
收件人:&nbsp;"dev"<dev@iotdb.apache.org&gt;;

主题:&nbsp;Re: [DISCUSS] Table schema of group by device



One more suggestion, using "align by device" is more clear than "group by
device".

-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

&nbsp;黄向东
清华大学 软件学院


Xiangdong Huang <sainthxd@gmail.com&gt; 于2020年2月7日周五 下午2:56写道：

&gt;&nbsp; -1 for (2), forever and&nbsp; I think I will never vote +1 for it...
&gt;
&gt; If you do it like that, there is no chance to replace those applications
&gt; which are using relational db to manage timeseries data.
&gt;
&gt; (3) is the most friendly for those developers who are using Relational DB,
&gt; because when they write a SQL like "select c1, c2, c3 FROM", they think it
&gt; is of course that the resultset has 3 columns...
&gt;
&gt; Of course, for users who are using RDB and want a table like "Time
&gt; DeviceId, s1, s2", their applications can guarantee the data type of data
&gt; in s2 as const.
&gt; If there are many data types in s2, the RDB users may use "text"
&gt; "varchar2" format directly.
&gt;
&gt; Considering that, I think the choice is: if all data has the same data
&gt; type in a column, use the correct data type. Otherwise use String.
&gt;
&gt; (1) Well, it can be an option. But my suggestion is, if all data has the
&gt; same data type in a column, do not change its column name.
&gt;
&gt; Best,
&gt; -----------------------------------
&gt; Xiangdong Huang
&gt; School of Software, Tsinghua University
&gt;
&gt;&nbsp; 黄向东
&gt; 清华大学 软件学院
&gt;
&gt;
&gt; Jialin Qiao <qiaojialin@apache.org&gt; 于2020年2月7日周五 下午2:29写道：
&gt;
&gt;&gt; Hi,
&gt;&gt;
&gt;&gt; In IOTDB-243 [1], We want to allow create measurements with the same name
&gt;&gt; but with different types in the same storage group.
&gt;&gt;
&gt;&gt; For example,
&gt;&gt; root.sg1.d1.s1, int32
&gt;&gt; root.sg1.d1.s2 int32
&gt;&gt; root.sg1.d2.s1 boolean
&gt;&gt; root.sg1.d2.s2 int32
&gt;&gt;
&gt;&gt; This may cause trouble in group by device query. How do we organize the
&gt;&gt; result (table schema)? I thought of three ways:
&gt;&gt;
&gt;&gt; (1) Time, Device, s1_int, s1_boolean, s2_int32
&gt;&gt;
&gt;&gt; * advantage：
&gt;&gt; - No ambiguity
&gt;&gt; - The number of columns is acceptable.
&gt;&gt;
&gt;&gt; * disadvantage:
&gt;&gt; - In most cases, the datatype indicator is redundant and weird.
&gt;&gt; - Difficult to use parallelization among devices in the query.
&gt;&gt;
&gt;&gt; (2) Time, d1, s1, s2 Time, d2, s1, s2
&gt;&gt;
&gt;&gt; * advantage:
&gt;&gt; - No ambiguity
&gt;&gt; - This could leverage the parallelization among devices in the query.
&gt;&gt;
&gt;&gt; * disadvantage:
&gt;&gt; - The number of columns may be large.
&gt;&gt;
&gt;&gt; (3) Time DeviceId, s1, s2
&gt;&gt;
&gt;&gt; This may need to do much work in the QueryDataSet, and users need to get
&gt;&gt; value carefully according to the measurement type of one device.
&gt;&gt; Otherwise,
&gt;&gt; it may cause RunTimeException in JDBC Client.
&gt;&gt;
&gt;&gt; * advantage:
&gt;&gt; - The number of columns is the minimal.
&gt;&gt;
&gt;&gt; * disadvantage:
&gt;&gt; - May cause ambiguity, a column of one table has more than one type, which
&gt;&gt; also conflicts to the Spark connector or Hive connector.
&gt;&gt; - Difficult to use parallelization in the query.
&gt;&gt;
&gt;&gt; _______________
&gt;&gt;
&gt;&gt; From my perspective, I prefer (1) ≈ (2) &gt; (3).
&gt;&gt;
&gt;&gt; What's your opinion?
&gt;&gt;
&gt;&gt; [1] https://issues.apache.org/jira/browse/IOTDB-243
&gt;&gt;
&gt;&gt; Thanks,
&gt;&gt; —————————————————
&gt;&gt; Jialin Qiao
&gt;&gt; School of Software, Tsinghua University
&gt;&gt;
&gt;&gt; 乔嘉林
&gt;&gt; 清华大学 软件学院
&gt;&gt;
&gt;

Re: [DISCUSS] Table schema of group by device

Posted by Xiangdong Huang <sa...@gmail.com>.

One more suggestion, using "align by device" is more clear than "group by
device".

-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Xiangdong Huang <sa...@gmail.com> 于2020年2月7日周五 下午2:56写道：

>  -1 for (2), forever and  I think I will never vote +1 for it...
>
> If you do it like that, there is no chance to replace those applications
> which are using relational db to manage timeseries data.
>
> (3) is the most friendly for those developers who are using Relational DB,
> because when they write a SQL like "select c1, c2, c3 FROM", they think it
> is of course that the resultset has 3 columns...
>
> Of course, for users who are using RDB and want a table like "Time
> DeviceId, s1, s2", their applications can guarantee the data type of data
> in s2 as const.
> If there are many data types in s2, the RDB users may use "text"
> "varchar2" format directly.
>
> Considering that, I think the choice is: if all data has the same data
> type in a column, use the correct data type. Otherwise use String.
>
> (1) Well, it can be an option. But my suggestion is, if all data has the
> same data type in a column, do not change its column name.
>
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
>
>  黄向东
> 清华大学 软件学院
>
>
> Jialin Qiao <qi...@apache.org> 于2020年2月7日周五 下午2:29写道：
>
>> Hi,
>>
>> In IOTDB-243 [1], We want to allow create measurements with the same name
>> but with different types in the same storage group.
>>
>> For example,
>> root.sg1.d1.s1, int32
>> root.sg1.d1.s2 int32
>> root.sg1.d2.s1 boolean
>> root.sg1.d2.s2 int32
>>
>> This may cause trouble in group by device query. How do we organize the
>> result (table schema)? I thought of three ways:
>>
>> (1) Time, Device, s1_int, s1_boolean, s2_int32
>>
>> * advantage：
>> - No ambiguity
>> - The number of columns is acceptable.
>>
>> * disadvantage:
>> - In most cases, the datatype indicator is redundant and weird.
>> - Difficult to use parallelization among devices in the query.
>>
>> (2) Time, d1, s1, s2 Time, d2, s1, s2
>>
>> * advantage:
>> - No ambiguity
>> - This could leverage the parallelization among devices in the query.
>>
>> * disadvantage:
>> - The number of columns may be large.
>>
>> (3) Time DeviceId, s1, s2
>>
>> This may need to do much work in the QueryDataSet, and users need to get
>> value carefully according to the measurement type of one device.
>> Otherwise,
>> it may cause RunTimeException in JDBC Client.
>>
>> * advantage:
>> - The number of columns is the minimal.
>>
>> * disadvantage:
>> - May cause ambiguity, a column of one table has more than one type, which
>> also conflicts to the Spark connector or Hive connector.
>> - Difficult to use parallelization in the query.
>>
>> _______________
>>
>> From my perspective, I prefer (1) ≈ (2) > (3).
>>
>> What's your opinion?
>>
>> [1] https://issues.apache.org/jira/browse/IOTDB-243
>>
>> Thanks,
>> —————————————————
>> Jialin Qiao
>> School of Software, Tsinghua University
>>
>> 乔嘉林
>> 清华大学 软件学院
>>
>

Re: [DISCUSS] Table schema of group by device

Posted by Xiangdong Huang <sa...@gmail.com>.

 -1 for (2), forever and  I think I will never vote +1 for it...

If you do it like that, there is no chance to replace those applications
which are using relational db to manage timeseries data.

(3) is the most friendly for those developers who are using Relational DB,
because when they write a SQL like "select c1, c2, c3 FROM", they think it
is of course that the resultset has 3 columns...

Of course, for users who are using RDB and want a table like "Time
DeviceId, s1, s2", their applications can guarantee the data type of data
in s2 as const.
If there are many data types in s2, the RDB users may use "text" "varchar2"
format directly.

Considering that, I think the choice is: if all data has the same data type
in a column, use the correct data type. Otherwise use String.

(1) Well, it can be an option. But my suggestion is, if all data has the
same data type in a column, do not change its column name.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Jialin Qiao <qi...@apache.org> 于2020年2月7日周五 下午2:29写道：

> Hi,
>
> In IOTDB-243 [1], We want to allow create measurements with the same name
> but with different types in the same storage group.
>
> For example,
> root.sg1.d1.s1, int32
> root.sg1.d1.s2 int32
> root.sg1.d2.s1 boolean
> root.sg1.d2.s2 int32
>
> This may cause trouble in group by device query. How do we organize the
> result (table schema)? I thought of three ways:
>
> (1) Time, Device, s1_int, s1_boolean, s2_int32
>
> * advantage：
> - No ambiguity
> - The number of columns is acceptable.
>
> * disadvantage:
> - In most cases, the datatype indicator is redundant and weird.
> - Difficult to use parallelization among devices in the query.
>
> (2) Time, d1, s1, s2 Time, d2, s1, s2
>
> * advantage:
> - No ambiguity
> - This could leverage the parallelization among devices in the query.
>
> * disadvantage:
> - The number of columns may be large.
>
> (3) Time DeviceId, s1, s2
>
> This may need to do much work in the QueryDataSet, and users need to get
> value carefully according to the measurement type of one device. Otherwise,
> it may cause RunTimeException in JDBC Client.
>
> * advantage:
> - The number of columns is the minimal.
>
> * disadvantage:
> - May cause ambiguity, a column of one table has more than one type, which
> also conflicts to the Spark connector or Hive connector.
> - Difficult to use parallelization in the query.
>
> _______________
>
> From my perspective, I prefer (1) ≈ (2) > (3).
>
> What's your opinion?
>
> [1] https://issues.apache.org/jira/browse/IOTDB-243
>
> Thanks,
> —————————————————
> Jialin Qiao
> School of Software, Tsinghua University
>
> 乔嘉林
> 清华大学 软件学院
>