You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by Trevor Hart <tr...@ope.nz> on 2021/11/01 22:09:26 UTC

New UDTF iotDBExtras

I have put my first UDTF on GitHub - https://github.com/ope-nz/iotDBExtras



The first function I have developed is called UDTFDistinctCount. It returns the distinct values (similar to IoTDB-Quality UDTFDistinct) but it includes the count of each distinct values (as the time column).



Example query;



IoTDB> select distinct_count(temperature) from root.ln.wf01.wt01 

+-----------------------------+---------------------------------------------+

|                         Time|distinct_count(root.ln.wf01.wt01.temperature)|

+-----------------------------+---------------------------------------------+

|1970-01-01T12:00:00.020+12:00|                                        24.37|

|1970-01-01T12:00:00.009+12:00|                                        24.12|

|1970-01-01T12:00:00.016+12:00|                                        24.87|

+-----------------------------+---------------------------------------------+



Or as digital time;



+----+---------------------------------------------+ 


|Time|distinct_count(root.ln.wf01.wt01.temperature)|

+----+---------------------------------------------+

|  20|                                        24.37|

|   9|                                        24.12|

|  16|                                        24.87|

+----+---------------------------------------------+





Thanks 

Trevor Hart

Ope Limited

w: http://www.ope.nz/

m: +64212728039

Re: New UDTF iotDBExtras

Posted by Trevor Hart <tr...@ope.nz>.
I've updated UDTFDistinctCount to allow sorting using the Time column (the pseudo "count" column)



This is useful where you only want to get the top 10 values (based on their occurrence count)



Example;



select distinct_count(temperature,'sort'='asc') from root.ln.wf01.wt01 limit 10;






---- On Wed, 03 Nov 2021 05:53:02 +1300 Steve Su <st...@qq.com.INVALID> wrote ----


Hi Trevor, 
 
Thanks for sharing the usage of UDTF. I have starred your GitHub repo :D 
 
> Besides, could we rename the column from time to count in UDF? 
 
Sadly, no. This can be a new feature. 
 
> Unless I am mistaken a time column must be returned via UDTF - if you look at IoTDB-Quality UDTFDistinct it returns a incrementing Long value as the time column. 
 
In our design, we do expect a UDTF to return an increasing long value column as a time column. But as you see, a non-increasing long value column is also okay. To be honest, I was very surprised to see this can work. 
 
> Initially I wanted to return Col1 (Distinct Values) and Col2 (Count as Int) but at this release it looks like you have to return Col1 (Time) and Col2 (A computed value) - so at this version it does not look possible to return more than 1 computed value so using the time column is just a work around to get the count in the results. Not ideal but gets the job done. 
 
In fact, many users wanted us to support UDTFs that can output multiple columns, but we haven't figured out how to define this feature: what the Java interface should look like, what the SQL should look like, and so on. 
 
I want to hear your suggestions :D 
 
Steve Su 
 
------------------ Original ------------------ 
From: "mailto:dev@iotdb.apache.orgtrevor@ope.nz" <ma...@ope.nz>; 
Date: Tue, Nov 2, 2021 09:15 AM 
To: "dev"<ma...@iotdb.apache.org>; 
Cc: "qiaojialin"<ma...@qq.com.invalid>; 
Subject: Re: New UDTF iotDBExtras 
 
Hello Jialin 
 
 
 
Yes digital time does look better - this is how I will access it through JDBC anyway. I treat all times as Long. 
 
 
 
Unless I am mistaken a time column must be returned via UDTF - if you look at IoTDB-Quality UDTFDistinct it returns a incrementing Long value as the time column. 
 
 
 
The UTDF method to put the values together only accepts two parameters ie time and a value. 
 
 
 
Initially I wanted to return Col1 (Distinct Values) and Col2 (Count as Int) but at this release it looks like you have to return Col1 (Time) and Col2 (A computed value) - so at this version it does not look possible to return more than 1 computed value so using the time column is just a work around to get the count in the results. Not ideal but gets the job done. 
 
 
 
Thanks 
 
Trevor 
 
 
 
 
 
 
---- On Tue, 02 Nov 2021 14:07:03 +1300 Jialin Qiao <ma...@apache.org> wrote ---- 
 
 
 
Hi, 
 
Digital time looks more intuitive. 
Besides, could we rename the column from time to count in UDF? 
@mailto:mailto:steveyurongsu@qq.com.invalid <ma...@qq.com.invalid> 
 
Thanks, 
————————————————— 
Jialin Qiao 
 
Trevor Hart <ma...@ope.nz> 于2021年11月2日周二 上午6:09写道: 
 
> I have put my first UDTF on GitHub - https://github.com/ope-nz/iotDBExtras 
> 
> 
> 
> The first function I have developed is called UDTFDistinctCount. It 
> returns the distinct values (similar to IoTDB-Quality UDTFDistinct) but it 
> includes the count of each distinct values (as the time column). 
> 
> 
> 
> Example query; 
> 
> 
> 
> IoTDB> select distinct_count(temperature) from root.ln.wf01.wt01 
> 
> 
> +-----------------------------+---------------------------------------------+ 
> 
> | 
> Time|distinct_count(root.ln.wf01.wt01.temperature)| 
> 
> 
> +-----------------------------+---------------------------------------------+ 
> 
> |1970-01-01T12:00:00.020+12:00| 
> 24.37| 
> 
> |1970-01-01T12:00:00.009+12:00| 
> 24.12| 
> 
> |1970-01-01T12:00:00.016+12:00| 
> 24.87| 
> 
> 
> +-----------------------------+---------------------------------------------+ 
> 
> 
> 
> Or as digital time; 
> 
> 
> 
> +----+---------------------------------------------+ 
> 
> 
> |Time|distinct_count(root.ln.wf01.wt01.temperature)| 
> 
> +----+---------------------------------------------+ 
> 
> |  20|                                        24.37| 
> 
> |   9|                                        24.12| 
> 
> |  16|                                        24.87| 
> 
> +----+---------------------------------------------+ 
> 
> 
> 
> 
> 
> Thanks 
> 
> Trevor Hart 
> 
> Ope Limited 
> 
> w: http://www.ope.nz/ 
> 
> m: +64212728039

Re: New UDTF iotDBExtras

Posted by Steve Su <st...@qq.com.INVALID>.
Hi Trevor,

Thanks for sharing the usage of UDTF. I have starred your GitHub repo :D

> Besides, could we rename the column from time to count in UDF?

Sadly, no. This can be a new feature.

> Unless I am mistaken a time column must be returned via UDTF - if you look at IoTDB-Quality UDTFDistinct it returns a incrementing Long value as the time column.

In our design, we do expect a UDTF to return an increasing long value column as a time column. But as you see, a non-increasing long value column is also okay. To be honest, I was very surprised to see this can work. 

> Initially I wanted to return Col1 (Distinct Values) and Col2 (Count as Int) but at this release it looks like you have to return Col1 (Time) and Col2 (A computed value) - so at this version it does not look possible to return more than 1 computed value so using the time column is just a work around to get the count in the results. Not ideal but gets the job done.

In fact, many users wanted us to support UDTFs that can output multiple columns, but we haven't figured out how to define this feature: what the Java interface should look like, what the SQL should look like, and so on.

I want to hear your suggestions :D 

Steve Su

------------------ Original ------------------
From: "dev@iotdb.apache.orgtrevor@ope.nz" <tr...@ope.nz>;
Date: Tue, Nov 2, 2021 09:15 AM
To: "dev"<de...@iotdb.apache.org>;
Cc: "qiaojialin"<qi...@qq.com.invalid>;
Subject: Re: New UDTF iotDBExtras

Hello Jialin



Yes digital time does look better - this is how I will access it through JDBC anyway. I treat all times as Long.



Unless I am mistaken a time column must be returned via UDTF - if you look at IoTDB-Quality UDTFDistinct it returns a incrementing Long value as the time column.



The UTDF method to put the values together only accepts two parameters ie time and a value.



Initially I wanted to return Col1 (Distinct Values) and Col2 (Count as Int) but at this release it looks like you have to return Col1 (Time) and Col2 (A computed value) - so at this version it does not look possible to return more than 1 computed value so using the time column is just a work around to get the count in the results. Not ideal but gets the job done.



Thanks 

Trevor






---- On Tue, 02 Nov 2021 14:07:03 +1300 Jialin Qiao <qi...@apache.org> wrote ----



Hi,
 
Digital time looks more intuitive.
Besides, could we rename the column from time to count in UDF?
@mailto:steveyurongsu@qq.com.invalid <ma...@qq.com.invalid>
 
Thanks,
—————————————————
Jialin Qiao
 
Trevor Hart <ma...@ope.nz> 于2021年11月2日周二 上午6:09写道:
 
> I have put my first UDTF on GitHub - https://github.com/ope-nz/iotDBExtras
>
>
>
> The first function I have developed is called UDTFDistinctCount. It
> returns the distinct values (similar to IoTDB-Quality UDTFDistinct) but it
> includes the count of each distinct values (as the time column).
>
>
>
> Example query;
>
>
>
> IoTDB> select distinct_count(temperature) from root.ln.wf01.wt01
>
>
> +-----------------------------+---------------------------------------------+
>
> |
> Time|distinct_count(root.ln.wf01.wt01.temperature)|
>
>
> +-----------------------------+---------------------------------------------+
>
> |1970-01-01T12:00:00.020+12:00|
> 24.37|
>
> |1970-01-01T12:00:00.009+12:00|
> 24.12|
>
> |1970-01-01T12:00:00.016+12:00|
> 24.87|
>
>
> +-----------------------------+---------------------------------------------+
>
>
>
> Or as digital time;
>
>
>
> +----+---------------------------------------------+
>
>
> |Time|distinct_count(root.ln.wf01.wt01.temperature)|
>
> +----+---------------------------------------------+
>
> |  20|                                        24.37|
>
> |   9|                                        24.12|
>
> |  16|                                        24.87|
>
> +----+---------------------------------------------+
>
>
>
>
>
> Thanks
>
> Trevor Hart
>
> Ope Limited
>
> w: http://www.ope.nz/
>
> m: +64212728039

Re: New UDTF iotDBExtras

Posted by Trevor Hart <tr...@ope.nz>.
Hello Jialin



Yes digital time does look better - this is how I will access it through JDBC anyway. I treat all times as Long.



Unless I am mistaken a time column must be returned via UDTF - if you look at IoTDB-Quality UDTFDistinct it returns a incrementing Long value as the time column.



The UTDF method to put the values together only accepts two parameters ie time and a value.



Initially I wanted to return Col1 (Distinct Values) and Col2 (Count as Int) but at this release it looks like you have to return Col1 (Time) and Col2 (A computed value) - so at this version it does not look possible to return more than 1 computed value so using the time column is just a work around to get the count in the results. Not ideal but gets the job done.



Thanks 

Trevor






---- On Tue, 02 Nov 2021 14:07:03 +1300 Jialin Qiao <qi...@apache.org> wrote ----



Hi, 
 
Digital time looks more intuitive. 
Besides, could we rename the column from time to count in UDF? 
@mailto:steveyurongsu@qq.com.invalid <ma...@qq.com.invalid> 
 
Thanks, 
————————————————— 
Jialin Qiao 
 
Trevor Hart <ma...@ope.nz> 于2021年11月2日周二 上午6:09写道: 
 
> I have put my first UDTF on GitHub - https://github.com/ope-nz/iotDBExtras 
> 
> 
> 
> The first function I have developed is called UDTFDistinctCount. It 
> returns the distinct values (similar to IoTDB-Quality UDTFDistinct) but it 
> includes the count of each distinct values (as the time column). 
> 
> 
> 
> Example query; 
> 
> 
> 
> IoTDB> select distinct_count(temperature) from root.ln.wf01.wt01 
> 
> 
> +-----------------------------+---------------------------------------------+ 
> 
> | 
> Time|distinct_count(root.ln.wf01.wt01.temperature)| 
> 
> 
> +-----------------------------+---------------------------------------------+ 
> 
> |1970-01-01T12:00:00.020+12:00| 
> 24.37| 
> 
> |1970-01-01T12:00:00.009+12:00| 
> 24.12| 
> 
> |1970-01-01T12:00:00.016+12:00| 
> 24.87| 
> 
> 
> +-----------------------------+---------------------------------------------+ 
> 
> 
> 
> Or as digital time; 
> 
> 
> 
> +----+---------------------------------------------+ 
> 
> 
> |Time|distinct_count(root.ln.wf01.wt01.temperature)| 
> 
> +----+---------------------------------------------+ 
> 
> |  20|                                        24.37| 
> 
> |   9|                                        24.12| 
> 
> |  16|                                        24.87| 
> 
> +----+---------------------------------------------+ 
> 
> 
> 
> 
> 
> Thanks 
> 
> Trevor Hart 
> 
> Ope Limited 
> 
> w: http://www.ope.nz/ 
> 
> m: +64212728039

Re: New UDTF iotDBExtras

Posted by Jialin Qiao <qi...@apache.org>.
Hi,

Digital time looks more intuitive.
Besides, could we rename the column from time to count in UDF?
@steveyurongsu@qq.com.invalid <st...@qq.com.invalid>

Thanks,
—————————————————
Jialin Qiao

Trevor Hart <tr...@ope.nz> 于2021年11月2日周二 上午6:09写道:

> I have put my first UDTF on GitHub - https://github.com/ope-nz/iotDBExtras
>
>
>
> The first function I have developed is called UDTFDistinctCount. It
> returns the distinct values (similar to IoTDB-Quality UDTFDistinct) but it
> includes the count of each distinct values (as the time column).
>
>
>
> Example query;
>
>
>
> IoTDB> select distinct_count(temperature) from root.ln.wf01.wt01
>
>
> +-----------------------------+---------------------------------------------+
>
> |
> Time|distinct_count(root.ln.wf01.wt01.temperature)|
>
>
> +-----------------------------+---------------------------------------------+
>
> |1970-01-01T12:00:00.020+12:00|
> 24.37|
>
> |1970-01-01T12:00:00.009+12:00|
> 24.12|
>
> |1970-01-01T12:00:00.016+12:00|
> 24.87|
>
>
> +-----------------------------+---------------------------------------------+
>
>
>
> Or as digital time;
>
>
>
> +----+---------------------------------------------+
>
>
> |Time|distinct_count(root.ln.wf01.wt01.temperature)|
>
> +----+---------------------------------------------+
>
> |  20|                                        24.37|
>
> |   9|                                        24.12|
>
> |  16|                                        24.87|
>
> +----+---------------------------------------------+
>
>
>
>
>
> Thanks
>
> Trevor Hart
>
> Ope Limited
>
> w: http://www.ope.nz/
>
> m: +64212728039