You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by Igor Gois <ig...@gmail.com> on 2021/02/01 22:23:14 UTC

New here - Python client to pandas

Hi everyone,

I am sorry if this is not the right place to ask, but I couldn't install QQ
or we chat. I am from Brazil.

I am using a python client and I would like to know if there is a way to
transform a session dataset (session.execute_query_statement) into a pandas
dataframe?

I tried with the example code in the while loop. It worked, but it was not
so fast.

I am in a project to store and analyze wind turbines data and iotdb seems
to be a great fit.

Thanks in advance and sorry if this is the wrong place

[image: image.png]


Igor Gois

Re: New here - Python client to pandas

Posted by Igor Gois <ig...@gmail.com>.
Just for the record, this way is faster:

# execute sql query statement
session_data_set = session.execute_query_statement("select
s9,s10,s11,s12,s13,s14,s15,s16,s17,s18,s19,s22,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32,s33,s61,s62,s64,s72,s56,s57
from root.sae.trends.1 order by time desc limit 100000")

session_data_set.set_fetch_size(1024)

results_columns = session_data_set.get_column_names()

x = []

while session_data_set.has_next():
    x.append(tuple(session_data_set.next().__str__().split()))

results = pd.DataFrame(x,columns=results_columns)

print(results)


Att,

Igor Gois






Am Di., 2. Feb. 2021 um 09:09 Uhr schrieb Igor Gois <ig...@gmail.com>:

> Thank you, Xiangdong
>
> Here is the python code that I used:
>
> # execute sql query statement
> session_data_set = session.execute_query_statement("select
> s9,s10,s11,s12,s13,s14,s15,s16,s17,s18,s19,s22,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32,s33,s61,s62,s64,s72,s56,s57
> from root.sae.trends.1 order by time desc limit 100")
>
> session_data_set.set_fetch_size(1024)
> results_columns = session_data_set.get_column_names()
> print(session_data_set.get_column_types())
> print(results_columns)
>
> results = pd.DataFrame(columns=results_columns)
>
> while session_data_set.has_next():
>     row = session_data_set.next()
>     row_str = row.__str__().split()
>     row_str[0] =
> datetime.fromtimestamp(row.get_timestamp()/1000).strftime("%Y-%m-%d
> %H:%M:%S")
>
>     results =
> results.append(pd.Series(row_str,index=results_columns),ignore_index=True)
>
>
> print(results)
> session_data_set.close_operation_handle()
>
> When I remove the 'limit 100' it takes too long. The time series has 72
> sensors with 40k datapoints each.
>
> Is there a faster way to do this?
>
> Thank you in advance
>
>
> IoTDB> select count(s1) from root.sae.trends.1
> +---------------------------+
> |count(root.sae.trends.1.s1)|
> +---------------------------+
> |                      40778|
> +---------------------------+
>
> IoTDB> count timeseries root
> +-----+
> |count|
> +-----+
> |   72|
> +-----+
>
> Att,
>
> Igor Gois
>
>
>
>
>
>
> Am Di., 2. Feb. 2021 um 02:00 Uhr schrieb Xiangdong Huang <
> sainthxd@gmail.com>:
>
>> By the way, indeed we should consider whether our iotdb-session python API
>> is efficient enough (we have many array operations and serialization
>> operations in the Python lib).
>>
>> As I am not an expert of Python, I cannot get the conclusion. Also call
>> contributors.
>>
>> Best,
>> -----------------------------------
>> Xiangdong Huang
>> School of Software, Tsinghua University
>>
>>  黄向东
>> 清华大学 软件学院
>>
>>
>> Xiangdong Huang <sa...@gmail.com> 于2021年2月2日周二 下午12:56写道:
>>
>> > Hi Igor,
>> >
>> > > I am sorry if this is not the right place to ask
>> >
>> > The mailing list is absolutely the right place:D
>> > But the problem is our mailing list only supports plain texts.
>> > If you have to upload screenshots, please create a jira (
>> > https://issues.apache.org/jira/projects/IOTDB/issues).
>> >
>> > Actually several days ago, my mate @steveyurongsu@qq.com
>> > <st...@qq.com>  also discussed with me about integration IoTDB
>> > python with Pandas.
>> > I think it is a great idea, and we can develop it together.
>> >
>> > Best,
>> > -----------------------------------
>> > Xiangdong Huang
>> > School of Software, Tsinghua University
>> >
>> >  黄向东
>> > 清华大学 软件学院
>> >
>> >
>> > Igor Gois <ig...@gmail.com> 于2021年2月2日周二 上午6:23写道:
>> >
>> >> Hi everyone,
>> >>
>> >> I am sorry if this is not the right place to ask, but I couldn't
>> install
>> >> QQ or we chat. I am from Brazil.
>> >>
>> >> I am using a python client and I would like to know if there is a way
>> to
>> >> transform a session dataset (session.execute_query_statement) into a
>> pandas
>> >> dataframe?
>> >>
>> >> I tried with the example code in the while loop. It worked, but it was
>> >> not so fast.
>> >>
>> >> I am in a project to store and analyze wind turbines data and iotdb
>> seems
>> >> to be a great fit.
>> >>
>> >> Thanks in advance and sorry if this is the wrong place
>> >>
>> >> [image: image.png]
>> >>
>> >>
>> >> Igor Gois
>> >>
>> >>
>> >>
>> >>
>> >>
>>
>

Re: New here - Python client to pandas

Posted by Igor Gois <ig...@gmail.com>.
Thank you, Xiangdong

Here is the python code that I used:

# execute sql query statement
session_data_set = session.execute_query_statement("select
s9,s10,s11,s12,s13,s14,s15,s16,s17,s18,s19,s22,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32,s33,s61,s62,s64,s72,s56,s57
from root.sae.trends.1 order by time desc limit 100")

session_data_set.set_fetch_size(1024)
results_columns = session_data_set.get_column_names()
print(session_data_set.get_column_types())
print(results_columns)

results = pd.DataFrame(columns=results_columns)

while session_data_set.has_next():
    row = session_data_set.next()
    row_str = row.__str__().split()
    row_str[0] =
datetime.fromtimestamp(row.get_timestamp()/1000).strftime("%Y-%m-%d
%H:%M:%S")

    results =
results.append(pd.Series(row_str,index=results_columns),ignore_index=True)


print(results)
session_data_set.close_operation_handle()

When I remove the 'limit 100' it takes too long. The time series has 72
sensors with 40k datapoints each.

Is there a faster way to do this?

Thank you in advance


IoTDB> select count(s1) from root.sae.trends.1
+---------------------------+
|count(root.sae.trends.1.s1)|
+---------------------------+
|                      40778|
+---------------------------+

IoTDB> count timeseries root
+-----+
|count|
+-----+
|   72|
+-----+

Att,

Igor Gois






Am Di., 2. Feb. 2021 um 02:00 Uhr schrieb Xiangdong Huang <
sainthxd@gmail.com>:

> By the way, indeed we should consider whether our iotdb-session python API
> is efficient enough (we have many array operations and serialization
> operations in the Python lib).
>
> As I am not an expert of Python, I cannot get the conclusion. Also call
> contributors.
>
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
>
>  黄向东
> 清华大学 软件学院
>
>
> Xiangdong Huang <sa...@gmail.com> 于2021年2月2日周二 下午12:56写道:
>
> > Hi Igor,
> >
> > > I am sorry if this is not the right place to ask
> >
> > The mailing list is absolutely the right place:D
> > But the problem is our mailing list only supports plain texts.
> > If you have to upload screenshots, please create a jira (
> > https://issues.apache.org/jira/projects/IOTDB/issues).
> >
> > Actually several days ago, my mate @steveyurongsu@qq.com
> > <st...@qq.com>  also discussed with me about integration IoTDB
> > python with Pandas.
> > I think it is a great idea, and we can develop it together.
> >
> > Best,
> > -----------------------------------
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> >  黄向东
> > 清华大学 软件学院
> >
> >
> > Igor Gois <ig...@gmail.com> 于2021年2月2日周二 上午6:23写道:
> >
> >> Hi everyone,
> >>
> >> I am sorry if this is not the right place to ask, but I couldn't install
> >> QQ or we chat. I am from Brazil.
> >>
> >> I am using a python client and I would like to know if there is a way to
> >> transform a session dataset (session.execute_query_statement) into a
> pandas
> >> dataframe?
> >>
> >> I tried with the example code in the while loop. It worked, but it was
> >> not so fast.
> >>
> >> I am in a project to store and analyze wind turbines data and iotdb
> seems
> >> to be a great fit.
> >>
> >> Thanks in advance and sorry if this is the wrong place
> >>
> >> [image: image.png]
> >>
> >>
> >> Igor Gois
> >>
> >>
> >>
> >>
> >>
>

Re: New here - Python client to pandas

Posted by Xiangdong Huang <sa...@gmail.com>.
By the way, indeed we should consider whether our iotdb-session python API
is efficient enough (we have many array operations and serialization
operations in the Python lib).

As I am not an expert of Python, I cannot get the conclusion. Also call
contributors.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Xiangdong Huang <sa...@gmail.com> 于2021年2月2日周二 下午12:56写道:

> Hi Igor,
>
> > I am sorry if this is not the right place to ask
>
> The mailing list is absolutely the right place:D
> But the problem is our mailing list only supports plain texts.
> If you have to upload screenshots, please create a jira (
> https://issues.apache.org/jira/projects/IOTDB/issues).
>
> Actually several days ago, my mate @steveyurongsu@qq.com
> <st...@qq.com>  also discussed with me about integration IoTDB
> python with Pandas.
> I think it is a great idea, and we can develop it together.
>
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
>
>  黄向东
> 清华大学 软件学院
>
>
> Igor Gois <ig...@gmail.com> 于2021年2月2日周二 上午6:23写道:
>
>> Hi everyone,
>>
>> I am sorry if this is not the right place to ask, but I couldn't install
>> QQ or we chat. I am from Brazil.
>>
>> I am using a python client and I would like to know if there is a way to
>> transform a session dataset (session.execute_query_statement) into a pandas
>> dataframe?
>>
>> I tried with the example code in the while loop. It worked, but it was
>> not so fast.
>>
>> I am in a project to store and analyze wind turbines data and iotdb seems
>> to be a great fit.
>>
>> Thanks in advance and sorry if this is the wrong place
>>
>> [image: image.png]
>>
>>
>> Igor Gois
>>
>>
>>
>>
>>

Re: New here - Python client to pandas

Posted by Xiangdong Huang <sa...@gmail.com>.
Hi Igor,

> I am sorry if this is not the right place to ask

The mailing list is absolutely the right place:D
But the problem is our mailing list only supports plain texts.
If you have to upload screenshots, please create a jira (
https://issues.apache.org/jira/projects/IOTDB/issues).

Actually several days ago, my mate @steveyurongsu@qq.com
<st...@qq.com>  also discussed with me about integration IoTDB
python with Pandas.
I think it is a great idea, and we can develop it together.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Igor Gois <ig...@gmail.com> 于2021年2月2日周二 上午6:23写道:

> Hi everyone,
>
> I am sorry if this is not the right place to ask, but I couldn't install
> QQ or we chat. I am from Brazil.
>
> I am using a python client and I would like to know if there is a way to
> transform a session dataset (session.execute_query_statement) into a pandas
> dataframe?
>
> I tried with the example code in the while loop. It worked, but it was not
> so fast.
>
> I am in a project to store and analyze wind turbines data and iotdb seems
> to be a great fit.
>
> Thanks in advance and sorry if this is the wrong place
>
> [image: image.png]
>
>
> Igor Gois
>
>
>
>
>