You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by Jialin Qiao <qj...@mails.tsinghua.edu.cn> on 2020/07/09 07:16:48 UTC

Fetch query result iteratively to avoid OOM

Hi,


Currently, Haihang and Yuan Tian added an improvement to fetch the results of show timeseries iteratively to avoid OOM [1].


For data queries, we use the QueryDataSet which provides the hasNext & next iterator. Then the client could fetch result by fetchSize.


However, for most metadata queries, the ListDataSet is used, which caches all results at server. This may cause OOM.
"show timeseries" query is just one case whose results may be very large and use the ListDataSet. 


We'd better check other metadata queries, and when you add a new query type, please consider its memory consumption.


[1] https://github.com/apache/incubator-iotdb/pull/1470

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

Re: Fetch query result iteratively to avoid OOM

Posted by Julian Feinauer <j....@pragmaticminds.de>.
Hi Jialin,

this is a very good point, indeed.
I remember in Calcite there was the Linq4j Module which was based on an extension of an "Iterable / Iterator" which was basically Lazy Loading / Computation.
This object was assembled throughout the query and finally given to the Servers handler who could then do all the junking or fetching with the client.

Of course some Implementations require some amount of memory but in most situations one could always realize it in a "bounded" way.

Here some details: https://calcite.apache.org/docs/

Julian

Am 09.07.20, 09:17 schrieb "Jialin Qiao" <qj...@mails.tsinghua.edu.cn>:

    Hi,


    Currently, Haihang and Yuan Tian added an improvement to fetch the results of show timeseries iteratively to avoid OOM [1].


    For data queries, we use the QueryDataSet which provides the hasNext & next iterator. Then the client could fetch result by fetchSize.


    However, for most metadata queries, the ListDataSet is used, which caches all results at server. This may cause OOM.
    "show timeseries" query is just one case whose results may be very large and use the ListDataSet. 


    We'd better check other metadata queries, and when you add a new query type, please consider its memory consumption.


    [1] https://github.com/apache/incubator-iotdb/pull/1470

    Thanks,
    --
    Jialin Qiao
    School of Software, Tsinghua University

    乔嘉林
    清华大学 软件学院