You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by 张天生 <zh...@gmail.com> on 2016/07/29 08:54:55 UTC

Query server occupy memory too high

I'm using kylin 1.5.2.1. I built a cube for a month's event data of
advertisment impression/click/conversion. It consists of 6 dimensions and 8
measures. It consists of 2 uv measures, and uv measure was computed by
DISTINCT COUNT. The cube size is 2G. When i queried uv measures of a month
data, the memory quickly increased to 30G+, and the quey was also slowly. I
don't known why it occupied so much memory, but cube size is only 2G,
memory data expanded so big. Hower, when i executed simple silimar sum or
count query ,it was fast and occupied memory not too much.

Re: Query server occupy memory too high

Posted by 张天生 <zh...@gmail.com>.
We have 30+ million event log of each day, and amost 15w+ cardinality of 6
dimension. We used 4.86% precision of HLL measure, and queried page by page.

hongbin ma <ma...@apache.org>于2016年8月4日周四 下午11:39写道:

> after you run such query, check out the KYLIN_HOME/logs/kylin.log, there
> should be snippet like:
>
> 2016-08-04 00:48:31,990 INFO  [http-bio-7070-exec-7]
> service.QueryService:399 : Scan count for each storageContext: 12306477,
> 2016-08-04 00:48:31,991 INFO  [http-bio-7070-exec-7]
> controller.QueryController:197 : Stats of SQL response: isException: false,
> duration: 56152, total scan count 12306477
> 2016-08-04 00:48:32,000 WARN  [http-bio-7070-exec-7]
>
> can you let us know " Scan count for each storageContext" and the size of
> your query result?
>
> On Thu, Aug 4, 2016 at 2:21 PM, Li Yang <li...@apache.org> wrote:
>
>> Depending on how many rows and how many count distinct values are
>> returned, the query may take much memory and become slow.
>>
>> By saying querying uv of a month data, how many rows do you expect? Also
>> what's the precision of the HLL measure? Lower the precision can ease the
>> problem too.
>>
>> On Fri, Jul 29, 2016 at 4:54 PM, 张天生 <zh...@gmail.com> wrote:
>>
>>> I'm using kylin 1.5.2.1. I built a cube for a month's event data of
>>> advertisment impression/click/conversion. It consists of 6 dimensions and 8
>>> measures. It consists of 2 uv measures, and uv measure was computed by
>>> DISTINCT COUNT. The cube size is 2G. When i queried uv measures of a month
>>> data, the memory quickly increased to 30G+, and the quey was also slowly. I
>>> don't known why it occupied so much memory, but cube size is only 2G,
>>> memory data expanded so big. Hower, when i executed simple silimar sum or
>>> count query ,it was fast and occupied memory not too much.
>>>
>>
>>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
>

Re: Query server occupy memory too high

Posted by hongbin ma <ma...@apache.org>.
after you run such query, check out the KYLIN_HOME/logs/kylin.log, there
should be snippet like:

2016-08-04 00:48:31,990 INFO  [http-bio-7070-exec-7]
service.QueryService:399 : Scan count for each storageContext: 12306477,
2016-08-04 00:48:31,991 INFO  [http-bio-7070-exec-7]
controller.QueryController:197 : Stats of SQL response: isException: false,
duration: 56152, total scan count 12306477
2016-08-04 00:48:32,000 WARN  [http-bio-7070-exec-7]

can you let us know " Scan count for each storageContext" and the size of
your query result?

On Thu, Aug 4, 2016 at 2:21 PM, Li Yang <li...@apache.org> wrote:

> Depending on how many rows and how many count distinct values are
> returned, the query may take much memory and become slow.
>
> By saying querying uv of a month data, how many rows do you expect? Also
> what's the precision of the HLL measure? Lower the precision can ease the
> problem too.
>
> On Fri, Jul 29, 2016 at 4:54 PM, 张天生 <zh...@gmail.com> wrote:
>
>> I'm using kylin 1.5.2.1. I built a cube for a month's event data of
>> advertisment impression/click/conversion. It consists of 6 dimensions and 8
>> measures. It consists of 2 uv measures, and uv measure was computed by
>> DISTINCT COUNT. The cube size is 2G. When i queried uv measures of a month
>> data, the memory quickly increased to 30G+, and the quey was also slowly. I
>> don't known why it occupied so much memory, but cube size is only 2G,
>> memory data expanded so big. Hower, when i executed simple silimar sum or
>> count query ,it was fast and occupied memory not too much.
>>
>
>


-- 
Regards,

*Bin Mahone | 马洪宾*

Re: Query server occupy memory too high

Posted by Li Yang <li...@apache.org>.
Depending on how many rows and how many count distinct values are returned,
the query may take much memory and become slow.

By saying querying uv of a month data, how many rows do you expect? Also
what's the precision of the HLL measure? Lower the precision can ease the
problem too.

On Fri, Jul 29, 2016 at 4:54 PM, 张天生 <zh...@gmail.com> wrote:

> I'm using kylin 1.5.2.1. I built a cube for a month's event data of
> advertisment impression/click/conversion. It consists of 6 dimensions and 8
> measures. It consists of 2 uv measures, and uv measure was computed by
> DISTINCT COUNT. The cube size is 2G. When i queried uv measures of a month
> data, the memory quickly increased to 30G+, and the quey was also slowly. I
> don't known why it occupied so much memory, but cube size is only 2G,
> memory data expanded so big. Hower, when i executed simple silimar sum or
> count query ,it was fast and occupied memory not too much.
>