You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@phoenix.apache.org by Aaron Molitor <am...@splicemachine.com> on 2016/08/24 14:54:30 UTC

high client cpu usage

Seeing higher than expected CPU/MEM usage for the java process started by the sqlline.py client. 

From top:
top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74, 7.89
Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,  0.0%st
Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                                                                                                                                                                                                                    
 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java    

Why is a the JDBC client using so much memory/cpu?  The expected result is only a single row.

QUERY:
SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
FROM
  TPCH.LINEITEM,
  TPCH.PART
WHERE
  P_PARTKEY = L_PARTKEY
  AND P_BRAND = 'BRAND#23'
  AND P_CONTAINER = 'MED BOX'
  AND L_QUANTITY < (
    SELECT 0.2 * AVG(L_QUANTITY)
    FROM
      TPCH.LINEITEM
    WHERE
      L_PARTKEY = P_PARTKEY
  );

ROW COUNTS: 
TPCH.REGION             5
TPCH.NATION            25
TPCH.SUPPLIER     1000000
TPCH.PART        20000000
TPCH.PARTSUPP    80000000
TPCH.ORDERS     150000000
TPCH.CUSTOMER    15000000
TPCH.LINEITEM   600037902 

Thanks, 
Aaron

Re: high client cpu usage

Posted by Aaron Molitor <am...@splicemachine.com>.

1/1          EXPLAIN SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY 
FROM 
TPCH.LINEITEM, 
TPCH.PART 
WHERE 
P_PARTKEY = L_PARTKEY 
AND P_BRAND = 'BRAND#23' 
AND P_CONTAINER = 'MED BOX' 
AND L_QUANTITY < ( 
SELECT 0.2 * AVG(L_QUANTITY) 
FROM 
TPCH.LINEITEM 
WHERE 
L_PARTKEY = P_PARTKEY 
) 
;
+--------------------------------------------------------------------------------------------------------------+
|                                                     PLAN                                                     |   
+--------------------------------------------------------------------------------------------------------------+
| CLIENT 1422-CHUNK 1186588769 ROWS 436312507108 BYTES PARALLEL 1-WAY FULL SCAN OVER TPCH.LINEITEM             |   
|     SERVER AGGREGATE INTO SINGLE ROW                                                                         |   
|     PARALLEL INNER-JOIN TABLE 0                                                                              |   
|         CLIENT 28-CHUNK 19045177 ROWS 7549747697 BYTES PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER TPCH.PART   |   
|             SERVER FILTER BY (P_BRAND = 'BRAND#23' AND P_CONTAINER = 'MED BOX')                              |   
|     PARALLEL INNER-JOIN TABLE 1(DELAYED EVALUATION)                                                          |   
|         CLIENT 1422-CHUNK 1186588769 ROWS 436312507108 BYTES PARALLEL 1422-WAY FULL SCAN OVER TPCH.LINEITEM  |
|             SERVER AGGREGATE INTO DISTINCT ROWS BY [L_PARTKEY]                                               |   
|         CLIENT MERGE SORT                                                                                    |   
|     AFTER-JOIN SERVER FILTER BY TPCH.LINEITEM.L_QUANTITY < $1.$3                                             |   
+--------------------------------------------------------------------------------------------------------------+


> On Aug 25, 2016, at 14:23, Maryann Xue <ma...@gmail.com> wrote:
> 
> Hi John,
> 
> Would you mind sharing the query plan for this query (by running "EXPLAIN <query>")?
> 
> 
> Thanks,
> Maryann
> 
> On Thu, Aug 25, 2016 at 11:19 AM, John Leach <jl...@gmail.com> wrote:
> Yeah, this query.
> 
> QUERY:
> SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>  TPCH.LINEITEM,
>  TPCH.PART
> WHERE
>  P_PARTKEY = L_PARTKEY
>  AND P_BRAND = 'BRAND#23'
>  AND P_CONTAINER = 'MED BOX'
>  AND L_QUANTITY < (
>    SELECT 0.2 * AVG(L_QUANTITY)
>    FROM
>      TPCH.LINEITEM
>    WHERE
>      L_PARTKEY = P_PARTKEY
>  );
> 
> ROW COUNTS: 
> TPCH.REGION             5
> TPCH.NATION            25
> TPCH.SUPPLIER     1000000
> TPCH.PART        20000000
> TPCH.PARTSUPP    80000000
> TPCH.ORDERS     150000000
> TPCH.CUSTOMER    15000000
> TPCH.LINEITEM   600037902 
> 
> Does the client have to sort all of LINEITEM and PART?
> 
> Thanks James.
> 
> Regards,
> John
> 
> 
> 
>> On Aug 25, 2016, at 1:16 PM, James Taylor <ja...@apache.org> wrote:
>> 
>> The client typically does a merge sort when combining results from paralllel scans. Not sure if this would explain the CPU/memory usage you're seeing. Can you narrow it down to a particular query that's causing the issue?
>> 
>> Thanks,
>> James
>> 
>> On Thu, Aug 25, 2016 at 6:49 AM, John Leach <jl...@splicemachine.com> wrote:
>> Can anyone explain why the client would be burning so much CPU and memory if the result is a single row?
>> 
>> I suspect we configured something wrong on Phoenix but we are having a hard time figuring it out.
>> 
>> Thanks in advance.
>> 
>> Regards,
>> John
>> 
>> > On Aug 24, 2016, at 9:54 AM, Aaron Molitor <am...@splicemachine.com> wrote:
>> >
>> > Seeing higher than expected CPU/MEM usage for the java process started by the sqlline.py client.
>> >
>> > From top:
>> > top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74, 7.89
>> > Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
>> > Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,  0.0%st
>> > Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
>> > Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
>> >
>> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> > 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java
>> >
>> > Why is a the JDBC client using so much memory/cpu?  The expected result is only a single row.
>> >
>> > QUERY:
>> > SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
>> > FROM
>> >  TPCH.LINEITEM,
>> >  TPCH.PART
>> > WHERE
>> >  P_PARTKEY = L_PARTKEY
>> >  AND P_BRAND = 'BRAND#23'
>> >  AND P_CONTAINER = 'MED BOX'
>> >  AND L_QUANTITY < (
>> >    SELECT 0.2 * AVG(L_QUANTITY)
>> >    FROM
>> >      TPCH.LINEITEM
>> >    WHERE
>> >      L_PARTKEY = P_PARTKEY
>> >  );
>> >
>> > ROW COUNTS:
>> > TPCH.REGION             5
>> > TPCH.NATION            25
>> > TPCH.SUPPLIER     1000000
>> > TPCH.PART        20000000
>> > TPCH.PARTSUPP    80000000
>> > TPCH.ORDERS     150000000
>> > TPCH.CUSTOMER    15000000
>> > TPCH.LINEITEM   600037902
>> >
>> > Thanks,
>> > Aaron
>> 
>> 
> 
>

Re: high client cpu usage

Posted by Aaron Molitor <am...@splicemachine.com>.

1/1          EXPLAIN SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY 
FROM 
TPCH.LINEITEM, 
TPCH.PART 
WHERE 
P_PARTKEY = L_PARTKEY 
AND P_BRAND = 'BRAND#23' 
AND P_CONTAINER = 'MED BOX' 
AND L_QUANTITY < ( 
SELECT 0.2 * AVG(L_QUANTITY) 
FROM 
TPCH.LINEITEM 
WHERE 
L_PARTKEY = P_PARTKEY 
) 
;
+--------------------------------------------------------------------------------------------------------------+
|                                                     PLAN                                                     |   
+--------------------------------------------------------------------------------------------------------------+
| CLIENT 1422-CHUNK 1186588769 ROWS 436312507108 BYTES PARALLEL 1-WAY FULL SCAN OVER TPCH.LINEITEM             |   
|     SERVER AGGREGATE INTO SINGLE ROW                                                                         |   
|     PARALLEL INNER-JOIN TABLE 0                                                                              |   
|         CLIENT 28-CHUNK 19045177 ROWS 7549747697 BYTES PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER TPCH.PART   |   
|             SERVER FILTER BY (P_BRAND = 'BRAND#23' AND P_CONTAINER = 'MED BOX')                              |   
|     PARALLEL INNER-JOIN TABLE 1(DELAYED EVALUATION)                                                          |   
|         CLIENT 1422-CHUNK 1186588769 ROWS 436312507108 BYTES PARALLEL 1422-WAY FULL SCAN OVER TPCH.LINEITEM  |
|             SERVER AGGREGATE INTO DISTINCT ROWS BY [L_PARTKEY]                                               |   
|         CLIENT MERGE SORT                                                                                    |   
|     AFTER-JOIN SERVER FILTER BY TPCH.LINEITEM.L_QUANTITY < $1.$3                                             |   
+--------------------------------------------------------------------------------------------------------------+


> On Aug 25, 2016, at 14:23, Maryann Xue <ma...@gmail.com> wrote:
> 
> Hi John,
> 
> Would you mind sharing the query plan for this query (by running "EXPLAIN <query>")?
> 
> 
> Thanks,
> Maryann
> 
> On Thu, Aug 25, 2016 at 11:19 AM, John Leach <jl...@gmail.com> wrote:
> Yeah, this query.
> 
> QUERY:
> SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>  TPCH.LINEITEM,
>  TPCH.PART
> WHERE
>  P_PARTKEY = L_PARTKEY
>  AND P_BRAND = 'BRAND#23'
>  AND P_CONTAINER = 'MED BOX'
>  AND L_QUANTITY < (
>    SELECT 0.2 * AVG(L_QUANTITY)
>    FROM
>      TPCH.LINEITEM
>    WHERE
>      L_PARTKEY = P_PARTKEY
>  );
> 
> ROW COUNTS: 
> TPCH.REGION             5
> TPCH.NATION            25
> TPCH.SUPPLIER     1000000
> TPCH.PART        20000000
> TPCH.PARTSUPP    80000000
> TPCH.ORDERS     150000000
> TPCH.CUSTOMER    15000000
> TPCH.LINEITEM   600037902 
> 
> Does the client have to sort all of LINEITEM and PART?
> 
> Thanks James.
> 
> Regards,
> John
> 
> 
> 
>> On Aug 25, 2016, at 1:16 PM, James Taylor <ja...@apache.org> wrote:
>> 
>> The client typically does a merge sort when combining results from paralllel scans. Not sure if this would explain the CPU/memory usage you're seeing. Can you narrow it down to a particular query that's causing the issue?
>> 
>> Thanks,
>> James
>> 
>> On Thu, Aug 25, 2016 at 6:49 AM, John Leach <jl...@splicemachine.com> wrote:
>> Can anyone explain why the client would be burning so much CPU and memory if the result is a single row?
>> 
>> I suspect we configured something wrong on Phoenix but we are having a hard time figuring it out.
>> 
>> Thanks in advance.
>> 
>> Regards,
>> John
>> 
>> > On Aug 24, 2016, at 9:54 AM, Aaron Molitor <am...@splicemachine.com> wrote:
>> >
>> > Seeing higher than expected CPU/MEM usage for the java process started by the sqlline.py client.
>> >
>> > From top:
>> > top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74, 7.89
>> > Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
>> > Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,  0.0%st
>> > Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
>> > Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
>> >
>> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> > 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java
>> >
>> > Why is a the JDBC client using so much memory/cpu?  The expected result is only a single row.
>> >
>> > QUERY:
>> > SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
>> > FROM
>> >  TPCH.LINEITEM,
>> >  TPCH.PART
>> > WHERE
>> >  P_PARTKEY = L_PARTKEY
>> >  AND P_BRAND = 'BRAND#23'
>> >  AND P_CONTAINER = 'MED BOX'
>> >  AND L_QUANTITY < (
>> >    SELECT 0.2 * AVG(L_QUANTITY)
>> >    FROM
>> >      TPCH.LINEITEM
>> >    WHERE
>> >      L_PARTKEY = P_PARTKEY
>> >  );
>> >
>> > ROW COUNTS:
>> > TPCH.REGION             5
>> > TPCH.NATION            25
>> > TPCH.SUPPLIER     1000000
>> > TPCH.PART        20000000
>> > TPCH.PARTSUPP    80000000
>> > TPCH.ORDERS     150000000
>> > TPCH.CUSTOMER    15000000
>> > TPCH.LINEITEM   600037902
>> >
>> > Thanks,
>> > Aaron
>> 
>> 
> 
>

Re: high client cpu usage

Posted by Maryann Xue <ma...@gmail.com>.

Hi John,

Would you mind sharing the query plan for this query (by running "EXPLAIN
<query>")?


Thanks,
Maryann

On Thu, Aug 25, 2016 at 11:19 AM, John Leach <jl...@gmail.com> wrote:

> Yeah, this query.
>
> QUERY:
> SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>  TPCH.LINEITEM,
>  TPCH.PART
> WHERE
>  P_PARTKEY = L_PARTKEY
>  AND P_BRAND = 'BRAND#23'
>  AND P_CONTAINER = 'MED BOX'
>  AND L_QUANTITY < (
>    SELECT 0.2 * AVG(L_QUANTITY)
>    FROM
>      TPCH.LINEITEM
>    WHERE
>      L_PARTKEY = P_PARTKEY
>  );
>
> ROW COUNTS:
> TPCH.REGION             5
> TPCH.NATION            25
> TPCH.SUPPLIER     1000000
> TPCH.PART        20000000
> TPCH.PARTSUPP    80000000
> TPCH.ORDERS     150000000
> TPCH.CUSTOMER    15000000
> TPCH.LINEITEM   600037902
>
> Does the client have to sort all of LINEITEM and PART?
>
> Thanks James.
>
> Regards,
> John
>
>
>
> On Aug 25, 2016, at 1:16 PM, James Taylor <ja...@apache.org> wrote:
>
> The client typically does a merge sort when combining results from
> paralllel scans. Not sure if this would explain the CPU/memory usage you're
> seeing. Can you narrow it down to a particular query that's causing the
> issue?
>
> Thanks,
> James
>
> On Thu, Aug 25, 2016 at 6:49 AM, John Leach <jl...@splicemachine.com>
> wrote:
>
>> Can anyone explain why the client would be burning so much CPU and memory
>> if the result is a single row?
>>
>> I suspect we configured something wrong on Phoenix but we are having a
>> hard time figuring it out.
>>
>> Thanks in advance.
>>
>> Regards,
>> John
>>
>> > On Aug 24, 2016, at 9:54 AM, Aaron Molitor <am...@splicemachine.com>
>> wrote:
>> >
>> > Seeing higher than expected CPU/MEM usage for the java process started
>> by the sqlline.py client.
>> >
>> > From top:
>> > top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74,
>> 7.89
>> > Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
>> > Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,
>> 0.0%st
>> > Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
>> > Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
>> >
>> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> > 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java
>> >
>> > Why is a the JDBC client using so much memory/cpu?  The expected result
>> is only a single row.
>> >
>> > QUERY:
>> > SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
>> > FROM
>> >  TPCH.LINEITEM,
>> >  TPCH.PART
>> > WHERE
>> >  P_PARTKEY = L_PARTKEY
>> >  AND P_BRAND = 'BRAND#23'
>> >  AND P_CONTAINER = 'MED BOX'
>> >  AND L_QUANTITY < (
>> >    SELECT 0.2 * AVG(L_QUANTITY)
>> >    FROM
>> >      TPCH.LINEITEM
>> >    WHERE
>> >      L_PARTKEY = P_PARTKEY
>> >  );
>> >
>> > ROW COUNTS:
>> > TPCH.REGION             5
>> > TPCH.NATION            25
>> > TPCH.SUPPLIER     1000000
>> > TPCH.PART        20000000
>> > TPCH.PARTSUPP    80000000
>> > TPCH.ORDERS     150000000
>> > TPCH.CUSTOMER    15000000
>> > TPCH.LINEITEM   600037902
>> >
>> > Thanks,
>> > Aaron
>>
>>
>
>

Re: high client cpu usage

Posted by Maryann Xue <ma...@gmail.com>.

Hi John,

Would you mind sharing the query plan for this query (by running "EXPLAIN
<query>")?


Thanks,
Maryann

On Thu, Aug 25, 2016 at 11:19 AM, John Leach <jl...@gmail.com> wrote:

> Yeah, this query.
>
> QUERY:
> SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>  TPCH.LINEITEM,
>  TPCH.PART
> WHERE
>  P_PARTKEY = L_PARTKEY
>  AND P_BRAND = 'BRAND#23'
>  AND P_CONTAINER = 'MED BOX'
>  AND L_QUANTITY < (
>    SELECT 0.2 * AVG(L_QUANTITY)
>    FROM
>      TPCH.LINEITEM
>    WHERE
>      L_PARTKEY = P_PARTKEY
>  );
>
> ROW COUNTS:
> TPCH.REGION             5
> TPCH.NATION            25
> TPCH.SUPPLIER     1000000
> TPCH.PART        20000000
> TPCH.PARTSUPP    80000000
> TPCH.ORDERS     150000000
> TPCH.CUSTOMER    15000000
> TPCH.LINEITEM   600037902
>
> Does the client have to sort all of LINEITEM and PART?
>
> Thanks James.
>
> Regards,
> John
>
>
>
> On Aug 25, 2016, at 1:16 PM, James Taylor <ja...@apache.org> wrote:
>
> The client typically does a merge sort when combining results from
> paralllel scans. Not sure if this would explain the CPU/memory usage you're
> seeing. Can you narrow it down to a particular query that's causing the
> issue?
>
> Thanks,
> James
>
> On Thu, Aug 25, 2016 at 6:49 AM, John Leach <jl...@splicemachine.com>
> wrote:
>
>> Can anyone explain why the client would be burning so much CPU and memory
>> if the result is a single row?
>>
>> I suspect we configured something wrong on Phoenix but we are having a
>> hard time figuring it out.
>>
>> Thanks in advance.
>>
>> Regards,
>> John
>>
>> > On Aug 24, 2016, at 9:54 AM, Aaron Molitor <am...@splicemachine.com>
>> wrote:
>> >
>> > Seeing higher than expected CPU/MEM usage for the java process started
>> by the sqlline.py client.
>> >
>> > From top:
>> > top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74,
>> 7.89
>> > Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
>> > Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,
>> 0.0%st
>> > Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
>> > Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
>> >
>> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> > 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java
>> >
>> > Why is a the JDBC client using so much memory/cpu?  The expected result
>> is only a single row.
>> >
>> > QUERY:
>> > SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
>> > FROM
>> >  TPCH.LINEITEM,
>> >  TPCH.PART
>> > WHERE
>> >  P_PARTKEY = L_PARTKEY
>> >  AND P_BRAND = 'BRAND#23'
>> >  AND P_CONTAINER = 'MED BOX'
>> >  AND L_QUANTITY < (
>> >    SELECT 0.2 * AVG(L_QUANTITY)
>> >    FROM
>> >      TPCH.LINEITEM
>> >    WHERE
>> >      L_PARTKEY = P_PARTKEY
>> >  );
>> >
>> > ROW COUNTS:
>> > TPCH.REGION             5
>> > TPCH.NATION            25
>> > TPCH.SUPPLIER     1000000
>> > TPCH.PART        20000000
>> > TPCH.PARTSUPP    80000000
>> > TPCH.ORDERS     150000000
>> > TPCH.CUSTOMER    15000000
>> > TPCH.LINEITEM   600037902
>> >
>> > Thanks,
>> > Aaron
>>
>>
>
>

Re: high client cpu usage

Posted by John Leach <jl...@gmail.com>.

Yeah, this query.

QUERY:
SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
FROM
 TPCH.LINEITEM,
 TPCH.PART
WHERE
 P_PARTKEY = L_PARTKEY
 AND P_BRAND = 'BRAND#23'
 AND P_CONTAINER = 'MED BOX'
 AND L_QUANTITY < (
   SELECT 0.2 * AVG(L_QUANTITY)
   FROM
     TPCH.LINEITEM
   WHERE
     L_PARTKEY = P_PARTKEY
 );

ROW COUNTS: 
TPCH.REGION             5
TPCH.NATION            25
TPCH.SUPPLIER     1000000
TPCH.PART        20000000
TPCH.PARTSUPP    80000000
TPCH.ORDERS     150000000
TPCH.CUSTOMER    15000000
TPCH.LINEITEM   600037902 

Does the client have to sort all of LINEITEM and PART?

Thanks James.

Regards,
John



> On Aug 25, 2016, at 1:16 PM, James Taylor <ja...@apache.org> wrote:
> 
> The client typically does a merge sort when combining results from paralllel scans. Not sure if this would explain the CPU/memory usage you're seeing. Can you narrow it down to a particular query that's causing the issue?
> 
> Thanks,
> James
> 
> On Thu, Aug 25, 2016 at 6:49 AM, John Leach <jleach@splicemachine.com <ma...@splicemachine.com>> wrote:
> Can anyone explain why the client would be burning so much CPU and memory if the result is a single row?
> 
> I suspect we configured something wrong on Phoenix but we are having a hard time figuring it out.
> 
> Thanks in advance.
> 
> Regards,
> John
> 
> > On Aug 24, 2016, at 9:54 AM, Aaron Molitor <amolitor@splicemachine.com <ma...@splicemachine.com>> wrote:
> >
> > Seeing higher than expected CPU/MEM usage for the java process started by the sqlline.py client.
> >
> > From top:
> > top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74, 7.89
> > Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,  0.0%st
> > Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
> > Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
> >
> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java
> >
> > Why is a the JDBC client using so much memory/cpu?  The expected result is only a single row.
> >
> > QUERY:
> > SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> > FROM
> >  TPCH.LINEITEM,
> >  TPCH.PART
> > WHERE
> >  P_PARTKEY = L_PARTKEY
> >  AND P_BRAND = 'BRAND#23'
> >  AND P_CONTAINER = 'MED BOX'
> >  AND L_QUANTITY < (
> >    SELECT 0.2 * AVG(L_QUANTITY)
> >    FROM
> >      TPCH.LINEITEM
> >    WHERE
> >      L_PARTKEY = P_PARTKEY
> >  );
> >
> > ROW COUNTS:
> > TPCH.REGION             5
> > TPCH.NATION            25
> > TPCH.SUPPLIER     1000000
> > TPCH.PART        20000000
> > TPCH.PARTSUPP    80000000
> > TPCH.ORDERS     150000000
> > TPCH.CUSTOMER    15000000
> > TPCH.LINEITEM   600037902
> >
> > Thanks,
> > Aaron
> 
>

Re: high client cpu usage

Posted by John Leach <jl...@gmail.com>.

Yeah, this query.

QUERY:
SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
FROM
 TPCH.LINEITEM,
 TPCH.PART
WHERE
 P_PARTKEY = L_PARTKEY
 AND P_BRAND = 'BRAND#23'
 AND P_CONTAINER = 'MED BOX'
 AND L_QUANTITY < (
   SELECT 0.2 * AVG(L_QUANTITY)
   FROM
     TPCH.LINEITEM
   WHERE
     L_PARTKEY = P_PARTKEY
 );

ROW COUNTS: 
TPCH.REGION             5
TPCH.NATION            25
TPCH.SUPPLIER     1000000
TPCH.PART        20000000
TPCH.PARTSUPP    80000000
TPCH.ORDERS     150000000
TPCH.CUSTOMER    15000000
TPCH.LINEITEM   600037902 

Does the client have to sort all of LINEITEM and PART?

Thanks James.

Regards,
John



> On Aug 25, 2016, at 1:16 PM, James Taylor <ja...@apache.org> wrote:
> 
> The client typically does a merge sort when combining results from paralllel scans. Not sure if this would explain the CPU/memory usage you're seeing. Can you narrow it down to a particular query that's causing the issue?
> 
> Thanks,
> James
> 
> On Thu, Aug 25, 2016 at 6:49 AM, John Leach <jleach@splicemachine.com <ma...@splicemachine.com>> wrote:
> Can anyone explain why the client would be burning so much CPU and memory if the result is a single row?
> 
> I suspect we configured something wrong on Phoenix but we are having a hard time figuring it out.
> 
> Thanks in advance.
> 
> Regards,
> John
> 
> > On Aug 24, 2016, at 9:54 AM, Aaron Molitor <amolitor@splicemachine.com <ma...@splicemachine.com>> wrote:
> >
> > Seeing higher than expected CPU/MEM usage for the java process started by the sqlline.py client.
> >
> > From top:
> > top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74, 7.89
> > Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,  0.0%st
> > Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
> > Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
> >
> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java
> >
> > Why is a the JDBC client using so much memory/cpu?  The expected result is only a single row.
> >
> > QUERY:
> > SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> > FROM
> >  TPCH.LINEITEM,
> >  TPCH.PART
> > WHERE
> >  P_PARTKEY = L_PARTKEY
> >  AND P_BRAND = 'BRAND#23'
> >  AND P_CONTAINER = 'MED BOX'
> >  AND L_QUANTITY < (
> >    SELECT 0.2 * AVG(L_QUANTITY)
> >    FROM
> >      TPCH.LINEITEM
> >    WHERE
> >      L_PARTKEY = P_PARTKEY
> >  );
> >
> > ROW COUNTS:
> > TPCH.REGION             5
> > TPCH.NATION            25
> > TPCH.SUPPLIER     1000000
> > TPCH.PART        20000000
> > TPCH.PARTSUPP    80000000
> > TPCH.ORDERS     150000000
> > TPCH.CUSTOMER    15000000
> > TPCH.LINEITEM   600037902
> >
> > Thanks,
> > Aaron
> 
>

Re: high client cpu usage

Posted by James Taylor <ja...@apache.org>.

The client typically does a merge sort when combining results from
paralllel scans. Not sure if this would explain the CPU/memory usage you're
seeing. Can you narrow it down to a particular query that's causing the
issue?

Thanks,
James

On Thu, Aug 25, 2016 at 6:49 AM, John Leach <jl...@splicemachine.com>
wrote:

> Can anyone explain why the client would be burning so much CPU and memory
> if the result is a single row?
>
> I suspect we configured something wrong on Phoenix but we are having a
> hard time figuring it out.
>
> Thanks in advance.
>
> Regards,
> John
>
> > On Aug 24, 2016, at 9:54 AM, Aaron Molitor <am...@splicemachine.com>
> wrote:
> >
> > Seeing higher than expected CPU/MEM usage for the java process started
> by the sqlline.py client.
> >
> > From top:
> > top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74,
> 7.89
> > Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,
> 0.0%st
> > Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
> > Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
> >
> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java
> >
> > Why is a the JDBC client using so much memory/cpu?  The expected result
> is only a single row.
> >
> > QUERY:
> > SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> > FROM
> >  TPCH.LINEITEM,
> >  TPCH.PART
> > WHERE
> >  P_PARTKEY = L_PARTKEY
> >  AND P_BRAND = 'BRAND#23'
> >  AND P_CONTAINER = 'MED BOX'
> >  AND L_QUANTITY < (
> >    SELECT 0.2 * AVG(L_QUANTITY)
> >    FROM
> >      TPCH.LINEITEM
> >    WHERE
> >      L_PARTKEY = P_PARTKEY
> >  );
> >
> > ROW COUNTS:
> > TPCH.REGION             5
> > TPCH.NATION            25
> > TPCH.SUPPLIER     1000000
> > TPCH.PART        20000000
> > TPCH.PARTSUPP    80000000
> > TPCH.ORDERS     150000000
> > TPCH.CUSTOMER    15000000
> > TPCH.LINEITEM   600037902
> >
> > Thanks,
> > Aaron
>
>

Re: high client cpu usage

Posted by James Taylor <ja...@apache.org>.

The client typically does a merge sort when combining results from
paralllel scans. Not sure if this would explain the CPU/memory usage you're
seeing. Can you narrow it down to a particular query that's causing the
issue?

Thanks,
James

On Thu, Aug 25, 2016 at 6:49 AM, John Leach <jl...@splicemachine.com>
wrote:

> Can anyone explain why the client would be burning so much CPU and memory
> if the result is a single row?
>
> I suspect we configured something wrong on Phoenix but we are having a
> hard time figuring it out.
>
> Thanks in advance.
>
> Regards,
> John
>
> > On Aug 24, 2016, at 9:54 AM, Aaron Molitor <am...@splicemachine.com>
> wrote:
> >
> > Seeing higher than expected CPU/MEM usage for the java process started
> by the sqlline.py client.
> >
> > From top:
> > top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74,
> 7.89
> > Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,
> 0.0%st
> > Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
> > Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
> >
> >  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java
> >
> > Why is a the JDBC client using so much memory/cpu?  The expected result
> is only a single row.
> >
> > QUERY:
> > SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> > FROM
> >  TPCH.LINEITEM,
> >  TPCH.PART
> > WHERE
> >  P_PARTKEY = L_PARTKEY
> >  AND P_BRAND = 'BRAND#23'
> >  AND P_CONTAINER = 'MED BOX'
> >  AND L_QUANTITY < (
> >    SELECT 0.2 * AVG(L_QUANTITY)
> >    FROM
> >      TPCH.LINEITEM
> >    WHERE
> >      L_PARTKEY = P_PARTKEY
> >  );
> >
> > ROW COUNTS:
> > TPCH.REGION             5
> > TPCH.NATION            25
> > TPCH.SUPPLIER     1000000
> > TPCH.PART        20000000
> > TPCH.PARTSUPP    80000000
> > TPCH.ORDERS     150000000
> > TPCH.CUSTOMER    15000000
> > TPCH.LINEITEM   600037902
> >
> > Thanks,
> > Aaron
>
>

Re: high client cpu usage

Posted by John Leach <jl...@splicemachine.com>.

Can anyone explain why the client would be burning so much CPU and memory if the result is a single row?  

I suspect we configured something wrong on Phoenix but we are having a hard time figuring it out.  

Thanks in advance.

Regards,
John

> On Aug 24, 2016, at 9:54 AM, Aaron Molitor <am...@splicemachine.com> wrote:
> 
> Seeing higher than expected CPU/MEM usage for the java process started by the sqlline.py client. 
> 
> From top:
> top - 14:37:32 up 7 days, 22:15,  2 users,  load average: 25.52, 9.74, 7.89
> Tasks: 509 total,   1 running, 508 sleeping,   0 stopped,   0 zombie
> Cpu(s): 61.2%us,  6.3%sy,  0.0%ni, 31.0%id,  0.5%wa,  0.0%hi,  1.0%si,  0.0%st
> Mem:  65920564k total, 31913580k used, 34006984k free,   647004k buffers
> Swap: 33030140k total,        0k used, 33030140k free, 10464056k cached
> 
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                                                                                                                                                                                                                    
> 8729 splice    20   0 29.9g  14g 889m S 1616.8 22.4  26:19.21 java    
> 
> Why is a the JDBC client using so much memory/cpu?  The expected result is only a single row.
> 
> QUERY:
> SELECT SUM(L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>  TPCH.LINEITEM,
>  TPCH.PART
> WHERE
>  P_PARTKEY = L_PARTKEY
>  AND P_BRAND = 'BRAND#23'
>  AND P_CONTAINER = 'MED BOX'
>  AND L_QUANTITY < (
>    SELECT 0.2 * AVG(L_QUANTITY)
>    FROM
>      TPCH.LINEITEM
>    WHERE
>      L_PARTKEY = P_PARTKEY
>  );
> 
> ROW COUNTS: 
> TPCH.REGION             5
> TPCH.NATION            25
> TPCH.SUPPLIER     1000000
> TPCH.PART        20000000
> TPCH.PARTSUPP    80000000
> TPCH.ORDERS     150000000
> TPCH.CUSTOMER    15000000
> TPCH.LINEITEM   600037902 
> 
> Thanks, 
> Aaron