You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Matthias Boehm (JIRA)" <ji...@apache.org> on 2018/03/19 05:57:00 UTC

[jira] [Created] (SYSTEMML-2200) KMeans w/ codegen shows very bad performance

Matthias Boehm created SYSTEMML-2200:
----------------------------------------

             Summary: KMeans w/ codegen shows very bad performance
                 Key: SYSTEMML-2200
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2200
             Project: SystemML
          Issue Type: Sub-task
            Reporter: Matthias Boehm


While codegen worked extremely well for KMeans with 1 run, we currently see performance issues in a parfor setting with concurrent 10 runs, which all spawn distributed spark operations. In detail, this is due to particular plan choices that are affected by the reduced local memory budget per parfor worker. However, these issues can be overcome by avoiding unnecessary RDD joins in distributed codegen operations via better broadcast handling (currently the first input is always assumed to be an RDD).

{code}
Total elapsed time:		9305.981 sec.
Total compilation time:		3.023 sec.
Total execution time:		9302.958 sec.
Number of compiled Spark inst:	21.
Number of executed Spark inst:	193.
Cache hits (Mem, WB, FS, HDFS):	1242/0/0/91.
Cache writes (WB, FS, HDFS):	456/188/1.
Cache times (ACQr/m, RLS, EXP):	10086.631/0.011/114.967/1.291 sec.
HOP DAGs recompiled (PRED, SB):	0/108.
HOP DAGs recompile time:	2.733 sec.
Functions recompiled:		1.
Functions recompile time:	0.043 sec.
Codegen compile (DAG,CP,JC):	176/430/21.
Codegen enum (ALLt/p,EVALt/p):	48076/47974/39249/38324.
Codegen compile times (DAG,JC):	3.024/0.491 sec.
Codegen enum plan cache hits:	0/0.
Codegen op plan cache hits:	395/416.
Spark ctx create time (lazy):	19.506 sec.
Spark trans counts (par,bc,col):0/179/91.
Spark trans times (par,bc,col):	0.000/1.954/10086.614 secs.
ParFor loops optimized:		1.
ParFor optimize time:		0.141 sec.
ParFor initialize time:		0.022 sec.
ParFor result merge time:	0.059 sec.
ParFor total update in-place:	0/40/50
Total JIT compile time:		98.963 sec.
Total JVM GC count:		374.
Total JVM GC time:		72.456 sec.
Heavy hitter instructions:
  #  Instruction          Time(s)  Count
  1  sp_spoofRATMP63   73,750.553     89
  2  spoofRATMP43      10,195.724     89
  3  sp_chkpoint           20.239     12
  4  sp_uasqk+             14.347      1
  5  spoofRATMP52          10.496     89
  6  ba+*                   9.273     15
  7  sp_mapmm               1.543      1
  8  write                  1.291      1
  9  /                      1.127     92
 10  sp_spoofRATMP116       0.930     89
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)