You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Andrew Palumbo (JIRA)" <ji...@apache.org> on 2016/03/08 23:46:40 UTC

[jira] [Created] (MAHOUT-1802) Capture attached checkpoints (if cached)

Andrew Palumbo created MAHOUT-1802:
--------------------------------------

             Summary:  Capture attached checkpoints (if cached)
                 Key: MAHOUT-1802
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1802
             Project: Mahout
          Issue Type: Improvement
    Affects Versions: 0.11.1
            Reporter: Andrew Palumbo
            Assignee: Andrew Palumbo
             Fix For: 0.11.2


Currently, the optimizer generates checkpoints and attaches them to actual logical elements of the DAG via CheckpointAction$cp. 

the way it worsk today is as follows: 

{code}
drmC = drmA+ drmB

val cp1 = drmC.checkpoint() // checkpoint
val cp2 = drmC.checkpoint() // cp2 == cp1

drmD = cp1 + drmE // cp1 + drmE
{code}
but, in: 
{code}
drmD = drmC + drmE // computes drmA + drmB + drmC all over
{code}

{{drmC}} already has {{cp1}} attached to it so we should assume the common computational path is the intent here regardless and should be used, instead of building plans that recompute it. That is, 

{{drmD = drmC + drmE}} should imply {{cp1 + drmE}} as well even if checkpoint is not used explicitly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)