You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Matthias Boehm (JIRA)" <ji...@apache.org> on 2016/02/11 06:58:18 UTC
[jira] [Commented] (SYSTEMML-510) Generalized wdivmm w/ eps all
patterns
[ https://issues.apache.org/jira/browse/SYSTEMML-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142282#comment-15142282 ]
Matthias Boehm commented on SYSTEMML-510:
-----------------------------------------
step 1: generalization wdivmm pattern (a) and (b) (w/ b(/) as pseudo sparse-safe operation)
step 2: generalization wcemm
> Generalized wdivmm w/ eps all patterns
> --------------------------------------
>
> Key: SYSTEMML-510
> URL: https://issues.apache.org/jira/browse/SYSTEMML-510
> Project: SystemML
> Issue Type: Task
> Components: Compiler, Parser, Runtime
> Reporter: Mike Dusenberry
> Original Estimate: 6h
> Remaining Estimate: 6h
>
> If we look at the inner loop of Poisson nonnegative matrix factorization (PNMF) in general, we update the factors as
> {code}
> H = (H * (t(W) %*% (V/(W%*%H + 1e-17))))/t(colSums(W)
> W = (W * ((V/(W%*%H + 1e-17)) %*% t(H)))/t(rowSums(H))
> {code}.
> Notice the addition of the "1e-17" epsilon term in the denominators. Mathematically, we need this in case any cell of {code}W%*%H{code} evaluates to zero so that we can avoid dividing by zero. R needs this, but SystemML technically does not due to a fused operator, "wdivmm", that takes care of these situations (or this may be done in the general case?). This fused operator is currently applied to the pattern {code}t(W) %*% (V / %* (W %*% H)){code}, amongst other similar patterns. Ideally, this would easily apply to {code}t(W) %*% (V/(W%*%H + 1e-17){code}, regardless of the unneeded epsilon term. Currently, the addition of the epsilon term causes the algorithm to run in non-linear time (quad or exponential). Initially, the behavior pointed towards the possibility of the optimizer avoiding the rewrite to the fused operator, resulting in naive computation, and non-linear growth in training time. Further exploration seems to show that the rewrite is indeed still being applied, but there seems to also be a recursive nesting of the same rewrite over various regions of the above statements that is not found when the epsilon term is removed.
> The following is the full PNMF DML script used:
> {code}
> V = read($X)
> max_iteration = $maxiter
> rank = $rank
> n = nrow(V)
> m = ncol(V)
> range = 0.01
> W = Rand(rows=n, cols=rank, min=0, max=range, pdf="uniform")
> H = Rand(rows=rank, cols=m, min=0, max=range, pdf="uniform")
> loglik0 = sum(V*log(W%*%H)) - as.scalar(colSums(W)%*%rowSums(H))
> i=0
> while(i < max_iteration) {
> # Addition of epsilon (1e-17) term causes script to run in non-linear time:
> H = (H * (t(W) %*% (V/(W%*%H + 1e-17))))/t(colSums(W))
> W = (W * ((V/(W%*%H + 1e-17)) %*% t(H)))/t(rowSums(H))
> # Removal of epsilon works correctly:
> #H = (H * (t(W) %*% (V/(W%*%H))))/t(colSums(W))
> #W = (W * ((V/(W%*%H)) %*% t(H)))/t(rowSums(H))
> i = i + 1;
> }
> loglik = sum(V*log(W%*%H+1e-17)) - as.scalar(colSums(W)%*%rowSums(H))
> print("pnmf: " + loglik0 + " -> " + loglik)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)