You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Mike Dusenberry (JIRA)" <ji...@apache.org> on 2017/06/19 03:50:00 UTC

[jira] [Commented] (SYSTEMML-1676) Add a new 2D softmax layer

    [ https://issues.apache.org/jira/browse/SYSTEMML-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053450#comment-16053450 ] 

Mike Dusenberry commented on SYSTEMML-1676:
-------------------------------------------

I thought about this a bit more, and if we can transpose the shape from `(N, C*H*W)` to `(N*H*W, C)`, then we can apply the regular `softmax` layer and transpose the outputs back to the original shape afterwards.  We could do the same for the cross entropy loss as well.  Ideally, we should add a `transpose4d(X, C, Hin, Win, new_dim1, new_dim2, new_dim3, new_dim4)` builtin-in function with the same functionality as NumPy's [transpose function](https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.transpose.html).  In the short term, if we could create a couple of efficient utility functions `util::NCHW_to_NHWC(X, C, H, W)` and `util::NHWC(X, C, H, W)` that convert from `(N, C*H*W)` to `(N*H*W, C)` (or maybe just to `(N, H*W*C)`, which could then be trivially reshaped to `(N*H*W, C)`), then we could easily implement 2D softmax and 2D cross entropy loss (or just let the user take care of the conversions using the utility functions directly, rather than adding new layers).

> Add a new 2D softmax layer
> --------------------------
>
>                 Key: SYSTEMML-1676
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1676
>             Project: SystemML
>          Issue Type: Sub-task
>            Reporter: Mike Dusenberry
>            Assignee: Mike Dusenberry
>
> A 2D softmax layer would accept a tensor of shape {{(N,C,H,W)}}, where the {{C}} axis contains scores for {{D}} classes, and output a tensor of the same shape, with the scores transformed to normalized probabilities.  The typical use case would be a segmentation problem, in which every pixel has a multiclass prediction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)