You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/05/29 12:08:00 UTC

[GitHub] asmushetzel commented on issue #11031: Use dtype=int for the indices returned by TopK

asmushetzel commented on issue #11031: Use dtype=int for the indices returned by TopK
URL: https://github.com/apache/incubator-mxnet/issues/11031#issuecomment-392752863

A few general remarks about indices in mxnet:

There is no specific convention in MXNet about what datatype to be used when encoding indices. It is basically left to the user or inferred from other data. This is generally fine and consistent. For example look at all operators that are indeed consumers of index arrays (pick, take, gather_nd):
They are all written in a way such that the datatype of the indices can be anything, as long as it can be cast to an int internally. In particular, the datatype does not have to be compliant with other inputs to such operators (i.e. take() will accept the data array to be of type double and the index array of type int/float/half_t/whatever). I think this is a really good design and we should not impose any restrictions that the incoming indices must be of integral type. Neither that every index should be integral.

The situation is slightly different for operators that actually produce index arrays (without having any index array as input). Such operators are topk, argsort, argmax, argmin, argmax_channel (hope I didn't forget one here). Currently they infer the index type (used in the output array of the operator) from the datatype used in the input. But that is weird as there is no objective correlation between data type and index type. If data input is float, why should be then the index type also a float? Moreover it can cause weird problems: If you want to get the index of the maximum element of a 10M large array of floats, then the float-type used as output would not be able to represent all potential indices (while an int would be able). To make such thing work, you currently would have to cast the entire data input to doubles, then receive indices as a double, then cast back to int. And even that may not work as the operator may not support doubles (which is the case for topk).

So my proposal would be to make a change for all index-producing operators (listed above) to allow explicit setting of the output index type by a parameter. In order to keep backward compatibility, the default should stay as is (inferred from data etc).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services