You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Imran Younus (JIRA)" <ji...@apache.org> on 2017/03/14 18:51:41 UTC

[jira] [Created] (SYSTEMML-1401) Data mismatch problem with Cox Predict script

Imran Younus created SYSTEMML-1401:
--------------------------------------

             Summary: Data mismatch problem with Cox Predict script
                 Key: SYSTEMML-1401
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1401
             Project: SystemML
          Issue Type: Bug
          Components: Algorithms
         Environment: 

            Reporter: Imran Younus


The Cox predict script internally sorts the input/test data set w.r.t. time. This is necessary to calculate the cumulative hazard function. But creates a serious problem for the user because all the results returned from the predict script are sorted by time but the input data is not, and user has no way of matching the input data with predictions.

There are two possible solutions to this problems:

1) We should restore the original order inside the predict script before returning the final results, so that the order of the predictions match exactly with order of the input data.

2) We can add sorted time column in the final output to let the user know which prediction corresponds to which time value. This may be easier to implement, but I think this is not ideal solution because in case of ties in time values, user will still have problem matching input with the predictions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)