You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:13:25 UTC

[jira] [Resolved] (SPARK-19234) AFTSurvivalRegression chokes silently or with confusing errors when any labels are zero

     [ https://issues.apache.org/jira/browse/SPARK-19234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-19234.
----------------------------------
    Resolution: Incomplete

> AFTSurvivalRegression chokes silently or with confusing errors when any labels are zero
> ---------------------------------------------------------------------------------------
>
>                 Key: SPARK-19234
>                 URL: https://issues.apache.org/jira/browse/SPARK-19234
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.1.0
>         Environment: spark-shell or pyspark
>            Reporter: Andrew MacKinlay
>            Priority: Minor
>              Labels: bulk-closed
>         Attachments: spark-aft-failure.txt
>
>
> If you try and use AFTSurvivalRegression and any label in your input data is 0.0, you get coefficients of 0.0 returned, and in many cases, errors like this:
> {{17/01/16 15:10:50 ERROR StrongWolfeLineSearch: Encountered bad values in function evaluation. Decreasing step size to NaN}}
> Zero should, I think, be an allowed value for survival analysis. I don't know if this is a pathological case for AFT specifically as I don't know enough about it, but this behaviour is clearly undesirable. If you have any labels of 0.0, you get either a) obscure error messages, with no knowledge of the cause and coefficients which are all zero or b) no errors messages at all and coefficients of zero (arguably worse, since you don't even have console output to tell you something's gone awry). If AFT doesn't work with zero-valued labels, Spark should fail fast and let the developer know why. If it does, we should get results here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org