You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/07 15:55:36 UTC

[GitHub] [spark] ahmed-mahran commented on pull request #38966: [SPARK-41008][MLLIB] Dedup isotonic regression duplicate features

ahmed-mahran commented on PR #38966:
URL: https://github.com/apache/spark/pull/38966#issuecomment-1341173543

   > Is it fair to say this keeps existing tests and adds new ones? my only concern is about introducing new unintended behavior changes, but if it's passing all the previous tests, that's good enough for purposes here
   
   Well, the behavior has changed on duplicate features cases. Specifically, only these two tests I had to change to conform with the new behavior. These tests, on `master`, produce different results than `sklearn`:
   - [`test("isotonic regression prediction with duplicate features")`](https://github.com/apache/spark/pull/38966/files#diff-6eae24fe4c455659652a83f423f0df48ad994237b3350f18953c39258ae1ab86R248-R261)
   - [`test("antitonic regression prediction with duplicate features")`](https://github.com/apache/spark/pull/38966/files#diff-6eae24fe4c455659652a83f423f0df48ad994237b3350f18953c39258ae1ab86R263-R276)
   
   Added new tests:
   - [`test("SPARK-41008 isotonic regression with duplicate features differs from sklearn")`](https://github.com/apache/spark/pull/38966/files#diff-6eae24fe4c455659652a83f423f0df48ad994237b3350f18953c39258ae1ab86R226-R246)
   - [`test("makeUnique: handle duplicate features")`](https://github.com/apache/spark/pull/38966/files#diff-6eae24fe4c455659652a83f423f0df48ad994237b3350f18953c39258ae1ab86R327-R380)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org