You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2018/07/05 02:08:00 UTC

[jira] [Resolved] (SPARK-24698) In Pyspark's ML, an Identifiable's UID has 20 random characters rather than the 12 mentioned in the documentation.

     [ https://issues.apache.org/jira/browse/SPARK-24698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-24698.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 2.4.0

Issue resolved by pull request 21675
[https://github.com/apache/spark/pull/21675]

> In Pyspark's ML, an Identifiable's UID has 20 random characters rather than the 12 mentioned in the documentation.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24698
>                 URL: https://issues.apache.org/jira/browse/SPARK-24698
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.3.1
>            Reporter: Thomas Dunne
>            Assignee: Thomas Dunne
>            Priority: Trivial
>              Labels: easyfix
>             Fix For: 2.4.0
>
>
> Hi.
> In pyspark, an Identifiable object has a random ID assigned to help distinguish instances from each other. This ID is made by concatenating the name of the class with part of a Python's built-in UUID.
> The docstring of the method (__randomUID()_) that generates this ID says that 12 random characters are used from the Python UUID, but the code actually skips the first 12 characters. The hex representation of the UUID is 32 characters, so the last 20 characters are used.
> Code can be found [here|https://github.com/apache/spark/blob/master/python/pyspark/ml/util.py#L66], and also copied here for your viewing pleasure:
> {code}
> @classmethod
> def _randomUID(cls):
>     """
>     Generate a unique unicode id for the object. The default implementation
>     concatenates the class name, "_", and 12 random hex chars.
>     """
>     return unicode(cls.__name__ + "_" + uuid.uuid4().hex[12:])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org