You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yikun Jiang (Jira)" <ji...@apache.org> on 2021/09/06 02:36:00 UTC

[jira] [Commented] (SPARK-36337) decimal('Nan') is unsupported in net.razorvine.pickle

    [ https://issues.apache.org/jira/browse/SPARK-36337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17410300#comment-17410300 ] 

Yikun Jiang commented on SPARK-36337:
-------------------------------------

Good news:
the doubel Nan pickled issue solved in https://github.com/irmen/pickle/issues/7

Bad news: 
Spark is using Pyrolite, include below features:
- net.razorvine.pickle (Only use this feature in current Spark, [~hyukjin.kwon] Could you do a double confirm?)
- net.razorvine.pyro

And the pyrolite was spilted as two separate repo: https://github.com/irmen/Pyrolite/tree/master#where-is-pickle

So, looks like we have 2 choices:
1. wait to [backport Nan fixed PR to pyrolite|https://github.com/irmen/pickle/issues/7#issuecomment-913293219] and bump version to 4.32.
or 2. change dep from irmen/pyrolite to irmen/pickle


[1] https://github.com/apache/spark/blob/1ccb06ca8cf439e0c13ffbfb50365402e7d1330d/core/pom.xml#L418 , 

> decimal('Nan') is unsupported in net.razorvine.pickle 
> ------------------------------------------------------
>
>                 Key: SPARK-36337
>                 URL: https://issues.apache.org/jira/browse/SPARK-36337
>             Project: Spark
>          Issue Type: Sub-task
>          Components: PySpark
>    Affects Versions: 3.2.0
>            Reporter: Yikun Jiang
>            Priority: Major
>
> Decimal('NaN') is not supported by net.razorvine.pickle now.
> In Python
> {code:java}
> >>> pickled = cloudpickle.dumps(decimal.Decimal('NaN'))
> b'\x80\x05\x95!\x00\x00\x00\x00\x00\x00\x00\x8c\x07decimal\x94\x8c\x07Decimal\x94\x93\x94\x8c\x03NaN\x94\x85\x94R\x94.'
> >>> pickle.loads(pickled)
> Decimal('NaN')
> {code}
> In Scala
> {code:java}
> scala> import net.razorvine.pickle.\{Pickler, Unpickler, PickleUtils}
> scala> val unpickle = new Unpickler
> scala> unpickle.loads(PickleUtils.str2bytes("\u0080\u0005\u0095!\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u008c\u0007decimal\u0094\u008c\u0007Decimal\u0094\u0093\u0094\u008c\u0003NaN\u0094\u0085\u0094R\u0094."))
> net.razorvine.pickle.PickleException: problem construction object: java.lang.reflect.InvocationTargetException
>  at net.razorvine.pickle.objects.AnyClassConstructor.construct(AnyClassConstructor.java:29)
>  at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:773)
>  at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:213)
>  at net.razorvine.pickle.Unpickler.load(Unpickler.java:123)
>  at net.razorvine.pickle.Unpickler.loads(Unpickler.java:136)
>  ... 48 elided
> {code}
> I submit an issue in pickle upstream [https://github.com/irmen/pickle/issues/7] .
> we should bump pickle latest version after it fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org