You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2020/07/14 01:44:00 UTC

[jira] [Resolved] (SPARK-32146) ValueError when loading a PipelineModel on a personal computer

     [ https://issues.apache.org/jira/browse/SPARK-32146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jungtaek Lim resolved SPARK-32146.
----------------------------------
    Resolution: Invalid

Please use user mailing list regarding question. If your issue is bound to the specific vendor, please go through support line on the vendor.

> ValueError when loading a PipelineModel on a personal computer
> --------------------------------------------------------------
>
>                 Key: SPARK-32146
>                 URL: https://issues.apache.org/jira/browse/SPARK-32146
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.4.5
>         Environment: * OS: Windows
>  * SparkSession: spark = SparkSession.builder.appName({color:#6a8759}"annonces_organiques"{color}).getOrCreate()
>            Reporter: LoicH
>            Priority: Major
>
> I have a PipelineModel saved on my computer that I can't load using {{PipelineModel.load(path)}}.
> When I launch my code in a Databricks cluster, it works. {{path}} is the path to my model saved on DBFS, accessible via a mount point: {{path = "/dbfs/path/to/my/model}}.
> However on my machine, calling {{PipelineModel.load("C:\\Users\\path\\to\\my\\model")}} throws a {{ValueError("RDD is empty")}}.
> Here is how the model is saved on my computer:
> {code:title=pipeline.txt}
> \---model
>     +---metadata
>     |       part-00000
>     |       _SUCCESS
>     |
>     \---stages
>         +---0_CountVectorizer_b92625354bf7
>         |   +---data
>         |   |       part-00000-tid-9156766819779394023-5cf6aecb-8959-48b3-be24-65bfa0543465-62-1-c000.snappy.parquet
>         |   |       _committed_9156766819779394023
>         |   |       _started_9156766819779394023
>         |   |       _SUCCESS
>         |   |
>         |   \---metadata
>         |           part-00000
>         |           _SUCCESS
>         |
>         \---1_LinearSVC_108fa01daf43
>             +---data
>             |       part-00000-tid-4403060754466700849-27841dd9-de88-4015-9dfa-7854c2a15f15-65-1-c000.snappy.parquet
>             |       _committed_4403060754466700849
>             |       _started_4403060754466700849
>             |       _SUCCESS
>             |
>             \---metadata
>                     part-00000
>                     _SUCCESS
> {code}
> (I just downloaded the model from my DataLake to my computer)
> How can I load this model when running my code in local?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org