You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2020/07/14 01:44:00 UTC
[jira] [Resolved] (SPARK-32146) ValueError when loading a
PipelineModel on a personal computer
[ https://issues.apache.org/jira/browse/SPARK-32146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim resolved SPARK-32146.
----------------------------------
Resolution: Invalid
Please use user mailing list regarding question. If your issue is bound to the specific vendor, please go through support line on the vendor.
> ValueError when loading a PipelineModel on a personal computer
> --------------------------------------------------------------
>
> Key: SPARK-32146
> URL: https://issues.apache.org/jira/browse/SPARK-32146
> Project: Spark
> Issue Type: Bug
> Components: ML
> Affects Versions: 2.4.5
> Environment: * OS: Windows
> * SparkSession: spark = SparkSession.builder.appName({color:#6a8759}"annonces_organiques"{color}).getOrCreate()
> Reporter: LoicH
> Priority: Major
>
> I have a PipelineModel saved on my computer that I can't load using {{PipelineModel.load(path)}}.
> When I launch my code in a Databricks cluster, it works. {{path}} is the path to my model saved on DBFS, accessible via a mount point: {{path = "/dbfs/path/to/my/model}}.
> However on my machine, calling {{PipelineModel.load("C:\\Users\\path\\to\\my\\model")}} throws a {{ValueError("RDD is empty")}}.
> Here is how the model is saved on my computer:
> {code:title=pipeline.txt}
> \---model
> +---metadata
> | part-00000
> | _SUCCESS
> |
> \---stages
> +---0_CountVectorizer_b92625354bf7
> | +---data
> | | part-00000-tid-9156766819779394023-5cf6aecb-8959-48b3-be24-65bfa0543465-62-1-c000.snappy.parquet
> | | _committed_9156766819779394023
> | | _started_9156766819779394023
> | | _SUCCESS
> | |
> | \---metadata
> | part-00000
> | _SUCCESS
> |
> \---1_LinearSVC_108fa01daf43
> +---data
> | part-00000-tid-4403060754466700849-27841dd9-de88-4015-9dfa-7854c2a15f15-65-1-c000.snappy.parquet
> | _committed_4403060754466700849
> | _started_4403060754466700849
> | _SUCCESS
> |
> \---metadata
> part-00000
> _SUCCESS
> {code}
> (I just downloaded the model from my DataLake to my computer)
> How can I load this model when running my code in local?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org