You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2022/04/08 02:34:00 UTC

[jira] [Resolved] (SPARK-38806) Unable to initialize the empty pyspark.pandas dataframe

     [ https://issues.apache.org/jira/browse/SPARK-38806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-38806.
----------------------------------
    Resolution: Invalid

> Unable to initialize the empty pyspark.pandas dataframe
> -------------------------------------------------------
>
>                 Key: SPARK-38806
>                 URL: https://issues.apache.org/jira/browse/SPARK-38806
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 3.2.1
>            Reporter: Prakhar Sandhu
>            Priority: Major
>
> I am trying to replace pandas library with pyspark.pandas library. But after the replacement the below line of code failed - 
> {code:java}
> import pyspark.pandas as pd 
> self._df = pd.DataFrame()
>  {code}
>  
> It throws the below error : 
>  
> {code:java}
>     self._df = pd.DataFrame()
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\site-packages\pyspark\pandas\frame.py", line 520, in __init__        
>     internal = InternalFrame.from_pandas(pdf)
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\site-packages\pyspark\pandas\internal.py", line 1464, in from_pandas 
>     sdf = default_session().createDataFrame(pdf, schema=schema)
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\site-packages\pyspark\pandas\utils.py", line 477, in default_session 
>     return builder.getOrCreate()
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\site-packages\pyspark\sql\session.py", line 228, in getOrCreate      
>     sc = SparkContext.getOrCreate(sparkConf)
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\site-packages\pyspark\context.py", line 392, in getOrCreate
>     SparkContext(conf=conf or SparkConf())
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\site-packages\pyspark\context.py", line 144, in __init__
>     SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\site-packages\pyspark\context.py", line 339, in _ensure_initialized  
>     SparkContext._gateway = gateway or launch_gateway(conf)
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\site-packages\pyspark\java_gateway.py", line 101, in launch_gateway  
>     proc = Popen(command, **popen_kwargs)
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\subprocess.py", line 800, in __init__
>     restore_signals, start_new_session)
>   File "C:\Users\eapasnr\Anaconda3\envs\oden2\lib\subprocess.py", line 1207, in _execute_child
>     startupinfo)
> FileNotFoundError: [WinError 2] The system cannot find the file specified {code}
> The code was working fine previously with Pandas
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org