You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "hengoren (via GitHub)" <gi...@apache.org> on 2023/05/22 12:52:45 UTC

[GitHub] [spark] hengoren commented on pull request #37232: [SPARK-39821][PYTHON][PS] Fix error during using DatetimeIndex

hengoren commented on PR #37232:
URL: https://github.com/apache/spark/pull/37232#issuecomment-1557166208

   With the release of pandas 2.0, I think this is PR should be re-opened, right? 
   
   I can recreate the issue originally described with
   
   ```python
   Python 3.9.16 (main, May  3 2023, 09:54:39) 
   [GCC 10.2.1 20210110] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import pyspark
   >>> pyspark.__version__
   '3.4.0'
   >>> import pandas
   >>> pandas.__version__
   '2.0.1'
   >>> import pyspark.pandas as ps
   >>> ps.DatetimeIndex(["1970-01-01", "1970-01-02", "1970-01-03"])
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   23/05/18 21:07:30 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   23/05/18 21:07:31 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pyspark/pandas/indexes/base.py", line 2705, in __repr__
       pindex = self._psdf._get_or_create_repr_pandas_cache(max_display_count).index
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pyspark/pandas/frame.py", line 13347, in _get_or_create_repr_pandas_cache
       self, "_repr_pandas_cache", {n: self.head(n + 1)._to_internal_pandas()}
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pyspark/pandas/frame.py", line 13342, in _to_internal_pandas
       return self._internal.to_pandas_frame
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pyspark/pandas/utils.py", line 588, in wrapped_lazy_property
       setattr(self, attr_name, fn(self))
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pyspark/pandas/internal.py", line 1056, in to_pandas_frame
       pdf = sdf.toPandas()
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pyspark/sql/pandas/conversion.py", line 251, in toPandas
       if (t is not None and not all([is_timedelta64_dtype(t),is_datetime64_dtype(t)])) or should_check_timedelta:
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pandas/core/generic.py", line 6324, in astype
       new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 451, in astype
       return self.apply(
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 352, in apply
       applied = getattr(b, f)(**kwargs)
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 511, in astype
       new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pandas/core/dtypes/astype.py", line 242, in astype_array_safe
       new_values = astype_array(values, dtype, copy=copy)
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pandas/core/dtypes/astype.py", line 184, in astype_array
       values = values.astype(dtype, copy=copy)
     File "/home/ubuntu/.local/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 694, in astype
       raise TypeError(
   TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org