You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bjørn Jørgensen (Jira)" <ji...@apache.org> on 2021/09/12 11:01:00 UTC

[jira] [Created] (SPARK-36728) Can't create datetime object from anything other then year column Pyspark - koalas

Bjørn Jørgensen created SPARK-36728:
---------------------------------------

             Summary: Can't create datetime object from anything other then year column Pyspark - koalas
                 Key: SPARK-36728
                 URL: https://issues.apache.org/jira/browse/SPARK-36728
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 3.3.0
            Reporter: Bjørn Jørgensen


If I create a datetime object it must be from columns named year.

 

df = ps.DataFrame(\{'year': [2015, 2016],df = ps.DataFrame({'year': [2015, 2016],                   'month': [2, 3],                    'day': [4, 5],                    'hour': [2, 3],                    'minute': [10, 30],                    'second': [21,25]}) df.info()
<class 'pyspark.pandas.frame.DataFrame'>Int64Index: 2 entries, 1 to 0Data columns (total 6 columns): #   Column  Non-Null Count  Dtype---  ------  --------------  ----- 0   year    2 non-null      int64 1   month   2 non-null      int64 2   day     2 non-null      int64 3   hour    2 non-null      int64 4   minute  2 non-null      int64 5   second  2 non-null      int64dtypes: int64(6)

df['date'] = ps.to_datetime(df[['year', 'month', 'day']])
df.info()

<class 'pyspark.pandas.frame.DataFrame'>Int64Index: 2 entries, 1 to 0Data columns (total 7 columns): #   Column  Non-Null Count  Dtype     ---  ------  --------------  -----      0   year    2 non-null      int64      1   month   2 non-null      int64      2   day     2 non-null      int64      3   hour    2 non-null      int64      4   minute  2 non-null      int64      5   second  2 non-null      int64      6   date    2 non-null      datetime64dtypes: datetime64(1), int64(6)


df_test = ps.DataFrame(\{'testyear': [2015, 2016],                   'testmonth': [2, 3],                    'testday': [4, 5],                    'hour': [2, 3],                    'minute': [10, 30],                    'second': [21,25]}) df_test['date'] = ps.to_datetime(df[['testyear', 'testmonth', 'testday']])

---------------------------------------------------------------------------KeyError                                  Traceback (most recent call last)/tmp/ipykernel_73/904491906.py in <module>----> 1 df_test['date'] = ps.to_datetime(df[['testyear', 'testmonth', 'testday']])
/opt/spark/python/pyspark/pandas/frame.py in __getitem__(self, key)  11853             return self.loc[:, key]  11854         elif is_list_like(key):> 11855             return self.loc[:, list(key)]  11856         raise NotImplementedError(key)  11857 
/opt/spark/python/pyspark/pandas/indexing.py in __getitem__(self, key)    476                 returns_series,    477                 series_name,--> 478             ) = self._select_cols(cols_sel)    479     480             if cond is None and limit is None and returns_series:
/opt/spark/python/pyspark/pandas/indexing.py in _select_cols(self, cols_sel, missing_keys)    322             return self._select_cols_else(cols_sel, missing_keys)    323         elif is_list_like(cols_sel):--> 324             return self._select_cols_by_iterable(cols_sel, missing_keys)    325         else:    326             return self._select_cols_else(cols_sel, missing_keys)
/opt/spark/python/pyspark/pandas/indexing.py in _select_cols_by_iterable(self, cols_sel, missing_keys)   1352                 if not found:   1353                     if missing_keys is None:-> 1354                         raise KeyError("['{}'] not in index".format(name_like_string(key)))   1355                     else:   1356                         missing_keys.append(key)
KeyError: "['testyear'] not in index"
df_test
testyear testmonth testday hour minute second0 2015 2 4 2 10 211 2016 3 5 3 30 25



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org