You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/06/25 09:45:56 UTC

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7537: ARROW-842: [Python] Recognize pandas.NaT as null when converting object arrays with from_pandas=True

jorisvandenbossche commented on a change in pull request #7537:
URL: https://github.com/apache/arrow/pull/7537#discussion_r445438326



##########
File path: cpp/src/arrow/python/helpers.cc
##########
@@ -254,14 +255,45 @@ bool PyFloat_IsNaN(PyObject* obj) {
   return PyFloat_Check(obj) && std::isnan(PyFloat_AsDouble(obj));
 }
 
+namespace {
+
+static std::once_flag pandas_static_initialized;
+static PyTypeObject* pandas_NaTType = nullptr;
+
+void GetPandasStaticSymbols() {
+  OwnedRef pandas;
+  Status s = ImportModule("pandas", &pandas);
+  if (!s.ok()) {
+    return;
+  }
+
+  OwnedRef nat_value;
+  s = ImportFromModule(pandas.obj(), "NaT", &nat_value);
+  if (!s.ok()) {
+    return;
+  }
+  PyObject* nat_type = PyObject_Type(nat_value.obj());
+  pandas_NaTType = reinterpret_cast<PyTypeObject*>(nat_type);
+
+  // PyObject_Type returns a new reference but we trust that pandas.NaT will
+  // outlive our use of this PyObject*
+  Py_DECREF(nat_type);
+}
+
+}  // namespace
+
+void InitPandasStaticData() {
+  std::call_once(pandas_static_initialized, GetPandasStaticSymbols);
+}
+
 bool PandasObjectIsNull(PyObject* obj) {
   if (!MayHaveNaN(obj)) {
     return false;
   }
   if (obj == Py_None) {
     return true;
   }
-  if (PyFloat_IsNaN(obj) ||
+  if (PyFloat_IsNaN(obj) || (pandas_NaTType && PyObject_TypeCheck(obj, pandas_NaTType)) ||

Review comment:
       In practice, I think it is a singleton (there is a single value instantiated in the pandas code base that is used throughout), but just checked and it is not guaranteed in the `__new__`. So you can actually create another pd.NaT object with `type(pd.NaT)()` which is not identical to `pd.NaT` .. (probably something to change in pandas)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org