You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/28 12:30:08 UTC

[GitHub] [arrow] AlenkaF opened a new pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

AlenkaF opened a new pull request #12522:
URL: https://github.com/apache/arrow/pull/12522


   This PR tries to change `pytz` module to be an optional dependency.
   
   The tests that install/use `pytz` had to be changed. 
   
   I ran all `pyarrow` tests locally with and without `pytz` installed. We could update the build without Pandas to not include `pytz` either so the code could be checked on the CI also.
   
   - [ ] Change `pytz` to an optional dependency
   - [ ] Change the use of `PyObject_HasAttrString` to use `PyObject_IsInstance`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1062799778


   @github-actions crossbow submit test-conda-python-3.8-hypothesis


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r835278334



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -391,62 +391,165 @@ Result<std::string> PyTZInfo_utcoffset_hhmm(PyObject* pytzinfo) {
 Result<PyObject*> StringToTzinfo(const std::string& tz) {
   util::string_view sign_str, hour_str, minute_str;
   OwnedRef pytz;
-  RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+  OwnedRef zoneinfo;
+  OwnedRef datetime;
 
+  if (internal::ImportModule("pytz", &pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+
+    if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+      int sign = -1;
+      if (sign_str == "+") {
+        sign = 1;
+      }
+      OwnedRef fixed_offset;
+      RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+      uint32_t minutes, hours;
+      if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
+          !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
+                                            &minutes)) {
+        return Status::Invalid("Invalid timezone: ", tz);
+      }
+      OwnedRef total_minutes(PyLong_FromLong(
+          sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+      RETURN_IF_PYERROR();
+      auto tzinfo =
+          PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+      RETURN_IF_PYERROR();
+      return tzinfo;
+    }
+
+    OwnedRef timezone;
+    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "timezone", &timezone));
+    OwnedRef py_tz_string(
+        PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
+    auto tzinfo = PyObject_CallFunctionObjArgs(timezone.obj(), py_tz_string.obj(), NULL);
+    RETURN_IF_PYERROR();
+    return tzinfo;
+  }
+
+  // catch fixed offset if pytz is not present
   if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+    RETURN_NOT_OK(internal::ImportModule("datetime", &datetime));
     int sign = -1;
     if (sign_str == "+") {
       sign = 1;
     }
-    OwnedRef fixed_offset;
-    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+
+    // import timezone and timedelta module to create a tzinfo object
+    OwnedRef class_timezone;
+    OwnedRef class_timedelta;
+    RETURN_NOT_OK(
+        internal::ImportFromModule(datetime.obj(), "timezone", &class_timezone));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(datetime.obj(), "timedelta", &class_timedelta));
+
+    // check input
     uint32_t minutes, hours;
     if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
         !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
                                           &minutes)) {
       return Status::Invalid("Invalid timezone: ", tz);
     }
+
+    // save offset as a signed integer
     OwnedRef total_minutes(PyLong_FromLong(
         sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+    // create zero integers for empty arguments in datetime.timedelta
+    OwnedRef zero(PyLong_FromLong(static_cast<int>(0)));
+
+    // call datetime.timedelta to get correct offset object for datetime.timezone
+    auto offset =
+        PyObject_CallFunctionObjArgs(class_timedelta.obj(), zero.obj(), zero.obj(),
+                                     zero.obj(), zero.obj(), total_minutes.obj(), NULL);
     RETURN_IF_PYERROR();
+    // call datetime.timezone
+    auto tzinfo = PyObject_CallFunctionObjArgs(class_timezone.obj(), offset, NULL);
+    RETURN_IF_PYERROR();
+    return tzinfo;
+  }
+
+  // fallback on zoneinfo if tz is string and pytz is not present
+  if (internal::ImportModule("zoneinfo", &zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &zoneinfo));
+
+    OwnedRef class_zoneinfo;
+    RETURN_NOT_OK(
+        internal::ImportFromModule(zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));
+    OwnedRef py_tz_string(
+        PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
     auto tzinfo =
-        PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+        PyObject_CallFunctionObjArgs(class_zoneinfo.obj(), py_tz_string.obj(), NULL);
     RETURN_IF_PYERROR();
     return tzinfo;
   }
 
-  OwnedRef timezone;
-  RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "timezone", &timezone));
-  OwnedRef py_tz_string(
-      PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
-  auto tzinfo = PyObject_CallFunctionObjArgs(timezone.obj(), py_tz_string.obj(), NULL);
-  RETURN_IF_PYERROR();
-  return tzinfo;
+  return Status::Invalid("Pytz package must be installed.");
 }
 
 Result<std::string> TzinfoToString(PyObject* tzinfo) {
   OwnedRef module_pytz;        // import pytz
   OwnedRef module_datetime;    // import datetime
+  OwnedRef module_zoneinfo;    // import zoneinfo
+  OwnedRef module_dateutil;    // import dateutil
   OwnedRef class_timezone;     // from datetime import timezone
   OwnedRef class_fixedoffset;  // from pytz import _FixedOffset
+  OwnedRef class_basetzinfo;   // from pytz import BaseTzInfo
+  OwnedRef class_zoneinfo;     // from zoneinfo import ZoneInfo
+  OwnedRef class_tzfile;       // from zoneinfo import tzfile
 
   // import necessary modules
-  RETURN_NOT_OK(internal::ImportModule("pytz", &module_pytz));
   RETURN_NOT_OK(internal::ImportModule("datetime", &module_datetime));
   // import necessary classes
-  RETURN_NOT_OK(
-      internal::ImportFromModule(module_pytz.obj(), "_FixedOffset", &class_fixedoffset));
   RETURN_NOT_OK(
       internal::ImportFromModule(module_datetime.obj(), "timezone", &class_timezone));
 
+  // Try to import pytz if it is available
+  if (internal::ImportModule("pytz", &module_pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &module_pytz));
+    RETURN_NOT_OK(internal::ImportFromModule(module_pytz.obj(), "_FixedOffset",
+                                             &class_fixedoffset));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(module_pytz.obj(), "BaseTzInfo", &class_basetzinfo));
+  }
+
+  // Try to import zoneinfo if it is available
+  if (internal::ImportModule("zoneinfo", &module_zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &module_zoneinfo));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(module_zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));
+  }
+  // Try to import dateutil if it is available
+  if (internal::ImportModule("dateutil.tz", &module_dateutil).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("dateutil.tz", &module_dateutil));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(module_dateutil.obj(), "tzfile", &class_tzfile));
+  }
+
   // check that it's a valid tzinfo object
   if (!PyTZInfo_Check(tzinfo)) {
     return Status::TypeError("Not an instance of datetime.tzinfo");
   }
 
-  // if tzinfo is an instance of pytz._FixedOffset or datetime.timezone return the
+  // if tzinfo is an instance of datetime.timezone return the
+  // HH:MM offset string representation
+  if (PyObject_IsInstance(tzinfo, class_timezone.obj())) {
+    // still recognize datetime.timezone.utc as UTC (instead of +00:00)
+    OwnedRef tzname_object(PyObject_CallMethod(tzinfo, "tzname", "O", Py_None));
+    RETURN_IF_PYERROR();
+    if (PyUnicode_Check(tzname_object.obj())) {
+      std::string result;
+      RETURN_NOT_OK(internal::PyUnicode_AsStdString(tzname_object.obj(), &result));
+      if (result == "UTC") {
+        return result;
+      }
+    }
+    return PyTZInfo_utcoffset_hhmm(tzinfo);
+  }
+
+  // if tzinfo is an instance of pytz._FixedOffset return the
   // HH:MM offset string representation
-  if (PyObject_IsInstance(tzinfo, class_timezone.obj()) ||
+  if (module_pytz.obj() != nullptr &&
       PyObject_IsInstance(tzinfo, class_fixedoffset.obj())) {
     // still recognize datetime.timezone.utc as UTC (instead of +00:00)

Review comment:
       Ups, sorry for this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Finished :arrow_down:0.71% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/409| `d7fbf526` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/395| `d7fbf526` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/405| `d7fbf526` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/408| `09a794a6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/394| `09a794a6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/404| `09a794a6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r816758132



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +231,9 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    pytest.importorskip("pytz")

Review comment:
       We might want to change this to use `zoneinfo` for recent Python versions, so we don't skip this full test just when pytz is not present.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r818590997



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +231,9 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    pytest.importorskip("pytz")

Review comment:
       With  `--enable-hypothesis` flag in pytest 5 tests are still being skipped and I wasn't able to check the code change from the `test_arrays`. Any clue what is causing them to get skipped? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1081499194


   I see other PRs are getting segmentation fault on the `tests/test_dataset.py::test_write_dataset_s3 ` also ...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Finished :arrow_down:0.71% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Failed :arrow_down:0.71% :arrow_up:0.71%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/409| `d7fbf526` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/395| `d7fbf526` test-mac-arm>
   [Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/395| `d7fbf526` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/405| `d7fbf526` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/408| `09a794a6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/394| `09a794a6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/394| `09a794a6` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/404| `09a794a6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r834409885



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -391,62 +391,165 @@ Result<std::string> PyTZInfo_utcoffset_hhmm(PyObject* pytzinfo) {
 Result<PyObject*> StringToTzinfo(const std::string& tz) {
   util::string_view sign_str, hour_str, minute_str;
   OwnedRef pytz;
-  RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+  OwnedRef zoneinfo;
+  OwnedRef datetime;
 
+  if (internal::ImportModule("pytz", &pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+
+    if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+      int sign = -1;
+      if (sign_str == "+") {
+        sign = 1;
+      }
+      OwnedRef fixed_offset;
+      RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+      uint32_t minutes, hours;
+      if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
+          !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
+                                            &minutes)) {
+        return Status::Invalid("Invalid timezone: ", tz);
+      }
+      OwnedRef total_minutes(PyLong_FromLong(
+          sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+      RETURN_IF_PYERROR();
+      auto tzinfo =
+          PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+      RETURN_IF_PYERROR();
+      return tzinfo;
+    }
+
+    OwnedRef timezone;
+    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "timezone", &timezone));
+    OwnedRef py_tz_string(
+        PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
+    auto tzinfo = PyObject_CallFunctionObjArgs(timezone.obj(), py_tz_string.obj(), NULL);
+    RETURN_IF_PYERROR();
+    return tzinfo;
+  }
+
+  // catch fixed offset if pytz is not present
   if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+    RETURN_NOT_OK(internal::ImportModule("datetime", &datetime));
     int sign = -1;
     if (sign_str == "+") {
       sign = 1;
     }
-    OwnedRef fixed_offset;
-    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+
+    // import timezone and timedelta module to create a tzinfo object
+    OwnedRef class_timezone;
+    OwnedRef class_timedelta;
+    RETURN_NOT_OK(
+        internal::ImportFromModule(datetime.obj(), "timezone", &class_timezone));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(datetime.obj(), "timedelta", &class_timedelta));
+
+    // check input
     uint32_t minutes, hours;
     if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
         !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
                                           &minutes)) {
       return Status::Invalid("Invalid timezone: ", tz);
     }
+
+    // save offset as a signed integer
     OwnedRef total_minutes(PyLong_FromLong(
         sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+    // create zero integers for empty arguments in datetime.timedelta
+    OwnedRef zero(PyLong_FromLong(static_cast<int>(0)));
+
+    // call datetime.timedelta to get correct offset object for datetime.timezone
+    auto offset =
+        PyObject_CallFunctionObjArgs(class_timedelta.obj(), zero.obj(), zero.obj(),
+                                     zero.obj(), zero.obj(), total_minutes.obj(), NULL);
     RETURN_IF_PYERROR();
+    // call datetime.timezone
+    auto tzinfo = PyObject_CallFunctionObjArgs(class_timezone.obj(), offset, NULL);
+    RETURN_IF_PYERROR();
+    return tzinfo;
+  }
+
+  // fallback on zoneinfo if tz is string and pytz is not present
+  if (internal::ImportModule("zoneinfo", &zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &zoneinfo));
+
+    OwnedRef class_zoneinfo;
+    RETURN_NOT_OK(
+        internal::ImportFromModule(zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));
+    OwnedRef py_tz_string(
+        PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
     auto tzinfo =
-        PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+        PyObject_CallFunctionObjArgs(class_zoneinfo.obj(), py_tz_string.obj(), NULL);
     RETURN_IF_PYERROR();
     return tzinfo;
   }
 
-  OwnedRef timezone;
-  RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "timezone", &timezone));
-  OwnedRef py_tz_string(
-      PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
-  auto tzinfo = PyObject_CallFunctionObjArgs(timezone.obj(), py_tz_string.obj(), NULL);
-  RETURN_IF_PYERROR();
-  return tzinfo;
+  return Status::Invalid("Pytz package must be installed.");
 }
 
 Result<std::string> TzinfoToString(PyObject* tzinfo) {
   OwnedRef module_pytz;        // import pytz
   OwnedRef module_datetime;    // import datetime
+  OwnedRef module_zoneinfo;    // import zoneinfo
+  OwnedRef module_dateutil;    // import dateutil
   OwnedRef class_timezone;     // from datetime import timezone
   OwnedRef class_fixedoffset;  // from pytz import _FixedOffset
+  OwnedRef class_basetzinfo;   // from pytz import BaseTzInfo
+  OwnedRef class_zoneinfo;     // from zoneinfo import ZoneInfo
+  OwnedRef class_tzfile;       // from zoneinfo import tzfile
 
   // import necessary modules
-  RETURN_NOT_OK(internal::ImportModule("pytz", &module_pytz));
   RETURN_NOT_OK(internal::ImportModule("datetime", &module_datetime));
   // import necessary classes
-  RETURN_NOT_OK(
-      internal::ImportFromModule(module_pytz.obj(), "_FixedOffset", &class_fixedoffset));
   RETURN_NOT_OK(
       internal::ImportFromModule(module_datetime.obj(), "timezone", &class_timezone));
 
+  // Try to import pytz if it is available

Review comment:
       You can handle `datetime.timezone` before this, and so on for other possibilities. This will avoid trying to import those modules when not necessary.

##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -482,7 +586,9 @@ Result<std::string> TzinfoToString(PyObject* tzinfo) {
 
   // try to look up _filename attribute
   // in case of dateutil.tz object
-  if (PyObject_HasAttrString(tzinfo, "_filename")) {
+  // if dateutil is installed and tzinfo is and instance of dateutil.tz.tzfile

Review comment:
       ```suggestion
     // if dateutil is installed and tzinfo is an instance of dateutil.tz.tzfile
   ```

##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -461,18 +564,19 @@ Result<std::string> TzinfoToString(PyObject* tzinfo) {
     return PyTZInfo_utcoffset_hhmm(tzinfo);
   }
 
-  // try to look up zone attribute
-  if (PyObject_HasAttrString(tzinfo, "zone")) {
+  // if pytz is installed and tzinfo is and instance of pytz.BaseTzInfo
+  if (module_pytz.obj() != nullptr &&
+      PyObject_IsInstance(tzinfo, class_basetzinfo.obj())) {
     OwnedRef zone(PyObject_GetAttrString(tzinfo, "zone"));
     RETURN_IF_PYERROR();
     std::string result;
     RETURN_NOT_OK(internal::PyUnicode_AsStdString(zone.obj(), &result));
     return result;
   }
 
-  // try to look up key attribute
-  // in case of zoneinfo object
-  if (PyObject_HasAttrString(tzinfo, "key")) {
+  // if zoneinfo is installed and tzinfo is and instance of zoneinfo.ZoneInfo

Review comment:
       ```suggestion
     // if zoneinfo is installed and tzinfo is an instance of zoneinfo.ZoneInfo
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r837216548



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -96,10 +103,14 @@
     pa.time64('us'),
     pa.time64('ns')
 ])
+if tzst and zoneinfo:
+    timezones = st.one_of(st.none(), tzst.timezones(), st.timezones())
+else:

Review comment:
       ```suggestion
   if tzst and zoneinfo:
       timezones = st.one_of(st.none(), tzst.timezones(), st.timezones())
   elif tzst:
       timezones = st.one_of(st.none(), tzst.timezones())
   elif zoneinfo:
       timezones = st.one_of(st.none(), st.timezones())
   else:
   ```

##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -261,17 +274,23 @@ def arrays(draw, type, size=None, nullable=True):
     elif pa.types.is_date(ty):
         value = st.dates()
     elif pa.types.is_timestamp(ty):
+        if zoneinfo is None:
+            pytest.skip('no module named zoneinfo')
+        if ty.tz is None:
+            pytest.skip('requires timezone not None')

Review comment:
       We can maybe try to tackle this as a follow-up JIRA?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r816758829



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -391,37 +391,43 @@ Result<std::string> PyTZInfo_utcoffset_hhmm(PyObject* pytzinfo) {
 Result<PyObject*> StringToTzinfo(const std::string& tz) {
   util::string_view sign_str, hour_str, minute_str;
   OwnedRef pytz;
-  RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
 
-  if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
-    int sign = -1;
-    if (sign_str == "+") {
-      sign = 1;
-    }
-    OwnedRef fixed_offset;
-    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
-    uint32_t minutes, hours;
-    if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
-        !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
-                                          &minutes)) {
-      return Status::Invalid("Invalid timezone: ", tz);
-    }
-    OwnedRef total_minutes(PyLong_FromLong(
-        sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
-    RETURN_IF_PYERROR();
-    auto tzinfo =
-        PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
-    RETURN_IF_PYERROR();
-    return tzinfo;
+  if (internal::ImportModule("pytz", &pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+
+    if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+      int sign = -1;
+      if (sign_str == "+") {
+        sign = 1;
+      }
+      OwnedRef fixed_offset;
+      RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+      uint32_t minutes, hours;
+      if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
+          !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
+                                            &minutes)) {
+        return Status::Invalid("Invalid timezone: ", tz);
+      }
+      OwnedRef total_minutes(PyLong_FromLong(
+          sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+      RETURN_IF_PYERROR();
+      auto tzinfo =
+          PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+      RETURN_IF_PYERROR();
+      return tzinfo;
+      }
+
+      OwnedRef timezone;

Review comment:
       Yes indeed. Will correct - it will resolve the liner CI error also.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1074928989


   @jorisvandenbossche I added `zoneinfo` option to the `StringToTzinfo` and rebased. Waiting for the CI checks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r822637011



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -266,10 +277,12 @@ def arrays(draw, type, size=None, nullable=True):
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
-            offset_hours = int(ty.tz)
-            tz = pytz.FixedOffset(offset_hours * 60)
+            offset = ty.tz.split(":")
+            offset_hours = int(offset[0])
+            offset_min = int(offset[1])
+            tz = datetime.timedelta(hours=offset_hours, minutes=offset_min)
         except ValueError:
-            tz = pytz.timezone(ty.tz)
+            tz = zoneinfo.ZoneInfo(ty.tz)

Review comment:
       Yes, I wasn't able to get to this correction till today.
   
   Trouble is that in case of `tz=None` the method `tzinfo_to_string` errors with `Not an instance of datetime.tzinfo` which I think is correct behaviour. Putting an extra skip in the test seems off. I will try to use the `timezones` from `pyarrow.tests.strategies` as you suggested in another comment and see if it works.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r822679857



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -266,10 +277,12 @@ def arrays(draw, type, size=None, nullable=True):
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
-            offset_hours = int(ty.tz)
-            tz = pytz.FixedOffset(offset_hours * 60)
+            offset = ty.tz.split(":")
+            offset_hours = int(offset[0])
+            offset_min = int(offset[1])
+            tz = datetime.timedelta(hours=offset_hours, minutes=offset_min)

Review comment:
       https://issues.apache.org/jira/browse/ARROW-15887




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r819369771



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +231,9 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    pytest.importorskip("pytz")

Review comment:
       > 5 tests are still being skipped and I wasn't able to check the code change from the `test_arrays`
   
   For seeing the reason something is skipped, you can pass `-r S` to pytest:
   
   ```
   $ pytest python/pyarrow/tests/test_array.py --enable-hypothesis -r S
   ..
   ====================================== short test summary info ==============================
   SKIPPED [1] python/pyarrow/tests/conftest.py:221: nopandas NOT enabled
   SKIPPED [1] python/pyarrow/tests/strategies.py:236: could not import 'ZoneInfo': No module named 'ZoneInfo'
   SKIPPED [2] python/pyarrow/tests/conftest.py:221: large_memory NOT enabled
   ====================================== 213 passed, 4 skipped in 0.87s =========================
   ```
   
   So one test is skipped because of pandas being present, and two tests are skipped by default because they are "large memory" tests. One because `ZoneInfo` could not be imported, but so that was actually because of a bug here (the module name is lowercase, see the inline comment below).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r816755156



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -391,37 +391,43 @@ Result<std::string> PyTZInfo_utcoffset_hhmm(PyObject* pytzinfo) {
 Result<PyObject*> StringToTzinfo(const std::string& tz) {
   util::string_view sign_str, hour_str, minute_str;
   OwnedRef pytz;
-  RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
 
-  if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
-    int sign = -1;
-    if (sign_str == "+") {
-      sign = 1;
-    }
-    OwnedRef fixed_offset;
-    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
-    uint32_t minutes, hours;
-    if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
-        !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
-                                          &minutes)) {
-      return Status::Invalid("Invalid timezone: ", tz);
-    }
-    OwnedRef total_minutes(PyLong_FromLong(
-        sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
-    RETURN_IF_PYERROR();
-    auto tzinfo =
-        PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
-    RETURN_IF_PYERROR();
-    return tzinfo;
+  if (internal::ImportModule("pytz", &pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+
+    if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+      int sign = -1;
+      if (sign_str == "+") {
+        sign = 1;
+      }
+      OwnedRef fixed_offset;
+      RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+      uint32_t minutes, hours;
+      if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
+          !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
+                                            &minutes)) {
+        return Status::Invalid("Invalid timezone: ", tz);
+      }
+      OwnedRef total_minutes(PyLong_FromLong(
+          sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+      RETURN_IF_PYERROR();
+      auto tzinfo =
+          PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+      RETURN_IF_PYERROR();
+      return tzinfo;
+      }
+
+      OwnedRef timezone;

Review comment:
       This part needs to be indented one level less?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r818420435



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +231,9 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    pytest.importorskip("pytz")

Review comment:
       > It seems all the tests in `test_strategies.py` are being skipped?
   
   Yeah, it seems you need to manually enable them by passing `--enable-hypothesis` flag to pytest (I struggled with this in the past as well, I remember. This is documented .. but easy to miss: https://arrow.apache.org/docs/dev/developers/python.html#test-groups
   
   > I would need to install `timedelta`,
   
   That's not about the standard library `datetime.timedelta`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r832321102



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -426,8 +427,19 @@ Result<PyObject*> StringToTzinfo(const std::string& tz) {
     return tzinfo;
   }
 
+  if (internal::ImportModule("zoneinfo", &zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &zoneinfo));
+
+    OwnedRef class_zoneinfo;
+    RETURN_NOT_OK(internal::ImportFromModule(zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));

Review comment:
       Yes, I missed that. Will do 👍 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1074150420


   @AlenkaF I merged the other hypothesis PR, so if you rebase this one, we should be able to run the hypothesis tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r836574248



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -96,10 +103,16 @@
     pa.time64('us'),
     pa.time64('ns')
 ])
+if tzst:

Review comment:
       Seems that when starting with a non-pytz timezone, the roundtrip `pandas->arrow->pandas` changes the timezone (as we convert it to `pytz` if it is installed) and produces an error when asserting Pandas dataframes. Will amend the test to ignore this kind of cases.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r818433297



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +231,9 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    pytest.importorskip("pytz")

Review comment:
       Thanks for the suggestion! I have totally missed the Test Groups part of the documentation.
   
   As for the `timedelta` I got a bit confused, yes, you are absolutely right. Will add the change today.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1064843822


   @jorisvandenbossche this PR now removes `pytz` from `ci/conda_env_python.txt` and uses `pytz` module for the roundtrip test in `python/pyarrow/tests/test_types.py`, if it is present. I tested it locally without `pytz` installed and it didn't complain.
   
   `pa.lib.string_to_tzinfo` produces a `pytz` timezone or gives a warning, that the module is not installed. There should be a fallback to `zoneinfo` at some point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1067904778


   That wasn't stated correctly from my side, I think. It returns an Invalid Status:
   https://github.com/apache/arrow/blob/0e21278ef3dddffefcce5e5cd3b91dee352334d3/cpp/src/arrow/python/datetime.cc#L429-L430


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1075100278


   Lots of CI errors but I do not see any related. Still worried though ...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1081559057


   Indeed, I don't think the S3 failures on Mac are related (they are also happening on C++), see https://issues.apache.org/jira/browse/ARROW-16043


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r834286616



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -391,62 +391,165 @@ Result<std::string> PyTZInfo_utcoffset_hhmm(PyObject* pytzinfo) {
 Result<PyObject*> StringToTzinfo(const std::string& tz) {
   util::string_view sign_str, hour_str, minute_str;
   OwnedRef pytz;
-  RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+  OwnedRef zoneinfo;
+  OwnedRef datetime;
 
+  if (internal::ImportModule("pytz", &pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));

Review comment:
       ```suggestion
     Status has_pytz = internal::ImportModule("pytz", &pytz)
     if (has_pytz.ok()) {
   ```
   
   Something like this would avoid doing the ImportModule twice. Not that importing twice is really a problem, but this might be a bit cleaner.
   
   (same for the other places where an ImportModule happens inside an `if`)

##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -391,62 +391,165 @@ Result<std::string> PyTZInfo_utcoffset_hhmm(PyObject* pytzinfo) {
 Result<PyObject*> StringToTzinfo(const std::string& tz) {
   util::string_view sign_str, hour_str, minute_str;
   OwnedRef pytz;
-  RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+  OwnedRef zoneinfo;
+  OwnedRef datetime;
 
+  if (internal::ImportModule("pytz", &pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+
+    if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+      int sign = -1;
+      if (sign_str == "+") {
+        sign = 1;
+      }
+      OwnedRef fixed_offset;
+      RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+      uint32_t minutes, hours;
+      if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
+          !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
+                                            &minutes)) {
+        return Status::Invalid("Invalid timezone: ", tz);
+      }
+      OwnedRef total_minutes(PyLong_FromLong(
+          sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+      RETURN_IF_PYERROR();
+      auto tzinfo =
+          PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+      RETURN_IF_PYERROR();
+      return tzinfo;
+    }
+
+    OwnedRef timezone;
+    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "timezone", &timezone));
+    OwnedRef py_tz_string(
+        PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
+    auto tzinfo = PyObject_CallFunctionObjArgs(timezone.obj(), py_tz_string.obj(), NULL);
+    RETURN_IF_PYERROR();
+    return tzinfo;
+  }
+
+  // catch fixed offset if pytz is not present
   if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+    RETURN_NOT_OK(internal::ImportModule("datetime", &datetime));
     int sign = -1;
     if (sign_str == "+") {
       sign = 1;
     }
-    OwnedRef fixed_offset;
-    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+
+    // import timezone and timedelta module to create a tzinfo object
+    OwnedRef class_timezone;
+    OwnedRef class_timedelta;
+    RETURN_NOT_OK(
+        internal::ImportFromModule(datetime.obj(), "timezone", &class_timezone));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(datetime.obj(), "timedelta", &class_timedelta));
+
+    // check input
     uint32_t minutes, hours;
     if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
         !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
                                           &minutes)) {
       return Status::Invalid("Invalid timezone: ", tz);
     }
+
+    // save offset as a signed integer
     OwnedRef total_minutes(PyLong_FromLong(
         sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+    // create zero integers for empty arguments in datetime.timedelta
+    OwnedRef zero(PyLong_FromLong(static_cast<int>(0)));
+
+    // call datetime.timedelta to get correct offset object for datetime.timezone
+    auto offset =
+        PyObject_CallFunctionObjArgs(class_timedelta.obj(), zero.obj(), zero.obj(),
+                                     zero.obj(), zero.obj(), total_minutes.obj(), NULL);
     RETURN_IF_PYERROR();
+    // call datetime.timezone
+    auto tzinfo = PyObject_CallFunctionObjArgs(class_timezone.obj(), offset, NULL);
+    RETURN_IF_PYERROR();
+    return tzinfo;
+  }
+
+  // fallback on zoneinfo if tz is string and pytz is not present
+  if (internal::ImportModule("zoneinfo", &zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &zoneinfo));
+
+    OwnedRef class_zoneinfo;
+    RETURN_NOT_OK(
+        internal::ImportFromModule(zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));
+    OwnedRef py_tz_string(
+        PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
     auto tzinfo =
-        PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+        PyObject_CallFunctionObjArgs(class_zoneinfo.obj(), py_tz_string.obj(), NULL);
     RETURN_IF_PYERROR();
     return tzinfo;
   }
 
-  OwnedRef timezone;
-  RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "timezone", &timezone));
-  OwnedRef py_tz_string(
-      PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
-  auto tzinfo = PyObject_CallFunctionObjArgs(timezone.obj(), py_tz_string.obj(), NULL);
-  RETURN_IF_PYERROR();
-  return tzinfo;
+  return Status::Invalid("Pytz package must be installed.");

Review comment:
       or Python>=3.8 for zoneinfo module

##########
File path: python/pyarrow/tests/test_convert_builtin.py
##########
@@ -995,6 +994,9 @@ def test_sequence_timestamp():
     'ns'
 ])
 def test_sequence_timestamp_with_timezone(timezone, unit):
+    pytest.importorskip("pytz")
+    import pytz

Review comment:
       This one is not yet addressed I think

##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -261,17 +274,23 @@ def arrays(draw, type, size=None, nullable=True):
     elif pa.types.is_date(ty):
         value = st.dates()
     elif pa.types.is_timestamp(ty):
+        if zoneinfo is None:
+            pytest.skip('no module named zoneinfo')
+        if ty.tz is None:
+            pytest.skip('requires timezone not None')

Review comment:
       Although I now see some previous comments about that below ...

##########
File path: python/pyarrow/tests/test_pandas.py
##########
@@ -1044,7 +1044,7 @@ def test_python_datetime_with_pytz_tzinfo(self):
             df = pd.DataFrame({'datetime': values})
             _check_pandas_roundtrip(df)
 
-    @h.given(st.none() | tzst.timezones())
+    @h.given(st.none() | st.timezones())

Review comment:
       Or this can be `past.timezones()` instead?

##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -96,10 +103,16 @@
     pa.time64('us'),
     pa.time64('ns')
 ])
+if tzst:

Review comment:
       I don't know if I already asked this before, but is it possible to do something like this?
   
   ```suggestion
   if tzst and zoneinfo:
       timezones = st.one_of(st.none(), tzst.timezones(), st.timezones())
   elif tzst:
   ```

##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -261,17 +274,23 @@ def arrays(draw, type, size=None, nullable=True):
     elif pa.types.is_date(ty):
         value = st.dates()
     elif pa.types.is_timestamp(ty):
+        if zoneinfo is None:
+            pytest.skip('no module named zoneinfo')
+        if ty.tz is None:
+            pytest.skip('requires timezone not None')

Review comment:
       I think we can also create a datetime without timezone in this case, instead of skipping the whole test? (that might require rearranging the code below a bit)

##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -391,62 +391,165 @@ Result<std::string> PyTZInfo_utcoffset_hhmm(PyObject* pytzinfo) {
 Result<PyObject*> StringToTzinfo(const std::string& tz) {
   util::string_view sign_str, hour_str, minute_str;
   OwnedRef pytz;
-  RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+  OwnedRef zoneinfo;
+  OwnedRef datetime;
 
+  if (internal::ImportModule("pytz", &pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &pytz));
+
+    if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+      int sign = -1;
+      if (sign_str == "+") {
+        sign = 1;
+      }
+      OwnedRef fixed_offset;
+      RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+      uint32_t minutes, hours;
+      if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
+          !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
+                                            &minutes)) {
+        return Status::Invalid("Invalid timezone: ", tz);
+      }
+      OwnedRef total_minutes(PyLong_FromLong(
+          sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+      RETURN_IF_PYERROR();
+      auto tzinfo =
+          PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+      RETURN_IF_PYERROR();
+      return tzinfo;
+    }
+
+    OwnedRef timezone;
+    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "timezone", &timezone));
+    OwnedRef py_tz_string(
+        PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
+    auto tzinfo = PyObject_CallFunctionObjArgs(timezone.obj(), py_tz_string.obj(), NULL);
+    RETURN_IF_PYERROR();
+    return tzinfo;
+  }
+
+  // catch fixed offset if pytz is not present
   if (MatchFixedOffset(tz, &sign_str, &hour_str, &minute_str)) {
+    RETURN_NOT_OK(internal::ImportModule("datetime", &datetime));
     int sign = -1;
     if (sign_str == "+") {
       sign = 1;
     }
-    OwnedRef fixed_offset;
-    RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "FixedOffset", &fixed_offset));
+
+    // import timezone and timedelta module to create a tzinfo object
+    OwnedRef class_timezone;
+    OwnedRef class_timedelta;
+    RETURN_NOT_OK(
+        internal::ImportFromModule(datetime.obj(), "timezone", &class_timezone));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(datetime.obj(), "timedelta", &class_timedelta));
+
+    // check input
     uint32_t minutes, hours;
     if (!::arrow::internal::ParseUnsigned(hour_str.data(), hour_str.size(), &hours) ||
         !::arrow::internal::ParseUnsigned(minute_str.data(), minute_str.size(),
                                           &minutes)) {
       return Status::Invalid("Invalid timezone: ", tz);
     }
+
+    // save offset as a signed integer
     OwnedRef total_minutes(PyLong_FromLong(
         sign * ((static_cast<int>(hours) * 60) + static_cast<int>(minutes))));
+    // create zero integers for empty arguments in datetime.timedelta
+    OwnedRef zero(PyLong_FromLong(static_cast<int>(0)));
+
+    // call datetime.timedelta to get correct offset object for datetime.timezone
+    auto offset =
+        PyObject_CallFunctionObjArgs(class_timedelta.obj(), zero.obj(), zero.obj(),
+                                     zero.obj(), zero.obj(), total_minutes.obj(), NULL);
     RETURN_IF_PYERROR();
+    // call datetime.timezone
+    auto tzinfo = PyObject_CallFunctionObjArgs(class_timezone.obj(), offset, NULL);
+    RETURN_IF_PYERROR();
+    return tzinfo;
+  }
+
+  // fallback on zoneinfo if tz is string and pytz is not present
+  if (internal::ImportModule("zoneinfo", &zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &zoneinfo));
+
+    OwnedRef class_zoneinfo;
+    RETURN_NOT_OK(
+        internal::ImportFromModule(zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));
+    OwnedRef py_tz_string(
+        PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
     auto tzinfo =
-        PyObject_CallFunctionObjArgs(fixed_offset.obj(), total_minutes.obj(), NULL);
+        PyObject_CallFunctionObjArgs(class_zoneinfo.obj(), py_tz_string.obj(), NULL);
     RETURN_IF_PYERROR();
     return tzinfo;
   }
 
-  OwnedRef timezone;
-  RETURN_NOT_OK(internal::ImportFromModule(pytz.obj(), "timezone", &timezone));
-  OwnedRef py_tz_string(
-      PyUnicode_FromStringAndSize(tz.c_str(), static_cast<Py_ssize_t>(tz.size())));
-  auto tzinfo = PyObject_CallFunctionObjArgs(timezone.obj(), py_tz_string.obj(), NULL);
-  RETURN_IF_PYERROR();
-  return tzinfo;
+  return Status::Invalid("Pytz package must be installed.");
 }
 
 Result<std::string> TzinfoToString(PyObject* tzinfo) {
   OwnedRef module_pytz;        // import pytz
   OwnedRef module_datetime;    // import datetime
+  OwnedRef module_zoneinfo;    // import zoneinfo
+  OwnedRef module_dateutil;    // import dateutil
   OwnedRef class_timezone;     // from datetime import timezone
   OwnedRef class_fixedoffset;  // from pytz import _FixedOffset
+  OwnedRef class_basetzinfo;   // from pytz import BaseTzInfo
+  OwnedRef class_zoneinfo;     // from zoneinfo import ZoneInfo
+  OwnedRef class_tzfile;       // from zoneinfo import tzfile
 
   // import necessary modules
-  RETURN_NOT_OK(internal::ImportModule("pytz", &module_pytz));
   RETURN_NOT_OK(internal::ImportModule("datetime", &module_datetime));
   // import necessary classes
-  RETURN_NOT_OK(
-      internal::ImportFromModule(module_pytz.obj(), "_FixedOffset", &class_fixedoffset));
   RETURN_NOT_OK(
       internal::ImportFromModule(module_datetime.obj(), "timezone", &class_timezone));
 
+  // Try to import pytz if it is available
+  if (internal::ImportModule("pytz", &module_pytz).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("pytz", &module_pytz));
+    RETURN_NOT_OK(internal::ImportFromModule(module_pytz.obj(), "_FixedOffset",
+                                             &class_fixedoffset));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(module_pytz.obj(), "BaseTzInfo", &class_basetzinfo));
+  }
+
+  // Try to import zoneinfo if it is available
+  if (internal::ImportModule("zoneinfo", &module_zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &module_zoneinfo));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(module_zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));
+  }
+  // Try to import dateutil if it is available
+  if (internal::ImportModule("dateutil.tz", &module_dateutil).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("dateutil.tz", &module_dateutil));
+    RETURN_NOT_OK(
+        internal::ImportFromModule(module_dateutil.obj(), "tzfile", &class_tzfile));
+  }
+
   // check that it's a valid tzinfo object
   if (!PyTZInfo_Check(tzinfo)) {
     return Status::TypeError("Not an instance of datetime.tzinfo");
   }
 
-  // if tzinfo is an instance of pytz._FixedOffset or datetime.timezone return the
+  // if tzinfo is an instance of datetime.timezone return the
+  // HH:MM offset string representation
+  if (PyObject_IsInstance(tzinfo, class_timezone.obj())) {
+    // still recognize datetime.timezone.utc as UTC (instead of +00:00)
+    OwnedRef tzname_object(PyObject_CallMethod(tzinfo, "tzname", "O", Py_None));
+    RETURN_IF_PYERROR();
+    if (PyUnicode_Check(tzname_object.obj())) {
+      std::string result;
+      RETURN_NOT_OK(internal::PyUnicode_AsStdString(tzname_object.obj(), &result));
+      if (result == "UTC") {
+        return result;
+      }
+    }
+    return PyTZInfo_utcoffset_hhmm(tzinfo);
+  }
+
+  // if tzinfo is an instance of pytz._FixedOffset return the
   // HH:MM offset string representation
-  if (PyObject_IsInstance(tzinfo, class_timezone.obj()) ||
+  if (module_pytz.obj() != nullptr &&
       PyObject_IsInstance(tzinfo, class_fixedoffset.obj())) {
     // still recognize datetime.timezone.utc as UTC (instead of +00:00)

Review comment:
       This part was only relevant for the stdlib datetime.timezone, as pytz.utc is not a subclass of pytz._FixedOffset. So this can be cleaned up a bit here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1080948878


   This PR is now ready. I have run the tests with and without `pytz`.
   
   <details>
   <summary>With pytz installed</summary>
   <pre>
   
   ```python
   (pyarrow-dev-9) (base) alenkafrim@Alenkas-MacBook-Pro arrow % pip install pytz
   Collecting pytz
     Using cached pytz-2022.1-py2.py3-none-any.whl (503 kB)
   Installing collected packages: pytz
   Successfully installed pytz-2022.1
   (pyarrow-dev-9) (base) alenkafrim@Alenkas-MacBook-Pro arrow % python -m pytest python/pyarrow/tests --enable-hypothesis
   ================================================================= test session starts =================================================================
   platform darwin -- Python 3.9.10, pytest-7.1.1, pluggy-1.0.0
   rootdir: /Users/alenkafrim/repos/arrow/python, configfile: setup.cfg
   plugins: hypothesis-6.39.4, cython-0.2.0.dev0, lazy-fixture-0.6.3
   collected 4142 items / 3 skipped                                                                                                                      
   
   python/pyarrow/tests/test_adhoc_memory_leak.py s                                                                                                [  0%]
   python/pyarrow/tests/test_array.py ........................s................................................................................... [  2%]
   ..............................................................................................ss.............                                   [  5%]
   python/pyarrow/tests/test_builder.py ....                                                                                                       [  5%]
   python/pyarrow/tests/test_cffi.py ...........                                                                                                   [  5%]
   python/pyarrow/tests/test_compute.py .......................................................................................................... [  8%]
   ..............................................................................................................                                  [ 10%]
   python/pyarrow/tests/test_convert_builtin.py .................................................................................................. [ 13%]
   .................................................................................................................x............................. [ 16%]
   .........................................................x.......................ssssss........................................................ [ 20%]
   ................................................sssssssss                                                                                       [ 21%]
   python/pyarrow/tests/test_csv.py .............................................................................................................. [ 24%]
   ......                                                                                                                                          [ 24%]
   python/pyarrow/tests/test_cython.py ..                                                                                                          [ 24%]
   python/pyarrow/tests/test_dataset.py .......................................................................................................... [ 26%]
   ..................................................ssss...................ssss.................................................................s [ 30%]
   ..                                                                                                                                              [ 30%]
   python/pyarrow/tests/test_extension_type.py ........................................                                                            [ 31%]
   python/pyarrow/tests/test_feather.py .............................ss...............................ss...x........s.....s.                       [ 33%]
   python/pyarrow/tests/test_filesystem.py ....                                                                                                    [ 33%]
   python/pyarrow/tests/test_flight.py ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss                                            [ 35%]
   python/pyarrow/tests/test_fs.py s.......ssx.x..s...ssx.x..s....ss.....s...ss.....s...ss.....s...ss.....s...ss.....s...ss.....s...ss.....s....ss [ 37%]
   .....s...ss.....s...ss.....s...ss.....s............ssssssss....................ssss...ss.....s...ss.....s............ssssssss.................. [ 41%]
   ..ssss............ssssssss....................ssss...ss.....s....sss...............s...........ssss.                                            [ 43%]
   python/pyarrow/tests/test_gandiva.py sssssssssss                                                                                                [ 43%]
   python/pyarrow/tests/test_gdb.py ssssssssssssssssssssssss                                                                                       [ 44%]
   python/pyarrow/tests/test_hdfs.py ssssssssssssssssssssssss                                                                                      [ 45%]
   python/pyarrow/tests/test_io.py ........s..........................x.....x...............s...s................................x................ [ 47%]
   .................                                                                                                                               [ 48%]
   python/pyarrow/tests/test_ipc.py ....................................................                                                           [ 49%]
   python/pyarrow/tests/test_json.py ..........................                                                                                    [ 49%]
   python/pyarrow/tests/test_memory.py .................                                                                                           [ 50%]
   python/pyarrow/tests/test_misc.py .s...............................................................................................             [ 52%]
   python/pyarrow/tests/test_orc.py sssssssssss                                                                                                    [ 52%]
   python/pyarrow/tests/test_pandas.py ........................................................................................................... [ 55%]
   sss.....................................................s.........s............................................................................ [ 59%]
   ......................................................                                                                                          [ 60%]
   python/pyarrow/tests/test_plasma.py sssssssssssssssssssssssssssssssss                                                                           [ 61%]
   python/pyarrow/tests/test_plasma_tf_op.py s                                                                                                     [ 61%]
   python/pyarrow/tests/test_scalars.py ..............................................s....s....................                                   [ 62%]
   python/pyarrow/tests/test_schema.py ................................                                                                            [ 63%]
   python/pyarrow/tests/test_serialization.py ........s..ss....................................................................................... [ 66%]
   .............................................................................................ss................................................ [ 69%]
   ................................................................................................................................s.............. [ 72%]
   ............................................................................................................................................... [ 76%]
   ...................s........................................................................................................................... [ 79%]
   ........................................................s..............                                                                         [ 81%]
   python/pyarrow/tests/test_serialization_deprecated.py ..                                                                                        [ 81%]
   python/pyarrow/tests/test_sparse_tensor.py .................................................................................................... [ 84%]
   ................................sssssssssssssssssssssssssssssssss                                                                               [ 85%]
   python/pyarrow/tests/test_strategies.py ...ss.ss                                                                                                [ 85%]
   python/pyarrow/tests/test_table.py .......................s................................................................................     [ 88%]
   python/pyarrow/tests/test_tensor.py .....................                                                                                       [ 88%]
   python/pyarrow/tests/test_types.py ............................................................................                                 [ 90%]
   python/pyarrow/tests/test_util.py s                                                                                                             [ 90%]
   python/pyarrow/tests/parquet/test_basic.py ...............s........................................s.....................................       [ 92%]
   python/pyarrow/tests/parquet/test_compliant_nested_type.py ........                                                                             [ 93%]
   python/pyarrow/tests/parquet/test_data_types.py ..............X.................sssssss.ss                                                      [ 94%]
   python/pyarrow/tests/parquet/test_dataset.py ....x..........xx.......x...x.ssssssss..........................x.x...........ssssss.s..s......... [ 96%]
                                                                                                                                                   [ 96%]
   python/pyarrow/tests/parquet/test_datetime.py ...................                                                                               [ 97%]
   python/pyarrow/tests/parquet/test_encryption.py ssssssssssssssss                                                                                [ 97%]
   python/pyarrow/tests/parquet/test_metadata.py .........................s                                                                        [ 98%]
   python/pyarrow/tests/parquet/test_pandas.py .......s.........................................                                                   [ 99%]
   python/pyarrow/tests/parquet/test_parquet_file.py ..............                                                                                [ 99%]
   python/pyarrow/tests/parquet/test_parquet_writer.py .............sss...                                                                         [100%]
   
   ================================================================== warnings summary ===================================================================
   pyarrow/tests/test_pandas.py::TestConvertMetadata::test_column_index_names_with_tz
   pyarrow/tests/test_pandas.py::TestConvertMetadata::test_column_index_names_with_tz
   pyarrow/tests/test_pandas.py::TestConvertMetadata::test_column_index_names_with_tz
     /Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/pandas/core/indexes/base.py:1059: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
       new_values = values.astype(dtype, copy=copy)
   
   pyarrow/tests/test_pandas.py::TestConvertMetadata::test_rangeindex_doesnt_warn
   pyarrow/tests/test_pandas.py::TestConvertMetadata::test_multiindex_doesnt_warn
   pyarrow/tests/parquet/test_basic.py::test_read_table_doesnt_warn[True]
   pyarrow/tests/parquet/test_basic.py::test_read_table_doesnt_warn[False]
     /Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/_pytest/python.py:192: PytestRemovedIn8Warning: Passing None has been deprecated.
     See https://docs.pytest.org/en/latest/how-to/capture-warnings.html#additional-use-cases-of-warnings-in-tests for alternatives in common use cases.
       result = testfunction(**testargs)
   
   pyarrow/tests/test_pandas.py::TestConvertListTypes::test_infer_lists
   pyarrow/tests/test_pandas.py::TestConvertListTypes::test_to_list_of_structs_pandas
   pyarrow/tests/test_pandas.py::TestConvertListTypes::test_nested_large_list
     /Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/pandas/core/dtypes/missing.py:521: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
       if np.any(np.asarray(left_value != right_value)):
   
   pyarrow/tests/test_pandas.py::TestConvertListTypes::test_nested_large_list
     /Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/pandas/core/dtypes/missing.py:521: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
       if np.any(np.asarray(left_value != right_value)):
   
   pyarrow/tests/test_pandas.py::test_timestamp_as_object_parquet
   pyarrow/tests/parquet/test_pandas.py::test_spark_flavor_preserves_pandas_metadata
     /Users/alenkafrim/repos/arrow/python/pyarrow/parquet.py:2101: FutureWarning: Parquet format '2.0' pseudo version is deprecated, use '2.4' or '2.6' for fine-grained feature selection
       with ParquetWriter(
   
   pyarrow/tests/parquet/test_metadata.py::test_write_metadata
     /Users/alenkafrim/repos/arrow/python/pyarrow/parquet.py:2345: FutureWarning: Parquet format '2.0' pseudo version is deprecated, use '2.4' or '2.6' for fine-grained feature selection
       writer = ParquetWriter(where, schema, **kwargs)
   
   -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
   ======================================= 3723 passed, 404 skipped, 17 xfailed, 1 xpassed, 14 warnings in 39.40s ========================================
   ```
   
   </pre>
   </details>
   
   <details>
   <summary>Without pytz</summary>
   <pre>
   
   ```python
   (pyarrow-dev-9) (base) alenkafrim@Alenkas-MacBook-Pro arrow % pip uninstall pytz
   Found existing installation: pytz 2022.1
   Uninstalling pytz-2022.1:
     Would remove:
       /Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/pytz-2022.1.dist-info/*
       /Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/pytz/*
   Proceed (Y/n)? y 
     Successfully uninstalled pytz-2022.1
   (pyarrow-dev-9) (base) alenkafrim@Alenkas-MacBook-Pro arrow % python -m pytest python/pyarrow/tests --enable-hypothesis
   ================================================================= test session starts =================================================================
   platform darwin -- Python 3.9.10, pytest-7.1.1, pluggy-1.0.0
   rootdir: /Users/alenkafrim/repos/arrow/python, configfile: setup.cfg
   plugins: hypothesis-6.39.4, cython-0.2.0.dev0, lazy-fixture-0.6.3
   collected 4142 items / 3 skipped                                                                                                                      
   
   python/pyarrow/tests/test_adhoc_memory_leak.py s                                                                                                [  0%]
   python/pyarrow/tests/test_array.py .......................ss....s.............................................................................. [  2%]
   .............s.........................s...................s.s................................ss...s........s                                   [  5%]
   python/pyarrow/tests/test_builder.py ....                                                                                                       [  5%]
   python/pyarrow/tests/test_cffi.py ...........                                                                                                   [  5%]
   python/pyarrow/tests/test_compute.py ........................................................................s................................. [  8%]
   ........................................................................ssssssssss............................                                  [ 10%]
   python/pyarrow/tests/test_convert_builtin.py .................................................................................................. [ 13%]
   .................................................................................................................x............................. [ 16%]
   .........................................................x.......................ssssss...............ssssssssssssssssssssss...Xs.............. [ 20%]
   ...............................................ssssssssss                                                                                       [ 21%]
   python/pyarrow/tests/test_csv.py .............................................................................................................. [ 24%]
   ......                                                                                                                                          [ 24%]
   python/pyarrow/tests/test_cython.py ..                                                                                                          [ 24%]
   python/pyarrow/tests/test_dataset.py ...........................s....sssssss.ssss.ssssssssss......s....s...................s..........s........ [ 26%]
   .................................................sssss.ss......ssss......ssss.ssssssssss......sss.ssssss..ssss..sss.s.....s....ss..........s..s [ 30%]
   ..                                                                                                                                              [ 30%]
   python/pyarrow/tests/test_extension_type.py .................s.s....................                                                            [ 31%]
   python/pyarrow/tests/test_feather.py s.sssssssss..ssssssssssssss.sss.s..sssssssss..ssssssssssssss.sss.s.sss.sss..s.....s.                       [ 33%]
   python/pyarrow/tests/test_filesystem.py ....                                                                                                    [ 33%]
   python/pyarrow/tests/test_flight.py ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss                                            [ 35%]
   python/pyarrow/tests/test_fs.py s.......ssx.x..s...ssx.x..s....ss.....s...ss.....s...ss.....s...ss.....s...ss.....s...ss.....s...ss.....s....ss [ 37%]
   .....s...ss.....s...ss.....s...ss.....s............ssssssss....................ssss...ss.....s...ss.....s............ssssssss.................. [ 41%]
   ..ssss............ssssssss....................ssss...ss.....s....sss...............s...........ssss.                                            [ 43%]
   python/pyarrow/tests/test_gandiva.py sssssssssss                                                                                                [ 43%]
   python/pyarrow/tests/test_gdb.py ssssssssssssssssssssssss                                                                                       [ 44%]
   python/pyarrow/tests/test_hdfs.py ssssssssssssssssssssssss                                                                                      [ 45%]
   python/pyarrow/tests/test_io.py ........s..........................x.....x...............s...s................................x................ [ 47%]
   .................                                                                                                                               [ 48%]
   python/pyarrow/tests/test_ipc.py ......s..s.ss..............s........ss.sssssssss....                                                           [ 49%]
   python/pyarrow/tests/test_json.py ..........................                                                                                    [ 49%]
   python/pyarrow/tests/test_memory.py .................                                                                                           [ 50%]
   python/pyarrow/tests/test_misc.py .s...............................................................................................             [ 52%]
   python/pyarrow/tests/test_orc.py sssssssssss                                                                                                    [ 52%]
   python/pyarrow/tests/test_pandas.py sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 55%]
   sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 59%]
   ssssssssssssssssssssssssssssssssssssssssssssssssssssss                                                                                          [ 60%]
   python/pyarrow/tests/test_plasma.py sssssssssssssssssssssssssssssssss                                                                           [ 61%]
   python/pyarrow/tests/test_plasma_tf_op.py s                                                                                                     [ 61%]
   python/pyarrow/tests/test_scalars.py .............................................ssss.s..............s......                                   [ 62%]
   python/pyarrow/tests/test_schema.py ...s........................s...                                                                            [ 63%]
   python/pyarrow/tests/test_serialization.py ........s..ss....................................................................................... [ 66%]
   .............................................................................................ss................................................ [ 69%]
   ................................................................................................................................s.............. [ 72%]
   ............................................................................................................................................... [ 76%]
   ...................s........................................................................................................................... [ 79%]
   ........................................................s..............                                                                         [ 81%]
   python/pyarrow/tests/test_serialization_deprecated.py ..                                                                                        [ 81%]
   python/pyarrow/tests/test_sparse_tensor.py .................................................................................................... [ 84%]
   ................................sssssssssssssssssssssssssssssssss                                                                               [ 85%]
   python/pyarrow/tests/test_strategies.py ...sssss                                                                                                [ 85%]
   python/pyarrow/tests/test_table.py ..................ssssss..................s....s.....................s...................ss.s...........     [ 88%]
   python/pyarrow/tests/test_tensor.py .....................                                                                                       [ 88%]
   python/pyarrow/tests/test_types.py ................s...s..s....................................................                                 [ 90%]
   python/pyarrow/tests/test_util.py s                                                                                                             [ 90%]
   python/pyarrow/tests/parquet/test_basic.py ...ssssssss....sssss..ss...................sssss....ss..s.....................................       [ 92%]
   python/pyarrow/tests/parquet/test_compliant_nested_type.py ssssssss                                                                             [ 93%]
   python/pyarrow/tests/parquet/test_data_types.py ssssssssss..sss...........ss.ss.sssssss.ss                                                      [ 94%]
   python/pyarrow/tests/parquet/test_dataset.py ss..x.ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssx.sssssssssssssssssssssssss..... [ 96%]
                                                                                                                                                   [ 96%]
   python/pyarrow/tests/parquet/test_datetime.py sssssssssssss..sss.                                                                               [ 97%]
   python/pyarrow/tests/parquet/test_encryption.py ssssssssssssssss                                                                                [ 97%]
   python/pyarrow/tests/parquet/test_metadata.py s.ssssssssssssss...s..ss.s                                                                        [ 98%]
   python/pyarrow/tests/parquet/test_pandas.py sssssssssssssssssssssssssssssssssssssssssssssssss                                                   [ 99%]
   python/pyarrow/tests/parquet/test_parquet_file.py ssssss..ssssss                                                                                [ 99%]
   python/pyarrow/tests/parquet/test_parquet_writer.py ss.ssssssssssssssss                                                                         [100%]
   
   ================================================================== warnings summary ===================================================================
   pyarrow/tests/parquet/test_basic.py::test_read_table_doesnt_warn[True]
   pyarrow/tests/parquet/test_basic.py::test_read_table_doesnt_warn[False]
     /Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/_pytest/python.py:192: PytestRemovedIn8Warning: Passing None has been deprecated.
     See https://docs.pytest.org/en/latest/how-to/capture-warnings.html#additional-use-cases-of-warnings-in-tests for alternatives in common use cases.
       result = testfunction(**testargs)
   
   pyarrow/tests/parquet/test_metadata.py::test_write_metadata
     /Users/alenkafrim/repos/arrow/python/pyarrow/parquet.py:2345: FutureWarning: Parquet format '2.0' pseudo version is deprecated, use '2.4' or '2.6' for fine-grained feature selection
       writer = ParquetWriter(where, schema, **kwargs)
   
   -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
   ======================================= 3001 passed, 1132 skipped, 11 xfailed, 1 xpassed, 3 warnings in 29.25s ========================================
   ```
   
   </pre>
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche closed pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche closed pull request #12522:
URL: https://github.com/apache/arrow/pull/12522


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Finished :arrow_down:0.71% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r833177515



##########
File path: python/pyarrow/tests/test_pandas.py
##########
@@ -1044,7 +1044,7 @@ def test_python_datetime_with_pytz_tzinfo(self):
             df = pd.DataFrame({'datetime': values})
             _check_pandas_roundtrip(df)
 
-    @h.given(st.none() | tzst.timezones())
+    @h.given(st.none() | st.timezones())

Review comment:
       Can't remember if this is the correct change deleted with rebasing or not :(




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1067889979


   > `pa.lib.string_to_tzinfo` produces a `pytz` timezone or gives a warning, that the module is not installed.
   
   What does it then return, when it gives a warning? (None?)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r834413860



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -461,18 +564,19 @@ Result<std::string> TzinfoToString(PyObject* tzinfo) {
     return PyTZInfo_utcoffset_hhmm(tzinfo);
   }
 
-  // try to look up zone attribute
-  if (PyObject_HasAttrString(tzinfo, "zone")) {
+  // if pytz is installed and tzinfo is and instance of pytz.BaseTzInfo

Review comment:
       ```suggestion
     // if pytz is installed and tzinfo is an instance of pytz.BaseTzInfo
   ```

##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -482,7 +586,9 @@ Result<std::string> TzinfoToString(PyObject* tzinfo) {
 
   // try to look up _filename attribute
   // in case of dateutil.tz object

Review comment:
       Did you mean to keep these comments?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1081635355


   Follow-up JIRA: https://issues.apache.org/jira/browse/ARROW-16056


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Finished :arrow_down:0.71% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/409| `d7fbf526` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/395| `d7fbf526` test-mac-arm>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/395| `d7fbf526` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/405| `d7fbf526` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/408| `09a794a6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/394| `09a794a6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/394| `09a794a6` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/404| `09a794a6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r835335739



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -96,10 +103,16 @@
     pa.time64('us'),
     pa.time64('ns')
 ])
+if tzst:

Review comment:
       Just a note to myself: I get strange errors if applying this correction ⬆️ 
   
   ```
   python/pyarrow/tests/test_pandas.py:1052: in test_python_datetime_with_pytz_timezone
       _check_pandas_roundtrip(df)
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
   
   df =                    datetime
   0 2018-01-01 12:23:45+00:00
   expected =                    datetime
   0 2018-01-01 12:23:45+00:00, use_threads = False, expected_schema = None
   check_dtype = True, schema = None, preserve_index = False, as_batch = False
   
       def _check_pandas_roundtrip(df, expected=None, use_threads=False,
                                   expected_schema=None,
                                   check_dtype=True, schema=None,
                                   preserve_index=False,
                                   as_batch=False):
           klass = pa.RecordBatch if as_batch else pa.Table
           table = klass.from_pandas(df, schema=schema,
                                     preserve_index=preserve_index,
                                     nthreads=2 if use_threads else 1)
           result = table.to_pandas(use_threads=use_threads)
       
           if expected_schema:
               # all occurrences of _check_pandas_roundtrip passes expected_schema
               # without the pandas generated key-value metadata
               assert table.schema.equals(expected_schema)
       
           if expected is None:
               expected = df
       
   >       tm.assert_frame_equal(result, expected, check_dtype=check_dtype,
                                 check_index_type=('equiv' if preserve_index
                                                   else False))
   E       AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="datetime") are different
   E       
   E       Attribute "dtype" are different
   E       [left]:  datetime64[ns, UTC]
   E       [right]: datetime64[ns, UTC]
   
   python/pyarrow/tests/test_pandas.py:98: AssertionError
   ```
   
   Will have to further look into this. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r822683300



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -266,10 +277,12 @@ def arrays(draw, type, size=None, nullable=True):
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
-            offset_hours = int(ty.tz)
-            tz = pytz.FixedOffset(offset_hours * 60)
+            offset = ty.tz.split(":")
+            offset_hours = int(offset[0])
+            offset_min = int(offset[1])
+            tz = datetime.timedelta(hours=offset_hours, minutes=offset_min)
         except ValueError:
-            tz = pytz.timezone(ty.tz)
+            tz = zoneinfo.ZoneInfo(ty.tz)

Review comment:
       I added a `pytest.skip()` in case timezone is None and also used `timezones` from `pyarrow.tests.strategies` instead of `hypothesis.extra.pytz` (`tzst`) where relevant.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1062800945


   Revision: 5e531a0ddb23d927f9f9edefc7e13a50ca973e95
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-1733](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1733)
   
   |Task|Status|
   |----|------|
   |test-conda-python-3.8-hypothesis|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1733-github-test-conda-python-3.8-hypothesis)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1733-github-test-conda-python-3.8-hypothesis)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Finished :arrow_down:0.71% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/409| `d7fbf526` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/395| `d7fbf526` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/405| `d7fbf526` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/408| `09a794a6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/394| `09a794a6` test-mac-arm>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/394| `09a794a6` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/404| `09a794a6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1054208912


   https://issues.apache.org/jira/browse/ARROW-15580


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r817522143



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +231,9 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    pytest.importorskip("pytz")

Review comment:
       There are couple of issues I am tying to solve to make this work:
   1. It seems all the tests in `test_strategies.py` are being skipped?
   2. The code I need to change to use `zoneinfo` is here:
   https://github.com/apache/arrow/blob/69682ec944d053a60316c0af903a167eca0f24db/python/pyarrow/tests/strategies.py#L263-L274
   
   but: * I would need to install `timedelta,` which we do not depend on currently, to get the offset into a `timedelta` instance. ** The current code for the offset doesn't seem to work if I play with it locally. Example:
   
   ```python
   >>> import pyarrow as pa
   >>> 
   >>> ty = pa.timestamp('s', tz='+07:30')
   >>> pa.types.is_timestamp(ty)
   True
   
   >>> offset_hours = int(ty.tz) # This errors
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   ValueError: invalid literal for int() with base 10: '+07:30'
   
   >>> import pytz
   >>> tz = pytz.timezone(ty.tz) # This also errors
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/pytz/__init__.py", line 188, in timezone
       raise UnknownTimeZoneError(zone)
   pytz.exceptions.UnknownTimeZoneError: '+07:30'
   ```
   This works:
   ```python
   >>> ty = pa.timestamp('s', tz='America/New_York')
   >>> pa.types.is_timestamp(ty)
   True
   >>> pytz.timezone(ty.tz)
   <DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>
   ```
   
   My idea for a current change would be as follows. But it uses `timedelta` and the tests are being skipped so I am not sure if this works.
   ```python
     elif pa.types.is_timestamp(ty):
         min_int64 = -(2**63)
         max_int64 = 2**63 - 1
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
             offset = ty.tz.split(":")
             offset_hours = int(offset[0])
             offset_min = int(offset[1])
             tz = timedelta(hours=offset_hours, minutes=offset_min)
         except ValueError:
             tz = ZoneInfo(ty.tz)
         value = st.datetimes(timezones=st.just(tz), min_value=min_datetime,
                              max_value=max_datetime)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1079097011


   I still need to resolve the issue of the failing test (`test_python_datetime_with_pytz_timezone`) and then I need to run the tests again with and without `pytz` installed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r819367694



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -266,10 +278,12 @@ def arrays(draw, type, size=None, nullable=True):
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
-            offset_hours = int(ty.tz)
-            tz = pytz.FixedOffset(offset_hours * 60)
+            offset = ty.tz.split(":")
+            offset_hours = int(offset[0])
+            offset_min = int(offset[1])
+            tz = datetime.timedelta(hours=offset_hours, minutes=offset_min)
         except ValueError:
-            tz = pytz.timezone(ty.tz)
+            tz = ZoneInfo(ty.tz)

Review comment:
       ```suggestion
               tz = zoneinfo.ZoneInfo(ty.tz)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r822521291



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +233,8 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    zoneinfo = pytest.importorskip("zoneinfo")

Review comment:
       I _think_ we can move this import to the `elif pa.types.is_timestamp(ty):` block, so the test gets only skipped when using a timestamp type 

##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -96,10 +99,16 @@
     pa.time64('us'),
     pa.time64('ns')
 ])
+if tzst:
+    timezones = st.one_of(st.none(), tzst.timezones())
+elif zoneinfo:

Review comment:
       `zoneinfo` is not yet defined or imported at this point?

##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -266,10 +277,12 @@ def arrays(draw, type, size=None, nullable=True):
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
-            offset_hours = int(ty.tz)
-            tz = pytz.FixedOffset(offset_hours * 60)
+            offset = ty.tz.split(":")
+            offset_hours = int(offset[0])
+            offset_min = int(offset[1])
+            tz = datetime.timedelta(hours=offset_hours, minutes=offset_min)

Review comment:
       We are not actually using this, I think? (since the `tz` that gets generated above is either None or a zone, but never a fixed offset string?)
   
   (we could update the `timezones` strategy to also included fixed offsets, but since that's not supported out of the box by hypothesis, I think it is fine to leave this as a potential follow-up)

##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -266,10 +277,12 @@ def arrays(draw, type, size=None, nullable=True):
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
-            offset_hours = int(ty.tz)
-            tz = pytz.FixedOffset(offset_hours * 60)
+            offset = ty.tz.split(":")
+            offset_hours = int(offset[0])
+            offset_min = int(offset[1])
+            tz = datetime.timedelta(hours=offset_hours, minutes=offset_min)
         except ValueError:
-            tz = pytz.timezone(ty.tz)
+            tz = zoneinfo.ZoneInfo(ty.tz)

Review comment:
       This might need a `if ty.tz is not None:` ? (since the timezone strategy can now also return None)

##########
File path: python/pyarrow/tests/test_pandas.py
##########
@@ -25,12 +25,14 @@
 from datetime import date, datetime, time, timedelta, timezone
 
 import hypothesis as h
-import hypothesis.extra.pytz as tzst
+try:
+    import hypothesis.extra.pytz as tzst

Review comment:
       I _think_ we could also use the `timezones` strategy from `pyarrow.tests.strategies` (which is already import as `past`), so then we don't have to redefine it here with the `if tzst: .. else: ..`

##########
File path: python/pyarrow/tests/test_convert_builtin.py
##########
@@ -995,6 +994,9 @@ def test_sequence_timestamp():
     'ns'
 ])
 def test_sequence_timestamp_with_timezone(timezone, unit):
+    pytest.importorskip("pytz")
+    import pytz

Review comment:
       I think you can simplify this to a single line as `pytz = pytest.importorskip("pytz")`
   
   (and the same for all other occurrences)

##########
File path: python/pyarrow/tests/test_types.py
##########
@@ -23,10 +23,12 @@
 
 import pickle
 import pytest
-import pytz
 import hypothesis as h
 import hypothesis.strategies as st
-import hypothesis.extra.pytz as tzst
+try:
+    import hypothesis.extra.pytz as tzst

Review comment:
       same here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r822674499



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -266,10 +277,12 @@ def arrays(draw, type, size=None, nullable=True):
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
-            offset_hours = int(ty.tz)
-            tz = pytz.FixedOffset(offset_hours * 60)
+            offset = ty.tz.split(":")
+            offset_hours = int(offset[0])
+            offset_min = int(offset[1])
+            tz = datetime.timedelta(hours=offset_hours, minutes=offset_min)

Review comment:
       Yes, that is correct, this isn't actually tested. But it would be good to update the `timezone` strategy to include fixed offsets. Will make a JIRA for it and leave this part of the code as is.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1081494633


   I am not sure if `write_dataset` CI errors are connected to this PR. Am trying to build PyArrow with S3 to run the full tests locally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot commented on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1082076337


   Benchmark runs are scheduled for baseline = 09a794a6977a3d44cab34db1373c632170d5bae7 and contender = d7fbf5260433dee08ae219e8aec34169cd610652. d7fbf5260433dee08ae219e8aec34169cd610652 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Failed] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8e18089ea59546be8bcd20f990ee630e...c4f6194385564fe3adc8b1d67f8ec34f/)
   [Failed] [test-mac-arm](https://conbench.ursa.dev/compare/runs/ab20b344f53a46e8983e94e9556e72ce...be48b219ba9641f08d77b32d60ba2a96/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7d2c25745e164215a537d751ce393465...3c18a7fcd8aa47b8a6686cf021875c70/)
   [Failed] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/1ac3d1167959409095e909e69d18c4e4...28606d8b6d094ef2a9a6a85788d5f7d3/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r819366943



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +233,9 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    pytest.importorskip("ZoneInfo")
+    import ZoneInfo

Review comment:
       ```suggestion
       zoneinfo = pytest.importorskip("zoneinfo")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1062905029


   It seems it is not actually running the hypothesis tests in CI -> https://github.com/apache/arrow/pull/12588


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#issuecomment-1074863921


   I also need to add the option for using `zoneinfo` in `StringToTzinfo`. Am on it today so we can close this asap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12522: ARROW-15580: [Python] PyArrow imports pytz, which is not included as dependency

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r832306534



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -426,8 +427,19 @@ Result<PyObject*> StringToTzinfo(const std::string& tz) {
     return tzinfo;
   }
 
+  if (internal::ImportModule("zoneinfo", &zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &zoneinfo));
+
+    OwnedRef class_zoneinfo;
+    RETURN_NOT_OK(internal::ImportFromModule(zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));

Review comment:
       And in that case, we can probably fall back to the stdlib `datetime.timezone`, which handles fixed offsets? 

##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -426,8 +427,19 @@ Result<PyObject*> StringToTzinfo(const std::string& tz) {
     return tzinfo;
   }
 
+  if (internal::ImportModule("zoneinfo", &zoneinfo).ok()) {
+    RETURN_NOT_OK(internal::ImportModule("zoneinfo", &zoneinfo));
+
+    OwnedRef class_zoneinfo;
+    RETURN_NOT_OK(internal::ImportFromModule(zoneinfo.obj(), "ZoneInfo", &class_zoneinfo));

Review comment:
       You will probably need to check `MatchFixedOffset`, to ensure we don't pass a fixed offset string to `zoneinfo.ZoneInfo`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on a change in pull request #12522: ARROW-15580: [Python] Make pytz an actual optional dependency of PyArrow

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r835295309



##########
File path: cpp/src/arrow/python/datetime.cc
##########
@@ -482,7 +586,9 @@ Result<std::string> TzinfoToString(PyObject* tzinfo) {
 
   // try to look up _filename attribute
   // in case of dateutil.tz object

Review comment:
       No, I don't think so. Will delete them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org