You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/20 03:15:38 UTC

[GitHub] [arrow] cyb70289 opened a new pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings

cyb70289 opened a new pull request #9274:
URL: https://github.com/apache/arrow/pull/9274


   Many `invalid-offsetof` warnings in python building:
   ...
   ~/arrow/python/build/temp.linux-x86_64-3.6/_compute.cpp:26034:146:
   warning: offsetof within non-standard-layout type ‘__pyx_obj_7pyarrow_8_compute__CastOptions’ is undefined [-Winvalid-offsetof]
    x_type_7pyarrow_8_compute__CastOptions.tp_weaklistoffset = offsetof(struct __pyx_obj_7pyarrow_8_compute__CastOptions, __pyx_base.__pyx_base.__weakref__);
   ...
   
   Cython generated code calls `offsetof` on FunctionOptions based classes.
   `offsetof` only applies to C++ classes with `standard layout type` [1],
   which excludes virtual functions.
   
   Fix the warning by removing virtual destructor in FunctionOptions class.
   No FunctionOptions derived class has explicit destructor, so removing
   the virtual destructor looks safe now. But it may cause memory leak in
   the future when some derived classes have explicit destructors.
   
   Any better way to handle this issue?
   
   [1] https://en.cppreference.com/w/cpp/named_req/StandardLayoutType


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-764217852


   > @cyb70289 `arrow::dataset::Expression::Call::options` requires that virtual destructor since a `shared_ptr<FunctionOptions>` is used to store subclasses of FunctionOptions. Removing this virtual destructor results in a resource leak since the derived classes' (implicit) destructors are not being called, leaking data members such as `arrow::compute::SetLookupOptions::value_set` which are not destroyed.
   
   Thanks @bkietz , I missed this point.
   Will update according to your comment. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562484017



##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
 
 cdef class _CastOptions(FunctionOptions):
     cdef:
-        CCastOptions options
+        unique_ptr[CCastOptions] options
 
     __slots__ = ()  # avoid mistakingly creating attributes
 
     cdef const CFunctionOptions* get_options(self) except NULL:
-        return &self.options
+        return self.options.get()
 
     def _set_options(self, DataType target_type, allow_int_overflow,
                      allow_time_truncate, allow_time_overflow,
                      allow_float_truncate, allow_invalid_utf8):
+        self.options.reset(new CCastOptions())
         self._set_type(target_type)
         if allow_int_overflow is not None:
-            self.allow_int_overflow = allow_int_overflow
+            deref(self.options).allow_int_overflow = allow_int_overflow

Review comment:
       Looks `options` is missed in current code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562483502



##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
 
 cdef class _CastOptions(FunctionOptions):
     cdef:
-        CCastOptions options
+        unique_ptr[CCastOptions] options
 
     __slots__ = ()  # avoid mistakingly creating attributes
 
     cdef const CFunctionOptions* get_options(self) except NULL:
-        return &self.options
+        return self.options.get()
 
     def _set_options(self, DataType target_type, allow_int_overflow,
                      allow_time_truncate, allow_time_overflow,
                      allow_float_truncate, allow_invalid_utf8):
+        self.options.reset(new CCastOptions())
         self._set_type(target_type)
         if allow_int_overflow is not None:
-            self.allow_int_overflow = allow_int_overflow
+            deref(self.options).allow_int_overflow = allow_int_overflow
         if allow_time_truncate is not None:
-            self.allow_time_truncate = allow_time_truncate
+            deref(self.options).allow_time_truncate = allow_time_truncate
         if allow_time_overflow is not None:
-            self.allow_time_overflow = allow_time_overflow
+            deref(self.options).allow_time_overflow = allow_time_overflow
         if allow_float_truncate is not None:
-            self.allow_float_truncate = allow_float_truncate
+            deref(self.options).allow_float_truncate = allow_float_truncate
         if allow_invalid_utf8 is not None:
-            self.allow_invalid_utf8 = allow_invalid_utf8
+            deref(self.options).allow_invalid_utf8 = allow_invalid_utf8
 
     def _set_type(self, target_type=None):
         if target_type is not None:
-            self.options.to_type = (
+            deref(self.options).to_type = (
                 (<DataType> ensure_type(target_type)).sp_type
             )
 
     def _set_safe(self):
-        self.options = CCastOptions.Safe()
+        self.options.reset(new CCastOptions(CCastOptions.Safe()))

Review comment:
       Any way to assign `self.options` directly?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763833521


   I think the best way to resolve this issue is to avoid direct instances of derived classes of FunctionOptions in Cython, since Cython inappropriately applies `offsetof`. For example, note that the warning is emitted for [MinMaxOptions](https://github.com/apache/arrow/blob/fdc63acdb68adade0d68e232dfcc892dc0ecf700/python/pyarrow/_compute.pyx#L670) but not for [SetLookupOptions](https://github.com/apache/arrow/blob/fdc63acdb68adade0d68e232dfcc892dc0ecf700/python/pyarrow/_compute.pyx#L732) (where the options are held by unique_ptr rather than being a direct member).
   
   I'd recommend the following approach: rewrite `_compute.pyx::FunctionOptions` to hold a `shared_ptr[CFunctionOptions]`, and let the `cdef` classes acquire a non owning pointer to the appropriate FunctionOptions subclass. For example:
   
   `_compute.pxd`
   ```python
   cdef class FunctionOptions(_Weakrefable):
       cdef:
           shared_ptr[CFunctionOptions] options
   
       cdef const CFunctionOptions* get_options(self) except NULL
   ```
   
   `_compute.pyx`
   ```python
   cdef class _CastOptions(FunctionOptions):
       cdef:
           const CCastOptions* cast_options
   
       __slots__ = ()  # avoid mistakingly creating attributes
   
       cdef const CFunctionOptions* get_options(self) except NULL:
           return self.cast_options
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763823491


   @cyb70289 `arrow::dataset::Expression::Call::options` requires that virtual destructor since a `shared_ptr<FunctionOptions>` is used to store subclasses of FunctionOptions. Removing this virtual destructor results in a resource leak since the derived classes' (implicit) destructors are not being called, leaking data members such as `arrow::compute::SetLookupOptions::value_set` which are not destroyed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763319885


   All debug build failed due to this change. FunctionOptions is supposed to be polymorphic, but looks it not necessary.
   Would like to hear comments before moving on.
   @bkietz 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562633900



##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
 
 cdef class _CastOptions(FunctionOptions):
     cdef:
-        CCastOptions options
+        unique_ptr[CCastOptions] options
 
     __slots__ = ()  # avoid mistakingly creating attributes
 
     cdef const CFunctionOptions* get_options(self) except NULL:
-        return &self.options
+        return self.options.get()
 
     def _set_options(self, DataType target_type, allow_int_overflow,
                      allow_time_truncate, allow_time_overflow,
                      allow_float_truncate, allow_invalid_utf8):
+        self.options.reset(new CCastOptions())
         self._set_type(target_type)
         if allow_int_overflow is not None:
-            self.allow_int_overflow = allow_int_overflow
+            deref(self.options).allow_int_overflow = allow_int_overflow

Review comment:
       I'm not sure what you mean here. Note that `self.allow_int_overflow = ` is a property setter which modifies the corresponding field in CCastOptions, so substituting `deref(self.options).allow_int_overflow = ` here is unnecessary




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562634717



##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
 
 cdef class _CastOptions(FunctionOptions):
     cdef:
-        CCastOptions options
+        unique_ptr[CCastOptions] options
 
     __slots__ = ()  # avoid mistakingly creating attributes
 
     cdef const CFunctionOptions* get_options(self) except NULL:
-        return &self.options
+        return self.options.get()
 
     def _set_options(self, DataType target_type, allow_int_overflow,
                      allow_time_truncate, allow_time_overflow,
                      allow_float_truncate, allow_invalid_utf8):
+        self.options.reset(new CCastOptions())
         self._set_type(target_type)
         if allow_int_overflow is not None:
-            self.allow_int_overflow = allow_int_overflow
+            deref(self.options).allow_int_overflow = allow_int_overflow
         if allow_time_truncate is not None:
-            self.allow_time_truncate = allow_time_truncate
+            deref(self.options).allow_time_truncate = allow_time_truncate
         if allow_time_overflow is not None:
-            self.allow_time_overflow = allow_time_overflow
+            deref(self.options).allow_time_overflow = allow_time_overflow
         if allow_float_truncate is not None:
-            self.allow_float_truncate = allow_float_truncate
+            deref(self.options).allow_float_truncate = allow_float_truncate
         if allow_invalid_utf8 is not None:
-            self.allow_invalid_utf8 = allow_invalid_utf8
+            deref(self.options).allow_invalid_utf8 = allow_invalid_utf8
 
     def _set_type(self, target_type=None):
         if target_type is not None:
-            self.options.to_type = (
+            deref(self.options).to_type = (
                 (<DataType> ensure_type(target_type)).sp_type
             )
 
     def _set_safe(self):
-        self.options = CCastOptions.Safe()
+        self.options.reset(new CCastOptions(CCastOptions.Safe()))

Review comment:
       I think this is the most efficient way to write this in cython.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 edited a comment on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
cyb70289 edited a comment on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763319885


   All debug build failed due to this change. FunctionOptions is supposed to be polymorphic, but looks not necessary.
   Would like to hear comments before moving on.
   @bkietz @pitrou 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-764217852


   > @cyb70289 `arrow::dataset::Expression::Call::options` requires that virtual destructor since a `shared_ptr<FunctionOptions>` is used to store subclasses of FunctionOptions. Removing this virtual destructor results in a resource leak since the derived classes' (implicit) destructors are not being called, leaking data members such as `arrow::compute::SetLookupOptions::value_set` which are not destroyed.
   
   Thanks @bkietz , I missed this point.
   Will update according to your comment. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763300961


   https://issues.apache.org/jira/browse/ARROW-11299


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562660367



##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
 
 cdef class _CastOptions(FunctionOptions):
     cdef:
-        CCastOptions options
+        unique_ptr[CCastOptions] options
 
     __slots__ = ()  # avoid mistakingly creating attributes
 
     cdef const CFunctionOptions* get_options(self) except NULL:
-        return &self.options
+        return self.options.get()
 
     def _set_options(self, DataType target_type, allow_int_overflow,
                      allow_time_truncate, allow_time_overflow,
                      allow_float_truncate, allow_invalid_utf8):
+        self.options.reset(new CCastOptions())
         self._set_type(target_type)
         if allow_int_overflow is not None:
-            self.allow_int_overflow = allow_int_overflow
+            deref(self.options).allow_int_overflow = allow_int_overflow

Review comment:
       Restored to original code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz closed pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings

Posted by GitBox <gi...@apache.org>.
bkietz closed pull request #9274:
URL: https://github.com/apache/arrow/pull/9274


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org