You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/20 03:15:38 UTC
[GitHub] [arrow] cyb70289 opened a new pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings
cyb70289 opened a new pull request #9274:
URL: https://github.com/apache/arrow/pull/9274
Many `invalid-offsetof` warnings in python building:
...
~/arrow/python/build/temp.linux-x86_64-3.6/_compute.cpp:26034:146:
warning: offsetof within non-standard-layout type ‘__pyx_obj_7pyarrow_8_compute__CastOptions’ is undefined [-Winvalid-offsetof]
x_type_7pyarrow_8_compute__CastOptions.tp_weaklistoffset = offsetof(struct __pyx_obj_7pyarrow_8_compute__CastOptions, __pyx_base.__pyx_base.__weakref__);
...
Cython generated code calls `offsetof` on FunctionOptions based classes.
`offsetof` only applies to C++ classes with `standard layout type` [1],
which excludes virtual functions.
Fix the warning by removing virtual destructor in FunctionOptions class.
No FunctionOptions derived class has explicit destructor, so removing
the virtual destructor looks safe now. But it may cause memory leak in
the future when some derived classes have explicit destructors.
Any better way to handle this issue?
[1] https://en.cppreference.com/w/cpp/named_req/StandardLayoutType
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-764217852
> @cyb70289 `arrow::dataset::Expression::Call::options` requires that virtual destructor since a `shared_ptr<FunctionOptions>` is used to store subclasses of FunctionOptions. Removing this virtual destructor results in a resource leak since the derived classes' (implicit) destructors are not being called, leaking data members such as `arrow::compute::SetLookupOptions::value_set` which are not destroyed.
Thanks @bkietz , I missed this point.
Will update according to your comment.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562484017
##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
cdef class _CastOptions(FunctionOptions):
cdef:
- CCastOptions options
+ unique_ptr[CCastOptions] options
__slots__ = () # avoid mistakingly creating attributes
cdef const CFunctionOptions* get_options(self) except NULL:
- return &self.options
+ return self.options.get()
def _set_options(self, DataType target_type, allow_int_overflow,
allow_time_truncate, allow_time_overflow,
allow_float_truncate, allow_invalid_utf8):
+ self.options.reset(new CCastOptions())
self._set_type(target_type)
if allow_int_overflow is not None:
- self.allow_int_overflow = allow_int_overflow
+ deref(self.options).allow_int_overflow = allow_int_overflow
Review comment:
Looks `options` is missed in current code.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562483502
##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
cdef class _CastOptions(FunctionOptions):
cdef:
- CCastOptions options
+ unique_ptr[CCastOptions] options
__slots__ = () # avoid mistakingly creating attributes
cdef const CFunctionOptions* get_options(self) except NULL:
- return &self.options
+ return self.options.get()
def _set_options(self, DataType target_type, allow_int_overflow,
allow_time_truncate, allow_time_overflow,
allow_float_truncate, allow_invalid_utf8):
+ self.options.reset(new CCastOptions())
self._set_type(target_type)
if allow_int_overflow is not None:
- self.allow_int_overflow = allow_int_overflow
+ deref(self.options).allow_int_overflow = allow_int_overflow
if allow_time_truncate is not None:
- self.allow_time_truncate = allow_time_truncate
+ deref(self.options).allow_time_truncate = allow_time_truncate
if allow_time_overflow is not None:
- self.allow_time_overflow = allow_time_overflow
+ deref(self.options).allow_time_overflow = allow_time_overflow
if allow_float_truncate is not None:
- self.allow_float_truncate = allow_float_truncate
+ deref(self.options).allow_float_truncate = allow_float_truncate
if allow_invalid_utf8 is not None:
- self.allow_invalid_utf8 = allow_invalid_utf8
+ deref(self.options).allow_invalid_utf8 = allow_invalid_utf8
def _set_type(self, target_type=None):
if target_type is not None:
- self.options.to_type = (
+ deref(self.options).to_type = (
(<DataType> ensure_type(target_type)).sp_type
)
def _set_safe(self):
- self.options = CCastOptions.Safe()
+ self.options.reset(new CCastOptions(CCastOptions.Safe()))
Review comment:
Any way to assign `self.options` directly?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] bkietz commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763833521
I think the best way to resolve this issue is to avoid direct instances of derived classes of FunctionOptions in Cython, since Cython inappropriately applies `offsetof`. For example, note that the warning is emitted for [MinMaxOptions](https://github.com/apache/arrow/blob/fdc63acdb68adade0d68e232dfcc892dc0ecf700/python/pyarrow/_compute.pyx#L670) but not for [SetLookupOptions](https://github.com/apache/arrow/blob/fdc63acdb68adade0d68e232dfcc892dc0ecf700/python/pyarrow/_compute.pyx#L732) (where the options are held by unique_ptr rather than being a direct member).
I'd recommend the following approach: rewrite `_compute.pyx::FunctionOptions` to hold a `shared_ptr[CFunctionOptions]`, and let the `cdef` classes acquire a non owning pointer to the appropriate FunctionOptions subclass. For example:
`_compute.pxd`
```python
cdef class FunctionOptions(_Weakrefable):
cdef:
shared_ptr[CFunctionOptions] options
cdef const CFunctionOptions* get_options(self) except NULL
```
`_compute.pyx`
```python
cdef class _CastOptions(FunctionOptions):
cdef:
const CCastOptions* cast_options
__slots__ = () # avoid mistakingly creating attributes
cdef const CFunctionOptions* get_options(self) except NULL:
return self.cast_options
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] bkietz commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763823491
@cyb70289 `arrow::dataset::Expression::Call::options` requires that virtual destructor since a `shared_ptr<FunctionOptions>` is used to store subclasses of FunctionOptions. Removing this virtual destructor results in a resource leak since the derived classes' (implicit) destructors are not being called, leaking data members such as `arrow::compute::SetLookupOptions::value_set` which are not destroyed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763319885
All debug build failed due to this change. FunctionOptions is supposed to be polymorphic, but looks it not necessary.
Would like to hear comments before moving on.
@bkietz
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] bkietz commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562633900
##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
cdef class _CastOptions(FunctionOptions):
cdef:
- CCastOptions options
+ unique_ptr[CCastOptions] options
__slots__ = () # avoid mistakingly creating attributes
cdef const CFunctionOptions* get_options(self) except NULL:
- return &self.options
+ return self.options.get()
def _set_options(self, DataType target_type, allow_int_overflow,
allow_time_truncate, allow_time_overflow,
allow_float_truncate, allow_invalid_utf8):
+ self.options.reset(new CCastOptions())
self._set_type(target_type)
if allow_int_overflow is not None:
- self.allow_int_overflow = allow_int_overflow
+ deref(self.options).allow_int_overflow = allow_int_overflow
Review comment:
I'm not sure what you mean here. Note that `self.allow_int_overflow = ` is a property setter which modifies the corresponding field in CCastOptions, so substituting `deref(self.options).allow_int_overflow = ` here is unnecessary
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] bkietz commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562634717
##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
cdef class _CastOptions(FunctionOptions):
cdef:
- CCastOptions options
+ unique_ptr[CCastOptions] options
__slots__ = () # avoid mistakingly creating attributes
cdef const CFunctionOptions* get_options(self) except NULL:
- return &self.options
+ return self.options.get()
def _set_options(self, DataType target_type, allow_int_overflow,
allow_time_truncate, allow_time_overflow,
allow_float_truncate, allow_invalid_utf8):
+ self.options.reset(new CCastOptions())
self._set_type(target_type)
if allow_int_overflow is not None:
- self.allow_int_overflow = allow_int_overflow
+ deref(self.options).allow_int_overflow = allow_int_overflow
if allow_time_truncate is not None:
- self.allow_time_truncate = allow_time_truncate
+ deref(self.options).allow_time_truncate = allow_time_truncate
if allow_time_overflow is not None:
- self.allow_time_overflow = allow_time_overflow
+ deref(self.options).allow_time_overflow = allow_time_overflow
if allow_float_truncate is not None:
- self.allow_float_truncate = allow_float_truncate
+ deref(self.options).allow_float_truncate = allow_float_truncate
if allow_invalid_utf8 is not None:
- self.allow_invalid_utf8 = allow_invalid_utf8
+ deref(self.options).allow_invalid_utf8 = allow_invalid_utf8
def _set_type(self, target_type=None):
if target_type is not None:
- self.options.to_type = (
+ deref(self.options).to_type = (
(<DataType> ensure_type(target_type)).sp_type
)
def _set_safe(self):
- self.options = CCastOptions.Safe()
+ self.options.reset(new CCastOptions(CCastOptions.Safe()))
Review comment:
I think this is the most efficient way to write this in cython.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 edited a comment on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
cyb70289 edited a comment on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763319885
All debug build failed due to this change. FunctionOptions is supposed to be polymorphic, but looks not necessary.
Would like to hear comments before moving on.
@bkietz @pitrou
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-764217852
> @cyb70289 `arrow::dataset::Expression::Call::options` requires that virtual destructor since a `shared_ptr<FunctionOptions>` is used to store subclasses of FunctionOptions. Removing this virtual destructor results in a resource leak since the derived classes' (implicit) destructors are not being called, leaking data members such as `arrow::compute::SetLookupOptions::value_set` which are not destroyed.
Thanks @bkietz , I missed this point.
Will update according to your comment.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #9274: ARROW-11299: [C++][Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#issuecomment-763300961
https://issues.apache.org/jira/browse/ARROW-11299
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on a change in pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9274:
URL: https://github.com/apache/arrow/pull/9274#discussion_r562660367
##########
File path: python/pyarrow/_compute.pyx
##########
@@ -481,88 +481,89 @@ cdef class FunctionOptions(_Weakrefable):
cdef class _CastOptions(FunctionOptions):
cdef:
- CCastOptions options
+ unique_ptr[CCastOptions] options
__slots__ = () # avoid mistakingly creating attributes
cdef const CFunctionOptions* get_options(self) except NULL:
- return &self.options
+ return self.options.get()
def _set_options(self, DataType target_type, allow_int_overflow,
allow_time_truncate, allow_time_overflow,
allow_float_truncate, allow_invalid_utf8):
+ self.options.reset(new CCastOptions())
self._set_type(target_type)
if allow_int_overflow is not None:
- self.allow_int_overflow = allow_int_overflow
+ deref(self.options).allow_int_overflow = allow_int_overflow
Review comment:
Restored to original code.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] bkietz closed pull request #9274: ARROW-11299: [Python] Fix invalid-offsetof warnings
Posted by GitBox <gi...@apache.org>.
bkietz closed pull request #9274:
URL: https://github.com/apache/arrow/pull/9274
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org