You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/26 11:07:06 UTC

[GitHub] [arrow-rs] alamb opened a new issue #29: Mark methods that do not perform bounds checking as unsafe

alamb opened a new issue #29:
URL: https://github.com/apache/arrow-rs/issues/29


   *Note*: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-3776
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] vertexclique commented on issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
vertexclique commented on issue #29:
URL: https://github.com/apache/arrow-rs/issues/29#issuecomment-1002140009


   I have mentioned this in the arrow mailing list as "exposed" api to expose unsafe methods and pointers (if needed) to some performance hungry users. 
   @alamb 
   
   Ref: https://lists.apache.org/thread/vk6m5lwljsv9sd7f9zm4z3o6l9997nw4


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb closed issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
alamb closed issue #29:
URL: https://github.com/apache/arrow-rs/issues/29


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb edited a comment on issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on issue #29:
URL: https://github.com/apache/arrow-rs/issues/29#issuecomment-826746621


   comment from Paul Kernfeld(paulkernfeld) @ 2019-01-24T00:13:03.622+0000:\n<pre>I'm interested in working on this, although there could be a lot of downstream effects. A good example of a tricky function is arrow::array::PrimitiveArray::value, which appears to be used in a couple dozen places. A few possible strategies are:\r\n # Add in bounds checking so that we don't need to deal with unsafe at all.\r\n # Propagate the unsafes up through the code.\r\n # Maintain a safe and unsafe version of each function that is currently unsafe.\r\n\r\nPersonally I'm a fan of #1 because I think that reducing unsafe code will help developers and users avoid mistakes (I [accidentally wrote|https://github.com/apache/arrow/pull/3448] a nondeterministic unit test earlier this week). However, I'm new to the project so I'm happy to do what others think is best.</pre>\n\ncomment from Paddy Horan(paddyhoran) @ 2019-01-24T01:35:09.674+0000:\n<pre>I was actually thinking we would need #3.\xa0 Taking `value
 ` as an example I would be in favor of adding bounds checking to `value` and having a `value_unchecked` that does no bounds checking and is unsafe.\r\n\r\nI think that we need to provide the unsafe versions for maximum performance due to Arrow often being described as a development platform rather than a front end API.\xa0 i.e. people will use it as the foundation of other higher level libraries and so developers will want the option to avoid bounds checking for performance reasons.\r\n\r\n\xa0</pre>\n\ncomment from Andrew Lamb(alamb) @ 2021-04-26T10:52:32.742+0000:\n<pre>Migrated to github: https://api.github.com/repos/apache/arrow-rs/issues/28</pre>\n\ncomment from Andrew Lamb(alamb) @ 2021-04-26T10:54:56.406+0000:\n<pre>https://github.com/apache/arrow-rs/issues/28</pre>\n\ncomment from Andrew Lamb(alamb) @ 2021-04-26T11:05:26.331+0000:\n<pre>Testing automated script to migrate issues, trying again</pre>\n\ncomment from Andrew Lamb(alamb) @ 2021-04-26T11:06:54.444+0000:\n<pre>Migr
 ated to github: https://github.com/apache/arrow-rs/issues/29</pre>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #29:
URL: https://github.com/apache/arrow-rs/issues/29#issuecomment-956117874


   For anyone else reading this issue, all the calls to `*Array::value()` I checked (below) had bounds checking in them and would `panic` if an out of bounds request was made. 
   
   As written, the ticket seems more broadly defined and refers to more than just `value()` so I'll leave it open, but at the moment this seems unactionable to me (as the methods needing bounds checking are not clear).
   
   -----
   
   BinaryArray:  https://sourcegraph.com/github.com/apache/arrow-rs/-/blob/arrow/src/array/array_binary.rs?L104&subtree=true
   
   ListArray: https://sourcegraph.com/github.com/apache/arrow-rs/-/blob/arrow/src/array/array_list.rs?L83:9&subtree=true
   
   PrimitiveArray: https://sourcegraph.com/github.com/apache/arrow-rs/-/blob/arrow/src/array/array_primitive.rs?L119:9&subtree=true
   
   StringArray: https://sourcegraph.com/github.com/apache/arrow-rs/-/blob/arrow/src/array/array_string.rs?L108:9&subtree=true
   
   BooleanArray: https://sourcegraph.com/github.com/apache/arrow-rs/-/blob/arrow/src/array/array_boolean.rs?L119:9&subtree=true


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #29:
URL: https://github.com/apache/arrow-rs/issues/29#issuecomment-954637753


   I believe this has been complated 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb edited a comment on issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on issue #29:
URL: https://github.com/apache/arrow-rs/issues/29#issuecomment-826746621


   Comment from Paul Kernfeld(paulkernfeld) @ 2019-01-24T00:13:03.622+0000:
   <pre>I'm interested in working on this, although there could be a lot of downstream effects. A good example of a tricky function is arrow::array::PrimitiveArray::value, which appears to be used in a couple dozen places. A few possible strategies are:
    # Add in bounds checking so that we don't need to deal with unsafe at all.
    # Propagate the unsafes up through the code.
    # Maintain a safe and unsafe version of each function that is currently unsafe.
   
   Personally I'm a fan of #1 because I think that reducing unsafe code will help developers and users avoid mistakes (I [accidentally wrote|https://github.com/apache/arrow/pull/3448] a nondeterministic unit test earlier this week). However, I'm new to the project so I'm happy to do what others think is best.</pre>
   
   Comment from Paddy Horan(paddyhoran) @ 2019-01-24T01:35:09.674+0000:
   <pre>I was actually thinking we would need #3.  Taking `value` as an example I would be in favor of adding bounds checking to `value` and having a `value_unchecked` that does no bounds checking and is unsafe.
   
   I think that we need to provide the unsafe versions for maximum performance due to Arrow often being described as a development platform rather than a front end API.  i.e. people will use it as the foundation of other higher level libraries and so developers will want the option to avoid bounds checking for performance reasons.
   
    </pre>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #29:
URL: https://github.com/apache/arrow-rs/issues/29#issuecomment-826746621


   comment from Paul Kernfeld(paulkernfeld) @ 2019-01-24T00:13:03.622+0000:
   I'm interested in working on this, although there could be a lot of downstream effects. A good example of a tricky function is arrow::array::PrimitiveArray::value, which appears to be used in a couple dozen places. A few possible strategies are:
    # Add in bounds checking so that we don't need to deal with unsafe at all.
    # Propagate the unsafes up through the code.
    # Maintain a safe and unsafe version of each function that is currently unsafe.
   
   Personally I'm a fan of #1 because I think that reducing unsafe code will help developers and users avoid mistakes (I [accidentally wrote|https://github.com/apache/arrow/pull/3448] a nondeterministic unit test earlier this week). However, I'm new to the project so I'm happy to do what others think is best.
   
   comment from Paddy Horan(paddyhoran) @ 2019-01-24T01:35:09.674+0000:
   I was actually thinking we would need #3.  Taking `value` as an example I would be in favor of adding bounds checking to `value` and having a `value_unchecked` that does no bounds checking and is unsafe.
   
   I think that we need to provide the unsafe versions for maximum performance due to Arrow often being described as a development platform rather than a front end API.  i.e. people will use it as the foundation of other higher level libraries and so developers will want the option to avoid bounds checking for performance reasons.
   
    
   
   comment from Andrew Lamb(alamb) @ 2021-04-26T10:52:32.742+0000:
   Migrated to github: https://api.github.com/repos/apache/arrow-rs/issues/28
   
   comment from Andrew Lamb(alamb) @ 2021-04-26T10:54:56.406+0000:
   https://github.com/apache/arrow-rs/issues/28
   
   comment from Andrew Lamb(alamb) @ 2021-04-26T11:05:26.331+0000:
   Testing automated script to migrate issues, trying again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb edited a comment on issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on issue #29:
URL: https://github.com/apache/arrow-rs/issues/29#issuecomment-826746621


   Comment from Paul Kernfeld(paulkernfeld) @ 2019-01-24T00:13:03.622+0000:
   <pre>I'm interested in working on this, although there could be a lot of downstream effects. A good example of a tricky function is arrow::array::PrimitiveArray::value, which appears to be used in a couple dozen places. A few possible strategies are:
    # Add in bounds checking so that we don't need to deal with unsafe at all.
    # Propagate the unsafes up through the code.
    # Maintain a safe and unsafe version of each function that is currently unsafe.
   
   Personally I'm a fan of #1 because I think that reducing unsafe code will help developers and users avoid mistakes (I [accidentally wrote|https://github.com/apache/arrow/pull/3448] a nondeterministic unit test earlier this week). However, I'm new to the project so I'm happy to do what others think is best.</pre>
   
   Comment from Paddy Horan(paddyhoran) @ 2019-01-24T01:35:09.674+0000:
   <pre>I was actually thinking we would need #3.  Taking `value` as an example I would be in favor of adding bounds checking to `value` and having a `value_unchecked` that does no bounds checking and is unsafe.
   
   I think that we need to provide the unsafe versions for maximum performance due to Arrow often being described as a development platform rather than a front end API.  i.e. people will use it as the foundation of other higher level libraries and so developers will want the option to avoid bounds checking for performance reasons.
   
    </pre>
   
   Comment from Andrew Lamb(alamb) @ 2021-04-26T10:52:32.742+0000:
   <pre>Migrated to github: https://api.github.com/repos/apache/arrow-rs/issues/28</pre>
   
   Comment from Andrew Lamb(alamb) @ 2021-04-26T10:54:56.406+0000:
   <pre>https://github.com/apache/arrow-rs/issues/28</pre>
   
   Comment from Andrew Lamb(alamb) @ 2021-04-26T11:05:26.331+0000:
   <pre>Testing automated script to migrate issues, trying again</pre>
   
   Comment from Andrew Lamb(alamb) @ 2021-04-26T11:06:54.444+0000:
   <pre>Migrated to github: https://github.com/apache/arrow-rs/issues/29</pre>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb removed a comment on issue #29: Mark methods that do not perform bounds checking as unsafe

Posted by GitBox <gi...@apache.org>.
alamb removed a comment on issue #29:
URL: https://github.com/apache/arrow-rs/issues/29#issuecomment-954637753


   I believe this has been complated 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org