You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/18 02:46:08 UTC

[GitHub] [arrow-rs] viirya opened a new pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

viirya opened a new pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334


   # Which issue does this PR close?
   
   <!---
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #1333.
   
   # Rationale for this change
    
    <!---
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.
   -->
   
   # What changes are included in this PR?
   
   <!---
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are there any user-facing changes?
   
   
   <!---
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!---
   If there are any breaking changes to public APIs, please add the `breaking change` label.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] sunchao commented on pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#issuecomment-1046294625


   Merged, thanks @viirya !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] viirya commented on pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
viirya commented on pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#issuecomment-1046306542


   Thanks @sunchao !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on a change in pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#discussion_r816205894



##########
File path: arrow/src/ffi.rs
##########
@@ -721,9 +721,11 @@ impl ArrowArray {
                     .to_string(),
             ));
         };
+        let ffi_array = (*array).clone();

Review comment:
       I am not super familiar with this code but it makes sense to me. The clone here seems to just be a clone of the `FFI_ArrowArray` struct itself (which is some ints and pointers) which seems reasonable enough to me.
   
   If someone seems performance issues from this code, we can always add a `try_from_raw_arc` or something, but this looks good to me for now




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] viirya commented on pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
viirya commented on pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#issuecomment-1064843112


   > If we clone the FFI struct, then it means we need to free the pointer by ourself, but if we free FFI_ArrowArray, then the data in this Array will also be free? Which means we can't free the pointer(until the data are used and ready to free, but in reality we can't hold this useless pointer in a big project for such a long time), which create memory leak.
   
   As the raw pointers are converted to `Arc`, they will be released correctly eventually. But the data is still there, so of course you should not release the pointers.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] sunchao merged pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
sunchao merged pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] viirya commented on pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
viirya commented on pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#issuecomment-1064839953


   > As to the question @viirya raised in https://github.com/apache/arrow-rs/issues/1333 , when manage memory, the one who allocate it should free it, which means in our case, we need to alloc the struct in rust and pass the pointer to java and then also free the memory in rust.
   
   Simply said, I don't think this is correct. That is the whole point of C data interface. The release callback is designed to be called by not only the one allocates the C data interface structure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] viirya commented on a change in pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#discussion_r816221790



##########
File path: arrow/src/ffi.rs
##########
@@ -721,9 +721,11 @@ impl ArrowArray {
                     .to_string(),
             ));
         };
+        let ffi_array = (*array).clone();

Review comment:
       > The clone here seems to just be a clone of the FFI_ArrowArray struct itself (which is some ints and pointers) which seems reasonable enough to me.
   
   That's right.

##########
File path: arrow/src/ffi.rs
##########
@@ -721,9 +721,11 @@ impl ArrowArray {
                     .to_string(),
             ));
         };
+        let ffi_array = (*array).clone();

Review comment:
       Thanks @alamb 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter commented on pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#issuecomment-1043797128


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1334?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1334](https://codecov.io/gh/apache/arrow-rs/pull/1334?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (133cb01) into [master](https://codecov.io/gh/apache/arrow-rs/commit/f4c7102a670f30d6d43193a5177f03a4789f2576?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f4c7102) will **increase** coverage by `0.00%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/1334/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/1334?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #1334   +/-   ##
   =======================================
     Coverage   83.03%   83.03%           
   =======================================
     Files         181      181           
     Lines       52949    52951    +2     
   =======================================
   + Hits        43965    43968    +3     
   + Misses       8984     8983    -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1334?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [arrow/src/ffi.rs](https://codecov.io/gh/apache/arrow-rs/pull/1334/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2ZmaS5ycw==) | `84.61% <100.00%> (+0.08%)` | :arrow_up: |
   | [arrow/src/datatypes/field.rs](https://codecov.io/gh/apache/arrow-rs/pull/1334/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy9maWVsZC5ycw==) | `53.79% <0.00%> (-0.31%)` | :arrow_down: |
   | [arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow-rs/pull/1334/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS9tb2QucnM=) | `84.52% <0.00%> (+0.13%)` | :arrow_up: |
   | [parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/1334/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nLnJz) | `93.71% <0.00%> (+0.19%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1334?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1334?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [f4c7102...133cb01](https://codecov.io/gh/apache/arrow-rs/pull/1334?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] wangfenjin commented on pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
wangfenjin commented on pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#issuecomment-1064828113


   Guys, not sure if my understanding is right, but I think this commit will break the design and create memory leak.
   
   If we clone the FFI struct, then it means we need to free the pointer by ourself, but if we free FFI_ArrowArray, then the data in this Array will also be free? Which means we can't free the pointer(until the data are used and ready to free, but in reality we can't hold this useless pointer in a big project for such a long time), which create memory leak.
   
   As to the question @viirya raised in #1333 , when manage memory, the one who allocate it should free it, which means in our case, we need to alloc the struct in rust and pass the pointer to java and then also free the memory in rust.
   
   * You can check my code in here: https://github.com/wangfenjin/duckdb-rs/blob/5083d39a4147f8017613304ae5f217a88ac42c2e/src/raw_statement.rs#L58
   * When I try to upgrade to version 10, [memory leak detected](https://github.com/wangfenjin/duckdb-rs/runs/5506654342?check_suite_focus=true) and there is no easy way to fix it.
   
   I suggest we revert this commit. cc @alamb @sunchao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on a change in pull request #1334: `ArrowArray::try_from_raw` should not assume the pointers are from Arc

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #1334:
URL: https://github.com/apache/arrow-rs/pull/1334#discussion_r816205894



##########
File path: arrow/src/ffi.rs
##########
@@ -721,9 +721,11 @@ impl ArrowArray {
                     .to_string(),
             ));
         };
+        let ffi_array = (*array).clone();

Review comment:
       I am not super familiar with this code but it makes sense to me. The clone here seems to just be a clone of the `FFI_ArrowArray` struct itself (which is some ints and pointers) which seems reasonable enough to me.
   
   If someone seems performance issues from this code, we can always add a `try_from_raw_arc` or something




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org