You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ivanmemruk (via GitHub)" <gi...@apache.org> on 2023/04/05 00:54:19 UTC

[GitHub] [arrow] ivanmemruk opened a new issue, #34895: Golang arenas

ivanmemruk opened a new issue, #34895:
URL: https://github.com/apache/arrow/issues/34895

   ### Describe the enhancement requested
   
   Golang v1.20 has added an experimental `arenas` package, which allows more "manual" memory allocation and could be potentially useful for the golang arrow library.
   
   I am opening this issue to see if there has been any interest or any attempt to try to use `arenas` for arrow. If there hasn't, I could potentially try working on that in the nearest few months and contribute back.
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on issue #34895: Golang arenas

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on issue #34895:
URL: https://github.com/apache/arrow/issues/34895#issuecomment-1497721445

   I think it's a great idea to try using an arena. To my knowledge there hasn't been any attempts yet, and I know that I've gotten at least one question about them, but that's all the interest I've seen so far. There's a few areas I can think of that might be useful for it:
   
   1. An implementation of the `memory.Allocator` interface that uses an arena under the hood for the Allocate, Reallocate, etc. methods for arrays. (this is probably the easiest thing to implement)
   2. An arena in the compute package that we can use for all the small allocations that are used intermittently during the compute execution
   3. As an option for IPC read/write and/or Flight/FlightSQL along with in the Parquet reader and writer.
   4. As an option for the CSV and JSON readers/writers
   
   There's probably other areas that could benefit that I'm not thinking of offhand. 
   
   The one thing to keep in mind is that we do want to try to maintain compatibility with the older versions of go (ie go1.17) so you'd need to keep anything with arenas in files that are protected with a build constraint (for example, notice that all the files in the compute package have `//go:build go1.18` restricting them to only be used when compiling with go1.18+). So that does make things a little complicated, and is why I mentioned that an arena based `memory.Allocator` is likely the easiest thing to implement first.
   
   If you do take a stab at it, I'll happily review the PRs and feel free to reach out if you need any assistance implementing it. Thanks in advance if you do take this on!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] spenczar commented on issue #34895: [Go] Implement Usage of Golang arenas (go1.20+)

Posted by "spenczar (via GitHub)" <gi...@apache.org>.
spenczar commented on issue #34895:
URL: https://github.com/apache/arrow/issues/34895#issuecomment-1501129204

   I think this might be premature. See https://go-review.googlesource.com/c/go/+/462355 
   
   > The arena goexperiment contains code used inside Google in very
   limited use cases that we will maintain, but the discussion on [#51317](https://go.dev/issue/51317)
   identified serious problems with the very idea of adding arenas to the
   standard library. In particular the concept tends to infect many other
   APIs in the name of efficiency, a bit like sync.Pool except more
   publicly visible.
   >
   > It is unclear when, if ever, we will pick up the idea and try to push
   it forward into a public API, but it's not going to happen any time
   soon, and we don't want users to start depending on it: it's a true
   experiment and may be changed or deleted without warning.
   
   This isn't to say it's a terrible idea of course - just to be very cautious about depending on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ivanmemruk commented on issue #34895: [Go] Implement Usage of Golang arenas (go1.20+)

Posted by "ivanmemruk (via GitHub)" <gi...@apache.org>.
ivanmemruk commented on issue #34895:
URL: https://github.com/apache/arrow/issues/34895#issuecomment-1498038765

   Agree that starting with the Allocator makes sense. I plan to start working on this next week. Just to clarify, is there an existing performance test suite that would focus on memory allocations, that could be used as a benchmark? I have my personal project where this change would probably be quite beneficial, but we need something publicly accessible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on issue #34895: [Go] Implement Usage of Golang arenas (go1.20+)

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on issue #34895:
URL: https://github.com/apache/arrow/issues/34895#issuecomment-1501929262

   @ivanmemruk I think the most interesting aspect here is going to be to look at the existing benchmarks after you give a stab at this and see if there's actually benefits shown from it. That's what I'm most looking forward to, and would be a valid reason to add CI for go1.20 (or update the existing go1.18 CI to use go1.20)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on issue #34895: [Go] Implement Usage of Golang arenas (go1.20+)

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on issue #34895:
URL: https://github.com/apache/arrow/issues/34895#issuecomment-1498192917

   there are benchmarks in the package that can be run via `go test -bench -benchmem .....` at a few different levels including in the compute package. Those should be sufficient to see differences, you can also find benchmarks on conbench.ursa.dev for Go commits, such as here: https://conbench.ursa.dev/runs/50fba6849b2e48f7b0f4a31230b27675/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ivanmemruk commented on issue #34895: [Go] Implement Usage of Golang arenas (go1.20+)

Posted by "ivanmemruk (via GitHub)" <gi...@apache.org>.
ivanmemruk commented on issue #34895:
URL: https://github.com/apache/arrow/issues/34895#issuecomment-1501197791

   thanks @spenczar - even if the whole thing gets thrown out later on, it might still makes sense for my own project in the nearest month or two - assuming the benefits do materialize. this work (we can consider it a PoC from the arrows perspective) could also end up as a useful data point for the golang team in their ultimate decision on the feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org