You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/15 05:10:48 UTC

[GitHub] [arrow] vertexclique opened a new pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

vertexclique opened a new pull request #8665:
URL: https://github.com/apache/arrow/pull/8665


   Implements bit and on avx512.
   
   ```
   AVX2
   =============
   
   buffer_bit_ops and      time:   [729.17 ns 729.31 ns 729.49 ns]                                
   Found 12 outliers among 100 measurements (12.00%)
     7 (7.00%) low mild
     1 (1.00%) high mild
     4 (4.00%) high severe
   
   AVX512
   ==============
   buffer_bit_ops and      time:   [332.39 ns 332.55 ns 332.71 ns]                               
                           change: [-54.427% -54.390% -54.355%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) high mild
     1 (1.00%) high severe
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727578474


   Reran the tests, failure now confirmed: https://github.com/nevi-me/arrow/runs/1402675504#step:8:2084
   
   The C++ implementation has avx512 support, so maybe @kszucs or someone else deeply familiar with our CI knows what a probable solution is. Otherwise this might be a matter for the mailing list.
   
   My position is that I'm happy to still proceed with merging this, because this doesn't break CI for stable and nightly. Else, we could open a separate branch where we can merge this into, to avoid @vertexclique having to pile up PRs on top of each other.
   It gets very irritating after a while, from my experience with the parquet writer.
   This separate branch could also temporarily house ARMv7, as we don't yet have CI for it.
   
   ___
   
   @jorgecarleitao what's your opinion on us testing `arrow` separately on stable, to ensure that we don't regress? Something like:
   
   ```sh
   # run unit tests, excluding arrow
   cargo test --exclude arrow
   # run unit tests on arrow separately
   pushd arrow
   # run arrow unit tests on stable
   cargo +stable test
   # run arrow unit tests with features, separate test for mutually exclusive oens
   cargo test --features "simd"
   cargo test --features "avx512"
   popd
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan edited a comment on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
Dandandan edited a comment on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-770384503


   @vertexclique I was actually wondering whether AVX-512 is really faster than compiling the kernel with "simd" and the right rustflags , i.e. `RUSTFLAGS='-C target-cpu=native'`?
   I don't have a `AVX-512`-enabled cpu to try this out, but I would have the same intuition as @jhorstmann that packed simd should generate similar `AVX-512` instructions?
   Would be nice maybe to revisit this once to see whether we need to maintain 2 different implementations, especially when it's going to be in `std` / stable.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727574114


   > @vertexclique from my reviewing the code, it looked like simd and avx512 were able to co-exist, but I'm getting some errors on the macOS CI (https://github.com/nevi-me/arrow/runs/1402635245#step:8:1511).
   Should I disable one for the other?
   
   Yes, they shouldn't coexist since they are conflicting implementations I would like to document that when everything settles down. Another option would make all of them mutually exclusive but that is going to be problematic. So the person who enables `avx512` should use avx512 feature or fallback to autovec. Not need to bring packed_simd or something else to do fallback.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727560444


   I will do that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727583378


   I can see both sides of the argument here but I would be supportive of merging this PR as long as we have a JIRA filed to follow up on the CI support (which I agree is really important). My understanding is that these changes will not break anything for users who are using Arrow without this new features enabled, and that is the configuration that we are currently certifying in CI. 
   
   We have a similar situation already with DataFusion, where we have to run benchmarks locally before merging some PRs because we don't have those set up in CI yet, and we have seen performance regressions as a result, so I don't see this as being particulary different.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-770384503


   @vertexclique I was actually wondering whether AVX-512 is really faster than compiling the kernel with "simd" and the right rustflags , i.e. `RUSTFLAGS='-C target-cpu=native`?
   I don't have a `AVX-512`-enabled cpu to try this out, but I would have the same intuition as @jhorstmann that packed simd should generate similar `AVX-512` instructions?
   Would be nice maybe to revisit this once to see whether we need to maintain 2 different implementations, especially when it's going to be in `std` / stable.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727517072


   https://issues.apache.org/jira/browse/ARROW-10589


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727572262


   I'm running a CI job at https://github.com/nevi-me/arrow/runs/1402635214, which includes `"simd avx512"` features.
   
   We do test "simd" in one of the CI tasks. Perhaps I didn't word my previous comment correctly.
   If GH CI does support AVX512 (I might be going on outdated assumptions that not every Intel CPU supports AVX512), then we can add it to the tests in `ci/scripts/rust_test.sh`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727568453


   > The main thing we have to be wary of, which I'm fine if we test locally and confirm; is whether anything here breaks the arrow crate building on stable. I haven't tested this yet.
   
   I tested this before pushing this pr and that was my concern before implementing this. I build it with the latest stable and tested too before opening this one.
   
   p.s. linker failed on windows no disk space left.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb closed pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
alamb closed pull request #8665:
URL: https://github.com/apache/arrow/pull/8665


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan edited a comment on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
Dandandan edited a comment on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-770384503


   @vertexclique I was actually wondering whether AVX-512 is really faster than compiling the kernel with "simd" and the right rustflags , i.e. `RUSTFLAGS='-C target-cpu=native'`?
   I don't have a `AVX-512`-enabled cpu to try this out, but I would have the same intuition as @jhorstmann that `packed_simd`should generate similar `AVX-512` instructions?
   Would be nice maybe to revisit this once to see whether we need to maintain 2 different implementations, especially when it's going to be in `std` / stable.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jhorstmann commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727847232


   Nice performance improvement! I'm a bit surprised by that since the packed_simd version also uses 512-bit wide types (`u8x64`) which should then generate avx-512 instructions when targeting a machine that supports them.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727571958


   @vertexclique [here](https://github.com/apache/arrow/blob/master/ci/scripts/rust_test.sh#L34). The timings are important: e.g. we decided to not place coverage on every PR because it was taking too long (it is under a cron job).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique edited a comment on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique edited a comment on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727560444


   I will do that in a min.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-770385902


   For some operations, it will use wider registers and their transfers with avx512, but not all algorithms are expressable using other simd sets. Main idea was instead of creating ordinary instructions of both feature sets, creating fast operations on a specific collection of data. packed_simd or stdsimd doesn't generate the optimal ordering as a code—neither some intrinsics in the language's core. Useful intrinsics are bound to llvm procedures. Not compiler optimized ones. AVX512 set is one of those sets.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727563248


   > Is this being tested on the CI? I did not see any changes to the CI to test with that feature.
   
   The main thing we have to be wary of, which I'm fine if we test locally and confirm; is whether anything here breaks the arrow crate building on stable. I haven't tested this yet.
   
   Do the GHA machines support avx512? We might have to rely on those with AVX512-capable Intels to check when new PRs that use AVX512 are submitted


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727575454


   > Okay, great. For tracking purposes, once we find a resolution to the CI question; we should open an umbrella JIRA to implement avx512 for other simd equivalents. I'm assuming that's your plan?
   
   Yes, I am going to open PRs one by one on top of each other and by opening an umbrella ticket for each one of them. About CI resolution, AWS offers avx512 machines. That can be a solution to the CI problem.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727521423


   Let's make the nightly update a separate PR, because there's a few other changes that we need to make for CI to pass. I'll work on that today


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727571404


   feature `simd` is not tested on this CI too. Am I missing something?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727574928


   > So the person who enables `avx512` should use avx512 feature or fallback to autovec. Not need to bring packed_simd or something else to do fallback.
   
   Okay, great. For tracking purposes, once we find a resolution to the CI question; we should open an umbrella JIRA to implement `avx512` for other `simd` equivalents. I'm assuming that's your plan?
   
   I'm trying CI again at https://github.com/nevi-me/arrow/runs/1402675547 with simd and avx512 running separately:
   
   ```sh
   # run unit tests with non-default features on
   pushd arrow
   cargo test --features "simd"
   cargo test --features "avx512"
   popd
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727573373


   @vertexclique from my reviewing the code, it looked like `simd` and `avx512` were able to co-exist, but I'm getting some errors on the macOS CI (https://github.com/nevi-me/arrow/runs/1402635245#step:8:1511).
   Should I disable one for the other?
   
   @jorgecarleitao @alamb CI fails because the AVX512 intrinsics aren't found (https://github.com/nevi-me/arrow/runs/1402635245#step:8:1537), so my concern seems valid that we might be running CI on CPUs that don't support AVX512, and thus make it a challenge to enable the feature.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727912338


   @andygrove @nevi-me https://issues.apache.org/jira/browse/ARROW-10612 Umbrella issue for AVX-512. Includes CI support follow up subtask. I will create a subtask for every operation that I will implement.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique edited a comment on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
vertexclique edited a comment on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727560444


   I will do that in a min. edit: @nevi-me updated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-727584271


   Running the CI on separate feature fits exactly what I was thinking about.
   
   IMO we could then benefit from some clarity over how we merge PRs from here on: do we wait for someone with a machine with avx512 to run the PR locally before merging? Or do we accept breaking that feature set? Note that any PR can break a feature (e.g. removing a `use` from the top of the module is often sufficient, as it just happened recently to me on #8670).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #8665: ARROW-10589: [Rust] Implement AVX-512 bit and operation

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #8665:
URL: https://github.com/apache/arrow/pull/8665#issuecomment-728507289


   I'm looking at contributing an AVX-512-capable machine to run occasional builds on Buildkite, I'd guess we're looking at 2-3 month time frame for that though. Note that anyone can hook up an AVX-512 machine to Buildkite and arrange for builds to be triggered on it


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org