You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/25 14:27:32 UTC

[GitHub] [arrow-rs] psvri opened a new pull request, #3192: Improve regex related kernels by upto 85%

psvri opened a new pull request, #3192:
URL: https://github.com/apache/arrow-rs/pull/3192

   # Which issue does this PR close?
   
   NA
   
   # Rationale for this change
    
   Improves regex related kernels by a lot.
   
   # What changes are included in this PR?
   The regex crate was complied with perf flag. Due to this we were missing a lot of performance. It seems to be an unintended side effect of https://github.com/apache/arrow-rs/issues/1876  .
   
   # Are there any user-facing changes?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-rs] Dandandan commented on pull request #3192: Improve regex related kernels by upto 85%

Posted by GitBox <gi...@apache.org>.

Dandandan commented on PR #3192:
URL: https://github.com/apache/arrow-rs/pull/3192#issuecomment-1327609056

   Nice catch @psvri !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-rs] tustvold merged pull request #3192: Improve regex related kernels by upto 85%

Posted by GitBox <gi...@apache.org>.

tustvold merged PR #3192:
URL: https://github.com/apache/arrow-rs/pull/3192


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-rs] psvri commented on pull request #3192: Improve regex related kernels by upto 85%

Posted by GitBox <gi...@apache.org>.

psvri commented on PR #3192:
URL: https://github.com/apache/arrow-rs/pull/3192#issuecomment-1327552314

   Like kernel improvements
     ```
     like_utf8 scalar equals time:   [379.59 µs 379.62 µs 379.66 µs]
                           change: [-0.2632% -0.1372% -0.0478%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 12 outliers among 100 measurements (12.00%)
     3 (3.00%) low mild
     1 (1.00%) high mild
     8 (8.00%) high severe
   
   like_utf8 scalar contains
                           time:   [1.9998 ms 2.0014 ms 2.0031 ms]
                           change: [+0.1456% +0.2614% +0.3704%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   
   like_utf8 scalar ends with
                           time:   [358.22 µs 358.30 µs 358.40 µs]
                           change: [+0.0299% +0.0737% +0.1209%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 12 outliers among 100 measurements (12.00%)
     1 (1.00%) low severe
     4 (4.00%) high mild
     7 (7.00%) high severe
   
   like_utf8 scalar starts with
                           time:   [379.94 µs 380.12 µs 380.34 µs]
                           change: [+0.0506% +0.1122% +0.1742%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 6 outliers among 100 measurements (6.00%)
     3 (3.00%) high mild
     3 (3.00%) high severe
   
   Benchmarking like_utf8 scalar complex: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.4s, enable flat sampling, or reduce sample count to 60.
   like_utf8 scalar complex
                           time:   [1.2768 ms 1.2770 ms 1.2772 ms]
                           change: [-85.872% -85.868% -85.865%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     1 (1.00%) low mild
     4 (4.00%) high mild
     3 (3.00%) high severe
   
   nlike_utf8 scalar equals
                           time:   [380.18 µs 380.32 µs 380.51 µs]
                           change: [-0.0281% +0.0227% +0.0706%] (p = 0.39 > 0.05)
                           No change in performance detected.
   Found 8 outliers among 100 measurements (8.00%)
     1 (1.00%) high mild
     7 (7.00%) high severe
   
   nlike_utf8 scalar contains
                           time:   [2.0066 ms 2.0083 ms 2.0099 ms]
                           change: [+0.6815% +0.7869% +0.8975%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   
   nlike_utf8 scalar ends with
                           time:   [379.45 µs 379.65 µs 379.91 µs]
                           change: [-0.1908% -0.0220% +0.1077%] (p = 0.81 > 0.05)
                           No change in performance detected.
   Found 13 outliers among 100 measurements (13.00%)
     1 (1.00%) low severe
     3 (3.00%) high mild
     9 (9.00%) high severe
   
   nlike_utf8 scalar starts with
                           time:   [379.84 µs 380.10 µs 380.46 µs]
                           change: [+0.0558% +0.1208% +0.1876%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 10 outliers among 100 measurements (10.00%)
     1 (1.00%) low mild
     2 (2.00%) high mild
     7 (7.00%) high severe
   
   Benchmarking nlike_utf8 scalar complex: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.4s, enable flat sampling, or reduce sample count to 60.
   nlike_utf8 scalar complex
                           time:   [1.2763 ms 1.2765 ms 1.2768 ms]
                           change: [-85.881% -85.878% -85.874%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     4 (4.00%) high mild
     1 (1.00%) high severe
   
   ilike_utf8 scalar equals
                           time:   [2.8738 ms 2.8751 ms 2.8762 ms]
                           change: [+0.0368% +0.0849% +0.1312%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     2 (2.00%) low severe
     2 (2.00%) high mild
     1 (1.00%) high severe
   
   ilike_utf8 scalar contains
                           time:   [4.4475 ms 4.4509 ms 4.4543 ms]
                           change: [-1.9928% -1.9116% -1.8345%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   ilike_utf8 scalar ends with
                           time:   [2.8618 ms 2.8633 ms 2.8645 ms]
                           change: [-0.1859% -0.1240% -0.0760%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 6 outliers among 100 measurements (6.00%)
     5 (5.00%) low severe
     1 (1.00%) low mild
   
   ilike_utf8 scalar starts with
                           time:   [2.8435 ms 2.8444 ms 2.8454 ms]
                           change: [-0.2926% -0.2414% -0.1961%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     1 (1.00%) low severe
     2 (2.00%) low mild
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   ilike_utf8 scalar complex
                           time:   [2.4073 ms 2.4078 ms 2.4084 ms]
                           change: [-78.295% -78.290% -78.284%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   nilike_utf8 scalar equals
                           time:   [2.9062 ms 2.9072 ms 2.9083 ms]
                           change: [-1.7614% -1.7197% -1.6777%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 17 outliers among 100 measurements (17.00%)
     4 (4.00%) low severe
     3 (3.00%) low mild
     8 (8.00%) high mild
     2 (2.00%) high severe
   
   nilike_utf8 scalar contains
                           time:   [4.4728 ms 4.4755 ms 4.4784 ms]
                           change: [-2.0144% -1.9452% -1.8747%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   nilike_utf8 scalar ends with
                           time:   [2.9045 ms 2.9055 ms 2.9066 ms]
                           change: [-0.0610% -0.0224% +0.0171%] (p = 0.29 > 0.05)
                           No change in performance detected.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) low mild
     1 (1.00%) high severe
   
   nilike_utf8 scalar starts with
                           time:   [2.8983 ms 2.9002 ms 2.9025 ms]
                           change: [-0.0080% +0.0594% +0.1436%] (p = 0.16 > 0.05)
                           No change in performance detected.
   Found 7 outliers among 100 measurements (7.00%)
     1 (1.00%) low severe
     1 (1.00%) high mild
     5 (5.00%) high severe
   
   nilike_utf8 scalar complex
                           time:   [2.4668 ms 2.4672 ms 2.4677 ms]
                           change: [-77.898% -77.892% -77.885%] (p = 0.00 < 0.05)
                           Performance has improved.
     ```
     
     Regex improvements
     
     ```
     egexp_matches_utf8 scalar starts with
                           time:   [1.3229 ms 1.3238 ms 1.3247 ms]
                           change: [-58.274% -58.244% -58.215%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   Benchmarking egexp_matches_utf8 scalar ends with: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.7s, enable flat sampling, or reduce sample count to 60.
   egexp_matches_utf8 scalar ends with
                           time:   [1.3368 ms 1.3372 ms 1.3376 ms]
                           change: [-81.771% -81.762% -81.754%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     1 (1.00%) low severe
     1 (1.00%) low mild
     3 (3.00%) high mild
     3 (3.00%) high severe
     ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-rs] ursabot commented on pull request #3192: Improve regex related kernels by upto 85%

Posted by GitBox <gi...@apache.org>.

ursabot commented on PR #3192:
URL: https://github.com/apache/arrow-rs/pull/3192#issuecomment-1327718932

   Benchmark runs are scheduled for baseline = cbe5af071ce68b2a36d9e9881767ebd95bfdac83 and contender = 14e6212198ce75c9c17147edd3deedf126dae452. 14e6212198ce75c9c17147edd3deedf126dae452 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/78c9daee10cd4050890df42e115e06cd...65338995ae2242798a826a52fb306368/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/68b29c08449048aa974b767dfcefc9ec...ab0f00646209464382339a9e3c72b984/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/6a17dffc20f74b8d864b510f79986753...d0f0c374c26745688b582407fd61929a/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/f7f9bae0316e4606a3072c1348f9bbab...454e08af24854a16985b9b36e27e14d6/)
   Buildkite builds:
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org