You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "comphead (via GitHub)" <gi...@apache.org> on 2024/03/27 15:40:06 UTC

[PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

comphead opened a new pull request, #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824

   ## Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #.
   
   Related to #9764
   
   ## Rationale for this change
   This is a POC on if `Arc<String>` performs better than `Cow<&'a str>`
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   ## What changes are included in this PR?
   
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   ## Are these changes tested?
   
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   ## Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#discussion_r1552195109


##########
datafusion/common/src/table_reference.rs:
##########
@@ -246,24 +240,7 @@ impl<'a> TableReference<'a> {
     /// Converts directly into an [`OwnedTableReference`] by cloning
     /// the underlying data.
     pub fn to_owned_reference(&self) -> OwnedTableReference {

Review Comment:
   https://github.com/apache/arrow-datafusion/pull/9933



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "crepererum (via GitHub)" <gi...@apache.org>.
crepererum commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2031412423

   If you measure a difference between `Arc<String>` and `Arc<str>` I would say its either noise or a compiler optimizer bug. The technical difference between the two is:
   
   - `Arc<String>` stores pointer to allocation, the allocation capacity and the used length (because `String` is mutable, like `Vec`)
   - `Arc<str>` stores pointer to allocation (which always is always trimmed) and the length
   
   So `Arc<String>` is pointless since you pay for a mutable data structure that you cannot mutate anymore. Same goes for `Arc<Vec<T>>` which should always be `Arc<[T]>`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#discussion_r1552194789


##########
datafusion/common/src/schema_reference.rs:
##########
@@ -38,9 +33,9 @@ impl SchemaReference<'_> {
     }
 }
 
-pub type OwnedSchemaReference = SchemaReference<'static>;
+pub type OwnedSchemaReference = SchemaReference;

Review Comment:
   https://github.com/apache/arrow-datafusion/pull/9933



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2030071719

   Latest `Arc<String>` consistently improves 5-7%, in very good cases even 14%
   
   ```
   logical_select_one_from_700
                           time:   [576.23 µs 577.70 µs 579.05 µs]
                           change: [-6.5082% -5.9265% -5.3353%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 9 outliers among 100 measurements (9.00%)
     1 (1.00%) low severe
     4 (4.00%) low mild
     1 (1.00%) high mild
     3 (3.00%) high severe
   
   physical_select_one_from_700
                           time:   [2.1965 ms 2.2016 ms 2.2071 ms]
                           change: [-14.414% -14.055% -13.708%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 9 outliers among 100 measurements (9.00%)
     4 (4.00%) high mild
     5 (5.00%) high severe
   
   Benchmarking logical_select_all_from_1000: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.6s, or reduce sample count to 70.
   logical_select_all_from_1000
                           time:   [65.417 ms 65.721 ms 66.037 ms]
                           change: [-3.9689% -3.3293% -2.7003%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   Benchmarking physical_select_all_from_1000: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 47.2s, or reduce sample count to 10.
   physical_select_all_from_1000
                           time:   [469.29 ms 470.48 ms 471.69 ms]
                           change: [-4.8974% -4.5601% -4.1916%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   logical_trivial_join_low_numbered_columns
                           time:   [606.50 µs 608.04 µs 609.58 µs]
                           change: [-4.6959% -4.1899% -3.6424%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     1 (1.00%) low severe
     3 (3.00%) low mild
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   logical_trivial_join_high_numbered_columns
                           time:   [634.81 µs 637.24 µs 639.53 µs]
                           change: [-4.7751% -4.3011% -3.8219%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 11 outliers among 100 measurements (11.00%)
     7 (7.00%) low mild
     1 (1.00%) high mild
     3 (3.00%) high severe
   
   logical_aggregate_with_join
                           time:   [833.85 µs 835.68 µs 837.53 µs]
                           change: [-7.1013% -6.7186% -6.3543%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) low severe
     1 (1.00%) low mild
     1 (1.00%) high mild
   
   physical_plan_tpch_q1   time:   [3.5415 ms 3.5513 ms 3.5618 ms]
                           change: [-7.6268% -7.2103% -6.8134%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     5 (5.00%) high mild
     2 (2.00%) high severe
   
   physical_plan_tpch_q2   time:   [5.5515 ms 5.5835 ms 5.6185 ms]
                           change: [-11.115% -10.514% -9.8843%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     4 (4.00%) high mild
     1 (1.00%) high severe
   
   Benchmarking physical_plan_tpch_q3: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.9s, enable flat sampling, or reduce sample count to 50.
   physical_plan_tpch_q3   time:   [2.0002 ms 2.0202 ms 2.0393 ms]
                           change: [-6.7269% -5.8744% -5.0422%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   Benchmarking physical_plan_tpch_q4: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.5s, enable flat sampling, or reduce sample count to 50.
   physical_plan_tpch_q4   time:   [1.6361 ms 1.6525 ms 1.6692 ms]
                           change: [-7.7611% -6.8830% -6.0578%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   
   physical_plan_tpch_q5   time:   [2.8525 ms 2.8742 ms 2.8964 ms]
                           change: [-8.1109% -7.3859% -6.6532%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   Benchmarking physical_plan_tpch_q6: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60.
   physical_plan_tpch_q6   time:   [1.1371 ms 1.1439 ms 1.1512 ms]
                           change: [-4.8222% -4.1028% -3.3634%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   
   physical_plan_tpch_q7   time:   [4.0255 ms 4.0757 ms 4.1341 ms]
                           change: [-8.3332% -7.0531% -5.6627%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     3 (3.00%) high mild
     2 (2.00%) high severe
   
   physical_plan_tpch_q8   time:   [5.6720 ms 5.6937 ms 5.7166 ms]
                           change: [-12.538% -12.035% -11.493%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   
   physical_plan_tpch_q9   time:   [4.2870 ms 4.2995 ms 4.3133 ms]
                           change: [-11.457% -11.051% -10.665%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     4 (4.00%) high mild
     2 (2.00%) high severe
   
   physical_plan_tpch_q10  time:   [2.8187 ms 2.8270 ms 2.8359 ms]
                           change: [-10.879% -10.380% -9.9175%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     5 (5.00%) high mild
     1 (1.00%) high severe
   
   physical_plan_tpch_q11  time:   [2.2546 ms 2.2614 ms 2.2689 ms]
                           change: [-9.0121% -8.6145% -8.1883%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) high mild
     1 (1.00%) high severe
   
   physical_plan_tpch_q12  time:   [2.0119 ms 2.0176 ms 2.0239 ms]
                           change: [-8.0297% -7.6041% -7.1867%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) high mild
     2 (2.00%) high severe
   
   Benchmarking physical_plan_tpch_q13: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.4s, enable flat sampling, or reduce sample count to 60.
   physical_plan_tpch_q13  time:   [1.2584 ms 1.2620 ms 1.2663 ms]
                           change: [-7.9586% -7.4495% -6.9731%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     2 (2.00%) low mild
     4 (4.00%) high mild
     2 (2.00%) high severe
   
   Benchmarking physical_plan_tpch_q14: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.3s, enable flat sampling, or reduce sample count to 50.
   physical_plan_tpch_q14  time:   [1.6386 ms 1.6432 ms 1.6482 ms]
                           change: [-17.798% -15.212% -12.713%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
     1 (1.00%) low mild
     1 (1.00%) high mild
     2 (2.00%) high severe
   
   physical_plan_tpch_q16  time:   [2.3289 ms 2.3359 ms 2.3439 ms]
                           change: [-14.684% -14.050% -13.412%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
     3 (3.00%) high mild
     1 (1.00%) high severe
   
   physical_plan_tpch_q17  time:   [2.1194 ms 2.1255 ms 2.1323 ms]
                           change: [-9.7062% -9.3504% -8.9638%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     4 (4.00%) high mild
     2 (2.00%) high severe
   
   physical_plan_tpch_q18  time:   [2.3301 ms 2.3373 ms 2.3448 ms]
                           change: [-11.888% -11.391% -10.910%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   
   physical_plan_tpch_q19  time:   [5.2007 ms 5.2191 ms 5.2392 ms]
                           change: [-9.2573% -8.7779% -8.2815%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
     4 (4.00%) high mild
   
   physical_plan_tpch_q20  time:   [2.8280 ms 2.8368 ms 2.8466 ms]
                           change: [-9.9140% -9.4954% -9.0702%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high severe
   
   physical_plan_tpch_q21  time:   [4.1423 ms 4.1551 ms 4.1689 ms]
                           change: [-12.567% -12.169% -11.735%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     5 (5.00%) high mild
     2 (2.00%) high severe
   
   physical_plan_tpch_q22  time:   [2.0965 ms 2.1030 ms 2.1100 ms]
                           change: [-8.2668% -7.7013% -7.1833%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
     4 (4.00%) high mild
   
   Benchmarking physical_plan_tpch_all: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.5s, or reduce sample count to 70.
   physical_plan_tpch_all  time:   [64.117 ms 64.298 ms 64.491 ms]
                           change: [-12.073% -11.663% -11.246%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
     4 (4.00%) high mild
   
   logical_plan_tpch_all   time:   [12.591 ms 12.638 ms 12.686 ms]
                           change: [-2.1208% -1.5156% -0.8282%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 4 outliers among 100 measurements (4.00%)
     1 (1.00%) low mild
     3 (3.00%) high mild
   ```
   
   @alamb @jayzhan211 which way you guys prefer? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2032014365

   > @alamb @jayzhan211 which way you guys prefer?
   
   I prefer `Arc<str>` for the reasons mentioned by @crepererum https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2031412423


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2032949830

   > I'll merge it as is, and will create another one with Arc again, so we can see if there is more benefits
   
   Sounds good -- we can also potentially switch to using `Arc<String>` as a follow on if we run a benchmark and show it going faster


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2025450686

   > Is it possible to replace with `Arc<str>`?
   
   I'll try but imho I dont expect much diff as those strings are mostly immutable and should be much diff between `Vec<u8>` which is `String` and `[u8]` which is str


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2025744064

   yes, its 1 more level of indirection, I'll try to go with `Arc<str>`, lets see how many changes it will take


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#discussion_r1548135759


##########
datafusion/common/src/table_reference.rs:
##########
@@ -246,24 +240,7 @@ impl<'a> TableReference<'a> {
     /// Converts directly into an [`OwnedTableReference`] by cloning
     /// the underlying data.
     pub fn to_owned_reference(&self) -> OwnedTableReference {

Review Comment:
   gonna deprecate this method in following PR



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2025726557

   > @alamb @jayzhan211 I can see some improvements for the `Arc<String>` let me know folks if you wanna proceed with this
   
   I think we should proceed 🚀 
   
   What I think we should do is get version 37 released so we can proceed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2024189261

   @alamb @jayzhan211 I can see some improvements for the `Arc<String>` let me know folks if you wanna proceed with this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead closed pull request #9824: [WIP] Analyzer: Arc<String> vs Cow
URL: https://github.com/apache/arrow-datafusion/pull/9824


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2027777796

   oops


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2032414750

   Modified to `Arc<str>` as discussed. The better performance benefit for `Arc<String>` I think still explained by using String more often than str and the conversion is cheaper


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2032016131

   I am feeling DataFusion planning time is going to be very much improved in Version 38.0.0 🚀  -- I'll check this PR out later today. Very excited


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2023092603

   @alamb do you run sql planner benchmark vs baseline? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2027823916

   with Arc<str>. I'd say with Arc<String> numbers looked slightly better
   ```
   logical_select_one_from_700
                           time:   [580.12 µs 585.06 µs 589.73 µs]
                           change: [-6.3590% -5.7328% -5.0564%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   physical_select_one_from_700
                           time:   [2.3826 ms 2.3916 ms 2.4006 ms]
                           change: [-7.0877% -6.6384% -6.1781%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   
   Benchmarking logical_select_all_from_1000: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60.
   logical_select_all_from_1000
                           time:   [70.414 ms 70.720 ms 71.037 ms]
                           change: [+3.3675% +4.0240% +4.6863%] (p = 0.00 < 0.05)
                           Performance has regressed.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   Benchmarking physical_select_all_from_1000: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 50.0s, or reduce sample count to 10.
   physical_select_all_from_1000
                           time:   [506.06 ms 507.48 ms 508.95 ms]
                           change: [+2.5742% +2.9462% +3.3299%] (p = 0.00 < 0.05)
                           Performance has regressed.
   Found 4 outliers among 100 measurements (4.00%)
     4 (4.00%) high mild
   
   logical_trivial_join_low_numbered_columns
                           time:   [587.37 µs 592.07 µs 595.97 µs]
                           change: [-7.8608% -6.8978% -6.0233%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 13 outliers among 100 measurements (13.00%)
     7 (7.00%) low severe
     2 (2.00%) low mild
     2 (2.00%) high mild
     2 (2.00%) high severe
   
   logical_trivial_join_high_numbered_columns
                           time:   [630.89 µs 632.24 µs 633.53 µs]
                           change: [-5.9321% -5.4946% -5.0383%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 9 outliers among 100 measurements (9.00%)
     1 (1.00%) low severe
     4 (4.00%) low mild
     4 (4.00%) high mild
   
   logical_aggregate_with_join
                           time:   [823.32 µs 827.32 µs 830.69 µs]
                           change: [-8.0075% -7.5928% -7.1907%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) low severe
     2 (2.00%) low mild
   
   physical_plan_tpch_q1   time:   [3.7308 ms 3.7435 ms 3.7561 ms]
                           change: [-2.6559% -2.1877% -1.7373%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   physical_plan_tpch_q2   time:   [5.7711 ms 5.7942 ms 5.8175 ms]
                           change: [-7.6441% -7.1368% -6.6634%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) low mild
     1 (1.00%) high mild
   
   Benchmarking physical_plan_tpch_q3: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.0s, enable flat sampling, or reduce sample count to 40.
   physical_plan_tpch_q3   time:   [1.9749 ms 1.9794 ms 1.9844 ms]
                           change: [-7.4539% -6.9879% -6.5380%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     1 (1.00%) low severe
     1 (1.00%) low mild
     4 (4.00%) high mild
     1 (1.00%) high severe
   
   Benchmarking physical_plan_tpch_q4: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.3s, enable flat sampling, or reduce sample count to 50.
   physical_plan_tpch_q4   time:   [1.6496 ms 1.6537 ms 1.6581 ms]
                           change: [-6.7606% -6.0924% -5.4526%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) low mild
     2 (2.00%) high mild
   
   physical_plan_tpch_q5   time:   [2.8363 ms 2.8451 ms 2.8541 ms]
                           change: [-8.7362% -8.3231% -7.8946%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     1 (1.00%) low mild
     5 (5.00%) high mild
   
   Benchmarking physical_plan_tpch_q6: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60.
   physical_plan_tpch_q6   time:   [1.1385 ms 1.1419 ms 1.1455 ms]
                           change: [-4.6024% -3.9168% -3.1923%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     2 (2.00%) low mild
     4 (4.00%) high mild
     1 (1.00%) high severe
   
   physical_plan_tpch_q7   time:   [4.0771 ms 4.0917 ms 4.1067 ms]
                           change: [-7.1486% -6.6862% -6.1909%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   physical_plan_tpch_q8   time:   [6.0481 ms 6.0697 ms 6.0910 ms]
                           change: [-6.7553% -6.2264% -5.7076%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   physical_plan_tpch_q9   time:   [4.5076 ms 4.5234 ms 4.5394 ms]
                           change: [-6.8838% -6.4167% -5.9600%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   physical_plan_tpch_q10  time:   [2.9380 ms 2.9475 ms 2.9574 ms]
                           change: [-7.1054% -6.5586% -6.0542%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     5 (5.00%) high mild
   
   physical_plan_tpch_q11  time:   [2.3309 ms 2.3379 ms 2.3450 ms]
                           change: [-5.9245% -5.5252% -5.1065%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) low mild
     2 (2.00%) high mild
   
   physical_plan_tpch_q12  time:   [2.0939 ms 2.1058 ms 2.1193 ms]
                           change: [-4.2735% -3.5686% -2.8289%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     3 (3.00%) high mild
     4 (4.00%) high severe
   
   Benchmarking physical_plan_tpch_q13: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.7s, enable flat sampling, or reduce sample count to 60.
   physical_plan_tpch_q13  time:   [1.3206 ms 1.3255 ms 1.3305 ms]
                           change: [-3.3886% -2.7905% -2.2341%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   Benchmarking physical_plan_tpch_q14: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.8s, enable flat sampling, or reduce sample count to 50.
   physical_plan_tpch_q14  time:   [1.7025 ms 1.7112 ms 1.7211 ms]
                           change: [-14.099% -11.398% -8.8080%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   
   physical_plan_tpch_q16  time:   [2.4419 ms 2.4517 ms 2.4616 ms]
                           change: [-10.500% -9.7909% -9.1054%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   physical_plan_tpch_q17  time:   [2.2205 ms 2.2303 ms 2.2406 ms]
                           change: [-5.3559% -4.8812% -4.3619%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) low mild
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   physical_plan_tpch_q18  time:   [2.4850 ms 2.4959 ms 2.5070 ms]
                           change: [-5.9772% -5.3788% -4.8142%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   
   physical_plan_tpch_q19  time:   [5.5130 ms 5.5335 ms 5.5539 ms]
                           change: [-3.7923% -3.2835% -2.7258%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) low mild
   
   physical_plan_tpch_q20  time:   [2.9745 ms 2.9860 ms 2.9981 ms]
                           change: [-5.2189% -4.7349% -4.2429%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   physical_plan_tpch_q21  time:   [4.4102 ms 4.4247 ms 4.4394 ms]
                           change: [-6.9218% -6.4708% -6.0108%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   physical_plan_tpch_q22  time:   [2.1663 ms 2.1753 ms 2.1852 ms]
                           change: [-5.1750% -4.5239% -3.8635%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     1 (1.00%) low mild
     3 (3.00%) high mild
     4 (4.00%) high severe
   
   Benchmarking physical_plan_tpch_all: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.8s, or reduce sample count to 70.
   physical_plan_tpch_all  time:   [67.608 ms 67.925 ms 68.253 ms]
                           change: [-7.2306% -6.6810% -6.1284%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   logical_plan_tpch_all   time:   [12.434 ms 12.540 ms 12.636 ms]
                           change: [-3.4299% -2.2768% -1.3315%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     4 (4.00%) low severe
     2 (2.00%) low mild
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead merged PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2033115088

   Update it turns out that I got significant performance improvements by using `impl Into<Arc<str>>` -- https://github.com/apache/arrow-datafusion/pull/9916 🚀 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2025725269

   > > Is it possible to replace with `Arc<str>`?
   > 
   > I'll try but imho I dont expect much diff as those strings are mostly immutable and should not be much diff between `Vec<u8>` which is `String` and `[u8]` which is str
   
   I agree there is not a lot of practical difference -- but I think the `Arc<str>`is the more standard (and does avoid one level of indirection:
   
   for `Arc<str>` it is `Arc` --> `[u8]`
   For `Arc<String>` it is `Arc` -> `String/Vec` -> `[u8]`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2033123242

   > Update it turns out that I got significant performance improvements by using `impl Into<Arc<str>>` -- #9916 🚀
   
   Yes, my very first implementation was to get the `impl Into<String>` instead of &str. Looks like compiler in this case can find optimal construction and avoid reallocation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2032903998

   I am also running the benchmarks as well and I am going to try and make a test PR that tries using `Impl Into<Arc<str>>` and see if the Rust compiler is smart enough to make that faster. Will update here with results
   
   ps. it would be sweet to see what this does for the TPC-DS benchmarks https://github.com/apache/arrow-datafusion/pull/9907


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2032946816

   I'll merge it as is, and will create another one with Arc<String> again, so we can see if there is more benefits


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2024741872

   Is it possible to replace with `Arc<str>`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2026069505

   Most of the time, you can make an `Arc<str>` like `Arc::from(my_str_variable)` (which I didn't fully appreciate when I first encountered the pattern. I think @crepererum  showed me that one back in the day)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [WIP] Analyzer: Arc vs Cow [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#issuecomment-2024153821

   With `Arc<String>`
   ```
   logical_select_one_from_700
                           time:   [590.03 µs 591.75 µs 593.50 µs]
                           change: [-3.8311% -3.3439% -2.8110%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     3 (3.00%) low mild
     4 (4.00%) high mild
     1 (1.00%) high severe
   
   physical_select_one_from_700
                           time:   [2.5903 ms 2.5966 ms 2.6029 ms]
                           change: [+0.9541% +1.3643% +1.7661%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   
   Benchmarking logical_select_all_from_1000: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.9s, or reduce sample count to 70.
   logical_select_all_from_1000
                           time:   [68.628 ms 68.932 ms 69.243 ms]
                           change: [+0.7792% +1.3932% +2.0657%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) low mild
     2 (2.00%) high mild
   
   Benchmarking physical_select_all_from_1000: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 51.2s, or reduce sample count to 10.
   physical_select_all_from_1000
                           time:   [509.48 ms 510.84 ms 512.22 ms]
                           change: [+3.2665% +3.6282% +4.0088%] (p = 0.00 < 0.05)
                           Performance has regressed.
   
   logical_trivial_join_low_numbered_columns
                           time:   [608.20 µs 610.94 µs 613.41 µs]
                           change: [-4.3832% -3.7492% -3.0800%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     5 (5.00%) low mild
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   logical_trivial_join_high_numbered_columns
                           time:   [646.02 µs 648.13 µs 650.31 µs]
                           change: [-3.2821% -2.7965% -2.3021%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     1 (1.00%) low severe
     2 (2.00%) low mild
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   logical_aggregate_with_join
                           time:   [849.03 µs 851.57 µs 854.03 µs]
                           change: [-5.5567% -5.1253% -4.6791%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     1 (1.00%) low severe
     3 (3.00%) low mild
     1 (1.00%) high mild
   
   physical_plan_tpch_q1   time:   [3.7277 ms 3.7385 ms 3.7494 ms]
                           change: [-2.7537% -2.3180% -1.8824%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   physical_plan_tpch_q2   time:   [5.7276 ms 5.7456 ms 5.7636 ms]
                           change: [-8.3477% -7.9163% -7.4741%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   Benchmarking physical_plan_tpch_q3: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.0s, enable flat sampling, or reduce sample count to 40.
   physical_plan_tpch_q3   time:   [1.9718 ms 1.9769 ms 1.9820 ms]
                           change: [-7.8350% -7.3778% -6.9367%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     4 (4.00%) low mild
     1 (1.00%) high mild
   
   Benchmarking physical_plan_tpch_q4: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.4s, enable flat sampling, or reduce sample count to 50.
   physical_plan_tpch_q4   time:   [1.6532 ms 1.6574 ms 1.6617 ms]
                           change: [-6.2634% -5.5640% -4.9105%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) high mild
     1 (1.00%) high severe
   
   physical_plan_tpch_q5   time:   [2.8426 ms 2.8507 ms 2.8590 ms]
                           change: [-8.5504% -8.1407% -7.7324%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   Benchmarking physical_plan_tpch_q6: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.8s, enable flat sampling, or reduce sample count to 60.
   physical_plan_tpch_q6   time:   [1.1498 ms 1.1532 ms 1.1563 ms]
                           change: [-4.0547% -3.4204% -2.7697%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) low mild
     1 (1.00%) high mild
   
   physical_plan_tpch_q7   time:   [4.0766 ms 4.0897 ms 4.1029 ms]
                           change: [-7.1757% -6.7318% -6.2923%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) low mild
   
   physical_plan_tpch_q8   time:   [5.9329 ms 5.9529 ms 5.9730 ms]
                           change: [-8.5300% -8.0299% -7.5207%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   physical_plan_tpch_q9   time:   [4.4875 ms 4.5015 ms 4.5155 ms]
                           change: [-7.3168% -6.8709% -6.4349%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   physical_plan_tpch_q10  time:   [2.9242 ms 2.9325 ms 2.9411 ms]
                           change: [-7.5414% -7.0326% -6.5287%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   physical_plan_tpch_q11  time:   [2.3420 ms 2.3486 ms 2.3553 ms]
                           change: [-5.4976% -5.0941% -4.6762%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   physical_plan_tpch_q12  time:   [2.0776 ms 2.0854 ms 2.0939 ms]
                           change: [-5.0134% -4.5018% -3.9740%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     4 (4.00%) high mild
     1 (1.00%) high severe
   
   Benchmarking physical_plan_tpch_q13: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.5s, enable flat sampling, or reduce sample count to 60.
   physical_plan_tpch_q13  time:   [1.2825 ms 1.2861 ms 1.2898 ms]
                           change: [-6.2738% -5.7308% -5.2177%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) low mild
   
   Benchmarking physical_plan_tpch_q14: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.6s, enable flat sampling, or reduce sample count to 50.
   physical_plan_tpch_q14  time:   [1.6919 ms 1.6980 ms 1.7048 ms]
                           change: [-15.030% -12.371% -9.8084%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     2 (2.00%) low mild
     4 (4.00%) high mild
     1 (1.00%) high severe
   
   physical_plan_tpch_q16  time:   [2.4186 ms 2.4252 ms 2.4318 ms]
                           change: [-11.400% -10.767% -10.137%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   physical_plan_tpch_q17  time:   [2.2207 ms 2.2270 ms 2.2333 ms]
                           change: [-5.3964% -5.0215% -4.6541%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   physical_plan_tpch_q18  time:   [2.4501 ms 2.4612 ms 2.4730 ms]
                           change: [-7.2748% -6.6948% -6.0694%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     3 (3.00%) high mild
     2 (2.00%) high severe
   
   physical_plan_tpch_q19  time:   [5.5188 ms 5.5361 ms 5.5539 ms]
                           change: [-3.7142% -3.2374% -2.7330%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   physical_plan_tpch_q20  time:   [2.9689 ms 2.9796 ms 2.9906 ms]
                           change: [-5.3842% -4.9395% -4.4579%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   physical_plan_tpch_q21  time:   [4.4561 ms 4.4756 ms 4.4965 ms]
                           change: [-5.9124% -5.3952% -4.8319%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   physical_plan_tpch_q22  time:   [2.1949 ms 2.2018 ms 2.2088 ms]
                           change: [-3.9778% -3.3629% -2.7916%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   Benchmarking physical_plan_tpch_all: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.8s, or reduce sample count to 70.
   physical_plan_tpch_all  time:   [67.107 ms 67.291 ms 67.476 ms]
                           change: [-7.9768% -7.5521% -7.1246%] (p = 0.00 < 0.05)
                           Performance has improved.
   
   logical_plan_tpch_all   time:   [12.417 ms 12.473 ms 12.528 ms]
                           change: [-3.4870% -2.8011% -2.1019%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) low mild
     1 (1.00%) high mild
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] perf: Use `Arc` instead of `Cow<&'a>` in the analyzer [arrow-datafusion]

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #9824:
URL: https://github.com/apache/arrow-datafusion/pull/9824#discussion_r1548135140


##########
datafusion/common/src/schema_reference.rs:
##########
@@ -38,9 +33,9 @@ impl SchemaReference<'_> {
     }
 }
 
-pub type OwnedSchemaReference = SchemaReference<'static>;
+pub type OwnedSchemaReference = SchemaReference;

Review Comment:
   Gonna deprecate this type in following PR 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org