You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "andygrove (via GitHub)" <gi...@apache.org> on 2023/04/17 14:18:30 UTC

[GitHub] [arrow-datafusion] andygrove commented on issue #5999: Improve DataFusion scalability as more cores are added

andygrove commented on issue #5999:
URL: https://github.com/apache/arrow-datafusion/issues/5999#issuecomment-1511449038

   I ran the benchmarks with a simpler version of q1 with just the table scan and filter:
   
   ```sql
   select
       *
   from
   	lineitem
   where
   	l_shipdate <= date '1998-12-01' - interval '113 days';
   ```
   
   Here are the results:
   
   Cores | DataFusion Python 21.0.0 | DuckDB | DuckDB x Times Faster
   -- | -- | -- | --
   1 | 14376 | 15000.9 | 0.96
   2 | 7547.9 | 7915.7 | 0.95
   4 | 4243.9 | 3850.7 | 1.10
   8 | 2655.8 | 1869.3 | 1.42
   16 | 2096.6 | 954.9 | 2.20
   32 | 2252.2 | 531.7 | 4.24
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org