You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/10/10 18:57:57 UTC

[GitHub] [arrow] andygrove edited a comment on pull request #8428: ARROW-10251: [Rust] [DataFusion] MemTable::load() now loads partitions in parallel

andygrove edited a comment on pull request #8428:
URL: https://github.com/apache/arrow/pull/8428#issuecomment-706595614


   For the TPCH benchmark with `--mem-table` this gave me ~10x speedup in load times. fyi @jhorstmann 
   
   ```
   Running benchmarks with the following options: TpchOpt { query: 1, debug: false, iterations: 3, concurrency: 24, batch_size: 4096, path: "/mnt/tpch/s1/parquet", file_format: "parquet", mem_table: true }
   Loading data into memory
   Loaded data into memory in 486 ms
   Query 1 iteration 0 took 166 ms
   Query 1 iteration 1 took 154 ms
   Query 1 iteration 2 took 156 ms
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org