You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "jychen7 (via GitHub)" <gi...@apache.org> on 2023/04/09 02:47:12 UTC

[GitHub] [arrow-datafusion] jychen7 commented on issue #5404: Datafusion v19.rc1 scan parquet 20x slower than DuckDB v0.6.1 on 15GB ClickBench data

jychen7 commented on issue #5404:
URL: https://github.com/apache/arrow-datafusion/issues/5404#issuecomment-1501025273

   > Perhaps you may be able to try with https://github.com/apache/arrow-datafusion/pull/5416 and the various predicate pushdown features enabled on ParquetOptions
   
   @tustvold based on the `explain` result in https://github.com/apache/arrow-datafusion/issues/5404#issuecomment-1447472667, 
   Expect: predicate pushdown, limit pushdown
   Actual: predicate pushdown, limit NOT pushdown
   
   I understand that your PR https://github.com/apache/arrow-datafusion/pull/5416 support `limit pushdown` in the physical plan, but looks like the above query does not have a limit pushdown in the logical plan or physical plan.
   
   Do you think this is something to improve ❓  I haven't checked the related code in the logical plan, so I will try to take a look tomorrow.
   (I feel like when ALL predicates are pushed down, we can push down the limit as well. Is this generally true?)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org