You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "avantgardnerio (via GitHub)" <gi...@apache.org> on 2023/02/22 21:00:03 UTC

[GitHub] [arrow-datafusion] avantgardnerio opened a new pull request, #5367: Try to push down full filter before break-up

avantgardnerio opened a new pull request, #5367:
URL: https://github.com/apache/arrow-datafusion/pull/5367

   # Which issue does this PR close?
   
   Closes #5357.
   
   # Rationale for this change
   
   Allow TableProviders with multi-column indexes to fully filter results. Presently, if a TableProvider is ordered by `[col_a, col_b]`, and it gets a query `where col_a=1 and col_b=2`, then it receives two calls to `supports_filter_pushdown()`: one for `col_a` and one for `col_b` and it would have to return `exact` for `col_a` and `Unsupported` for `col_b` because it can filter them together, and it can filter `col_a` because it's first in the sort order, but it cannot filter `col_b` on it's own.
   
   This change would allow the TableProvider to say "Yes, I can filter expressions like `col_a = x and col_b  = y`.
   
   # What changes are included in this PR?
   
   Attempt pushing down the full expression before breaking it up.
   
   # Are these changes tested?
   
   yes.
   
   # Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #5367: Try to push down full filter before break-up

Posted by "avantgardnerio (via GitHub)" <gi...@apache.org>.
avantgardnerio commented on PR #5367:
URL: https://github.com/apache/arrow-datafusion/pull/5367#issuecomment-1440790749

   As an alternative to jumping to full support of ordered indexes, here is a lighter weight PR that would still unblock me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #5367: Try to push down full filter before break-up

Posted by "avantgardnerio (via GitHub)" <gi...@apache.org>.
avantgardnerio commented on PR #5367:
URL: https://github.com/apache/arrow-datafusion/pull/5367#issuecomment-1445190333

   These tests are showing up as broken in this branch, but they appear to be broken in main as well:
   
   ```
   expected:
   
   [
       "Explain [plan_type:Utf8, plan:Utf8]",
       "  Projection: t1.t1_id, t1.t1_name, t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    Filter: EXISTS (<subquery>) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "      Subquery: [t1_int:UInt32;N]",
       "        Projection: t1.t1_int [t1_int:UInt32;N]",
       "          Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "            TableScan: t1 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "      TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
   ]
   actual:
   
   [
       "Explain [plan_type:Utf8, plan:Utf8]",
       "  Filter: EXISTS (<subquery>) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    Subquery: [t1_int:UInt32;N]",
       "      Projection: t1.t1_int [t1_int:UInt32;N]",
       "        Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "          TableScan: t1 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
   ]
   
   
   Left:  ["Explain [plan_type:Utf8, plan:Utf8]", "  Projection: t1.t1_id, t1.t1_name, t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    Filter: EXISTS (<subquery>) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "      Subquery: [t1_int:U ...
   
   Right: ["Explain [plan_type:Utf8, plan:Utf8]", "  Filter: EXISTS (<subquery>) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    Subquery: [t1_int:UInt32;N]", "      Projection: t1.t1_int [t1_int:UInt32;N]", "        Filter: t1.t1_id > t1.t1_int [t1_i ...
   
   <Click to see difference>
   
   thread 'sql::subqueries::exists_subquery_with_same_table' panicked at 'assertion failed: `(left == right)`
     left: `["Explain [plan_type:Utf8, plan:Utf8]", "  Projection: t1.t1_id, t1.t1_name, t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    Filter: EXISTS (<subquery>) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "      Subquery: [t1_int:UInt32;N]", "        Projection: t1.t1_int [t1_int:UInt32;N]", "          Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "            TableScan: t1 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "      TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]"]`,
    right: `["Explain [plan_type:Utf8, plan:Utf8]", "  Filter: EXISTS (<subquery>) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    Subquery: [t1_int:UInt32;N]", "      Projection: t1.t1_int [t1_int:UInt32;N]", "        Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "          TableScan: t1 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]"]`: 
   
   expected:
   
   [
       "Explain [plan_type:Utf8, plan:Utf8]",
       "  Projection: t1.t1_id, t1.t1_name, t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    Filter: EXISTS (<subquery>) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "      Subquery: [t1_int:UInt32;N]",
       "        Projection: t1.t1_int [t1_int:UInt32;N]",
       "          Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "            TableScan: t1 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "      TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
   ]
   actual:
   
   [
       "Explain [plan_type:Utf8, plan:Utf8]",
       "  Filter: EXISTS (<subquery>) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    Subquery: [t1_int:UInt32;N]",
       "      Projection: t1.t1_int [t1_int:UInt32;N]",
       "        Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "          TableScan: t1 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
   ]
   
   ', datafusion/core/tests/sql/subqueries.rs:143:5
   stack backtrace:
      0: rust_begin_unwind
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/std/src/panicking.rs:575:5
      1: core::panicking::panic_fmt
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/panicking.rs:64:14
      2: core::panicking::assert_failed_inner
      3: core::panicking::assert_failed
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/panicking.rs:211:5
      4: sql_integration::sql::subqueries::exists_subquery_with_same_table::{{closure}}
                at ./tests/sql/subqueries.rs:143:5
      5: <core::pin::Pin<P> as core::future::future::Future>::poll
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/future/future.rs:125:9
      6: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:525:48
      7: tokio::coop::with_budget::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:102:9
      8: std::thread::local::LocalKey<T>::try_with
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/std/src/thread/local.rs:446:16
      9: std::thread::local::LocalKey<T>::with
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/std/src/thread/local.rs:422:9
     10: tokio::coop::with_budget
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:95:5
     11: tokio::coop::budget
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:72:5
     12: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:525:25
     13: tokio::runtime::scheduler::current_thread::Context::enter
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:349:19
     14: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:524:36
     15: tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:595:57
     16: tokio::macros::scoped_tls::ScopedKey<T>::set
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/macros/scoped_tls.rs:61:9
     17: tokio::runtime::scheduler::current_thread::CoreGuard::enter
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:595:27
     18: tokio::runtime::scheduler::current_thread::CoreGuard::block_on
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:515:19
     19: tokio::runtime::scheduler::current_thread::CurrentThread::block_on
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:161:24
     20: tokio::runtime::Runtime::block_on
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/mod.rs:490:46
     21: sql_integration::sql::subqueries::exists_subquery_with_same_table
                at ./tests/sql/subqueries.rs:148:5
     22: sql_integration::sql::subqueries::exists_subquery_with_same_table::{{closure}}
                at ./tests/sql/subqueries.rs:121:47
     23: core::ops::function::FnOnce::call_once
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/ops/function.rs:250:5
     24: core::ops::function::FnOnce::call_once
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/ops/function.rs:250:5
   note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
   
   
   
   
   expected:
   
   [
       "Explain [plan_type:Utf8, plan:Utf8]",
       "  Projection: t1.t1_id, t1.t1_name, t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    LeftSemi Join: t1.t1_id = __correlated_sq_1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "      TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "      SubqueryAlias: __correlated_sq_1 [t1_int:UInt32;N]",
       "        Projection: t1.t1_int AS t1_int [t1_int:UInt32;N]",
       "          Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_int:UInt32;N]",
       "            TableScan: t1 projection=[t1_id, t1_int] [t1_id:UInt32;N, t1_int:UInt32;N]",
   ]
   actual:
   
   [
       "Explain [plan_type:Utf8, plan:Utf8]",
       "  LeftSemi Join: t1.t1_id = __correlated_sq_1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    SubqueryAlias: __correlated_sq_1 [t1_int:UInt32;N]",
       "      Projection: t1.t1_int AS t1_int [t1_int:UInt32;N]",
       "        Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_int:UInt32;N]",
       "          TableScan: t1 projection=[t1_id, t1_int] [t1_id:UInt32;N, t1_int:UInt32;N]",
   ]
   
   
   Left:  ["Explain [plan_type:Utf8, plan:Utf8]", "  Projection: t1.t1_id, t1.t1_name, t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    LeftSemi Join: t1.t1_id = __correlated_sq_1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "   ...
   
   Right: ["Explain [plan_type:Utf8, plan:Utf8]", "  LeftSemi Join: t1.t1_id = __correlated_sq_1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N] ...
   
   <Click to see difference>
   
   thread 'sql::subqueries::in_subquery_with_same_table' panicked at 'assertion failed: `(left == right)`
     left: `["Explain [plan_type:Utf8, plan:Utf8]", "  Projection: t1.t1_id, t1.t1_name, t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    LeftSemi Join: t1.t1_id = __correlated_sq_1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "      TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "      SubqueryAlias: __correlated_sq_1 [t1_int:UInt32;N]", "        Projection: t1.t1_int AS t1_int [t1_int:UInt32;N]", "          Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_int:UInt32;N]", "            TableScan: t1 projection=[t1_id, t1_int] [t1_id:UInt32;N, t1_int:UInt32;N]"]`,
    right: `["Explain [plan_type:Utf8, plan:Utf8]", "  LeftSemi Join: t1.t1_id = __correlated_sq_1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", "    SubqueryAlias: __correlated_sq_1 [t1_int:UInt32;N]", "      Projection: t1.t1_int AS t1_int [t1_int:UInt32;N]", "        Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_int:UInt32;N]", "          TableScan: t1 projection=[t1_id, t1_int] [t1_id:UInt32;N, t1_int:UInt32;N]"]`: 
   
   expected:
   
   [
       "Explain [plan_type:Utf8, plan:Utf8]",
       "  Projection: t1.t1_id, t1.t1_name, t1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    LeftSemi Join: t1.t1_id = __correlated_sq_1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "      TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "      SubqueryAlias: __correlated_sq_1 [t1_int:UInt32;N]",
       "        Projection: t1.t1_int AS t1_int [t1_int:UInt32;N]",
       "          Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_int:UInt32;N]",
       "            TableScan: t1 projection=[t1_id, t1_int] [t1_id:UInt32;N, t1_int:UInt32;N]",
   ]
   actual:
   
   [
       "Explain [plan_type:Utf8, plan:Utf8]",
       "  LeftSemi Join: t1.t1_id = __correlated_sq_1.t1_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    TableScan: t1 projection=[t1_id, t1_name, t1_int] [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]",
       "    SubqueryAlias: __correlated_sq_1 [t1_int:UInt32;N]",
       "      Projection: t1.t1_int AS t1_int [t1_int:UInt32;N]",
       "        Filter: t1.t1_id > t1.t1_int [t1_id:UInt32;N, t1_int:UInt32;N]",
       "          TableScan: t1 projection=[t1_id, t1_int] [t1_id:UInt32;N, t1_int:UInt32;N]",
   ]
   
   ', datafusion/core/tests/sql/subqueries.rs:174:5
   stack backtrace:
      0: rust_begin_unwind
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/std/src/panicking.rs:575:5
      1: core::panicking::panic_fmt
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/panicking.rs:64:14
      2: core::panicking::assert_failed_inner
      3: core::panicking::assert_failed
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/panicking.rs:211:5
      4: sql_integration::sql::subqueries::in_subquery_with_same_table::{{closure}}
                at ./tests/sql/subqueries.rs:174:5
      5: <core::pin::Pin<P> as core::future::future::Future>::poll
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/future/future.rs:125:9
      6: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:525:48
      7: tokio::coop::with_budget::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:102:9
      8: std::thread::local::LocalKey<T>::try_with
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/std/src/thread/local.rs:446:16
      9: std::thread::local::LocalKey<T>::with
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/std/src/thread/local.rs:422:9
     10: tokio::coop::with_budget
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:95:5
     11: tokio::coop::budget
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:72:5
     12: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:525:25
     13: tokio::runtime::scheduler::current_thread::Context::enter
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:349:19
     14: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:524:36
     15: tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}}
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:595:57
     16: tokio::macros::scoped_tls::ScopedKey<T>::set
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/macros/scoped_tls.rs:61:9
     17: tokio::runtime::scheduler::current_thread::CoreGuard::enter
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:595:27
     18: tokio::runtime::scheduler::current_thread::CoreGuard::block_on
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:515:19
     19: tokio::runtime::scheduler::current_thread::CurrentThread::block_on
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/current_thread.rs:161:24
     20: tokio::runtime::Runtime::block_on
                at /home/bgardner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/mod.rs:490:46
     21: sql_integration::sql::subqueries::in_subquery_with_same_table
                at ./tests/sql/subqueries.rs:179:5
     22: sql_integration::sql::subqueries::in_subquery_with_same_table::{{closure}}
                at ./tests/sql/subqueries.rs:152:43
     23: core::ops::function::FnOnce::call_once
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/ops/function.rs:250:5
     24: core::ops::function::FnOnce::call_once
                at /rustc/75a0be98f25a4b9de5afa0e15eb016e7f9627032/library/core/src/ops/function.rs:250:5
   note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
   
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] ursabot commented on pull request #5367: Try to push down full filter before break-up

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #5367:
URL: https://github.com/apache/arrow-datafusion/pull/5367#issuecomment-1445250018

   Benchmark runs are scheduled for baseline = 38185cacda6e56573c1aa4b316458679a81948af and contender = fad360df0132a2fcb264a7c07b2b02f0b1dfc644. fad360df0132a2fcb264a7c07b2b02f0b1dfc644 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/bfe6f69e4afd4603a2e61009c13c58bd...8ec4869d232442de9d4267d5343f0a03/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/aaa1dbdf33084f3b815b09da77bac6f4...e8e9bed8f9314aa5b792e87c5bdc9a54/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/c86fcb8eb3544f73a802cf25de4fa19a...ded6af4ae3564e808ae4a83ce37ddd5c/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/afeb50dcf3cb467d82c170060f8fed13...24984c865b7f4d119a71df7913322a52/)
   Buildkite builds:
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5367: Try to push down full filter before break-up

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5367:
URL: https://github.com/apache/arrow-datafusion/pull/5367#issuecomment-1441724946

   I have a question / comment: https://github.com/apache/arrow-datafusion/issues/5357#issuecomment-1441721936


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio merged pull request #5367: Try to push down full filter before break-up

Posted by "avantgardnerio (via GitHub)" <gi...@apache.org>.
avantgardnerio merged PR #5367:
URL: https://github.com/apache/arrow-datafusion/pull/5367


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5367: Try to push down full filter before break-up

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5367:
URL: https://github.com/apache/arrow-datafusion/pull/5367#discussion_r1117930542


##########
datafusion/optimizer/src/push_down_filter.rs:
##########
@@ -699,30 +700,28 @@ impl OptimizerRule for PushDownFilter {
                 push_down_all_join(predicates, &filter.input, left, right, vec![])?
             }
             LogicalPlan::TableScan(scan) => {
-                let mut new_scan_filters = scan.filters.clone();
-                let mut new_predicate = vec![];
-
-                let filter_predicates =
-                    utils::split_conjunction_owned(filter.predicate.clone());
-
-                for filter_expr in &filter_predicates {
-                    let (preserve_filter_node, add_to_provider) =
-                        match scan.source.supports_filter_pushdown(filter_expr)? {
-                            TableProviderFilterPushDown::Unsupported => (true, false),
-                            TableProviderFilterPushDown::Inexact => (true, true),
-                            TableProviderFilterPushDown::Exact => (false, true),
-                        };
-                    if preserve_filter_node {
-                        new_predicate.push(filter_expr.clone());
-                    }
-                    if add_to_provider {
-                        // avoid reduplicated filter expr.
-                        if new_scan_filters.contains(filter_expr) {
-                            continue;
-                        }
-                        new_scan_filters.push(filter_expr.clone());
-                    }
-                }
+                let filter_predicates = split_conjunction_owned(filter.predicate.clone());

Review Comment:
   It unfortunate to require a clone here. I wonder if we could change the signature to take `&[&Expr]]` and use `split_conjunction`?
   
   https://docs.rs/datafusion-optimizer/18.0.0/datafusion_optimizer/utils/fn.split_conjunction.html
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] yjshen commented on a diff in pull request #5367: Try to push down full filter before break-up

Posted by "yjshen (via GitHub)" <gi...@apache.org>.
yjshen commented on code in PR #5367:
URL: https://github.com/apache/arrow-datafusion/pull/5367#discussion_r1117865065


##########
datafusion/expr/src/table_source.rs:
##########
@@ -70,13 +71,27 @@ pub trait TableSource: Sync + Send {
 
     /// Tests whether the table provider can make use of a filter expression
     /// to optimise data retrieval.
+    #[deprecated(since = "20.0.0", note = "use supports_filters_pushdown instead")]
     fn supports_filter_pushdown(
         &self,
         _filter: &Expr,
-    ) -> datafusion_common::Result<TableProviderFilterPushDown> {
+    ) -> Result<TableProviderFilterPushDown> {
         Ok(TableProviderFilterPushDown::Unsupported)
     }
 
+    /// Tests whether the table provider can make use of any or all filter expressions
+    /// to optimise data retrieval.
+    #[allow(deprecated)]
+    fn supports_filters_pushdown(
+        &self,
+        filters: &[Expr],
+    ) -> Result<Vec<TableProviderFilterPushDown>> {
+        filters
+            .iter()
+            .map(|f| self.supports_filter_pushdown(f))

Review Comment:
   👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org