You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "tshauck (via GitHub)" <gi...@apache.org> on 2023/09/28 03:51:51 UTC

[GitHub] [arrow-datafusion] tshauck opened a new pull request, #7680: docs: add section on supports_filters_pushdown

tshauck opened a new pull request, #7680:
URL: https://github.com/apache/arrow-datafusion/pull/7680

   ## Which issue does this PR close?
   
   Closes #7676 
   
   ## Rationale for this change
   
   See issue re: question
   
   ## What changes are included in this PR?
   
   Adds section to expand on the `TableProvider` trait functionality and specifically one on supports_filters_pushdown.
   
   ## Are these changes tested?
   
   <img width="789" alt="image" src="https://github.com/apache/arrow-datafusion/assets/421839/1bee5b89-1771-46e7-850a-dddc1d366017">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #7680: docs: add section on supports_filters_pushdown

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #7680:
URL: https://github.com/apache/arrow-datafusion/pull/7680


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #7680: docs: add section on supports_filters_pushdown

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #7680:
URL: https://github.com/apache/arrow-datafusion/pull/7680#issuecomment-1740953259

   Thanks again @tshauck and @andygrove 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #7680: docs: add section on supports_filters_pushdown

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #7680:
URL: https://github.com/apache/arrow-datafusion/pull/7680#discussion_r1340522395


##########
docs/source/library-user-guide/custom-table-providers.md:
##########
@@ -121,6 +121,22 @@ impl TableProvider for CustomDataSource {
 
 With this, and the implementation of the omitted methods, we can now use the `CustomDataSource` as a `TableProvider` in DataFusion.
 
+##### Additional `TableProvider` Methods
+
+`scan` has no default implementation, so it needed to be written. There are other methods on the `TableProvider` that have default implementations, but can be overridden if needed to provide additional functionality.
+
+###### `supports_filters_pushdown`
+
+The `supports_filters_pushdown` method can be overridden to indicate which filter expressions support being pushed down to the data source and within that the specificity of the pushdown.
+
+This returns a `Vec` of `TableProviderFilterPushDown` enums where each enum represents a filter that can be pushed down. The `TableProviderFilterPushDown` enum has three variants:
+
+- `TableProviderFilterPushDown::Unsupported` - the filter cannot be pushed down
+- `TableProviderFilterPushDown::Exact` - the filter can be pushed down and the data source can guarantee that the filter will be applied exactly as specified
+- `TableProviderFilterPushDown::Inexact` - the filter can be pushed down, but the data source cannot guarantee that the filter will be applied exactly as specified

Review Comment:
   ```suggestion
   - `TableProviderFilterPushDown::Exact` - the filter can be pushed down and the data source can guarantee that the filter will be applied completely to all rows. This is the highest performance option. 
   - `TableProviderFilterPushDown::Inexact` - the filter can be pushed down, but the data source cannot guarantee that the filter will be applied to all rows. DataFusion will apply `Inexact` filters again after the scan to ensure correctness. 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org