You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2020/12/03 05:18:47 UTC

[GitHub] [drill] cgivre opened a new pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

cgivre opened a new pull request #2120:
URL: https://github.com/apache/drill/pull/2120


   # [DRILL-7814](https://issues.apache.org/jira/browse/DRILL-7814): Add Limit Pushdown to MongoDB Storage Plugin
   
   ## Description
   This PR adds a limit pushdown capability to the MongoDB storage plugin. 
   
   ## Documentation
   No user visible changes
   
   ## Testing
   Added new class of unit tests for the pushdown.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vvysotskyi commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-812619249


   @cgivre, there are some additional follow-up changes that should be made. In `matches()` method and in the first line of `onMatch()` please replace `DrillScanRel` with `DrillScanRelBase`. When creating `newScanPrel`, it should be either `ScanPrel` or `DrillScanRel`, depending on the type of `scan`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
cgivre commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-812559616


   > @cgivre, looks like I've found the reason why the limit cannot be applied for the case you have pointed.
   > Drill uses the `DrillPushLimitToScanRule` rule to push the limit to scan. This rule can be applied at the **logical** planning phase.
   > Mongo storage plugin uses the `MongoPushDownFilterForScan` rule to push filter to scan. This rule is applied at the **physical** planning phase (in its constructor it passes `FilterPrel`and `ScanPrel` rel nodes to match).
   > 
   > The solution for this issue is quite simple - update `MongoPushDownFilterForScan` to be able to work apply at any phase, for it, please replace `FilterPrel` with `org.apache.calcite.rel.core.Filter` and `ScanPrel` with `DrillScanRelBase`.
   
   Hey @vvysotskyi, I tried doing this in the `MongoPushDownFilterForScan` and either I missed something, but I'm getting all kinds of errors when I did this.  Could you take a quick look?
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
cgivre commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-738176989


   > @cgivre, thanks for pointing to this issue, I think it is better to address it in the scope of this PR.
   
   Could you give me any suggestions on this?  It didn't seem like `scanStats` were doing anything with the filters.  Do you think reworking this method would fix this?  I was trying to keep this change as small as possible. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
cgivre commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-738156883


   @vvysotskyi 
   Thanks for the review!  I was messing with this a bit more and found that it doesn't seem to push down the limit when there are filters in the query.  Should we address that in future work and commit this as is?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vvysotskyi commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-766965951


   @cgivre, looks like I've found the reason why the limit cannot be applied for the case you have pointed.
   Drill uses the `DrillPushLimitToScanRule` rule to push the limit to scan. This rule can be applied at the **logical** planning phase.
   Mongo storage plugin uses the `MongoPushDownFilterForScan` rule to push filter to scan. This rule is applied at the **physical** planning phase (in its constructor it passes `FilterPrel`and `ScanPrel` rel nodes to match).
   
   The solution for this issue is quite simple - update `MongoPushDownFilterForScan` to be able to work apply at any phase, for it, please replace  `FilterPrel` with `org.apache.calcite.rel.core.Filter` and `ScanPrel` with `DrillScanRelBase`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vvysotskyi commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-738167008


   @cgivre, thanks for pointing to this issue, I think it is better to address it in the scope of this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vvysotskyi commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-813239607


   @cgivre, the failure should be fixed by addressing the comment above:
   > When creating newScanPrel, it should be either ScanPrel or DrillScanRel, depending on the type of scan.
   
   It may be done by adding a copy method that accepts group scan to `DrillScanRelBase`, implementing it in both inheritors, and using this method in the rule instead of explicitly creating `ScanPrel`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vvysotskyi commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-766965951


   @cgivre, looks like I've found the reason why the limit cannot be applied for the case you have pointed.
   Drill uses the `DrillPushLimitToScanRule` rule to push the limit to scan. This rule can be applied at the **logical** planning phase.
   Mongo storage plugin uses the `MongoPushDownFilterForScan` rule to push filter to scan. This rule is applied at the **physical** planning phase (in its constructor it passes `FilterPrel`and `ScanPrel` rel nodes to match).
   
   The solution for this issue is quite simple - update `MongoPushDownFilterForScan` to be able to work apply at any phase, for it, please replace  `FilterPrel` with `org.apache.calcite.rel.core.Filter` and `ScanPrel` with `DrillScanRelBase`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
cgivre commented on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-813171609


   > @cgivre, there are some additional follow-up changes that should be made. In `matches()` method and in the first line of `onMatch()` please replace `DrillScanRel` with `DrillScanRelBase`. When creating `newScanPrel`, it should be either `ScanPrel` or `DrillScanRel`, depending on the type of `scan`.
   
   @vvysotskyi 
   Thanks for the response.  I made a few changes but am stuck again.  It would seem that I need to check the query phase, but I'm not quite sure how to do that.  This is the error I'm getting now. 
   
   ```
   org.apache.drill.exec.rpc.RpcException: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: AssertionError: Relational expression ScanPrel#133 has calling-convention LOGICAL but does not implement the required interface 'interface org.apache.drill.exec.planner.logical.DrillRel' of that convention
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre edited a comment on pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
cgivre edited a comment on pull request #2120:
URL: https://github.com/apache/drill/pull/2120#issuecomment-738156883


   @vvysotskyi 
   Thanks for the review!  I was messing with this a bit more and found that it doesn't seem to push down the limit when there are filters in the query.  Should we address that in future work and commit this as is? Or do you think I should investigate further before committing this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2120: DRILL-7814: Add Limit Pushdown to MongoDB Storage Plugin

Posted by GitBox <gi...@apache.org>.
cgivre merged pull request #2120:
URL: https://github.com/apache/drill/pull/2120


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org