You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by "cfmcgrady (via GitHub)" <gi...@apache.org> on 2023/04/14 08:44:27 UTC

[GitHub] [kyuubi] cfmcgrady opened a new pull request, #4710: [ARROW] LocalTableScanExec should not trigger job

cfmcgrady opened a new pull request, #4710:
URL: https://github.com/apache/kyuubi/pull/4710

   <!--
   Thanks for sending a pull request!
   
   Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/CONTRIBUTING.html
     2. If the PR is related to an issue in https://github.com/apache/kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
   -->
   
   ### _Why are the changes needed?_
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you add a feature, you can talk about the use case of it.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   TODO
   
   ### _How was this patch tested?_
   - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
   
   - [ ] Add screenshots for manual tests if appropriate
   
   - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] cfmcgrady commented on a diff in pull request #4710: [ARROW] LocalTableScanExec should not trigger job

Posted by "cfmcgrady (via GitHub)" <gi...@apache.org>.
cfmcgrady commented on code in PR #4710:
URL: https://github.com/apache/kyuubi/pull/4710#discussion_r1172416558


##########
externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/sql/kyuubi/SparkDatasetHelper.scala:
##########
@@ -51,6 +51,8 @@ object SparkDatasetHelper extends Logging {
       doCollectLimit(collectLimit)
     case collectLimit: CollectLimitExec if collectLimit.limit < 0 =>
       executeArrowBatchCollect(collectLimit.child)
+    case localTableScan: LocalTableScanExec =>
+      doLocalTableScan(localTableScan)

Review Comment:
   filed a PR on https://github.com/apache/spark/pull/40875, and will send a follow-up once Spark has addressed this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] codecov-commenter commented on pull request #4710: [ARROW] LocalTableScanExec should not trigger job

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #4710:
URL: https://github.com/apache/kyuubi/pull/4710#issuecomment-1511151955

   ## [Codecov](https://codecov.io/gh/apache/kyuubi/pull/4710?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#4710](https://codecov.io/gh/apache/kyuubi/pull/4710?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e4c2891) into [master](https://codecov.io/gh/apache/kyuubi/commit/17514a3acaf5e6b4c3a424e839e07854e674dd4c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (17514a3) will **increase** coverage by `0.08%`.
   > The diff coverage is `76.92%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #4710      +/-   ##
   ============================================
   + Coverage     58.03%   58.11%   +0.08%     
     Complexity       13       13              
   ============================================
     Files           580      580              
     Lines         32270    32281      +11     
     Branches       4307     4309       +2     
   ============================================
   + Hits          18728    18761      +33     
   + Misses        11739    11724      -15     
   + Partials       1803     1796       -7     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/kyuubi/pull/4710?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...rk/sql/execution/arrow/KyuubiArrowConverters.scala](https://codecov.io/gh/apache/kyuubi/pull/4710?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9hcnJvdy9LeXV1YmlBcnJvd0NvbnZlcnRlcnMuc2NhbGE=) | `88.73% <0.00%> (+0.07%)` | :arrow_up: |
   | [...g/apache/spark/sql/kyuubi/SparkDatasetHelper.scala](https://codecov.io/gh/apache/kyuubi/pull/4710?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2t5dXViaS9TcGFya0RhdGFzZXRIZWxwZXIuc2NhbGE=) | `81.90% <100.00%> (+1.90%)` | :arrow_up: |
   
   ... and [8 files with indirect coverage changes](https://codecov.io/gh/apache/kyuubi/pull/4710/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 closed pull request #4710: [ARROW] LocalTableScanExec should not trigger job

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 closed pull request #4710: [ARROW] LocalTableScanExec should not trigger job
URL: https://github.com/apache/kyuubi/pull/4710


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] cfmcgrady commented on a diff in pull request #4710: [ARROW] LocalTableScanExec should not trigger job

Posted by "cfmcgrady (via GitHub)" <gi...@apache.org>.
cfmcgrady commented on code in PR #4710:
URL: https://github.com/apache/kyuubi/pull/4710#discussion_r1169560427


##########
externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/sql/kyuubi/SparkDatasetHelper.scala:
##########
@@ -51,6 +51,8 @@ object SparkDatasetHelper extends Logging {
       doCollectLimit(collectLimit)
     case collectLimit: CollectLimitExec if collectLimit.limit < 0 =>
       executeArrowBatchCollect(collectLimit.child)
+    case localTableScan: LocalTableScanExec =>
+      doLocalTableScan(localTableScan)

Review Comment:
   Spark also doesn't post the driver-side SQLMetrics, is that a Spark bug?
   
   the SQL UI will be changed once we post the metrics.
   
   Before:
   
   ![截屏2023-04-18 下午2 37 50](https://user-images.githubusercontent.com/8537877/232692094-501e14e0-7797-4f81-8c82-eb091ce12c13.png)
   
   
   After:
   
   ![截屏2023-04-18 下午2 37 04](https://user-images.githubusercontent.com/8537877/232692109-d0f8c412-9753-4b2d-a557-8b67a42e8e76.png)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on pull request #4710: [ARROW] LocalTableScanExec should not trigger job

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on PR #4710:
URL: https://github.com/apache/kyuubi/pull/4710#issuecomment-1516053726

   Thanks, merged to master/1.7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] ulysses-you commented on a diff in pull request #4710: [ARROW] LocalTableScanExec should not trigger job

Posted by "ulysses-you (via GitHub)" <gi...@apache.org>.
ulysses-you commented on code in PR #4710:
URL: https://github.com/apache/kyuubi/pull/4710#discussion_r1168600486


##########
externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/sql/kyuubi/SparkDatasetHelper.scala:
##########
@@ -51,6 +51,8 @@ object SparkDatasetHelper extends Logging {
       doCollectLimit(collectLimit)
     case collectLimit: CollectLimitExec if collectLimit.limit < 0 =>
       executeArrowBatchCollect(collectLimit.child)
+    case localTableScan: LocalTableScanExec =>
+      doLocalTableScan(localTableScan)

Review Comment:
   it seems we need to post `SparkListenerDriverAccumUpdates` to update metrics



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] ulysses-you commented on a diff in pull request #4710: [ARROW] LocalTableScanExec should not trigger job

Posted by "ulysses-you (via GitHub)" <gi...@apache.org>.
ulysses-you commented on code in PR #4710:
URL: https://github.com/apache/kyuubi/pull/4710#discussion_r1171270816


##########
externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/sql/kyuubi/SparkDatasetHelper.scala:
##########
@@ -51,6 +51,8 @@ object SparkDatasetHelper extends Logging {
       doCollectLimit(collectLimit)
     case collectLimit: CollectLimitExec if collectLimit.limit < 0 =>
       executeArrowBatchCollect(collectLimit.child)
+    case localTableScan: LocalTableScanExec =>
+      doLocalTableScan(localTableScan)

Review Comment:
   I think so. Spark should post if there is no task.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org