You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/09/27 17:39:29 UTC

[GitHub] [spark] LuciferYang opened a new pull request, #43155: [SPARK-45357][CONNECT][TESTS] Ignore `dataframeId` when comparing CollectMetrics in `SparkConnectProtoSuite`.

LuciferYang opened a new pull request, #43155:
URL: https://github.com/apache/spark/pull/43155

### What changes were proposed in this pull request?

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1746251416

   The GA failure is unrelated to the current pr:
   
   ```
   starting mypy annotations test...
   annotations failed mypy checks:
   /usr/local/lib/python3.9/dist-packages/torch/_dynamo/variables/tensor.py:369: error: INTERNAL ERROR -- Please try using mypy master on GitHub:
   https://mypy.readthedocs.io/en/stable/common_issues.html#using-a-development-mypy-build
   If this issue continues with mypy master, please report a bug at https://github.com/python/mypy/issues
   version: 0.982
   /usr/local/lib/python3.9/dist-packages/torch/_dynamo/variables/tensor.py:369: : note: please use --show-traceback to print a traceback when reporting a bug
   2
   Error: Process completed with exit code 2.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1738367041

   cc @cloud-fan @zhengruifeng 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng closed pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`
URL: https://github.com/apache/spark/pull/43155


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] amaliujia commented on a diff in pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.

amaliujia commented on code in PR #43155:
URL: https://github.com/apache/spark/pull/43155#discussion_r1340316526


##########
connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -1068,6 +1068,10 @@ class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
   // Compares proto plan with LogicalPlan.
   private def comparePlans(connectPlan: proto.Relation, sparkPlan: LogicalPlan): Unit = {
     val connectAnalyzed = analyzePlan(transform(connectPlan))
-    comparePlans(connectAnalyzed, sparkPlan, false)
+    (connectAnalyzed, sparkPlan) match {

Review Comment:
   Thanks for the clarification! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #43155:
URL: https://github.com/apache/spark/pull/43155#discussion_r1339488349


##########
connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -1067,7 +1067,11 @@ class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
 
   // Compares proto plan with LogicalPlan.
   private def comparePlans(connectPlan: proto.Relation, sparkPlan: LogicalPlan): Unit = {
+    def normalizeDataframeId(plan: LogicalPlan): LogicalPlan = plan match {

Review Comment:
   shall we use transform? why only top-level `CollectMetrics`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1746273704

   @zhengruifeng Moreover, this test case is in the `connect-server` module, the function used by `connectTestRelation.observe` in the test is
   
   https://github.com/apache/spark/blob/5e6986b819821c4ea5e341217b6c901690d93e90/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/dsl/package.scala#L1089-L1111
   
   It seems that there is no `planId` here and the `observe` function on the client side has not been implemented yet.
   
   
   https://github.com/apache/spark/blob/5e6986b819821c4ea5e341217b6c901690d93e90/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala#L3286-L3288
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1746279579

   rebase to fix python lint


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #43155:
URL: https://github.com/apache/spark/pull/43155#discussion_r1339473976


##########
connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -1067,7 +1067,11 @@ class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
 
   // Compares proto plan with LogicalPlan.
   private def comparePlans(connectPlan: proto.Relation, sparkPlan: LogicalPlan): Unit = {
+    def normalizeDataframeId(plan: LogicalPlan): LogicalPlan = plan match {

Review Comment:
   add a new small function



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1750269782

   Thanks @zhengruifeng @cloud-fan @amaliujia ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1746211815

   In `PlanGenerationTestSuite`, the `planId` was reset before each test
   
   https://github.com/apache/spark/blob/4863dec91f1cd596e60b42f1cbef9f6df327e562/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala#L119-L121
   
   IIRC, the `dataframeId` in `CollectMetrics` is also the `planId`, so is it possible to simply reset the `planId` before problematic test suites?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #43155:
URL: https://github.com/apache/spark/pull/43155#discussion_r1339473976


##########
connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -1067,7 +1067,11 @@ class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
 
   // Compares proto plan with LogicalPlan.
   private def comparePlans(connectPlan: proto.Relation, sparkPlan: LogicalPlan): Unit = {
+    def normalizeDataframeId(plan: LogicalPlan): LogicalPlan = plan match {

Review Comment:
   add a new small function, is it ok?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] amaliujia commented on pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.

amaliujia commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1739491907

   QQ: why this was not caught before? Does the CI check here run tests in `connector/connect/server` at all?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] amaliujia commented on pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.

amaliujia commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1739514235

   LGTM
   
   Another way to fix is manually construct the CollectMetrics to compare with the proto generated version. But this PR's approach is fine too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #43155:
URL: https://github.com/apache/spark/pull/43155#discussion_r1339470161


##########
connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -1068,6 +1068,10 @@ class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
   // Compares proto plan with LogicalPlan.
   private def comparePlans(connectPlan: proto.Relation, sparkPlan: LogicalPlan): Unit = {
     val connectAnalyzed = analyzePlan(transform(connectPlan))
-    comparePlans(connectAnalyzed, sparkPlan, false)
+    (connectAnalyzed, sparkPlan) match {
+      case (l: CollectMetrics, r: CollectMetrics) =>

Review Comment:
   why not just add a small normalize function to reset df id to 0 for all `CollectMetrics` in the query plan?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1750035503

   merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1746330281

   also cc @hvanhovell 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #43155: [SPARK-45357][CONNECT][TESTS] Ignore `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`.

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #43155:
URL: https://github.com/apache/spark/pull/43155#discussion_r1338979671


##########
connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -1068,6 +1068,10 @@ class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
   // Compares proto plan with LogicalPlan.
   private def comparePlans(connectPlan: proto.Relation, sparkPlan: LogicalPlan): Unit = {
     val connectAnalyzed = analyzePlan(transform(connectPlan))
-    comparePlans(connectAnalyzed, sparkPlan, false)
+    (connectAnalyzed, sparkPlan) match {

Review Comment:
   Since this is the first case, this pr only made a simple fix.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #43155:
URL: https://github.com/apache/spark/pull/43155#discussion_r1339014960


##########
connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -1068,6 +1068,10 @@ class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
   // Compares proto plan with LogicalPlan.
   private def comparePlans(connectPlan: proto.Relation, sparkPlan: LogicalPlan): Unit = {
     val connectAnalyzed = analyzePlan(transform(connectPlan))
-    comparePlans(connectAnalyzed, sparkPlan, false)
+    (connectAnalyzed, sparkPlan) match {

Review Comment:
   In the current scenario, `connectAnalyzed` is transformed from `proto.Relation`. When it is `CollectMetrics`, the `dataframeId` is always 0.
   
   But if the `sparkPlan` is `CollectMetrics`, its `dataframeId` value is determined by its corresponding DataFrame.
   
   In the sbt test, `SparkConnectProtoSuite` is tested earlier, `sparkTestRelation` is the first created `DataFrame` with `id` as 0, thus the GA test didn't trigger the failure described in the pr.
   
   When using Maven for testing, `SparkConnectProtoSuite` is tested later, `sparkTestRelation` is not the first created `DataFrame` with `id` not being 0, thus causing the test to fail.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #43155: [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1738368044

   also cc @amaliujia for double check


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-45357][CONNECT][TESTS] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #43155:
URL: https://github.com/apache/spark/pull/43155#issuecomment-1746224877

   > In `PlanGenerationTestSuite`, the `planId` was reset before each test
   > 
   > https://github.com/apache/spark/blob/4863dec91f1cd596e60b42f1cbef9f6df327e562/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala#L119-L121
   > 
   > IIRC, the `dataframeId` in `CollectMetrics` is also the `planId`, so is it possible to simply reset the `planId` before problematic test suites?
   
   No, maybe it's not feasible. It's not just a matter of planId here. If we follow your argument, we would also need to add a new function for xx to reset `Dataset#curId`.
   
   
   https://github.com/apache/spark/blob/97597ba52842c6fcda5db27171439f9c02c1a782/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L70
   
   https://github.com/apache/spark/blob/97597ba52842c6fcda5db27171439f9c02c1a782/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L204
   
   https://github.com/apache/spark/blob/97597ba52842c6fcda5db27171439f9c02c1a782/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L2219-L2222


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org