You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "William1104 (via GitHub)" <gi...@apache.org> on 2024/02/14 15:28:04 UTC

[PR] SPARK-47043 add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

William1104 opened a new pull request, #45103:
URL: https://github.com/apache/spark/pull/45103

   ### What changes were proposed in this pull request?
   This PR aims to fix `spark-common-utils` module by adding explicit `jackson-core` and `jackson-annotations` dependencies. 
   
   ### Why are the changes needed?
   `spark-common-utils` uses `jackson-core` and `jackson-annotations` like the following, but we didn't declare it explicitly. 
   
   https://github.com/apache/spark/blob/a6bed5e9bcc54dac51421263d5ef73c0b6e0b12c/common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala#L26
   https://github.com/apache/spark/blob/a6bed5e9bcc54dac51421263d5ef73c0b6e0b12c/common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala#L25
   https://github.com/apache/spark/blob/a6bed5e9bcc54dac51421263d5ef73c0b6e0b12c/common/utils/src/main/scala/org/apache/spark/util/JsonUtils.scala#L23
   
   ### How was this patch tested?
   Pass the CIs.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "William1104 (via GitHub)" <gi...@apache.org>.
William1104 commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1948854042

   Hi @dongjoon-hyun,
   
   Thank you for providing the script. If I understand correctly, the script is designed to check if any dependencies have been updated in an unexpected manner for every profile. This functionality is extremely valuable as it helps us avoid making careless mistakes and provides the reviewer with a clearer understanding of how dependencies will be updated.
   
   It appears that several Spark modules are using libraries without proper declaration, or they have declared certain libraries but are not actually using them. To generate a report highlighting these issues, I ran the following command:
   
   ```
    ./build/mvn dependency:analyze | sed -n '/<<< dependency:.*:analyze/,/>>> dependency:.*:analyze/p' > dependency-analyze
   ```
   
   In the "dependency-analyze" report, let's take the module "spark-common-utils" as an example. It currently has a compile scope dependency on "commons-io," which could be changed to the test scope to avoid unnecessary transitive dependencies. Here are the relevant excerpts from the report:
   
   ```
   [INFO] --- dependency:3.6.1:analyze (default-cli) @ spark-common-utils_2.13 ---
   [WARNING] Used undeclared dependencies found:
   [WARNING]    com.fasterxml.jackson.core:jackson-annotations:jar:2.16.1:compile
   [WARNING]    org.apache.commons:commons-lang3:jar:3.14.0:compile
   [WARNING]    com.fasterxml.jackson.core:jackson-core:jar:2.16.1:compile
   [WARNING]    org.scala-lang:scala-library:jar:2.13.12:compile
   [WARNING]    org.scalatest:scalatest-funsuite_2.13:jar:3.2.17:test
   [WARNING]    org.scalactic:scalactic_2.13:jar:3.2.17:test
   [WARNING]    org.scalatest:scalatest-compatible:jar:3.2.17:test
   [WARNING]    org.scalatest:scalatest-core_2.13:jar:3.2.17:test
   [WARNING] Unused declared dependencies found:
   [WARNING]    com.fasterxml.jackson.module:jackson-module-scala_2.13:jar:2.16.1:compile
   [WARNING]    oro:oro:jar:2.0.8:compile
   [WARNING]    org.slf4j:jul-to-slf4j:jar:2.0.11:compile
   [WARNING]    org.slf4j:jcl-over-slf4j:jar:2.0.11:compile
   [WARNING]    org.apache.logging.log4j:log4j-slf4j2-impl:jar:2.22.1:compile
   [WARNING]    org.apache.logging.log4j:log4j-1.2-api:jar:2.22.1:compile
   [WARNING]    org.spark-project.spark:unused:jar:1.0.0:compile
   [WARNING]    org.scalatest:scalatest_2.13:jar:3.2.17:test
   [WARNING]    org.scalatestplus:scalacheck-1-17_2.13:jar:3.2.17.0:test
   [WARNING]    org.scalatestplus:mockito-4-11_2.13:jar:3.2.17.0:test
   [WARNING]    org.scalatestplus:selenium-4-12_2.13:jar:3.2.17.0:test
   [WARNING]    org.junit.jupiter:junit-jupiter:jar:5.9.3:test
   [WARNING]    net.aichler:jupiter-interface:jar:0.11.1:test
   [WARNING] Non-test scoped test only dependencies found:
   [WARNING]    commons-io:commons-io:jar:2.15.1:compile
   [INFO]
   ```
   This information suggests that there are both used undeclared dependencies and unused declared dependencies. Additionally, there are non-test scoped test-only dependencies, such as "commons-io:commons-io:jar:2.15.1:compile." These findings can help us identify areas where we can optimize and refine the dependency management within the "spark-common-utils" module.
   
   I would like to create PR to fix the dependency scope. 
   
   Thanks and regards, 
   William


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "William1104 (via GitHub)" <gi...@apache.org>.
William1104 commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1949062510

   Hi @dongjoon-hyun , 
   
   Thanks for your explanation. Yes, the spark is a large and very dynamic community. It could be challenging to enforce this, especially since some modules already have more than required dependencies. I would propose fixing dependencies issue module by module, and then enforcing it in a systematic way (via maven dependency plugin). 
   
   Let me send an email to dev@spark.apache.org on this topic. Thanks a lot. 
   
   Best regards, 
   William


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "William1104 (via GitHub)" <gi...@apache.org>.
William1104 commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1944365291

   Hi @dongjoon-hyun 
   
   I am sorry about that. I created the JIRA via cloning. I don't know how to update the assignee back to myself. Would you mind to change it back to me, or let me know how I can update the assignee. 
   
   Best regards,
   William


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1944331715

   BTW, @William1104 , please don't borrow someone-else name like this.
   
   ![Screenshot 2024-02-14 at 09 59 56](https://github.com/apache/spark/assets/9700541/9dfafa5c-6335-4450-998e-6832c66f8b6b)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1944373538

   No problem. I just wanted to inform you. I fixed the field by removing my name. :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1944194406

   Please note that we usually fix the wrong dependency like the following situation in order to unblock other PRs. If you have a similar issue, please describe in the PR description, @William1104 .
   > Previously, it was provided by some unused `htmlunit-driver`'s transitive dependency accidentally. This causes a weird situation which `kvstore` module starts to fail to compile when we upgrade `htmlunit-driver`. We need to fix this first.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1960064027

   Did you send an email to dev, @William1104 ? It seems that I missed it.
   > Let me send an email to [dev@spark.apache.org](mailto:dev@spark.apache.org) on this topic. Thanks a lot.
   
   Ya, I totally agree with you and want to get some further consensus in dev mailing list because @William1104 seems to be interested in this area. (Already 2 opened and I guess more to come).
   
   > Not sure if we want to make lots of changes like this, but the occasional targeted one seems OK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1960006926

   I think this change itself is fine, and indeed technically correct. Not sure if we want to make lots of changes like this, but the occasional targeted one seems OK. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #45103:
URL: https://github.com/apache/spark/pull/45103#issuecomment-1948928099

   Unlike other communities like Apache ORC, Apache Spark community doesn't  take advantage of `maven-dependency-plugin` to enforce those `Used undeclared` and `Unused declared` and `Non-test scoped test only dependencies`. The main reason is because it would be quite inconvenient in a large and dynamic community like Apache Spark community.
   
   Apache ORC Example:
   https://github.com/apache/orc/blob/0a02e4cde165b81fb93fc99456a130da4625ef30/java/core/pom.xml#L160-L172
   
   In addition, if you want to propose it, it should be applied systematically like Apache ORC community for all modules, not like many small PRs like this PR and #45101 .
   
   However, if you have a concern about this area and want to contribute that area, I'd like to recommend you to send an email to dev@spark.apache.org . We can discuss more and build a consensus on the necessity of your proposal, @William1104 . You can use your PRs' links in your email.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org