You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "wangyum (via GitHub)" <gi...@apache.org> on 2023/05/17 07:48:21 UTC

[GitHub] [spark] wangyum opened a new pull request, #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

wangyum opened a new pull request, #41195:
URL: https://github.com/apache/spark/pull/41195

   ### What changes were proposed in this pull request?
   
   This PR adds `log4j-1.2-api` and `log4j-slf4j2-impl` to classpath if active `hadoop-provided`.
   
   ### Why are the changes needed?
   
   To fix log issue.
   
   How to reproduce this issue:
   1. Build Spark:
      ```
      ./dev/make-distribution.sh --name default --tgz -Phive -Phive-thriftserver -Pyarn -Phadoop-provided
      tar -zxf spark-3.5.0-SNAPSHOT-bin-default.tgz 
      ```
   2. Remove the following jars from spark-3.5.0-SNAPSHOT-bin-default:
      ```
      jars/log4j-1.2-api-2.20.0.jar
      jars/log4j-slf4j2-impl-2.20.0.jar
      ```
   3. Add a new log4j2.properties to spark-3.5.0-SNAPSHOT-bin-default/conf:
      ```
      rootLogger.level = info
      rootLogger.appenderRef.file.ref = File
      rootLogger.appenderRef.stderr.ref = console
      
      appender.console.type = Console
      appender.console.name = console
      appender.console.target = SYSTEM_ERR
      appender.console.layout.type = PatternLayout
      appender.console.layout.pattern = %d{yy/MM/dd HH:mm:ss,SSS} %p [%t] %c{2}:%L : %m%n
      
      appender.file.type = RollingFile
      appender.file.name = File
      appender.file.fileName = /tmp/spark/logs/spark.log
      appender.file.filePattern = /tmp/spark/logs/spark.%d{yyyyMMdd-HH}.log
      appender.file.append = true
      appender.file.layout.type = PatternLayout
      appender.file.layout.pattern = %d{yy/MM/dd HH:mm:ss,SSS} %p [%t] %c{2}:%L : %m%n
      appender.file.policies.type = Policies
      appender.file.policies.time.type = TimeBasedTriggeringPolicy
      appender.file.policies.time.interval = 1
      appender.file.policies.time.modulate = true
      appender.file.policies.size.type = SizeBasedTriggeringPolicy
      appender.file.policies.size.size = 256M
      appender.file.strategy.type = DefaultRolloverStrategy
      appender.file.strategy.max = 100
      ```
   4. Start Spark thriftserver: `sbin/start-thriftserver.sh`
   
   5. The log file is empty: `cat /tmp/spark/logs/spark.log`.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Manual test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1556043789

   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "Kimahriman (via GitHub)" <gi...@apache.org>.
Kimahriman commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1555905091

   Nevermind it looks like including `log4j-1.2-api` actually fixes the issue, I wasn't including that before. In fact, just including `log4j-1.2-api` fixes the issue I made for https://issues.apache.org/jira/browse/SPARK-40246. That might not fix custom log4j2 configs, but at least makes `SparkContext.setLogLevel` work with a hadoop-provided build


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1555428658

   @Kimahriman Do you have a way to reproduce?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1554550499

   Seems reasonable then. Let's just get the tests to run again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen closed pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided
URL: https://github.com/apache/spark/pull/41195


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1551422222

   Can you explain this more - your steps to reproduce require you to remove files from the build, and that causes the problem? but the JARs were there before you removed them. How does removing the scope change the result?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "Kimahriman (via GitHub)" <gi...@apache.org>.
Kimahriman commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1555080586

   Actually hit a new issue related to this after finally being able to test out 3.4 from the Delta release. Because of the bump to slf4j 2, it seems `log4j-slf4j2-impl` doesn't get recognized, so it always falls back to the log4j/reload4j 1.x for logging. I had to also include slf4j-api 2.0.6 to get Spark's built-in log handling to work correctly (i.e. SparkContext.setLogLevel)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1551701267

   It seems weird that log4j 2 config works, if you add log4j 1.x. Maybe so, just trying to figure out if this is really what's going on and if we have to let log4j 1.x back in? because then we have both log4j in the distribution


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "Kimahriman (via GitHub)" <gi...@apache.org>.
Kimahriman commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1552245794

   Maybe similar reason I made https://github.com/apache/spark/pull/37694 a while ago? Basically Spark logging setup assumes log4j2, but with hadoop provided you get 1.x from Hadoop. So you get weird logging behavior. We just have both providers in the class path and slf seems to pick the 2.x one


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a diff in pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #41195:
URL: https://github.com/apache/spark/pull/41195#discussion_r1198004599


##########
pom.xml:
##########
@@ -763,7 +762,6 @@
         <groupId>org.apache.logging.log4j</groupId>
         <artifactId>log4j-1.2-api</artifactId>
         <version>${log4j.version}</version>

Review Comment:
   This looks okay. This is bridge between log4j1 and log4j2. In case of hadoop-provided, we still need this as hadoop classpath doesn't provide it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #41195: [SPARK-43534][BUILD] Add log4j-1.2-api and log4j-slf4j2-impl to classpath if active hadoop-provided

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on PR #41195:
URL: https://github.com/apache/spark/pull/41195#issuecomment-1551463081

   Thank you @srowen. I have updated the PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org