You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/03/04 04:40:43 UTC

[GitHub] [spark] pan3793 opened a new pull request #35730: [SPARK-38411] Use UTF-8 to read event log

pan3793 opened a new pull request #35730:
URL: https://github.com/apache/spark/pull/35730


   ### What changes were proposed in this pull request?
   
   Use UTF-8 instead of system default encoding to read event log
   
   ### Why are the changes needed?
   
   After SPARK-29160, we should always use UTF-8 to read event log, otherwise, if Spark History Server run with different default charset than "UTF-8", will encounter such error.
   
   ```
   2022-03-04 12:16:00,143 [3752440] - INFO  [log-replay-executor-19:Logging@57] - Parsing hdfs://hz-cluster11/spark2-history/application_1640597251469_2453817_1.lz4 for listing data...
   2022-03-04 12:16:00,145 [3752442] - ERROR [log-replay-executor-18:Logging@94] - Exception while merging application listings
   java.nio.charset.MalformedInputException: Input length = 1
   	at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
   	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
   	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
   	at java.io.InputStreamReader.read(InputStreamReader.java:184)
   	at java.io.BufferedReader.fill(BufferedReader.java:161)
   	at java.io.BufferedReader.readLine(BufferedReader.java:324)
   	at java.io.BufferedReader.readLine(BufferedReader.java:389)
   	at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:74)
   	at scala.collection.Iterator$$anon$20.hasNext(Iterator.scala:884)
   	at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:511)
   	at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:82)
   	at org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4(FsHistoryProvider.scala:819)
   	at org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4$adapted(FsHistoryProvider.scala:801)
   	at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2626)
   	at org.apache.spark.deploy.history.FsHistoryProvider.doMergeApplicationListing(FsHistoryProvider.scala:801)
   	at org.apache.spark.deploy.history.FsHistoryProvider.mergeApplicationListing(FsHistoryProvider.scala:715)
   	at org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$15(FsHistoryProvider.scala:581)
   	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, bug fix.
   
   ### How was this patch tested?
   
   Existing UT.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] pan3793 commented on pull request #35730: [SPARK-38411] Use UTF-8 to read event log

Posted by GitBox <gi...@apache.org>.
pan3793 commented on pull request #35730:
URL: https://github.com/apache/spark/pull/35730#issuecomment-1058822211


   cc @HeartSaVioR 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #35730: [SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` reads event logs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #35730:
URL: https://github.com/apache/spark/pull/35730#issuecomment-1060015942


   Thank you so much for providing that, @pan3793 .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #35730: [SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` reads event logs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #35730:
URL: https://github.com/apache/spark/pull/35730


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #35730: [SPARK-38411][CORE] Use UTF-8 to read event log

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #35730:
URL: https://github.com/apache/spark/pull/35730#issuecomment-1059749574


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] pan3793 commented on pull request #35730: [SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` reads event logs

Posted by GitBox <gi...@apache.org>.
pan3793 commented on pull request #35730:
URL: https://github.com/apache/spark/pull/35730#issuecomment-1059965467


   @dongjoon-hyun sorry for late reply, I updated PR description adding the verification steps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org