You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/03/01 07:54:26 UTC

[GitHub] [spark] weixiuli opened a new pull request #35693: [SPARK-38191][CORE][FOLLOWUP] The staging directory of write job only needs to be initialized once in HadoopMapReduceCommitProtocol

weixiuli opened a new pull request #35693:
URL: https://github.com/apache/spark/pull/35693


   ### What changes were proposed in this pull request?
   
   This pr follows up the https://github.com/apache/spark/pull/35492, try to use a stagingDir constant instead of the  stagingDir method in HadoopMapReduceCommitProtocol.
   
   ### Why are the changes needed?
   
   In the https://github.com/apache/spark/pull/35492#issuecomment-1054910730
   
   ```
   ./build/sbt -mem 4096 -Phadoop-2 "sql/testOnly org.apache.spark.sql.sources.PartitionedWriteSuite -- -z SPARK-27194"
   ...
   [info]   Cause: org.apache.spark.SparkException: Task not serializable
   ...
   [info]   Cause: java.io.NotSerializableException: org.apache.hadoop.fs.Path
   ...
   
   ```
   It's because org.apache.hadoop.fs.Path is serializable in Hadoop3 but not in Hadoop2.  So, we should make the stagingDir  transient to avoid that.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   ### How was this patch tested?
   
   Passed `./build/sbt -mem 4096 -Phadoop-2 "sql/testOnly org.apache.spark.sql.sources.PartitionedWriteSuite -- -z SPARK-27194"`
   
   Pass the CIs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #35693: [SPARK-38191][CORE][FOLLOWUP] The staging directory of write job only needs to be initialized once in HadoopMapReduceCommitProtocol

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #35693:
URL: https://github.com/apache/spark/pull/35693#issuecomment-1057520933


   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #35693: [SPARK-38191][CORE][FOLLOWUP] The staging directory of write job only needs to be initialized once in HadoopMapReduceCommitProtocol

Posted by GitBox <gi...@apache.org>.
srowen closed pull request #35693:
URL: https://github.com/apache/spark/pull/35693


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #35693: [SPARK-38191][CORE][FOLLOWUP] The staging directory of write job only needs to be initialized once in HadoopMapReduceCommitProtocol

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #35693:
URL: https://github.com/apache/spark/pull/35693#issuecomment-1056306915


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] weixiuli commented on pull request #35693: [SPARK-38191][CORE][FOLLOWUP] The staging directory of write job only needs to be initialized once in HadoopMapReduceCommitProtocol

Posted by GitBox <gi...@apache.org>.
weixiuli commented on pull request #35693:
URL: https://github.com/apache/spark/pull/35693#issuecomment-1055125957


   cc @srowen @Ngone51 PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org