You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/05 16:06:52 UTC

[GitHub] attilapiros commented on a change in pull request #23688: [SPARK-25035][Core] Avoiding memory mapping at disk-stored blocks replication

attilapiros commented on a change in pull request #23688: [SPARK-25035][Core] Avoiding memory mapping at disk-stored blocks replication
URL: https://github.com/apache/spark/pull/23688#discussion_r253930633
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/storage/DiskStore.scala
 ##########
 @@ -126,6 +127,12 @@ private[spark] class DiskStore(
     }
   }
 
+  def moveFileToBlock(sourceFile: File, blockSize: Long, targetBlockId: BlockId): Unit = {
+    blockSizes.put(targetBlockId, blockSize)
 
 Review comment:
   Yes you are right: `blockSizes` does not contain the block size of the block stored in the temporary file so the `toBlockData` would return an invalid value but because `TempFileBasedBlockStoreUpdater` was only used for DISK_ONLY storage levels where the replication is 1 this part was never executed. 
   
   I started to correct it then realised a bit more restructuring of the `BlockStoreUpdater` source we can achieve more improvements (in performance) and the code readability could be improved too. I will add some more tests too.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org