You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jinxing64 <gi...@git.apache.org> on 2017/10/24 03:38:04 UTC

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

Github user jinxing64 commented on the issue:

    https://github.com/apache/spark/pull/19560
  
    @gatorsmile @dongjoon-hyun 
    
    Thanks a lot for looking into this.
    This pr aims to avoid OOM if metastore fails to update table properties after the data is already produced. With the config in this pr enabled, we check the size on filesystem only when `totalSize` is below `spark.sql.autoBroadcastJoinThreshold`, so I think the cost can be acceptable.
    
    Yes, the storage can be other filesystems. I refined the name. Please take a look again when you have time.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org