You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by matd <ma...@gmail.com> on 2017/06/12 21:14:19 UTC

broadcast() multiple times the same df. Is it cached ?

Hi spark folks,

In our application, we have to join a dataframe with several other df (not
always the same joining column).

This left-hand side df is not very large, so a broadcast hint may be
beneficial.

My questions :
- if the same df get broadcast multiple times, will the transfer occur once
(the broadcast data is somehow cached on executors), or multiple times ?
- If the join concern different cols, will it be cached as well ?

Thanks for your insights
Mathieu




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/broadcast-multiple-times-the-same-df-Is-it-cached-tp28756.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org