You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by matd <ma...@gmail.com> on 2016/07/06 13:23:50 UTC

spark 2.0 bloom filters

A question for Spark developers

I see that Bloom filters have been integrated in  Spark 2.0
<https://spark.apache.org/docs/2.0.0-preview/api/scala/index.html#org.apache.spark.util.sketch.BloomFilter> 
.

Hadoop already has some Bloom filter implementations, especially a  dynamic
one
<https://hadoop.apache.org/docs/r2.7.2/api/org/apache/hadoop/util/bloom/DynamicBloomFilter.html> 
, very interesting when the number of keys largely exceed what was imagined.

Is there any rationale (performance, implem...) for this implem in Spark
instead of re-using the one from Hadoop ?

Thanks !



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-2-0-bloom-filters-tp27297.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org