You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by morr0723 <mi...@gmail.com> on 2014/12/22 03:10:37 UTC

locality sensitive hashing for spark

I've pushed out an implementation of locality sensitive hashing for spark.
LSH has a number of use cases, most prominent being if the features are not
based in Euclidean space. 

Code, documentation, and small exemplar dataset is available on github:

https://github.com/mrsqueeze/spark-hash

Feel free to pass along any comments or issues.

Enjoy!





--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/locality-sensitive-hashing-for-spark-tp20803.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: locality sensitive hashing for spark

Posted by Michael Orr <mi...@gmail.com>.
The implementation closely aligns with jaccard. It should be possible to swap out the hash functions to a family that is compatible with other distance measures.



> On Dec 22, 2014, at 1:16 AM, Nick Pentreath <ni...@gmail.com> wrote:
> 
> Looks interesting thanks for sharing.
> 
> Does it support cosine similarity ? I only saw jaccard mentioned from a quick glance.
> 
> —
> Sent from Mailbox <https://www.dropbox.com/mailbox>
> 
> On Mon, Dec 22, 2014 at 4:12 AM, morr0723 <michael.d.orr@gmail.com <ma...@gmail.com>> wrote:
> 
> I've pushed out an implementation of locality sensitive hashing for spark. 
> LSH has a number of use cases, most prominent being if the features are not 
> based in Euclidean space. 
> 
> Code, documentation, and small exemplar dataset is available on github: 
> 
> https://github.com/mrsqueeze/spark-hash 
> 
> Feel free to pass along any comments or issues. 
> 
> Enjoy! 
> 
> 
> 
> 
> 
> -- 
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/locality-sensitive-hashing-for-spark-tp20803.html 
> Sent from the Apache Spark User List mailing list archive at Nabble.com. 
> 
> --------------------------------------------------------------------- 
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org 
> For additional commands, e-mail: user-help@spark.apache.org 
> 
> 
> 


Re: locality sensitive hashing for spark

Posted by Nick Pentreath <ni...@gmail.com>.
Looks interesting thanks for sharing.


Does it support cosine similarity ? I only saw jaccard mentioned from a quick glance.


—
Sent from Mailbox

On Mon, Dec 22, 2014 at 4:12 AM, morr0723 <mi...@gmail.com> wrote:

> I've pushed out an implementation of locality sensitive hashing for spark.
> LSH has a number of use cases, most prominent being if the features are not
> based in Euclidean space. 
> Code, documentation, and small exemplar dataset is available on github:
> https://github.com/mrsqueeze/spark-hash
> Feel free to pass along any comments or issues.
> Enjoy!
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/locality-sensitive-hashing-for-spark-tp20803.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org