You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Arnoldo Muller <ar...@gmail.com> on 2008/10/09 13:49:42 UTC
Student Project: Filesystem namespace partitioning
Hello,
My name is Arnoldo Muller, I am a final year PhD candidate.
I am working on similarity search for detecting Open Source license violations
(www.furiachan.org). In my spare time, I also code a similarity
search engine (www.obsearch.net).
In am interested in the Apache Hadoop Open Source Student Project:
"Performance evaluation of existing Locality Sensitive Hashing schemes.
Research on new hashing schemes for filesystem namespace partitioning"
If nobody is working on this, I would like to know more about the scope of the
project. Do you want to define a distance function so that
similar namespaces are grouped together into the same "bucket"?
If so, I have three or four metric trees that could be used for the comparison.
Thanks,
Arnoldo Muller