Just pushed a version of the refactor that passes unit tests https://github.com/apache/mahout/pull/86 It doesn’t have the isDirectory fix yet so will not run on hadoop 1.2.1 but if anyone can test a clustered Spark or Hadoop job on it would be appreciated.