You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Jeff Eastman <jd...@windwardsolutions.com> on 2010/06/16 03:13:51 UTC
Testing Wikipedia?
I've made changes (patch in MAHOUT-167e.patch) to migrate the
WikipediaDatasetCreatorEtc to 0.20.2 and the changes compile and the
existing unit tests all run. But I had to port new 0.20 versions of
MultipleOutputFormat and MultipleTextOutputFormat to do this and there
are no unit tests for any of the wikipedia code in this package.
Further, the code snippets to run the full example in the wiki
(https://cwiki.apache.org/MAHOUT/wikipediabayesexample.html) are
obsolete and build-deprecated.xml is no longer in trunk. This makes
verifying the correctness of my port pretty difficult, for me at least
since this is all unfamiliar code. What shall I do?
A. commit it, since the unit tests all run, and hope somebody else will
verify the example
B. get help to run the example to verify it is correct, then commit it
C. leave the patch in jira and move on to utils
I'm loath to do A and would prefer to do B; however, C is what I'm going
to have to do in the short term due to my schedule
Jeff