You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2009/08/01 03:12:07 UTC
Re: About Frequent Pattern Mining
On Jul 28, 2009, at 1:39 PM, Ted Dunning wrote:
> On Tue, Jul 28, 2009 at 12:18 AM, Robin Anil <ro...@gmail.com>
> wrote:
>
>> ... We need modules
>> to convert data in databases (Flatfiles, XMLdumps, MySQL, Different
>> formats on HDFS, Hbase) into intermediate form(say vector).
>
>
> Yes. We do need that.
+1
>
>
>> Ever considered having a Workflow where we select InputformatReader
>> Job and
>> an algorithm to perform (classification, clustering , itemset
>> mining).
>> where the first process breaks different sources into the vector
>> format. and then launches the algorithms.
>
>
> That is an intriguing thought. How many algorithms have the same
> shape?
> (as in, one input, one output, one algorithm, one input format)?
This might be a bit tricky due to the large number of options.
However, I do agree, we should try to standardize names, etc. and use
CLI2 in all places.
-Grant