You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2009/08/01 03:12:07 UTC

Re: About Frequent Pattern Mining

On Jul 28, 2009, at 1:39 PM, Ted Dunning wrote:

> On Tue, Jul 28, 2009 at 12:18 AM, Robin Anil <ro...@gmail.com>  
> wrote:
>
>> ... We need modules
>> to convert data in databases (Flatfiles, XMLdumps, MySQL, Different
>> formats on  HDFS, Hbase) into intermediate form(say vector).
>
>
> Yes.  We do need that.

+1

>
>
>> Ever considered having a Workflow where we select InputformatReader  
>> Job and
>> an algorithm to perform (classification, clustering , itemset  
>> mining).
>> where the first process breaks different sources into the vector
>> format. and then launches the algorithms.
>
>
> That is an intriguing thought.  How many algorithms have the same  
> shape?
> (as in, one input, one output, one algorithm, one input format)?

This might be a bit tricky due to the large number of options.   
However, I do agree, we should try to standardize names, etc. and use  
CLI2 in all places.

-Grant