You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Paritosh Ranjan <pr...@xebia.com> on 2011/11/05 08:03:45 UTC

Top Down Clustering : CLI

Hi,

I have created the Java API for Top Down Clustering. 
https://issues.apache.org/jira/browse/MAHOUT-843

But the patch also needs a CLI in order to be accepted. I don't have any 
knowledge of creating CLI for Mahout. Can I get some 
guidance/help/suggestions?

Regards,
Paritosh

Re: Top Down Clustering : CLI

Posted by Paritosh Ranjan <pr...@xebia.com>.
Thanks for the help.

On 06-11-2011 09:16, Lance Norskog wrote:
> Except for the KMeansDriver part, yes, this is it. In fact you can give the
> full class name instead of 'clusterpp' when you run bin/mahout. Look at the
> different props files in src/conf: you can set up default params for your
> named job.
>
> On Sat, Nov 5, 2011 at 11:44 AM, Paritosh Ranjan<pr...@xebia.com>  wrote:
>
>> I missed a line in driver.classes.props
>>
>> org.apache.mahout.clustering.**topdown.postprocessor.**ClusterOutputPostProcessor
>> = clusterpp : ClusterPostProcessor
>>
>> So, is this all I need to create the CLI? Extending AbstractJob and adding
>> this line in driver.classes.props.
>>
>>
>> On 06-11-2011 00:10, Paritosh Ranjan wrote:
>>
>>> I have extended the AbstractJob. Added methods run(String[] args) and
>>> main(String[] args) to it as shown in the code below. Is this all I need to
>>> create the CLI? ( I just need the input path, so , there is no need of
>>> DefaultOptionCreator in this case ).
>>>
>>>   @Override
>>>   public int run(String[] args) throws Exception {
>>>
>>>       addInputOption();
>>>       if (parseArguments(args) == null) {
>>>         return -1;
>>>       }
>>>
>>>       Path input = getInputPath();
>>>
>>>       if (getConf() == null) {          setConf(new Configuration());
>>>       }
>>>
>>>     ClusterOutputPostProcessor clusterOutputPostProcessor = new
>>> ClusterOutputPostProcessor(**input, getConf());
>>>     clusterOutputPostProcessor.**distributeVectors();
>>>     return 0;
>>> }
>>>
>>> public static void main(String[] args) throws Exception {
>>>     ToolRunner.run(new Configuration(), new KMeansDriver(), args);
>>> }
>>>
>>>
>>> On 05-11-2011 17:30, Grant Ingersoll wrote:
>>>
>>>> Have a look at KMeansDriver and driver.classes.props.  If you mirror
>>>> KMeansDriver, you should be able to get what you are looking for.  For that
>>>> matter, pretty much any Driver has a similar structure.
>>>>
>>>> On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
>>>>
>>>>   Hi,
>>>>> I have created the Java API for Top Down Clustering.
>>>>> https://issues.apache.org/**jira/browse/MAHOUT-843<https://issues.apache.org/jira/browse/MAHOUT-843>
>>>>>
>>>>> But the patch also needs a CLI in order to be accepted. I don't have
>>>>> any knowledge of creating CLI for Mahout. Can I get some
>>>>> guidance/help/suggestions?
>>>>>
>>>>> Regards,
>>>>> Paritosh
>>>>>
>>>> --------------------------
>>>> Grant Ingersoll
>>>> http://www.lucidimagination.**com<http://www.lucidimagination.com>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----
>>>> No virus found in this message.
>>>> Checked by AVG - www.avg.com
>>>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>>>
>>>>
>>>
>>> -----
>>> No virus found in this message.
>>> Checked by AVG - www.avg.com
>>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>>
>>
>


Re: Top Down Clustering : CLI

Posted by Lance Norskog <go...@gmail.com>.
Except for the KMeansDriver part, yes, this is it. In fact you can give the
full class name instead of 'clusterpp' when you run bin/mahout. Look at the
different props files in src/conf: you can set up default params for your
named job.

On Sat, Nov 5, 2011 at 11:44 AM, Paritosh Ranjan <pr...@xebia.com> wrote:

> I missed a line in driver.classes.props
>
> org.apache.mahout.clustering.**topdown.postprocessor.**ClusterOutputPostProcessor
> = clusterpp : ClusterPostProcessor
>
> So, is this all I need to create the CLI? Extending AbstractJob and adding
> this line in driver.classes.props.
>
>
> On 06-11-2011 00:10, Paritosh Ranjan wrote:
>
>> I have extended the AbstractJob. Added methods run(String[] args) and
>> main(String[] args) to it as shown in the code below. Is this all I need to
>> create the CLI? ( I just need the input path, so , there is no need of
>> DefaultOptionCreator in this case ).
>>
>>  @Override
>>  public int run(String[] args) throws Exception {
>>
>>      addInputOption();
>>      if (parseArguments(args) == null) {
>>        return -1;
>>      }
>>
>>      Path input = getInputPath();
>>
>>      if (getConf() == null) {          setConf(new Configuration());
>>      }
>>
>>    ClusterOutputPostProcessor clusterOutputPostProcessor = new
>> ClusterOutputPostProcessor(**input, getConf());
>>    clusterOutputPostProcessor.**distributeVectors();
>>    return 0;
>> }
>>
>> public static void main(String[] args) throws Exception {
>>    ToolRunner.run(new Configuration(), new KMeansDriver(), args);
>> }
>>
>>
>> On 05-11-2011 17:30, Grant Ingersoll wrote:
>>
>>> Have a look at KMeansDriver and driver.classes.props.  If you mirror
>>> KMeansDriver, you should be able to get what you are looking for.  For that
>>> matter, pretty much any Driver has a similar structure.
>>>
>>> On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
>>>
>>>  Hi,
>>>>
>>>> I have created the Java API for Top Down Clustering.
>>>> https://issues.apache.org/**jira/browse/MAHOUT-843<https://issues.apache.org/jira/browse/MAHOUT-843>
>>>>
>>>> But the patch also needs a CLI in order to be accepted. I don't have
>>>> any knowledge of creating CLI for Mahout. Can I get some
>>>> guidance/help/suggestions?
>>>>
>>>> Regards,
>>>> Paritosh
>>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://www.lucidimagination.**com <http://www.lucidimagination.com>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----
>>> No virus found in this message.
>>> Checked by AVG - www.avg.com
>>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>>
>>>
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>
>
>


-- 
Lance Norskog
goksron@gmail.com

Re: Top Down Clustering : CLI

Posted by Paritosh Ranjan <pr...@xebia.com>.
I missed a line in driver.classes.props

org.apache.mahout.clustering.topdown.postprocessor.ClusterOutputPostProcessor 
= clusterpp : ClusterPostProcessor

So, is this all I need to create the CLI? Extending AbstractJob and 
adding this line in driver.classes.props.

On 06-11-2011 00:10, Paritosh Ranjan wrote:
> I have extended the AbstractJob. Added methods run(String[] args) and 
> main(String[] args) to it as shown in the code below. Is this all I 
> need to create the CLI? ( I just need the input path, so , there is no 
> need of DefaultOptionCreator in this case ).
>
>   @Override
>   public int run(String[] args) throws Exception {
>
>       addInputOption();
>       if (parseArguments(args) == null) {
>         return -1;
>       }
>
>       Path input = getInputPath();
>
>       if (getConf() == null) {          setConf(new Configuration());
>       }
>
>     ClusterOutputPostProcessor clusterOutputPostProcessor = new 
> ClusterOutputPostProcessor(input, getConf());
>     clusterOutputPostProcessor.distributeVectors();
>     return 0;
> }
>
> public static void main(String[] args) throws Exception {
>     ToolRunner.run(new Configuration(), new KMeansDriver(), args);
> }
>
>
> On 05-11-2011 17:30, Grant Ingersoll wrote:
>> Have a look at KMeansDriver and driver.classes.props.  If you mirror 
>> KMeansDriver, you should be able to get what you are looking for.  
>> For that matter, pretty much any Driver has a similar structure.
>>
>> On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
>>
>>> Hi,
>>>
>>> I have created the Java API for Top Down Clustering. 
>>> https://issues.apache.org/jira/browse/MAHOUT-843
>>>
>>> But the patch also needs a CLI in order to be accepted. I don't have 
>>> any knowledge of creating CLI for Mahout. Can I get some 
>>> guidance/help/suggestions?
>>>
>>> Regards,
>>> Paritosh
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>
>
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11


Re: Top Down Clustering : CLI

Posted by Paritosh Ranjan <pr...@xebia.com>.
I have extended the AbstractJob. Added methods run(String[] args) and 
main(String[] args) to it as shown in the code below. Is this all I need 
to create the CLI? ( I just need the input path, so , there is no need 
of DefaultOptionCreator in this case ).

   @Override
   public int run(String[] args) throws Exception {

       addInputOption();
       if (parseArguments(args) == null) {
         return -1;
       }

       Path input = getInputPath();

       if (getConf() == null) { 
          setConf(new Configuration());
       }

     ClusterOutputPostProcessor clusterOutputPostProcessor = new ClusterOutputPostProcessor(input, getConf());
     clusterOutputPostProcessor.distributeVectors();
     return 0;
}

public static void main(String[] args) throws Exception {
     ToolRunner.run(new Configuration(), new KMeansDriver(), args);
}


On 05-11-2011 17:30, Grant Ingersoll wrote:
> Have a look at KMeansDriver and driver.classes.props.  If you mirror KMeansDriver, you should be able to get what you are looking for.  For that matter, pretty much any Driver has a similar structure.
>
> On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
>
>> Hi,
>>
>> I have created the Java API for Top Down Clustering. https://issues.apache.org/jira/browse/MAHOUT-843
>>
>> But the patch also needs a CLI in order to be accepted. I don't have any knowledge of creating CLI for Mahout. Can I get some guidance/help/suggestions?
>>
>> Regards,
>> Paritosh
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>
>
>
>
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>


Re: Top Down Clustering : CLI

Posted by Grant Ingersoll <gs...@apache.org>.
Have a look at KMeansDriver and driver.classes.props.  If you mirror KMeansDriver, you should be able to get what you are looking for.  For that matter, pretty much any Driver has a similar structure.

On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:

> Hi,
> 
> I have created the Java API for Top Down Clustering. https://issues.apache.org/jira/browse/MAHOUT-843
> 
> But the patch also needs a CLI in order to be accepted. I don't have any knowledge of creating CLI for Mahout. Can I get some guidance/help/suggestions?
> 
> Regards,
> Paritosh

--------------------------
Grant Ingersoll
http://www.lucidimagination.com