You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Paritosh Ranjan <pr...@xebia.com> on 2011/11/05 08:03:45 UTC
Top Down Clustering : CLI
Hi,
I have created the Java API for Top Down Clustering.
https://issues.apache.org/jira/browse/MAHOUT-843
But the patch also needs a CLI in order to be accepted. I don't have any
knowledge of creating CLI for Mahout. Can I get some
guidance/help/suggestions?
Regards,
Paritosh
Re: Top Down Clustering : CLI
Posted by Paritosh Ranjan <pr...@xebia.com>.
Thanks for the help.
On 06-11-2011 09:16, Lance Norskog wrote:
> Except for the KMeansDriver part, yes, this is it. In fact you can give the
> full class name instead of 'clusterpp' when you run bin/mahout. Look at the
> different props files in src/conf: you can set up default params for your
> named job.
>
> On Sat, Nov 5, 2011 at 11:44 AM, Paritosh Ranjan<pr...@xebia.com> wrote:
>
>> I missed a line in driver.classes.props
>>
>> org.apache.mahout.clustering.**topdown.postprocessor.**ClusterOutputPostProcessor
>> = clusterpp : ClusterPostProcessor
>>
>> So, is this all I need to create the CLI? Extending AbstractJob and adding
>> this line in driver.classes.props.
>>
>>
>> On 06-11-2011 00:10, Paritosh Ranjan wrote:
>>
>>> I have extended the AbstractJob. Added methods run(String[] args) and
>>> main(String[] args) to it as shown in the code below. Is this all I need to
>>> create the CLI? ( I just need the input path, so , there is no need of
>>> DefaultOptionCreator in this case ).
>>>
>>> @Override
>>> public int run(String[] args) throws Exception {
>>>
>>> addInputOption();
>>> if (parseArguments(args) == null) {
>>> return -1;
>>> }
>>>
>>> Path input = getInputPath();
>>>
>>> if (getConf() == null) { setConf(new Configuration());
>>> }
>>>
>>> ClusterOutputPostProcessor clusterOutputPostProcessor = new
>>> ClusterOutputPostProcessor(**input, getConf());
>>> clusterOutputPostProcessor.**distributeVectors();
>>> return 0;
>>> }
>>>
>>> public static void main(String[] args) throws Exception {
>>> ToolRunner.run(new Configuration(), new KMeansDriver(), args);
>>> }
>>>
>>>
>>> On 05-11-2011 17:30, Grant Ingersoll wrote:
>>>
>>>> Have a look at KMeansDriver and driver.classes.props. If you mirror
>>>> KMeansDriver, you should be able to get what you are looking for. For that
>>>> matter, pretty much any Driver has a similar structure.
>>>>
>>>> On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
>>>>
>>>> Hi,
>>>>> I have created the Java API for Top Down Clustering.
>>>>> https://issues.apache.org/**jira/browse/MAHOUT-843<https://issues.apache.org/jira/browse/MAHOUT-843>
>>>>>
>>>>> But the patch also needs a CLI in order to be accepted. I don't have
>>>>> any knowledge of creating CLI for Mahout. Can I get some
>>>>> guidance/help/suggestions?
>>>>>
>>>>> Regards,
>>>>> Paritosh
>>>>>
>>>> --------------------------
>>>> Grant Ingersoll
>>>> http://www.lucidimagination.**com<http://www.lucidimagination.com>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----
>>>> No virus found in this message.
>>>> Checked by AVG - www.avg.com
>>>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>>>
>>>>
>>>
>>> -----
>>> No virus found in this message.
>>> Checked by AVG - www.avg.com
>>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>>
>>
>
Re: Top Down Clustering : CLI
Posted by Lance Norskog <go...@gmail.com>.
Except for the KMeansDriver part, yes, this is it. In fact you can give the
full class name instead of 'clusterpp' when you run bin/mahout. Look at the
different props files in src/conf: you can set up default params for your
named job.
On Sat, Nov 5, 2011 at 11:44 AM, Paritosh Ranjan <pr...@xebia.com> wrote:
> I missed a line in driver.classes.props
>
> org.apache.mahout.clustering.**topdown.postprocessor.**ClusterOutputPostProcessor
> = clusterpp : ClusterPostProcessor
>
> So, is this all I need to create the CLI? Extending AbstractJob and adding
> this line in driver.classes.props.
>
>
> On 06-11-2011 00:10, Paritosh Ranjan wrote:
>
>> I have extended the AbstractJob. Added methods run(String[] args) and
>> main(String[] args) to it as shown in the code below. Is this all I need to
>> create the CLI? ( I just need the input path, so , there is no need of
>> DefaultOptionCreator in this case ).
>>
>> @Override
>> public int run(String[] args) throws Exception {
>>
>> addInputOption();
>> if (parseArguments(args) == null) {
>> return -1;
>> }
>>
>> Path input = getInputPath();
>>
>> if (getConf() == null) { setConf(new Configuration());
>> }
>>
>> ClusterOutputPostProcessor clusterOutputPostProcessor = new
>> ClusterOutputPostProcessor(**input, getConf());
>> clusterOutputPostProcessor.**distributeVectors();
>> return 0;
>> }
>>
>> public static void main(String[] args) throws Exception {
>> ToolRunner.run(new Configuration(), new KMeansDriver(), args);
>> }
>>
>>
>> On 05-11-2011 17:30, Grant Ingersoll wrote:
>>
>>> Have a look at KMeansDriver and driver.classes.props. If you mirror
>>> KMeansDriver, you should be able to get what you are looking for. For that
>>> matter, pretty much any Driver has a similar structure.
>>>
>>> On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
>>>
>>> Hi,
>>>>
>>>> I have created the Java API for Top Down Clustering.
>>>> https://issues.apache.org/**jira/browse/MAHOUT-843<https://issues.apache.org/jira/browse/MAHOUT-843>
>>>>
>>>> But the patch also needs a CLI in order to be accepted. I don't have
>>>> any knowledge of creating CLI for Mahout. Can I get some
>>>> guidance/help/suggestions?
>>>>
>>>> Regards,
>>>> Paritosh
>>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://www.lucidimagination.**com <http://www.lucidimagination.com>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----
>>> No virus found in this message.
>>> Checked by AVG - www.avg.com
>>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>>
>>>
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>
>
>
--
Lance Norskog
goksron@gmail.com
Re: Top Down Clustering : CLI
Posted by Paritosh Ranjan <pr...@xebia.com>.
I missed a line in driver.classes.props
org.apache.mahout.clustering.topdown.postprocessor.ClusterOutputPostProcessor
= clusterpp : ClusterPostProcessor
So, is this all I need to create the CLI? Extending AbstractJob and
adding this line in driver.classes.props.
On 06-11-2011 00:10, Paritosh Ranjan wrote:
> I have extended the AbstractJob. Added methods run(String[] args) and
> main(String[] args) to it as shown in the code below. Is this all I
> need to create the CLI? ( I just need the input path, so , there is no
> need of DefaultOptionCreator in this case ).
>
> @Override
> public int run(String[] args) throws Exception {
>
> addInputOption();
> if (parseArguments(args) == null) {
> return -1;
> }
>
> Path input = getInputPath();
>
> if (getConf() == null) { setConf(new Configuration());
> }
>
> ClusterOutputPostProcessor clusterOutputPostProcessor = new
> ClusterOutputPostProcessor(input, getConf());
> clusterOutputPostProcessor.distributeVectors();
> return 0;
> }
>
> public static void main(String[] args) throws Exception {
> ToolRunner.run(new Configuration(), new KMeansDriver(), args);
> }
>
>
> On 05-11-2011 17:30, Grant Ingersoll wrote:
>> Have a look at KMeansDriver and driver.classes.props. If you mirror
>> KMeansDriver, you should be able to get what you are looking for.
>> For that matter, pretty much any Driver has a similar structure.
>>
>> On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
>>
>>> Hi,
>>>
>>> I have created the Java API for Top Down Clustering.
>>> https://issues.apache.org/jira/browse/MAHOUT-843
>>>
>>> But the patch also needs a CLI in order to be accepted. I don't have
>>> any knowledge of creating CLI for Mahout. Can I get some
>>> guidance/help/suggestions?
>>>
>>> Regards,
>>> Paritosh
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>>
>
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
Re: Top Down Clustering : CLI
Posted by Paritosh Ranjan <pr...@xebia.com>.
I have extended the AbstractJob. Added methods run(String[] args) and
main(String[] args) to it as shown in the code below. Is this all I need
to create the CLI? ( I just need the input path, so , there is no need
of DefaultOptionCreator in this case ).
@Override
public int run(String[] args) throws Exception {
addInputOption();
if (parseArguments(args) == null) {
return -1;
}
Path input = getInputPath();
if (getConf() == null) {
setConf(new Configuration());
}
ClusterOutputPostProcessor clusterOutputPostProcessor = new ClusterOutputPostProcessor(input, getConf());
clusterOutputPostProcessor.distributeVectors();
return 0;
}
public static void main(String[] args) throws Exception {
ToolRunner.run(new Configuration(), new KMeansDriver(), args);
}
On 05-11-2011 17:30, Grant Ingersoll wrote:
> Have a look at KMeansDriver and driver.classes.props. If you mirror KMeansDriver, you should be able to get what you are looking for. For that matter, pretty much any Driver has a similar structure.
>
> On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
>
>> Hi,
>>
>> I have created the Java API for Top Down Clustering. https://issues.apache.org/jira/browse/MAHOUT-843
>>
>> But the patch also needs a CLI in order to be accepted. I don't have any knowledge of creating CLI for Mahout. Can I get some guidance/help/suggestions?
>>
>> Regards,
>> Paritosh
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>
>
>
>
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1411 / Virus Database: 2092/3997 - Release Date: 11/04/11
>
Re: Top Down Clustering : CLI
Posted by Grant Ingersoll <gs...@apache.org>.
Have a look at KMeansDriver and driver.classes.props. If you mirror KMeansDriver, you should be able to get what you are looking for. For that matter, pretty much any Driver has a similar structure.
On Nov 5, 2011, at 3:03 AM, Paritosh Ranjan wrote:
> Hi,
>
> I have created the Java API for Top Down Clustering. https://issues.apache.org/jira/browse/MAHOUT-843
>
> But the patch also needs a CLI in order to be accepted. I don't have any knowledge of creating CLI for Mahout. Can I get some guidance/help/suggestions?
>
> Regards,
> Paritosh
--------------------------
Grant Ingersoll
http://www.lucidimagination.com