You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Drew Farris <dr...@apache.org> on 2012/06/09 14:27:56 UTC
cluster-reuters.sh clusterdump arguments
Hi All,
In kicking the tires of the 0.7 release, I've discovered that the
arguments for clusterdump in examples/bin/cluster-reuters.sh aren't
quite right.
When running what's checked in, I get:
12/06/09 08:10:47 ERROR common.AbstractJob: Unexpected -s while
processing Job-Specific Options:
usage: <command> [Generic Options] [Job-Specific Options]
The current dump commands look like:
$MAHOUT clusterdump \
-s ${WORK_DIR}/reuters-kmeans/clusters-*-final \
-d ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/dictionary.file-0 \
-dt sequencefile -b 100 -n 20 --evaluate -dm
org.apache.mahout.common.distance.CosineDistanceMeasure \
--pointsDir ${WORK_DIR}/reuters-kmeans/clusteredPoints
I think they should be:
$MAHOUT clusterdump \
-i ${WORK_DIR}/reuters-kmeans/clusters-*-final \
-o ${WORK_DIR}/reuters-kmeans/clusters-dump -of TEXT \
-d ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/dictionary.file-0 \
-dt sequencefile -b 100 -n 20 --evaluate -dm
org.apache.mahout.common.distance.CosineDistanceMeasure \
--pointsDir ${WORK_DIR}/reuters-kmeans/clusteredPoints
Anyone opposed to getting this fix in for 0.7?
Drew
Re: cluster-reuters.sh clusterdump arguments
Posted by Jeff Eastman <jd...@windwardsolutions.com>.
+1 -s got changed to -i some time back and it looks like some of the
$MAHOUT clusterdump invocations didn't get upgraded. I agree it needs
fixing.
On 6/9/12 8:27 AM, Drew Farris wrote:
> Hi All,
>
> In kicking the tires of the 0.7 release, I've discovered that the
> arguments for clusterdump in examples/bin/cluster-reuters.sh aren't
> quite right.
>
> When running what's checked in, I get:
>
> 12/06/09 08:10:47 ERROR common.AbstractJob: Unexpected -s while
> processing Job-Specific Options:
> usage:<command> [Generic Options] [Job-Specific Options]
>
> The current dump commands look like:
>
> $MAHOUT clusterdump \
> -s ${WORK_DIR}/reuters-kmeans/clusters-*-final \
> -d ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/dictionary.file-0 \
> -dt sequencefile -b 100 -n 20 --evaluate -dm
> org.apache.mahout.common.distance.CosineDistanceMeasure \
> --pointsDir ${WORK_DIR}/reuters-kmeans/clusteredPoints
>
> I think they should be:
>
> $MAHOUT clusterdump \
> -i ${WORK_DIR}/reuters-kmeans/clusters-*-final \
> -o ${WORK_DIR}/reuters-kmeans/clusters-dump -of TEXT \
> -d ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/dictionary.file-0 \
> -dt sequencefile -b 100 -n 20 --evaluate -dm
> org.apache.mahout.common.distance.CosineDistanceMeasure \
> --pointsDir ${WORK_DIR}/reuters-kmeans/clusteredPoints
>
> Anyone opposed to getting this fix in for 0.7?
>
> Drew
>
>