You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Reinis Vicups (JIRA)" <ji...@apache.org> on 2014/03/28 22:21:14 UTC

[jira] [Updated] (MAHOUT-1497) mahout resplit not producing splited files

     [ https://issues.apache.org/jira/browse/MAHOUT-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Reinis Vicups updated MAHOUT-1497:
----------------------------------

    Description: 
when I run "mahout resplit", I get the output below but no split files are being produced.

{code}
support@hadoop1:~$ mahout resplit --input .../final/clusteredPoints/part-m-* --output .../final/split --numSplits 4
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /opt/cloudera/parcels/CDH-5.0.0-0.cdh5b2.p0.27/bin/../lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /opt/cloudera/parcels/CDH-5.0.0-0.cdh5b2.p0.27/lib/mahout/mahout-examples-0.8-cdh5.0.0-beta-2-job.jar
14/03/28 16:22:50 WARN driver.MahoutDriver: No resplit.props found on classpath, will use command-line arguments only
Writing 4 splits
Writing split 0
Writing split 1
Writing split 2
Writing split 3
14/03/28 16:22:52 INFO driver.MahoutDriver: Program took 2077 ms (Minutes: 0.034616666666666664)
{code}

The folder "cluteredPoints" passed to --input of resplit contains clustered points generated by k-means algorithm from mahout.

  was:
when I run "mahout resplit", I get the output below but no split files are being produced.

{code}
support@hadoop1:~$ mahout resplit --input .../final/clusteredPoints/part-m-*
--output .../final/split --numSplits 4
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /opt/cloudera/parcels/CDH-5.0.
0-0.cdh5b2.p0.27/bin/../lib/hadoop/bin/hadoop and
HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /opt/cloudera/parcels/CDH-5.0.0-0.cdh5b2.p0.27/lib/mahout/
mahout-examples-0.8-cdh5.0.0-beta-2-job.jar
14/03/28 16:22:50 WARN driver.MahoutDriver: No resplit.props found on
classpath, will use command-line arguments only
Writing 4 splits
Writing split 0
Writing split 1
Writing split 2
Writing split 3
14/03/28 16:22:52 INFO driver.MahoutDriver: Program took 2077 ms (Minutes:
0.034616666666666664)
{code}

The folder "cluteredPoints" passed to --input of resplit contains clustered points generated by k-means algorithm from mahout.


> mahout resplit not producing splited files
> ------------------------------------------
>
>                 Key: MAHOUT-1497
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1497
>             Project: Mahout
>          Issue Type: Bug
>          Components: CLI
>    Affects Versions: 0.8
>            Reporter: Reinis Vicups
>
> when I run "mahout resplit", I get the output below but no split files are being produced.
> {code}
> support@hadoop1:~$ mahout resplit --input .../final/clusteredPoints/part-m-* --output .../final/split --numSplits 4
> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> Running on hadoop, using /opt/cloudera/parcels/CDH-5.0.0-0.cdh5b2.p0.27/bin/../lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf
> MAHOUT-JOB: /opt/cloudera/parcels/CDH-5.0.0-0.cdh5b2.p0.27/lib/mahout/mahout-examples-0.8-cdh5.0.0-beta-2-job.jar
> 14/03/28 16:22:50 WARN driver.MahoutDriver: No resplit.props found on classpath, will use command-line arguments only
> Writing 4 splits
> Writing split 0
> Writing split 1
> Writing split 2
> Writing split 3
> 14/03/28 16:22:52 INFO driver.MahoutDriver: Program took 2077 ms (Minutes: 0.034616666666666664)
> {code}
> The folder "cluteredPoints" passed to --input of resplit contains clustered points generated by k-means algorithm from mahout.



--
This message was sent by Atlassian JIRA
(v6.2#6252)