You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by me...@apache.org on 2014/07/14 04:28:04 UTC
[12/12] git commit: SPARK-2363. Clean MLlib's sample data files
SPARK-2363. Clean MLlib's sample data files
(Just made a PR for this, mengxr was the reporter of:)
MLlib has sample data under serveral folders:
1) data/mllib
2) data/
3) mllib/data/*
Per previous discussion with Matei Zaharia, we want to put them under `data/mllib` and clean outdated files.
Author: Sean Owen <so...@cloudera.com>
Closes #1394 from srowen/SPARK-2363 and squashes the following commits:
54313dd [Sean Owen] Move ML example data from /mllib/data/ and /data/ into /data/mllib/
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/635888cb
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/635888cb
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/635888cb
Branch: refs/heads/master
Commit: 635888cbed0e3f4127252fb84db449f0cc9ed659
Parents: 4c8be64
Author: Sean Owen <so...@cloudera.com>
Authored: Sun Jul 13 19:27:43 2014 -0700
Committer: Xiangrui Meng <me...@databricks.com>
Committed: Sun Jul 13 19:27:43 2014 -0700
----------------------------------------------------------------------
data/kmeans_data.txt | 6 -
data/lr_data.txt | 1000 ---------------------------
data/mllib/als/test.data | 16 +
data/mllib/kmeans_data.txt | 6 +
data/mllib/lr-data/random.data | 1000 +++++++++++++++++++++++++++
data/mllib/lr_data.txt | 1000 +++++++++++++++++++++++++++
data/mllib/pagerank_data.txt | 6 +
data/mllib/ridge-data/lpsa.data | 67 ++
data/mllib/sample_libsvm_data.txt | 100 +++
data/mllib/sample_naive_bayes_data.txt | 6 +
data/mllib/sample_svm_data.txt | 322 +++++++++
data/mllib/sample_tree_data.csv | 569 +++++++++++++++
data/pagerank_data.txt | 6 -
docs/bagel-programming-guide.md | 2 +-
docs/mllib-basics.md | 6 +-
docs/mllib-clustering.md | 4 +-
docs/mllib-collaborative-filtering.md | 4 +-
docs/mllib-decision-tree.md | 4 +-
docs/mllib-linear-methods.md | 8 +-
docs/mllib-naive-bayes.md | 2 +-
docs/mllib-optimization.md | 2 +-
mllib/data/als/test.data | 16 -
mllib/data/lr-data/random.data | 1000 ---------------------------
mllib/data/ridge-data/lpsa.data | 67 --
mllib/data/sample_libsvm_data.txt | 100 ---
mllib/data/sample_naive_bayes_data.txt | 6 -
mllib/data/sample_svm_data.txt | 322 ---------
mllib/data/sample_tree_data.csv | 569 ---------------
28 files changed, 3108 insertions(+), 3108 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/spark/blob/635888cb/data/kmeans_data.txt
----------------------------------------------------------------------
diff --git a/data/kmeans_data.txt b/data/kmeans_data.txt
deleted file mode 100644
index 338664f..0000000
--- a/data/kmeans_data.txt
+++ /dev/null
@@ -1,6 +0,0 @@
-0.0 0.0 0.0
-0.1 0.1 0.1
-0.2 0.2 0.2
-9.0 9.0 9.0
-9.1 9.1 9.1
-9.2 9.2 9.2