You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@griffin.apache.org by gu...@apache.org on 2018/09/14 01:32:28 UTC

[1/2] incubator-griffin-site git commit: update quickstart

Repository: incubator-griffin-site
Updated Branches:
  refs/heads/master fd5b06aad -> 053b3172e


update quickstart


Project: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/commit/7c926c57
Tree: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/tree/7c926c57
Diff: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/diff/7c926c57

Branch: refs/heads/master
Commit: 7c926c570c7c2b70055cc028c264ba54eab7a9ff
Parents: fd5b06a
Author: Lionel Liu <bh...@163.com>
Authored: Fri Sep 14 09:20:42 2018 +0800
Committer: Lionel Liu <bh...@163.com>
Committed: Fri Sep 14 09:20:42 2018 +0800

----------------------------------------------------------------------
 quickstart.md | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/7c926c57/quickstart.md
----------------------------------------------------------------------
diff --git a/quickstart.md b/quickstart.md
index c36beaf..047a4e9 100644
--- a/quickstart.md
+++ b/quickstart.md
@@ -77,13 +77,21 @@ LOCATION
   'hdfs:///griffin/data/batch/demo_tgt';
 
 ```
-and we will load data into both two tables for every hour.
-
+The data could be generated this:
 ```
-#load data here...
+1|18|student
+2|23|engineer
+3|42|cook
+...
 ```
-
-
+For demo_src and demo_tgt, there could be some different items between each other. 
+You can download [this directory](https://github.com/bhlx3lyx7/griffin-docker/tree/master/griffin_spark2/prep/data) and execute `./gen_demo_data.sh` to get the two data source files.
+Then we will load data into both two tables for every hour.
+```
+LOAD DATA LOCAL INPATH 'demo_src' INTO TABLE demo_src PARTITION (dt='20180912',hour='09');
+LOAD DATA LOCAL INPATH 'demo_tgt' INTO TABLE demo_tgt PARTITION (dt='20180912',hour='09');
+```
+Or you can just execute `./gen-hive-data.sh` in the downloaded directory above, to generate and load data into the tables hourly.
 
 ## Define data quality measure
 
@@ -195,7 +203,6 @@ spark-submit --class org.apache.griffin.measure.Application --master yarn --depl
 ## Report data quality metrics
 Then you can get the calculation log in console, after the job finishes, you can get the result metrics printed. The metrics will also be saved in hdfs: `hdfs:///griffin/persist/<job name>/<timestamp>/_METRICS`.
 
-
 ## Refine Data Quality report
 Depends on your business, you might need to refine your data quality measure further till your are satisfied.
 


[2/2] incubator-griffin-site git commit: This closes #5 merge

Posted by gu...@apache.org.
This closes #5   merge


Project: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/commit/053b3172
Tree: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/tree/053b3172
Diff: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/diff/053b3172

Branch: refs/heads/master
Commit: 053b3172e210f0304bdfe20aa6b3e6709aa6f697
Parents: 7c926c5
Author: William Guo <gu...@apache.org>
Authored: Fri Sep 14 09:32:18 2018 +0800
Committer: William Guo <gu...@apache.org>
Committed: Fri Sep 14 09:32:18 2018 +0800

----------------------------------------------------------------------
 quickstart.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/053b3172/quickstart.md
----------------------------------------------------------------------
diff --git a/quickstart.md b/quickstart.md
index 047a4e9..0deb6f6 100644
--- a/quickstart.md
+++ b/quickstart.md
@@ -18,7 +18,7 @@ both dt and hour are partitions,
 
 as every day we have one daily partition dt(like 20180912), 
 
-for every day we have 24 hourly partitions(like 01,02, ...).
+for every day we have 24 hourly partitions(like 00, 01, 02, ..., 23).
 
 ## Environment Preparation
 You need to prepare the environment for Apache Griffin measure module, including the following software: