You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@griffin.apache.org by gu...@apache.org on 2018/04/08 04:52:28 UTC

incubator-griffin git commit: [GRIFFIN-138] update readme.md, highlighted docker guide

Repository: incubator-griffin
Updated Branches:
  refs/heads/master 95e45dca4 -> 4e0f25d2c


[GRIFFIN-138] update readme.md, highlighted docker guide

update readme.md, describe docker guide, debug guide and deploy guide in order for specific users

Author: Lionel Liu <bh...@163.com>

Closes #248 from bhlx3lyx7/tmst.


Project: http://git-wip-us.apache.org/repos/asf/incubator-griffin/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-griffin/commit/4e0f25d2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-griffin/tree/4e0f25d2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-griffin/diff/4e0f25d2

Branch: refs/heads/master
Commit: 4e0f25d2c9fd64c56a128e3ddde7c5c7addd916c
Parents: 95e45dc
Author: Lionel Liu <bh...@163.com>
Authored: Sun Apr 8 12:52:21 2018 +0800
Committer: Lionel Liu <bh...@163.com>
Committed: Sun Apr 8 12:52:21 2018 +0800

----------------------------------------------------------------------
 README.md                          | 174 ++++----------------------------
 griffin-doc/deploy/deploy-guide.md | 160 +++++++++++++++++++++++++++++
 2 files changed, 179 insertions(+), 155 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-griffin/blob/4e0f25d2/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index 5bc0e1c..37987d0 100644
--- a/README.md
+++ b/README.md
@@ -27,176 +27,40 @@ Apache Griffin is a model driven data quality solution for modern data systems.
 
 ## Getting Started
 
+### First Try of Griffin
 
-You can try Griffin in docker following the [docker guide](https://github.com/apache/incubator-griffin/blob/master/griffin-doc/docker/griffin-docker-guide.md).
-
-To run Griffin at local, you can follow instructions below.
-
-### Prerequisites
-You need to install following items 
-- jdk (1.8 or later versions).
-- mysql.
-- Postgresql.
-- npm (version 6.0.0+).
-- [Hadoop](http://apache.claz.org/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz) (2.6.0 or later), you can get some help [here](https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html).
--  [Spark](http://spark.apache.org/downloads.html) (version 1.6.x, griffin does not support 2.0.x at current), if you want to install Pseudo Distributed/Single Node Cluster, you can get some help [here](http://why-not-learn-something.blogspot.com/2015/06/spark-installation-pseudo.html).
-- [Hive](http://apache.claz.org/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz) (version 1.2.1 or later), you can get some help [here](https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive).
-    You need to make sure that your spark cluster could access your HiveContext.
-- [Livy](http://archive.cloudera.com/beta/livy/livy-server-0.3.0.zip), you can get some help [here](http://livy.io/quickstart.html).
-    Griffin need to schedule spark jobs by server, we use livy to submit our jobs.
-    For some issues of Livy for HiveContext, we need to download 3 files, and put them into HDFS.
-    ```
-    datanucleus-api-jdo-3.2.6.jar
-    datanucleus-core-3.2.10.jar
-    datanucleus-rdbms-3.2.9.jar
-    ```
-- ElasticSearch. 
-	ElasticSearch works as a metrics collector, Griffin produces metrics to it, and our default UI get metrics from it, you can use your own way as well.
-
-### Configuration
-
-Create database 'quartz' in mysql
-```
-mysql -u username -e "create database quartz" -p
-```
-Init quartz tables in mysql by service/src/main/resources/Init_quartz.sql
-```
-mysql -u username -p quartz < service/src/main/resources/Init_quartz.sql
-```
-
-
-You should also modify some configurations of Griffin for your environment.
-
-- <b>service/src/main/resources/application.properties</b>
-
-    ```
-    # jpa
-    spring.datasource.url = jdbc:postgresql://<your IP>:5432/quartz?autoReconnect=true&useSSL=false
-    spring.datasource.username = <user name>
-    spring.datasource.password = <password>
-    spring.jpa.generate-ddl=true
-    spring.datasource.driverClassName = org.postgresql.Driver
-    spring.jpa.show-sql = true
-    
-    # hive metastore
-    hive.metastore.uris = thrift://<your IP>:9083
-    hive.metastore.dbname = <hive database name>    # default is "default"
-    
-    # external properties directory location, ignore it if not required
-    external.config.location =
-
-	# login strategy, default is "default"
-	login.strategy = <default or ldap>
-
-	# ldap properties, ignore them if ldap is not enabled
-	ldap.url = ldap://hostname:port
-	ldap.email = @example.com
-	ldap.searchBase = DC=org,DC=example
-	ldap.searchPattern = (sAMAccountName={0})
-
-	# hdfs, ignore it if you do not need predicate job
-	fs.defaultFS = hdfs://<hdfs-default-name>
-
-	# elasticsearch
-	elasticsearch.host = <your IP>
-	elasticsearch.port = <your elasticsearch rest port>
-	# authentication properties, uncomment if basic authentication is enabled
-	# elasticsearch.user = user
-	# elasticsearch.password = password
-    ```
-
-- <b>measure/src/main/resources/env.json</b> 
-	```
-	"persist": [
-	    ...
-	    {
-			"type": "http",
-			"config": {
-		        "method": "post",
-		        "api": "http://<your ES IP>:<ES rest port>/griffin/accuracy"
-			}
-		}
-	]
-	```
-	Put the modified env.json file into HDFS.
-	
-- <b>service/src/main/resources/sparkJob.properties</b>
-    ```
-    sparkJob.file = hdfs://<griffin measure path>/griffin-measure.jar
-    sparkJob.args_1 = hdfs://<griffin env path>/env.json
-    
-    sparkJob.jars = hdfs://<datanucleus path>/spark-avro_2.11-2.0.1.jar\
-	    hdfs://<datanucleus path>/datanucleus-api-jdo-3.2.6.jar\
-	    hdfs://<datanucleus path>/datanucleus-core-3.2.10.jar\
-	    hdfs://<datanucleus path>/datanucleus-rdbms-3.2.9.jar
-	    
-	spark.yarn.dist.files = hdfs:///<spark conf path>/hive-site.xml
-	
-    livy.uri = http://<your IP>:8998/batches
-    spark.uri = http://<your IP>:8088
-    ```
-    - \<griffin measure path> is the location you should put the jar file of measure module.
-    - \<griffin env path> is the location you should put the env.json file.
-    - \<datanucleus path> is the location you should put the 3 jar files of livy, and the spark avro jar file if you need.
-    - \<spark conf path> is the location of spark conf directory.
-    
-### Build and Run
-
-Build the whole project and deploy. (NPM should be installed)
-
-  ```
-  mvn clean install
-  ```
- 
-Put jar file of measure module into \<griffin measure path> in HDFS
-
-```
-cp measure/target/measure-<version>-incubating-SNAPSHOT.jar measure/target/griffin-measure.jar
-hdfs dfs -put measure/target/griffin-measure.jar <griffin measure path>/
-  ```
-
-After all environment services startup, we can start our server.
-
-  ```
-  java -jar service/target/service.jar
-  ```
-    
-After a few seconds, we can visit our default UI of Griffin (by default the port of spring boot is 8080).
-
-  ```
-  http://<your IP>:8080
-  ```
-
-You can use UI following the steps  [here](https://github.com/apache/incubator-griffin/blob/master/griffin-doc/ui/user-guide.md).
-
-**Note**: The front-end UI is still under development, you can only access some basic features currently.
-
-
-### Build and Debug
+You can try Griffin in docker following the [docker guide](griffin-doc/docker/griffin-docker-guide.md).
+
+### Environment for Dev
 
 If you want to develop Griffin, please follow [this document](griffin-doc/dev/dev-env-build.md), to skip complex environment building work.
 
+### Deployment at Local
 
-## Community
+If you want to deploy Griffin in your local environment, please follow [this document](griffin-doc/deploy/deploy-guide.md).
 
-You can contact us via email: <a href="mailto:dev@griffin.incubator.apache.org">dev@griffin.incubator.apache.org</a>
+## Community
 
-You can also subscribe this mail by sending a email to [here](mailto:dev-subscribe@griffin.incubator.apache.org). 
+You can access [griffin home page](http://griffin.apache.org).
 
-You can access our issues jira page [here](https://issues.apache.org/jira/browse/GRIFFIN)
+You can contact us via email:
+- dev-list: <a href="mailto:dev@griffin.incubator.apache.org">dev@griffin.incubator.apache.org</a>
+- user-list: <a href="mailto:user@griffin.incubator.apache.org">user@griffin.incubator.apache.org</a>
 
+You can also subscribe this mail by sending a email to [subscribe dev-list](mailto:dev-subscribe@griffin.incubator.apache.org) and [subscribe user-list](mailto:user-subscribe@griffin.incubator.apache.org).
 
+You can access our issues on [JIRA page](https://issues.apache.org/jira/browse/GRIFFIN)
 
 ## Contributing
 
-See [Contributing Guide](./CONTRIBUTING.md) for details on how to contribute code, documentation, etc.
+See [How to Contribute](http://griffin.apache.org/2017/03/04/community) for details on how to contribute code, documentation, etc.
 
 ## References
 - [Home Page](http://griffin.incubator.apache.org/)
 - [Wiki](https://cwiki.apache.org/confluence/display/GRIFFIN/Apache+Griffin)
 - Documents:
-	- [Measure](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/measure)
-	- [Service](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/service)
-	- [UI](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/ui)
-	- [Docker usage](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/docker)
-	- [Postman API](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/service/postman)
\ No newline at end of file
+	- [Measure](griffin-doc/measure)
+	- [Service](griffin-doc/service)
+	- [UI](griffin-doc/ui)
+	- [Docker usage](griffin-doc/docker)
+	- [Postman API](griffin-doc/service/postman)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-griffin/blob/4e0f25d2/griffin-doc/deploy/deploy-guide.md
----------------------------------------------------------------------
diff --git a/griffin-doc/deploy/deploy-guide.md b/griffin-doc/deploy/deploy-guide.md
new file mode 100644
index 0000000..0693c25
--- /dev/null
+++ b/griffin-doc/deploy/deploy-guide.md
@@ -0,0 +1,160 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Apache Griffin Deployment Guide
+For Griffin users, you can deploy it with some dependencies in your environment, you can follow instructions below.
+
+### Prerequisites
+You need to install following items
+- jdk (1.8 or later versions).
+- mysql or Postgresql.
+- npm (version 6.0.0+).
+- [Hadoop](http://apache.claz.org/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz) (2.6.0 or later), you can get some help [here](https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html).
+-  [Spark](http://spark.apache.org/downloads.html) (version 1.6.x, griffin does not support 2.0.x at current), if you want to install Pseudo Distributed/Single Node Cluster, you can get some help [here](http://why-not-learn-something.blogspot.com/2015/06/spark-installation-pseudo.html).
+- [Hive](http://apache.claz.org/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz) (version 1.2.1 or later), you can get some help [here](https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive).
+    You need to make sure that your spark cluster could access your HiveContext.
+- [Livy](http://archive.cloudera.com/beta/livy/livy-server-0.3.0.zip), you can get some help [here](http://livy.io/quickstart.html).
+    Griffin need to schedule spark jobs by server, we use livy to submit our jobs.
+    For some issues of Livy for HiveContext, we need to download 3 files or get them from Spark lib `$SPARK_HOME/lib/`, and put them into HDFS.
+    ```
+    datanucleus-api-jdo-3.2.6.jar
+    datanucleus-core-3.2.10.jar
+    datanucleus-rdbms-3.2.9.jar
+    ```
+- ElasticSearch.
+	ElasticSearch works as a metrics collector, Griffin produces metrics to it, and our default UI get metrics from it, you can use your own way as well.
+
+### Configuration
+
+Create database 'quartz' in mysql
+```
+mysql -u username -e "create database quartz" -p
+```
+Init quartz tables in mysql by service/src/main/resources/Init_quartz.sql
+```
+mysql -u username -p quartz < service/src/main/resources/Init_quartz.sql
+```
+
+
+You should also modify some configurations of Griffin for your environment.
+
+- <b>service/src/main/resources/application.properties</b>
+
+    ```
+    # jpa
+    spring.datasource.url = jdbc:postgresql://<your IP>:5432/quartz?autoReconnect=true&useSSL=false
+    spring.datasource.username = <user name>
+    spring.datasource.password = <password>
+    spring.jpa.generate-ddl=true
+    spring.datasource.driverClassName = org.postgresql.Driver
+    spring.jpa.show-sql = true
+
+    # hive metastore
+    hive.metastore.uris = thrift://<your IP>:9083
+    hive.metastore.dbname = <hive database name>    # default is "default"
+
+    # external properties directory location, ignore it if not required
+    external.config.location =
+
+	# login strategy, default is "default"
+	login.strategy = <default or ldap>
+
+	# ldap properties, ignore them if ldap is not enabled
+	ldap.url = ldap://hostname:port
+	ldap.email = @example.com
+	ldap.searchBase = DC=org,DC=example
+	ldap.searchPattern = (sAMAccountName={0})
+
+	# hdfs, ignore it if you do not need predicate job
+	fs.defaultFS = hdfs://<hdfs-default-name>
+
+	# elasticsearch
+	elasticsearch.host = <your IP>
+	elasticsearch.port = <your elasticsearch rest port>
+	# authentication properties, uncomment if basic authentication is enabled
+	# elasticsearch.user = user
+	# elasticsearch.password = password
+    ```
+
+- <b>measure/src/main/resources/env.json</b>
+	```
+	"persist": [
+	    ...
+	    {
+			"type": "http",
+			"config": {
+		        "method": "post",
+		        "api": "http://<your ES IP>:<ES rest port>/griffin/accuracy"
+			}
+		}
+	]
+	```
+	Put the modified env.json file into HDFS.
+
+- <b>service/src/main/resources/sparkJob.properties</b>
+    ```
+    sparkJob.file = hdfs://<griffin measure path>/griffin-measure.jar
+    sparkJob.args_1 = hdfs://<griffin env path>/env.json
+
+    sparkJob.jars = hdfs://<datanucleus path>/spark-avro_2.11-2.0.1.jar\
+	    hdfs://<datanucleus path>/datanucleus-api-jdo-3.2.6.jar\
+	    hdfs://<datanucleus path>/datanucleus-core-3.2.10.jar\
+	    hdfs://<datanucleus path>/datanucleus-rdbms-3.2.9.jar
+
+	spark.yarn.dist.files = hdfs:///<spark conf path>/hive-site.xml
+
+    livy.uri = http://<your IP>:8998/batches
+    spark.uri = http://<your IP>:8088
+    ```
+    - \<griffin measure path> is the location you should put the jar file of measure module.
+    - \<griffin env path> is the location you should put the env.json file.
+    - \<datanucleus path> is the location you should put the 3 jar files of livy, and the spark avro jar file if you need to support avro data.
+    - \<spark conf path> is the location of spark conf directory.
+
+### Build and Run
+
+Build the whole project and deploy. (NPM should be installed)
+
+  ```
+  mvn clean install
+  ```
+
+Put jar file of measure module into \<griffin measure path> in HDFS
+
+```
+cp measure/target/measure-<version>-incubating-SNAPSHOT.jar measure/target/griffin-measure.jar
+hdfs dfs -put measure/target/griffin-measure.jar <griffin measure path>/
+  ```
+
+After all environment services startup, we can start our server.
+
+  ```
+  java -jar service/target/service.jar
+  ```
+
+After a few seconds, we can visit our default UI of Griffin (by default the port of spring boot is 8080).
+
+  ```
+  http://<your IP>:8080
+  ```
+
+You can use UI following the steps  [here](../ui/user-guide.md).
+
+**Note**: The front-end UI is still under development, you can only access some basic features currently.
+