You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by GitBox <gi...@apache.org> on 2020/08/11 07:48:17 UTC

[GitHub] [griffin] chitralverma commented on a change in pull request #581: [Griffin-339]Import griffin tool for debug and run user jobs

chitralverma commented on a change in pull request #581:
URL: https://github.com/apache/griffin/pull/581#discussion_r468372517



##########
File path: measure/assembly.xml
##########
@@ -0,0 +1,52 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one or more
+  ~ contributor license agreements.  See the NOTICE file distributed with
+  ~ this work for additional information regarding copyright ownership.
+  ~ The ASF licenses this file to You under the Apache License, Version 2.0
+  ~ (the "License"); you may not use this file except in compliance with
+  ~ the License.  You may obtain a copy of the License at
+  ~
+  ~    http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2"
+  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2 http://maven.apache.org/xsd/assembly-1.1.2.xsd">
+  <id>package</id>
+  <formats>
+    <format>tar.gz</format>
+  </formats>
+  <fileSets>
+    <fileSet>
+      <directory>${project.basedir}/sbin</directory>
+      <outputDirectory>/bin</outputDirectory>
+      <includes>
+        <include>/**</include>
+      </includes>
+      <lineEnding>unix</lineEnding>
+      <fileMode>0777</fileMode> <!-- 所有文件文件权限为777 -->
+      <directoryMode>0755</directoryMode> <!-- 所有目录权限为777  -->

Review comment:
       Change the comment in Chinese to English for more readability and understanding

##########
File path: measure/sbin/griffin-tool.sh
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env bash
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+BASEDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && cd .. && pwd )"
+CONFDIR="${BASEDIR}/conf"
+
+. "${BASEDIR}/bin/griffin-env.sh"
+
+if [ ! $# -ge 2 ]; then
+  echo "env file and dq file must be provided!"
+  exit 1
+fi
+
+envFile=$1
+if [ ! -f ${envFile} ];then
+  envFile="${CONFDIR}/${envFile}"
+  if [ ! -f ${envFile} ];then
+    echo "Not found env file: $1"
+    exit
+  fi
+fi
+shift
+
+dqFile=$1
+if [ ! -f ${dqFile} ];then
+  dqFile="${CONFDIR}/${dqFile}"
+  if [ ! -f ${dqFile} ];then
+    echo "Not found dq file: $2"
+    exit
+  fi
+fi
+shift
+
+cd ${BASEDIR}
+
+# export CLASSPATH and JAVA_OPTS
+export CLASSPATH=$(echo ${SPARK_HOME}/jars/*.jar | tr ' ' ':'):${CLASSPATH}

Review comment:
       What if `SPARK_HOME` is not set ?

##########
File path: griffin-doc/measure/griffin-tool.md
##########
@@ -0,0 +1,140 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Apache Griffin Tool Guide
+
+With Griffin tool, user can run dq jobs in command line. 
+This is helpful for user to debug and run user dq jobs.
+
+# Install
+
+* Compile Griffin using maven.
+* Decompress Griffin tool file `measure-x.x.x-package.tar.gz` in the target directory of measure module.
+* Go to the decompress directory, and run Griffin tool with user defined env file and dq file. eg: `./bin/griffin-tool.sh ENV_FILE DQ_FILE`
+
+# ENV_FILE demo
+
+```json
+{
+  "spark": {
+    "log.level": "WARN",
+    "config": {
+      "spark.master": "local[*]"
+    }
+  },
+
+  "sinks": [
+    {
+      "type": "CONSOLE",

Review comment:
       ```suggestion
         "name": "MyConsoleSink",
         "type": "CONSOLE"
   ```

##########
File path: griffin-doc/measure/griffin-tool.md
##########
@@ -0,0 +1,140 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Apache Griffin Tool Guide
+
+With Griffin tool, user can run dq jobs in command line. 
+This is helpful for user to debug and run user dq jobs.
+
+# Install
+
+* Compile Griffin using maven.
+* Decompress Griffin tool file `measure-x.x.x-package.tar.gz` in the target directory of measure module.
+* Go to the decompress directory, and run Griffin tool with user defined env file and dq file. eg: `./bin/griffin-tool.sh ENV_FILE DQ_FILE`
+
+# ENV_FILE demo
+
+```json
+{
+  "spark": {
+    "log.level": "WARN",
+    "config": {
+      "spark.master": "local[*]"
+    }
+  },
+
+  "sinks": [
+    {
+      "type": "CONSOLE",
+      "config": {
+        "max.log.lines": 10
+      }
+    },
+    {
+      "type": "HDFS",
+      "config": {
+        "path": "hdfs://localhost/griffin/batch/persist",
+        "max.persist.lines": 10000,
+        "max.lines.per.file": 10000
+      }
+    },
+    {
+      "type": "ELASTICSEARCH",

Review comment:
       ```suggestion
         "name": "MyElasticSearchSink",
         "type": "ELASTICSEARCH"
   ```

##########
File path: measure/sbin/griffin-tool.sh
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env bash
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+BASEDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && cd .. && pwd )"
+CONFDIR="${BASEDIR}/conf"
+
+. "${BASEDIR}/bin/griffin-env.sh"
+
+if [ ! $# -ge 2 ]; then

Review comment:
       shouldn't this be `-eq` instead of `-ge`

##########
File path: measure/sbin/griffin-tool.sh
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env bash
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+

Review comment:
       As a suggestion, maybe we can have dedicated logging functions like below
   
   ```
   # Logging functions
   
   function log_info() {
     echo "[INFO] | $(date -u +"%D %T") UTC | $1"
   }
   
   function log_error() {
     echo "[ERROR] | $(date -u +"%D %T") UTC | $1"
   }
   ```
   
   and then we can replace all the `echo` with the above logging functions, for example,
   
   ```
   if [ ! -f ${envFile} ];then
     envFile="${CONFDIR}/${envFile}"
     if [ ! -f ${envFile} ];then
       log_info "Not found env file: $1"
       exit
     fi
   fi
   ```
   
   This will help show the logs nicely.
   

##########
File path: measure/sbin/griffin-tool.sh
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env bash
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+BASEDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && cd .. && pwd )"
+CONFDIR="${BASEDIR}/conf"
+
+. "${BASEDIR}/bin/griffin-env.sh"
+
+if [ ! $# -ge 2 ]; then
+  echo "env file and dq file must be provided!"
+  exit 1
+fi
+
+envFile=$1
+if [ ! -f ${envFile} ];then
+  envFile="${CONFDIR}/${envFile}"

Review comment:
       we shouldn't assume that the configs files will be available in the config directory. If the configs do not exist, we can fail the process directly.

##########
File path: griffin-doc/measure/griffin-tool.md
##########
@@ -0,0 +1,140 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Apache Griffin Tool Guide
+
+With Griffin tool, user can run dq jobs in command line. 
+This is helpful for user to debug and run user dq jobs.
+
+# Install
+
+* Compile Griffin using maven.
+* Decompress Griffin tool file `measure-x.x.x-package.tar.gz` in the target directory of measure module.
+* Go to the decompress directory, and run Griffin tool with user defined env file and dq file. eg: `./bin/griffin-tool.sh ENV_FILE DQ_FILE`
+
+# ENV_FILE demo
+
+```json
+{
+  "spark": {
+    "log.level": "WARN",
+    "config": {
+      "spark.master": "local[*]"
+    }
+  },
+
+  "sinks": [
+    {
+      "type": "CONSOLE",
+      "config": {
+        "max.log.lines": 10
+      }
+    },
+    {
+      "type": "HDFS",

Review comment:
       ```suggestion
         "name": "MyHDFSSink",
         "type": "HDFS"
   ```

##########
File path: griffin-doc/measure/griffin-tool.md
##########
@@ -0,0 +1,140 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Apache Griffin Tool Guide
+
+With Griffin tool, user can run dq jobs in command line. 
+This is helpful for user to debug and run user dq jobs.
+
+# Install
+
+* Compile Griffin using maven.
+* Decompress Griffin tool file `measure-x.x.x-package.tar.gz` in the target directory of measure module.
+* Go to the decompress directory, and run Griffin tool with user defined env file and dq file. eg: `./bin/griffin-tool.sh ENV_FILE DQ_FILE`
+
+# ENV_FILE demo
+
+```json
+{
+  "spark": {
+    "log.level": "WARN",
+    "config": {
+      "spark.master": "local[*]"
+    }
+  },
+
+  "sinks": [

Review comment:
       Due to the latest merge in master, sinks now require `name` as well. An example is given below.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org