You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@griffin.apache.org by gu...@apache.org on 2017/05/17 01:04:56 UTC

incubator-griffin-site git commit: update architecture

Repository: incubator-griffin-site
Updated Branches:
  refs/heads/master 1d228cf19 -> ebdd32b4e


update architecture


Project: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/commit/ebdd32b4
Tree: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/tree/ebdd32b4
Diff: http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/diff/ebdd32b4

Branch: refs/heads/master
Commit: ebdd32b4ec2389630cdd6bacf5339fffc9fbd2e6
Parents: 1d228cf
Author: William Guo <gu...@icloud.com>
Authored: Wed May 17 09:04:15 2017 +0800
Committer: William Guo <gu...@icloud.com>
Committed: Wed May 17 09:04:15 2017 +0800

----------------------------------------------------------------------
 source/_posts/home.md       |  11 +++++++++--
 source/images/arch.png      | Bin 0 -> 307285 bytes
 source/images/techstack.png | Bin 0 -> 127993 bytes
 3 files changed, 9 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/ebdd32b4/source/_posts/home.md
----------------------------------------------------------------------
diff --git a/source/_posts/home.md b/source/_posts/home.md
index d8711e5..42967f6 100644
--- a/source/_posts/home.md
+++ b/source/_posts/home.md
@@ -8,7 +8,7 @@ Apache Griffin is a Data Quality Service platform built on Apache Hadoop and Apa
 
 
 ## Overview of Apache Griffin  
-At eBay, when people use big data (Hadoop or other streaming systems), measurement of data quality is a big challenge. Different teams have built customized tools to detect and analyze data quality issues within their own domains. As a platform organization, we think of taking a platform approach to commonly occurring patterns. As such, we are building a platform to provide shared Infrastructure and generic features to solve common data quality pain points. This would enable us to build trusted data assets.
+When people use big data (Hadoop or other streaming systems), measurement of data quality is a big challenge. Different teams have built customized tools to detect and analyze data quality issues within their own domains. As a platform organization, we think of taking a platform approach to commonly occurring patterns. As such, we are building a platform to provide shared Infrastructure and generic features to solve common data quality pain points. This would enable us to build trusted data assets.
 
 Currently it is very difficult and costly to do data quality validation when we have large volumes of related data flowing across multi-platforms (streaming and batch). Take eBay's Real-time Personalization Platform as a sample; Everyday we have to validate the data quality for ~600M records. Data quality often becomes one big challenge in this complex environment and massive scale.
 
@@ -50,10 +50,17 @@ For near real time analysis, we consume data from messaging system, then our dat
 We have RESTful web services to accomplish all the functionalities of Apache Griffin, such as register data-set, create data quality model, publish metrics, retrieve metrics, add subscription, etc. So, the developers can develop their own user interface based on these web serivces.
 
 ## Main business process
-Here's the business process diagram
 
 ![](/images/Business_Process.png)
 
+## Architecture diagram
+
+![](/images/arch.png)
+
+## Tech stack
+
+![](/images/techstack.png)
+
 ## Rationale
 The challenge we face at eBay is that our data volume is becoming bigger and bigger, systems process become more complex, while we do not have a unified data quality solution to ensure the trusted data sets which provide confidences on data quality to our data consumers.  The key challenges on data quality includes:
 

http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/ebdd32b4/source/images/arch.png
----------------------------------------------------------------------
diff --git a/source/images/arch.png b/source/images/arch.png
new file mode 100644
index 0000000..93bc755
Binary files /dev/null and b/source/images/arch.png differ

http://git-wip-us.apache.org/repos/asf/incubator-griffin-site/blob/ebdd32b4/source/images/techstack.png
----------------------------------------------------------------------
diff --git a/source/images/techstack.png b/source/images/techstack.png
new file mode 100644
index 0000000..ebc5540
Binary files /dev/null and b/source/images/techstack.png differ