You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by ja...@apache.org on 2018/10/13 00:58:42 UTC

samza git commit: Clean-up the case-studies page for Ebay, add a diagram

Repository: samza
Updated Branches:
  refs/heads/master e5ea9bef1 -> 0b0a0cabf


Clean-up the case-studies page for Ebay, add a diagram

Author: Jagadish <jv...@linkedin.com>

Reviewers: Jagadish<ja...@apache.org>

Closes #724 from vjagadish1989/website-reorg17


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/0b0a0cab
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/0b0a0cab
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/0b0a0cab

Branch: refs/heads/master
Commit: 0b0a0cabf3a6edfe4cf9c9de26c2d5185b677d0f
Parents: e5ea9be
Author: Jagadish <jv...@linkedin.com>
Authored: Fri Oct 12 17:58:39 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Fri Oct 12 17:58:39 2018 -0700

----------------------------------------------------------------------
 docs/_case-studies/ebay.md                      |  57 ++++++++++++-------
 .../learn/documentation/case-study/ebay.png     | Bin 0 -> 27064 bytes
 2 files changed, 35 insertions(+), 22 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/0b0a0cab/docs/_case-studies/ebay.md
----------------------------------------------------------------------
diff --git a/docs/_case-studies/ebay.md b/docs/_case-studies/ebay.md
index e967558..96821f0 100644
--- a/docs/_case-studies/ebay.md
+++ b/docs/_case-studies/ebay.md
@@ -1,7 +1,7 @@
 ---
 layout: case-study
 hide_title: true # so we have control in case-study layout, but can still use page
-title: Low Latency Web Scale Fraud Prevention
+title: Low Latency Web-Scale Fraud Prevention
 study_domain: ebay.com
 menu_title: eBay
 excerpt_separator: <!--more-->
@@ -27,30 +27,43 @@ How Samza powers low-latency, web-scale fraud prevention at Ebay?
 
 <!--more-->
 
-eBay Enterprise is the world’s largest omni-channel commerce provider with 
-hundreds millions of units shipped annually, as commerce gets more 
-convenient and complex, so does fraud. The engineering team at eBay 
-Enterprise selected Samza as the platform to build the horizontally 
-scalable, realtime (sub-seconds) and fault tolerant abnormality detection 
-system. For example, the system computes and evaluates key metrics to 
-detect abnormal behaviors
+eBay Enterprise is the world’s largest omni-channel commerce provider. The engineering team at eBay chose Apache Samza to build _PreCog_, their 
+horizontally scalable anomaly detection system. 
 
--   Transaction velocity (#tnx/day) and change (#tnx/day vs #tnx/day over n days)
--   Amount velocity ($tnx/day) and change ($tnx/day vs $tnx/day over n days)
+_PreCog_ extensively leverages Samza's high-performance, fault-tolerant local storage. Its architecture had the following requirements, for which Samza perfectly fit the bill: <br/>
 
-A wide range of realtime and historical adjunct data from various sources 
-including people, places, interests, social and connections are ingested 
-through Kafka, and stored in local RocksDB state store with changelog 
-enabled for recovery. Incoming transaction data is aggregated using 
-windowing and then joined with adjunct data stores in multiple stages. 
-The system generates potential fraud cases for review real time. Finally, 
-the engineering team at eBay Enterprise has built an OpenTSDB and Grafana 
-based monitoring system using metrics collected through JMX.
+_Web-scale:_ Scale to a large number of users and large volume of data per-user. Additionally, should be possible to add more commodity hardware and scale horizontally. <br/>
+_Low-latency:_ Process customer interactions real-time by reacting in milliseconds instead of hours. <br/>
+_Fault-tolerance:_ Gracefully tolerate and handle hardware failures. <br/>
 
-Key Samza features: *Stateful processing*, *Windowing*, *Kafka-integration*,
-*JMX-metrics*
+![diagram-large](/img/{{site.version}}/learn/documentation/case-study/ebay.png)
 
-More information
+The PreCog anomaly-detection system comprises of multiple tiers, with each tier consisting of multiple Samza jobs, which process the output of the previous tier.
+
+_Ingestion tier:_ In this tier, a variety of historical and realtime data from various
+sources including people, places etc., is ingested into Kafka.
+
+_Fanout tier:_ This tier consists of Samza jobs which process the Kafka events, fan them out and re-partition them based on various
+facets like email-address, ip-address, credit-card number, shipping address etc. 
+
+_Compute tier:_ The Samza jobs in this tier consume messages from the fan-out tier and compute various key metrics and derived features. Features used to evaluate fraud include: 
+
+1. Number of transactions per-customer per-day <br/>
+2. Change in the number of daily transactions over the past few days <br/>
+3. Amount value ($$) of each transaction per-day <br/>
+4. Change in the amount value of transactions over a sliding time-window <br/>
+5. Number of transactions per shipping-address
+
+_Assembly tier:_ This tier comprises of Samza jobs which join the output of the compute-tier with other additional data-sources
+and make a final determination on transaction-fraud. 
+
+For monitoring the _PreCog_ pipeline, EBay leverages Samza's [JMXMetricsReporter](/learn/documentation/{{site.version}}/operations/monitoring.html) and ingests the reported metrics into OpenTSDB/ HBase. The metrics are then 
+visualzed using [Grafana](https://grafana.com/).
+
+
+Key Samza features: *Stateful processing*, *Windowing*, *Kafka-integration*, *JMX-metrics*
+
+More information:
 
 -   [https://www.slideshare.net/edibice/extremely-low-latency-web-scale-fraud-prevention-with-apache-samza-kafka-and-friends](https://www.slideshare.net/edibice/extremely-low-latency-web-scale-fraud-prevention-with-apache-samza-kafka-and-friends)
--   [http://ebayenterprise.com/](http://ebayenterprise.com/)
+-   [http://ebayenterprise.com/](http://ebayenterprise.com/)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/0b0a0cab/docs/img/versioned/learn/documentation/case-study/ebay.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/case-study/ebay.png b/docs/img/versioned/learn/documentation/case-study/ebay.png
new file mode 100644
index 0000000..a9976ac
Binary files /dev/null and b/docs/img/versioned/learn/documentation/case-study/ebay.png differ