You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@eagle.apache.org by ha...@apache.org on 2017/04/03 11:23:43 UTC
svn commit: r1789954 [5/5] - in /eagle/site: ./ docs/ docs/tutorial/
post/2015/10/27/
Modified: eagle/site/docs/tutorial/site-0.3.0.html
URL: http://svn.apache.org/viewvc/eagle/site/docs/tutorial/site-0.3.0.html?rev=1789954&r1=1789953&r2=1789954&view=diff
==============================================================================
--- eagle/site/docs/tutorial/site-0.3.0.html (original)
+++ eagle/site/docs/tutorial/site-0.3.0.html Mon Apr 3 11:23:42 2017
@@ -129,86 +129,27 @@
<li class="divider"></li>
- <li class="heading">Download</li>
-
- <li class="sidenavli "><a href="/docs/download-latest.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Latest version</a></li>
-
- <li class="sidenavli "><a href="/docs/download.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Archived</a></li>
-
- <li class="divider"></li>
-
- <li class="heading">Installation</li>
-
- <li class="sidenavli "><a href="/docs/quick-start.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Get Started with Sandbox</a></li>
-
- <li class="sidenavli "><a href="/docs/deployment-in-docker.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Get Started with Docker</a></li>
-
- <li class="sidenavli "><a href="/docs/deployment-env.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Setup Environment</a></li>
-
- <li class="sidenavli "><a href="/docs/deployment-in-production.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Setup Eagle in Production</a></li>
-
- <li class="sidenavli "><a href="/docs/configuration.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Eagle Application Configuration</a></li>
-
- <li class="sidenavli "><a href="/docs/serviceconfiguration.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Eagle Service Configuration</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/ldap.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Eagle LDAP Authentication</a></li>
-
- <li class="divider"></li>
-
- <li class="heading">Tutorial</li>
-
- <li class="sidenavli "><a href="/docs/tutorial/site.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Site Management</a></li>
+ <li class="heading">Documentations</li>
- <li class="sidenavli "><a href="/docs/tutorial/policy.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Policy Management</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/policy-capabilities.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Policy Engine Capabilities</a></li>
-
- <li class="sidenavli "><a href="/docs/hdfs-data-activity-monitoring.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">HDFS Data Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/hive-query-activity-monitoring.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">HIVE Query Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/hbase-data-activity-monitoring.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">HBASE Data Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/mapr-integration.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">MapR FS Data Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/cloudera-integration.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Cloudera FS Data Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/jmx-metric-monitoring.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Hadoop JMX Metrics Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/import-hdfs-auditLog.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Stream HDFS audit logs into Kafka</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/userprofile.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">User Profile Feature</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/classification.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Data Classification Feature</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/topologymanagement.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Topology Management Feature</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/notificationplugin.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Alert Notification Plugin</a></li>
-
- <li class="sidenavli "><a href="/docs/metadata-api.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Metadata RESTful API</a></li>
+ <li class="sidenavli "><a href="/docs/latest/" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Latest version (v0.5.0)</a></li>
<li class="divider"></li>
- <li class="heading">Development Guide</li>
-
- <li class="sidenavli "><a href="/docs/development-quick-guide.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Development Quick Guide</a></li>
+ <li class="heading">Download</li>
- <li class="sidenavli "><a href="/docs/development-in-macosx.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Development in Mac OSX</a></li>
+ <li class="sidenavli "><a href="/docs/download-latest.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Latest version (v0.5.0)</a></li>
- <li class="sidenavli "><a href="/docs/development-in-intellij.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Development in Intellij</a></li>
+ <li class="sidenavli "><a href="/docs/download.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Archived</a></li>
<li class="divider"></li>
- <li class="heading">Advanced</li>
+ <li class="heading">Supplement</li>
- <li class="sidenavli "><a href="/docs/user-profile-ml.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">User Profile Machine Learning</a></li>
+ <li class="sidenavli "><a href="/docs/security.html" data-permalink="/docs/tutorial/site-0.3.0.html" id="">Security</a></li>
<li class="divider"></li>
<li class="sidenavli">
- <a href="/sup/index.html">Go To Supplement</a>
- </li>
- <li class="sidenavli">
<a href="mailto:dev@eagle.apache.org" target="_blank">Need Help?</a>
</li>
</ul>
@@ -239,35 +180,32 @@ Here we give configuration examples for
<p>You may configure the default path for Hadoop clients to connect remote hdfs namenode.</p>
- <div class="highlighter-rouge"><pre class="highlight"><code><span class="w"> </span><span class="p">{</span><span class="nt">"fs.defaultFS"</span><span class="p">:</span><span class="s2">"hdfs://sandbox.hortonworks.com:8020"</span><span class="p">}</span><span class="w">
-</span></code></pre>
- </div>
+ <pre><code> {"fs.defaultFS":"hdfs://sandbox.hortonworks.com:8020"}
+</code></pre>
</li>
<li>
<p>HA case</p>
<p>Basically, you point your fs.defaultFS at your nameservice and let the client know how its configured (the backing namenodes) and how to fail over between them under the HA mode</p>
- <div class="highlighter-rouge"><pre class="highlight"><code><span class="w"> </span><span class="p">{</span><span class="nt">"fs.defaultFS"</span><span class="p">:</span><span class="s2">"hdfs://nameservice1"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"dfs.nameservices"</span><span class="p">:</span><span class="w"> </span><span class="s2">"nameservice1"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"dfs.ha.namenodes.nameservice1"</span><span class="p">:</span><span class="s2">"namenode1,namenode2"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"dfs.namenode.rpc-address.nameservice1.namenode1"</span><span class="p">:</span><span class="w"> </span><span class="s2">"hadoopnamenode01:8020"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"dfs.namenode.rpc-address.nameservice1.namenode2"</span><span class="p">:</span><span class="w"> </span><span class="s2">"hadoopnamenode02:8020"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"dfs.client.failover.proxy.provider.nameservice1"</span><span class="p">:</span><span class="w"> </span><span class="s2">"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"</span><span class="w">
- </span><span class="p">}</span><span class="w">
-</span></code></pre>
- </div>
+ <pre><code> {"fs.defaultFS":"hdfs://nameservice1",
+ "dfs.nameservices": "nameservice1",
+ "dfs.ha.namenodes.nameservice1":"namenode1,namenode2",
+ "dfs.namenode.rpc-address.nameservice1.namenode1": "hadoopnamenode01:8020",
+ "dfs.namenode.rpc-address.nameservice1.namenode2": "hadoopnamenode02:8020",
+ "dfs.client.failover.proxy.provider.nameservice1": "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
+ }
+</code></pre>
</li>
<li>
<p>Kerberos-secured cluster</p>
<p>For Kerberos-secured cluster, you need to get a keytab file and the principal from your admin, and configure “eagle.keytab.file” and “eagle.kerberos.principal” to authenticate its access.</p>
- <div class="highlighter-rouge"><pre class="highlight"><code><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">"eagle.keytab.file"</span><span class="p">:</span><span class="s2">"/EAGLE-HOME/.keytab/eagle.keytab"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"eagle.kerberos.principal"</span><span class="p">:</span><span class="s2">"eagle@SOMEWHERE.COM"</span><span class="w">
- </span><span class="p">}</span><span class="w">
-</span></code></pre>
- </div>
+ <pre><code> { "eagle.keytab.file":"/EAGLE-HOME/.keytab/eagle.keytab",
+ "eagle.kerberos.principal":"eagle@SOMEWHERE.COM"
+ }
+</code></pre>
<p>If there is an exception about “invalid server principal name”, you may need to check the DNS resolver, or the data transfer , such as “dfs.encrypt.data.transfer”, “dfs.encrypt.data.transfer.algorithm”, “dfs.trustedchannel.resolver.class”, “dfs.datatransfer.client.encrypt”.</p>
</li>
@@ -278,15 +216,14 @@ Here we give configuration examples for
<li>
<p>Basic</p>
- <div class="highlighter-rouge"><pre class="highlight"><code><span class="w"> </span><span class="p">{</span><span class="w">
- </span><span class="nt">"accessType"</span><span class="p">:</span><span class="w"> </span><span class="s2">"metastoredb_jdbc"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"password"</span><span class="p">:</span><span class="w"> </span><span class="s2">"hive"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"user"</span><span class="p">:</span><span class="w"> </span><span class="s2">"hive"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"jdbcDriverClassName"</span><span class="p">:</span><span class="w"> </span><span class="s2">"com.mysql.jdbc.Driver"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"jdbcUrl"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc:mysql://sandbox.hortonworks.com/hive?createDatabaseIfNotExist=true"</span><span class="w">
- </span><span class="p">}</span><span class="w">
-</span></code></pre>
- </div>
+ <pre><code> {
+ "accessType": "metastoredb_jdbc",
+ "password": "hive",
+ "user": "hive",
+ "jdbcDriverClassName": "com.mysql.jdbc.Driver",
+ "jdbcUrl": "jdbc:mysql://sandbox.hortonworks.com/hive?createDatabaseIfNotExist=true"
+ }
+</code></pre>
</li>
</ul>
</li>
@@ -299,29 +236,27 @@ Here we give configuration examples for
<p>You need to sett “hbase.zookeeper.quorum”:”localhost” property and “hbase.zookeeper.property.clientPort” property.</p>
- <div class="highlighter-rouge"><pre class="highlight"><code><span class="w"> </span><span class="p">{</span><span class="w">
- </span><span class="nt">"hbase.zookeeper.property.clientPort"</span><span class="p">:</span><span class="s2">"2181"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"hbase.zookeeper.quorum"</span><span class="p">:</span><span class="s2">"localhost"</span><span class="w">
- </span><span class="p">}</span><span class="w">
-</span></code></pre>
- </div>
+ <pre><code> {
+ "hbase.zookeeper.property.clientPort":"2181",
+ "hbase.zookeeper.quorum":"localhost"
+ }
+</code></pre>
</li>
<li>
<p>Kerberos-secured cluster</p>
<p>According to your environment, you can add or remove some of the following properties. Here is the reference.</p>
- <div class="highlighter-rouge"><pre class="highlight"><code><span class="w"> </span><span class="p">{</span><span class="w">
- </span><span class="nt">"hbase.zookeeper.property.clientPort"</span><span class="p">:</span><span class="s2">"2181"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"hbase.zookeeper.quorum"</span><span class="p">:</span><span class="s2">"localhost"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"hbase.security.authentication"</span><span class="p">:</span><span class="s2">"kerberos"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"hbase.master.kerberos.principal"</span><span class="p">:</span><span class="s2">"hadoop/_HOST@EXAMPLE.COM"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"zookeeper.znode.parent"</span><span class="p">:</span><span class="s2">"/hbase"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"eagle.keytab.file"</span><span class="p">:</span><span class="s2">"/EAGLE-HOME/.keytab/eagle.keytab"</span><span class="p">,</span><span class="w">
- </span><span class="nt">"eagle.kerberos.principal"</span><span class="p">:</span><span class="s2">"eagle@EXAMPLE.COM"</span><span class="w">
- </span><span class="p">}</span><span class="w">
-</span></code></pre>
- </div>
+ <pre><code> {
+ "hbase.zookeeper.property.clientPort":"2181",
+ "hbase.zookeeper.quorum":"localhost",
+ "hbase.security.authentication":"kerberos",
+ "hbase.master.kerberos.principal":"hadoop/_HOST@EXAMPLE.COM",
+ "zookeeper.znode.parent":"/hbase",
+ "eagle.keytab.file":"/EAGLE-HOME/.keytab/eagle.keytab",
+ "eagle.kerberos.principal":"eagle@EXAMPLE.COM"
+ }
+</code></pre>
</li>
</ul>
</li>
Modified: eagle/site/docs/tutorial/site.html
URL: http://svn.apache.org/viewvc/eagle/site/docs/tutorial/site.html?rev=1789954&r1=1789953&r2=1789954&view=diff
==============================================================================
--- eagle/site/docs/tutorial/site.html (original)
+++ eagle/site/docs/tutorial/site.html Mon Apr 3 11:23:42 2017
@@ -129,86 +129,27 @@
<li class="divider"></li>
- <li class="heading">Download</li>
-
- <li class="sidenavli "><a href="/docs/download-latest.html" data-permalink="/docs/tutorial/site.html" id="">Latest version</a></li>
-
- <li class="sidenavli "><a href="/docs/download.html" data-permalink="/docs/tutorial/site.html" id="">Archived</a></li>
-
- <li class="divider"></li>
-
- <li class="heading">Installation</li>
-
- <li class="sidenavli "><a href="/docs/quick-start.html" data-permalink="/docs/tutorial/site.html" id="">Get Started with Sandbox</a></li>
-
- <li class="sidenavli "><a href="/docs/deployment-in-docker.html" data-permalink="/docs/tutorial/site.html" id="">Get Started with Docker</a></li>
-
- <li class="sidenavli "><a href="/docs/deployment-env.html" data-permalink="/docs/tutorial/site.html" id="">Setup Environment</a></li>
-
- <li class="sidenavli "><a href="/docs/deployment-in-production.html" data-permalink="/docs/tutorial/site.html" id="">Setup Eagle in Production</a></li>
-
- <li class="sidenavli "><a href="/docs/configuration.html" data-permalink="/docs/tutorial/site.html" id="">Eagle Application Configuration</a></li>
-
- <li class="sidenavli "><a href="/docs/serviceconfiguration.html" data-permalink="/docs/tutorial/site.html" id="">Eagle Service Configuration</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/ldap.html" data-permalink="/docs/tutorial/site.html" id="">Eagle LDAP Authentication</a></li>
-
- <li class="divider"></li>
-
- <li class="heading">Tutorial</li>
-
- <li class="sidenavli current"><a href="/docs/tutorial/site.html" data-permalink="/docs/tutorial/site.html" id="">Site Management</a></li>
+ <li class="heading">Documentations</li>
- <li class="sidenavli "><a href="/docs/tutorial/policy.html" data-permalink="/docs/tutorial/site.html" id="">Policy Management</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/policy-capabilities.html" data-permalink="/docs/tutorial/site.html" id="">Policy Engine Capabilities</a></li>
-
- <li class="sidenavli "><a href="/docs/hdfs-data-activity-monitoring.html" data-permalink="/docs/tutorial/site.html" id="">HDFS Data Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/hive-query-activity-monitoring.html" data-permalink="/docs/tutorial/site.html" id="">HIVE Query Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/hbase-data-activity-monitoring.html" data-permalink="/docs/tutorial/site.html" id="">HBASE Data Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/mapr-integration.html" data-permalink="/docs/tutorial/site.html" id="">MapR FS Data Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/cloudera-integration.html" data-permalink="/docs/tutorial/site.html" id="">Cloudera FS Data Activity Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/jmx-metric-monitoring.html" data-permalink="/docs/tutorial/site.html" id="">Hadoop JMX Metrics Monitoring</a></li>
-
- <li class="sidenavli "><a href="/docs/import-hdfs-auditLog.html" data-permalink="/docs/tutorial/site.html" id="">Stream HDFS audit logs into Kafka</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/userprofile.html" data-permalink="/docs/tutorial/site.html" id="">User Profile Feature</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/classification.html" data-permalink="/docs/tutorial/site.html" id="">Data Classification Feature</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/topologymanagement.html" data-permalink="/docs/tutorial/site.html" id="">Topology Management Feature</a></li>
-
- <li class="sidenavli "><a href="/docs/tutorial/notificationplugin.html" data-permalink="/docs/tutorial/site.html" id="">Alert Notification Plugin</a></li>
-
- <li class="sidenavli "><a href="/docs/metadata-api.html" data-permalink="/docs/tutorial/site.html" id="">Metadata RESTful API</a></li>
+ <li class="sidenavli "><a href="/docs/latest/" data-permalink="/docs/tutorial/site.html" id="">Latest version (v0.5.0)</a></li>
<li class="divider"></li>
- <li class="heading">Development Guide</li>
-
- <li class="sidenavli "><a href="/docs/development-quick-guide.html" data-permalink="/docs/tutorial/site.html" id="">Development Quick Guide</a></li>
+ <li class="heading">Download</li>
- <li class="sidenavli "><a href="/docs/development-in-macosx.html" data-permalink="/docs/tutorial/site.html" id="">Development in Mac OSX</a></li>
+ <li class="sidenavli "><a href="/docs/download-latest.html" data-permalink="/docs/tutorial/site.html" id="">Latest version (v0.5.0)</a></li>
- <li class="sidenavli "><a href="/docs/development-in-intellij.html" data-permalink="/docs/tutorial/site.html" id="">Development in Intellij</a></li>
+ <li class="sidenavli "><a href="/docs/download.html" data-permalink="/docs/tutorial/site.html" id="">Archived</a></li>
<li class="divider"></li>
- <li class="heading">Advanced</li>
+ <li class="heading">Supplement</li>
- <li class="sidenavli "><a href="/docs/user-profile-ml.html" data-permalink="/docs/tutorial/site.html" id="">User Profile Machine Learning</a></li>
+ <li class="sidenavli "><a href="/docs/security.html" data-permalink="/docs/tutorial/site.html" id="">Security</a></li>
<li class="divider"></li>
<li class="sidenavli">
- <a href="/sup/index.html">Go To Supplement</a>
- </li>
- <li class="sidenavli">
<a href="mailto:dev@eagle.apache.org" target="_blank">Need Help?</a>
</li>
</ul>
Modified: eagle/site/feed.xml
URL: http://svn.apache.org/viewvc/eagle/site/feed.xml?rev=1789954&r1=1789953&r2=1789954&view=diff
==============================================================================
--- eagle/site/feed.xml (original)
+++ eagle/site/feed.xml Mon Apr 3 11:23:42 2017
@@ -5,9 +5,9 @@
<description>Eagle - Analyze Big Data Platforms for Security and Performance</description>
<link>http://goeagle.io/</link>
<atom:link href="http://goeagle.io/feed.xml" rel="self" type="application/rss+xml"/>
- <pubDate>Thu, 12 Jan 2017 15:28:13 +0800</pubDate>
- <lastBuildDate>Thu, 12 Jan 2017 15:28:13 +0800</lastBuildDate>
- <generator>Jekyll v3.3.1</generator>
+ <pubDate>Mon, 03 Apr 2017 19:17:59 +0800</pubDate>
+ <lastBuildDate>Mon, 03 Apr 2017 19:17:59 +0800</lastBuildDate>
+ <generator>Jekyll v2.5.3</generator>
<item>
<title>Apache Eagle 正式发布:分布式实时Hadoop数据安全方案</title>
@@ -17,7 +17,7 @@
<p>日前,eBay公司隆重宣布正式向开源业界推出分布式实时安全监控方案 - Apache Eagle (http://goeagle.io),该项目已于2015年10月26日正式加入Apache 成为孵化器项目。Apache Eagle提供一套高效分布式的流式策略引擎,具有高实时、可伸缩、易扩展、交互友好等特点,同时集成机器学习对用户行为建立Profile以实现智能实时地保护Hadoop生态系统中大数据的安全。</p>
-<h2 id="背景">背景</h2>
+<h2 id="section">背景</h2>
<p>随着大数据的发展,越来越多的成功企业或者组织开始采取数据驱动商业的运作模式。在eBay,我们拥有数万名工程师、分析师和数据科学家,他们每天访问分析数PB级的数据,以为我们的用户带来无与伦比的体验。在全球业务中,我们也广泛地利用海量大数据来连接我们数以亿计的用户。</p>
<p>近年来,Hadoop已经逐渐成为大数据分析领域最受欢迎的解决方案,eBay也一直在使用Hadoop技术从数据中挖掘价值,例如,我们通过大数据提高用户的搜索体验,识别和优化精准广告投放,充实我们的产品目录,以及通过点击流分析以理解用户如何使用我们的在线市场平台等。</p>
@@ -54,20 +54,20 @@
<li><strong>开源</strong>:Eagle一直根据开源的标准开发,并构建于诸多大数据领域的开源产品之上,因此我们决定以Apache许可证开源Eagle,以回馈社区,同时也期待获得社区的反馈、协作与支持。</li>
</ul>
-<h2 id="eagle概览">Eagle概览</h2>
+<h2 id="eagle">Eagle概览</h2>
<p><img src="/images/posts/eagle-group.png" alt="" /></p>
-<h4 id="数据流接入和存储data-collection-and-storage">数据流接入和存储(Data Collection and Storage)</h4>
+<h4 id="data-collection-and-storage">数据流接入和存储(Data Collection and Storage)</h4>
<p>Eagle提供高度可扩展的编程API,可以支持将任何类型的数据源集成到Eagle的策略执行引擎中。例如,在Eagle HDFS 审计事件(Audit)监控模块中,通过Kafka来实时接收来自Namenode Log4j Appender 或者 Logstash Agent 收集的数据;在Eagle Hive 监控模块中,通过YARN API 收集正在运行Job的Hive 查询日志,并保证比较高的可伸缩性和容错性。</p>
-<h4 id="数据实时处理data-processing">数据实时处理(Data Processing)</h4>
+<h4 id="data-processing">数据实时处理(Data Processing)</h4>
<p><strong>流处理API(Stream Processing API)Eagle</strong> 提供独立于物理平台而高度抽象的流处理API,目前默认支持Apache Storm,但是也允许扩展到其他任意流处理引擎,比如Flink 或者 Samza等。该层抽象允许开发者在定义监控数据处理逻辑时,无需在物理执行层绑定任何特定流处理平台,而只需通过复用、拼接和组装例如数据转换、过滤、外部数据Join等组件,以实现满足需求的DAG(有向无环图),同时,开发
��也可以很容易地以编程地方式将业务逻辑流程和Eagle 策略引擎框架集成起来。Eagle框架内部会将描述业务逻辑的DAG编译成底层流处理架构的原生应用,例如Apache Storm Topology 等,从事实现平台的独立。</p>
<p><strong>以下是一个Eagle如何处理事件和告警的示例:</strong></p>
-<div class="highlighter-rouge"><pre class="highlight"><code>StormExecutionEnvironment env = ExecutionEnvironmentFactory.getStorm(config); // storm env
+<pre><code>StormExecutionEnvironment env = ExecutionEnvironmentFactory.getStorm(config); // storm env
StreamProducer producer = env.newSource(new KafkaSourcedSpoutProvider().getSpout(config)).renameOutputFields(1) // declare kafka source
.flatMap(new AuditLogTransformer()) // transform event
.groupBy(Arrays.asList(0)) // group by 1st field
@@ -75,7 +75,6 @@ StreamProducer producer = env.newSource(
.alertWithConsumer(“userActivity“,”userProfileExecutor“) // ML policy evaluation
env.execute(); // execute stream processing and alert
</code></pre>
-</div>
<p><strong>告警框架(Alerting Framework)Eagle</strong>告警框架由流元数据API、策略引擎服务提供API、策略Partitioner API 以及预警去重框架等组成:</p>
@@ -85,7 +84,7 @@ env.execute(); // execute stream process
<li>
<p><strong>扩展性</strong> Eagle的策略引擎服务提供API允许你插入新的策略引擎</p>
- <div class="highlighter-rouge"><pre class="highlight"><code> public interface PolicyEvaluatorServiceProvider {
+ <pre><code> public interface PolicyEvaluatorServiceProvider {
public String getPolicyType(); // literal string to identify one type of policy
public Class&lt;? extends PolicyEvaluator&gt; getPolicyEvaluator(); // get policy evaluator implementation
public List&lt;Module&gt; getBindingModules(); // policy text with json format to object mapping
@@ -96,17 +95,15 @@ env.execute(); // execute stream process
public void onPolicyDelete(); // invoked when policy is deleted
}
</code></pre>
- </div>
</li>
<li><strong>策略Partitioner API</strong> 允许策略在不同的物理节点上并行执行。也允许你自定义策略Partitioner类。这些功能使得策略和事件完全以分布式的方式执行。</li>
<li>
<p><strong>可伸缩性</strong> Eagle 通过支持策略的分区接口来实现大量的策略可伸缩并发地运行</p>
- <div class="highlighter-rouge"><pre class="highlight"><code> public interface PolicyPartitioner extends Serializable {
+ <pre><code> public interface PolicyPartitioner extends Serializable {
int partition(int numTotalPartitions, String policyType, String policyId); // method to distribute policies
}
</code></pre>
- </div>
<p><img src="/images/posts/policy-partition.png" alt="" /></p>
@@ -163,29 +160,26 @@ Eagle 支持根据用�
<li>
<p>单一事件执行策略(用户访问Hive中的敏感数据列)</p>
- <div class="highlighter-rouge"><pre class="highlight"><code> from hiveAccessLogStream[sensitivityType=='PHONE_NUMBER'] select * insert into outputStream;
+ <pre><code> from hiveAccessLogStream[sensitivityType=='PHONE_NUMBER'] select * insert into outputStream;
</code></pre>
- </div>
</li>
<li>
<p>基于窗口的策略(用户在10分钟内访问目录 /tmp/private 多余 5次)</p>
- <div class="highlighter-rouge"><pre class="highlight"><code> hdfsAuditLogEventStream[(src == '/tmp/private')]#window.externalTime(timestamp,10 min) select user, count(timestamp) as aggValue group by user having aggValue &gt;= 5 insert into outputStream;
+ <pre><code> hdfsAuditLogEventStream[(src == '/tmp/private')]#window.externalTime(timestamp,10 min) select user, count(timestamp) as aggValue group by user having aggValue &gt;= 5 insert into outputStream;
</code></pre>
- </div>
</li>
</ul>
<p><strong>查询服务(Query Service)</strong> Eagle 提供类SQL的REST API用来实现针对海量数据集的综合计算、查询和分析的能力,支持例如过滤、聚合、直方运算、排序、top、算术表达式以及分页等。Eagle优先支持HBase 作为其默认数据存储,但是同时也支持基JDBC的关系型数据库。特别是当选择以HBase作为存储时,Eagle便原生拥有了HBase存储和查询海量监控数据的能力,Eagle 查询框架会将用户提供的类SQL查询语法最�
�编译成为HBase 原生的Filter 对象,并支持通过HBase Coprocessor进一步提升响应速度。</p>
-<div class="highlighter-rouge"><pre class="highlight"><code>query=AlertDefinitionService[@dataSource="hiveQueryLog"]{@policyDef}&amp;pageSize=100000
+<pre><code>query=AlertDefinitionService[@dataSource="hiveQueryLog"]{@policyDef}&amp;pageSize=100000
</code></pre>
-</div>
-<h2 id="eagle在ebay的使用场景">Eagle在eBay的使用场景</h2>
+<h2 id="eagleebay">Eagle在eBay的使用场景</h2>
<p>目前,Eagle的数据行为监控系统已经部署到一个拥有2500多个节点的Hadoop集群之上,用以保护数百PB数据的安全,并正计划于今年年底之前扩展到其他上十个Hadoop集群上,从而覆盖eBay 所有主要Hadoop的10000多台节点。在我们的生产环境中,我们已针对HDFS、Hive 等集群中的数据配置了一些基础的安全策略,并将于年底之前不断引入更多的策略,以确保重要数据的绝对安全。目前,Eagle的策略涵盖多�
模式,包括从访问模式、频繁访问数据集,预定义查询类型、Hive 表和列、HBase 表以及基于机器学习模型生成的用户Profile相关的所有策略等。同时,我们也有广泛的策略来防止数据的丢失、数据被拷贝到不安全地点、敏感数据被未授权区域访问等。Eagle策略定义上极大的灵活性和扩展性使得我们未来可以轻易地继续扩展更多更复杂的策略以支持更多多元化的用例场景。</p>
-<h2 id="后续计划">后续计划</h2>
+<h2 id="section-1">后续计划</h2>
<p>过去两年中,在eBay 除了被用于数据行为监控以外,Eagle 核心框架还被广泛用于监控节点健康状况、Hadoop应用性能指标、Hadoop 核心服务以及整个Hadoop集群的健康状况等诸多领域。我们还建立一系列的自动化机制,例如节点修复等,帮助我们平台部门极大得节省了我们人工劳力,并有效地提升了整个集群资源地利用率。</p>
<p>以下是我们目前正在开发中地一些特性:</p>
@@ -202,7 +196,7 @@ Eagle 支持根据用�
</li>
</ul>
-<h2 id="关于作者">关于作者</h2>
+<h2 id="section-2">关于作者</h2>
<p><a href="https://github.com/haoch">陈浩</a>,Apache Eagle Committer 和 PMC 成员,eBay 分析平台基础架构部门高级软件工程师,负责Eagle的产品设计、技术架构、核心实现以及开源社区推广等。</p>
<p>感谢以下来自Apache Eagle社区和eBay公司的联合作者们对本文的贡献:</p>
@@ -216,7 +210,7 @@ Eagle 支持根据用�
<p>eBay 分析平台基础架构部(Analytics Data Infrastructure)是eBay的全球数据及分析基础架构部门,负责eBay在数据库、数据仓库、Hadoop、商务智能以及机器学习等各个数据平台开发、管理等,支持eBay全球各部门运用高端的数据分析解决方案作出及时有效的作业决策,为遍布全球的业务用户提供数据分析解决方案。</p>
-<h2 id="参考资料">参考资料</h2>
+<h2 id="section-3">参考资料</h2>
<ul>
<li>Apache Eagle 文档:<a href="http://goeagle.io">http://goeagle.io</a></li>
@@ -224,7 +218,7 @@ Eagle 支持根据用�
<li>Apache Eagle 项目:<a href="http://incubator.apache.org/projects/eagle.html">http://incubator.apache.org/projects/eagle.html</a></li>
</ul>
-<h2 id="引用链接">引用链接</h2>
+<h2 id="section-4">引用链接</h2>
<ul>
<li><strong>CSDN</strong>: <a href="http://www.csdn.net/article/2015-10-29/2826076">http://www.csdn.net/article/2015-10-29/2826076</a></li>
<li><strong>OSCHINA</strong>: <a href="http://www.oschina.net/news/67515/apache-eagle">http://www.oschina.net/news/67515/apache-eagle</a></li>
Modified: eagle/site/post/2015/10/27/apache-eagle-announce-cn.html
URL: http://svn.apache.org/viewvc/eagle/site/post/2015/10/27/apache-eagle-announce-cn.html?rev=1789954&r1=1789953&r2=1789954&view=diff
==============================================================================
--- eagle/site/post/2015/10/27/apache-eagle-announce-cn.html (original)
+++ eagle/site/post/2015/10/27/apache-eagle-announce-cn.html Mon Apr 3 11:23:42 2017
@@ -93,7 +93,7 @@
<p>日前,eBay公司隆重宣布正式向开源业界推出分布式实时安全监控方案 - Apache Eagle (http://goeagle.io),该项目已于2015年10月26日正式加入Apache 成为孵化器项目。Apache Eagle提供一套高效分布式的流式策略引擎,具有高实时、可伸缩、易扩展、交互友好等特点,同时集成机器学习对用户行为建立Profile以实现智能实时地保护Hadoop生态系统中大数据的安全。</p>
-<h2 id="背景">背景</h2>
+<h2 id="section">背景</h2>
<p>随着大数据的发展,越来越多的成功企业或者组织开始采取数据驱动商业的运作模式。在eBay,我们拥有数万名工程师、分析师和数据科学家,他们每天访问分析数PB级的数据,以为我们的用户带来无与伦比的体验。在全球业务中,我们也广泛地利用海量大数据来连接我们数以亿计的用户。</p>
<p>近年来,Hadoop已经逐渐成为大数据分析领域最受欢迎的解决方案,eBay也一直在使用Hadoop技术从数据中挖掘价值,例如,我们通过大数据提高用户的搜索体验,识别和优化精准广告投放,充实我们的产品目录,以及通过点击流分析以理解用户如何使用我们的在线市场平台等。</p>
@@ -130,20 +130,20 @@
<li><strong>开源</strong>:Eagle一直根据开源的标准开发,并构建于诸多大数据领域的开源产品之上,因此我们决定以Apache许可证开源Eagle,以回馈社区,同时也期待获得社区的反馈、协作与支持。</li>
</ul>
-<h2 id="eagle概览">Eagle概览</h2>
+<h2 id="eagle">Eagle概览</h2>
<p><img src="/images/posts/eagle-group.png" alt="" /></p>
-<h4 id="数据流接入和存储data-collection-and-storage">数据流接入和存储(Data Collection and Storage)</h4>
+<h4 id="data-collection-and-storage">数据流接入和存储(Data Collection and Storage)</h4>
<p>Eagle提供高度可扩展的编程API,可以支持将任何类型的数据源集成到Eagle的策略执行引擎中。例如,在Eagle HDFS 审计事件(Audit)监控模块中,通过Kafka来实时接收来自Namenode Log4j Appender 或者 Logstash Agent 收集的数据;在Eagle Hive 监控模块中,通过YARN API 收集正在运行Job的Hive 查询日志,并保证比较高的可伸缩性和容错性。</p>
-<h4 id="数据实时处理data-processing">数据实时处理(Data Processing)</h4>
+<h4 id="data-processing">数据实时处理(Data Processing)</h4>
<p><strong>流处理API(Stream Processing API)Eagle</strong> 提供独立于物理平台而高度抽象的流处理API,目前默认支持Apache Storm,但是也允许扩展到其他任意流处理引擎,比如Flink 或者 Samza等。该层抽象允许开发者在定义监控数据处理逻辑时,无需在物理执行层绑定任何特定流处理平台,而只需通过复用、拼接和组装例如数据转换、过滤、外部数据Join等组件,以实现满足需求的DAG(有向无环图),同时,开发者也可
��很容易地以编程地方式将业务逻辑流程和Eagle 策略引擎框架集成起来。Eagle框架内部会将描述业务逻辑的DAG编译成底层流处理架构的原生应用,例如Apache Storm Topology 等,从事实现平台的独立。</p>
<p><strong>以下是一个Eagle如何处理事件和告警的示例:</strong></p>
-<div class="highlighter-rouge"><pre class="highlight"><code>StormExecutionEnvironment env = ExecutionEnvironmentFactory.getStorm(config); // storm env
+<pre><code>StormExecutionEnvironment env = ExecutionEnvironmentFactory.getStorm(config); // storm env
StreamProducer producer = env.newSource(new KafkaSourcedSpoutProvider().getSpout(config)).renameOutputFields(1) // declare kafka source
.flatMap(new AuditLogTransformer()) // transform event
.groupBy(Arrays.asList(0)) // group by 1st field
@@ -151,7 +151,6 @@ StreamProducer producer = env.newSource(
.alertWithConsumer(“userActivity“,”userProfileExecutor“) // ML policy evaluation
env.execute(); // execute stream processing and alert
</code></pre>
-</div>
<p><strong>告警框架(Alerting Framework)Eagle</strong>告警框架由流元数据API、策略引擎服务提供API、策略Partitioner API 以及预警去重框架等组成:</p>
@@ -161,7 +160,7 @@ env.execute(); // execute stream process
<li>
<p><strong>扩展性</strong> Eagle的策略引擎服务提供API允许你插入新的策略引擎</p>
- <div class="highlighter-rouge"><pre class="highlight"><code> public interface PolicyEvaluatorServiceProvider {
+ <pre><code> public interface PolicyEvaluatorServiceProvider {
public String getPolicyType(); // literal string to identify one type of policy
public Class<? extends PolicyEvaluator> getPolicyEvaluator(); // get policy evaluator implementation
public List<Module> getBindingModules(); // policy text with json format to object mapping
@@ -172,17 +171,15 @@ env.execute(); // execute stream process
public void onPolicyDelete(); // invoked when policy is deleted
}
</code></pre>
- </div>
</li>
<li><strong>策略Partitioner API</strong> 允许策略在不同的物理节点上并行执行。也允许你自定义策略Partitioner类。这些功能使得策略和事件完全以分布式的方式执行。</li>
<li>
<p><strong>可伸缩性</strong> Eagle 通过支持策略的分区接口来实现大量的策略可伸缩并发地运行</p>
- <div class="highlighter-rouge"><pre class="highlight"><code> public interface PolicyPartitioner extends Serializable {
+ <pre><code> public interface PolicyPartitioner extends Serializable {
int partition(int numTotalPartitions, String policyType, String policyId); // method to distribute policies
}
</code></pre>
- </div>
<p><img src="/images/posts/policy-partition.png" alt="" /></p>
@@ -239,29 +236,26 @@ Eagle 支持根据用�
<li>
<p>单一事件执行策略(用户访问Hive中的敏感数据列)</p>
- <div class="highlighter-rouge"><pre class="highlight"><code> from hiveAccessLogStream[sensitivityType=='PHONE_NUMBER'] select * insert into outputStream;
+ <pre><code> from hiveAccessLogStream[sensitivityType=='PHONE_NUMBER'] select * insert into outputStream;
</code></pre>
- </div>
</li>
<li>
<p>基于窗口的策略(用户在10分钟内访问目录 /tmp/private 多余 5次)</p>
- <div class="highlighter-rouge"><pre class="highlight"><code> hdfsAuditLogEventStream[(src == '/tmp/private')]#window.externalTime(timestamp,10 min) select user, count(timestamp) as aggValue group by user having aggValue >= 5 insert into outputStream;
+ <pre><code> hdfsAuditLogEventStream[(src == '/tmp/private')]#window.externalTime(timestamp,10 min) select user, count(timestamp) as aggValue group by user having aggValue >= 5 insert into outputStream;
</code></pre>
- </div>
</li>
</ul>
<p><strong>查询服务(Query Service)</strong> Eagle 提供类SQL的REST API用来实现针对海量数据集的综合计算、查询和分析的能力,支持例如过滤、聚合、直方运算、排序、top、算术表达式以及分页等。Eagle优先支持HBase 作为其默认数据存储,但是同时也支持基JDBC的关系型数据库。特别是当选择以HBase作为存储时,Eagle便原生拥有了HBase存储和查询海量监控数据的能力,Eagle 查询框架会将用户提供的类SQL查询语法最终编译�
�为HBase 原生的Filter 对象,并支持通过HBase Coprocessor进一步提升响应速度。</p>
-<div class="highlighter-rouge"><pre class="highlight"><code>query=AlertDefinitionService[@dataSource="hiveQueryLog"]{@policyDef}&pageSize=100000
+<pre><code>query=AlertDefinitionService[@dataSource="hiveQueryLog"]{@policyDef}&pageSize=100000
</code></pre>
-</div>
-<h2 id="eagle在ebay的使用场景">Eagle在eBay的使用场景</h2>
+<h2 id="eagleebay">Eagle在eBay的使用场景</h2>
<p>目前,Eagle的数据行为监控系统已经部署到一个拥有2500多个节点的Hadoop集群之上,用以保护数百PB数据的安全,并正计划于今年年底之前扩展到其他上十个Hadoop集群上,从而覆盖eBay 所有主要Hadoop的10000多台节点。在我们的生产环境中,我们已针对HDFS、Hive 等集群中的数据配置了一些基础的安全策略,并将于年底之前不断引入更多的策略,以确保重要数据的绝对安全。目前,Eagle的策略涵盖多种�
式,包括从访问模式、频繁访问数据集,预定义查询类型、Hive 表和列、HBase 表以及基于机器学习模型生成的用户Profile相关的所有策略等。同时,我们也有广泛的策略来防止数据的丢失、数据被拷贝到不安全地点、敏感数据被未授权区域访问等。Eagle策略定义上极大的灵活性和扩展性使得我们未来可以轻易地继续扩展更多更复杂的策略以支持更多多元化的用例场景。</p>
-<h2 id="后续计划">后续计划</h2>
+<h2 id="section-1">后续计划</h2>
<p>过去两年中,在eBay 除了被用于数据行为监控以外,Eagle 核心框架还被广泛用于监控节点健康状况、Hadoop应用性能指标、Hadoop 核心服务以及整个Hadoop集群的健康状况等诸多领域。我们还建立一系列的自动化机制,例如节点修复等,帮助我们平台部门极大得节省了我们人工劳力,并有效地提升了整个集群资源地利用率。</p>
<p>以下是我们目前正在开发中地一些特性:</p>
@@ -278,7 +272,7 @@ Eagle 支持根据用�
</li>
</ul>
-<h2 id="关于作者">关于作者</h2>
+<h2 id="section-2">关于作者</h2>
<p><a href="https://github.com/haoch">陈浩</a>,Apache Eagle Committer 和 PMC 成员,eBay 分析平台基础架构部门高级软件工程师,负责Eagle的产品设计、技术架构、核心实现以及开源社区推广等。</p>
<p>感谢以下来自Apache Eagle社区和eBay公司的联合作者们对本文的贡献:</p>
@@ -292,7 +286,7 @@ Eagle 支持根据用�
<p>eBay 分析平台基础架构部(Analytics Data Infrastructure)是eBay的全球数据及分析基础架构部门,负责eBay在数据库、数据仓库、Hadoop、商务智能以及机器学习等各个数据平台开发、管理等,支持eBay全球各部门运用高端的数据分析解决方案作出及时有效的作业决策,为遍布全球的业务用户提供数据分析解决方案。</p>
-<h2 id="参考资料">参考资料</h2>
+<h2 id="section-3">参考资料</h2>
<ul>
<li>Apache Eagle 文档:<a href="http://goeagle.io">http://goeagle.io</a></li>
@@ -300,7 +294,7 @@ Eagle 支持根据用�
<li>Apache Eagle 项目:<a href="http://incubator.apache.org/projects/eagle.html">http://incubator.apache.org/projects/eagle.html</a></li>
</ul>
-<h2 id="引用链接">引用链接</h2>
+<h2 id="section-4">引用链接</h2>
<ul>
<li><strong>CSDN</strong>: <a href="http://www.csdn.net/article/2015-10-29/2826076">http://www.csdn.net/article/2015-10-29/2826076</a></li>
<li><strong>OSCHINA</strong>: <a href="http://www.oschina.net/news/67515/apache-eagle">http://www.oschina.net/news/67515/apache-eagle</a></li>