You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kylin.apache.org by sh...@apache.org on 2015/09/26 13:02:50 UTC

svn commit: r1705409 - in /incubator/kylin/site: blog/2015/09/25/hybrid-model/index.html feed.xml

Author: shaofengshi
Date: Sat Sep 26 11:02:50 2015
New Revision: 1705409

URL: http://svn.apache.org/viewvc?rev=1705409&view=rev
Log:
update hybrid blog

Modified:
    incubator/kylin/site/blog/2015/09/25/hybrid-model/index.html
    incubator/kylin/site/feed.xml

Modified: incubator/kylin/site/blog/2015/09/25/hybrid-model/index.html
URL: http://svn.apache.org/viewvc/incubator/kylin/site/blog/2015/09/25/hybrid-model/index.html?rev=1705409&r1=1705408&r2=1705409&view=diff
==============================================================================
--- incubator/kylin/site/blog/2015/09/25/hybrid-model/index.html (original)
+++ incubator/kylin/site/blog/2015/09/25/hybrid-model/index.html Sat Sep 26 11:02:50 2015
@@ -41,7 +41,7 @@
   <meta name="viewport" content="width=device-width, initial-scale=1">
 
   <title>Apache Kylin | Hybrid Model in Apache Kylin 1.0</title>
-  <meta name="description" content="Apache Kylin v1.0 introduces a new realization “hybrid model” (also called “dynamic model”); This post introduces the concept and how to create it.">
+  <meta name="description" content="Apache Kylin v1.0 introduces a new realization “hybrid model” (also called “dynamic model”); This post introduces the concept and how to create a hybrid inst...">
   <meta name="author"      content="Apache Kylin">
   <link rel="shortcut icon" href="fav.png" type="image/png">
 
@@ -185,7 +185,7 @@
   </header>
 
   <article class="post-content" >
-    <p><strong>Apache Kylin v1.0 introduces a new realization “hybrid model” (also called “dynamic model”); This post introduces the concept and how to create it.</strong></p>
+    <p><strong>Apache Kylin v1.0 introduces a new realization “hybrid model” (also called “dynamic model”); This post introduces the concept and how to create a hybrid instance.</strong></p>
 
 <h1 id="problem">Problem</h1>
 
@@ -198,10 +198,11 @@
 <ul>
   <li>History source data has been dropped from Hadoop, not possible to build “Cube_V2” from the very beginning;</li>
   <li>The cube is large, rebuilding takes very long time;</li>
-  <li>New dimension/metrics is only available or applied since some day, or user feels fine if they were absent for old dates; etc.</li>
+  <li>New dimension/metrics is only available or applied since some day;</li>
+  <li>User feels okay that the result is empty for old days when the query uses new dimensions/metrics.</li>
 </ul>
 
-<p>For some queries on the common measures/metrics, user expects both “Cube_V1” and “Cube_V2” be scanned to get a full result, such as “select count(*)…”, “select sum(column)…”, etc; Under such a background, the “hybrid model” is introduced.</p>
+<p>For the queries against the common dimensions/metrics, user expects both “Cube_V1” and “Cube_V2” be scanned to get a full result set; Under such a background, the “hybrid model” is introduced to solve this problem.</p>
 
 <h2 id="hybrid-model">Hybrid Model</h2>
 
@@ -209,11 +210,11 @@
 
 <p><img src="/images/blog/hybrid-model.png" alt="" /></p>
 
-<p>Hybrid doesn’t have its real storage; It is just like a virtual database view over tables; It acts as a delegator who forward the requests to its children realizations and then consolidates the results.</p>
+<p>Hybrid doesn’t have its real storage; It is like a virtual database view over the tables; A hybrid instance acts as a delegator who forward the requests to its children realizations and then union the results when gets back from them.</p>
 
-<h2 id="how-to-add-a-hybrid-model">How to add a Hybrid model</h2>
+<h2 id="how-to-add-a-hybrid-instance">How to add a hybrid instance</h2>
 
-<p>So far there is no UI for creating/editing hybrid model; if have the need, you need manually edit Kylin metadata;</p>
+<p>So far there is no UI for creating/editing hybrid; if have the need, you need manually edit Kylin metadata;</p>
 
 <h3 id="step-1-take-a-backup-of-kylin-metadata-store">Step 1: Take a backup of kylin metadata store</h3>
 
@@ -226,20 +227,20 @@ $KYLIN_HOME/bin/metastore.sh backup
 
 <p>A backup folder will be created, assume it is $KYLIN_HOME/metadata_backup/2015-09-25/</p>
 
-<h3 id="step-2-create-sub-folder-hybrid-in-the-metadata-folder">Step 2: Create sub-folder “hybrid” in the metadata folder,</h3>
+<h3 id="step-2-create-sub-folder-hybrid">Step 2: Create sub-folder “hybrid”</h3>
 
 <div class="highlighter-rouge"><pre class="highlight"><code>mkdir -p $KYLIN_HOME/metadata_backup/2015-09-25/hybrid
 </code></pre>
 </div>
 
-<h3 id="step-3-create-a-hybrid-json-file">Step 3: Create a hybrid json file:</h3>
+<h3 id="step-3-create-a-hybrid-instance-json-file">Step 3: Create a hybrid instance json file:</h3>
 
 <div class="highlighter-rouge"><pre class="highlight"><code>vi $KYLIN_HOME/metadata_backup/2015-09-25/hybrid/my_hybrid.json
 
 </code></pre>
 </div>
 
-<p>Input content like this:</p>
+<p>Input content like below, the “name” and “uuid” need be unique:</p>
 
 <div class="highlighter-rouge"><pre class="highlight"><code><span class="p">{</span><span class="w">
   </span><span class="nt">"uuid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"9iiu8590-64b6-4367-8fb5-7500eb95fd9c"</span><span class="p">,</span><span class="w">
@@ -260,7 +261,7 @@ $KYLIN_HOME/bin/metastore.sh backup
 </div>
 <p>Here “Cube_V1” and “Cube_V2” are the cubes that you want to combine.</p>
 
-<h3 id="step-4-add-hybrid-model-to-project">Step 4: Add hybrid model to project</h3>
+<h3 id="step-4-add-hybrid-instance-to-project">Step 4: Add hybrid instance to project</h3>
 
 <p>Open project json file (for example project “default”) with text editor:</p>
 
@@ -269,7 +270,7 @@ $KYLIN_HOME/bin/metastore.sh backup
 </code></pre>
 </div>
 
-<p>In the “realizations” array, add one entry like:</p>
+<p>In the “realizations” array, add one entry like below, the type need be “HYBRID”, “realization” is the name of the hybrid instance:</p>
 
 <div class="highlighter-rouge"><pre class="highlight"><code><span class="w">    </span><span class="p">{</span><span class="w">
       </span><span class="nt">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"my_hybrid"</span><span class="p">,</span><span class="w">
@@ -289,7 +290,7 @@ $KYLIN_HOME/bin/metastore.sh backup
 
 <h3 id="step-6-reload-metadata">Step 6: Reload metadata</h3>
 
-<p>Restart Kylin server, or click “Reload metadata” in the “Admin” tab on Kylin web UI to load the changes; Ideally the hybrid will start to work; You can do some verifications.</p>
+<p>Restart Kylin server, or click “Reload metadata” in the “Admin” tab on Kylin web UI to load the changes; Ideally the hybrid will start to work; You can do some verifications by writing some SQLs.</p>
 
 <h2 id="faq">FAQ:</h2>
 
@@ -303,13 +304,16 @@ Hybrid will delegate the query to each o
 No; it depends on user to ensure the cubes in a hybrid don’t have date/time range duplication; For example, the “Cube_V1” is ended at 2015-9-20 (excluding), the “Cube_V2” should start from 2015-9-20 (including);</p>
 
 <p><strong>Question 4</strong>: Will hybrid restrict the children cubes having the same data model?<br />
-No; To provide as much as flexibility, hybrid doesn’t check whether the children cubes’ fact/lookup tables and join conditions are the matched; But user should understand what they’re doing to avoid unexpected behavior.</p>
+No; To provide as much as flexibility, hybrid doesn’t check whether the children cubes’ fact/lookup tables and join conditions are the same; But user should understand what they’re doing to avoid unexpected behavior.</p>
 
 <p><strong>Question 5</strong>: Can hybrid have another hybrid as child?<br />
-No; didn’t see the need; so far it assumes all children are Cubes;</p>
+No; we don’t see the need; so far it assumes all children are Cubes;</p>
 
 <p><strong>Question 6</strong>: Can I use hybrid to join multiple cubes?<br />
-No; the purpose of hybrid is to consolidate history cube and new cube, something like a “union”, not join relationship;</p>
+No; the purpose of hybrid is to consolidate history cube and new cube, something like a “union”, not “join”;</p>
+
+<p><strong>Question 7</strong>: If a child cube is disabled, will it be scanned via the hybrid?<br />
+No; hybrid instance will check the child realization’s status before sending query to it; so if the cube is disabled, it will not be scanned.</p>
 
   </article>
 

Modified: incubator/kylin/site/feed.xml
URL: http://svn.apache.org/viewvc/incubator/kylin/site/feed.xml?rev=1705409&r1=1705408&r2=1705409&view=diff
==============================================================================
--- incubator/kylin/site/feed.xml (original)
+++ incubator/kylin/site/feed.xml Sat Sep 26 11:02:50 2015
@@ -19,13 +19,13 @@
     <description>Apache Kylin Home</description>
     <link>http://kylin.incubator.apache.org/</link>
     <atom:link href="http://kylin.incubator.apache.org/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Sat, 26 Sep 2015 03:40:44 -0700</pubDate>
-    <lastBuildDate>Sat, 26 Sep 2015 03:40:44 -0700</lastBuildDate>
+    <pubDate>Sat, 26 Sep 2015 04:01:37 -0700</pubDate>
+    <lastBuildDate>Sat, 26 Sep 2015 04:01:37 -0700</lastBuildDate>
     <generator>Jekyll v2.5.3</generator>
     
       <item>
         <title>Hybrid Model in Apache Kylin 1.0</title>
-        <description>&lt;p&gt;&lt;strong&gt;Apache Kylin v1.0 introduces a new realization “hybrid model” (also called “dynamic model”); This post introduces the concept and how to create it.&lt;/strong&gt;&lt;/p&gt;
+        <description>&lt;p&gt;&lt;strong&gt;Apache Kylin v1.0 introduces a new realization “hybrid model” (also called “dynamic model”); This post introduces the concept and how to create a hybrid instance.&lt;/strong&gt;&lt;/p&gt;
 
 &lt;h1 id=&quot;problem&quot;&gt;Problem&lt;/h1&gt;
 
@@ -38,10 +38,11 @@
 &lt;ul&gt;
   &lt;li&gt;History source data has been dropped from Hadoop, not possible to build “Cube_V2” from the very beginning;&lt;/li&gt;
   &lt;li&gt;The cube is large, rebuilding takes very long time;&lt;/li&gt;
-  &lt;li&gt;New dimension/metrics is only available or applied since some day, or user feels fine if they were absent for old dates; etc.&lt;/li&gt;
+  &lt;li&gt;New dimension/metrics is only available or applied since some day;&lt;/li&gt;
+  &lt;li&gt;User feels okay that the result is empty for old days when the query uses new dimensions/metrics.&lt;/li&gt;
 &lt;/ul&gt;
 
-&lt;p&gt;For some queries on the common measures/metrics, user expects both “Cube_V1” and “Cube_V2” be scanned to get a full result, such as “select count(*)…”, “select sum(column)…”, etc; Under such a background, the “hybrid model” is introduced.&lt;/p&gt;
+&lt;p&gt;For the queries against the common dimensions/metrics, user expects both “Cube_V1” and “Cube_V2” be scanned to get a full result set; Under such a background, the “hybrid model” is introduced to solve this problem.&lt;/p&gt;
 
 &lt;h2 id=&quot;hybrid-model&quot;&gt;Hybrid Model&lt;/h2&gt;
 
@@ -49,11 +50,11 @@
 
 &lt;p&gt;&lt;img src=&quot;/images/blog/hybrid-model.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;Hybrid doesn’t have its real storage; It is just like a virtual database view over tables; It acts as a delegator who forward the requests to its children realizations and then consolidates the results.&lt;/p&gt;
+&lt;p&gt;Hybrid doesn’t have its real storage; It is like a virtual database view over the tables; A hybrid instance acts as a delegator who forward the requests to its children realizations and then union the results when gets back from them.&lt;/p&gt;
 
-&lt;h2 id=&quot;how-to-add-a-hybrid-model&quot;&gt;How to add a Hybrid model&lt;/h2&gt;
+&lt;h2 id=&quot;how-to-add-a-hybrid-instance&quot;&gt;How to add a hybrid instance&lt;/h2&gt;
 
-&lt;p&gt;So far there is no UI for creating/editing hybrid model; if have the need, you need manually edit Kylin metadata;&lt;/p&gt;
+&lt;p&gt;So far there is no UI for creating/editing hybrid; if have the need, you need manually edit Kylin metadata;&lt;/p&gt;
 
 &lt;h3 id=&quot;step-1-take-a-backup-of-kylin-metadata-store&quot;&gt;Step 1: Take a backup of kylin metadata store&lt;/h3&gt;
 
@@ -66,20 +67,20 @@ $KYLIN_HOME/bin/metastore.sh backup
 
 &lt;p&gt;A backup folder will be created, assume it is $KYLIN_HOME/metadata_backup/2015-09-25/&lt;/p&gt;
 
-&lt;h3 id=&quot;step-2-create-sub-folder-hybrid-in-the-metadata-folder&quot;&gt;Step 2: Create sub-folder “hybrid” in the metadata folder,&lt;/h3&gt;
+&lt;h3 id=&quot;step-2-create-sub-folder-hybrid&quot;&gt;Step 2: Create sub-folder “hybrid”&lt;/h3&gt;
 
 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;mkdir -p $KYLIN_HOME/metadata_backup/2015-09-25/hybrid
 &lt;/code&gt;&lt;/pre&gt;
 &lt;/div&gt;
 
-&lt;h3 id=&quot;step-3-create-a-hybrid-json-file&quot;&gt;Step 3: Create a hybrid json file:&lt;/h3&gt;
+&lt;h3 id=&quot;step-3-create-a-hybrid-instance-json-file&quot;&gt;Step 3: Create a hybrid instance json file:&lt;/h3&gt;
 
 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;vi $KYLIN_HOME/metadata_backup/2015-09-25/hybrid/my_hybrid.json
 
 &lt;/code&gt;&lt;/pre&gt;
 &lt;/div&gt;
 
-&lt;p&gt;Input content like this:&lt;/p&gt;
+&lt;p&gt;Input content like below, the “name” and “uuid” need be unique:&lt;/p&gt;
 
 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
   &lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&quot;uuid&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;9iiu8590-64b6-4367-8fb5-7500eb95fd9c&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
@@ -100,7 +101,7 @@ $KYLIN_HOME/bin/metastore.sh backup
 &lt;/div&gt;
 &lt;p&gt;Here “Cube_V1” and “Cube_V2” are the cubes that you want to combine.&lt;/p&gt;
 
-&lt;h3 id=&quot;step-4-add-hybrid-model-to-project&quot;&gt;Step 4: Add hybrid model to project&lt;/h3&gt;
+&lt;h3 id=&quot;step-4-add-hybrid-instance-to-project&quot;&gt;Step 4: Add hybrid instance to project&lt;/h3&gt;
 
 &lt;p&gt;Open project json file (for example project “default”) with text editor:&lt;/p&gt;
 
@@ -109,7 +110,7 @@ $KYLIN_HOME/bin/metastore.sh backup
 &lt;/code&gt;&lt;/pre&gt;
 &lt;/div&gt;
 
-&lt;p&gt;In the “realizations” array, add one entry like:&lt;/p&gt;
+&lt;p&gt;In the “realizations” array, add one entry like below, the type need be “HYBRID”, “realization” is the name of the hybrid instance:&lt;/p&gt;
 
 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
       &lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;my_hybrid&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
@@ -129,7 +130,7 @@ $KYLIN_HOME/bin/metastore.sh backup
 
 &lt;h3 id=&quot;step-6-reload-metadata&quot;&gt;Step 6: Reload metadata&lt;/h3&gt;
 
-&lt;p&gt;Restart Kylin server, or click “Reload metadata” in the “Admin” tab on Kylin web UI to load the changes; Ideally the hybrid will start to work; You can do some verifications.&lt;/p&gt;
+&lt;p&gt;Restart Kylin server, or click “Reload metadata” in the “Admin” tab on Kylin web UI to load the changes; Ideally the hybrid will start to work; You can do some verifications by writing some SQLs.&lt;/p&gt;
 
 &lt;h2 id=&quot;faq&quot;&gt;FAQ:&lt;/h2&gt;
 
@@ -143,13 +144,16 @@ Hybrid will delegate the query to each o
 No; it depends on user to ensure the cubes in a hybrid don’t have date/time range duplication; For example, the “Cube_V1” is ended at 2015-9-20 (excluding), the “Cube_V2” should start from 2015-9-20 (including);&lt;/p&gt;
 
 &lt;p&gt;&lt;strong&gt;Question 4&lt;/strong&gt;: Will hybrid restrict the children cubes having the same data model?&lt;br /&gt;
-No; To provide as much as flexibility, hybrid doesn’t check whether the children cubes’ fact/lookup tables and join conditions are the matched; But user should understand what they’re doing to avoid unexpected behavior.&lt;/p&gt;
+No; To provide as much as flexibility, hybrid doesn’t check whether the children cubes’ fact/lookup tables and join conditions are the same; But user should understand what they’re doing to avoid unexpected behavior.&lt;/p&gt;
 
 &lt;p&gt;&lt;strong&gt;Question 5&lt;/strong&gt;: Can hybrid have another hybrid as child?&lt;br /&gt;
-No; didn’t see the need; so far it assumes all children are Cubes;&lt;/p&gt;
+No; we don’t see the need; so far it assumes all children are Cubes;&lt;/p&gt;
 
 &lt;p&gt;&lt;strong&gt;Question 6&lt;/strong&gt;: Can I use hybrid to join multiple cubes?&lt;br /&gt;
-No; the purpose of hybrid is to consolidate history cube and new cube, something like a “union”, not join relationship;&lt;/p&gt;
+No; the purpose of hybrid is to consolidate history cube and new cube, something like a “union”, not “join”;&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;Question 7&lt;/strong&gt;: If a child cube is disabled, will it be scanned via the hybrid?&lt;br /&gt;
+No; hybrid instance will check the child realization’s status before sending query to it; so if the cube is disabled, it will not be scanned.&lt;/p&gt;
 </description>
         <pubDate>Fri, 25 Sep 2015 09:00:00 -0700</pubDate>
         <link>http://kylin.incubator.apache.org/blog/2015/09/25/hybrid-model/</link>