You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kylin.apache.org by li...@apache.org on 2020/11/10 14:09:20 UTC
svn commit: r1883250 - in /kylin/site:
cn/docs/tutorial/setup_systemcube.html docs/tutorial/setup_systemcube.html
feed.xml
Author: lidong
Date: Tue Nov 10 14:09:19 2020
New Revision: 1883250
URL: http://svn.apache.org/viewvc?rev=1883250&view=rev
Log:
a little update
Modified:
kylin/site/cn/docs/tutorial/setup_systemcube.html
kylin/site/docs/tutorial/setup_systemcube.html
kylin/site/feed.xml
Modified: kylin/site/cn/docs/tutorial/setup_systemcube.html
URL: http://svn.apache.org/viewvc/kylin/site/cn/docs/tutorial/setup_systemcube.html?rev=1883250&r1=1883249&r2=1883250&view=diff
==============================================================================
--- kylin/site/cn/docs/tutorial/setup_systemcube.html (original)
+++ kylin/site/cn/docs/tutorial/setup_systemcube.html Tue Nov 10 14:09:19 2020
@@ -182,16 +182,26 @@ var _hmt = _hmt || [];
<p>èª Apache Kylin v2.3.0 èµ·ææ</p>
</blockquote>
-<h2 id="cube">ä»ä¹æ¯ç³»ç» Cube</h2>
+<p>æ¬è主è¦å
容ï¼</p>
+
+<ul>
+ <li><a href="#ä»ä¹æ¯ç³»ç» Cube">ä»ä¹æ¯ç³»ç» Cube</a></li>
+ <li><a href="#å¦ä½å»ºç«ç³»ç» Cube">å¦ä½å»ºç«ç³»ç» Cube</a></li>
+ <li><a href="#èªå¨åå»ºç³»ç» Cube">èªå¨åå»ºç³»ç» Cube</a></li>
+ <li><a href="#ç³»ç» Cube çç»è">ç³»ç» Cube çç»è</a></li>
+</ul>
+
+<h2 id="span-id-cube-cubespan"><span id="ä»ä¹æ¯ç³»ç» Cube">ä»ä¹æ¯ç³»ç» Cube</span></h2>
<p>为äºæ´å¥½çæ¯æèªæçæ§ï¼å¨ç³»ç» project ä¸å建ä¸ç»ç³»ç» Cubesï¼å«å âKYLIN_SYSTEMâãç°å¨ï¼è¿éæäºä¸ª Cubesãä¸ä¸ªç¨äºæ¥è¯¢ææ ï¼âMETRICS_QUERYâï¼âMETRICS_QUERY_CUBEâï¼âMETRICS_QUERY_RPCâãå¦å¤ä¸¤ä¸ªæ¯ job ææ ï¼âMETRICS_JOBâï¼âMETRICS_JOB_EXCEPTIONâã</p>
-<h2 id="cube-1">å¦ä½å»ºç«ç³»ç» Cube</h2>
+<h2 id="span-id-cube-cubespan-1"><span id="å¦ä½å»ºç«ç³»ç» Cube">å¦ä½å»ºç«ç³»ç» Cube</span></h2>
-<h3 id="section">åå¤</h3>
-<p>å¨ KYLIN_HOME ç®å½ä¸å建ä¸ä¸ªé
ç½®æ件 SCSinkTools.jsonã</p>
+<p>æ¬èæ们ä»ç»æå¨å¯ç¨ç³»ç» Cube çæ¹æ³ï¼å¦ææ¨å¸æéè¿ shell èæ¬èªå¨åå»ºç³»ç» Cubeï¼è¯·åè<a href="#èªå¨åå»ºç³»ç» Cube">èªå¨åå»ºç³»ç» Cube</a>ã</p>
-<p>ä¾å¦ï¼</p>
+<h3 id="section">1. åå¤</h3>
+
+<p>å¨ KYLIN_HOME ç®å½ä¸å建ä¸ä¸ªé
ç½®æ件 SCSinkTools.jsonãä¾å¦ï¼</p>
<div class="highlighter-rouge"><pre class="highlight"><code>[
{
@@ -206,8 +216,8 @@ var _hmt = _hmt || [];
</code></pre>
</div>
-<h3 id="metadata">1. çæ Metadata</h3>
-<p>å¨ KYLIN_HOME æ件夹ä¸è¿è¡ä¸ä¸å½ä»¤çæç¸å
³ç metadataï¼</p>
+<h3 id="metadata">2. çæ Metadata</h3>
+<p>å¨ KYLIN_HOME æ件夹ä¸è¿è¡ä»¥ä¸å½ä»¤çæç¸å
³ç metadataï¼</p>
<div class="highlighter-rouge"><pre class="highlight"><code>./bin/kylin.sh org.apache.kylin.tool.metrics.systemcube.SCCreator \
-inputConfig SCSinkTools.json \
@@ -219,39 +229,31 @@ var _hmt = _hmt || [];
<p><img src="/images/SystemCube/metadata.png" alt="metadata" /></p>
-<h3 id="section-1">2. 建ç«æ°æ®æº</h3>
-<p>è¿è¡ä¸åå½ä»¤çæ hive æºè¡¨ï¼</p>
+<h3 id="section-1">3. 建ç«æ°æ®æº</h3>
+<p>è¿è¡ä¸åå½ä»¤çæ Hive æºè¡¨ï¼</p>
<div class="highlighter-rouge"><pre class="highlight"><code>hive -f <output_forder>/create_hive_tables_for_system_cubes.sql
</code></pre>
</div>
-<p>éè¿è¿ä¸ªå½ä»¤ï¼ç¸å
³ç hive 表å°ä¼è¢«å建ã</p>
+<p>éè¿è¿ä¸ªå½ä»¤ï¼ç¸å
³ç hive 表å°ä¼è¢«å建ãæ¯ä¸ä¸ªç³»ç» Cube ä¸çäºå®è¡¨å¯¹åºäºä¸å¼ Hive æºè¡¨ï¼Hive æºè¡¨ä¸è®°å½äºæ¥è¯¢æä»»å¡ç¸å
³çæ°æ®ï¼è¿äºæ°æ®å°ä¸ºç³»ç» Cube æå¡ã</p>
<p><img src="/images/SystemCube/hive_table.png" alt="hive_table" /></p>
-<h3 id="system-cubes--metadata">3. 为 System Cubes ä¸ä¼ Metadata</h3>
+<h3 id="cubes--metadata">4. ä¸ºç³»ç» Cubes ä¸ä¼ Metadata</h3>
<p>ç¶åæ们éè¦éè¿ä¸åå½ä»¤ä¸ä¼ metadata å° hbaseï¼</p>
<div class="highlighter-rouge"><pre class="highlight"><code>./bin/metastore.sh restore <output_forder>
</code></pre>
</div>
-<h3 id="metadata-1">4. éè½½ Metadata</h3>
-<p>æç»ï¼æ们éè¦å¨ Kylin web UI éè½½ metadataã</p>
-
-<p>ç¶åï¼ä¸ç»ç³»ç» Cubes å°ä¼è¢«å建å¨ç³»ç» project ä¸ï¼ç§°ä¸º âKYLIN_SYSTEMâã</p>
-
-<h3 id="cube-build">5. ç³»ç» Cube build</h3>
-<p>å½ç³»ç» Cube 被å建ï¼æ们éè¦å®æ build Cubeã</p>
+<h3 id="metadata-1">5. éè½½ Metadata</h3>
+<p>æç»ï¼æ们éè¦å¨ Kylin web UI éè½½ metadataãç¶åï¼ä¸ç»ç³»ç» Cubes å°ä¼è¢«å建å¨ç³»ç» project ä¸ï¼ç§°ä¸º âKYLIN_SYSTEMâã</p>
-<ol>
- <li>
- <p>å建ä¸ä¸ª shell èæ¬å
¶éè¿è°ç¨ org.apache.kylin.tool.job.CubeBuildingCLI æ¥ build ç³»ç» Cube</p>
+<h3 id="cube">6. æå»ºç³»ç» Cube</h3>
+<p>å½ç³»ç» Cube 被å建ï¼æ们éè¦å®ææ建 Cubeãæ¹æ³å¦ä¸ï¼</p>
- <p>ä¾å¦:</p>
- </li>
-</ol>
+<p><strong>æ¥éª¤ä¸</strong>ï¼å建ä¸ä¸ª shell èæ¬ï¼éè¿è°ç¨ org.apache.kylin.tool.job.CubeBuildingCLI æ¥æå»ºç³»ç» Cubeãä¾å¦ï¼</p>
<div class="highlight"><pre><code class="language-groff" data-lang="groff">#!/bin/bash
@@ -270,13 +272,7 @@ ID="$END"
echo "building for ${CUBE}_${ID}" >> ${KYLIN_HOME}/logs/build_trace.log
sh ${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.job.CubeBuildingCLI --cube ${CUBE} --endTime ${END} > ${KYLIN_HOME}/logs/system_cube_${CUBE}_${END}.log 2>&1 &</code></pre></div>
-<ol>
- <li>
- <p>ç¶åå®æè¿è¡è¿ä¸ª shell èæ¬</p>
-
- <p>ä¾å¦ï¼åæ¥ä¸æ¥è¿æ ·æ·»å ä¸ä¸ª cron jobï¼</p>
- </li>
-</ol>
+<p><strong>æ¥éª¤äº</strong>ï¼å®æè¿è¡è¿ä¸ª shell èæ¬ãä¾å¦ï¼åæ¥ä¸æ¥è¿æ ·æ·»å ä¸ä¸ª cron jobï¼</p>
<div class="highlight"><pre><code class="language-groff" data-lang="groff">0 */2 * * * sh ${KYLIN_HOME}/bin/system_cube_build.sh KYLIN_HIVE_METRICS_QUERY_QA 3600000 1200000
@@ -288,26 +284,29 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
50 */12 * * * sh ${KYLIN_HOME}/bin/system_cube_build.sh KYLIN_HIVE_METRICS_JOB_EXCEPTION_QA 3600000 12000</code></pre></div>
-<h2 id="cube-2">èªå¨å建系ç»cube</h2>
+<h2 id="span-id-cube-cubespan-2"><span id="èªå¨åå»ºç³»ç» Cube">èªå¨åå»ºç³»ç» Cube</span></h2>
-<p>ä»kylin 2.6.0å¼å§æä¾system-cube.shèæ¬ï¼ç¨æ·å¯ä»¥éè¿æ§è¡æ¤èæ¬æ¥èªå¨å建系ç»cubeã</p>
+<p>ä»kylin 2.6.0 å¼å§æä¾ system-cube.sh èæ¬ï¼ç¨æ·å¯ä»¥éè¿æ§è¡æ¤èæ¬æ¥èªå¨åå»ºç³»ç» Cubeã</p>
<ul>
<li>
- <p>å建系ç»cubeï¼<code class="highlighter-rouge">sh system-cube.sh setup</code></p>
+ <p>åå»ºç³»ç» Cubeï¼<code class="highlighter-rouge">sh system-cube.sh setup</code></p>
</li>
<li>
- <p>æ建系ç»cubeï¼<code class="highlighter-rouge">sh bin/system-cube.sh build</code></p>
+ <p>æå»ºç³»ç» Cubeï¼<code class="highlighter-rouge">sh bin/system-cube.sh build</code></p>
</li>
<li>
- <p>为系ç»cubeæ·»å å®æ¶ä»»å¡ï¼<code class="highlighter-rouge">bin/system.sh cron</code></p>
+ <p>ä¸ºç³»ç» Cube æ·»å å®æ¶ä»»å¡ï¼<code class="highlighter-rouge">bin/system.sh cron</code></p>
</li>
</ul>
-<h2 id="cube-">ç³»ç» Cube çç»è</h2>
+<h2 id="span-id-cube--cube-span"><span id="ç³»ç» Cube çç»è">ç³»ç» Cube çç»è</span></h2>
+
+<p>Hive ä¸æ 5 å¼ è¡¨è®°å½äº Kylin ç³»ç»çç¸å
³ææ æ°æ®ï¼æ¯ä¸ä¸ªç³»ç» Cube çäºå®è¡¨å¯¹åºäºä¸å¼ Hive 表ï¼å
±æ 5 ä¸ªç³»ç» Cubeã</p>
<h3 id="dimension">æ®é Dimension</h3>
-<p>对äºè¿äº Cubeï¼admins è½å¤ç¨å个æ¶é´ç²åº¦æ¥è¯¢ãä»é«çº§å«å°ä½çº§å«ï¼å¦ä¸ï¼</p>
+
+<p>对äºè¿äºç³»ç» Cubeï¼admins è½å¤ç¨å个æ¶é´ç²åº¦æ¥è¯¢ï¼è¿äºç»´åº¦å¨ 5 ä¸ªç³»ç» Cube ä¸åçæãä»é«çº§å«å°ä½çº§å«ï¼å¦ä¸ï¼</p>
<table>
<tr>
@@ -340,12 +339,16 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td>the host of server for query engine</td>
</tr>
<tr>
+ <td>KUSER</td>
+ <td>the user who executes the query</td>
+ </tr>
+ <tr>
<td>PROJECT</td>
- <td></td>
+ <td>the project where the query executes</td>
</tr>
<tr>
<td>REALIZATION</td>
- <td>in Kylinï¼there are two OLAP realizations: Cubeï¼or Hybrid of Cubes</td>
+ <td>the cube which the query hits. In Kylin, there are two OLAP realizations: Cube, or Hybrid of Cubes</td>
</tr>
<tr>
<td>REALIZATION_TYPE</td>
@@ -353,11 +356,11 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
</tr>
<tr>
<td>QUERY_TYPE</td>
- <td>users can query on different data sourcesï¼CACHEï¼OLAPï¼LOOKUP_TABLEï¼HIVE</td>
+ <td>users can query on different data sources: CACHE, OLAP, LOOKUP_TABLE, HIVE</td>
</tr>
<tr>
<td>EXCEPTION</td>
- <td>when doing queryï¼exceptions may happen. It's for classifying different exception types</td>
+ <td>when doing query, exceptions may happen. It's for classifying different exception types</td>
</tr>
</table>
@@ -370,19 +373,19 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
- <td>MINï¼MAXï¼SUM of QUERY_TIME_COST</td>
+ <td>MIN, MAX, SUM, PERCENTILE_APPROX of QUERY_TIME_COST</td>
<td>the time cost for the whole query</td>
</tr>
<tr>
- <td>MAXï¼SUM of CALCITE_SIZE_RETURN</td>
+ <td>MAX, SUM of CALCITE_SIZE_RETURN</td>
<td>the row count of the result Calcite returns</td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_SIZE_RETURN</td>
+ <td>MAX, SUM of STORAGE_SIZE_RETURN</td>
<td>the row count of the input to Calcite</td>
</tr>
<tr>
- <td>MAXï¼SUM of CALCITE_SIZE_AGGREGATE_FILTER</td>
+ <td>MAX, SUM of CALCITE_SIZE_AGGREGATE_FILTER</td>
<td>the row count of Calcite aggregates and filters</td>
</tr>
<tr>
@@ -404,11 +407,11 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
</tr>
<tr>
<td>PROJECT</td>
- <td></td>
+ <td>the project where the query executes</td>
</tr>
<tr>
<td>REALIZATION</td>
- <td></td>
+ <td>the cube which the query hits</td>
</tr>
<tr>
<td>RPC_SERVER</td>
@@ -416,7 +419,7 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
</tr>
<tr>
<td>EXCEPTION</td>
- <td>the exception of a rpc call. If no exceptionï¼"NULL" is used</td>
+ <td>the exception of a rpc call. If no exception, "NULL" is used</td>
</tr>
</table>
@@ -429,28 +432,28 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
- <td>MAXï¼SUM of CALL_TIME</td>
+ <td>MAX, SUM, PERCENTILE_APPROX of CALL_TIME</td>
<td>the time cost of a rpc all</td>
</tr>
<tr>
- <td>MAXï¼SUM of COUNT_SKIP</td>
- <td>based on fuzzy filters or elseï¼a few rows will be skiped. This indicates the skipped row count</td>
+ <td>MAX, SUM of COUNT_SKIP</td>
+ <td>based on fuzzy filters or else, a few rows will be skiped. This indicates the skipped row count</td>
</tr>
<tr>
- <td>MAXï¼SUM of SIZE_SCAN</td>
+ <td>MAX, SUM of SIZE_SCAN</td>
<td>the row count actually scanned</td>
</tr>
<tr>
- <td>MAXï¼SUM of SIZE_RETURN</td>
+ <td>MAX, SUM of SIZE_RETURN</td>
<td>the row count actually returned</td>
</tr>
<tr>
- <td>MAXï¼SUM of SIZE_AGGREGATE</td>
+ <td>MAX, SUM of SIZE_AGGREGATE</td>
<td>the row count actually aggregated</td>
</tr>
<tr>
- <td>MAXï¼SUM of SIZE_AGGREGATE_FILTER</td>
- <td>the row count actually aggregated and filteredï¼= SIZE_SCAN - SIZE_RETURN</td>
+ <td>MAX, SUM of SIZE_AGGREGATE_FILTER</td>
+ <td>the row count actually aggregated and filtered, = SIZE_SCAN - SIZE_RETURN</td>
</tr>
</table>
@@ -466,6 +469,10 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
+ <td>SEGMENT_NAME</td>
+ <td></td>
+ </tr>
+ <tr>
<td>CUBOID_SOURCE</td>
<td>source cuboid parsed based on query and Cube design</td>
</tr>
@@ -482,7 +489,6 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td>whether a query on this Cube is successful or not</td>
</tr>
</table>
-
<table>
<tr>
<th colspan="2">Measure</th>
@@ -492,36 +498,40 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_CALL_COUNT</td>
+ <td>WEIGHT_PER_HIT</td>
+ <td></td>
+ </tr>
+ <tr>
+ <td>MAX, SUM of STORAGE_CALL_COUNT</td>
<td>the number of rpc calls for a query hit on this Cube</td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_CALL_TIME_SUM</td>
+ <td>MAX, SUM of STORAGE_CALL_TIME_SUM</td>
<td>sum of time cost for the rpc calls of a query</td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_CALL_TIME_MAX</td>
+ <td>MAX, SUM of STORAGE_CALL_TIME_MAX</td>
<td>max of time cost among the rpc calls of a query</td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_COUNT_SKIP</td>
+ <td>MAX, SUM of STORAGE_COUNT_SKIP</td>
<td>the sum of row count skipped for the related rpc calls</td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_SIZE_SCAN</td>
+ <td>MAX, SUM of STORAGE_COUNT_SCAN</td>
<td>the sum of row count scanned for the related rpc calls</td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_SIZE_RETURN</td>
+ <td>MAX, SUM of STORAGE_COUNT_RETURN</td>
<td>the sum of row count returned for the related rpc calls</td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_SIZE_AGGREGATE</td>
+ <td>MAX, SUM of STORAGE_COUNT_AGGREGATE</td>
<td>the sum of row count aggregated for the related rpc calls</td>
</tr>
<tr>
- <td>MAXï¼SUM of STORAGE_SIZE_AGGREGATE_FILTER</td>
- <td>the sum of row count aggregated and filtered for the related rpc callsï¼= STORAGE_SIZE_SCAN - STORAGE_SIZE_RETURN</td>
+ <td>MAX, SUM of STORAGE_COUNT_AGGREGATE_FILTER</td>
+ <td>the sum of row count aggregated and filtered for the related rpc calls, = STORAGE_SIZE_SCAN - STORAGE_SIZE_RETURN</td>
</tr>
</table>
@@ -538,20 +548,28 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<th colspan="2">Dimension</th>
</tr>
<tr>
+ <td>HOST</td>
+ <td>the host of server for job engine</td>
+ </tr>
+ <tr>
+ <td>KUSER</td>
+ <td>the user who run the job</td>
+ </tr>
+ <tr>
<td>PROJECT</td>
- <td></td>
+ <td>the project where the job runs</td>
</tr>
<tr>
<td>CUBE_NAME</td>
- <td></td>
+ <td>the cube with which the job is related</td>
</tr>
<tr>
<td>JOB_TYPE</td>
- <td></td>
+ <td>build, merge or optimize</td>
</tr>
<tr>
<td>CUBING_TYPE</td>
- <td>in kylinï¼there are two cubing algorithmsï¼Layered & Fast(InMemory)</td>
+ <td>in kylin, there are two cubing algorithms, Layered & Fast(InMemory)</td>
</tr>
</table>
@@ -564,25 +582,41 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
- <td>MINï¼MAXï¼SUM of DURATION</td>
+ <td>MIN, MAX, SUM, PERCENTILE_APPROX of DURATION</td>
<td>the duration from a job start to finish</td>
</tr>
<tr>
- <td>MINï¼MAXï¼SUM of TABLE_SIZE</td>
+ <td>MIN, MAX, SUM of TABLE_SIZE</td>
<td>the size of data source in bytes</td>
</tr>
<tr>
- <td>MINï¼MAXï¼SUM of CUBE_SIZE</td>
+ <td>MIN, MAX, SUM of CUBE_SIZE</td>
<td>the size of created Cube segment in bytes</td>
</tr>
<tr>
- <td>MINï¼MAXï¼SUM of PER_BYTES_TIME_COST</td>
+ <td>MIN, MAX, SUM of PER_BYTES_TIME_COST</td>
<td>= DURATION / TABLE_SIZE</td>
</tr>
<tr>
- <td>MINï¼MAXï¼SUM of WAIT_RESOURCE_TIME</td>
+ <td>MIN, MAX, SUM of WAIT_RESOURCE_TIME</td>
<td>a job may includes serveral MR(map reduce) jobs. Those MR jobs may wait because of lack of Hadoop resources.</td>
</tr>
+ <tr>
+ <td>MAX, SUM of step_duration_distinct_columns</td>
+ <td></td>
+ </tr>
+ <tr>
+ <td>MAX, SUM of step_duration_dictionary</td>
+ <td></td>
+ </tr>
+ <tr>
+ <td>MAX, SUM of step_duration_inmem_cubing</td>
+ <td></td>
+ </tr>
+ <tr>
+ <td>MAX, SUM of step_duration_hfile_convert</td>
+ <td></td>
+ </tr>
</table>
<h3 id="metricsjobexception">METRICS_JOB_EXCEPTION</h3>
@@ -593,24 +627,32 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<th colspan="2">Dimension</th>
</tr>
<tr>
+ <td>HOST</td>
+ <td>the host of server for job engine</td>
+ </tr>
+ <tr>
+ <td>KUSER</td>
+ <td>the user who run a job</td>
+ </tr>
+ <tr>
<td>PROJECT</td>
- <td></td>
+ <td>the project where the job runs</td>
</tr>
<tr>
<td>CUBE_NAME</td>
- <td></td>
+ <td>the cube with which the job is related</td>
</tr>
<tr>
<td>JOB_TYPE</td>
- <td></td>
+ <td>build, merge or optimize</td>
</tr>
<tr>
<td>CUBING_TYPE</td>
- <td></td>
+ <td>in kylin, there are two cubing algorithms, Layered & Fast(InMemory)</td>
</tr>
<tr>
<td>EXCEPTION</td>
- <td>when running a jobï¼exceptions may happen. It's for classifying different exception types</td>
+ <td>when running a job, exceptions may happen. It's for classifying different exception types</td>
</tr>
</table>
Modified: kylin/site/docs/tutorial/setup_systemcube.html
URL: http://svn.apache.org/viewvc/kylin/site/docs/tutorial/setup_systemcube.html?rev=1883250&r1=1883249&r2=1883250&view=diff
==============================================================================
--- kylin/site/docs/tutorial/setup_systemcube.html (original)
+++ kylin/site/docs/tutorial/setup_systemcube.html Tue Nov 10 14:09:19 2020
@@ -8589,13 +8589,25 @@ var _hmt = _hmt || [];
<p>Available since Apache Kylin v2.3.0</p>
</blockquote>
-<h2 id="what-is-system-cube">What is System Cube</h2>
+<p>Main content of this section:</p>
+
+<ul>
+ <li><a href="#What is System Cube">What is System Cube</a></li>
+ <li><a href="#How to Set Up System Cube">How to Set Up System Cube</a></li>
+ <li><a href="#Automatically create System Cube">Automatically create System Cube</a></li>
+ <li><a href="#Details of System Cube">Details of System Cube</a></li>
+</ul>
+
+<h2 id="span-idwhat-is-system-cubewhat-is-system-cubespan"><span id="What is System Cube">What is System Cube</span></h2>
<p>For better supporting self-monitoring, a set of system Cubes are created under the system project, called âKYLIN_SYSTEMâ. Currently, there are five Cubes. Three are for query metrics, âMETRICS_QUERYâ, âMETRICS_QUERY_CUBEâ, âMETRICS_QUERY_RPCâ. And the other two are for job metrics, âMETRICS_JOBâ, âMETRICS_JOB_EXCEPTIONâ.</p>
-<h2 id="how-to-set-up-system-cube">How to Set Up System Cube</h2>
+<h2 id="span-idhow-to-set-up-system-cubehow-to-set-up-system-cubespan"><span id="How to Set Up System Cube">How to Set Up System Cube</span></h2>
+
+<p>In this section, we will introduce the method of manually enabling the system cube. If you want to automatically enable the system cube through shell scripts, please refer to <a href="#Automatically Create System Cube">Automatically Create System Cube</a>.</p>
+
+<h3 id="prepare">1. Prepare</h3>
-<h3 id="prepare">Prepare</h3>
<p>Create a configuration file SCSinkTools.json in KYLIN_HOME directory.</p>
<p>For example:</p>
@@ -8613,7 +8625,7 @@ var _hmt = _hmt || [];
</code></pre>
</div>
-<h3 id="generate-metadata">1. Generate Metadata</h3>
+<h3 id="generate-metadata">2. Generate Metadata</h3>
<p>Run the following command in KYLIN_HOME folder to generate related metadata:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>./bin/kylin.sh org.apache.kylin.tool.metrics.systemcube.SCCreator \
@@ -8626,7 +8638,7 @@ var _hmt = _hmt || [];
<p><img src="/images/SystemCube/metadata.png" alt="metadata" /></p>
-<h3 id="set-up-datasource">2. Set Up Datasource</h3>
+<h3 id="set-up-datasource">3. Set Up Datasource</h3>
<p>Running the following command to create source hive tables:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>hive -f <output_folder>/create_hive_tables_for_system_cubes.sql
@@ -8637,28 +8649,24 @@ var _hmt = _hmt || [];
<p><img src="/images/SystemCube/hive_table.png" alt="hive_table" /></p>
-<h3 id="upload-metadata-for-system-cubes">3. Upload Metadata for System Cubes</h3>
+<h3 id="upload-metadata-for-system-cubes">4. Upload Metadata for System Cubes</h3>
<p>Then we need to upload metadata to hbase by the following command:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>./bin/metastore.sh restore <output_folder>
</code></pre>
</div>
-<h3 id="reload-metadata">4. Reload Metadata</h3>
+<h3 id="reload-metadata">5. Reload Metadata</h3>
<p>Finally, we need to reload metadata in Kylin web UI.</p>
<p>Then, a set of system Cubes will be created under the system project, called âKYLIN_SYSTEMâ.</p>
-<h3 id="system-cube-build">5. System Cube build</h3>
+<h3 id="system-cube-build">6. System Cube build</h3>
<p>When the system Cube is created, we need to build the Cube regularly.</p>
-<ol>
- <li>
- <p>Create a shell script that builds the system Cube by calling org.apache.kylin.tool.job.CubeBuildingCLI</p>
+<p><strong>Step 1</strong>. Create a shell script that builds the system Cube by calling <code class="highlighter-rouge">org.apache.kylin.tool.job.CubeBuildingCLI</code></p>
- <p>For example:</p>
- </li>
-</ol>
+<p>For example:</p>
<div class="highlight"><pre><code class="language-groff" data-lang="groff">#!/bin/bash
@@ -8677,13 +8685,7 @@ ID="$END"
echo "building for ${CUBE}_${ID}" >> ${KYLIN_HOME}/logs/build_trace.log
sh ${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.job.CubeBuildingCLI --cube ${CUBE} --endTime ${END} > ${KYLIN_HOME}/logs/system_cube_${CUBE}_${END}.log 2>&1 &</code></pre></div>
-<ol>
- <li>
- <p>Then run this shell script regularly</p>
-
- <p>For example, add a cron job as follows:</p>
- </li>
-</ol>
+<p><strong>Step 2</strong>. Then run this shell script regularly. For example, add a cron job as follows:</p>
<div class="highlight"><pre><code class="language-groff" data-lang="groff">0 */2 * * * sh ${KYLIN_HOME}/bin/system_cube_build.sh KYLIN_HIVE_METRICS_QUERY_QA 3600000 1200000
@@ -8695,7 +8697,7 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
50 */12 * * * sh ${KYLIN_HOME}/bin/system_cube_build.sh KYLIN_HIVE_METRICS_JOB_EXCEPTION_QA 3600000 12000</code></pre></div>
-<h2 id="automatically-create-system-cube">Automatically create System Cube</h2>
+<h2 id="span-idautomatically-create-system-cubeautomatically-create-system-cubespan"><span id="Automatically create System Cube">Automatically create System Cube</span></h2>
<p>Kylin provides system-cube.sh from v2.6.0, users can automatically create system cube by executing this script.</p>
@@ -8711,7 +8713,7 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
</li>
</ul>
-<h2 id="details-of-system-cube">Details of System Cube</h2>
+<h2 id="span-iddetails-of-system-cubedetails-of-system-cubespan"><span id="Details of System Cube">Details of System Cube</span></h2>
<h3 id="common-dimension">Common Dimension</h3>
<p>For all of these Cube, admins can query at four time granularities. From higher level to lower, itâs as follows:</p>
@@ -8747,12 +8749,16 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td>the host of server for query engine</td>
</tr>
<tr>
+ <td>KUSER</td>
+ <td>the user who executes the query</td>
+ </tr>
+ <tr>
<td>PROJECT</td>
- <td></td>
+ <td>the project where the query executes</td>
</tr>
<tr>
<td>REALIZATION</td>
- <td>in Kylin, there are two OLAP realizations: Cube, or Hybrid of Cubes</td>
+ <td>the cube which the query hits. In Kylin, there are two OLAP realizations: Cube, or Hybrid of Cubes</td>
</tr>
<tr>
<td>REALIZATION_TYPE</td>
@@ -8760,7 +8766,7 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
</tr>
<tr>
<td>QUERY_TYPE</td>
- <td>users can query on different data sources, CACHE, OLAP, LOOKUP_TABLE, HIVE</td>
+ <td>users can query on different data sources: CACHE, OLAP, LOOKUP_TABLE, HIVE</td>
</tr>
<tr>
<td>EXCEPTION</td>
@@ -8777,7 +8783,7 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
- <td>MIN, MAX, SUM of QUERY_TIME_COST</td>
+ <td>MIN, MAX, SUM, PERCENTILE_APPROX of QUERY_TIME_COST</td>
<td>the time cost for the whole query</td>
</tr>
<tr>
@@ -8811,11 +8817,11 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
</tr>
<tr>
<td>PROJECT</td>
- <td></td>
+ <td>the project where the query executes</td>
</tr>
<tr>
<td>REALIZATION</td>
- <td></td>
+ <td>the cube which the query hits</td>
</tr>
<tr>
<td>RPC_SERVER</td>
@@ -8823,7 +8829,7 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
</tr>
<tr>
<td>EXCEPTION</td>
- <td>the exception of a rpc call. If no exception, "NULL" is used</td>
+ <td>the exception of a rpc call. If no exceptionï¼"NULL" is used</td>
</tr>
</table>
@@ -8836,12 +8842,12 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
- <td>MAX, SUM of CALL_TIME</td>
+ <td>MAX, SUM, PERCENTILE_APPROX of CALL_TIME</td>
<td>the time cost of a rpc all</td>
</tr>
<tr>
<td>MAX, SUM of COUNT_SKIP</td>
- <td>based on fuzzy filters or else, a few rows will be skipped. This indicates the skipped row count</td>
+ <td>based on fuzzy filters or else, a few rows will be skiped. This indicates the skipped row count</td>
</tr>
<tr>
<td>MAX, SUM of SIZE_SCAN</td>
@@ -8873,12 +8879,16 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
+ <td>SEGMENT_NAME</td>
+ <td></td>
+ </tr>
+ <tr>
<td>CUBOID_SOURCE</td>
<td>source cuboid parsed based on query and Cube design</td>
</tr>
<tr>
<td>CUBOID_TARGET</td>
- <td>target cuboid already pre-calculated and served for source cuboid</td>
+ <td>target cuboid already precalculated and served for source cuboid</td>
</tr>
<tr>
<td>IF_MATCH</td>
@@ -8889,7 +8899,6 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td>whether a query on this Cube is successful or not</td>
</tr>
</table>
-
<table>
<tr>
<th colspan="2">Measure</th>
@@ -8899,6 +8908,10 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
+ <td>WEIGHT_PER_HIT</td>
+ <td></td>
+ </tr>
+ <tr>
<td>MAX, SUM of STORAGE_CALL_COUNT</td>
<td>the number of rpc calls for a query hit on this Cube</td>
</tr>
@@ -8915,19 +8928,19 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td>the sum of row count skipped for the related rpc calls</td>
</tr>
<tr>
- <td>MAX, SUM of STORAGE_SIZE_SCAN</td>
+ <td>MAX, SUM of STORAGE_COUNT_SCAN</td>
<td>the sum of row count scanned for the related rpc calls</td>
</tr>
<tr>
- <td>MAX, SUM of STORAGE_SIZE_RETURN</td>
+ <td>MAX, SUM of STORAGE_COUNT_RETURN</td>
<td>the sum of row count returned for the related rpc calls</td>
</tr>
<tr>
- <td>MAX, SUM of STORAGE_SIZE_AGGREGATE</td>
+ <td>MAX, SUM of STORAGE_COUNT_AGGREGATE</td>
<td>the sum of row count aggregated for the related rpc calls</td>
</tr>
<tr>
- <td>MAX, SUM of STORAGE_SIZE_AGGREGATE_FILTER</td>
+ <td>MAX, SUM of STORAGE_COUNT_AGGREGATE_FILTER</td>
<td>the sum of row count aggregated and filtered for the related rpc calls, = STORAGE_SIZE_SCAN - STORAGE_SIZE_RETURN</td>
</tr>
</table>
@@ -8945,16 +8958,24 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<th colspan="2">Dimension</th>
</tr>
<tr>
+ <td>HOST</td>
+ <td>the host of server for job engine</td>
+ </tr>
+ <tr>
+ <td>KUSER</td>
+ <td>the user who run the job</td>
+ </tr>
+ <tr>
<td>PROJECT</td>
- <td></td>
+ <td>the project where the job runs</td>
</tr>
<tr>
<td>CUBE_NAME</td>
- <td></td>
+ <td>the cube with which the job is related</td>
</tr>
<tr>
<td>JOB_TYPE</td>
- <td></td>
+ <td>build, merge or optimize</td>
</tr>
<tr>
<td>CUBING_TYPE</td>
@@ -8971,7 +8992,7 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<td></td>
</tr>
<tr>
- <td>MIN, MAX, SUM of DURATION</td>
+ <td>MIN, MAX, SUM, PERCENTILE_APPROX of DURATION</td>
<td>the duration from a job start to finish</td>
</tr>
<tr>
@@ -8988,7 +9009,23 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
</tr>
<tr>
<td>MIN, MAX, SUM of WAIT_RESOURCE_TIME</td>
- <td>a job may includes several MR(map reduce) jobs. Those MR jobs may wait because of lack of Hadoop resources.</td>
+ <td>a job may includes serveral MR(map reduce) jobs. Those MR jobs may wait because of lack of Hadoop resources.</td>
+ </tr>
+ <tr>
+ <td>MAX, SUM of step_duration_distinct_columns</td>
+ <td></td>
+ </tr>
+ <tr>
+ <td>MAX, SUM of step_duration_dictionary</td>
+ <td></td>
+ </tr>
+ <tr>
+ <td>MAX, SUM of step_duration_inmem_cubing</td>
+ <td></td>
+ </tr>
+ <tr>
+ <td>MAX, SUM of step_duration_hfile_convert</td>
+ <td></td>
</tr>
</table>
@@ -9000,20 +9037,28 @@ sh ${KYLIN_HOME}/bin/kylin.sh org.apache
<th colspan="2">Dimension</th>
</tr>
<tr>
+ <td>HOST</td>
+ <td>the host of server for job engine</td>
+ </tr>
+ <tr>
+ <td>KUSER</td>
+ <td>the user who run a job</td>
+ </tr>
+ <tr>
<td>PROJECT</td>
- <td></td>
+ <td>the project where the job runs</td>
</tr>
<tr>
<td>CUBE_NAME</td>
- <td></td>
+ <td>the cube with which the job is related</td>
</tr>
<tr>
<td>JOB_TYPE</td>
- <td></td>
+ <td>build, merge or optimize</td>
</tr>
<tr>
<td>CUBING_TYPE</td>
- <td></td>
+ <td>in kylin, there are two cubing algorithms, Layered & Fast(InMemory)</td>
</tr>
<tr>
<td>EXCEPTION</td>
Modified: kylin/site/feed.xml
URL: http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1883250&r1=1883249&r2=1883250&view=diff
==============================================================================
--- kylin/site/feed.xml (original)
+++ kylin/site/feed.xml Tue Nov 10 14:09:19 2020
@@ -19,8 +19,8 @@
<description>Apache Kylin Home</description>
<link>http://kylin.apache.org/</link>
<atom:link href="http://kylin.apache.org/feed.xml" rel="self" type="application/rss+xml"/>
- <pubDate>Sun, 18 Oct 2020 05:37:07 -0700</pubDate>
- <lastBuildDate>Sun, 18 Oct 2020 05:37:07 -0700</lastBuildDate>
+ <pubDate>Tue, 10 Nov 2020 05:59:14 -0800</pubDate>
+ <lastBuildDate>Tue, 10 Nov 2020 05:59:14 -0800</lastBuildDate>
<generator>Jekyll v2.5.3</generator>
<item>