You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kylin.apache.org by li...@apache.org on 2022/04/21 08:37:13 UTC
svn commit: r1900099 [3/4] - in /kylin/site: ./ cn/blog/ cn_blog/2022/04/ cn_blog/2022/04/20/ cn_blog/2022/04/20/kylin4-on-cloud-part1/ cn_blog/2022/04/20/kylin4-on-cloud-part2/ images/blog/kylin4_on_cloud/
Modified: kylin/site/feed.xml
URL: http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1900099&r1=1900098&r2=1900099&view=diff
==============================================================================
--- kylin/site/feed.xml (original)
+++ kylin/site/feed.xml Thu Apr 21 08:37:12 2022
@@ -19,11 +19,648 @@
<description>Apache Kylin Home</description>
<link>http://kylin.apache.org/</link>
<atom:link href="http://kylin.apache.org/feed.xml" rel="self" type="application/rss+xml"/>
- <pubDate>Thu, 31 Mar 2022 06:59:26 -0700</pubDate>
- <lastBuildDate>Thu, 31 Mar 2022 06:59:26 -0700</lastBuildDate>
+ <pubDate>Thu, 21 Apr 2022 01:27:57 -0700</pubDate>
+ <lastBuildDate>Thu, 21 Apr 2022 01:27:57 -0700</lastBuildDate>
<generator>Jekyll v2.5.3</generator>
<item>
+ <title>Kylin on Cloud ââ 两å°æ¶å¿«éæ建äºä¸æ°æ®åæå¹³å°(ä¸)</title>
+ <description><p>以ä¸é¨å为 <code class="highlighter-rouge">Kylin on Cloud ââ 两å°æ¶å¿«éæ建äºä¸æ°æ®åæå¹³å°</code> çä¸ç¯ï¼ä¸ç¯è¯·æ¥çï¼<a href="../kylin4-on-cloud-part1/">Kylin on Cloud ââ 两å°æ¶å¿«éæ建äºä¸æ°æ®åæå¹³å°(ä¸)</a></p>
+
+<h3 id="kylin-">Kylin æ¥è¯¢é群</h3>
+
+<h4 id="kylin--1">å¯å¨ Kylin æ¥è¯¢é群</h4>
+
+<p>1.å¨å¯å¨æ建é群æ¶ä½¿ç¨ç kylin_configs.yaml çåºç¡ä¸ï¼æå¼ mdx å¼å
³ï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>ENABLE_MDX: &amp;ENABLE_MDX 'true'
+</code></pre>
+</div>
+
+<p>2.ç¶åæ§è¡é¨ç½²å½ä»¤å¯å¨é群ï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python deploy.py --type deploy --mode query
+</code></pre>
+</div>
+
+<h4 id="kylin--2">ä½éª kylin çæ¥è¯¢é度</h4>
+
+<p>1.æ¥è¯¢é群å¯å¨æååï¼å
æ§è¡ <code class="highlighter-rouge">python deploy.py --type list</code> å½ä»¤æ¥ååºææèç¹ä¿¡æ¯ï¼ç¶åå¨æµè§å¨è¾å
¥ http://${kylin_node_public_ip}:7070/kylin æ£æ¥ kylin UIï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/14_kylin_web_ui.png" alt="" /></p>
+
+<p>2.å¨ Insight 页é¢æ§è¡ä¸ä¹åå¨ spark-sql ä¸ç¸åç sqlï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>select TAXI_TRIP_RECORDS_VIEW.PICKUP_DATE, NEWYORK_ZONE.BOROUGH, count(*), sum(TAXI_TRIP_RECORDS_VIEW.TRIP_TIME_HOUR), sum(TAXI_TRIP_RECORDS_VIEW.TOTAL_AMOUNT)
+from TAXI_TRIP_RECORDS_VIEW
+left join NEWYORK_ZONE
+on TAXI_TRIP_RECORDS_VIEW.PULOCATIONID = NEWYORK_ZONE.LOCATIONID
+group by TAXI_TRIP_RECORDS_VIEW.PICKUP_DATE, NEWYORK_ZONE.BOROUGH;
+</code></pre>
+</div>
+
+<p><img src="/images/blog/kylin4_on_cloud/15_query_in_kylin.png" alt="" /></p>
+
+<p>å¯ä»¥çå°ï¼å¨æ¥è¯¢å»ä¸ cube çæ
åµä¸ï¼ä¹å°±æ¯æ¥è¯¢ç»æç´æ¥æ¥èªäºé¢è®¡ç®åçæ°æ®ï¼åªä½¿ç¨äºå¤§æ¦ 4 ç§çæ¶é´å°±è¿åäºæ¥è¯¢ç»æï¼å¤§å¤§èçäºæ¥è¯¢æ¶é´ã</p>
+
+<h3 id="section">é¢è®¡ç®éä½æ¥è¯¢ææ¬</h3>
+
+<p>å¨å¯¹æ¯åç SparkSql å Kylin æ¥è¯¢é度çæµè¯ä¸ï¼æ们使ç¨çæ°æ®éæ¯çº½çº¦å¸åºç§è½¦è®¢åæ°æ®ï¼äºå®è¡¨å
±æ 2 亿+ æ°æ®ãä»å¯¹æ¯ç»æå¯ä»¥çå°ï¼å¨ä¸äº¿ç大æ°æ®åæåºæ¯ä¸ï¼Kylin è½å¤æ¾èæåæ¥è¯¢æçï¼éè¿ä¸æ¬¡æ建å éä¸åä¸ä¸æ¬¡ä¸å¡æ¥è¯¢ï¼æ大çéä½æ¥è¯¢ææ¬ã</p>
+
+<h3 id="section-1">é
ç½®è¯ä¹å±</h3>
+
+<h4 id="mdx-for-kylin--dataset">å MDX for Kylin 导å
¥ Dataset</h4>
+
+<p>å¨ <code class="highlighter-rouge">MDX for Kylin</code> ä¸å¯ä»¥æ ¹æ®æè¿æ¥ç Kylin ä¸ç Cube æ¥å建 <code class="highlighter-rouge">Dataset</code>ï¼å®ä¹ Cube å
³ç³»ï¼å建ä¸å¡ææ ã为æ¹ä¾¿ä½éªï¼ç¨æ·å¯ä»¥ç´æ¥ä» S3 ä¸è½½ Dataset æ件导å
¥å° <code class="highlighter-rouge">MDX for Kylin</code> ä¸ï¼</p>
+
+<p>1.ä» S3 ä¸è½½ Dataset æ件å°æ¬å°æºå¨</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>wget https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/kylin_demo/covid_trip_project_covid_trip_dataset.json
+</code></pre>
+</div>
+
+<p>2.è®¿é® <code class="highlighter-rouge">MDX for Kylin</code> çé¢</p>
+
+<p>å¨æµè§å¨è¾å
¥ <code class="highlighter-rouge">http://${kylin_node_public_ip}:7080</code> è®¿é® <code class="highlighter-rouge">MDX for Kylin</code> 页é¢ï¼ä»¥ <code class="highlighter-rouge">ADMIN/KYLIN</code> çç¨æ·åå¯ç ç»åç»å½ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/16_mdx_web_ui.png" alt="" /></p>
+
+<p>3.确认 Kylin è¿æ¥</p>
+
+<p><code class="highlighter-rouge">MDX for Kylin</code> ä¸å·²ç»é
ç½®äºéè¦è¿æ¥ç kylin èç¹çä¿¡æ¯ï¼é¦æ¬¡ç»å½éè¦è¾å
¥ kylin èç¹çç¨æ·ååå¯ç ä¹å°±æ¯ <code class="highlighter-rouge">ADMIN/KYLIN</code>ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/17_connect_to_kylin.png" alt="" /></p>
+
+<p><img src="/images/blog/kylin4_on_cloud/18_exit_management.png" alt="" /></p>
+
+<p>4.导å
¥ Dataset</p>
+
+<p>è¿æ¥ Kylin æååç¹å»å³ä¸è§çå¾æ éåºç®¡ççé¢ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/19_kylin_running.png" alt="" /></p>
+
+<p>åæ¢å° <code class="highlighter-rouge">covid_trip_project</code> 项ç®ï¼å¨ Dataset 页é¢ä¸ç¹å» <code class="highlighter-rouge">Import Dataset</code>ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/20_import_dataset.png" alt="" /></p>
+
+<p>éæ©ååä» S3 ä¸è½½çæ件 <code class="highlighter-rouge">covid_trip_project_covid_trip_dataset.json</code> 导å
¥ã</p>
+
+<p><code class="highlighter-rouge">covid_trip_dataset</code> ä¸å®ä¹äºåååææ ç年累计ãæ累计ãå¹´å¢éãæå¢éï¼åæ¶é´å±çº§ãå°åºå±çº§çç¹æ®ç»´åº¦ã度éï¼ä»¥åæ°å èºçç
æ»çãåºç§è½¦å¹³åé度çä¸å¡ææ ãå¦ä½æå¨å建 Dataset 请åèï¼<a href="https://cwiki.apache.org/confluence/display/KYLIN/Create+Dataset+in+MDX+for+Kylin">Create dataset in MDX for Kylin</a>ï¼MDX for Kylin æåé¾æ¥è¯·åèï¼<a href="https://kyligence.github.io/mdx-kylin/">MDX for Kylin 使ç¨æå</a>ã</p>
+
+<h2 id="section-2">æ°æ®åæ</h2>
+
+<h3 id="tableau-">éè¿ Tableau è¿è¡æ°æ®åæ</h3>
+
+<p>æ们以æ¬å° windows æºå¨ä¸ç tableau 为ä¾è¿æ¥ MDX for Kylin è¿è¡æ°æ®åæã</p>
+
+<p>1.éæ© Tableau å
ç½®ç <code class="highlighter-rouge">Microsoft Analysis Service</code> æ¥è¿æ¥ <code class="highlighter-rouge">MDX for Kylin</code> (éè¦æåå®è£
<code class="highlighter-rouge">Microsoft Analysis Services</code> 驱å¨ï¼å¯ä» tableau å®ç½ä¸è½½ï¼<a href="https://www.tableau.com/support/drivers?_ga=2.104833284.564621013.1647953885-1839825424.1608198275">Microsoft Analysis Services 驱å¨ä¸è½½</a>)</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/21_tableau_connect.png" alt="" /></p>
+
+<p>2.å¨å¼¹åºç设置页é¢ä¸å¡«å <code class="highlighter-rouge">MDX for Kylin</code> çè¿æ¥å°åï¼ä»¥åç¨æ·ååå¯ç ï¼è¿æ¥å°å为 <code class="highlighter-rouge">http://${kylin_node_public_ip}:7080/mdx/xmla/covid_trip_project</code>:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/22_tableau_server.png" alt="" /></p>
+
+<p>3.éæ© <code class="highlighter-rouge">covid_trip_dataset</code> ä½ä¸ºæ°æ®éï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/23_tableau_dataset.png" alt="" /></p>
+
+<p>4.ç¶åå³å¯å¨å·¥ä½è¡¨ä¸è¿è¡æ°æ®åæï¼ç±äºæä»¬å¨ <code class="highlighter-rouge">MDX for Kylin</code> ä¸å·²ç»ç»ä¸å®ä¹äºä¸å¡ææ ï¼æä»¥å¨ tableau ä¸å¶ä½æ°æ®åææ¥è¡¨æ¶ï¼å¯ä»¥ç´æ¥ææ½å®ä¹å¥½çä¸å¡ææ å°å·¥ä½è¡¨ä¸è¿è¡å±ç¤ºã</p>
+
+<p>5.é¦å
åæç«æ
æ°æ®ï¼éè¿ç¡®è¯äººæ°ãç
æ»ç两个ææ æ¥ç»å¶å½å®¶çº§å«çç«æ
å°å¾ï¼åªéè¦å°å°åºå±çº§ä¸ç <code class="highlighter-rouge">COUNTRY_SHORT_NAME</code> æ¾å°å·¥ä½è¡¨çåä¸ ï¼å°äºå
å®ä¹å¥½çæ°å¢ç¡®è¯äººæ°æ»å <code class="highlighter-rouge">SUM_NEW_POSITIVE_CASES</code> åç
æ»çææ <code class="highlighter-rouge">CFR_COVID19</code> æ¾å°å·¥ä½è¡¨çè¡ä¸ï¼ç¶åéæ©ä»¥å°å¾å½¢å¼å±ç¤ºæ°æ®ç»æï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/24_tableau_covid19_map.png" alt="" /></p>
+
+<p>å
¶ä¸ï¼å¾æ é¢ç§¯ä»£è¡¨æ»äº¡äººæ°çº§å«ï¼å¾æ é¢è²æ·±æµ
代表ç
æ»ç级å«ãéè¿ç«æ
å°å¾å¯ä»¥çåºï¼ç¾å½åå°åº¦çç¡®è¯äººæ°ç¸å¯¹è¾å¤ï¼ä½æ¯è¿ä¸¤ä¸ªå½å®¶çç
æ»çä¸å
¶ä»å¤§å¤æ°å½å®¶æ²¡æææ¾å·®å«ï¼èç¡®è¯äººæ°å¾å°çç§é²ãç¦åªé¿å¾ã墨西å¥çå½å®¶çç
æ»çåå±
é«ä¸ä¸ãä»è¿ä¸ªç°è±¡å
¥æï¼ä¹è®¸å¯ä»¥ææå°æ´æ·±å±æ¬¡çåå ã</p>
+
+<p>ç±äºæ们设置äºå°åºå±çº§ï¼æ以å¯ä»¥å°å½å®¶çº§å«çç«æ
å°å¾ä¸é»å°ç级å«ï¼æ¥çå个å½å®¶å
é¨å个å°åºçç«æ
æ
åµï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/25_tableau_province.png" alt="" /></p>
+
+<p>å¨ province 级å«çç«æ
å°å¾æ¾å¤§çç¾å½çç«æ
ç¶åµï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/26_tableau_us_covid19.png" alt="" /></p>
+
+<p>å¯ä»¥åç°ï¼ç¾å½æ¯ä¸ªå·çç
æ»ç没æææ¾å·®è·ï¼é½å¨ 0.01 å·¦å³ï¼å¨ç¡®è¯äººæ°ä¸ï¼CaliforniaãTexasãFlorida 以å纽约å¸å 个å°åºææ¾åé«ï¼è¿å 个å°åºç»æµåè¾¾ã人å£ä¼å¤ï¼æ°å èºçç¡®è¯äººæ°ä¹éä¹æåãä¸é¢é对纽约å¸åºç§è½¦æ°æ®éï¼ç»åç«æ
åå±æ
åµï¼åæç«æ
å½¢å¿ä¸äººä»¬ä¹ååºç§è½¦åºè¡çæ°æ®ååã</p>
+
+<p>6.对äºçº½çº¦å¸åºç§è½¦è®¢åæ°æ®éï¼åå«ä»ä»¥ä¸ä¸¤ä¸ªä¸å¡é®é¢å
¥æï¼</p>
+
+<ul>
+ <li>åæ纽约å¸å个è¡åºåºè¡ç¹å¾ï¼å¯¹æ¯è®¢åæ°éãåºè¡é度çåºè¡ææ </li>
+</ul>
+
+<p>å° lookup 表 <code class="highlighter-rouge">PICKUP_NEWYORK_ZONE</code> ä¸çå段 <code class="highlighter-rouge">BOROUGH</code> ææ½å°å·¥ä½è¡¨çåä¸ï¼å°ææ <code class="highlighter-rouge">ORDER_COUNT</code>ã<code class="highlighter-rouge">trip_mean_speed</code> ææ½å°å·¥ä½è¡¨çè¡ä¸ï¼ä»¥ç¬¦å·å°å¾çæ¹å¼å±ç¤ºï¼é¢è²æ·±æµ
代表平åé度ãé¢ç§¯å¤§å°ä»£è¡¨è®¢åæ°éï¼å¯ä»¥çå°ä»æ¼åé¡¿åºåºåçåºç§è½¦è®¢åæ¯å«çè¡åºæ»åé½è¦é«ï¼ä½æ¯å¹³åé度æå°ï¼Queens è¡åºæ¬¡ä¹ï¼Staten Island åæ¯åºç§è�
�¦æ´»å¨æå°çä¸ä¸ªè¡åºãä» Bronx åºåçåºç§è½¦å¹³åé度é«è¾¾ 82 è±é/å°æ¶ï¼æ¯å
¶ä»è¡åºçå¹³åé度é½é«åºå åãä»è¿äºåºè¡ç¹å¾å¯ä»¥æ å°åºçº½çº¦å¸å个è¡åºç人å£å¯éç¨åº¦ä»¥åç»æµåè¾¾ç¨åº¦ã</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/27_tableau_taxi_1.png" alt="" /></p>
+
+<p>ç¶åå° lookup 表 <code class="highlighter-rouge">PICKUP_NEWYORK_ZONE</code> ä¸çå段 <code class="highlighter-rouge">BOROUGH</code> æ¢æ <code class="highlighter-rouge">DROPOFF_NEWYORK_ZONE</code> ä¸ç <code class="highlighter-rouge">BOROUGH</code>ï¼ç»è®¡åºç§è½¦è®¢åå°è¾¾è¡åºçæ°éåå¹³åé度ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/27_tableau_taxi_2.png" alt="" /></p>
+
+<p>ç¸æ¯åºåè¡åºçæ°æ®ï¼brooklyãQueens å Bronx ä¸ä¸ªè¡åºçå°è¾¾æ°æ®é½ææ¯è¾ææ¾çå·®å«ï¼ä»æ¯ä¾å
³ç³»ä¸æ¥çï¼å°è¾¾ brookly å Bronx çåºç§è½¦è®¢åè¦è¿è¿å¤äºä» Brookly å Bronx åºåç订åï¼å°è¾¾ Queens è¡åºç订åæ°éåææ¾å°äºä» Queens è¡åºåºåç订åã</p>
+
+<ul>
+ <li>ç«æ
åå纽约å¸å±
æ°ä¹ååºç§è½¦çåºè¡ä¹ æ¯ååï¼æ´ååè¿ç¨åºè¡è¿æ¯è¿ç¨</li>
+</ul>
+
+<p>éè¿å¹³ååºè¡éç¨åæå±
æ°åºè¡ä¹ æ¯ååï¼å°ç»´åº¦ <code class="highlighter-rouge">MONTH_START</code> ææ½å°å·¥ä½è¡¨çè¡ï¼å°ææ <code class="highlighter-rouge">trip_mean_distance</code> ææ½å°å·¥ä½è¡¨çåï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/28_tableau_taxi_3.png" alt="" /></p>
+
+<p>æ ¹æ®æ±ç¶å¾çç»æå¯ä»¥åç°ï¼ç«æ
åå人们çåºè¡ä¹ æ¯åçäºææ¾çååï¼ä» 2020.03 å¼å§å¹³ååºè¡éç¨æææ¾åé«ï¼çè³æçæ份åçæ°åå¢é¿ï¼å¹¶ä¸ç«æ
å¼å§åæ¯ä¸ªæçå¹³ååºè¡éç¨åçå¾ä¸ç¨³å®ãåºäºè¿ç§æ°æ®è¡¨ç°ï¼æ们å¯ä»¥åç»åæ份维度çç«æ
æ°æ®è¿è¡èååæï¼å° <code class="highlighter-rouge">SUM_NEW_POSITIVE_CASES</code> å <code class="highlighter-rouge">MTD_ORDER_COUNT</code> ææ½å°å·¥ä½è¡¨çè¡ä¸ï¼å¹¶å¨çéå¨ä¸å¢
å çéæ¡ä»¶ <code class="highlighter-rouge">PROVINCE_STATE_NAME=New York</code>ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/29_tableau_taxi_4.png" alt="" /></p>
+
+<p>å¯ä»¥çå°ä¸ä¸ªæ趣çç°è±¡ï¼ç«æ
åæååçåçæ¶ååºç§è½¦è®¢åéæ¥å§åå°ï¼èå¹³ååºè¡éç¨å¢å¤§ï¼è¯´æ大家åå°äºå¾å¤ä¸å¿
è¦ççè·ç¦»åºè¡ï¼æè
éç¨åºç§è½¦ä»¥å¤çæ´å®å
¨ç交éæ¹å¼è¿è¡äºçè·ç¦»åºè¡ã对æ¯ä¸ç§æ°æ®çæ²çº¿ååï¼å¯ä»¥çå°ç«æ
严éç¨åº¦å人们çåºè¡æ
åµè¡¨ç°åºå¾é«çç¸å
³æ§ï¼ç«æ
严éæ¶åºç§è½¦è®¢åéåå°ï¼å¹³ååºè¡éç¨æåï¼ç¶åç«æ
好转ï¼åºç§è½¦è®¢å
éå¢å¤§ï¼å¹³ååºè¡éç¨åè½ã</p>
+
+<h3 id="excel-">éè¿ Excel è¿è¡æ°æ®åæ</h3>
+
+<p>æäº <code class="highlighter-rouge">MDX for Kylin</code> ç帮å©ï¼æä»¬å¨ Excel ä¸ä¹å¯ä»¥è¿æ¥ Kylin è¿è¡å¤§æ°æ®åæãè¿æ¬¡æµè¯ä¸ï¼æ们使ç¨æ¬å° windows æºå¨ä¸ç Excel è¿æ¥ MDX for Kylin è¿è¡æ¼ç¤ºã</p>
+
+<p>1.æå¼ Excelï¼éæ© æ°æ® -&gt; è·åæ°æ® -&gt; æ¥èªæ°æ®åº -&gt; èª <code class="highlighter-rouge">Analysis Services</code>ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/30_excel_connect.png" alt="" /></p>
+
+<p>2.å¨æ°æ®è¿æ¥å导ä¸å¡«åMDX for Kylin è¿æ¥ä¿¡æ¯ï¼æå¡å¨å称为 <code class="highlighter-rouge">http://${kylin_node_public_ip}:7080/mdx/xmla/covid_trip_project</code>ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/31_excel_server.png" alt="" /></p>
+
+<p><img src="/images/blog/kylin4_on_cloud/32_tableau_dataset.png" alt="" /></p>
+
+<p>3.ç¶å为å½åçæ°æ®è¿æ¥å建æ°æ®éè§è¡¨ï¼å¨æ°æ®éè§è¡¨å段ä¸ï¼æ们å¯ä»¥çå°ï¼å¨ Excel ä¸è¿æ¥ <code class="highlighter-rouge">MDX for Kylin</code> ä¸ç dataset è·åæ°æ®ä¿¡æ¯ï¼å¯ä»¥ä¸ Tableau ä¿æå®å
¨ä¸è´ï¼æ 论åæ人åæ¯å¨ Tableau è¿æ¯ Excel ä¸è¿è¡åæï¼é½æ¯å¨ä¸è´çæ°æ®æ¨¡åã维度åä¸å¡ææ çåºç¡ä¸ï¼è¾¾å°ç»ä¸è¯ä¹çææã</p>
+
+<p>4.å¨ Tableau ä¸æ们对 <code class="highlighter-rouge">covid19</code> å <code class="highlighter-rouge">newyork_trip_data</code> 两个æ°æ®éè¿è¡äºç«æ
å°å¾ç»å¶åè¶å¿åæãå¨ Excel ä¸å¯¹äºåæ ·çæ°æ®éåæ°æ®åºæ¯ï¼æ们å¯ä»¥æ¥çæ´å¤çæç»æ°æ®ã</p>
+
+<ul>
+ <li>对äºç«æ
æ°æ®ï¼ä¸ºæ°æ®éè§è¡¨éåå°åºå±çº§å段 <code class="highlighter-rouge">REGION_HIERARCHY</code>ï¼ä»¥åäºå
å®ä¹å¥½çæ°å¢ç
ä¾æ°æ»å <code class="highlighter-rouge">SUM_NEW_POSITIVE_CASES</code> åç
æ»çææ <code class="highlighter-rouge">CFR_COVID19</code>ï¼</li>
+</ul>
+
+<p><img src="/images/blog/kylin4_on_cloud/33_tableau_covid19_1.png" alt="" /></p>
+
+<p>ç±äºå°åºå±çº§çæä¸å±ä¸º <code class="highlighter-rouge">CONTINENT_NAME</code>ï¼æ以é»è®¤å±ç¤ºæ´²çº§å«çç¡®è¯äººæ°åç
æ»çï¼å¯ä»¥çå°ç¡®è¯äººæ°æå¤çæ´²æ¯æ¬§æ´²ï¼ç
æ»çæé«çæ¯éæ´²ãå¨è¿å¼ æ°æ®éè§è¡¨ä¸æ们å¯ä»¥æ¹ä¾¿çä¸é»å°æ´ä¸å±çå°åºçº§å«æ¥çæ´ç»ç²åº¦çæç»æ°æ®ï¼æ¯å¦æ¥çäºæ´²å½å®¶çç«æ
æ°æ®ï¼å¹¶æ ¹æ®ç¡®è¯äººæ°è¿è¡éåºæåºï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/34_excel_covid20_2.png" alt="" /></p>
+
+<p>æ°æ®æ¾ç¤ºï¼äºæ´²å½å®¶ä¸ç¡®è¯äººæ°æååä¸çå½å®¶åå«æ¯å°åº¦ãåè³å
¶åä¼æã</p>
+
+<ul>
+ <li>对äºçº½çº¦å¸åºç§è½¦è®¢åæ°æ®ï¼é对 âç«æ
对äºåºç§è½¦è®¢åæ°éææ ææ¾å½±åâ çé®é¢ï¼é¦å
ä»å¹´ä»½ç维度ä¸æ¥çåºç§è½¦è®¢åæ°éç年累计åå¢éï¼æ°å»ºéè§è¡¨éæ©æ¶é´å±çº§ç»´åº¦ <code class="highlighter-rouge">TIME_HIERARCHY</code>ã<code class="highlighter-rouge">YOY_ORDER_COUNT</code> å <code class="highlighter-rouge">YTD_ORDER_COUNT</code>ï¼</li>
+</ul>
+
+<p><img src="/images/blog/kylin4_on_cloud/35_excel_taxi_1.png" alt="" /></p>
+
+<p>å¯ä»¥çå°ï¼2020 å¹´ç«æ
çå导è´åºç§è½¦è®¢åæ°éæ¥å§åå°ï¼2020年订åéå¢é为 -0.7079ï¼åå°äº 70% çåºè¡è®¢åï¼2021 年订åéå¢éä»ä¸ºè´æ°ï¼ä½æ¯ç¸æ¯ 2020 å¹´ç«æ
åæ订åéåå°é度æ¾ç¼äºè®¸å¤ã</p>
+
+<p>å±å¼æ¶é´å±çº§ï¼å¯ä»¥æ¥çå£åº¦çº§å«ãæ级å«ç´å°å¤©çº§å«ç订å累计å¼ï¼éæ© <code class="highlighter-rouge">MOM_ORDER_COUNT</code> å <code class="highlighter-rouge">ORDER_COUNT</code> å°éè§è¡¨ä¸è¿å¯ä»¥åæ¶æ¥çæ度订åå¢é以åå个æ¶é´å±çº§ç订åæ°éï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/36_excel_taxi_2.png" alt="" /></p>
+
+<p>2020 å¹´ 3 æ份ï¼è®¢åå¢é为 -0.52ï¼åºç§è½¦è®¢åå·²ç»åºç°ææ¾åå°ï¼4 æ份æ´æ¯è·è³ -0.92ï¼åå°äº 90% ç订åï¼åæå¼å§æ
¢æ
¢å¢é¿ï¼ä½æ¯ä¹å§ç»è¿ä½äºç«æ
ä¹åçæ°éã</p>
+
+<h3 id="api--kylin-">éè¿ API éæ Kylin å°æ°æ®åæå¹³å°</h3>
+
+<p>é¤äº ExcelãTableau è¿ç§åä¸ BI å·¥å
·ï¼å¾å¤ä¼ä¸å
é¨ä¼å¼åèªå·±çæ°æ®åæå¹³å°ï¼å¨è¿ç±»èªç æ°æ®åæå¹³å°ä¸ï¼ç¨æ·ä»ç¶å¯ä»¥éè¿è°ç¨ API çæ¹å¼å° Kylin + MDX for Kylin ä½ä¸ºåæå¹³å°çåºç¡åºåº§ï¼ä¿è¯ç»ä¸çæ°æ®å£å¾ãå¨è¿æ¬¡æ¼ç¤ºä¸ï¼æ们å°å±ç¤ºå¦ä½éè¿ Olap4j å MDX for Kylin åéæ¥è¯¢ï¼è·å¾åæç»æï¼Olap4j æ¯ä¸ä¸ªä¸ JDBC 驱å¨ç±»ä¼¼ï¼è½å¤è®¿é®ä»»æ OLAP æå¡ç Java åºã</p>
+
+<p>æ们æä¾äºä¸ä¸ªç®åç demo å¯ä»¥æ¹ä¾¿ç¨æ·ç´æ¥è¿è¡æµè¯ï¼æºç ä½äº <a href="https://github.com/apache/kylin/tree/mdx-query-demo">mdx query demo</a>ï¼</p>
+
+<p>1.ä¸è½½ demo æ¼ç¤ºç¸å
³ jar å
:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>wget https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/kylin_demo/mdx_query_demo.tgz
+tar -xvf mdx_query_demo.tgz
+cd mdx_query_demo
+</code></pre>
+</div>
+
+<p>2.è¿è¡ demo</p>
+
+<p>è¿è¡ demo ä¹åä¿è¯è¿è¡ç¯å¢å®è£
äº java8ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/37_jdk_8.png" alt="" /></p>
+
+<p>è¿è¡ demo éè¦ä¸¤ä¸ªåæ°ï¼mdx èç¹ç ip å éè¦è¿è¡ç mdx æ¥è¯¢ï¼ç«¯å£é»è®¤ä¸º 7080ï¼è¿éç mdx èç¹ ip å°±æ¯ kylin èç¹ç public ipï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>java -cp olap4j-xmla-1.2.0.jar:olap4j-1.2.0.jar:xercesImpl-2.9.1.jar:mdx-query-demo-0.0.1.jar io.kyligence.mdxquerydemo.MdxQueryDemoApplication "${kylin_node_public_ip}" "${mdx_query}"
+</code></pre>
+</div>
+
+<p>å¦æç¨æ·å¨è¿è¡ demo æ¶æ²¡æéè¿å½ä»¤è¡è¾å
¥éè¦æ§è¡ç mdx è¯å¥ï¼åä¼é»è®¤æ§è¡ä»¥ä¸ mdx è¯å¥ç»è®¡ä»åºåè¡åºç维度ä¸å个è¡åºç订åæ°éåå¹³åéç¨ï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>SELECT
+{[Measures].[ORDER_COUNT],
+[Measures].[trip_mean_distance]}
+DIMENSION PROPERTIES [MEMBER_UNIQUE_NAME],[MEMBER_ORDINAL],[MEMBER_CAPTION] ON COLUMNS,
+NON EMPTY [PICKUP_NEWYORK_ZONE].[BOROUGH].[BOROUGH].AllMembers
+DIMENSION PROPERTIES [MEMBER_UNIQUE_NAME],[MEMBER_ORDINAL],[MEMBER_CAPTION] ON ROWS
+FROM [covid_trip_dataset]
+</code></pre>
+</div>
+
+<p>å¨è¿æ¬¡æ¼ç¤ºä¸æ们ç´æ¥æ§è¡é»è®¤æ¥è¯¢ï¼æ§è¡æåä¹åï¼ç»è¿ç®åå¤ççæ¥è¯¢ç»æä¼è¾åºå°å½ä»¤è¡ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/38_demo_result.png" alt="" /></p>
+
+<p>å¯ä»¥çå°ï¼è¿è¡ Demo ä¹åæåè·å¾äºéè¦æ¥è¯¢çæ°æ®ï¼æ°æ®ç»ææ¾ç¤ºï¼ä» Manhattan åºåçåºç§è½¦è®¢åæ°éæå¤ï¼è®¢åå¹³åéç¨åªæ大约 2.4 è±éï¼ç¬¦å Manhattan å°çé¢ç§¯å°ä¸äººå£ç¨ å¯çç¹ç¹ï¼èä» Bronx ç订åå¹³åéç¨è¾¾å° 33 è±éï¼æåçé«äºå
¶ä»ä»»ä½è¡åºï¼å¯è½æ¯ç±äº Bronx å°å¤åå»çç¼æ
ã</p>
+
+<p>ä¸ Tableau å Excel ç¸åï¼å¨ Demo ä¸ç¼åç mdx è¯è¨ä¸å¯ä»¥ç´æ¥ä½¿ç¨å¨ Kylin 以å MDX for Kylin ä¸å®ä¹çææ ãå¨ä¼ä¸èªç æ°æ®åæå¹³å°ä¸ï¼ç¨æ·å¯ä»¥å¯¹æ¥è¯¢è¿åçæ°æ®ç»æè¿è¡è¿ä¸æ¥åæï¼æ ¹æ®å±ç¤ºéæ±çææ¥è¡¨ã</p>
+
+<h3 id="section-3">ç»ä¸çæ°æ®å£å¾</h3>
+
+<p>éè¿ä»¥ä¸ç§ä¸åçæ°æ®åææ¹å¼è¿æ¥ Kylin + MDX for Kylin è¿è¡æ°æ®åæå±ç¤ºï¼æ们å¯ä»¥åç°ï¼åå© Kylin å¤ç»´æ°æ®åºå MDX for Kylin è¯ä¹å±åè½ï¼æ 论ç¨æ·å¨ä¸å¡åºæ¯ä¸ä½¿ç¨åªç§æ¹å¼åææ°æ®ï¼é½å¯ä»¥ä½¿ç¨ç¸åçæ°æ®æ¨¡ååä¸å¡ææ ï¼è¾¾å°ç»ä¸æ°æ®å£å¾çç®çã</p>
+
+<h2 id="section-4">éæ¯é群</h2>
+
+<h3 id="section-5">éæ¯æ¥è¯¢é群</h3>
+
+<p>å¨ä¸è¿°åæå®æä¹åï¼æ们å¯ä»¥æ§è¡é群éæ¯å½ä»¤æ¥éæ¯æ¥è¯¢é群ãå¦æç¨æ·å¸æåæ¶éæ¯ Kylin 以å MDX for Kylin çå
æ°æ®åº RDSãçæ§èç¹ä»¥å VPCï¼é£ä¹å¯ä»¥æ§è¡é群éæ¯å½ä»¤ï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python deploy.py --type destroy-all
+</code></pre>
+</div>
+
+<h3 id="aws-">æ£æ¥ AWS èµæº</h3>
+
+<p>å¨éæ¯ææé群èµæºåï¼<code class="highlighter-rouge">CloudFormation</code> ä¸ä¸ä¼ä¿çä¸é¨ç½²å·¥å
·ç¸å
³çä»»ä½ Stackãå¦æç¨æ·æ³è¦å é¤ S3 ä¸ä¸é¨ç½²å·¥å
·ç¸å
³çæ件以åæ°æ®ï¼å¯ä»¥æå¨å é¤ S3 å·¥ä½ç®å½ä¸ç以ä¸æ件夹ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/39_check_s3_demo.png" alt="" /></p>
+
+<h2 id="section-6">æ»ç»</h2>
+
+<p>éè¿è¿æ¬¡æ¼ç¤ºæç¨ï¼åªéè¦ä¸ä¸ª AWS è´¦å·ï¼ç¨æ·å°±å¯ä»¥ä½¿ç¨äºä¸é¨ç½²å·¥å
·ï¼åå©äº Kylin çé¢è®¡ç®ææ¯åå¤ç»´æ¨¡åï¼ä»¥åMDX for Kylin çåºç¡ææ 管çï¼å¿«éä¸æ¹ä¾¿çæ建åºäº Kylin + MDX for Kylin çäºä¸å¤§æ°æ®åæå¹³å°ï¼å¯¹æ¥åç§ BI å·¥å
·è¿è¡ææ¯éªè¯ï¼è¾¾å°éæ¬å¢æãç»ä¸æ°æ®å£å¾çç®çã</p>
+
+</description>
+ <pubDate>Wed, 20 Apr 2022 04:00:00 -0700</pubDate>
+ <link>http://kylin.apache.org/cn_blog/2022/04/20/kylin4-on-cloud-part2/</link>
+ <guid isPermaLink="true">http://kylin.apache.org/cn_blog/2022/04/20/kylin4-on-cloud-part2/</guid>
+
+
+ <category>cn_blog</category>
+
+ </item>
+
+ <item>
+ <title>Kylin on Cloud ââ 两å°æ¶å¿«éæ建äºä¸æ°æ®åæå¹³å°(ä¸)</title>
+ <description><h2 id="section">èæ¯</h2>
+
+<p>Apache Kylin æ¯åºäºé¢è®¡ç®åå¤ç»´æ¨¡åçå¤ç»´æ°æ®åºï¼æ¯æ SQL æ åæ¥è¯¢æ¥å£ï¼å¨ Kylin ä¸ç¨æ·å¯ä»¥éè¿å建 Model å®ä¹è¡¨å
³ç³»ï¼éè¿å建 Cube å®ä¹ç»´åº¦å度éï¼ç¶åæ建 Cube 对éè¦èåçæ°æ®è¿è¡é¢è®¡ç®ï¼å°é¢è®¡ç®å¥½çæ°æ®ä¿åèµ·æ¥ï¼ç¨æ·æ§è¡æ¥è¯¢æ¶ä¾¿å¯ä»¥ç´æ¥å¨ç»è¿é¢è®¡ç®çæ°æ®ä¸è¿è¡è¿ä¸æ¥çèåæè
ç´æ¥è¿åæ¥è¯¢ç»æï¼æåæåæ¥è¯¢æçã</p>
+
+<p>éç Kylin 4.0 æ°æ¶æççæ¬åå¸ä¸æ´æ°ï¼Kylin å
·å¤äºå¨è±ç¦» Hadoop çäºç¯å¢ä¸è¿è¡é群é¨ç½²çè½åï¼ä¸ºäºä½¿ç¨æ·è½å¤è½»æ¾å°å¨äºä¸é¨ç½² Kylinï¼Kylin 社åºåäºè¿æ¥å¼åäºäºä¸é¨ç½²å·¥å
·ï¼ç¨æ·ä½¿ç¨é¨ç½²å·¥å
·åªéæ§è¡ä¸è¡å½ä»¤ä¾¿å¯ä»¥å¾å°ä¸ä¸ªå®å¤ç kylin é群ï¼è·å¾é«æå¿«éçåæä½éªï¼2022 å¹´1æ份ï¼Kylin 社åºåå¸äº mdx for kylin æ¥å 强 Kylin ä½ä¸ºå¤ç»´æ°æ®åºçä¸å¡è¡¨è¾¾è½åï¼MDX for Kylin æä¾äº MDX çæ¥è¯¢æ¥å£ï¼mdx for kylin å¯ä»¥�
�¨ Kylin å·²ç»å®ä¹å¥½çå¤ç»´æ¨¡åçåºç¡ä¸æ´è¿ä¸æ¥çå建ä¸å¡ææ ï¼å° Kylin ä¸çæ°æ®æ¨¡å转æ¢ä¸ºä¸å¡å好çè¯è¨ï¼èµäºæ°æ®ä¸å¡ä»·å¼ï¼æ¹ä¾¿å¯¹æ¥ ExcelãTableau ç BI å·¥å
·è¿è¡å¤ç»´åæã</p>
+
+<p>åºäºä»¥ä¸ä¸ç³»åçææ¯æ¯æï¼ç¨æ·ä¸ä»
å¯ä»¥æ¹ä¾¿å¿«æ·çå¨äºä¸é¨ç½² Kylin é群ï¼å建å¤ç»´æ¨¡åï¼ä½éªç»è¿é¢è®¡ç®çå¿«éæ¥è¯¢ååºï¼è¿è½å¤ç»å MDX for Kylin 对ä¸å¡ææ è¿è¡å®ä¹å管çï¼å° DW ææ¯å±æåå°ä¸å¡è¯ä¹å±ã</p>
+
+<p>ç¨æ·å¯ä»¥å¨ Kylin + MDX for Kylin ä¹ä¸ç´æ¥å¯¹æ¥ BI å·¥å
·è¿è¡å¤ç»´æ°æ®åæï¼ä¹å¯ä»¥ä»¥æ¤ä¸ºåºåº§å»ºè®¾ææ å¹³å°çå¤æåºç¨ãç¸æ¯äºç´æ¥åºäº SparkãHive çå¨è¿è¡æ¶è¿è¡ Join åèåæ¥è¯¢ç计ç®å¼æä¹ä¸æ建ææ å¹³å°ï¼å©ç¨ Kylin å¯ä»¥ä¾æäºå¤ç»´æ¨¡ååé¢è®¡ç®ææ¯ï¼ä»¥å mdx for kylin çè¯ä¹å±è½åï¼æ»¡è¶³ææ å¹³å°æéè¦çæµ·éæ°æ®è®¡ç®ãæéæ¥è¯¢ååºãç»ä¸çå¤ç»´æ¨¡åã对æ¥å¤ç§ BIãåºç¡çä¸å¡ææ 管ççå¤ç§å
³�
�®åè½ã</p>
+
+<p>æ¬æç以ä¸é¨åå°ä¼å¸¦é¢è¯»è
ï¼ä»ä¸ä¸ªæ°æ®å·¥ç¨å¸çè§åº¦ï¼å¿«éä½éªå¨äºä¸æ建åºäº Kylin çæ°æ®åæå¹³å°ï¼Kylin on Cloudï¼ï¼å¨äº¿è¡çº§æ°æ®ä¹ä¸è·å¾é«æ§è½ä½ææ¬çæ¥è¯¢ä½éªï¼å¹¶éè¿ mdx for kylin 管çä¸å¡ææ ï¼ç´æ¥å¯¹æ¥ BI å·¥å
·å¿«éçææ¥è¡¨ã</p>
+
+<p>æ¬æç¨æ¯ä¸ä¸ªæ¥éª¤é½æ详ç»è¯´æï¼å¹¶éæé
å¾åæ£æ¥ç¹ï¼å¸®å©æ°æä¸è·¯ã读è
åªéè¦åå¤ä¸ä¸ª AWS è´¦å·ï¼é¢è®¡è¿ä¸ªè¿ç¨éè¦å¤§çº¦ 2 å°æ¶ï¼è±è´¹ ï¿¥100 å·¦å³ã</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/0_deploy_kylin.png" alt="" /></p>
+
+<h2 id="section-1">ä¸å¡åºæ¯</h2>
+
+<p>èª 2020 å¹´åä»¥æ¥ COVID-19 å¨å
¨ä¸çèå´å
å¿«éä¼ æï¼å¯¹äººä»¬çè¡£é£ä½è¡å°¤å
¶æ¯åºè¡ä¹ æ¯é ææ大影åãè¿æ¬¡æ°æ®åæç»å COVID-19 ç«æ
æ°æ®å 2018 年以æ¥çº½çº¦åºç§è½¦åºè¡æ°æ®ï¼éè¿åæç«æ
ææ ååç§åºè¡ææ ï¼æ¯å¦ç¡®è¯äººæ°ãç
æ»çãåºç§è½¦è®¢åæ°ãå¹³ååºè¡è·ç¦»çï¼æ¥æ´å¯çº½çº¦å¸åºç§è½¦è¡ä¸åç«æ
å½±åçååè¶å¿ï¼ä»¥æ¯æå³çã</p>
+
+<h3 id="section-2">ä¸å¡é®é¢</h3>
+
+<ul>
+ <li>å¤ææ èååæå个å½å®¶å°åºç«æ
严éç¨åº¦</li>
+ <li>纽约å¸å个è¡åºåºè¡ææ 对æ¯ï¼æ¯å¦è®¢åæ°æ°éãåºè¡éç¨ç</li>
+ <li>ç«æ
对äºåºç§è½¦è®¢åæ°éææ ææ¾å½±å</li>
+ <li>ç«æ
ä¹åçåºè¡ä¹ æ¯ååï¼æ´ååè¿ç¨åºè¡è¿æ¯è¿ç¨</li>
+ <li>ç«æ
严éç¨åº¦ä¸åºç§è½¦åºè¡æ¬¡æ°æ¯å¦å¼ºç¸å
³</li>
+</ul>
+
+<h3 id="section-3">æ°æ®é</h3>
+
+<h4 id="covid-19-">COVID-19 æ°æ®é</h4>
+
+<p>COVID-19 æ°æ®éå
æ¬ä¸å¼ äºå®è¡¨ <code class="highlighter-rouge">covid_19_activity</code> åä¸å¼ 维度表 <code class="highlighter-rouge">lookup_calendar</code>ã</p>
+
+<p>å
¶ä¸ï¼<code class="highlighter-rouge">covid_19_activity</code> è®°å½æ¯ä¸å¤©å
¨çèå´å
ä¸åå°åºçç¡®è¯åæ»äº¡æ°åï¼<code class="highlighter-rouge">lookup_calendar</code> 为æ¥æ维度表ï¼ä¿åäºæ¶é´çæ©å±ä¿¡æ¯ï¼æ¯å¦æ¯ä¸ä¸ªæ¥æ对åºçå¹´å§ãæå§çï¼<code class="highlighter-rouge">covid_19_activity</code> å <code class="highlighter-rouge">lookup_calendar</code> ä¹é´éè¿æ¥æè¿è¡å
³èã</p>
+
+<p>COVID-19 æ°æ®éç¸å
³ä¿¡æ¯å¦ä¸:</p>
+
+<table>
+ <tbody>
+ <tr>
+ <td>æ°æ®å¤§å°</td>
+ <td>235 MB</td>
+ </tr>
+ <tr>
+ <td>äºå®è¡¨æ°æ®è¡æ°</td>
+ <td>2,753,688</td>
+ </tr>
+ <tr>
+ <td>æ°æ®æ¥æ</td>
+ <td>2020-01-21~2022-03-07</td>
+ </tr>
+ <tr>
+ <td>æ°æ®éæä¾æ¹ä¸è½½å°å</td>
+ <td>https://data.world/covid-19-data-resource-hub/covid-19-case-counts/workspace/file?filename=COVID-19+Activity.csv</td>
+ </tr>
+ <tr>
+ <td>æ°æ®é S3 å°å</td>
+ <td>s3://public.kyligence.io/kylin/kylin_demo/data/covid19_data/</td>
+ </tr>
+ </tbody>
+</table>
+
+<h4 id="section-4">纽约å¸åºç§è½¦è®¢åæ°æ®é</h4>
+
+<p>纽约å¸åºç§è½¦è®¢åæ°æ®éå
æ¬ä¸å¼ äºå®è¡¨ <code class="highlighter-rouge">taxi_trip_records_view</code> åä¸¤å¼ ç»´åº¦è¡¨ <code class="highlighter-rouge">newyork_zone</code>ã<code class="highlighter-rouge">lookup_calendar</code>ã</p>
+
+<p>å
¶ä¸ï¼<code class="highlighter-rouge">taxi_trip_records_view</code> ä¸çä¸æ¡è®°å½å¯¹ä¸æ¬¡åºç§è½¦åºè¡ï¼è®°å½äºåºåå°ç¹ IDãå°è¾¾å°ç¹ IDãåºè¡æ¶é¿ã订åéé¢ãåºè¡è·ç¦»çï¼<code class="highlighter-rouge">newyork_zone</code> è®°å½äºå°ç¹ ID æ对åºçè¡æ¿åºçä¿¡æ¯ï¼<code class="highlighter-rouge">taxi_trip_records_view</code> åå«éè¿ <code class="highlighter-rouge">PULocationID</code> å <code class="highlighter-rouge">DOLocationID</code> 两个åä¸ <code class="highlighter-rouge">newyork_zone</code> 建ç«å
³èå
³ç³»ï¼ç»è®¡åºåè¡åº�
�å°è¾¾è¡åºä¿¡æ¯ï¼<code class="highlighter-rouge">lookup_calendar</code> ä¸ <code class="highlighter-rouge">COVID-19</code> æ°æ®éä¸ç维度表为åä¸å¼ 表ï¼<code class="highlighter-rouge">taxi_trip_records_view</code> ä¸ <code class="highlighter-rouge">lookup_calendar</code> éè¿æ¥æè¿è¡å
³èã</p>
+
+<p>纽约å¸åºç§è½¦è®¢åæ°æ®éç¸å
³ä¿¡æ¯å¦ä¸ï¼</p>
+
+<table>
+ <tbody>
+ <tr>
+ <td>æ°æ®å¤§å°</td>
+ <td>19 G</td>
+ </tr>
+ <tr>
+ <td>äºå®è¡¨æ°æ®è¡æ°</td>
+ <td>226,849,274</td>
+ </tr>
+ <tr>
+ <td>æ°æ®æ¥æ</td>
+ <td>2018-01-01~2021-07-31</td>
+ </tr>
+ <tr>
+ <td>æ°æ®éæä¾æ¹ä¸è½½å°å</td>
+ <td>https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page</td>
+ </tr>
+ <tr>
+ <td>æ°æ®é S3 å°å</td>
+ <td>s3://public.kyligence.io/kylin/kylin_demo/data/trip_data_2018-2021/</td>
+ </tr>
+ </tbody>
+</table>
+
+<h4 id="er-">ER å
³ç³»å¾</h4>
+
+<p>æ°å ç«æ
æ°æ®éå纽约å¸åºç§è½¦è®¢åæ°æ®éç ER å
³ç³»å¾å¦ä¸å¾æ示ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/1_table_ER.png" alt="" /></p>
+
+<h3 id="section-5">ææ 设计</h3>
+
+<p>é对éè¦åæçä¸å¡åºæ¯åä¸å¡é®é¢ï¼æ们设计äºä»¥ä¸ååææ åä¸å¡ææ ï¼</p>
+
+<h6 id="section-6">1.ååææ </h6>
+
+<p>ååææ æçæ¯å¨ Kylin Cube ä¸å建çåç§åº¦éï¼å®ä»¬é常æ¯å¨åä¸åä¸é¢è¿è¡èå计ç®ï¼ç¸å¯¹æ¯è¾ç®åã</p>
+
+<ul>
+ <li>Covid19 ç
ä¾æ° sum(covid_19_activity.people_positive_cases_count)</li>
+ <li>Covid19 ç
æ»æ° sum(covid_19_activity. people_death_count)</li>
+ <li>æ°å¢ Covid19 ç
ä¾æ° sum(covid_19_activity. people_positive_new_cases_count)</li>
+ <li>æ°å¢ Covid19 ç
æ»æ° sum(covid_19_activity. people_death_new_count)</li>
+ <li>åºç§è½¦åºè¡éç¨ sum(taxi_trip_records_view. trip_distance)</li>
+ <li>åºç§è½¦è®¢å交æé¢ sum(taxi_trip_records_view. total_amount)</li>
+ <li>åºç§è½¦åºè¡æ°é count()</li>
+ <li>åºç§è½¦åºè¡æ¶é¿ sum(taxi_trip_records_view.trip_time_hour)</li>
+</ul>
+
+<h6 id="section-7">2.ä¸å¡ææ </h6>
+
+<p>ä¸å¡ææ æ¯æåºäºååææ å®ä¹çåç§å¤åè¿ç®ï¼å
·æå
·ä½çä¸å¡å«ä¹ã</p>
+
+<ul>
+ <li>åååææ çæ累计MTDã年累计YTD</li>
+ <li>åååææ çæå¢éMOMãå¹´å¢éYOY</li>
+ <li>Covid19 ç
æ»çï¼æ»äº¡äººæ°/ç¡®è¯äººæ°</li>
+ <li>åºç§è½¦å¹³ååºè¡é度ï¼åºç§è½¦åºè¡éç¨/åºç§è½¦åºè¡æ¶é´</li>
+ <li>åºç§è½¦åºè¡å¹³åéç¨ï¼åºç§è½¦åºè¡éç¨/åºç§è½¦åºè¡æ°é</li>
+</ul>
+
+<h2 id="section-8">æä½æ¥éª¤æ¦è§</h2>
+
+<p>æ建åºäº Apache Kylin çäºä¸æ°æ®åæå¹³å°å¹¶è¿è¡æ°æ®åæç主è¦æä½æ¥éª¤å¦ä¸å¾ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/2_step_overview.jpg" alt="" /></p>
+
+<h2 id="section-9">é群æ¶æ</h2>
+
+<p>使ç¨äºä¸é¨ç½²å·¥å
·é¨ç½²åºç Kylin é群æ¶æå¦å¾æ示ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/3_kylin_cluster.jpg" alt="" /></p>
+
+<h2 id="kylin-on-cloud-">Kylin on Cloud é¨ç½²</h2>
+
+<h3 id="section-10">ç¯å¢è¦æ±</h3>
+
+<ul>
+ <li>éè¦æ¬å°æºå¨å·²å®è£
gitï¼ç¨äºä¸è½½é¨ç½²å·¥å
·ä»£ç ï¼</li>
+ <li>éè¦æ¬å°æºå¨å·²å®è£
Python 3.6.6 å以ä¸çæ¬ï¼ç¨äºè¿è¡é¨ç½²å·¥å
·ã</li>
+</ul>
+
+<h3 id="aws-">AWS æéæ£æ¥ä¸åå§å</h3>
+
+<p>ç»å½ AWS è´¦å·ï¼æ ¹æ® <a href="https://github.com/apache/kylin/blob/kylin4_on_cloud/readme/prerequisites.md">åå¤ææ¡£</a> æ¥æ£æ¥ç¨æ·æéãå建é¨ç½²å·¥å
·éè¦ç Access KeyãIAM RoleãKey Pair å S3 å·¥ä½ç®å½ãåç»ç AWS æä½é½ä¼ä»¥è¿ä¸ªå¸å·ç身份æ§è¡ã</p>
+
+<h3 id="section-11">é
ç½®é¨ç½²å·¥å
·</h3>
+
+<p>1.æ§è¡ä¸é¢çå½ä»¤è·å¾ Kylin on AWS é¨ç½²å·¥å
·ç代ç </p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>git clone -b kylin4_on_cloud --single-branch https://github.com/apache/kylin.git <span class="o">&amp;&amp;</span> <span class="nb">cd </span>kylin
+</code></pre>
+</div>
+
+<p>2.å¨æ¬å°æºå¨åå§å python èæç¯å¢</p>
+
+<p>æ£æ¥ python ç¯å¢ï¼éè¦ Python 3.6.6 以ä¸ï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python --version
+</code></pre>
+</div>
+
+<p>åå§å python èæç¯å¢ï¼å®è£
ä¾èµï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>bin/init.sh
+<span class="nb">source </span>venv/bin/activate
+</code></pre>
+</div>
+
+<p>3.ä¿®æ¹é
ç½®æ件 <code class="highlighter-rouge">kylin_configs.yaml</code></p>
+
+<p>æå¼é¨ç½²å·¥å
·ä»£ç ä¸ç kylin_configs.yamlï¼å°æ件ä¸çé
置项æ¿æ¢ä¸ºå®é
å¼ï¼</p>
+
+<ul>
+ <li><code class="highlighter-rouge">AWS_REGION</code>: EC2 èç¹ä½ç½® Regionï¼é»è®¤ä¸º cn-northwest-1</li>
+ <li><code class="highlighter-rouge">${IAM_ROLE_NAME}</code>: æåå建ç IAM Role å称ï¼æ¯å¦ kylin_deploy_role</li>
+ <li><code class="highlighter-rouge">${S3_URI}</code>: ç¨äºé¨ç½² kylin ç S3 å·¥ä½ç®å½ï¼æ¯å¦ s3://kylindemo/kylin_demo_dir/</li>
+ <li><code class="highlighter-rouge">${KEY_PAIR}</code>: æåå建ç Key pairs ååï¼æ¯å¦ kylin_deploy_key</li>
+ <li><code class="highlighter-rouge">${Cidr Ip}</code>: å
è®¸è®¿é® EC2 å®ä¾ç IP å°åèå´ï¼æ¯å¦ 10.1.0.0/32ï¼é常设为æ¨çå¤ç½ IP å°åï¼ç¡®ä¿å建ç EC2 å®ä¾åªææ¨è½è®¿é®</li>
+</ul>
+
+<p>åºäºè¯»åå离é离æ建åæ¥è¯¢èµæºçèèï¼å¨ä»¥ä¸çæ¥éª¤ä¸ä¼å
å¯å¨ä¸ä¸ªæ建é群ç¨äºè¿æ¥ Glue 建表ãå è½½æ°æ®æºãæ交æ建任å¡è¿è¡é¢è®¡ç®ï¼ç¶åéæ¯æ建é群ï¼ä¿çå
æ°æ®ï¼å¯å¨å¸¦æ MDX for Kylin çæ¥è¯¢é群ï¼ç¨äºå建ä¸å¡ææ ãè¿æ¥ BI å·¥å
·æ§è¡æ¥è¯¢ï¼è¿è¡æ°æ®åæãKylin on AWS éç¾¤ä½¿ç¨ RDS åå¨å
æ°æ®ï¼ä½¿ç¨ S3 åå¨æ建åçæ°æ®ï¼å¹¶ä¸æ¯æä» AWS Glue ä¸å è½½æ°æ®æºï¼é¤äº EC2 èç¹ä¹å¤ä½¿ç¨çèµæ
ºé½æ¯æä¹
åçï¼ä¸ä¼éçèç¹çå é¤èæ¶å¤±ï¼æ以å¨æ²¡ææ¥è¯¢æè
æ建任å¡æ¶ï¼ç¨æ·å¯ä»¥éæ¶éæ¯æ建ææ¥è¯¢é群ï¼åªè¦ä¿çå
æ°æ®ãS3 å·¥ä½ç®å½å³å¯ã</p>
+
+<h3 id="kylin-">Kylin æ建é群</h3>
+
+<h4 id="kylin--1">å¯å¨ Kylin æ建é群</h4>
+
+<p>1.éè¿å¦ä¸å½ä»¤å¯å¨æ建é群ãæ ¹æ®ç½ç»æ
åµä¸åï¼é¨ç½²å¯å¨å¯è½éè¦ 15-30 åéã</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python deploy.py --type deploy --mode job
+</code></pre>
+</div>
+
+<p>2.æ建é群é¨ç½²æååï¼å½ä»¤çªå£å¯ä»¥çå°å¦ä¸è¾åºï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/4_deploy_cluster_successfully.png" alt="" /></p>
+
+<h4 id="aws--1">æ£æ¥ AWS æå¡</h4>
+
+<p>1.è¿å
¥ AWS æ§å¶å°ç CloudFormation çé¢ï¼å¯ä»¥çå° Kylin é¨ç½²å·¥å
·ä¸å
±èµ·äº 7 个 stackï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/5_check_aws_stacks.png" alt="" /></p>
+
+<p>2.ç¨æ·å¯ä»¥éè¿ AWS æ§å¶å°æ¥ç EC2 èç¹ç详ç»ä¿¡æ¯ï¼ä¹å¯ä»¥å¨å½ä»¤è¡çé¢ä½¿ç¨å¦ä¸å½ä»¤ååºææ EC2 èç¹çååãç§æ IP åå
¬æ IPï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python deploy.py --type list
+</code></pre>
+</div>
+
+<p><img src="/images/blog/kylin4_on_cloud/6_list_cluster_node.png" alt="" /></p>
+
+<h4 id="spark-sql-">ä½éª spark-sql åçæ¥è¯¢é度</h4>
+
+<p>为äºç´è§çæåå°é¢è®¡ç®ç»æ¥è¯¢æ§è½å¸¦æ¥çæåï¼å¨æ建 cube ä¹åï¼æ们å
å¨ spark-sql ä¸ä½éªåççæ¥è¯¢é度ï¼</p>
+
+<p>1.é¦å
ï¼æ们éè¿ kylin èç¹çå
¬æ IP ç»å½å°è¯¥ kylin æå¨ç EC2 æºå¨ï¼å¹¶åæ¢å° root ç¨æ·ï¼æ§è¡ ~/.bash_profile 使æå设置çç¯å¢åéçæï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>ssh -i <span class="s2">"</span><span class="k">${</span><span class="nv">KEY_PAIR</span><span class="k">}</span><span class="s2">"</span> ec2-user@<span class="k">${</span><span class="nv">kylin_node_public_ip</span><span class="k">}</span>
+sudo su
+<span class="nb">source</span> ~/.bash_profile
+</code></pre>
+</div>
+
+<p>2.ç¶åè¿å
¥ <code class="highlighter-rouge">$SPARK_HOME</code> 并修æ¹é
ç½®æ件 <code class="highlighter-rouge">conf/spark-defaults.conf</code>ï¼å° <code class="highlighter-rouge">spark_master_node_private_ip</code> ä¿®æ¹ä¸º spark master èç¹çç§æ IPï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="nb">cd</span> <span class="nv">$SPARK_HOME</span>
+vim conf/spark-defaults.conf
+
+<span class="c"># å° spark_master_node_private_ip æ¿æ¢ä¸ºçå® spark master èç¹çç§æip</span>
+spark.master spark://spark_master_node_private_ip:7077
+</code></pre>
+</div>
+
+<p><code class="highlighter-rouge">spark-defaults.conf</code> ä¸å
³äº driver å executor çèµæºé
ç½®ä¸ kylin æ¥è¯¢é群çèµæºé
ç½®æ¯ä¸è´çã</p>
+
+<p>3.å¨ spark-sql ä¸å»ºè¡¨</p>
+
+<p>æµè¯æç¨æ°æ®éçæææ°æ®åæ¾å¨ä½äº <code class="highlighter-rouge">cn-north-1</code> å <code class="highlighter-rouge">us-east-1</code> å°åºç S3 bucket ä¸ï¼å¦æä½ ç S3 bucket ä½äº <code class="highlighter-rouge">cn-north-1</code> æè
<code class="highlighter-rouge">us-east-1</code>ï¼é£ä¹ä½ å¯ä»¥ç´æ¥æ§è¡å»ºè¡¨ sqlï¼å¦åéè¦æ§è¡ä»¥ä¸èæ¬å¤å¶æ°æ®å° <code class="highlighter-rouge">kylin_configs.yaml</code> ä¸è®¾ç½®ç S3 å·¥ä½ç®å½ä¸ï¼å¹¶ä¿®æ¹å»ºè¡¨ sqlï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="c">## AWS CN ç¨æ·</span>
+aws s3 sync s3://public.kyligence.io/kylin/kylin_demo/data/ <span class="k">${</span><span class="nv">S3_DATA_DIR</span><span class="k">}</span> --region cn-north-1
+
+<span class="c">## AWS Global ç¨æ·</span>
+aws s3 sync s3://public.kyligence.io/kylin/kylin_demo/data/ <span class="k">${</span><span class="nv">S3_DATA_DIR</span><span class="k">}</span> --region us-east-1
+
+<span class="c"># ä¿®æ¹å»ºè¡¨ sql</span>
+sed -i <span class="s2">"s#s3://public.kyligence.io/kylin/kylin_demo/data/#</span><span class="k">${</span><span class="nv">S3_DATA_DIR</span><span class="k">}</span><span class="s2">#g"</span> /home/ec2-user/kylin_demo/create_kylin_demo_table.sql
+</code></pre>
+</div>
+
+<p>æ§è¡å»ºè¡¨ sqlï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>bin/spark-sql -f /home/ec2-user/kylin_demo/create_kylin_demo_table.sql
+</code></pre>
+</div>
+
+<p>4.å¨ spark-sql ä¸æ§è¡æ¥è¯¢</p>
+
+<p>è¿å
¥ spark-sqlï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>bin/spark-sql
+</code></pre>
+</div>
+
+<p>å¨ spark-sql ä¸æ§è¡æ¥è¯¢ï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="n">use</span> <span class="n">kylin_demo</span><span class="p">;</span>
+<span class="k">select</span> <span class="n">TAXI_TRIP_RECORDS_VIEW</span><span class="p">.</span><span class="n">PICKUP_DATE</span><span class="p">,</span> <span class="n">NEWYORK_ZONE</span><span class="p">.</span><span class="n">BOROUGH</span><span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">),</span> <span class="k">sum</span><span class="p">(</span><span class="n">TAXI_TRIP_RECORDS_VIEW</span><span class="p">.</span><span class="n">TRIP_TIME_HOUR</span><span class="p">),</span> <span class="k&
quot;>sum</span><span class="p">(</span><span class="n">TAXI_TRIP_RECORDS_VIEW</span><span class="p">.</span><span class="n">TOTAL_AMOUNT</span><span class="p">)</span>
+<span class="k">from</span> <span class="n">TAXI_TRIP_RECORDS_VIEW</span>
+<span class="k">left</span> <span class="k">join</span> <span class="n">NEWYORK_ZONE</span>
+<span class="k">on</span> <span class="n">TAXI_TRIP_RECORDS_VIEW</span><span class="p">.</span><span class="n">PULOCATIONID</span> <span class="o">=</span> <span class="n">NEWYORK_ZONE</span><span class="p">.</span><span class="n">LOCATIONID</span>
+<span class="k">group</span> <span class="k">by</span> <span class="n">TAXI_TRIP_RECORDS_VIEW</span><span class="p">.</span><span class="n">PICKUP_DATE</span><span class="p">,</span> <span class="n">NEWYORK_ZONE</span><span class="p">.</span><span class="n">BOROUGH</span><span class="p">;</span>
+</code></pre>
+</div>
+
+<p>ç¶åå¯ä»¥çå°ï¼å¨èµæºä¸ kylin æ¥è¯¢é群é
ç½®ç¸åçæ
åµä¸ï¼ä½¿ç¨ spark-sql ç´æ¥æ¥è¯¢èæ¶è¶
è¿100sï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/7_query_in_spark_sql.png" alt="" /></p>
+
+<p>5.æ¥è¯¢æ§è¡æååå¿
é¡»éåº spark-sql åè¿è¡ä¸é¢çæ¥éª¤ï¼é²æ¢å ç¨èµæºã</p>
+
+<h4 id="kylin--2">导å
¥ Kylin å
æ°æ®</h4>
+
+<p>1.è¿å
¥ <code class="highlighter-rouge">$KYLIN_HOME</code></p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="nb">cd</span> <span class="nv">$KYLIN_HOME</span>
+</code></pre>
+</div>
+
+<p>2.导å
¥å
æ°æ®</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>bin/metastore.sh restore /home/ec2-user/meta_backups/
+</code></pre>
+</div>
+
+<p>3.éè½½å
æ°æ®</p>
+
+<p>æ ¹æ® EC2 èç¹çå
¬æ IPï¼å¨æµè§å¨è¾å
¥ <code class="highlighter-rouge">http://${kylin_node_public_ip}:7070/kylin</code> è¿å
¥ kylin web 页é¢ï¼å¹¶ä½¿ç¨ ADMIN/KYLIN çé»è®¤ç¨æ·åå¯ç ç»å½ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/8_kylin_web_ui.png" alt="" /></p>
+
+<p>éè¿ System -&gt; Configuration -&gt; Reload Metadata éè½½ Kylin å
æ°æ®:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/9_reload_kylin_metadata.png" alt="" /></p>
+
+<p>å¦æç¨æ·æ³è¦äºè§£å¦ä½æå¨å建 Kylin å
æ°æ®ä¸æå
å«ç Model å Cubeï¼å¯ä»¥åèï¼(Create model and cube in kylin)[https://cwiki.apache.org/confluence/display/KYLIN/Create+Model+and+Cube+in+Kylin]ã</p>
+
+<h4 id="section-12">æ§è¡æ建</h4>
+
+<p>æ交 cube æ建任å¡ï¼ç±äºå¨ model ä¸æªè®¾ç½®ååºåï¼æ以è¿éç´æ¥å¯¹ä¸¤ä¸ª cube è¿è¡å
¨éæ建ï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/10_full_build_cube.png.png" alt="" /></p>
+
+<p><img src="/images/blog/kylin4_on_cloud/11_kylin_job_complete.png" alt="" /></p>
+
+<h4 id="section-13">éæ¯æ建é群</h4>
+
+<p>æ建å®æä¹åï¼æ§è¡é群éæ¯å½ä»¤éæ¯æ建é群ï¼é»è®¤æ
åµä¸ä¼ä¿ç RDS stackãmonitor stack å vpc stackï¼</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python deploy.py --type destroy
+</code></pre>
+</div>
+
+<p>é群éæ¯æåï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/12_destroy_job_cluster.png" alt="" /></p>
+
+<h4 id="aws--2">æ£æ¥ AWS èµæº</h4>
+
+<p>é群éæ¯æååï¼å¯ä»¥å° AWS æ§å¶å°ç <code class="highlighter-rouge">CloudFormation</code> æå¡ç¡®è®¤æ¯å¦åå¨èµæºæ®çï¼ç±äºé»è®¤ä¼ä¿çå
æ°æ® RDSãçæ§èç¹å VPC èç¹ï¼æ以é群éæ¯å CloudFormation 页é¢è¿ä¼åå¨ä»¥ä¸ä¸ä¸ª Stackï¼</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/13_check_aws_stacks.png" alt="" /></p>
+
+<p>ä¸é¢å¯å¨æ¥è¯¢é群æ¶ä»ç¶ä¼ä½¿ç¨è¿ä¸ä¸ª Stack ä¸çèµæºï¼è¿æ ·æ们å¯ä»¥ä¿è¯æ¥è¯¢é群åæ建é群使ç¨åä¸å¥å
æ°æ®ã</p>
+
+<p>以ä¸é¨å为 <code class="highlighter-rouge">Kylin on Cloud ââ 两å°æ¶å¿«éæ建äºä¸æ°æ®åæå¹³å°</code> çä¸ç¯ï¼ä¸ç¯è¯·æ¥çï¼<a href="../kylin4-on-cloud-part2/">Kylin on Cloud ââ 两å°æ¶å¿«éæ建äºä¸æ°æ®åæå¹³å°(ä¸)</a></p>
+
+</description>
+ <pubDate>Wed, 20 Apr 2022 04:00:00 -0700</pubDate>
+ <link>http://kylin.apache.org/cn_blog/2022/04/20/kylin4-on-cloud-part1/</link>
+ <guid isPermaLink="true">http://kylin.apache.org/cn_blog/2022/04/20/kylin4-on-cloud-part1/</guid>
+
+
+ <category>cn_blog</category>
+
+ </item>
+
+ <item>
<title>å¦ä½ä½¿ç¨ Excel æ¥è¯¢ Kylinï¼MDX for Kylinï¼</title>
<description><h2 id="kylin--mdx">Kylin 为ä»ä¹éè¦ MDXï¼</h2>
@@ -864,46 +1501,46 @@ CELL PROPERTIES VALUE, FORMAT_STRING, LA
</item>
<item>
- <title>å®æï¼Kylin 4 ç°å·²æ¯æ AWS Glue Catalog</title>
- <description><h2 id="emr--kylin--glue-">为ä»ä¹å¨ EMR é¨ç½² Kylin éè¦æ¯æ Glue ï¼</h2>
+ <title>Kylin 4 now is supporting AWS Glue Catalog</title>
+ <description><h2 id="why-does-installing-kylin-on-emr-need-to-support-aws-glue">Why does installing Kylin on EMR need to support AWS Glue?</h2>
-<h3 id="aws-glue">ä»ä¹æ¯ AWS Glueï¼</h3>
+<h3 id="what-is-aws-glue">What is AWS Glue?</h3>
-<p>AWS Glue æ¯ä¸é¡¹å®å
¨æ管ç ETLï¼æåã转æ¢åå è½½ï¼æå¡ï¼ä½¿ AWS ç¨æ·è½å¤è½»æ¾èç»æµé«æå°å¯¹æ°æ®è¿è¡åç±»ãæ¸
çåæ©å
ï¼å¹¶å¨åç§æ°æ®åå¨ä¹é´å¯é å°ç§»å¨æ°æ®ãAWS Glue ç±ä¸ä¸ªç§°ä¸º AWS Glue æ°æ®ç®å½çä¸å¤®å
æ°æ®åå¨åºãä¸ä¸ªèªå¨çæ代ç ç ETL å¼æ以åä¸ä¸ªå¤çä¾èµé¡¹è§£æãä½ä¸çæ§åéè¯ççµæ´»è®¡åç¨åºç»æãAWS Glue æ¯æ æå¡å¨æå¡ï¼å æ¤æ é设置æ管çåºç¡è®¾æ½ã</p>
+<p>AWS Glue is a fully hosted ETL (Extract, Transform, and Load) service that enables AWS users to easily and cost-effectively classify, cleanse, enrich data and move data between various data storages. AWS Glue consists of a central metastore called AWS Glue Data Catalog, an ETL engine that can automatically generate code and a flexible scheduler that can handle dependency resolution, monitor jobs and retry. AWS Glue is a serverless service, so there is no infrastructure to set up or manage.</p>
-<h3 id="kylin--aws-glue-catalog">Kylin 为ä»ä¹éè¦æ¯æ AWS Glue Catalogï¼</h3>
+<h3 id="why-does-kylin-need-aws-glue-catalog">Why does Kylin need AWS Glue Catalog?</h3>
-<p>ç®å社åºæå¾å¤ Kylin ç¨æ·å¨ä½¿ç¨ AWS EMRï¼ç»ä»¶ä¸»è¦å
æ¬ HadoopãSparkãHiveãPresto çï¼å¦æ没æé
ç½®ä½¿ç¨ AWS Glue data Catalogï¼é£ä¹å¨å个æ°æ®ä»åºç»ä»¶å¦ HiveãSparkãPresto 建çæ°æ®è¡¨ï¼å¨å
¶å®ç»ä»¶ä¸æ¯æ¾ä¸å°çï¼ä¹å°±ä¸è½ä½¿ç¨ï¼å
¬å¸åºå±çæ°æ®ä»åºæ¯æä¾ç»å个ä¸å¡é¨é¨æ¥è¿è¡ä½¿ç¨ï¼ä¸ºäºè§£å³è¿ä¸ªé®é¢ï¼å¨å建 AWS EMR é群æ¶å°±å¯ä»¥ä½¿ç¨ AWS Glue data Catalog æ¥åå¨å
æ°æ®ï¼å¯¹å个ç»ä»¶å
±äº«æ°æ®æºï¼å¯¹å个ä¸å¡é¨é¨è¿è¡å
±äº«æ�
�°æ®æºï¼å°å个ä¸å¡é¨é¨çæ°æ®æ建æä¸ä¸ªå¤§çæ°æ®ç«æ¹ä½ï¼è½å¤å¿«éååºå
¬å¸é«éåå±çä¸å¡éæ±ã<br />
-ç°ä»£å
¬å¸çæ°æ®é½æ¯åºäºäºå¹³å°æ建ï¼å¤§æ°æ®å¢é使ç¨ç AWS EMR æ¥è¿è¡æ°æ®å å·¥ãæ°æ®åæã以å模åè®ç»ï¼éçæ°æ®æ´å¢å¸¦æ¥ææ°æ
¢ãææ°é¾ï¼EMR/Spark/Hive å¾é¾æ»¡è¶³æ°æ®åæå¸ãè¿è¥äººåãéå®çå¿«éæ¥è¯¢æ°æ®çéæ±ï¼äºæ¯ä¸äºç¨æ·éæ©äº Apache Kylin ä½ä¸ºå¼æº OLAP 解å³æ¹æ¡ã<br />
-ä½æ¯æè¿ç¤¾åºç¨æ·èç³»å°æ们ï¼åç¥ Kylin 4 è¿ä¸æ¯æä» Glue 读å表å
æ°æ®ï¼æ以æ们å社åºç¨æ·åä½ä¸èµ·æ£æ¥è¿ééå°çé®é¢å¹¶æç»è§£å³äºé®é¢ï¼ä»èä½¿å¾ Kylin 4 æ¯æäº AWS Glue Catalogï¼è¿æ ·å¸¦æ¥ç好å¤å¨äº HiveãPrestoãSparkãKylin ä¸å¯ä»¥å
±äº«è¡¨åæ°æ®ï¼ä½¿å¾æ¯ä¸ªä¸»é¢é½ä¸²èèµ·æ¥å½¢æä¸ä¸ªå¤§çæ°æ®åæå¹³å°ï¼æç ´å
æ°æ®éç¢ã</p>
+<p>At present, many users in the Kylin community use AWS EMR for running large-scale distributed data processing jobs on Hadoop, Spark, Hive, Presto, etc. Without AWS Glue Data Catalog, tables built on these data warehouse components (like Hive, Spark and Presto) can not be used by any other components. As the data warehouse needs to answer requirements from various business departments, they use AWS Glue Data Catalog for metadata storage when creating the AWS EMR clusters, to share the data sources among different components and business departments. That is, to build one data cube with data from each business department, so they can provide quick responses to different business requirements.<br />
+In modern companies, data is saved on cloud object storage and big data teams use AWS EMR for data processing, data analysis and model training. But with data explosion, it becomes really difficult to extract data and the response time is too long. In other words, the solution of EMR + Spark/Hive cannot meet the speedy data query requirements from data analysts, O&amp;M personnel and sales. So some users turn to Apache Kylin as their open-source OLAP solution.<br />
+Recently, our users approached us with the request that Kylin 4 could directly read table metadata from AWS Glue. After some collaboration, now Kylin 4 supports AWS Glue Catalog, making it possible for tables and data to be shared among Hive, Presto, Spark and Kylin. This helps to break down the metadata barrier, so different topics can be combined to form a big data analysis platform.</p>
-<h3 id="apache-kylin--aws-glue-">Apache Kylin æ¯æ AWS Glue åï¼</h3>
+<h3 id="does-kylin-support-aws-glue">Does Kylin support AWS Glue?</h3>
<table>
<thead>
<tr>
<th>Â </th>
- <th>æ¯æ Glue ç Kylin çæ¬</th>
+ <th>Kylin version which supports Glue</th>
<th>Issue Link</th>
</tr>
</thead>
<tbody>
<tr>
<td>Kylin on HBase (Before Kylin 4)</td>
- <td>2.6.6 or higher<br /> 3.1.0 or higher</td>
+ <td>2.6.6 or higher<br />3.1.0 or higher</td>
<td>https://issues.apache.org/jira/browse/KYLIN-4206<br />https://zhuanlan.zhihu.com/p/99481373</td>
</tr>
<tr>
<td>Kylin on Parquet</td>
<td>4.0.1 or higher</td>
- <td>æ¬æã</td>
+ <td>This article.</td>
</tr>
</tbody>
</table>
-<h2 id="section">é¨ç½²ååå¤</h2>
+<h2 id="prerequisites-for-deployment">Prerequisites for deployment</h2>
-<h3 id="section-1">软件信æ¯ä¸è§</h3>
+<h3 id="software-version">Software Version</h3>
<table>
<thead>
@@ -917,27 +1554,27 @@ CELL PROPERTIES VALUE, FORMAT_STRING, LA
<tr>
<td>Apache Kylin</td>
<td>4.0.1 or higher</td>
- <td>å¿
é¡»æ¯ 4.0.1 以åä¸ï¼è¯¦æ
åè <a href="https://cwiki.apache.org/confluence/display/KYLIN/KIP+10+refactor+hive+and+hadoop+dependency">KIP 10 refactor hive and hadoop dependency</a>.</td>
+ <td><a href="https://cwiki.apache.org/confluence/display/KYLIN/KIP+10+refactor+hive+and+hadoop+dependency">KIP 10 refactor hive and hadoop dependency</a>.</td>
</tr>
<tr>
<td>AWS EMR</td>
<td>6.5.0 or higher<br />5.33.1 or higher</td>
- <td>è¦çEMR 6 / EMR 5 çè¾æ°çæ¬ï¼<a href="https://docs.amazonaws.cn/en_us/emr/latest/ReleaseGuide/emr-650-release.html">Amazon EMR release 6.5.0 - Amazon EMR</a>.</td>
+ <td><a href="https://docs.amazonaws.cn/en_us/emr/latest/ReleaseGuide/emr-650-release.html">Amazon EMR release 6.5.0 - Amazon EMR</a>.</td>
</tr>
</tbody>
</table>
-<h3 id="glue-">åå¤ Glue æ°æ®åºå表</h3>
+<h3 id="prepare-aws-glue-database-and-tables">Prepare AWS Glue database and tables</h3>
<p><img src="/images/blog/kylin4_support_aws_glue/1_prepare_aws_glue_table_en.png" alt="" /></p>
<p><img src="/images/blog/kylin4_support_aws_glue/2_prepare_aws_glue_table_en.png" alt="" /></p>
<ul>
- <li>å建 AWS EMR é群ã</li>
+ <li>Create an EMR cluster.</li>
</ul>
-<p>è¿éå¯å¨ä¸ä¸ª EMR çé群ï¼éè¦æ³¨æçæ¯ï¼è¿ééè¿é
ç½® <code class="highlighter-rouge">hive.metastore.client.factory.class</code> å¯å¨äº Glue å¤é¨å
æ°æ®ã以ä¸å½ä»¤å¯ä»¥ä½ä¸ºåèã</p>
+<p>Note: Parameter hive.metastore.client.factory.class is configured to enable AWS Glue. For details, you may refer to the commands below.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>aws emr create-cluster --applications <span class="nv">Name</span><span class="o">=</span>Hadoop <span class="nv">Name</span><span class="o">=</span>Hive <span class="nv">Name</span><span class="o">=</span>Spark <span class="nv">Name</span><span class="o">=</span>ZooKeeper <span class="nv">Name</span><span class="o">=</span>Tez <span class="nv">Name</span><span class="o">=</span>Ganglia <span class="se">\</span>
--ec2-attributes <span class="k">${}</span> <span class="se">\</span>
@@ -955,35 +1592,35 @@ CELL PROPERTIES VALUE, FORMAT_STRING, LA
</div>
<ul>
- <li>ç»å½ Master èç¹ï¼å¹¶ä¸æ£æ¥ Hadoop çæ¬ å Hadoop é群æ¯å¦å¯å¨æåã</li>
+ <li>Log in to the Master node. Check the Hadoop version and whether the Hadoop cluster is successfully started.</li>
</ul>
<p><img src="/images/blog/kylin4_support_aws_glue/3_prepare_hadoop_cluster_en.png" alt="" /></p>
<p><img src="/images/blog/kylin4_support_aws_glue/4_prepare_hadoop_cluster_en.png" alt="" /></p>
-<h3 id="optional">è·åç¯å¢ä¿¡æ¯ï¼Optionalï¼</h3>
+<h3 id="optionalget-environmental-information">(Optional)Get environmental information</h3>
<blockquote>
- <p>å¦æä½ ä½¿ç¨ RDS æè
å
¶ä»å
æ°æ®åå¨ï¼è¯·é
æ
è·³è¿æ¤æ¥ã</p>
+ <p>If you are using RDS or other metadata storage, you may skip this step.</p>
</blockquote>
-<p>ç±äº Kylin 4.X æ¨èä½¿ç¨ RDBMS ä½ä¸ºå
æ°æ®åå¨ï¼å¤äºæµè¯ç®çï¼è¿éä½¿ç¨ Master èç¹èªå¸¦ç MariaDB ä½ä¸ºå
æ°æ®åå¨ï¼å
³äº MariaDB ç主æºå称ãè´¦å·ãå¯ç çä¿¡æ¯ï¼å¯ä»¥ä» <code class="highlighter-rouge">/etc/hive/conf/hive-site.xml</code> è·åã</p>
+<p>RDBMS is recommended for metastore in Kylin 4. So for testing purposes, in this article, we use MariaDB which comes with the Master node for metastore; for hostname, account and password of MariaDB, see <code class="highlighter-rouge">/etc/hive/conf/hive-site.xml</code>.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>kylin.metadata.url<span class="o">=</span>kylin4_on_cloud@jdbc,url<span class="o">=</span>jdbc:mysql://<span class="k">${</span><span class="nv">HOSTNAME</span><span class="k">}</span>:3306/hue,username<span class="o">=</span>hive,password<span class="o">=</span><span class="k">${</span><span class="nv">PASSWORD</span><span class="k">}</span>,maxActive<span class="o">=</span>10,maxIdle<span class="o">=</span>10,driverClassName<span class="o">=</span>org.mariadb.jdbc.Driver
kylin.env.zookeeper-connect-string<span class="o">=</span><span class="k">${</span><span class="nv">HOSTNAME</span><span class="k">}</span>
</code></pre>
</div>
-<p>è·åè¿äºä¿¡æ¯åï¼å¹¶ä¸æ¿æ¢ä»¥ä¸ Kylin é
置项éé¢çåéï¼å¦ <code class="highlighter-rouge">${PASSWORD}</code>ï¼ä¿åå°æ¬å°ï¼ä¾ä¸ä¸æ¥å¯å¨ Kylin è¿ç¨ä½¿ç¨ã</p>
+<p>Configure the variables as per the actual information, for example, replace ${PASSWORD} with the real password, save it locally and it will be used to start Kylin.</p>
-<h3 id="spark-sql--aws-glue-">æµè¯ Spark SQL å AWS Glue çè¿éæ§</h3>
+<h3 id="test-the-connectivity-between-spark-sql-and-aws-glue">Test the connectivity between Spark SQL and AWS Glue</h3>
[... 987 lines stripped ...]