You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kylin.apache.org by li...@apache.org on 2019/04/19 05:58:35 UTC

svn commit: r1857785 [3/3] - in /kylin/site: ./ blog/ blog/2019/04/19/ blog/2019/04/19/release-v3.0.0-alpha/ cn/blog/2019/04/ cn/blog/2019/04/19/ cn/blog/2019/04/19/release-v3.0.0-alpha/ docs30/tutorial/

Modified: kylin/site/feed.xml
URL: http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1857785&r1=1857784&r2=1857785&view=diff
==============================================================================
--- kylin/site/feed.xml (original)
+++ kylin/site/feed.xml Fri Apr 19 05:58:35 2019
@@ -19,11 +19,123 @@
     <description>Apache Kylin Home</description>
     <link>http://kylin.apache.org/</link>
     <atom:link href="http://kylin.apache.org/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Thu, 18 Apr 2019 06:59:32 -0700</pubDate>
-    <lastBuildDate>Thu, 18 Apr 2019 06:59:32 -0700</lastBuildDate>
+    <pubDate>Thu, 18 Apr 2019 22:44:29 -0700</pubDate>
+    <lastBuildDate>Thu, 18 Apr 2019 22:44:29 -0700</lastBuildDate>
     <generator>Jekyll v2.5.3</generator>
     
       <item>
+        <title>Apache Kylin v3.0.0-alpha Release Announcement</title>
+        <description>&lt;p&gt;The Apache Kylin community is pleased to announce the release of Apache Kylin v3.0.0-alpha.&lt;/p&gt;
+
+&lt;p&gt;Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Big Data supporting extremely large datasets.&lt;/p&gt;
+
+&lt;p&gt;This is the first release of the new generation v3.x, the main feature introduced is the Real-time OLAP. All of the changes can be found in the &lt;a href=&quot;/docs/release_notes.html&quot;&gt;release notes&lt;/a&gt;. Here we just highlight the main features.&lt;/p&gt;
+
+&lt;h1 id=&quot;important-features&quot;&gt;Important features&lt;/h1&gt;
+
+&lt;h3 id=&quot;kylin-3654---real-time-olap&quot;&gt;KYLIN-3654 - Real-time OLAP&lt;/h3&gt;
+&lt;p&gt;With the newly introduced Kylin real-time receiver and coordinator components, Kylin can implement a millisecond-level data preparation delay for streaming data from sources like Apache Kafka. This means since v3.0 on,  Kylin can support sub-second level OLAP over historical batch data, near real-time streaming as well as real-time streaming. The user can use one OLAP platform to serve different scenarios. This solution has been deployed and verified in early adopters like eBay since 2018. For how to enable it, please refer to &lt;a href=&quot;/docs30/tutorial/realtime_olap.html&quot;&gt;this tutorial&lt;/a&gt;.&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3795---submit-spark-jobs-via-apache-livy&quot;&gt;KYLIN-3795 - Submit Spark jobs via Apache Livy&lt;/h3&gt;
+&lt;p&gt;This feature allows the administrator to configure Kylin to integrate with Apache Livy (incubating) for Spark job submissions. The Spark job is submitted to the Livy Server through Livy’s REST API, instead of starting the Spark Driver process in local, which facilitates the management and monitoring of the Spark resources, and also releases the pressure of the nodes where the Kylin job server is running.&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3820---a-curator-based-job-scheduler&quot;&gt;KYLIN-3820 - A curator-based job scheduler&lt;/h3&gt;
+&lt;p&gt;A new job scheduler is added to automatically discover the Kylin nodes and do an automatic leader selection among them (only the leader will submit jobs). With this feature, you can easily deploy and scale out Kylin nodes without manually update the node address in &lt;code class=&quot;highlighter-rouge&quot;&gt;kylin.properties&lt;/code&gt; and restart Kylin to take effective.&lt;/p&gt;
+
+&lt;h1 id=&quot;other-enhancements&quot;&gt;Other enhancements&lt;/h1&gt;
+
+&lt;h3 id=&quot;kylin-3716---fastthreadlocal-replaces-threadlocal&quot;&gt;KYLIN-3716 - FastThreadLocal replaces ThreadLocal&lt;/h3&gt;
+&lt;p&gt;Using FastThreadLocal instead of ThreadLocal can improve Kylin’s overall performance to some extent.&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3867---enable-jdbc-to-use-key-store--trust-store-for-https-connection&quot;&gt;KYLIN-3867 - Enable JDBC to use key store &amp;amp; trust store for https connection&lt;/h3&gt;
+&lt;p&gt;By using HTTPS, the authentication information used by JDBC is protected, making Kylin more secure.&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3905---enable-shrunken-dictionary-default&quot;&gt;KYLIN-3905 - Enable shrunken dictionary default&lt;/h3&gt;
+&lt;p&gt;By default, the shrunken dictionary is enabled, and the precise counting scene for high cardinal dimensions can significantly reduce the build time.&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3839---storage-clean-up-after-the-refreshing-and-deleting-a-segment&quot;&gt;KYLIN-3839 - Storage clean up after the refreshing and deleting a segment&lt;/h3&gt;
+&lt;p&gt;Clear unnecessary data files in a timely manner&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;Download&lt;/strong&gt;&lt;/p&gt;
+
+&lt;p&gt;To download Apache Kylin v3.0.0-alpha source code or binary package, visit the &lt;a href=&quot;http://kylin.apache.org/download&quot;&gt;download&lt;/a&gt; page.&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;Upgrade&lt;/strong&gt;&lt;/p&gt;
+
+&lt;p&gt;Follow the &lt;a href=&quot;/docs/howto/howto_upgrade.html&quot;&gt;upgrade guide&lt;/a&gt;.&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;Feedback&lt;/strong&gt;&lt;/p&gt;
+
+&lt;p&gt;If you face issue or question, please send mail to Apache Kylin dev or user mailing list: dev@kylin.apache.org , user@kylin.apache.org; Before sending, please make sure you have subscribed the mailing list by dropping an email to dev-subscribe@kylin.apache.org or user-subscribe@kylin.apache.org.&lt;/p&gt;
+
+&lt;p&gt;&lt;em&gt;Great thanks to everyone who contributed!&lt;/em&gt;&lt;/p&gt;
+</description>
+        <pubDate>Fri, 19 Apr 2019 13:00:00 -0700</pubDate>
+        <link>http://kylin.apache.org/blog/2019/04/19/release-v3.0.0-alpha/</link>
+        <guid isPermaLink="true">http://kylin.apache.org/blog/2019/04/19/release-v3.0.0-alpha/</guid>
+        
+        
+        <category>blog</category>
+        
+      </item>
+    
+      <item>
+        <title>Apache Kylin v3.0.0-alpha 发布</title>
+        <description>&lt;p&gt;近日 Apache Kylin 社区很高兴地宣布,Apache Kylin v3.0.0-alpha 正式发布。&lt;/p&gt;
+
+&lt;p&gt;Apache Kylin 是一个开源的分布式分析引擎,旨在为极大数据集提供 SQL 接口和多维分析(OLAP)的能力。&lt;/p&gt;
+
+&lt;p&gt;这是 Kylin 下一代 v3.x 的第一个发布版本,用于早期预览,主要的功能是实时 (Real-time) OLAP。完整的改动列表请参见&lt;a href=&quot;/docs/release_notes.html&quot;&gt;release notes&lt;/a&gt;;这里挑一些主要改进做说明。&lt;/p&gt;
+
+&lt;h1 id=&quot;section&quot;&gt;重要新功能&lt;/h1&gt;
+
+&lt;h3 id=&quot;kylin-3654----olap&quot;&gt;KYLIN-3654 - 实时 OLAP&lt;/h3&gt;
+&lt;p&gt;随着引入新的 real-time receiver 和 coordinator 组件,Kylin 能够实现毫秒级别的数据准备延迟,数据源来自流式数据如 Apache Kafka。这意味着,从 v3.0 开始,Kylin 既能够支持历史批量数据的 OLAP,也支持对流式数据的准实时(Near real-time)以及完全实时(real-time)分析。用户可以使用一个 OLAP 平台来服务不同的使用场景。此方案已经在早期用户如 eBay 得到部署和验证。关于如何使用此功能,请参考&lt;a href=&quot;/docs30/tutorial/realtime_olap.html&quot;&gt;æ­¤æ•
 ™ç¨‹&lt;/a&gt;。&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3795----apache-livy--spark-&quot;&gt;KYLIN-3795 - 通过 Apache Livy 递交 Spark 任务&lt;/h3&gt;
+&lt;p&gt;这个功能允许管理员为 Kylin 配置使用 Apache Livy (incubating) 来完成任务的递交。Spark 作业的提交通过 Livy 的 REST API 来提交,而无需在本地启动 Spark Driver 进程,从而方便对 Spark 资源的管理监控,同时也降低对 Kylin 任务进程所在节点的压力。&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3820----curator-&quot;&gt;KYLIN-3820 - 基于 Curator 的任务节点分配和服务发现&lt;/h3&gt;
+&lt;p&gt;新增一种基于Apache Zookeeper 和 Curator作业调度器,可以自动发现 Kylin 节点,并自动分配一个节点来进行任务的管理以及故障恢复。有了这个功能后,管理员可以更加容易地部署和扩展 Kylin 节点,而不再需要在 &lt;code class=&quot;highlighter-rouge&quot;&gt;kylin.properties&lt;/code&gt; 中配置每个 Kylin 节点的地址并重启 Kylin 以使之生效。&lt;/p&gt;
+
+&lt;h1 id=&quot;section-1&quot;&gt;其它改进&lt;/h1&gt;
+
+&lt;h3 id=&quot;kylin-3716---fastthreadlocal--threadlocal&quot;&gt;KYLIN-3716 - FastThreadLocal 替换 ThreadLocal&lt;/h3&gt;
+&lt;p&gt;使用 Netty 中的 FastThreadLocal 替代 JDK 原生的 ThreadLocal,可以一定程度上提升 Kylin 在高并发下的性能。&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3867---enable-jdbc-to-use-key-store--trust-store-for-https-connection&quot;&gt;KYLIN-3867 - Enable JDBC to use key store &amp;amp; trust store for https connection&lt;/h3&gt;
+&lt;p&gt;通过使用HTTPS,保护了JDBC使用的身份验证信息,使得Kylin更加安全&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3905---enable-shrunken-dictionary-default&quot;&gt;KYLIN-3905 - Enable shrunken dictionary default&lt;/h3&gt;
+&lt;p&gt;默认开启 shrunken dictionary,针对高基维进行精确去重的场景,可以显著减少构建用时。&lt;/p&gt;
+
+&lt;h3 id=&quot;kylin-3839---storage-clean-up-after-the-refreshing-and-deleting-a-segment&quot;&gt;KYLIN-3839 - Storage clean up after the refreshing and deleting a segment&lt;/h3&gt;
+&lt;p&gt;更加及时地清除不必要的数据文件&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;下载&lt;/strong&gt;&lt;/p&gt;
+
+&lt;p&gt;要下载Apache Kylin 源代码或二进制包,请访问&lt;a href=&quot;/download&quot;&gt;下载页面&lt;/a&gt; page.&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;升级&lt;/strong&gt;&lt;/p&gt;
+
+&lt;p&gt;参考&lt;a href=&quot;/docs/howto/howto_upgrade.html&quot;&gt;升级指南&lt;/a&gt;.&lt;/p&gt;
+
+&lt;p&gt;&lt;strong&gt;反馈&lt;/strong&gt;&lt;/p&gt;
+
+&lt;p&gt;如果您遇到问题或疑问,请发送邮件至 Apache Kylin dev 或 user 邮件列表:dev@kylin.apache.org,user@kylin.apache.org; 在发送之前,请确保您已通过发送电子邮件至 dev-subscribe@kylin.apache.org 或 user-subscribe@kylin.apache.org 订阅了邮件列表。&lt;/p&gt;
+
+&lt;p&gt;&lt;em&gt;非常感谢所有贡献Apache Kylin的朋友!&lt;/em&gt;&lt;/p&gt;
+</description>
+        <pubDate>Fri, 19 Apr 2019 13:00:00 -0700</pubDate>
+        <link>http://kylin.apache.org/cn/blog/2019/04/19/release-v3.0.0-alpha/</link>
+        <guid isPermaLink="true">http://kylin.apache.org/cn/blog/2019/04/19/release-v3.0.0-alpha/</guid>
+        
+        
+        <category>blog</category>
+        
+      </item>
+    
+      <item>
         <title>Real-time Streaming Design in Apache Kylin</title>
         <description>&lt;h2 id=&quot;why-build-real-time-streaming-in-kylin&quot;&gt;Why Build Real-time Streaming in Kylin&lt;/h2&gt;
 &lt;p&gt;The real-time streaming feature is contributed by eBay big data team in Kylin 3.0, the purpose we build real-time streaming is:&lt;/p&gt;
@@ -854,70 +966,6 @@ Graphic 10 Process of Querying Cube&lt;/
       </item>
     
       <item>
-        <title>Apache Kylin v2.5.0 正式发布</title>
-        <description>&lt;p&gt;近日Apache Kylin 社区很高兴地宣布,Apache Kylin 2.5.0 正式发布。&lt;/p&gt;
-
-&lt;p&gt;Apache Kylin 是一个开源的分布式分析引擎,旨在为极大数据集提供 SQL 接口和多维分析(OLAP)的能力。&lt;/p&gt;
-
-&lt;p&gt;这是继2.4.0 后的一个新功能版本。该版本引入了很多有价值的改进,完整的改动列表请参见&lt;a href=&quot;https://kylin.apache.org/docs/release_notes.html&quot;&gt;release notes&lt;/a&gt;;这里挑一些主要改进做说明:&lt;/p&gt;
-
-&lt;h3 id=&quot;all-in-spark--cubing-&quot;&gt;All-in-Spark 的 Cubing 引擎&lt;/h3&gt;
-&lt;p&gt;Kylin 的 Spark 引擎将使用 Spark 运行 cube 计算中的所有分布式作业,包括获取各个维度的不同值,将 cuboid 文件转换为 HBase HFile,合并 segment,合并词典等。默认的 Spark 配置也经过优化,使得用户可以获得开箱即用的体验。相关开发任务是 KYLIN-3427, KYLIN-3441, KYLIN-3442.&lt;/p&gt;
-
-&lt;p&gt;Spark 任务管理也有所改进:一旦 Spark 任务开始运行,您就可以在Web控制台上获得作业链接;如果您丢弃该作业,Kylin 将立刻终止 Spark 作业以及时释放资源;如果重新启动 Kylin,它可以从上一个作业恢复,而不是重新提交新作业.&lt;/p&gt;
-
-&lt;h3 id=&quot;mysql--kylin-&quot;&gt;MySQL 做 Kylin 元数据的存储&lt;/h3&gt;
-&lt;p&gt;在过去,HBase 是 Kylin 元数据存储的唯一选择。 在某些情况下 HBase不适用,例如使用多个 HBase 集群来为 Kylin 提供跨区域的高可用,这里复制的 HBase 集群是只读的,所以不能做元数据存储。现在我们引入了 MySQL Metastore 以满足这种需求。此功能现在处于测试阶段。更多内容参见 KYLIN-3488。&lt;/p&gt;
-
-&lt;h3 id=&quot;hybrid-model-&quot;&gt;Hybrid model 图形界面&lt;/h3&gt;
-&lt;p&gt;Hybrid 是一种用于组装多个 cube 的高级模型。 它可用于满足 cube 的 schema 要发生改变的情况。这个功能过去没有图形界面,因此只有一小部分用户知道它。现在我们在 Web 界面上开启了它,以便更多用户可以尝试。&lt;/p&gt;
-
-&lt;h3 id=&quot;cube-planner&quot;&gt;默认开启 Cube planner&lt;/h3&gt;
-&lt;p&gt;Cube planner 可以极大地优化 cube 结构,减少构建的 cuboid 数量,从而节省计算/存储资源并提高查询性能。它是在v2.3中引入的,但默认情况下没有开启。为了让更多用户看到并尝试它,我们默认在v2.5中启用它。 算法将在第一次构建 segment 的时候,根据数据统计自动优化 cuboid 集合.&lt;/p&gt;
-
-&lt;h3 id=&quot;segment-&quot;&gt;改进的 Segment 剪枝&lt;/h3&gt;
-&lt;p&gt;Segment(分区)修剪可以有效地减少磁盘和网络I / O,因此大大提高了查询性能。 过去,Kylin 只按分区列 (partition date column) 的值进行 segment 的修剪。 如果查询中没有将分区列作为过滤条件,那么修剪将不起作用,会扫描所有segment。.&lt;br /&gt;
-现在从v2.5开始,Kylin 将在 segment 级别记录每个维度的最小/最大值。 在扫描 segment 之前,会将查询的条件与最小/最大索引进行比较。 如果不匹配,将跳过该 segment。 检查KYLIN-3370了解更多信息。&lt;/p&gt;
-
-&lt;h3 id=&quot;yarn-&quot;&gt;在 YARN 上合并字典&lt;/h3&gt;
-&lt;p&gt;当 segment 合并时,它们的词典也需要合并。在过去,字典合并发生在 Kylin 的 JVM 中,这需要使用大量的本地内存和 CPU 资源。 在极端情况下(如果有几个并发作业),可能会导致 Kylin 进程崩溃。 因此,一些用户不得不为 Kylin 任务节点分配更多内存,或运行多个任务节点以平衡工作负载。&lt;br /&gt;
-现在从v2.5开始,Kylin 将把这项任务提交给 Hadoop MapReduce 和 Spark,这样就可以解决这个瓶颈问题。 查看KYLIN-3471了解更多信息.&lt;/p&gt;
-
-&lt;h3 id=&quot;cube-&quot;&gt;改进使用全局字典的 cube 构建性能&lt;/h3&gt;
-&lt;p&gt;全局字典 (Global Dictionary) 是 bitmap 精确去重计数的必要条件。如果去重列具有非常高的基数,则 GD 可能非常大。在 cube 构建阶段,Kylin 需要通过 GD 将非整数值转换为整数。尽管 GD 已被分成多个切片,可以分开加载到内存,但是由于去重列的值是乱序的。Kylin 需要反复载入和载出(swap in/out)切片,这会导致构建任务非常缓慢。&lt;br /&gt;
-该增强功能引入了一个新步骤,为每个数据块从全局字典中构建一个缩小的字典。 随后每个任务只需要加载缩小的字典,从而避免频繁的载入和载出。性能可以比以前快3倍。查看 KYLIN-3491 了解更多信息.&lt;/p&gt;
-
-&lt;h3 id=&quot;topn-count-distinct--cube-&quot;&gt;改进含 TOPN, COUNT DISTINCT 的 cube 大小的估计&lt;/h3&gt;
-&lt;p&gt;Cube 的大小在构建时是预先估计的,并被后续几个步骤使用,例如决定 MR / Spark 作业的分区数,计算 HBase region 切割等。它的准确与否会对构建性能产生很大影响。 当存在 COUNT DISTINCT,TOPN 的度量时候,因为它们的大小是灵活的,因此估计值可能跟真实值有很大偏差。 在过去,用户需要调整若干个参数以使尺寸估计更接近实际尺寸,这对普通用户有点困难。&lt;br /&gt;
-现在,Kylin 将根据收集的统计信息自动调整大小估计。这可以使估计值与实际大小更接近。查看 KYLIN-3453 了解更多信息。&lt;/p&gt;
-
-&lt;h3 id=&quot;hadoop-30hbase-20&quot;&gt;支持Hadoop 3.0/HBase 2.0&lt;/h3&gt;
-&lt;p&gt;Hadoop 3和 HBase 2开始被许多用户采用。现在 Kylin 提供使用新的 Hadoop 和 HBase API 编译的新二进制包。我们已经在 Hortonworks HDP 3.0 和 Cloudera CDH 6.0 上进行了测试&lt;/p&gt;
-
-&lt;p&gt;&lt;strong&gt;下载&lt;/strong&gt;&lt;/p&gt;
-
-&lt;p&gt;要下载Apache Kylin v2.5.0源代码或二进制包,请访问&lt;a href=&quot;http://kylin.apache.org/download&quot;&gt;下载页面&lt;/a&gt; .&lt;/p&gt;
-
-&lt;p&gt;&lt;strong&gt;升级&lt;/strong&gt;&lt;/p&gt;
-
-&lt;p&gt;参考&lt;a href=&quot;/docs/howto/howto_upgrade.html&quot;&gt;升级指南&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;&lt;strong&gt;反馈&lt;/strong&gt;&lt;/p&gt;
-
-&lt;p&gt;如果您遇到问题或疑问,请发送邮件至 Apache Kylin dev 或 user 邮件列表:dev@kylin.apache.org,user@kylin.apache.org; 在发送之前,请确保您已通过发送电子邮件至 dev-subscribe@kylin.apache.org 或 user-subscribe@kylin.apache.org订阅了邮件列表。&lt;/p&gt;
-
-&lt;p&gt;&lt;em&gt;非常感谢所有贡献Apache Kylin的朋友!&lt;/em&gt;&lt;/p&gt;
-</description>
-        <pubDate>Thu, 20 Sep 2018 13:00:00 -0700</pubDate>
-        <link>http://kylin.apache.org/cn/blog/2018/09/20/release-v2.5.0/</link>
-        <guid isPermaLink="true">http://kylin.apache.org/cn/blog/2018/09/20/release-v2.5.0/</guid>
-        
-        
-        <category>blog</category>
-        
-      </item>
-    
-      <item>
         <title>Apache Kylin v2.5.0 Release Announcement</title>
         <description>&lt;p&gt;The Apache Kylin community is pleased to announce the release of Apache Kylin v2.5.0.&lt;/p&gt;
 
@@ -990,287 +1038,63 @@ Graphic 10 Process of Querying Cube&lt;/
       </item>
     
       <item>
-        <title>Use Star Schema Benchmark for Apache Kylin</title>
-        <description>&lt;h2 id=&quot;background&quot;&gt;Background&lt;/h2&gt;
-
-&lt;p&gt;For many Apache Kylin users, when deploying Kylin in the production environment, how to measure Kylin’s performance before delivering to the business is a problem. A performance benchmark can help to find the potential performance issues, so you can tune the configuration to improve the overall performance. The tunning may include Kylin’s own Job and Query, concurrent building of Cubes, HBase write and read, MapReduce or Spark parameters and more.&lt;/p&gt;
-
-&lt;h2 id=&quot;ssb-introduction&quot;&gt;SSB Introduction&lt;/h2&gt;
-&lt;p&gt;Kyligence Inc provides an SSB (Star Schema Benchmark) project called &lt;a href=&quot;https://github.com/Kyligence/ssb-kylin&quot;&gt;ssb-kylin&lt;/a&gt; on github, which is modified from the TPC-H benchmark, and specifically targeted to test tools in the star model OLAP scenario.&lt;/p&gt;
-
-&lt;p&gt;The test process generates 5 tables, and the data volume can be adjusted by parameters. The table structure of SSB is shown below:&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/1. The table structure of SSB.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;The table “lineorder” is the fact table, the other four are dimension tables. Each dimension table is associated with the fact table by the primary key, which is a standard star schema.&lt;/p&gt;
-
-&lt;p&gt;The environment for this test is CDH 5.13.3, which enables authentication and authorization of Kerberos and OpenLDAP, and uses Sentry to provide fine-grained, role-based authorization and multi-tenant management. However, the official “ssb-kylin” does not involve the processing of permissions and authentication, so I have slightly modified it. For details, see my fork &lt;a href=&quot;https://github.com/jiangshouzhuang/ssb-kylin&quot;&gt;jiangshouzhuang/ssb-kylin&lt;/a&gt;.&lt;/p&gt;
-
-&lt;h2 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h2&gt;
-
-&lt;p&gt;** Here is a description of the Kylin deployment:**&lt;br /&gt;
-  1. Kylin deploys integrated OpenLDAP user unified authentication management&lt;br /&gt;
-  2. Add Kylin deployment user kylin_manager_user in OpenLDAP (user group is kylin_manager_group)&lt;br /&gt;
-  3. The Kylin version is apache-kylin-2.4.0&lt;br /&gt;
-  4. Kylin Cluster configuration (VM):&lt;br /&gt;
-  Kylin Job 1 node: 16GB, 8Cores&lt;br /&gt;
-  Kylin Query 2 nodes: 32GB, 8Cores&lt;br /&gt;
-&lt;strong&gt;A few points before SSB pressure measurement:&lt;/strong&gt;&lt;br /&gt;
-1 Create a database named ssb in the Hive database.&lt;/p&gt;
-&lt;pre name=&quot;code&quot; class=&quot;java&quot;&gt;
-# Log in to the hive database as a super administrator.  
-Create database SSB;  
-CREATE ROLE ssb_write_role;  
-GRANT ALL ON DATABASE ssb TO ROLE ssb_write_role;  
-GRANT ROLE ssb_write_role TO GROUP ssb_write_group;  
-# Then add kylin_manager_user to kylin_manager_group in OpenLDAP, so kylin_manager_user has access to the ssb database.
-&lt;/pre&gt;
-&lt;p&gt;2 Assign HDFS directory /user/kylin_manager_user read and write permissions to kylin_manager_user user.&lt;br /&gt;
-3 Configure the HADOOP_STREAMING_JAR environment variable under the kylin_manager_user user home directory.&lt;br /&gt;
-&lt;code class=&quot;highlighter-rouge&quot;&gt;
-Export HADOOP_STREAMING_JAR=/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-streaming.jar
-&lt;/code&gt;&lt;/p&gt;
-
-&lt;h2 id=&quot;download-the-ssb-tool-and-compile&quot;&gt;Download the SSB tool and compile&lt;/h2&gt;
-
-&lt;p&gt;You can quickly download and compile the ssb test tool by entering the following command in the linux terminal.&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone https://github.com/jiangshouzhuang/ssb-kylin.git
-cd ssb-kylin
-cd ssb-benchmark
-make clean
-make
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;h2 id=&quot;adjust-the-ssb-parameters&quot;&gt;Adjust the SSB parameters&lt;/h2&gt;
-
-&lt;p&gt;In the ssb-kylin project, there is a ssb.conf file below the bin directory, which defines the base data volume of the fact table and the dimension table. When we generate the amount of test data, we can specify the size of the scale so that the actual data is base * scale.&lt;/p&gt;
-
-&lt;p&gt;Part of the ssb.conf file is:&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  # customer base, default value is 30,000
-  customer_base = 30000
-  # part base, default value is 200,000
-  part_base = 200000
-  # supply base, default value is 2,000
-  supply_base = 2000
-  # date base (days), default value is 2,556
-  date_base = 2556
-  # lineorder base (purchase record), default value is 6,000,000
-  lineorder_base = 6000000
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;Of course, the above base parameters can be adjusted according to their actual needs, I use the default parameters.&lt;br /&gt;
-In the ssb.conf file, there are some parameters as follows.&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# manufacturer max. The value range is (1 .. manu_max)
-manu_max = 5
-# category max. The value range is (1 .. cat_max)
-cat_max = 5
-# brand max. The value range is (1 .. brand_max)
-brand_max = 40
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;&lt;strong&gt;The explanation is as follows:&lt;/strong&gt; &lt;br /&gt;
-manu_max, cat_max and brand_max are used to define hierarchical scale. For example, manu_max=10, cat_max=10, and brand_max=10 refer to a total of 10 manufactures, and each manufactures has a maximum of 10 category parts, and each category has up to 10 brands. Therefore, the cardinality of manufacture is 10, the cardinality of category is 100, and the cardinality of brand is 1000.&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# customer: num of cities per country, default value is 100
-cust_city_max = 9
-# supplier: num of cities per country, default value is 100
-supp_city_max = 9
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;&lt;strong&gt;The explanation is as follows:&lt;/strong&gt; &lt;br /&gt;
-cust_city_max and supp_city_max are used to define the number of city for each country in customer and supplier tables. If the total number of country is 30, and cust_city_max=100, supp_city_max=10, then the customer table will have 3000 different city, and the supplier table will have 300 different city.&lt;/p&gt;
-
-&lt;p&gt;&lt;strong&gt;Prompt:&lt;/strong&gt;&lt;br /&gt;
-In this pressure test, the resources allocated by Yarn are used to generate test data. If the memory problems are encountered in the process of generating the data, increase the memory size of the Yarn allocation of container.&lt;/p&gt;
-
-&lt;h2 id=&quot;generate-test-data&quot;&gt;Generate test data&lt;/h2&gt;
-
-&lt;p&gt;Before running the &lt;code class=&quot;highlighter-rouge&quot;&gt;ssb-kylin/bin/run.sh&lt;/code&gt; script, explain several points to run.sh:&lt;br /&gt;
-1 configuring HDFS_BASE_DIR as the path to table data, because I give kylin_manager_user the right to read and write to /user/kylin_manager_user directory, so configure here:&lt;/p&gt;
-&lt;pre name=&quot;code&quot; class=&quot;java&quot;&gt;
-HDFS_BASE_DIR=/user/kylin_manager_user/ssb
-&lt;/pre&gt;
-&lt;p&gt;The temporary and actual data will be generated under this directory when you run run.sh.&lt;br /&gt;
-2 configure the LDAP user and password for deploying Kylin, and operate KeyTab files such as HDFS.&lt;/p&gt;
-&lt;pre name=&quot;code&quot; class=&quot;java&quot;&gt;
-KYLIN_INSTALL_USER=kylin_manager_user
-KYLIN_INSTALL_USER_PASSWD=xxxxxxxx
-KYLIN_INSTALL_USER_KEYTAB=/home/${KYLIN_INSTALL_USER}/keytab/${KYLIN_INSTALL_USER}.keytab
-&lt;/pre&gt;
-&lt;p&gt;3 configure the way that beeline accesses the hive database.&lt;/p&gt;
-&lt;pre name=&quot;code&quot; class=&quot;java&quot;&gt;
-BEELINE_URL=jdbc:hive2://hiveserve2_ip:10000
-HIVE_BEELINE_COMMAND=&quot;beeline -u ${BEELINE_URL} -n ${KYLIN_INSTALL_USER} -p
-${KYLIN_INSTALL_USER_PASSWD} -d org.apache.hive.jdbc.HiveDriver&quot;
-&lt;/pre&gt;
-&lt;p&gt;If your CDH or other big data platform is not using beeline, but hive cli, please modify it yourself.&lt;br /&gt;
-Once everything is ready, we start running the program and generate test data:&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;cd ssb-kylin
-bin/run.sh --scale 20
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;We set the scale to 20, the program will run for a while, the maximum lineorder table data has more than 100 million. After the program is executed, we look at the tables in the hive database and the amount of data:&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;use ssb;
-show tables;
-select count(1) from lineorder;
-select count(1) from p_lineorder;
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/2.1 generated tables.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/2.2 the volume of data.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;As you can see, a total of five tables and one view were created.&lt;/p&gt;
-
-&lt;h2 id=&quot;load-the-cubes-metadata-and-build-the-cube&quot;&gt;Load the cube’s metadata and build the cube&lt;/h2&gt;
-
-&lt;p&gt;The ssb-kylin project has helped us build the project, model, and cube in advance. Just import the Kylin directly like the learn_kylin example. Cube Metadata’s directory is cubemeta, because our kylin integrates OpenLDAP, there is no ADMIN user, so the owner parameter in cubemeta/cube/ssb.json is set to null.&lt;br /&gt;
-Execute the following command to import cubemeta:&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;cd ssb-kylin
-$KYLIN_HOME/bin/metastore.sh restore cubemeta
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;Then log in to Kylin and execute Reload Metadata operation. This creates new project, model and cube in Kylin. Before building cube, first Disable, then Purge, delete old temporary files.&lt;/p&gt;
-
-&lt;p&gt;The results of building with MapReduce are as follows:&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/3 build with mapReduce.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;Here I test the performance of Spark to build Cube again, disable the previously created Cube, and then Purge. Since the Cube is used by Purge, the useless HBase tables and HDFS files need to be deleted. Here, manually clean up the junk files. First execute the following command:&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete false
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;Then check whether the listed HBase table and the HDFS file are useless. After confirming the error, perform the delete operation:&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;When using Spark to build a cube, it consumes a lot of memory. After all, using memory resources improves the speed of cube building. Here I will list some of the parameters of Spark in the kylin.properties configuration file:&lt;/p&gt;
-
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kylin.engine.spark-conf.spark.master=yarn
-kylin.engine.spark-conf.spark.submit.deployMode=cluster
-kylin.engine.spark-conf.spark.yarn.queue=root.kylin_manager_group
-# config Dynamic resource allocation
-kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true
-kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=10
-kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=1024
-kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300
-
-kylin.engine.spark-conf.spark.shuffle.service.enabled=true
-kylin.engine.spark-conf.spark.shuffle.service.port=7337
-
-kylin.engine.spark-conf.spark.driver.memory=4G
-kylin.engine.spark-conf.spark.executor.memory=4G 
-kylin.engine.spark-conf.spark.executor.cores=1
-kylin.engine.spark-conf.spark.network.timeout=600
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-
-&lt;p&gt;The above parameters can meet most of the requirements, so users basically do not need to configure when designing the Cube. Of course, if the situation is special, you can still set Spark-related tuning parameters at the Cube level.&lt;/p&gt;
-
-&lt;p&gt;Before executing Spark to build a Cube, you need to set the Cube Engine value to Spark in Advanced Setting and then execute Build. After the construction is completed, the results are as follows:&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/4 build completely.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;In contrast, the time for MapReduce and Spark to build Cube is as follows: (Scale=20):&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/5 the results of comparing Spark and MapReduce.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;You can see that the speed of building is almost 1x faster. In fact, Spark has many other aspects of tuning (performance can be improved by 1-4 times and above), which is not involved here.&lt;/p&gt;
-
-&lt;h2 id=&quot;query&quot;&gt;Query&lt;/h2&gt;
-
-&lt;p&gt;Ssb-kylin provides 13 SSB query SQL lists. The query conditions may vary with the scale factor. You can modify the results according to the actual situation. The following examples show the test results in the case of scale 10 and 20:&lt;br /&gt;
-The query result of Scale=10 is as follows:&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/6.1 scale 10.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;The query result of Scale=20 is as follows:&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/6.2 scale 20.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;As can be seen from the results, all the queries are completed within 1 s, which proves Apache Kylin’s subsecond query capability strongly. In addition, the average performance of the query did not decrease significantly as the amount of data doubled, which is also determined by the theory of Cube precomputation.&lt;/p&gt;
-
-&lt;p&gt;Note: For details on each query statement, see the README.md description in the ssb-kylin project.&lt;/p&gt;
+        <title>Apache Kylin v2.5.0 正式发布</title>
+        <description>&lt;p&gt;近日Apache Kylin 社区很高兴地宣布,Apache Kylin 2.5.0 正式发布。&lt;/p&gt;
 
-&lt;p&gt;At this point, the Kylin’s SSB pressure test is completed, but for you who are reading the article, everything is just beginning.&lt;/p&gt;
+&lt;p&gt;Apache Kylin 是一个开源的分布式分析引擎,旨在为极大数据集提供 SQL 接口和多维分析(OLAP)的能力。&lt;/p&gt;
 
-&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;
+&lt;p&gt;这是继2.4.0 后的一个新功能版本。该版本引入了很多有价值的改进,完整的改动列表请参见&lt;a href=&quot;https://kylin.apache.org/docs/release_notes.html&quot;&gt;release notes&lt;/a&gt;;这里挑一些主要改进做说明:&lt;/p&gt;
 
-&lt;ol&gt;
-  &lt;li&gt;蒋守壮.&lt;a href=&quot;https://juejin.im/post/5b46d0606fb9a04fd6593d31&quot;&gt;如何使用 Star Schema Benchmark 压测 Apache Kylin&lt;/a&gt;&lt;/li&gt;
-&lt;/ol&gt;
+&lt;h3 id=&quot;all-in-spark--cubing-&quot;&gt;All-in-Spark 的 Cubing 引擎&lt;/h3&gt;
+&lt;p&gt;Kylin 的 Spark 引擎将使用 Spark 运行 cube 计算中的所有分布式作业,包括获取各个维度的不同值,将 cuboid 文件转换为 HBase HFile,合并 segment,合并词典等。默认的 Spark 配置也经过优化,使得用户可以获得开箱即用的体验。相关开发任务是 KYLIN-3427, KYLIN-3441, KYLIN-3442.&lt;/p&gt;
 
-</description>
-        <pubDate>Mon, 16 Jul 2018 05:28:00 -0700</pubDate>
-        <link>http://kylin.apache.org/blog/2018/07/16/Star-Schema-Benchmark-on-Apache-Kylin/</link>
-        <guid isPermaLink="true">http://kylin.apache.org/blog/2018/07/16/Star-Schema-Benchmark-on-Apache-Kylin/</guid>
-        
-        
-        <category>blog</category>
-        
-      </item>
-    
-      <item>
-        <title>Redash-Kylin plugin from Strikingly</title>
-        <description>&lt;p&gt;At strikingly, we are using Apache Kylin as our OLAP engine. Kylin is very powerful and it supports our big data business well. We’ve chosen Apache Kylin because it fits our demand: it handles a huge amount of data, undertakes multiple concurrent queries and has sub-second response time.&lt;/p&gt;
+&lt;p&gt;Spark 任务管理也有所改进:一旦 Spark 任务开始运行,您就可以在Web控制台上获得作业链接;如果您丢弃该作业,Kylin 将立刻终止 Spark 作业以及时释放资源;如果重新启动 Kylin,它可以从上一个作业恢复,而不是重新提交新作业.&lt;/p&gt;
 
-&lt;p&gt;Although we are mainly using Kylin to provide service to our customers, we’ve decided to reuse the built result for internal purposes too. Kylin supports Business Intelligence tools like Apache Zeppelin and Tableau. With these BI tools we can provide insight and visualization about our data which will help making business decisions.&lt;/p&gt;
+&lt;h3 id=&quot;mysql--kylin-&quot;&gt;MySQL 做 Kylin 元数据的存储&lt;/h3&gt;
+&lt;p&gt;在过去,HBase 是 Kylin 元数据存储的唯一选择。 在某些情况下 HBase不适用,例如使用多个 HBase 集群来为 Kylin 提供跨区域的高可用,这里复制的 HBase 集群是只读的,所以不能做元数据存储。现在我们引入了 MySQL Metastore 以满足这种需求。此功能现在处于测试阶段。更多内容参见 KYLIN-3488。&lt;/p&gt;
 
-&lt;p&gt;Other than those BI tools mentioned above, we’re using another similar application named Redash because:&lt;/p&gt;
+&lt;h3 id=&quot;hybrid-model-&quot;&gt;Hybrid model 图形界面&lt;/h3&gt;
+&lt;p&gt;Hybrid 是一种用于组装多个 cube 的高级模型。 它可用于满足 cube 的 schema 要发生改变的情况。这个功能过去没有图形界面,因此只有一小部分用户知道它。现在我们在 Web 界面上开启了它,以便更多用户可以尝试。&lt;/p&gt;
 
-&lt;ol&gt;
-  &lt;li&gt;
-    &lt;p&gt;We’ve already had a deployment of redash for data analyzing upon traditional databases like PostgreSQL, etc&lt;/p&gt;
-  &lt;/li&gt;
-  &lt;li&gt;
-    &lt;p&gt;Redash is open source and easy to deploy, rich in visualization functions and has good integrations with other productivity tools we are using (like Slack).&lt;/p&gt;
-  &lt;/li&gt;
-&lt;/ol&gt;
+&lt;h3 id=&quot;cube-planner&quot;&gt;默认开启 Cube planner&lt;/h3&gt;
+&lt;p&gt;Cube planner 可以极大地优化 cube 结构,减少构建的 cuboid 数量,从而节省计算/存储资源并提高查询性能。它是在v2.3中引入的,但默认情况下没有开启。为了让更多用户看到并尝试它,我们默认在v2.5中启用它。 算法将在第一次构建 segment 的时候,根据数据统计自动优化 cuboid 集合.&lt;/p&gt;
 
-&lt;p&gt;Unfortunately, redash doesn’t officially support Kylin as a data source for now. Thus we wrote a simple one to include it. The plugin has already been open sourced under BSD-2 license as a &lt;a href=&quot;https://github.com/strikingly/redash-kylin&quot;&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;
+&lt;h3 id=&quot;segment-&quot;&gt;改进的 Segment 剪枝&lt;/h3&gt;
+&lt;p&gt;Segment(分区)修剪可以有效地减少磁盘和网络I / O,因此大大提高了查询性能。 过去,Kylin 只按分区列 (partition date column) 的值进行 segment 的修剪。 如果查询中没有将分区列作为过滤条件,那么修剪将不起作用,会扫描所有segment。.&lt;br /&gt;
+现在从v2.5开始,Kylin 将在 segment 级别记录每个维度的最小/最大值。 在扫描 segment 之前,会将查询的条件与最小/最大索引进行比较。 如果不匹配,将跳过该 segment。 检查KYLIN-3370了解更多信息。&lt;/p&gt;
 
-&lt;p&gt;The redash-kylin plugin is just a single piece of python file which implements redash’s data source protocol. To install, retrieve the &lt;code class=&quot;highlighter-rouge&quot;&gt;kylin.py&lt;/code&gt; file inside &lt;code class=&quot;highlighter-rouge&quot;&gt;redash/query_runner&lt;/code&gt; folder of the plugin’s repository and place it under corresponding folder of redash.&lt;/p&gt;
+&lt;h3 id=&quot;yarn-&quot;&gt;在 YARN 上合并字典&lt;/h3&gt;
+&lt;p&gt;当 segment 合并时,它们的词典也需要合并。在过去,字典合并发生在 Kylin 的 JVM 中,这需要使用大量的本地内存和 CPU 资源。 在极端情况下(如果有几个并发作业),可能会导致 Kylin 进程崩溃。 因此,一些用户不得不为 Kylin 任务节点分配更多内存,或运行多个任务节点以平衡工作负载。&lt;br /&gt;
+现在从v2.5开始,Kylin 将把这项任务提交给 Hadoop MapReduce 和 Spark,这样就可以解决这个瓶颈问题。 查看KYLIN-3471了解更多信息.&lt;/p&gt;
 
-&lt;p&gt;&lt;img src=&quot;/images/blog/redash/redash_1.jpeg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
+&lt;h3 id=&quot;cube-&quot;&gt;改进使用全局字典的 cube 构建性能&lt;/h3&gt;
+&lt;p&gt;全局字典 (Global Dictionary) 是 bitmap 精确去重计数的必要条件。如果去重列具有非常高的基数,则 GD 可能非常大。在 cube 构建阶段,Kylin 需要通过 GD 将非整数值转换为整数。尽管 GD 已被分成多个切片,可以分开加载到内存,但是由于去重列的值是乱序的。Kylin 需要反复载入和载出(swap in/out)切片,这会导致构建任务非常缓慢。&lt;br /&gt;
+该增强功能引入了一个新步骤,为每个数据块从全局字典中构建一个缩小的字典。 随后每个任务只需要加载缩小的字典,从而避免频繁的载入和载出。性能可以比以前快3倍。查看 KYLIN-3491 了解更多信息.&lt;/p&gt;
 
-&lt;p&gt;Before you can use the plugin, you need to enable it first. Please modify the default enabled plugin list defined in &lt;code class=&quot;highlighter-rouge&quot;&gt;redash/settings.py&lt;/code&gt;:&lt;/p&gt;
+&lt;h3 id=&quot;topn-count-distinct--cube-&quot;&gt;改进含 TOPN, COUNT DISTINCT 的 cube 大小的估计&lt;/h3&gt;
+&lt;p&gt;Cube 的大小在构建时是预先估计的,并被后续几个步骤使用,例如决定 MR / Spark 作业的分区数,计算 HBase region 切割等。它的准确与否会对构建性能产生很大影响。 当存在 COUNT DISTINCT,TOPN 的度量时候,因为它们的大小是灵活的,因此估计值可能跟真实值有很大偏差。 在过去,用户需要调整若干个参数以使尺寸估计更接近实际尺寸,这对普通用户有点困难。&lt;br /&gt;
+现在,Kylin 将根据收集的统计信息自动调整大小估计。这可以使估计值与实际大小更接近。查看 KYLIN-3453 了解更多信息。&lt;/p&gt;
 
-&lt;p&gt;&lt;img src=&quot;/images/blog/redash/redash_2.jpeg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
+&lt;h3 id=&quot;hadoop-30hbase-20&quot;&gt;支持Hadoop 3.0/HBase 2.0&lt;/h3&gt;
+&lt;p&gt;Hadoop 3和 HBase 2开始被许多用户采用。现在 Kylin 提供使用新的 Hadoop 和 HBase API 编译的新二进制包。我们已经在 Hortonworks HDP 3.0 和 Cloudera CDH 6.0 上进行了测试&lt;/p&gt;
 
-&lt;p&gt;At last you have to rebuild the docker image (if you are using docker deployment) of redash and restart both server and worker of it. Currently, the redash-kylin plugin only supports the current stable version of redash (3.0.0) and 2.x version of Apache Kylin.&lt;/p&gt;
+&lt;p&gt;&lt;strong&gt;下载&lt;/strong&gt;&lt;/p&gt;
 
-&lt;p&gt;Once installed successfully, you’ll be able to find a KylinAPI data source type at the New Data Source page. To use it, just select that source type and fill in required fields. The redash-kylin plugin works by calling Kylin’s HTTP RESTful API, thus you should make sure your redash deployment has an access to your Kylin cluster (either job mode or query mode).&lt;/p&gt;
+&lt;p&gt;要下载Apache Kylin v2.5.0源代码或二进制包,请访问&lt;a href=&quot;http://kylin.apache.org/download&quot;&gt;下载页面&lt;/a&gt; .&lt;/p&gt;
 
-&lt;p&gt;&lt;img src=&quot;/images/blog/redash/redash_3.jpeg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
+&lt;p&gt;&lt;strong&gt;升级&lt;/strong&gt;&lt;/p&gt;
 
-&lt;p&gt;After a data source is setup and the connection is tested ok. You should be able to view schemas, run queries and make visualizations from tables in Kylin. Just type the SQL query in and get the result out. For more details about redash’s usage, please refer to &lt;a href=&quot;https://redash.io/help/&quot;&gt;redash’s documentation&lt;/a&gt;.&lt;/p&gt;
+&lt;p&gt;参考&lt;a href=&quot;/docs/howto/howto_upgrade.html&quot;&gt;升级指南&lt;/a&gt;.&lt;/p&gt;
 
-&lt;p&gt;&lt;img src=&quot;/images/blog/redash/redash_4.jpeg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
+&lt;p&gt;&lt;strong&gt;反馈&lt;/strong&gt;&lt;/p&gt;
 
-&lt;p&gt;You can also add multiple data sources by setting different project names or different API URLs. It’s worth to mention that redash has an experiment function which supports making a query from former cached query results. Thus, once query results from different Kylin cluster has been imported, you’ll be able to join them together for richer data processing.&lt;/p&gt;
+&lt;p&gt;如果您遇到问题或疑问,请发送邮件至 Apache Kylin dev 或 user 邮件列表:dev@kylin.apache.org,user@kylin.apache.org; 在发送之前,请确保您已通过发送电子邮件至 dev-subscribe@kylin.apache.org 或 user-subscribe@kylin.apache.org订阅了邮件列表。&lt;/p&gt;
 
-&lt;p&gt;Wish you have a good time with Redash-Kylin!&lt;/p&gt;
+&lt;p&gt;&lt;em&gt;非常感谢所有贡献Apache Kylin的朋友!&lt;/em&gt;&lt;/p&gt;
 </description>
-        <pubDate>Tue, 08 May 2018 13:00:00 -0700</pubDate>
-        <link>http://kylin.apache.org/blog/2018/05/08/redash-kylin-plugin-strikingly/</link>
-        <guid isPermaLink="true">http://kylin.apache.org/blog/2018/05/08/redash-kylin-plugin-strikingly/</guid>
+        <pubDate>Thu, 20 Sep 2018 13:00:00 -0700</pubDate>
+        <link>http://kylin.apache.org/cn/blog/2018/09/20/release-v2.5.0/</link>
+        <guid isPermaLink="true">http://kylin.apache.org/cn/blog/2018/09/20/release-v2.5.0/</guid>
         
         
         <category>blog</category>