You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@kylin.apache.org by bi...@apache.org on 2018/01/26 05:13:04 UTC

[1/5] kylin git commit: added Chinese version of howto_jdbc

Repository: kylin
Updated Branches:
  refs/heads/document 2a282eeb8 -> 770287871


added Chinese version of howto_jdbc

Signed-off-by: Billy Liu <bi...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kylin/repo
Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/77028787
Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/77028787
Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/77028787

Branch: refs/heads/document
Commit: 77028787153a153116d5aefe7b43342f8ad027ca
Parents: 59d169c
Author: link3280 <49...@qq.com>
Authored: Sun Oct 22 14:09:35 2017 +0800
Committer: Billy Liu <bi...@apache.org>
Committed: Fri Jan 26 13:09:53 2018 +0800

----------------------------------------------------------------------
 website/_docs21/howto/howto_jdbc.cn.md | 92 +++++++++++++++++++++++++++++
 1 file changed, 92 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kylin/blob/77028787/website/_docs21/howto/howto_jdbc.cn.md
----------------------------------------------------------------------
diff --git a/website/_docs21/howto/howto_jdbc.cn.md b/website/_docs21/howto/howto_jdbc.cn.md
new file mode 100644
index 0000000..bd137be
--- /dev/null
+++ b/website/_docs21/howto/howto_jdbc.cn.md
@@ -0,0 +1,92 @@
+---
+layout: docs21
+title:  Kylin JDBC Driver
+categories: 帮助
+permalink: /cn/docs21/howto/howto_jdbc.html
+---
+
+### 认证
+
+###### 基于Apache Kylin认证RESTFUL服务。支持的参数：
+* user : 用户名
+* password : 密码
+* ssl: true或false。 默认为flas；如果为true，所有的服务调用都会使用https。
+
+### 连接url格式：
+{% highlight Groff markup %}
+jdbc:kylin://<hostname>:<port>/<kylin_project_name>
+{% endhighlight %}
+* 如果“ssl”为true，“port”应该是Kylin server的HTTPS端口。
+* 如果“port”未被指定，driver会使用默认的端口：HTTP 80，HTTPS 443。
+* 必须指定“kylin_project_name”并且用户需要确保它在Kylin server上存在。
+
+### 1. 使用Statement查询
+{% highlight Groff markup %}
+Driver driver = (Driver) Class.forName("org.apache.kylin.jdbc.Driver").newInstance();
+
+Properties info = new Properties();
+info.put("user", "ADMIN");
+info.put("password", "KYLIN");
+Connection conn = driver.connect("jdbc:kylin://localhost:7070/kylin_project_name", info);
+Statement state = conn.createStatement();
+ResultSet resultSet = state.executeQuery("select * from test_table");
+
+while (resultSet.next()) {
+    assertEquals("foo", resultSet.getString(1));
+    assertEquals("bar", resultSet.getString(2));
+    assertEquals("tool", resultSet.getString(3));
+}
+{% endhighlight %}
+
+### 2. 使用PreparedStatementv查询
+
+###### 支持的PreparedStatement参数：
+* setString
+* setInt
+* setShort
+* setLong
+* setFloat
+* setDouble
+* setBoolean
+* setByte
+* setDate
+* setTime
+* setTimestamp
+
+{% highlight Groff markup %}
+Driver driver = (Driver) Class.forName("org.apache.kylin.jdbc.Driver").newInstance();
+Properties info = new Properties();
+info.put("user", "ADMIN");
+info.put("password", "KYLIN");
+Connection conn = driver.connect("jdbc:kylin://localhost:7070/kylin_project_name", info);
+PreparedStatement state = conn.prepareStatement("select * from test_table where id=?");
+state.setInt(1, 10);
+ResultSet resultSet = state.executeQuery();
+
+while (resultSet.next()) {
+    assertEquals("foo", resultSet.getString(1));
+    assertEquals("bar", resultSet.getString(2));
+    assertEquals("tool", resultSet.getString(3));
+}
+{% endhighlight %}
+
+### 3. 获取查询结果元数据
+Kylin jdbc driver支持元数据列表方法：
+通过sql模式过滤器（比如 %）列出catalog、schema、table和column。
+
+{% highlight Groff markup %}
+Driver driver = (Driver) Class.forName("org.apache.kylin.jdbc.Driver").newInstance();
+Properties info = new Properties();
+info.put("user", "ADMIN");
+info.put("password", "KYLIN");
+Connection conn = driver.connect("jdbc:kylin://localhost:7070/kylin_project_name", info);
+Statement state = conn.createStatement();
+ResultSet resultSet = state.executeQuery("select * from test_table");
+
+ResultSet tables = conn.getMetaData().getTables(null, null, "dummy", null);
+while (tables.next()) {
+    for (int i = 0; i < 10; i++) {
+        assertEquals("dummy", tables.getString(i + 1));
+    }
+}
+{% endhighlight %}

[3/5] kylin git commit: added Chinese version of howto_optimize_build

Posted by bi...@apache.org.

added Chinese version of howto_optimize_build

Signed-off-by: Billy Liu <bi...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kylin/repo
Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/96854ef8
Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/96854ef8
Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/96854ef8

Branch: refs/heads/document
Commit: 96854ef87d2cbdae14aa1aba0e1b7c44020f2a42
Parents: 2a282ee
Author: link3280 <49...@qq.com>
Authored: Sat Oct 21 12:53:35 2017 +0800
Committer: Billy Liu <bi...@apache.org>
Committed: Fri Jan 26 13:09:53 2018 +0800

----------------------------------------------------------------------
 .../_docs21/howto/howto_optimize_build.cn.md    | 166 +++++++++++++++++++
 1 file changed, 166 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kylin/blob/96854ef8/website/_docs21/howto/howto_optimize_build.cn.md
----------------------------------------------------------------------
diff --git a/website/_docs21/howto/howto_optimize_build.cn.md b/website/_docs21/howto/howto_optimize_build.cn.md
new file mode 100644
index 0000000..d454859
--- /dev/null
+++ b/website/_docs21/howto/howto_optimize_build.cn.md
@@ -0,0 +1,166 @@
+---
+layout: docs21
+title:  优化cube构建
+categories: 帮助
+permalink: /cn/docs21/howto/howto_optimize_build.html
+---
+
+Kylin将Cube构建任务分解为几个依次执行的步骤，这些步骤包括Hive操作、MapReduce操作和其他类型的操作。如果你有很多Cube构建任务需要每天运行，那么你肯定想要减少其中消耗的时间。下文按照Cube构建步骤顺序提供了一些优化经验。
+
+## 创建Hive的中间平表
+
+这一步将数据从源Hive表提取出来(和所有join的表一起)并插入到一个中间平表。如果Cube是分区的，Kylin会加上一个时间条件以确保只有在时间范围内的数据才会被提取。你可以在这个步骤的log查看相关的Hive命令，比如：
+
+```
+hive -e "USE default;
+DROP TABLE IF EXISTS kylin_intermediate_airline_cube_v3610f668a3cdb437e8373c034430f6c34;
+
+CREATE EXTERNAL TABLE IF NOT EXISTS kylin_intermediate_airline_cube_v3610f668a3cdb437e8373c034430f6c34
+(AIRLINE_FLIGHTDATE date,AIRLINE_YEAR int,AIRLINE_QUARTER int,...,AIRLINE_ARRDELAYMINUTES int)
+STORED AS SEQUENCEFILE
+LOCATION 'hdfs:///kylin/kylin200instance/kylin-0a8d71e8-df77-495f-b501-03c06f785b6c/kylin_intermediate_airline_cube_v3610f668a3cdb437e8373c034430f6c34';
+
+SET dfs.replication=2;
+SET hive.exec.compress.output=true;
+SET hive.auto.convert.join.noconditionaltask=true;
+SET hive.auto.convert.join.noconditionaltask.size=100000000;
+SET mapreduce.job.split.metainfo.maxsize=-1;
+
+INSERT OVERWRITE TABLE kylin_intermediate_airline_cube_v3610f668a3cdb437e8373c034430f6c34 SELECT
+AIRLINE.FLIGHTDATE
+,AIRLINE.YEAR
+,AIRLINE.QUARTER
+,...
+,AIRLINE.ARRDELAYMINUTES
+FROM AIRLINE.AIRLINE as AIRLINE
+WHERE (AIRLINE.FLIGHTDATE >= '1987-10-01' AND AIRLINE.FLIGHTDATE < '2017-01-01');
+
+```
+
+在Hive命令运行时，Kylin会用`conf/kylin_hive_conf.properties`里的配置，比如保留更少的冗余备份和启用Hive的mapper side join。需要的话可以根据集群的具体情况增加其他配置。
+
+如果cube的分区列(在这个案例中是"FIGHTDATE")与Hive表的分区列相同，那么根据它过滤数据能让Hive聪明地跳过不匹配的分区。因此强烈建议用Hive的分区列（如果它是日期列）作为cube的分区列。这对于那些数据量很大的表来说几乎是必须的，否则Hive不得不每次在这步扫描全部文件，消耗非常长的时间。
+
+如果启用了Hive的文件合并，你可以在`conf/kylin_hive_conf.xml`里关闭它，因为Kylin有自己合并文件的方法(下一节)：
+
+    <property>
+        <name>hive.merge.mapfiles</name>
+        <value>false</value>
+        <description>Disable Hive's auto merge</description>
+    </property>
+
+## 重新分发中间表
+
+在之前的一步之后，Hive在HDFS上的目录里生成了数据文件：有些是大文件，有些是小文件甚至空文件。这种不平衡的文件分布会导致之后的MR任务出现数据倾斜的问题：有些mapper完成得很快，但其他的就很慢。针对这个问题，Kylin增加了这一个步骤来“重新分发”数据，这是示例输出:
+
+```
+total input rows = 159869711
+expected input rows per mapper = 1000000
+num reducers for RedistributeFlatHiveTableStep = 160
+
+```
+
+重新分发表的命令：
+
+```
+hive -e "USE default;
+SET dfs.replication=2;
+SET hive.exec.compress.output=true;
+SET hive.auto.convert.join.noconditionaltask=true;
+SET hive.auto.convert.join.noconditionaltask.size=100000000;
+SET mapreduce.job.split.metainfo.maxsize=-1;
+set mapreduce.job.reduces=160;
+set hive.merge.mapredfiles=false;
+
+INSERT OVERWRITE TABLE kylin_intermediate_airline_cube_v3610f668a3cdb437e8373c034430f6c34 SELECT * FROM kylin_intermediate_airline_cube_v3610f668a3cdb437e8373c034430f6c34 DISTRIBUTE BY RAND();
+"
+```
+
+首先，Kylin计算出中间表的行数，然后基于行数的大小算出重新分发数据需要的文件数。默认情况下,Kylin为每一百万行分配一个文件。在这个例子中，有1.6亿行和160个reducer，每个reducer会写一个文件。在接下来对这张表进行的MR步骤里，Hadoop会启动和文件相同数量的mapper来处理数据(通常一百万行数据比一个HDFS数据块要小)。如果你的日常数据量没有这么大或者Hadoop集群有足够的资源，你或许想要更多的并发数，这时可以将`conf/kylin.properties`里的`kylin.job.mapreduce.mapper.input.rows`设为小一点的数值，比如:
+
+`kylin.job.mapreduce.mapper.input.rows=500000`
+
+其次，Kylin会运行 *"INSERT OVERWRITE TABLE ... DISTRIBUTE BY "* 形式的HiveQL来分发数据到指定数量的reducer上。
+
+在很多情况下，Kylin请求Hive随机分发数据到reducer，然后得到大小相近的文件，分发的语句是"DISTRIBUTE BY RAND()"。
+
+如果你的cube指定了一个高基数的列，比如"USER_ID"，作为"分片"维度(在cube的“高级设置”页面)，Kylin会让Hive根据该列的值重新分发数据，那么在该列有着相同值的行将被分发到同一个文件。这比随机要分发要好得多，因为不仅重新分布了数据，并且在没有额外代价的情况下对数据进行了预先分类，如此一来接下来的cube build处理会从中受益。在典型的场景下，这样优化可以减少40%的build时长。在这个案例中分发的语句是"DISTRIBUTE BY USER_ID"：
+
+请注意: 1)“分片”列应该是高基数的维度列，并且它会出现在很多的cuboid中（不只是出现在少数的cuboid）。 使用它来合理进行分发可以在每个时间范围内的数据均匀分布，否则会造成数据倾斜，从而降低build效率。典型的正面例子是：“USER_ID”、“SELLER_ID”、“PRODUCT”、“CELL_NUMBER”等等，这些列的基数应该大于一千(远大于reducer的数量)。 2)"分片"对cube的存储同样有好处，不过这超出了本文的范围。
+
+## 提取事实表的唯一列
+
+在这一步骤Kylin运行MR任务来提取使用字典编码的维度列的唯一值。
+
+实际上这步另外还做了一些事情：通过HyperLogLog计数器收集cube的统计数据，用于估算每个cuboid的行数。如果你发现mapper运行得很慢，这通常表明cube的设计太过复杂，请参考
+[优化cube设计](howto_optimize_cubes.html)来简化cube。如果reducer出现了内存溢出错误，这表明cuboid组合真的太多了或者是YARN的内存分配满足不了需要。如果这一步从任何意义上讲不能在合理的时间内完成，你可以放弃任务并考虑重新设计cube，因为继续下去会花费更长的时间。
+
+你可以通过降低取样的比例（kylin.job.cubing.inmen.sampling.percent）来加速这个步骤，但是帮助可能不大而且影响了cube统计数据的准确性，所有我们并不推荐。
+
+## 构建维度字典
+
+有了前一步提取的维度列唯一值，Kylin会在内存里构建字典（在下个版本将改为MapReduce任务）。通常这一步比较快，但如果唯一值集合很大，Kylin可能会报出类似“字典不支持过高基数”。对于UHC类型的列，请使用其他编码方式，比如“fixed_length”、“integer”等等。
+
+## 保存cuboid的统计数据和创建 HTable
+
+这两步是轻量级和快速的。
+
+## 构建基础cuboid
+
+这一步用Hive的中间表构建基础的cuboid，是“逐层”构建cube算法的第一轮MR计算。Mapper的数目与第二步的reducer数目相等；Reducer的数目是根据cube统计数据估算的：默认情况下每500MB输出使用一个reducer；如果观察到reducer的数量较少，你可以将kylin.properties里的“kylin.job.mapreduce.default.reduce.input.mb”设为小一点的数值以获得过多的资源，比如:
+
+`kylin.job.mapreduce.default.reduce.input.mb=200`
+
+## Build N-Dimension Cuboid 
+## 构建N维cuboid
+
+这些步骤是“逐层”构建cube的过程，每一步以前一步的输出作为输入，然后去掉一个维度以聚合得到一个子cuboid。举个例子，cuboid ABCD去掉A得到BCD，去掉B得到ACD。
+
+有些cuboid可以从一个以上的父cuboid聚合得到，这种情况下，Kylin会选择最小的一个父cuboid。举例,AB可以从ABC(id:1110)和ABD(id:1101)生成，则ABD会被选中，因为它的比ABC要小。在这基础上，如果D的基数较小，聚合运算的成本就会比较低。所以，当设计rowkey序列的时候，请记得将基数较小的维度放在末尾。这样不仅有利于cube构建，而且有助于cube查询，因为预聚合也遵循相同的规则。
+
+通常来说，从N维到(N/2)维的构建比较慢，因为这是cuboid数量爆炸性增长的阶段：N维有1个cuboid，(N-1)维有N个cuboid，(N-2)维有N*(N-1)个cuboid，以此类推。经过(N/2)维构建的步骤，整个构建任务会逐渐变快。
+
+## 构建cube
+
+这个步骤使用一个新的算法来构建cube：“逐片”构建（也称为“内存”构建）。它会使用一轮MR来计算所有的cuboids，但是比通常情况下更耗内存。配置文件"conf/kylin_job_inmem.xml"正是为这步而设。默认情况下它为每个mapper申请3GB内存。如果你的集群有充足的内存，你可以在上述配置文件中分配更多内存给mapper，这样它会用尽可能多的内存来缓存数据以获得更好的性能，比如：
+
+    <property>
+        <name>mapreduce.map.memory.mb</name>
+        <value>6144</value>
+        <description></description>
+    </property>
+    
+    <property>
+        <name>mapreduce.map.java.opts</name>
+        <value>-Xmx5632m</value>
+        <description></description>
+    </property>
+
+
+请注意，Kylin会根据数据分布（从cube的统计数据里获得）自动选择最优的算法，没有被选中的算法对应的步骤会被跳过。你不需要显式地选择构建算法。
+
+## 将cuboid数据转换为HFile
+
+这一步启动一个MR任务来讲cuboid文件（序列文件格式）转换为HBase的HFile格式。Kylin通过cube统计数据计算HBase的region数目，默认情况下每5GB数据对应一个region。Region越多，MR使用的reducer也会越多。如果你观察到reducer数目较小且性能较差，你可以将“conf/kylin.properties”里的以下参数设小一点，比如：
+
+```
+kylin.hbase.region.cut=2
+kylin.hbase.hfile.size.gb=1
+```
+
+如果你不确定一个region应该是多大时，联系你的HBase管理员。
+
+## 将HFile导入HBase表
+
+这一步使用HBase API来讲HFile导入region server，这是轻量级并快速的一步。
+
+## 更新cube信息
+
+在导入数据到HBase后，Kylin在元数据中将对应的cube segment标记为ready。
+
+## 清理资源
+
+将中间宽表从Hive删除。这一步不会阻塞任何操作，因为在前一步segment已经被标记为ready。如果这一步发生错误，不用担心，垃圾回收工作可以晚些再通过Kylin的[StorageCleanupJob](howto_cleanup_storage.html)完成。
+
+## 总结
+还有非常多其他提高Kylin性能的方法，如果你有经验可以分享，欢迎通过[dev@kylin.apache.org](mailto:dev@kylin.apache.org)讨论。
\ No newline at end of file

[5/5] kylin git commit: added Chinese version of howto_backup_metadata

Posted by bi...@apache.org.

added Chinese version of howto_backup_metadata

Signed-off-by: Billy Liu <bi...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kylin/repo
Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/34eabdb5
Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/34eabdb5
Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/34eabdb5

Branch: refs/heads/document
Commit: 34eabdb518318767f217bb751f041910c5b8b4a5
Parents: 96854ef
Author: link3280 <49...@qq.com>
Authored: Sat Oct 21 19:49:38 2017 +0800
Committer: Billy Liu <bi...@apache.org>
Committed: Fri Jan 26 13:09:53 2018 +0800

----------------------------------------------------------------------
 .../_docs21/howto/howto_backup_metadata.cn.md   | 59 ++++++++++++++++++++
 1 file changed, 59 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kylin/blob/34eabdb5/website/_docs21/howto/howto_backup_metadata.cn.md
----------------------------------------------------------------------
diff --git a/website/_docs21/howto/howto_backup_metadata.cn.md b/website/_docs21/howto/howto_backup_metadata.cn.md
new file mode 100644
index 0000000..86aa997
--- /dev/null
+++ b/website/_docs21/howto/howto_backup_metadata.cn.md
@@ -0,0 +1,59 @@
+---
+layout: docs21
+title:  备份元数据
+categories: howto
+permalink: /cn/docs21/howto/howto_backup_metadata.html
+---
+
+Kylin将它全部的元数据（包括cube描述和实例、项目、倒排索引描述和实例、任务、表和字典）组织成层级文件系统的形式。然而，Kylin使用hbase来存储元数据，而不是一个普通的文件系统。如果你查看过Kylin的配置文件（kylin.properties），你会发现这样一行：
+
+{% highlight Groff markup %}
+## The metadata store in hbase
+kylin.metadata.url=kylin_metadata@hbase
+{% endhighlight %}
+
+这表明元数据会被保存在一个叫作“kylin_metadata”的htable里。你可以在hbase shell里scan该htbale来获取它。
+
+## 使用二进制包来备份Metadata Store
+
+有时你需要将Kylin的Metadata Store从hbase备份到磁盘文件系统。在这种情况下，假设你在部署Kylin的hadoop命令行（或沙盒）里，你可以到KYLIN_HOME并运行：
+
+{% highlight Groff markup %}
+./bin/metastore.sh backup
+{% endhighlight %}
+
+来将你的元数据导出到本地目录，这个目录在KYLIN_HOME/metadata_backps下，它的命名规则使用了当前时间作为参数：KYLIN_HOME/meta_backups/meta_year_month_day_hour_minute_second 。
+
+## 使用二进制包来恢复Metatdara Store
+
+万一你发现你的元数据被搞得一团糟，想要恢复先前的备份：
+
+首先，重置Metatdara Store（这个会清理Kylin在hbase的Metadata Store的所有信息，请确保先备份）：
+
+{% highlight Groff markup %}
+./bin/metastore.sh reset
+{% endhighlight %}
+
+然后上传备份的元数据到Kylin的Metadata Store：
+{% highlight Groff markup %}
+./bin/metastore.sh restore $KYLIN_HOME/meta_backups/meta_xxxx_xx_xx_xx_xx_xx
+{% endhighlight %}
+
+## 在开发环境备份/恢复元数据（0.7.3版本以上可用）
+
+在开发调试Kylin时，典型的环境是一台装有IDE的开发机上和一个后台的沙盒，通常你会写代码并在开发机上运行测试案例，但每次都需要将二进制包放到沙盒里以检查元数据是很麻烦的。这时有一个名为SandboxMetastoreCLI工具类可以帮助你在开发机本地下载/上传元数据。
+
+## 从Metadata Store清理无用的资源（0.7.3版本以上可用）
+随着运行时间增长，类似字典、表快照的资源变得没有用（cube segment被丢弃或者合并了），但是它们依旧占用空间，你可以运行命令来找到并清除它们：
+
+首先，运行一个检查，这是安全的因为它不会改变任何东西：
+{% highlight Groff markup %}
+./bin/metastore.sh clean
+{% endhighlight %}
+
+将要被删除的资源会被列出来：
+
+接下来，增加“--delete true”参数来清理这些资源；在这之前，你应该确保已经备份metadata store：
+{% highlight Groff markup %}
+./bin/metastore.sh clean --delete true
+{% endhighlight %}

[2/5] kylin git commit: added Chinese version of howto_build_cube_with_restapi

Posted by bi...@apache.org.

added Chinese version of howto_build_cube_with_restapi

Signed-off-by: Billy Liu <bi...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kylin/repo
Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/2c9c574a
Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/2c9c574a
Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/2c9c574a

Branch: refs/heads/document
Commit: 2c9c574aa19d3cc7060bddca58fd3ec1a6fd7371
Parents: 34eabdb
Author: link3280 <49...@qq.com>
Authored: Sat Oct 21 19:56:15 2017 +0800
Committer: Billy Liu <bi...@apache.org>
Committed: Fri Jan 26 13:09:53 2018 +0800

----------------------------------------------------------------------
 .../howto/howto_build_cube_with_restapi.cn.md   | 54 ++++++++++++++++++++
 1 file changed, 54 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kylin/blob/2c9c574a/website/_docs21/howto/howto_build_cube_with_restapi.cn.md
----------------------------------------------------------------------
diff --git a/website/_docs21/howto/howto_build_cube_with_restapi.cn.md b/website/_docs21/howto/howto_build_cube_with_restapi.cn.md
new file mode 100644
index 0000000..5bd8348
--- /dev/null
+++ b/website/_docs21/howto/howto_build_cube_with_restapi.cn.md
@@ -0,0 +1,54 @@
+---
+layout: docs21
+title:  用API构建cube
+categories: 帮助
+permalink: /cn/docs21/howto/howto_build_cube_with_restapi.html
+---
+
+### 1. 认证
+*   目前Kylin使用[basic authentication](http://en.wikipedia.org/wiki/Basic_access_authentication)。
+*   给第一个请求加上用于认证的 Authorization 头部。
+*   或者进行一个特定的请求: POST http://localhost:7070/kylin/api/user/authentication 。
+*   完成认证后, 客户端可以在接下来的请求里带上cookie。
+{% highlight Groff markup %}
+POST http://localhost:7070/kylin/api/user/authentication
+
+Authorization:Basic xxxxJD124xxxGFxxxSDF
+Content-Type: application/json;charset=UTF-8
+{% endhighlight %}
+
+### 2. 获取Cube的详细信息
+*   `GET http://localhost:7070/kylin/api/cubes?cubeName={cube_name}&limit=15&offset=0`
+*   用户可以在返回的cube详细信息里找到cube的segment日期范围。
+{% highlight Groff markup %}
+GET http://localhost:7070/kylin/api/cubes?cubeName=test_kylin_cube_with_slr&limit=15&offset=0
+
+Authorization:Basic xxxxJD124xxxGFxxxSDF
+Content-Type: application/json;charset=UTF-8
+{% endhighlight %}
+
+### 3.	然后提交cube构建任务
+*   `PUT http://localhost:7070/kylin/api/cubes/{cube_name}/rebuild`
+*   关于 put 的请求体细节请参考 Build Cube API
+    *   `startTime` 和 `endTime` 应该是utc时间。
+    *   `buildType` 可以是 `BUILD` 、 `MERGE` 或 `REFRESH`。 `BUILD` 用于构建一个新的segment， `REFRESH` 用于刷新一个已有的segment， `MERGE` 用于合并多个已有的segment生成一个较大的segment。
+*   这个方法会返回一个新建的任务实例，它的uuid是任务的唯一id，用于追踪任务状态。
+{% highlight Groff markup %}
+PUT http://localhost:7070/kylin/api/cubes/test_kylin_cube_with_slr/rebuild
+
+Authorization:Basic xxxxJD124xxxGFxxxSDF
+Content-Type: application/json;charset=UTF-8
+    
+{
+    "startTime": 0,
+    "endTime": 1388563200000,
+    "buildType": "BUILD"
+}
+{% endhighlight %}
+
+### 4.	跟踪任务状态 
+*   `GET http://localhost:7070/kylin/api/jobs/{job_uuid}`
+*   返回的 `job_status` 代表job的当前状态。
+
+### 5.	如果构建任务出现错误，可以重新开始它
+*   `PUT http://localhost:7070/kylin/api/jobs/{job_uuid}/resume`

[4/5] kylin git commit: added Chinese version of howto_cleanup_storage

Posted by bi...@apache.org.

added Chinese version of howto_cleanup_storage

Signed-off-by: Billy Liu <bi...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kylin/repo
Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/59d169cb
Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/59d169cb
Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/59d169cb

Branch: refs/heads/document
Commit: 59d169cb339dde48db763eb9d9977f7b839e8996
Parents: 2c9c574
Author: link3280 <49...@qq.com>
Authored: Sun Oct 22 14:08:21 2017 +0800
Committer: Billy Liu <bi...@apache.org>
Committed: Fri Jan 26 13:09:53 2018 +0800

----------------------------------------------------------------------
 .../_docs21/howto/howto_cleanup_storage.cn.md   | 21 ++++++++++++++++++++
 1 file changed, 21 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kylin/blob/59d169cb/website/_docs21/howto/howto_cleanup_storage.cn.md
----------------------------------------------------------------------
diff --git a/website/_docs21/howto/howto_cleanup_storage.cn.md b/website/_docs21/howto/howto_cleanup_storage.cn.md
new file mode 100644
index 0000000..13e27e9
--- /dev/null
+++ b/website/_docs21/howto/howto_cleanup_storage.cn.md
@@ -0,0 +1,21 @@
+---
+layout: docs21
+title:  清理存储
+categories: howto
+permalink: /cn/docs21/howto/howto_cleanup_storage.html
+---
+
+Kylin在构建cube期间会在HDFS上生成中间文件；除此之外，当清理/删除/合并cube时，一些HBase表可能被遗留在HBase却以后再也不会被查询；虽然Kylin已经开始做自动化的垃圾回收，但不一定能覆盖到所有的情况；你可以定期做离线的存储清理：
+
+步骤：
+1. 检查哪些资源可以清理，这一步不会删除任何东西：
+{% highlight Groff markup %}
+export KYLIN_HOME=/path/to/kylin_home
+${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete false
+{% endhighlight %}
+请将这里的 (version) 替换为你安装的Kylin jar版本。
+2. 你可以抽查一两个资源来检查它们是否已经没有被引用了；然后加上“--delete true”选项进行清理。
+{% highlight Groff markup %}
+${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true
+{% endhighlight %}
+完成后，中间HDFS上的中间文件和HTable会被移除。