You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@drill.apache.org by lu...@apache.org on 2021/08/11 12:05:40 UTC

[drill] branch gh-pages updated: gh-pages zh tutorial translation

This is an automated email from the ASF dual-hosted git repository.

luoc pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/drill.git


The following commit(s) were added to refs/heads/gh-pages by this push:
     new e05fc38  gh-pages zh tutorial translation
e05fc38 is described below

commit e05fc388d01a0586312738b2e2ae350a91335b8d
Author: kingswanwho <ji...@u.northwestern.edu>
AuthorDate: Sun Aug 1 21:13:31 2021 +0800

    gh-pages zh tutorial translation
---
 .../050-analyzing-highly-dynamic-datasets.md       |  45 +++--
 _docs/zh/tutorials/060-analyzing-social-media.md   | 225 ++++++++++-----------
 2 files changed, 133 insertions(+), 137 deletions(-)

diff --git a/_docs/zh/tutorials/050-analyzing-highly-dynamic-datasets.md b/_docs/zh/tutorials/050-analyzing-highly-dynamic-datasets.md
index aed2b20..d50fd16 100644
--- a/_docs/zh/tutorials/050-analyzing-highly-dynamic-datasets.md
+++ b/_docs/zh/tutorials/050-analyzing-highly-dynamic-datasets.md
@@ -5,9 +5,9 @@ parent: "教程"
 lang: "zh"
 ---
 
-Today’s data is dynamic and application-driven. The growth of a new era of business applications driven by industry trends such as web, social, mobile, and Internet of Things are generating datasets with new data types and new data models. These applications are iterative, and the associated data models typically are semi-structured, schema-less and constantly evolving. Semi-structured data models can be complex/nested, schema-less, and capable of having varying fields in every single ro [...]
+大数据是动态的并由应用驱动。互联网时代的商业软件的发展由不同的产业端所驱动,如网页端,媒体端,移动端,物联网。他们所生成的数据集,包含了新的数据类型和模型。这些应用都是交互式的,他们所关联的数据模型一般都是半结构化,schema-less,以及持续变化的。半结构化数据模型可以是复杂/嵌套或 schema-less,并且能够在每一行中包含不同的字段,为满足业务需求,字段会频繁修改。
 
-This tutorial shows you how to natively query dynamic datasets, such as JSON, and derive insights from any type of data in minutes. The dataset used in the example is from the Yelp check-ins dataset, which has the following structure:
+本教程将向你展示如何查询动态的原生数据集,例如 JSON,并在几分钟内从任意类型的数据中获得有效信息。示例中使用的数据集来自 Yelp 签到数据集,其结构如下:
 
     check-in
     {
@@ -23,32 +23,32 @@ This tutorial shows you how to natively query dynamic datasets, such as JSON, an
         }, # if there was no checkin for a hour-day block it will not be in the dataset
     }
 
-It is worth repeating the comment at the bottom of this snippet:
+请特别注意此段代码底部的注释:
 
        If there was no checkin for a hour-day block it will not be in the dataset. 
 
-The element names that you see in the `checkin_info` are unknown upfront and can vary for every row. The data, although simple, is highly dynamic data. To analyze the data there is no need to first represent this dataset in a flattened relational structure, as you would using any other SQL on Hadoop technology.
+你在 `checkin_info` 中看到的元素名称事先是未知的,并且每行都可能不同。数据虽然简单,但却是高度动态的数据。想要分析数据,无需在 Hadoop 平台上那样,需要先以扁平结构表示数据集,然后才能使用 SQL 类工具。
 
 ----------
 
-Step 1: First download Drill, if you have not yet done so, onto your machine
+第 1 步:首先将 Drill(如果你还没有安装 Drill)下载到你的主机
 
     http://drill.apache.org/download/
     tar -xvf apache-drill-1.19.0.tar
 
-Install Drill locally on your desktop (embedded mode). You don’t need Hadoop.
+本地安装 Drill(嵌入式模式)。并不需要安装 Hadoop。
 
 ----------
 
-Step 2: Start the Drill shell.
+第 2 步:嵌入式启动 Drill 。
 
     bin/drill-embedded
 
 ----------
 
-Step 3: Start analyzing the data using SQL
+第 3 步:开始使用 SQL 分析数据
 
-First, let’s take a look at the dataset:
+首先,让我们查看一下数据集:
 
     0: jdbc:drill:zk=local> SELECT * FROM dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_checkin.json` limit 2;
     |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|------------------------|
@@ -58,13 +58,13 @@ First, let’s take a look at the dataset:
     | {"6-6":2,"6-5":1,"7-6":1,"7-5":1,"8-5":2,"10-5":1,"9-3":1,"12-5":1,"15-3":1,"15-5":1,"15-6":1,"16-3":1,"10-0":1,"15-4":1,"10-4":1,"8-2":1}                                                                                               | checkin    | uGykseHzyS5xAMWoN6YUqA |
     |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|------------------------|
 
-{% include startnote.html %}This document aligns Drill output for example purposes. Drill output is not aligned in this case.{% include endnote.html %}
+{% include startnote.html %}本文档为了展示方便对齐了 Drill 的输出。实际上 Drill 的输出不会这样对齐。{% include endnote.html %}
 
-You query the data in JSON files directly. Schema definitions in Hive store are not necessary. The names of the elements within the `checkin_info` column are different between the first and second row.
+你可以直接查询 JSON 文件中的数据。 不必给 Hive 中的数据定义 schema。`checkin_info` 列中,第一行和第二行之间的元素名称是不同的。
 
-Drill provides a function called KVGEN (Key Value Generator) which is useful when working with complex data that contains arbitrary maps consisting of dynamic and unknown element names such as checkin_info. KVGEN turns the dynamic map into an array of key-value pairs where keys represent the dynamic element names.
+Drill 提供了一个名为 KVGEN(键值生成器)的函数,该函数在处理由动态和未知元素名称(例如 checkin_info)组成的任意映射的复杂数据时非常有用。 KVGEN 将动态映射转换为键值对数组,其中键表示动态元素名称。
 
-Let’s apply KVGEN on the `checkin_info` element to generate key-value pairs.
+让我们在 `checkin_info` 元素上利用 KVGEN 来生成键值对。
 
     0: jdbc:drill:zk=local> SELECT KVGEN(checkin_info) checkins FROM dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_checkin.json` LIMIT 2;
     |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- [...]
@@ -74,7 +74,7 @@ Let’s apply KVGEN on the `checkin_info` element to generate key-value pairs.
     | [{"key":"6-6","value":2},{"key":"6-5","value":1},{"key":"7-6","value":1},{"key":"7-5","value":1},{"key":"8-5","value":2},{"key":"10-5","value":1},{"key":"9-3","value":1},{"key":"12-5","value":1},{"key":"15-3","value":1},{"key":"15-5","value":1},{"key":"15-6","value":1},{"key":"16-3","value":1},{"key":"10-0","value":1},{"key":"15-4","value":1},{"key":"10-4","value":1},{"key":"8-2","value":1}]                                                                                             [...]
     |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- [...]
 
-Drill provides another function to operate on complex data called ‘Flatten’ to break the list of key-value pairs resulting from ‘KVGen’ into separate rows to further apply analytic functions on it.
+Drill 提供了另一个称为 Flatten 函数来处理复杂数据,可以将 KVGen 函数产生的键值对列表分解为单独的行,以进一步对其使用分析函数。
 
     0: jdbc:drill:zk=local> SELECT FLATTEN(KVGEN(checkin_info)) checkins FROM dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_checkin.json` LIMIT 20;
     |--------------------------|
@@ -102,11 +102,12 @@ Drill provides another function to operate on complex data called ‘Flatten’
     | {"key":"7-6","value":1}  |
     |--------------------------|
 
-You can get value from the data quickly by applying both KVGEN and FLATTEN functions on the datasets on the fly--no need for time-consuming schema definitions and data storage in intermediate formats.
+你可以对数据集使用 KVGEN 和 FLATTEN 函数,来快速从数据中获得有价值的信息——并且无需耗时在 schema 定义和数据存储格式。
 
-On the output of flattened data, you use standard SQL functionality such as filters , aggregates, and sort. Let’s see a few examples.
+在扁平化数据的输出中,你可以使用标准的 SQL 功能,例如过滤器、聚合和排序。让我们看几个例子。
 
-**Get the total number of check-ins recorded in the Yelp dataset**
+
+**获取 Yelp 数据集中签到记录的总数**
 
     0: jdbc:drill:zk=local> SELECT SUM(checkintbl.checkins.`value`) AS TotalCheckins FROM (
     . . . . . . . . . . . >  SELECT FLATTEN(KVGEN(checkin_info)) checkins FROM dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_checkin.json` ) checkintbl
@@ -117,7 +118,7 @@ On the output of flattened data, you use standard SQL functionality such as filt
     | 4713811       |
     |---------------|
 
-**Get the number of check-ins specifically for Sunday midnights**
+**获取周日午夜的签到次数**
 
     0: jdbc:drill:zk=local> SELECT SUM(checkintbl.checkins.`value`) AS SundayMidnightCheckins FROM (
     . . . . . . . . . . . >  SELECT FLATTEN(KVGEN(checkin_info)) checkins FROM dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_checkin.json` ) checkintbl WHERE checkintbl.checkins.key='23-0';
@@ -127,7 +128,7 @@ On the output of flattened data, you use standard SQL functionality such as filt
     | 8575                   |
     |------------------------|
   
-**Get the number of check-ins per day of the week**  
+**获取一周内每天的签到次数**  
 
     0: jdbc:drill:zk=local> SELECT `right`(checkintbl.checkins.key,1) WeekDay,sum(checkintbl.checkins.`value`) TotalCheckins from (
     . . . . . . . . . . . >  select flatten(kvgen(checkin_info)) checkins FROM dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_checkin.json`  ) checkintbl GROUP BY `right`(checkintbl.checkins.key,1) ORDER BY TotalCheckins;
@@ -143,7 +144,7 @@ On the output of flattened data, you use standard SQL functionality such as filt
     | 5       | 937201        |
     |---------|---------------|
 
-**Get the number of check-ins per hour of the day**
+**获取一天中每小时的签到次数**
 
     0: jdbc:drill:zk=local> SELECT SUBSTR(checkintbl.checkins.key,1,strpos(checkintbl.checkins.key,'-')-1) AS HourOfTheDay ,SUM(checkintbl.checkins.`value`) TotalCheckins FROM (
     . . . . . . . . . . . >  SELECT FLATTEN(KVGEN(checkin_info)) checkins FROM dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_checkin.json` ) checkintbl GROUP BY SUBSTR(checkintbl.checkins.key,1,strpos(checkintbl.checkins.key,'-')-1) ORDER BY TotalCheckins;
@@ -178,5 +179,5 @@ On the output of flattened data, you use standard SQL functionality such as filt
 
 ----------
 
-## Summary
-In this tutorial, you surf both structured and semi-structured data without any upfront schema management or ETL.
+## 小结
+在本教程中,学习了如何处理结构化和半结构化数据,而无需定义 schema 或进行 ETL 预处理。
diff --git a/_docs/zh/tutorials/060-analyzing-social-media.md b/_docs/zh/tutorials/060-analyzing-social-media.md
index f5ab607..45c8ca5 100644
--- a/_docs/zh/tutorials/060-analyzing-social-media.md
+++ b/_docs/zh/tutorials/060-analyzing-social-media.md
@@ -5,38 +5,37 @@ parent: "教程"
 lang: "zh"
 ---
 
-This tutorial covers how to analyze Twitter data in native JSON format using Apache Drill. First, you configure an environment to stream the Twitter data filtered on keywords and languages using Apache Flume, and then you analyze the data using Drill. Finally, you run interactive reports and analysis using MicroStrategy.
+本教程介绍了如何使用 Apache Drill 分析原生 JSON 格式的 Twitter 数据。首先,使用 Apache Flume 处理 Twitter 数据流并过滤关键字和语言类型,然后使用 Drill 分析数据。最后,运行 MicroStrategy 以获得交互式报告和分析。
 
-## Social Media Analysis Prerequisites
+## 社交媒体分析所需准备
 
-* Twitter developer account
-* AWS account
-* A MapR node on AWS
-* A MicroStrategy AWS instance
+* Twitter 开发者账户
+* 亚马逊云服务账户
+* 亚马逊云服务中加载一个 MapR 节点
+* 亚马逊云服务中加载一个 MicroStrategy 实例
 
-## Configuring the AWS environment
+## 配置亚马逊云服务环境
 
-Configuring the environment on Amazon Web Services (AWS) consists of these tasks:
+在亚马逊云服务 (AWS) 上配置环境包括以下任务:
 
-* Create a Twitter Dev account and register a Twitter application  
-* Provision a preconfigured AWS MapR node with Flume and Drill  
-* Provision a MicroStrategy AWS instance  
-* Configure MicroStrategy to run reports and analyses using Drill  
-* Create a Twitter Dev account and register an application
+* 创建一个 Twitter 开发者账户并注册一个 Twitter 应用程序  
+* 在开启的 AWS MapR 节点中配置 Flume 和 Drill
+* 在 AWS 虚拟机中配置 MicroStrategy
+* 配置 MicroStrategy 来使用 Drill 运行报告和分析  
 
-This tutorial assumes you are familiar with MicroStrategy. For information about using MicroStrategy, see the [MicroStrategy documentation](http://www.microstrategy.com/Strategy/media/downloads/products/cloud/cloud_aws-user-guide.pdf).
+本教程假设你已熟悉 MicroStrategy。有关使用 MicroStrategy 的信息,请参考 [MicroStrategy documentation](http://www.microstrategy.com/Strategy/media/downloads/products/cloud/cloud_aws-user-guide.pdf)。
 
 ----------
 
-## Establishing a Twitter Feed and Flume Credentials
+## 订阅 Twitter 新消息并建立 Flume 凭证
 
-The following steps establish a Twitter feed and get Twitter credentials required by Flume to set up Twitter as a data source:
+以下步骤订阅 Twitter 新消息并获取 Twitter 证书使 Flume 将 Twitter 设置为数据源:
 
-1. Go to dev.twitter.com and sign in with your Twitter account details.  
-2. Click **Manage Your Apps** under Tools in the page footer.  
-3. Click **Create New App** and fill in the form, then create the application.
-4. On the **Keys and Access Tokens** tab, create an access token, and then click **Create My Access Token**. If you have read-only access, you can create the token.
-5. Copy the following credentials for the Twitter App that will be used to configure Flume: 
+1. 打开 dev.twitter.com 页面并登陆。
+2. 点击页面底部 Tools 中的 **Manage Your Apps**。
+3. 点击 **Create New App** 并填写表格,创建 App。
+4. 点击 **Keys and Access Tokens** 选项卡,创建一个新的秘钥,并点击 **Create My Access Token**。只读权限即可创建秘钥。
+5. 复制 Twitter App 中的以下证书用于配置 Flume: 
    * Consumer Key
    * Consumer Secret
    * Access Token
@@ -44,166 +43,162 @@ The following steps establish a Twitter feed and get Twitter credentials require
 
 ----------
 
-## Provision Preconfigured MapR Node on AWS
+## 在 AWS 虚拟机中配置 MapR 节点
 
-You need to provision a preconfigured MapR node on AWS named ami-4dedc47d. The AMI is already configured with Flume, Drill, and specific elements to support data streaming from Twitter and Drill query views. The AMI is publicly available under Community AMIs, has a 6GB root drive, and a 100GB data drive. Being a small node, very large volumes of data will significantly decrease the response time to Twitter data queries.
+你需要在 AWS 中加载名为 ami-4dedc47d 的 MapR 节点。AMI 已经配置了 Flume、Drill 和特定工具,以支持来自 Twitter 和 Drill 查询数据流。AMI 社区免费开放使用 AMI,具有 6GB 根驱动器和 100GB 数据驱动器。作为一个小节点,丰富的硬件资源将显著加快 Twitter 数据查询的响应时间。
 
-1. In AWS, launch an instance.  
-   The AMI image is preconfigured to use a m2.2xlarge instance type with 4 vCPUs and 32GB of memory.  
-2. Select the AMI id ami-4dedc47d.  
-3. Make sure that the instance has been assigned an external IP address; an Elastic IP is preferred, but not essential.  
-4. Verify that a security group is used with open TCP and UDP ports on the node. At this time, all ports are left open on the node.
-5. After provisioning and booting up the instance, reboot the node in the AWS EC2 management interface to finalize the configuration.
+1. 在 AWS 中启动一个实例。AMI 镜像预先配置为具有4个 vCPU 和 32GB 内存的 m2.2xlarge 型实例。
+2. 选择 AMI id 为 ami-4dedc47d。
+3. 确保虚拟机已分配外部 IP 地址;弹性 IP 更好,但不是必需的。 
+4. 确认安全验证是否应用于与实例上打开的 TCP 和 UDP 端口。此时,节点上的所有端口应处于打开状态。
+5. 配置并启动实例后,在 AWS EC2 管理界面中重启节点以完成配置。
 
-The node is now configured with the required Flume and Drill installation. Next, update the Flume configuration files with the required credentials and keywords.
+该节点配置了所需的 Flume 和 Drill。接下来,使用所需的证书和关键字更新 Flume 配置文件。
 
 ----------
 
-## Update Flume Configuration Files
+## 更新 Flume 配置文件
 
-1. Log in as the ec2-user using the AWS credentials.
-2. Switch to the mapr user on the node using `su – mapr.`
-3. Update the Flume configuration files `flume-env.sh` and `flume`.conf in the `<FLUME HOME>/conf` directory using the Twitter app credentials from the first section. See the [sample files](https://github.com/mapr/mapr-demos/tree/master/drill-twitter-MSTR/flume).
-4. Enter the desired keywords, separated by a comma.  
-   Separate multiple keywords using a space.  
-5. Filter tweets for specific languages, if needed, by entering the ISO 639-1 [language codes](http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) separated by a comma. If you need no language filtering, leave the parameter blank.  
-6. Go to the FLUME HOME directory and, as user `mapr`, type screen on the command line as user `mapr`:  
-7. Start Flume by typing the following command:  
+1. 使用 AWS 凭证登录 ec2-user。
+2. 使用 `su – mapr` 切换到节点上的 MapR 用户。
+3. 使用步骤一中的 Twitter App 证书更新 `<FLUME HOME>/conf` 目录中的 Flume 配置文件 `flume-env.sh` 和 `flume.conf`。请参考[sample files](https://github.com/mapr/mapr-demos/tree/master/drill-twitter-MSTR/flume)。
+4. 输入所需的关键字,以逗号分隔。使用空格分隔多个关键字。 
+5. 如果要过滤特定语言的推文,需要通过以逗号分隔的 ISO 639-1 [语言代码](http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes)。如果不需要语言类型过滤,请将参数留空。  
+6. 以用户 `MapR` 切换到 FLUME HOME 目录,并以用户 `MapR` 在命令行执行命令:  
+7. 通过下命令启动 Flume:  
 
         ./bin/flume-ng agent --conf ./conf/ -f ./conf/flume.conf -Dflume.root.logger=INFO,console -n TwitterAgent
-8. Enter `CTRL+a` to exit, followed by `d` to detach.  
-   To go back to the screen terminal, simply enter screen –r to reattach.  
-   Twitter data streams into the system.  
-9. Run the following command to verify volumes:
+8. 输入 `CTRL+a` 使程序退出前台,然后输入 `d` 使程序后台运行。要使程序返回屏幕终端,只需输入 screen -r 即可重新连接。Twitter 数据会加载进系统。 
+9. 运行以下命令来查看 Twitter 数据所占磁盘空间
 
          du –h /mapr/drill_demo/twitter/feed.
 
-You cannot run queries until data appears in the feed directory. Allow 20-30 minutes minimum. 
+数据加载需要等待 20-30 分钟,加载完成之后才可以开始查询数据。
 
 ----------
 
-## Provision a MicroStrategy AWS Instance
+## 在 AWS 虚拟机中配置 MicroStrategy
 
-MicroStrategy provides an AWS instance of various sizes. It comes with a free 30-day trial for the MicroStrategy instance. AWS charges still apply for the platform and OS.
+MicroStrategy 为 AWS 虚拟机提供各种适配类型。MicroStrategy 有30天的免费试用。AWS 也会针对平台和操作系统收取费用。
 
-To provision the MicroStrategy node in AWS:
+如要在 AWS 中配置 MicroStrategy:
 
-1. On the [MicroStrategy website](http://www.microstrategy.com/us/analytics/analytics-on-aws), click **Get started**.  
-2. Select some number of users, for example, select 25 users.  
-3. Select the AWS region. Using a MapR node and MicroStrategy instance in the same AWS region is highly recommended.
-4. Click **Continue**.  
-5. On the Manual Launch tab, click **Launch with EC2 Console** next to the appropriate region, and select **r3.large instance**.  
-   An EC2 instance of r3.large is sufficient for the 25 user version.  
-6. Click **Configure Instance Details**.
-7. Select an appropriate network setting and zones, ideally within the same zone and network as the MapR node that you provisioned.
-8. Keep the default storage.
-9. Assign a tag to identify the instance.
-10. Select a security group that allows sufficient access to external IPs and open all ports because security is not a concern. 
-11. In the AWS Console, launch an instance, and when the AWS reports that the instance is running, select it, and click **Connect**.
-12. Click **Get Password** to get the OS Administrator password.
+1. 在 [MicroStrategy website](http://www.microstrategy.com/us/analytics/analytics-on-aws)页面, 点击 **Get started**。  
+2. 选择用户的数量,比如选择25个用户。
+3. 选择 AWS 的使用地区。强烈建议给 MapR 节点和 MicroStrategy 实例选择和 AWS 相同的使用地区。
+4. 点击 **Continue**。
+5. 在手动启动选项卡上, 点击 **Launch with EC2 Console**, 然后选择 **r3.large instance**
+   一个 r3.large 的 EC2 实例足以满足 25 个用户的用户量。
+6. 点击 **Configure Instance Details**。
+7. 选择适当的网络设置和网段,最好与您配置的 MapR 节点处在相同的网络中。
+8. 保留默认存储类型。
+9. 给实例分配标签以方便识别。
+10. 选择一个允许外部 IP 访问的安全组,并打开所有端口,不必担心存在安全问题。
+11. 在 AWS 控制台中,启动一个实例,当 AWS 报告该实例正在运行时,选择该实例并单击 **Connect**。
+12. 点击 **Get Password** 来获得 OS 的管理员密码。
 
-{% include startimportant.html %}Make sure that the MicroStrategy instance has a Public IP; elastic IP is preferred but not essential.{% include endimportant.html %}
+{% include startimportant.html %}确保 MicroStrategy 实例具有公共 IP;弹性 IP 更佳,但不是必需{% include endimportant.html %}
 
-The instance is now accessible with RDP and is using the relevant AWS credentials and security.
+该实例现在可通过 RDP 访问并绑定 AWS 的证书和安全验证。
 
 ----------
 
-## Configure MicroStrategy
+## 配置 MicroStrategy
 
-You need to configure MicroStrategy to integrate with Drill using the ODBC driver. You install a MicroStrategy package with a number of useful, prebuilt reports for working with Twitter data. You can modify the reports or use the reports as a template to create new and more interesting reports and analysis models.
+你需要使用 ODBC 驱动程序来配置,使 MicroStrategy 可以与 Drill 集成。安装完成后的 MicroStrategy 套件,包含许多实用的 Twitter 数据报告模板。你可以修改报告模板或使用模板来创建新的报告和分析模型。
 
-1. Configure a System DSN named `Twitter` with the ODBC administrator. The quick start version of the MapR ODBC driver requires the DSN.  
-2. [Download the quick start version of the MapR ODBC driver for Drill](http://package.mapr.com/tools/MapR-ODBC/MapR_Drill/MapRDrill_odbc_v0.08.1.0618/MapRDrillODBC32.msi).  
-3. [Configure the ODBC driver](http://drill.apache.org/docs/using-microstrategy-analytics-with-apache-drill) for Drill on MicroStrategy Analytics.  
-    The Drill object is part of the package and doesn’t need to be configured.  
-4. Use the AWS Private IP if both the MapR node and the MicroStrategy instance are located in the same region (recommended).
-5. Download the [Drill and Twitter configuration](https://github.com/mapr/mapr-demos/blob/master/drill-twitter-MSTR/MSTR/DrillTwitterProjectPackage.mmp) package for MicroStrategy on the Windows system using Git for Windows or the full GitHub for Windows.
+1. 使用 ODBC 管理员配置名为 `Twitter` 的 DSN 系统。MapR ODBC 驱动程序的快速入门版本需要 DSN 系统。
+2. [下载用于 Drill 的 MapR ODBC 驱动程序的快速入门版本](http://package.mapr.com/tools/MapR-ODBC/MapR_Drill/MapRDrill_odbc_v0.08.1.0618/MapRDrillODBC32.msi)。  
+3. [配置 ODBC 驱动程序](http://drill.apache.org/docs/using-microstrategy-analytics-with-apache-drill) 使 MicroStrategy Analytics 中可以使用 Drill。
+   Drill 是程序的一部分,不需要配置。 
+4. 如果 MapR 节点和 MicroStrategy 实例位于同一网络,使用 AWS 虚拟机私有 IP(推荐)。
+5. 下载 [Drill 和 Twitter 配置](https://github.com/mapr/mapr-demos/blob/master/drill-twitter-MSTR/MSTR/DrillTwitterProjectPackage.mmp) 配置 Windows 中的 MicroStrategy 可以使用 Git,同时使 Windows 全面兼容 Git。
 
 ----------
 
-## Import Reports
+## 导入报告
 
-1. In MicroStrategy Developer, select **Schema > Create New Project** to create a new project with MicroStrategy Developer.  
-2. Click **Create Project** and type a name for the new project.  
-3. Click **OK**.  
-   The Project appears in MicroStrategy Developer.  
-4. Open MicroStrategy Object Manager.  
-5. Connect to the Project Source and login as Administrator.  
+1. 在 MicroStrategy Developer,选择 **Schema > Create New Project** 来创建新的项目。 
+2. 点击 **Create Project** 给新项目命名。 
+3. 点击 **OK**。
+   新项目出现在 MicroStrategy Developer。
+4. 打开 MicroStrategy Object Manager。  
+5. 连接到 Project Source 并管理员登录
    ![project sources]({{ site.baseurl }}/images/docs/socialmed1.png)
-6. In MicroStrategy Object Manager, MicroStrategy Analytics Modules, select the project for the package. For example, select **Twitter analysis Apache Drill**.  
+6. 在 MicroStrategy Object Manager 中的 MicroStrategy Analytics Modules,选择组件所应用的项目。比如,选择 **Twitter analysis Apache Drill**。    
    ![project sources]({{ site.baseurl }}/images/docs/socialmed2.png)
-7. Select **Tools > Import Configuration Package**.  
-8. Open the configuration package file, and click **Proceed**.  
-   ![project sources]({{ site.baseurl }}/images/docs/socialmed3.png)
-   The package with the reports is available in MicroStrategy.  
+7. 选择 **Tools > Import Configuration Package**。  
+8. 打开组件配置文件,并点击 **Proceed**。  
+   ![project sources]({{ site.baseurl }}/images/docs/socialmed3.png)  
+   项目对应的报告便会生成到 MicroStrategy 中。  
 
-You can test and modify the reports in MicroStrategy Developer. Configure permissions if necessary.
+你可以在 MicroStrategy Developer 中测试和修改报告。并在必要时修改权限。
 
 ----------
 
-## Update the Schema
+## 更新 Schema
 
-1. In MicroStrategy Developer, select **Schema > Update Schema**.  
-2. In Schema Update, select all check boxes, and click **Update**.  
+1. 在 MicroStrategy Developer 中, 选择 **Schema > Update Schema**。  
+2. 在 Schema Update 中,勾中所有选项,并点击 **Update**。  
    ![project sources]({{ site.baseurl }}/images/docs/socialmed4.png)
 
 ----------
 
-## Create a User and Set the Password
+## 创建用户并设置密码
 
-1. Expand Administration.  
-2. Expand User Manager, and click **Everyone**.  
-3. Right-click to create a new user, or click **Administrator** to edit the password.  
+1. 点击 Administration.  
+2. 点击 User Manager, 然后点击 **Everyone**.  
+3. 点击右键创建新用户,或点击 **Administrator** 来编辑密码.  
 
 ----------
 
-## About the Reports
+## 关于报告
 
-There are 18 reports in the package. Most reports prompt you to specify date ranges, output limits, and terms as needed. The package contains reports in three main categories:
+组件中有18个报告模板。大多数报告会提示根据需要指定日期范围、输出条件和所需字段。该组件包包含三个主要类别的报告:
 
-* Volumes: A number of reports that show the total volume of Tweets by different date and time designations.
-* Top List: Displays the top Tweets, Retweets, hashtags and users are displayed.
-* Specific Terms: Tweets and Retweets that can be measured or listed based on terms in the text of the Tweet itself.
+* 数量统计:指定日期和时间的推文总量。
+* 热门列表:显示热门推文、转推、热门话题和用户。
+* 关键字:根据推文或者转推中的内容来统计关键字。
 
-You can copy and modify the reports or use the reports as a template for querying Twitter data using Drill. 
+你可以复制和修改报告或将报告作为模板并通过 Drill 查询 Twitter。 
 
-You can access reports through MicroStrategy Developer or the web interface. MicroStrategy Developer provides a more powerful interface than the web interface to modify reports or add new reports, but requires RDP access to the node.
+你可以通过 MicroStrategy Developer 或 Web 界面访问报告。MicroStrategy Developer 提供了比 Web 界面更强大的接口来修改或添加新报告,但需要对节点进行 RDP 访问。
 
 ----------
 
-## Using the Web Interface
+## 使用网页界面
 
-1. Using a web browser, enter the URL for the web interface:  
+1. 使用 Web 浏览器,输入 Web 界面的 URL:  
          http://<MSTR node name or IP address>/MicroStrategy/asp/Main.aspx
-2. Log in as the User you created or as Administrator, using the credentials created initially with Developer.  
-3. On the Welcome MicroStrategy Web User page, choose the project that was used to load the analysis package: **Drill Twitter Analysis**.  
+2. 使用初始账户或管理员账户登录,并通过证书初始化 Developer。
+3. 在 MicroStrategy Web 用户页面上,选择用于加载分析组件的项目:**Drill Twitter Analysis**。  
    ![choose project]({{ site.baseurl }}/images/docs/socialmed5.png)
-4. Select **Shared Reports**.  
-   The folders with the three main categories of the reports appear.
+4. 选择 **Shared Reports**.  
+   将出现包含三个主要报告类别的文件夹。
    ![project sources]({{ site.baseurl }}/images/docs/socialmed6.png)
-5. Select a report, and respond to any prompts. For example, to run the Top Tweet Languages by Date Range, enter the required Date_Start and Date_End.  
+5. 选择一个报告,并按提示操作。例如,要按日期范围统计热门推文语言种类,请输入所需的 Date_Start 和 Date_End。 
    ![project sources]({{ site.baseurl }}/images/docs/socialmed7.png)
-6. Click **Run Report**.  
-   A histogram report appears showing the top tweet languages by date range.
+6. 点击 **Run Report**.  
+   直方图报告会按日期范围显示排名靠前的推文语言种类。
    ![project sources]({{ site.baseurl }}/images/docs/socialmed8.png)
-7. To refresh the data or re-enter prompt values, select **Data > Refresh** or **Data > Re-prompt**.
+7. 要刷新数据或重新输入参数,请选择 **Data > Refresh** 或 **Data > Re-prompt**.
 
-## Browsing the Apache Drill Twitter Analysis Reports
+## 浏览 Apache Drill Twitter 分析报告
 
-The MicroStrategy Developer reports are located in the Public Objects folder of the project you chose for installing the package.  
+MicroStrategy Developer 报告位于组件安装路径中的 Public Objects 文件夹中
    ![project sources]({{ site.baseurl }}/images/docs/socialmed9.png)
-Many of the reports require you to respond to prompts to select the desired data. For example, select the Top Hashtags report in the right-hand column. This report requires you to respond to prompts for a Start Date and End Date to specify the date range for data of interest; by default, data for the last two months, ending with the current date is selected. You can also specify the limit for the number of Top Hashtags to be returned; the default is the top 10 hashtags.  
+许多报告要求按提示选择合适的参数。例如,选择右侧栏中的 Top Hashtags 报告。此报告需要提供开始日期和结束日期来指定你感兴趣的时间范围;默认情况下,是以当前时间为基准最近两个月的数据。你还可以指定要返回的热门话题的数量;默认是热度前10的热门话题。  
    ![project sources]({{ site.baseurl }}/images/docs/socialmed10.png)
-When you click **Finish** a bar chart report with the hashtag and number of times it appeared in the specified data range appears.  
+会出现一个带有热门话题和它在指定参数范围内出现次数的条形图报告。
    ![project sources]({{ site.baseurl }}/images/docs/socialmed11.png)
 
-Other reports are available in the bundle. For example, this report shows total tweets by hour:
+组件中还提供了其他报告。例如,此报告按小时显示推文总数:
    ![tweets by hour]({{ site.baseurl }}/images/docs/socialmed12.png)
-This report shows top Retweets for a date range with original Tweet date and count in the date range.  
+此报告显示日期范围内的最高转推次数,以及该日期范围内的原始推文日期和计数。  
    ![retweets report]({{ site.baseurl }}/images/docs/socialmed13.png)
 
 ----------
 
-## Summary
+## 总结
 
-In this tutorial, you learned how to configure an environment to stream Twitter data using Apache Flume. You then learned how to analyze the data in native JSON format with SQL using Apache Drill, and how to run interactive reports and analysis using MicroStrategy.
+在本教程中,学习了如何配置环境以使用 Apache Flume 传输 Twitter 数据流。然后学习了如何通过 Apache Drill 使用 SQL 分析原生 JSON 格式的数据,以及如何使用 MicroStrategy 运行交互式报告和分析。