You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by ch...@apache.org on 2018/09/07 16:53:55 UTC

[08/39] carbondata-site git commit: Added new page layout & updated as per new md files

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/main/webapp/performance-tuning.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/performance-tuning.html b/src/main/webapp/performance-tuning.html
new file mode 100644
index 0000000..49b3d3a
--- /dev/null
+++ b/src/main/webapp/performance-tuning.html
@@ -0,0 +1,529 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <link href='images/favicon.ico' rel='shortcut icon' type='image/x-icon'>
+    <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
+    <title>CarbonData</title>
+    <style>
+
+    </style>
+    <!-- Bootstrap -->
+
+    <link rel="stylesheet" href="css/bootstrap.min.css">
+    <link href="css/style.css" rel="stylesheet">
+    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
+    <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
+    <!--[if lt IE 9]>
+    <script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
+    <script src="https://oss.maxcdn.scom/respond/1.4.2/respond.min.js"></script>
+    <![endif]-->
+    <script src="js/jquery.min.js"></script>
+    <script src="js/bootstrap.min.js"></script>
+    <script defer src="https://use.fontawesome.com/releases/v5.0.8/js/all.js"></script>
+
+
+</head>
+<body>
+<header>
+    <nav class="navbar navbar-default navbar-custom cd-navbar-wrapper">
+        <div class="container">
+            <div class="navbar-header">
+                <button aria-controls="navbar" aria-expanded="false" data-target="#navbar" data-toggle="collapse"
+                        class="navbar-toggle collapsed" type="button">
+                    <span class="sr-only">Toggle navigation</span>
+                    <span class="icon-bar"></span>
+                    <span class="icon-bar"></span>
+                    <span class="icon-bar"></span>
+                </button>
+                <a href="index.html" class="logo">
+                    <img src="images/CarbonDataLogo.png" alt="CarbonData logo" title="CarbocnData logo"/>
+                </a>
+            </div>
+            <div class="navbar-collapse collapse cd_navcontnt" id="navbar">
+                <ul class="nav navbar-nav navbar-right navlist-custom">
+                    <li><a href="index.html" class="hidden-xs"><i class="fa fa-home" aria-hidden="true"></i> </a>
+                    </li>
+                    <li><a href="index.html" class="hidden-lg hidden-md hidden-sm">Home</a></li>
+                    <li class="dropdown">
+                        <a href="#" class="dropdown-toggle " data-toggle="dropdown" role="button" aria-haspopup="true"
+                           aria-expanded="false"> Download <span class="caret"></span></a>
+                        <ul class="dropdown-menu">
+                            <li>
+                                <a href="https://dist.apache.org/repos/dist/release/carbondata/1.4.1/"
+                                   target="_blank">Apache CarbonData 1.4.1</a></li>
+							<li>
+                                <a href="https://dist.apache.org/repos/dist/release/carbondata/1.4.0/"
+                                   target="_blank">Apache CarbonData 1.4.0</a></li>
+                            <li>
+                                <a href="https://dist.apache.org/repos/dist/release/carbondata/1.3.1/"
+                                   target="_blank">Apache CarbonData 1.3.1</a></li>
+                            <li>
+                                <a href="https://dist.apache.org/repos/dist/release/carbondata/1.3.0/"
+                                   target="_blank">Apache CarbonData 1.3.0</a></li>
+                            <li>
+                                <a href="https://cwiki.apache.org/confluence/display/CARBONDATA/Releases"
+                                   target="_blank">Release Archive</a></li>
+                        </ul>
+                    </li>
+                    <li><a href="documentation.html" class="active">Documentation</a></li>
+                    <li class="dropdown">
+                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true"
+                           aria-expanded="false">Community <span class="caret"></span></a>
+                        <ul class="dropdown-menu">
+                            <li>
+                                <a href="https://github.com/apache/carbondata/blob/master/docs/How-to-contribute-to-Apache-CarbonData.md"
+                                   target="_blank">Contributing to CarbonData</a></li>
+                            <li>
+                                <a href="https://github.com/apache/carbondata/blob/master/docs/release-guide.md"
+                                   target="_blank">Release Guide</a></li>
+                            <li>
+                                <a href="https://cwiki.apache.org/confluence/display/CARBONDATA/PMC+and+Committers+member+list"
+                                   target="_blank">Project PMC and Committers</a></li>
+                            <li>
+                                <a href="https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=66850609"
+                                   target="_blank">CarbonData Meetups</a></li>
+                            <li><a href="security.html">Apache CarbonData Security</a></li>
+                            <li><a href="https://issues.apache.org/jira/browse/CARBONDATA" target="_blank">Apache
+                                Jira</a></li>
+                            <li><a href="videogallery.html">CarbonData Videos </a></li>
+                        </ul>
+                    </li>
+                    <li class="dropdown">
+                        <a href="http://www.apache.org/" class="apache_link hidden-xs dropdown-toggle"
+                           data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache</a>
+                        <ul class="dropdown-menu">
+                            <li><a href="http://www.apache.org/" target="_blank">Apache Homepage</a></li>
+                            <li><a href="http://www.apache.org/licenses/" target="_blank">License</a></li>
+                            <li><a href="http://www.apache.org/foundation/sponsorship.html"
+                                   target="_blank">Sponsorship</a></li>
+                            <li><a href="http://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a></li>
+                        </ul>
+                    </li>
+
+                    <li class="dropdown">
+                        <a href="http://www.apache.org/" class="hidden-lg hidden-md hidden-sm dropdown-toggle"
+                           data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache</a>
+                        <ul class="dropdown-menu">
+                            <li><a href="http://www.apache.org/" target="_blank">Apache Homepage</a></li>
+                            <li><a href="http://www.apache.org/licenses/" target="_blank">License</a></li>
+                            <li><a href="http://www.apache.org/foundation/sponsorship.html"
+                                   target="_blank">Sponsorship</a></li>
+                            <li><a href="http://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a></li>
+                        </ul>
+                    </li>
+
+                    <li>
+                        <a href="#" id="search-icon"><i class="fa fa-search" aria-hidden="true"></i></a>
+
+                    </li>
+
+                </ul>
+            </div><!--/.nav-collapse -->
+            <div id="search-box">
+                <form method="get" action="http://www.google.com/search" target="_blank">
+                    <div class="search-block">
+                        <table border="0" cellpadding="0" width="100%">
+                            <tr>
+                                <td style="width:80%">
+                                    <input type="text" name="q" size=" 5" maxlength="255" value=""
+                                           class="search-input"  placeholder="Search...."    required/>
+                                </td>
+                                <td style="width:20%">
+                                    <input type="submit" value="Search"/></td>
+                            </tr>
+                            <tr>
+                                <td align="left" style="font-size:75%" colspan="2">
+                                    <input type="checkbox" name="sitesearch" value="carbondata.apache.org" checked/>
+                                    <span style=" position: relative; top: -3px;"> Only search for CarbonData</span>
+                                </td>
+                            </tr>
+                        </table>
+                    </div>
+                </form>
+            </div>
+        </div>
+    </nav>
+</header> <!-- end Header part -->
+
+<div class="fixed-padding"></div> <!--  top padding with fixde header  -->
+
+<section><!-- Dashboard nav -->
+    <div class="container-fluid q">
+        <div class="col-sm-12  col-md-12 maindashboard">
+            <div class="verticalnavbar">
+                <nav class="b-sticky-nav">
+                    <div class="nav-scroller">
+                        <div class="nav__inner">
+                            <a class="b-nav__intro nav__item" href="./introduction.html">introduction</a>
+                            <a class="b-nav__quickstart nav__item" href="./quick-start-guide.html">quick start</a>
+                            <a class="b-nav__uses nav__item" href="./usescases.html">use cases</a>
+
+                            <div class="nav__item nav__item__with__subs">
+                                <a class="b-nav__docs nav__item nav__sub__anchor" href="./language-manual.html">Language Reference</a>
+                                <a class="nav__item nav__sub__item" href="./ddl-of-carbondata.html">DDL</a>
+                                <a class="nav__item nav__sub__item" href="./dml-of-carbondata.html">DML</a>
+                                <a class="nav__item nav__sub__item" href="./streaming-guide.html">Streaming</a>
+                                <a class="nav__item nav__sub__item" href="./configuration-parameters.html">Configuration</a>
+                                <a class="nav__item nav__sub__item" href="./datamap-developer-guide.html">Datamaps</a>
+                                <a class="nav__item nav__sub__item" href="./supported-data-types-in-carbondata.html">Data Types</a>
+                            </div>
+
+                            <div class="nav__item nav__item__with__subs">
+                                <a class="b-nav__datamap nav__item nav__sub__anchor" href="./datamap-management.html">DataMaps</a>
+                                <a class="nav__item nav__sub__item" href="./bloomfilter-datamap-guide.html">Bloom Filter</a>
+                                <a class="nav__item nav__sub__item" href="./lucene-datamap-guide.html">Lucene</a>
+                                <a class="nav__item nav__sub__item" href="./preaggregate-datamap-guide.html">Pre-Aggregate</a>
+                                <a class="nav__item nav__sub__item" href="./timeseries-datamap-guide.html">Time Series</a>
+                            </div>
+
+                            <a class="b-nav__s3 nav__item" href="./s3-guide.html">S3 Support</a>
+                            <a class="b-nav__api nav__item" href="./sdk-guide.html">API</a>
+                            <a class="b-nav__perf nav__item" href="./performance-tuning.html">Performance Tuning</a>
+                            <a class="b-nav__faq nav__item" href="./faq.html">FAQ</a>
+                            <a class="b-nav__contri nav__item" href="./how-to-contribute-to-apache-carbondata.html">Contribute</a>
+                            <a class="b-nav__security nav__item" href="./security.html">Security</a>
+                            <a class="b-nav__release nav__item" href="./release-guide.html">Release Guide</a>
+                        </div>
+                    </div>
+                    <div class="navindicator">
+                        <div class="b-nav__intro navindicator__item"></div>
+                        <div class="b-nav__quickstart navindicator__item"></div>
+                        <div class="b-nav__uses navindicator__item"></div>
+                        <div class="b-nav__docs navindicator__item"></div>
+                        <div class="b-nav__datamap navindicator__item"></div>
+                        <div class="b-nav__s3 navindicator__item"></div>
+                        <div class="b-nav__api navindicator__item"></div>
+                        <div class="b-nav__perf navindicator__item"></div>
+                        <div class="b-nav__faq navindicator__item"></div>
+                        <div class="b-nav__contri navindicator__item"></div>
+                        <div class="b-nav__security navindicator__item"></div>
+                    </div>
+                </nav>
+            </div>
+            <div class="mdcontent">
+                <section>
+                    <div style="padding:10px 15px;">
+                        <div id="viewpage" name="viewpage">
+                            <div class="row">
+                                <div class="col-sm-12  col-md-12">
+                                    <div>
+<h1>
+<a id="useful-tips" class="anchor" href="#useful-tips" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Useful Tips</h1>
+<p>This tutorial guides you to create CarbonData Tables and optimize performance.
+The following sections will elaborate on the below topics :</p>
+<ul>
+<li><a href="#suggestions-to-create-carbondata-table">Suggestions to create CarbonData Table</a></li>
+<li><a href="#configuration-for-optimizing-data-loading-performance-for-massive-data">Configuration for Optimizing Data Loading performance for Massive Data</a></li>
+<li><a href="#configurations-for-optimizing-carbondata-performance">Optimizing Query Performance</a></li>
+</ul>
+<h2>
+<a id="suggestions-to-create-carbondata-table" class="anchor" href="#suggestions-to-create-carbondata-table" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Suggestions to Create CarbonData Table</h2>
+<p>For example, the results of the analysis for table creation with dimensions ranging from 10 thousand to 10 billion rows and 100 to 300 columns have been summarized below.
+The following table describes some of the columns from the table used.</p>
+<ul>
+<li><strong>Table Column Description</strong></li>
+</ul>
+<table>
+<thead>
+<tr>
+<th>Column Name</th>
+<th>Data Type</th>
+<th>Cardinality</th>
+<th>Attribution</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>msisdn</td>
+<td>String</td>
+<td>30 million</td>
+<td>Dimension</td>
+</tr>
+<tr>
+<td>BEGIN_TIME</td>
+<td>BigInt</td>
+<td>10 Thousand</td>
+<td>Dimension</td>
+</tr>
+<tr>
+<td>HOST</td>
+<td>String</td>
+<td>1 million</td>
+<td>Dimension</td>
+</tr>
+<tr>
+<td>Dime_1</td>
+<td>String</td>
+<td>1 Thousand</td>
+<td>Dimension</td>
+</tr>
+<tr>
+<td>counter_1</td>
+<td>Decimal</td>
+<td>NA</td>
+<td>Measure</td>
+</tr>
+<tr>
+<td>counter_2</td>
+<td>Numeric(20,0)</td>
+<td>NA</td>
+<td>Measure</td>
+</tr>
+<tr>
+<td>...</td>
+<td>...</td>
+<td>NA</td>
+<td>Measure</td>
+</tr>
+<tr>
+<td>counter_100</td>
+<td>Decimal</td>
+<td>NA</td>
+<td>Measure</td>
+</tr>
+</tbody>
+</table>
+<ul>
+<li><strong>Put the frequently-used column filter in the beginning of SORT_COLUMNS</strong></li>
+</ul>
+<p>For example, MSISDN filter is used in most of the query then we must put the MSISDN as the first column in SORT_COLUMNS property.
+The create table command can be modified as suggested below :</p>
+<pre><code>create table carbondata_table(
+  msisdn String,
+  BEGIN_TIME bigint,
+  HOST String,
+  Dime_1 String,
+  counter_1, Decimal
+  ...
+  
+  )STORED BY 'carbondata'
+  TBLPROPERTIES ('SORT_COLUMNS'='msisdn, Dime_1')
+</code></pre>
+<p>Now the query with MSISDN in the filter will be more efficient.</p>
+<ul>
+<li><strong>Put the frequently-used columns in the order of low to high cardinality in SORT_COLUMNS</strong></li>
+</ul>
+<p>If the table in the specified query has multiple columns which are frequently used to filter the results, it is suggested to put
+the columns in the order of cardinality low to high in SORT_COLUMNS configuration. This ordering of frequently used columns improves the compression ratio and
+enhances the performance of queries with filter on these columns.</p>
+<p>For example, if MSISDN, HOST and Dime_1 are frequently-used columns, then the column order of table is suggested as
+Dime_1&gt;HOST&gt;MSISDN, because Dime_1 has the lowest cardinality.
+The create table command can be modified as suggested below :</p>
+<pre><code>create table carbondata_table(
+    msisdn String,
+    BEGIN_TIME bigint,
+    HOST String,
+    Dime_1 String,
+    counter_1, Decimal
+    ...
+    
+    )STORED BY 'carbondata'
+    TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
+</code></pre>
+<ul>
+<li><strong>For measure type columns with non high accuracy, replace Numeric(20,0) data type with Double data type</strong></li>
+</ul>
+<p>For columns of measure type, not requiring high accuracy, it is suggested to replace Numeric data type with Double to enhance query performance.
+The create table command can be modified as below :</p>
+<pre><code>  create table carbondata_table(
+    Dime_1 String,
+    BEGIN_TIME bigint,
+    END_TIME bigint,
+    HOST String,
+    MSISDN String,
+    counter_1 decimal,
+    counter_2 double,
+    ...
+    )STORED BY 'carbondata'
+    TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
+</code></pre>
+<p>The result of performance analysis of test-case shows reduction in query execution time from 15 to 3 seconds, thereby improving performance by nearly 5 times.</p>
+<ul>
+<li><strong>Columns of incremental character should be re-arranged at the end of dimensions</strong></li>
+</ul>
+<p>Consider the following scenario where data is loaded each day and the begin_time is incremental for each load, it is suggested to put begin_time at the end of dimensions.
+Incremental values are efficient in using min/max index. The create table command can be modified as below :</p>
+<pre><code>create table carbondata_table(
+  Dime_1 String,
+  HOST String,
+  MSISDN String,
+  counter_1 double,
+  counter_2 double,
+  BEGIN_TIME bigint,
+  END_TIME bigint,
+  ...
+  counter_100 double
+  )STORED BY 'carbondata'
+  TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
+</code></pre>
+<p><strong>NOTE:</strong></p>
+<ul>
+<li>BloomFilter can be created to enhance performance for queries with precise equal/in conditions. You can find more information about it in BloomFilter datamap <a href="https://github.com/apache/carbondata/blob/master/docs/datamap/bloomfilter-datamap-guide.html" target=_blank>document</a>.</li>
+</ul>
+<h2>
+<a id="configuration-for-optimizing-data-loading-performance-for-massive-data" class="anchor" href="#configuration-for-optimizing-data-loading-performance-for-massive-data" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Configuration for Optimizing Data Loading performance for Massive Data</h2>
+<p>CarbonData supports large data load, in this process sorting data while loading consumes a lot of memory and disk IO and
+this can result sometimes in "Out Of Memory" exception.
+If you do not have much memory to use, then you may prefer to slow the speed of data loading instead of data load failure.
+You can configure CarbonData by tuning following properties in carbon.properties file to get a better performance.</p>
+<table>
+<thead>
+<tr>
+<th>Parameter</th>
+<th>Default Value</th>
+<th>Description/Tuning</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>carbon.number.of.cores.while.loading</td>
+<td>Default: 2.This value should be &gt;= 2</td>
+<td>Specifies the number of cores used for data processing during data loading in CarbonData.</td>
+</tr>
+<tr>
+<td>carbon.sort.size</td>
+<td>Default: 100000. The value should be &gt;= 100.</td>
+<td>Threshold to write local file in sort step when loading data</td>
+</tr>
+<tr>
+<td>carbon.sort.file.write.buffer.size</td>
+<td>Default:  50000.</td>
+<td>DataOutputStream buffer.</td>
+</tr>
+<tr>
+<td>carbon.merge.sort.reader.thread</td>
+<td>Default: 3</td>
+<td>Specifies the number of cores used for temp file merging during data loading in CarbonData.</td>
+</tr>
+<tr>
+<td>carbon.merge.sort.prefetch</td>
+<td>Default: true</td>
+<td>You may want set this value to false if you have not enough memory</td>
+</tr>
+</tbody>
+</table>
+<p>For example, if there are 10 million records, and i have only 16 cores, 64GB memory, will be loaded to CarbonData table.
+Using the default configuration  always fail in sort step. Modify carbon.properties as suggested below:</p>
+<pre><code>carbon.merge.sort.reader.thread=1
+carbon.sort.size=5000
+carbon.sort.file.write.buffer.size=5000
+carbon.merge.sort.prefetch=false
+</code></pre>
+<h2>
+<a id="configurations-for-optimizing-carbondata-performance" class="anchor" href="#configurations-for-optimizing-carbondata-performance" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Configurations for Optimizing CarbonData Performance</h2>
+<p>Recently we did some performance POC on CarbonData for Finance and telecommunication Field. It involved detailed queries and aggregation
+scenarios. After the completion of POC, some of the configurations impacting the performance have been identified and tabulated below :</p>
+<table>
+<thead>
+<tr>
+<th>Parameter</th>
+<th>Location</th>
+<th>Used For</th>
+<th>Description</th>
+<th>Tuning</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>carbon.sort.intermediate.files.limit</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading</td>
+<td>During the loading of data, local temp is used to sort the data. This number specifies the minimum number of intermediate files after which the  merge sort has to be initiated.</td>
+<td>Increasing the parameter to a higher value will improve the load performance. For example, when we increase the value from 20 to 100, it increases the data load performance from 35MB/S to more than 50MB/S. Higher values of this parameter consumes  more memory during the load.</td>
+</tr>
+<tr>
+<td>carbon.number.of.cores.while.loading</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading</td>
+<td>Specifies the number of cores used for data processing during data loading in CarbonData.</td>
+<td>If you have more number of CPUs, then you can increase the number of CPUs, which will increase the performance. For example if we increase the value from 2 to 4 then the CSV reading performance can increase about 1 times</td>
+</tr>
+<tr>
+<td>carbon.compaction.level.threshold</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading and Querying</td>
+<td>For minor compaction, specifies the number of segments to be merged in stage 1 and number of compacted segments to be merged in stage 2.</td>
+<td>Each CarbonData load will create one segment, if every load is small in size it will generate many small file over a period of time impacting the query performance. Configuring this parameter will merge the small segment to one big segment which will sort the data and improve the performance. For Example in one telecommunication scenario, the performance improves about 2 times after minor compaction.</td>
+</tr>
+<tr>
+<td>spark.sql.shuffle.partitions</td>
+<td>spark/conf/spark-defaults.conf</td>
+<td>Querying</td>
+<td>The number of task started when spark shuffle.</td>
+<td>The value can be 1 to 2 times as much as the executor cores. In an aggregation scenario, reducing the number from 200 to 32 reduced the query time from 17 to 9 seconds.</td>
+</tr>
+<tr>
+<td>spark.executor.instances/spark.executor.cores/spark.executor.memory</td>
+<td>spark/conf/spark-defaults.conf</td>
+<td>Querying</td>
+<td>The number of executors, CPU cores, and memory used for CarbonData query.</td>
+<td>In the bank scenario, we provide the 4 CPUs cores and 15 GB for each executor which can get good performance. This 2 value does not mean more the better. It needs to be configured properly in case of limited resources. For example, In the bank scenario, it has enough CPU 32 cores each node but less memory 64 GB each node. So we cannot give more CPU but less memory. For example, when 4 cores and 12GB for each executor. It sometimes happens GC during the query which impact the query performance very much from the 3 second to more than 15 seconds. In this scenario need to increase the memory or decrease the CPU cores.</td>
+</tr>
+<tr>
+<td>carbon.detail.batch.size</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading</td>
+<td>The buffer size to store records, returned from the block scan.</td>
+<td>In limit scenario this parameter is very important. For example your query limit is 1000. But if we set this value to 3000 that means we get 3000 records from scan but spark will only take 1000 rows. So the 2000 remaining are useless. In one Finance test case after we set it to 100, in the limit 1000 scenario the performance increase about 2 times in comparison to if we set this value to 12000.</td>
+</tr>
+<tr>
+<td>carbon.use.local.dir</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading</td>
+<td>Whether use YARN local directories for multi-table load disk load balance</td>
+<td>If this is set it to true CarbonData will use YARN local directories for multi-table load disk load balance, that will improve the data load performance.</td>
+</tr>
+<tr>
+<td>carbon.use.multiple.temp.dir</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading</td>
+<td>Whether to use multiple YARN local directories during table data loading for disk load balance</td>
+<td>After enabling 'carbon.use.local.dir', if this is set to true, CarbonData will use all YARN local directories during data load for disk load balance, that will improve the data load performance. Please enable this property when you encounter disk hotspot problem during data loading.</td>
+</tr>
+<tr>
+<td>carbon.sort.temp.compressor</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading</td>
+<td>Specify the name of compressor to compress the intermediate sort temporary files during sort procedure in data loading.</td>
+<td>The optional values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty. By default, empty means that Carbondata will not compress the sort temp files. This parameter will be useful if you encounter disk bottleneck.</td>
+</tr>
+<tr>
+<td>carbon.load.skewedDataOptimization.enabled</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading</td>
+<td>Whether to enable size based block allocation strategy for data loading.</td>
+<td>When loading, carbondata will use file size based block allocation strategy for task distribution. It will make sure that all the executors process the same size of data -- It's useful if the size of your input data files varies widely, say 1MB~1GB.</td>
+</tr>
+<tr>
+<td>carbon.load.min.size.enabled</td>
+<td>spark/carbonlib/carbon.properties</td>
+<td>Data loading</td>
+<td>Whether to enable node minumun input data size allocation strategy for data loading.</td>
+<td>When loading, carbondata will use node minumun input data size allocation strategy for task distribution. It will make sure the node load the minimum amount of data -- It's useful if the size of your input data files very small, say 1MB~256MB,Avoid generating a large number of small files.</td>
+</tr>
+</tbody>
+</table>
+<p>Note: If your CarbonData instance is provided only for query, you may specify the property 'spark.speculation=true' which is in conf directory of spark.</p>
+<script>
+// Show selected style on nav item
+$(function() { $('.b-nav__perf').addClass('selected'); });
+</script>
+</div>
+</div>
+</div>
+</div>
+<div class="doc-footer">
+    <a href="#top" class="scroll-top">Top</a>
+</div>
+</div>
+</section>
+</div>
+</div>
+</div>
+</section><!-- End systemblock part -->
+<script src="js/custom.js"></script>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/main/webapp/preaggregate-datamap-guide.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/preaggregate-datamap-guide.html b/src/main/webapp/preaggregate-datamap-guide.html
index d68764d..9220c84 100644
--- a/src/main/webapp/preaggregate-datamap-guide.html
+++ b/src/main/webapp/preaggregate-datamap-guide.html
@@ -22,6 +22,7 @@
     <![endif]-->
     <script src="js/jquery.min.js"></script>
     <script src="js/bootstrap.min.js"></script>
+    <script defer src="https://use.fontawesome.com/releases/v5.0.8/js/all.js"></script>
 
 
 </head>
@@ -67,7 +68,7 @@
                                    target="_blank">Release Archive</a></li>
                         </ul>
                     </li>
-                    <li><a href="mainpage.html" class="active">Documentation</a></li>
+                    <li><a href="documentation.html" class="active">Documentation</a></li>
                     <li class="dropdown">
                         <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true"
                            aria-expanded="false">Community <span class="caret"></span></a>
@@ -152,7 +153,57 @@
 <section><!-- Dashboard nav -->
     <div class="container-fluid q">
         <div class="col-sm-12  col-md-12 maindashboard">
-            <div class="row">
+            <div class="verticalnavbar">
+                <nav class="b-sticky-nav">
+                    <div class="nav-scroller">
+                        <div class="nav__inner">
+                            <a class="b-nav__intro nav__item" href="./introduction.html">introduction</a>
+                            <a class="b-nav__quickstart nav__item" href="./quick-start-guide.html">quick start</a>
+                            <a class="b-nav__uses nav__item" href="./usescases.html">use cases</a>
+
+                            <div class="nav__item nav__item__with__subs">
+                                <a class="b-nav__docs nav__item nav__sub__anchor" href="./language-manual.html">Language Reference</a>
+                                <a class="nav__item nav__sub__item" href="./ddl-of-carbondata.html">DDL</a>
+                                <a class="nav__item nav__sub__item" href="./dml-of-carbondata.html">DML</a>
+                                <a class="nav__item nav__sub__item" href="./streaming-guide.html">Streaming</a>
+                                <a class="nav__item nav__sub__item" href="./configuration-parameters.html">Configuration</a>
+                                <a class="nav__item nav__sub__item" href="./datamap-developer-guide.html">Datamaps</a>
+                                <a class="nav__item nav__sub__item" href="./supported-data-types-in-carbondata.html">Data Types</a>
+                            </div>
+
+                            <div class="nav__item nav__item__with__subs">
+                                <a class="b-nav__datamap nav__item nav__sub__anchor" href="./datamap-management.html">DataMaps</a>
+                                <a class="nav__item nav__sub__item" href="./bloomfilter-datamap-guide.html">Bloom Filter</a>
+                                <a class="nav__item nav__sub__item" href="./lucene-datamap-guide.html">Lucene</a>
+                                <a class="nav__item nav__sub__item" href="./preaggregate-datamap-guide.html">Pre-Aggregate</a>
+                                <a class="nav__item nav__sub__item" href="./timeseries-datamap-guide.html">Time Series</a>
+                            </div>
+
+                            <a class="b-nav__s3 nav__item" href="./s3-guide.html">S3 Support</a>
+                            <a class="b-nav__api nav__item" href="./sdk-guide.html">API</a>
+                            <a class="b-nav__perf nav__item" href="./performance-tuning.html">Performance Tuning</a>
+                            <a class="b-nav__faq nav__item" href="./faq.html">FAQ</a>
+                            <a class="b-nav__contri nav__item" href="./how-to-contribute-to-apache-carbondata.html">Contribute</a>
+                            <a class="b-nav__security nav__item" href="./security.html">Security</a>
+                            <a class="b-nav__release nav__item" href="./release-guide.html">Release Guide</a>
+                        </div>
+                    </div>
+                    <div class="navindicator">
+                        <div class="b-nav__intro navindicator__item"></div>
+                        <div class="b-nav__quickstart navindicator__item"></div>
+                        <div class="b-nav__uses navindicator__item"></div>
+                        <div class="b-nav__docs navindicator__item"></div>
+                        <div class="b-nav__datamap navindicator__item"></div>
+                        <div class="b-nav__s3 navindicator__item"></div>
+                        <div class="b-nav__api navindicator__item"></div>
+                        <div class="b-nav__perf navindicator__item"></div>
+                        <div class="b-nav__faq navindicator__item"></div>
+                        <div class="b-nav__contri navindicator__item"></div>
+                        <div class="b-nav__security navindicator__item"></div>
+                    </div>
+                </nav>
+            </div>
+            <div class="mdcontent">
                 <section>
                     <div style="padding:10px 15px;">
                         <div id="viewpage" name="viewpage">
@@ -266,7 +317,7 @@ kinds of DataMap:</p>
 a. 'path' is used to specify the store location of the datamap.('path'='/location/').
 b. 'partitioning' when set to false enables user to disable partitioning of the datamap.
 Default value is true for this property.</li>
-<li>timeseries, for timeseries roll-up table. Please refer to <a href="https://github.com/apache/carbondata/blob/master/docs/datamap/timeseries-datamap-guide.html" target=_blank>Timeseries DataMap</a>
+<li>timeseries, for timeseries roll-up table. Please refer to <a href="./timeseries-datamap-guide.html">Timeseries DataMap</a>
 </li>
 </ol>
 <p>DataMap can be dropped using following DDL</p>
@@ -415,6 +466,17 @@ release, user can do as following:</p>
 <li>Create the pre-aggregate table again by <code>CREATE DATAMAP</code> command
 Basically, user can manually trigger the operation by re-building the datamap.</li>
 </ol>
+<script>
+$(function() {
+  // Show selected style on nav item
+  $('.b-nav__datamap').addClass('selected');
+  
+  if (!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) {
+    // Display datamap subnav items
+    $('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded');
+  }
+});
+</script>
 </div>
 </div>
 </div>
@@ -430,4 +492,4 @@ Basically, user can manually trigger the operation by re-building the datamap.</
 </section><!-- End systemblock part -->
 <script src="js/custom.js"></script>
 </body>
-</html>
\ No newline at end of file
+</html>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/main/webapp/quick-start-guide.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/quick-start-guide.html b/src/main/webapp/quick-start-guide.html
index 89380b4..ea88086 100644
--- a/src/main/webapp/quick-start-guide.html
+++ b/src/main/webapp/quick-start-guide.html
@@ -22,6 +22,7 @@
     <![endif]-->
     <script src="js/jquery.min.js"></script>
     <script src="js/bootstrap.min.js"></script>
+    <script defer src="https://use.fontawesome.com/releases/v5.0.8/js/all.js"></script>
 
 
 </head>
@@ -67,7 +68,7 @@
                                    target="_blank">Release Archive</a></li>
                         </ul>
                     </li>
-                    <li><a href="mainpage.html" class="active">Documentation</a></li>
+                    <li><a href="documentation.html" class="active">Documentation</a></li>
                     <li class="dropdown">
                         <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true"
                            aria-expanded="false">Community <span class="caret"></span></a>
@@ -152,7 +153,57 @@
 <section><!-- Dashboard nav -->
     <div class="container-fluid q">
         <div class="col-sm-12  col-md-12 maindashboard">
-            <div class="row">
+            <div class="verticalnavbar">
+                <nav class="b-sticky-nav">
+                    <div class="nav-scroller">
+                        <div class="nav__inner">
+                            <a class="b-nav__intro nav__item" href="./introduction.html">introduction</a>
+                            <a class="b-nav__quickstart nav__item" href="./quick-start-guide.html">quick start</a>
+                            <a class="b-nav__uses nav__item" href="./usescases.html">use cases</a>
+
+                            <div class="nav__item nav__item__with__subs">
+                                <a class="b-nav__docs nav__item nav__sub__anchor" href="./language-manual.html">Language Reference</a>
+                                <a class="nav__item nav__sub__item" href="./ddl-of-carbondata.html">DDL</a>
+                                <a class="nav__item nav__sub__item" href="./dml-of-carbondata.html">DML</a>
+                                <a class="nav__item nav__sub__item" href="./streaming-guide.html">Streaming</a>
+                                <a class="nav__item nav__sub__item" href="./configuration-parameters.html">Configuration</a>
+                                <a class="nav__item nav__sub__item" href="./datamap-developer-guide.html">Datamaps</a>
+                                <a class="nav__item nav__sub__item" href="./supported-data-types-in-carbondata.html">Data Types</a>
+                            </div>
+
+                            <div class="nav__item nav__item__with__subs">
+                                <a class="b-nav__datamap nav__item nav__sub__anchor" href="./datamap-management.html">DataMaps</a>
+                                <a class="nav__item nav__sub__item" href="./bloomfilter-datamap-guide.html">Bloom Filter</a>
+                                <a class="nav__item nav__sub__item" href="./lucene-datamap-guide.html">Lucene</a>
+                                <a class="nav__item nav__sub__item" href="./preaggregate-datamap-guide.html">Pre-Aggregate</a>
+                                <a class="nav__item nav__sub__item" href="./timeseries-datamap-guide.html">Time Series</a>
+                            </div>
+
+                            <a class="b-nav__s3 nav__item" href="./s3-guide.html">S3 Support</a>
+                            <a class="b-nav__api nav__item" href="./sdk-guide.html">API</a>
+                            <a class="b-nav__perf nav__item" href="./performance-tuning.html">Performance Tuning</a>
+                            <a class="b-nav__faq nav__item" href="./faq.html">FAQ</a>
+                            <a class="b-nav__contri nav__item" href="./how-to-contribute-to-apache-carbondata.html">Contribute</a>
+                            <a class="b-nav__security nav__item" href="./security.html">Security</a>
+                            <a class="b-nav__release nav__item" href="./release-guide.html">Release Guide</a>
+                        </div>
+                    </div>
+                    <div class="navindicator">
+                        <div class="b-nav__intro navindicator__item"></div>
+                        <div class="b-nav__quickstart navindicator__item"></div>
+                        <div class="b-nav__uses navindicator__item"></div>
+                        <div class="b-nav__docs navindicator__item"></div>
+                        <div class="b-nav__datamap navindicator__item"></div>
+                        <div class="b-nav__s3 navindicator__item"></div>
+                        <div class="b-nav__api navindicator__item"></div>
+                        <div class="b-nav__perf navindicator__item"></div>
+                        <div class="b-nav__faq navindicator__item"></div>
+                        <div class="b-nav__contri navindicator__item"></div>
+                        <div class="b-nav__security navindicator__item"></div>
+                    </div>
+                </nav>
+            </div>
+            <div class="mdcontent">
                 <section>
                     <div style="padding:10px 15px;">
                         <div id="viewpage" name="viewpage">
@@ -161,12 +212,12 @@
                                     <div>
 <h1>
 <a id="quick-start" class="anchor" href="#quick-start" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Quick Start</h1>
-<p>This tutorial provides a quick introduction to using CarbonData.</p>
+<p>This tutorial provides a quick introduction to using CarbonData.To follow along with this guide, first download a packaged release of CarbonData from the <a href="https://dist.apache.org/repos/dist/release/carbondata/" target=_blank rel="nofollow">CarbonData website</a>.Alternatively it can be created following <a href="https://github.com/apache/carbondata/tree/master/build" target=_blank>Building CarbonData</a> steps.</p>
 <h2>
 <a id="prerequisites" class="anchor" href="#prerequisites" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Prerequisites</h2>
 <ul>
 <li>
-<p><a href="https://github.com/apache/carbondata/blob/master/build" target=_blank>Installation and building CarbonData</a>.</p>
+<p>Spark 2.2.1 version is installed and running.CarbonData supports Spark versions upto 2.2.1.Please follow steps described in <a href="https://spark.apache.org/docs/latest" target=_blank rel="nofollow">Spark docs website</a> for installing and running Spark.</p>
 </li>
 <li>
 <p>Create a sample.csv file using the following commands. The CSV file is required for loading data into CarbonData.</p>
@@ -181,14 +232,29 @@ EOF
 </li>
 </ul>
 <h2>
-<a id="interactive-analysis-with-spark-shell-version-21" class="anchor" href="#interactive-analysis-with-spark-shell-version-21" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Interactive Analysis with Spark Shell Version 2.1</h2>
+<a id="deployment-modes" class="anchor" href="#deployment-modes" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Deployment modes</h2>
+<p>CarbonData can be integrated with Spark and Presto Execution Engines.The below documentation guides on Installing and Configuring with these execution engines.</p>
+<h3>
+<a id="spark" class="anchor" href="#spark" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Spark</h3>
+<p><a href="#installing-and-configuring-carbondata-to-run-locally-with-spark-shell">Installing and Configuring CarbonData to run locally with Spark Shell</a></p>
+<p><a href="#installing-and-configuring-carbondata-on-standalone-spark-cluster">Installing and Configuring CarbonData on Standalone Spark Cluster</a></p>
+<p><a href="#installing-and-configuring-carbondata-on-spark-on-yarn-cluster">Installing and Configuring CarbonData on Spark on YARN Cluster</a></p>
+<h3>
+<a id="presto" class="anchor" href="#presto" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Presto</h3>
+<p><a href="#installing-and-configuring-carbondata-on-presto">Installing and Configuring CarbonData on Presto</a></p>
+<h2>
+<a id="querying-data" class="anchor" href="#querying-data" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Querying Data</h2>
+<p><a href="#query-execution-using-carbondata-thrift-server">Query Execution using CarbonData Thrift Server</a></p>
+<h2></h2>
+<h2>
+<a id="installing-and-configuring-carbondata-to-run-locally-with-spark-shell" class="anchor" href="#installing-and-configuring-carbondata-to-run-locally-with-spark-shell" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Installing and Configuring CarbonData to run locally with Spark Shell</h2>
 <p>Apache Spark Shell provides a simple way to learn the API, as well as a powerful tool to analyze data interactively. Please visit <a href="http://spark.apache.org/docs/latest/" target=_blank rel="nofollow">Apache Spark Documentation</a> for more details on Spark shell.</p>
 <h4>
 <a id="basics" class="anchor" href="#basics" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Basics</h4>
 <p>Start Spark shell by running the following command in the Spark directory:</p>
 <pre><code>./bin/spark-shell --jars &lt;carbondata assembly jar path&gt;
 </code></pre>
-<p><strong>NOTE</strong>: Assembly jar will be available after <a href="https://github.com/apache/carbondata/blob/master/build/README.md" target=_blank>building CarbonData</a> and can be copied from <code>./assembly/target/scala-2.1x/carbondata_xxx.jar</code></p>
+<p><strong>NOTE</strong>: Path where packaged release of CarbonData was downloaded or assembly jar will be available after <a href="https://github.com/apache/carbondata/blob/master/build/README.md" target=_blank>building CarbonData</a> and can be copied from <code>./assembly/target/scala-2.1x/carbondata_xxx.jar</code></p>
 <p>In this shell, SparkSession is readily available as <code>spark</code> and Spark context is readily available as <code>sc</code>.</p>
 <p>In order to create a CarbonSession we will have to configure it explicitly in the following manner :</p>
 <ul>
@@ -203,7 +269,7 @@ import org.apache.spark.sql.CarbonSession._
 <pre><code>val carbon = SparkSession.builder().config(sc.getConf)
              .getOrCreateCarbonSession("&lt;hdfs store path&gt;")
 </code></pre>
-<p><strong>NOTE</strong>: By default metastore location is pointed to <code>../carbon.metastore</code>, user can provide own metastore location to CarbonSession like <code>SparkSession.builder().config(sc.getConf) .getOrCreateCarbonSession("&lt;hdfs store path&gt;", "&lt;local metastore path&gt;")</code></p>
+<p><strong>NOTE</strong>: By default metastore location points to <code>../carbon.metastore</code>, user can provide own metastore location to CarbonSession like <code>SparkSession.builder().config(sc.getConf) .getOrCreateCarbonSession("&lt;hdfs store path&gt;", "&lt;local metastore path&gt;")</code></p>
 <h4>
 <a id="executing-queries" class="anchor" href="#executing-queries" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Executing Queries</h4>
 <h6>
@@ -222,7 +288,7 @@ import org.apache.spark.sql.CarbonSession._
                   INTO TABLE test_table")
 </code></pre>
 <p><strong>NOTE</strong>: Please provide the real file path of <code>sample.csv</code> for the above script.
-If you get "tablestatus.lock" issue, please refer to <a href="troubleshooting.html">troubleshooting</a></p>
+If you get "tablestatus.lock" issue, please refer to <a href="faq.html">FAQ</a></p>
 <h6>
 <a id="query-data-from-a-table" class="anchor" href="#query-data-from-a-table" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Query Data from a Table</h6>
 <pre><code>scala&gt;carbon.sql("SELECT * FROM test_table").show()
@@ -231,6 +297,398 @@ scala&gt;carbon.sql("SELECT city, avg(age), sum(age)
                   FROM test_table
                   GROUP BY city").show()
 </code></pre>
+<h2>
+<a id="installing-and-configuring-carbondata-on-standalone-spark-cluster" class="anchor" href="#installing-and-configuring-carbondata-on-standalone-spark-cluster" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Installing and Configuring CarbonData on Standalone Spark Cluster</h2>
+<h3>
+<a id="prerequisites-1" class="anchor" href="#prerequisites-1" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Prerequisites</h3>
+<ul>
+<li>Hadoop HDFS and Yarn should be installed and running.</li>
+<li>Spark should be installed and running on all the cluster nodes.</li>
+<li>CarbonData user should have permission to access HDFS.</li>
+</ul>
+<h3>
+<a id="procedure" class="anchor" href="#procedure" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Procedure</h3>
+<ol>
+<li>
+<p><a href="https://github.com/apache/carbondata/blob/master/build/README.md" target=_blank>Build the CarbonData</a> project and get the assembly jar from <code>./assembly/target/scala-2.1x/carbondata_xxx.jar</code>.</p>
+</li>
+<li>
+<p>Copy <code>./assembly/target/scala-2.1x/carbondata_xxx.jar</code> to <code>$SPARK_HOME/carbonlib</code> folder.</p>
+<p><strong>NOTE</strong>: Create the carbonlib folder if it does not exist inside <code>$SPARK_HOME</code> path.</p>
+</li>
+<li>
+<p>Add the carbonlib folder path in the Spark classpath. (Edit <code>$SPARK_HOME/conf/spark-env.sh</code> file and modify the value of <code>SPARK_CLASSPATH</code> by appending <code>$SPARK_HOME/carbonlib/*</code> to the existing value)</p>
+</li>
+<li>
+<p>Copy the <code>./conf/carbon.properties.template</code> file from CarbonData repository to <code>$SPARK_HOME/conf/</code> folder and rename the file to <code>carbon.properties</code>.</p>
+</li>
+<li>
+<p>Repeat Step 2 to Step 5 in all the nodes of the cluster.</p>
+</li>
+<li>
+<p>In Spark node[master], configure the properties mentioned in the following table in <code>$SPARK_HOME/conf/spark-defaults.conf</code> file.</p>
+</li>
+</ol>
+<table>
+<thead>
+<tr>
+<th>Property</th>
+<th>Value</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>spark.driver.extraJavaOptions</td>
+<td><code>-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties</code></td>
+<td>A string of extra JVM options to pass to the driver. For instance, GC settings or other logging.</td>
+</tr>
+<tr>
+<td>spark.executor.extraJavaOptions</td>
+<td><code>-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties</code></td>
+<td>A string of extra JVM options to pass to executors. For instance, GC settings or other logging. <strong>NOTE</strong>: You can enter multiple values separated by space.</td>
+</tr>
+</tbody>
+</table>
+<ol>
+<li>Add the following properties in <code>$SPARK_HOME/conf/carbon.properties</code> file:</li>
+</ol>
+<table>
+<thead>
+<tr>
+<th>Property</th>
+<th>Required</th>
+<th>Description</th>
+<th>Example</th>
+<th>Remark</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>carbon.storelocation</td>
+<td>NO</td>
+<td>Location where data CarbonData will create the store and write the data in its own format. If not specified then it takes spark.sql.warehouse.dir path.</td>
+<td>hdfs://HOSTNAME:PORT/Opt/CarbonStore</td>
+<td>Propose to set HDFS directory</td>
+</tr>
+</tbody>
+</table>
+<ol>
+<li>Verify the installation. For example:</li>
+</ol>
+<pre><code>./spark-shell --master spark://HOSTNAME:PORT --total-executor-cores 2
+--executor-memory 2G
+</code></pre>
+<p><strong>NOTE</strong>: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.</p>
+<h2>
+<a id="installing-and-configuring-carbondata-on-spark-on-yarn-cluster" class="anchor" href="#installing-and-configuring-carbondata-on-spark-on-yarn-cluster" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Installing and Configuring CarbonData on Spark on YARN Cluster</h2>
+<p>This section provides the procedure to install CarbonData on "Spark on YARN" cluster.</p>
+<h3>
+<a id="prerequisites-2" class="anchor" href="#prerequisites-2" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Prerequisites</h3>
+<ul>
+<li>Hadoop HDFS and Yarn should be installed and running.</li>
+<li>Spark should be installed and running in all the clients.</li>
+<li>CarbonData user should have permission to access HDFS.</li>
+</ul>
+<h3>
+<a id="procedure-1" class="anchor" href="#procedure-1" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Procedure</h3>
+<p>The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)</p>
+<ol>
+<li>
+<p><a href="https://github.com/apache/carbondata/blob/master/build/README.md" target=_blank>Build the CarbonData</a> project and get the assembly jar from <code>./assembly/target/scala-2.1x/carbondata_xxx.jar</code> and copy to <code>$SPARK_HOME/carbonlib</code> folder.</p>
+<p><strong>NOTE</strong>: Create the carbonlib folder if it does not exists inside <code>$SPARK_HOME</code> path.</p>
+</li>
+<li>
+<p>Copy the <code>./conf/carbon.properties.template</code> file from CarbonData repository to <code>$SPARK_HOME/conf/</code> folder and rename the file to <code>carbon.properties</code>.</p>
+</li>
+<li>
+<p>Create <code>tar.gz</code> file of carbonlib folder and move it inside the carbonlib folder.</p>
+</li>
+</ol>
+<pre><code>cd $SPARK_HOME
+tar -zcvf carbondata.tar.gz carbonlib/
+mv carbondata.tar.gz carbonlib/
+</code></pre>
+<ol>
+<li>Configure the properties mentioned in the following table in <code>$SPARK_HOME/conf/spark-defaults.conf</code> file.</li>
+</ol>
+<table>
+<thead>
+<tr>
+<th>Property</th>
+<th>Description</th>
+<th>Value</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>spark.master</td>
+<td>Set this value to run the Spark in yarn cluster mode.</td>
+<td>Set yarn-client to run the Spark in yarn cluster mode.</td>
+</tr>
+<tr>
+<td>spark.yarn.dist.files</td>
+<td>Comma-separated list of files to be placed in the working directory of each executor.</td>
+<td><code>$SPARK_HOME/conf/carbon.properties</code></td>
+</tr>
+<tr>
+<td>spark.yarn.dist.archives</td>
+<td>Comma-separated list of archives to be extracted into the working directory of each executor.</td>
+<td><code>$SPARK_HOME/carbonlib/carbondata.tar.gz</code></td>
+</tr>
+<tr>
+<td>spark.executor.extraJavaOptions</td>
+<td>A string of extra JVM options to pass to executors. For instance  <strong>NOTE</strong>: You can enter multiple values separated by space.</td>
+<td><code>-Dcarbon.properties.filepath = carbon.properties</code></td>
+</tr>
+<tr>
+<td>spark.executor.extraClassPath</td>
+<td>Extra classpath entries to prepend to the classpath of executors. <strong>NOTE</strong>: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath</td>
+<td><code>carbondata.tar.gz/carbonlib/*</code></td>
+</tr>
+<tr>
+<td>spark.driver.extraClassPath</td>
+<td>Extra classpath entries to prepend to the classpath of the driver. <strong>NOTE</strong>: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath.</td>
+<td><code>$SPARK_HOME/carbonlib/*</code></td>
+</tr>
+<tr>
+<td>spark.driver.extraJavaOptions</td>
+<td>A string of extra JVM options to pass to the driver. For instance, GC settings or other logging.</td>
+<td><code>-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties</code></td>
+</tr>
+</tbody>
+</table>
+<ol>
+<li>Add the following properties in <code>$SPARK_HOME/conf/carbon.properties</code>:</li>
+</ol>
+<table>
+<thead>
+<tr>
+<th>Property</th>
+<th>Required</th>
+<th>Description</th>
+<th>Example</th>
+<th>Default Value</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>carbon.storelocation</td>
+<td>NO</td>
+<td>Location where CarbonData will create the store and write the data in its own format. If not specified then it takes spark.sql.warehouse.dir path.</td>
+<td>hdfs://HOSTNAME:PORT/Opt/CarbonStore</td>
+<td>Propose to set HDFS directory</td>
+</tr>
+</tbody>
+</table>
+<ol>
+<li>Verify the installation.</li>
+</ol>
+<pre><code> ./bin/spark-shell --master yarn-client --driver-memory 1g
+ --executor-cores 2 --executor-memory 2G
+</code></pre>
+<p><strong>NOTE</strong>: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.</p>
+<h2>
+<a id="query-execution-using-carbondata-thrift-server" class="anchor" href="#query-execution-using-carbondata-thrift-server" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Query Execution Using CarbonData Thrift Server</h2>
+<h3>
+<a id="starting-carbondata-thrift-server" class="anchor" href="#starting-carbondata-thrift-server" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Starting CarbonData Thrift Server.</h3>
+<p>a. cd <code>$SPARK_HOME</code></p>
+<p>b. Run the following command to start the CarbonData thrift server.</p>
+<pre><code>./bin/spark-submit
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR &lt;carbon_store_path&gt;
+</code></pre>
+<table>
+<thead>
+<tr>
+<th>Parameter</th>
+<th>Description</th>
+<th>Example</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>CARBON_ASSEMBLY_JAR</td>
+<td>CarbonData assembly jar name present in the <code>$SPARK_HOME/carbonlib/</code> folder.</td>
+<td>carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar</td>
+</tr>
+<tr>
+<td>carbon_store_path</td>
+<td>This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. If not specified then it takes spark.sql.warehouse.dir path.</td>
+<td><code>hdfs://&lt;host_name&gt;:port/user/hive/warehouse/carbon.store</code></td>
+</tr>
+</tbody>
+</table>
+<p><strong>NOTE</strong>: From Spark 1.6, by default the Thrift server runs in multi-session mode. Which means each JDBC/ODBC connection owns a copy of their own SQL configuration and temporary function registry. Cached tables are still shared though. If you prefer to run the Thrift server in single-session mode and share all SQL configuration and temporary function registry, please set option <code>spark.sql.hive.thriftServer.singleSession</code> to <code>true</code>. You may either add this option to <code>spark-defaults.conf</code>, or pass it to <code>spark-submit.sh</code> via <code>--conf</code>:</p>
+<pre><code>./bin/spark-submit
+--conf spark.sql.hive.thriftServer.singleSession=true
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR &lt;carbon_store_path&gt;
+</code></pre>
+<p><strong>But</strong> in single-session mode, if one user changes the database from one connection, the database of the other connections will be changed too.</p>
+<p><strong>Examples</strong></p>
+<ul>
+<li>Start with default memory and executors.</li>
+</ul>
+<pre><code>./bin/spark-submit
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
+$SPARK_HOME/carbonlib
+/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
+hdfs://&lt;host_name&gt;:port/user/hive/warehouse/carbon.store
+</code></pre>
+<ul>
+<li>Start with Fixed executors and resources.</li>
+</ul>
+<pre><code>./bin/spark-submit
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
+--num-executors 3 --driver-memory 20g --executor-memory 250g 
+--executor-cores 32 
+/srv/OSCON/BigData/HACluster/install/spark/sparkJdbc/lib
+/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
+hdfs://&lt;host_name&gt;:port/user/hive/warehouse/carbon.store
+</code></pre>
+<h3>
+<a id="connecting-to-carbondata-thrift-server-using-beeline" class="anchor" href="#connecting-to-carbondata-thrift-server-using-beeline" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Connecting to CarbonData Thrift Server Using Beeline.</h3>
+<pre><code>     cd $SPARK_HOME
+     ./sbin/start-thriftserver.sh
+     ./bin/beeline -u jdbc:hive2://&lt;thriftserver_host&gt;:port
+
+     Example
+     ./bin/beeline -u jdbc:hive2://10.10.10.10:10000
+</code></pre>
+<h2>
+<a id="installing-and-configuring-carbondata-on-presto" class="anchor" href="#installing-and-configuring-carbondata-on-presto" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Installing and Configuring CarbonData on Presto</h2>
+<ul>
+<li>
+<h3>
+<a id="installing-presto" class="anchor" href="#installing-presto" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Installing Presto</h3>
+</li>
+</ul>
+<ol>
+<li>
+<p>Download the 0.187 version of Presto using:
+<code>wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz</code></p>
+</li>
+<li>
+<p>Extract Presto tar file: <code>tar zxvf presto-server-0.187.tar.gz</code>.</p>
+</li>
+<li>
+<p>Download the Presto CLI for the coordinator and name it presto.</p>
+</li>
+</ol>
+<pre><code>  wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+  mv presto-cli-0.187-executable.jar presto
+
+  chmod +x presto
+</code></pre>
+<h3>
+<a id="create-configuration-files" class="anchor" href="#create-configuration-files" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Create Configuration Files</h3>
+<ol>
+<li>
+<p>Create <code>etc</code> folder in presto-server-0.187 directory.</p>
+</li>
+<li>
+<p>Create <code>config.properties</code>, <code>jvm.config</code>, <code>log.properties</code>, and <code>node.properties</code> files.</p>
+</li>
+<li>
+<p>Install uuid to generate a node.id.</p>
+<pre><code>sudo apt-get install uuid
+
+uuid
+</code></pre>
+</li>
+</ol>
+<h5>
+<a id="contents-of-your-nodeproperties-file" class="anchor" href="#contents-of-your-nodeproperties-file" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Contents of your node.properties file</h5>
+<pre><code>node.environment=production
+node.id=&lt;generated uuid&gt;
+node.data-dir=/home/ubuntu/data
+</code></pre>
+<h5>
+<a id="contents-of-your-jvmconfig-file" class="anchor" href="#contents-of-your-jvmconfig-file" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Contents of your jvm.config file</h5>
+<pre><code>-server
+-Xmx16G
+-XX:+UseG1GC
+-XX:G1HeapRegionSize=32M
+-XX:+UseGCOverheadLimit
+-XX:+ExplicitGCInvokesConcurrent
+-XX:+HeapDumpOnOutOfMemoryError
+-XX:OnOutOfMemoryError=kill -9 %p
+</code></pre>
+<h5>
+<a id="contents-of-your-logproperties-file" class="anchor" href="#contents-of-your-logproperties-file" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Contents of your log.properties file</h5>
+<pre><code>com.facebook.presto=INFO
+</code></pre>
+<p>The default minimum level is <code>INFO</code>. There are four levels: <code>DEBUG</code>, <code>INFO</code>, <code>WARN</code> and <code>ERROR</code>.</p>
+<h3>
+<a id="coordinator-configurations" class="anchor" href="#coordinator-configurations" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Coordinator Configurations</h3>
+<h5>
+<a id="contents-of-your-configproperties" class="anchor" href="#contents-of-your-configproperties" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Contents of your config.properties</h5>
+<pre><code>coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8086
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=&lt;coordinator_ip&gt;:8086
+</code></pre>
+<p>The options <code>node-scheduler.include-coordinator=false</code> and <code>coordinator=true</code> indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers.</p>
+<p><strong>Note</strong>: It is recommended to set <code>query.max-memory-per-node</code> to half of the JVM config max memory, though the workload is highly concurrent, lower value for <code>query.max-memory-per-node</code> is to be used.</p>
+<p>Also relation between below two configuration-properties should be like:
+If, <code>query.max-memory-per-node=30GB</code>
+Then, <code>query.max-memory=&lt;30GB * number of nodes&gt;</code>.</p>
+<h3>
+<a id="worker-configurations" class="anchor" href="#worker-configurations" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Worker Configurations</h3>
+<h5>
+<a id="contents-of-your-configproperties-1" class="anchor" href="#contents-of-your-configproperties-1" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Contents of your config.properties</h5>
+<pre><code>coordinator=false
+http-server.http.port=8086
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery.uri=&lt;coordinator_ip&gt;:8086
+</code></pre>
+<p><strong>Note</strong>: <code>jvm.config</code> and <code>node.properties</code> files are same for all the nodes (worker + coordinator). All the nodes should have different <code>node.id</code>.(generated by uuid command).</p>
+<h3>
+<a id="catalog-configurations" class="anchor" href="#catalog-configurations" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Catalog Configurations</h3>
+<ol>
+<li>Create a folder named <code>catalog</code> in etc directory of presto on all the nodes of the cluster including the coordinator.</li>
+</ol>
+<h5>
+<a id="configuring-carbondata-in-presto" class="anchor" href="#configuring-carbondata-in-presto" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Configuring Carbondata in Presto</h5>
+<ol>
+<li>Create a file named <code>carbondata.properties</code> in the <code>catalog</code> folder and set the required properties on all the nodes.</li>
+</ol>
+<h3>
+<a id="add-plugins" class="anchor" href="#add-plugins" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Add Plugins</h3>
+<ol>
+<li>Create a directory named <code>carbondata</code> in plugin directory of presto.</li>
+<li>Copy <code>carbondata</code> jars to <code>plugin/carbondata</code> directory on all nodes.</li>
+</ol>
+<h3>
+<a id="start-presto-server-on-all-nodes" class="anchor" href="#start-presto-server-on-all-nodes" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Start Presto Server on all nodes</h3>
+<pre><code>./presto-server-0.187/bin/launcher start
+</code></pre>
+<p>To run it as a background process.</p>
+<pre><code>./presto-server-0.187/bin/launcher run
+</code></pre>
+<p>To run it in foreground.</p>
+<h3>
+<a id="start-presto-cli" class="anchor" href="#start-presto-cli" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Start Presto CLI</h3>
+<pre><code>./presto
+</code></pre>
+<p>To connect to carbondata catalog use the following command:</p>
+<pre><code>./presto --server &lt;coordinator_ip&gt;:8086 --catalog carbondata --schema &lt;schema_name&gt;
+</code></pre>
+<p>Execute the following command to ensure the workers are connected.</p>
+<pre><code>select * from system.runtime.nodes;
+</code></pre>
+<p>Now you can use the Presto CLI on the coordinator to query data sources in the catalog using the Presto workers.</p>
+<p><strong>Note :</strong> Create Tables and data loads should be done before executing queries as we can not create carbon table from this interface.</p>
+<script>
+// Show selected style on nav item
+$(function() { $('.b-nav__quickstart').addClass('selected'); });
+</script>
 </div>
 </div>
 </div>
@@ -246,4 +704,4 @@ scala&gt;carbon.sql("SELECT city, avg(age), sum(age)
 </section><!-- End systemblock part -->
 <script src="js/custom.js"></script>
 </body>
-</html>
\ No newline at end of file
+</html>