You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by ch...@apache.org on 2017/02/04 02:38:18 UTC

[19/35] incubator-carbondata-site git commit: Updated website for CarbonData release 1.0.0

http://git-wip-us.apache.org/repos/asf/incubator-carbondata-site/blob/0d4cdb1c/content/docs/latest/mainpage.html
----------------------------------------------------------------------
diff --git a/content/docs/latest/mainpage.html b/content/docs/latest/mainpage.html
new file mode 100644
index 0000000..b85e1b2
--- /dev/null
+++ b/content/docs/latest/mainpage.html
@@ -0,0 +1,144 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <link href='../../images/favicon.ico' rel='shortcut icon' type='image/x-icon'>
+    <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
+    <title>CarbonData</title>
+<style>
+
+</style>
+    <!-- Bootstrap -->
+
+    <link rel="stylesheet" href="../../css/bootstrap.min.css">
+    <link href="../../css/style.css" rel="stylesheet">
+    <link href="../../css/print.css" rel="stylesheet" >
+    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
+    <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
+    <!--[if lt IE 9]>
+      <script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
+      <script src="https://oss.maxcdn.scom/respond/1.4.2/respond.min.js"></script>
+    <![endif]-->
+    <script src="../../js/jquery.min.js"></script>
+    <script src="../../js/bootstrap.min.js"></script>
+
+
+
+  </head>
+  <body>
+    <header>
+     <nav class="navbar navbar-default navbar-custom cd-navbar-wrapper" >
+      <div class="container">
+        <div class="navbar-header">
+          <button aria-controls="navbar" aria-expanded="false" data-target="#navbar" data-toggle="collapse" class="navbar-toggle collapsed" type="button">
+            <span class="sr-only">Toggle navigation</span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+          </button>
+          <a href="../../index.html" class="logo">
+             <img src="../../images/CarbonDataLogo.png" alt="CarbonData logo" title="CarbocnData logo"  />
+          </a>
+        </div>
+        <div class="navbar-collapse collapse cd_navcontnt" id="navbar">
+         <ul class="nav navbar-nav navbar-right navlist-custom">
+              <li><a href="../../index.html" class="hidden-xs"><i class="fa fa-home" aria-hidden="true"></i> </a></li>
+              <li><a href="../../index.html" class="hidden-lg hidden-md hidden-sm">Home</a></li>
+              <li class="dropdown">
+                  <a href="#" class="dropdown-toggle " data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false"> Download <span class="caret"></span></a>
+                  <ul class="dropdown-menu">
+                      <li>
+                          <a href="https://www.apache.org/dyn/closer.lua/incubator/carbondata/1.0.0-incubating"
+                             target="_blank">Apache CarbonData 1.0.0</a></li>
+                      <li>
+                          <a href="https://www.apache.org/dyn/closer.lua/incubator/carbondata/0.2.0-incubating"
+                             target="_blank">Apache CarbonData 0.2.0</a></li>
+                      <li>
+                          <a href="https://www.apache.org/dyn/closer.lua/incubator/carbondata/0.1.1-incubating"
+                             target="_blank">Apache CarbonData 0.1.1</a></li>
+                      <li>
+                          <a href="https://www.apache.org/dyn/closer.lua/incubator/carbondata/0.1.0-incubating"
+                             target="_blank">Apache CarbonData 0.1.0</a></li>
+                      <li>
+                          <a href="https://cwiki.apache.org/confluence/display/CARBONDATA/Releases"
+                             target="_blank">Release Archive</a></li>
+                  </ul>
+                </li>
+
+              <li><a href="mainpage.html?page=userguide" class="">Documentation</a></li>
+              <li class="dropdown">
+                  <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Community <span class="caret"></span></a>
+                  <ul class="dropdown-menu">
+                      <li><a href="https://cwiki.apache.org/confluence/display/CARBONDATA/Contributing+to+CarbonData" target="_blank">Contributing to CarbonData</a></li>
+                      <li><a href="https://cwiki.apache.org/confluence/display/CARBONDATA/Committers" target="_blank">Project Committers</a></li>
+                    <li><a href="../../meetup.html">CarbonData Meetups </a></li>
+                  </ul>
+                </li>
+                <li class="dropdown">
+                  <a href="http://www.apache.org/" class="apache_link hidden-xs dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache</a>
+                   <ul class="dropdown-menu">
+                      <li><a href="http://www.apache.org/"  target="_blank">Apache Homepage</a></li>
+                      <li><a href="http://www.apache.org/licenses/"  target="_blank">License</a></li>
+                      <li><a href="http://www.apache.org/foundation/sponsorship.html"  target="_blank">Sponsorship</a></li>
+                      <li><a href="http://www.apache.org/foundation/thanks.html"  target="_blank">Thanks</a></li>
+                    </ul>
+                </li>
+
+                <li class="dropdown">
+                  <a href="http://www.apache.org/" class="hidden-lg hidden-md hidden-sm dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache</a>
+                   <ul class="dropdown-menu">
+                      <li><a href="http://www.apache.org/"  target="_blank">Apache Homepage</a></li>
+                      <li><a href="http://www.apache.org/licenses/"  target="_blank">License</a></li>
+                      <li><a href="http://www.apache.org/foundation/sponsorship.html"  target="_blank">Sponsorship</a></li>
+                      <li><a href="http://www.apache.org/foundation/thanks.html"  target="_blank">Thanks</a></li>
+                    </ul>
+                </li>
+
+           </ul>
+        </div><!--/.nav-collapse -->
+      </div>
+    </nav>
+     </header> <!-- end Header part -->
+
+   <div class="fixed-padding"></div> <!--  top padding with fixde header  -->
+
+   <section><!-- Dashboard nav -->
+    <div class="container-fluid q">
+        <div class="col-sm-12  col-md-12 maindashboard">
+              <div class="row">
+                <section>
+                  <div style="padding:10px 15px;">
+                    <div class="doc-header">
+                        <div class="doc-toc">
+                            <a href="mainpage.html?page=userguide" class="icon toc-icon"></a>
+                        </div>
+                       <img src="../../images/format/CarbonData_icon.png" alt="" class="logo-print" >
+                       <span>Version: 1.0.0 | Published: 30-01-2017</span>
+                       <i class="fa fa-print print-icon" aria-hidden="true" onclick="divPrint();"></i>
+                    </div>
+                    <div id="viewpage" name="viewpage">   </div>
+                    <div class="doc-footer">
+                         <a href="#top" class="scroll-top">Top</a>
+                    </div>
+                  </div>
+                </section>
+              </div>
+        </div>
+      </div>
+    </section><!-- End systemblock part -->
+
+  <!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
+
+    <script src="../../js/custom.js"></script>
+    <script src="../../js/mdNavigation.js" type="text/javascript"></script>
+
+    <script type="text/javascript">
+     <!-- $("#leftmenu").load("table-of-content.html");-->
+    </script>
+
+
+
+  </body>
+  </html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-carbondata-site/blob/0d4cdb1c/content/docs/latest/overview-of-carbondata.html
----------------------------------------------------------------------
diff --git a/content/docs/latest/overview-of-carbondata.html b/content/docs/latest/overview-of-carbondata.html
new file mode 100644
index 0000000..5f4aff3
--- /dev/null
+++ b/content/docs/latest/overview-of-carbondata.html
@@ -0,0 +1,51 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+<h1>Overview</h1><p>This tutorial provides a detailed overview about :</p>
+<ul>
+  <li><a href="#introduction">Introduction</a></li>
+  <li><a href="#features">Features</a></li>
+</ul>
+
+<div id="introduction"></div>
+<h2>Introduction</h2><p>CarbonData is a fully indexed columnar and Hadoop native data-store for processing heavy analytical workloads and detailed queries on big data. CarbonData allows faster interactive query using advanced columnar storage, index, compression and encoding techniques to improve computing efficiency, which helps in speeding up queries by an order of magnitude faster over PetaBytes of data.</p><p>In customer benchmarks, CarbonData has proven to manage Petabyte of data running on extraordinarily low-cost hardware and answers queries around 10 times faster than the current open source solutions (column-oriented SQL on Hadoop data-stores).</p><p>Some of the salient features of CarbonData are :</p>
+<ul>
+  <li>Low-Latency for various types of data access patterns like Sequential, Random and OLAP.</li>
+  <li>Fast query on fast data.</li>
+  <li>Space efficiency.</li>
+  <li>General format available on Hadoop-ecosystem.</li>
+</ul>
+
+<div id="features"></div>
+<h2>Features</h2><p>CarbonData file format is a columnar store in HDFS. It has many features that a modern columnar format has, such as splittable, compression schema, complex data type etc and CarbonData has following unique features:</p>
+<ul>
+  <li><p>Unique Data Organization: Though CarbonData stores data in Columnar format, it differs from traditional Columnar formats as the columns in each row-group(Data Block) is sorted independent of the other columns. Though this arrangement requires CarbonData to store the row-number mapping against each column value, it makes it possible to use binary search for faster filtering and since the values are sorted, same/similar values come together which yields better compression and offsets the storage overhead required by the row number mapping.</p></li>
+  <li><p>Advanced Push Down Optimizations: CarbonData pushes as much of query processing as possible close to the data to minimize the amount of data being read, processed, converted and transmitted/shuffled. Using projections and filters it reads only the required columns form the store and also reads only the rows that match the filter conditions provided in the query.</p></li>
+  <li><p>Multi Level Indexing: CarbonData uses multiple indices at various levels to enable faster search and speed up query processing.</p></li>
+  <li><p>Dictionary Encoding: Most databases and big data SQL data stores employ columnar encoding to achieve data compression by storing small integers numbers (surrogate value) instead of full string values. However, almost all existing databases and data stores divide the data into row groups containing anywhere from few thousand to a million rows and employ dictionary encoding only within each row group. Hence, the same column value can have different surrogate values in different row groups. So, while reading the data, conversion from surrogate value to actual value needs to be done immediately after the data is read from the disk. But CarbonData employs global surrogate key which means that a common dictionary is maintained for the full store on one machine/node. So CarbonData can perform all the query processing work such as grouping/aggregation, sorting etc on light weight surrogate values. The conversion from surrogate to actual values needs to be done only on the final res
 ult. This procedure improves performance on two aspects. Conversion from surrogate values to actual values is done only for the final result rows which are much less than the actual rows read from the store. All query processing and computation such as grouping/aggregation, sorting, and so on is done on lightweight surrogate values which requires less memory and CPU time compared to actual values.</p></li>
+  <li><p>Deep Spark Integration: It has built-in spark integration for Spark 1.6.2, 2.1 and interfaces for Spark SQL, DataFrame API and query optimization. It supports bulk data ingestion and allows saving of spark dataframes as CarbonData files.</p></li>
+  <li><p>Update Delete Support: It supports batch updates like daily update scenarios for OLAP and Base+Delta file based design.</p></li>
+  <li><p>Bucketing : It is a technique that is used for uniform distribution of data across files in CarbonData. It enhances the performance of join queries. While loading the data, records are placed into buckets based on hashing algorithm. During the execution of join queries the records can be fetched from buckets with out need of shuffling.This feature is used to distribute/organize the table/partition data into multiple files placing similar records in same file.</p></li>
+  <li><p>Global Multi Dimensional Keys(MDK) based B+Tree Index for all non- measure columns: Aids in quickly locating the row groups(Data Blocks) that contain the data matching search/filter criteria.</p></li>
+  <li><p>Min-Max Index for all columns: Aids in quickly locating the row groups(Data Blocks) that contain the data matching search/filter criteria.</p></li>
+  <li><p>Data Block level Inverted Index for all columns: Aids in quickly locating the rows that contain the data matching search/filter criteria within a row group(Data Blocks).</p></li>
+  <li><p>Store data along with index: Significantly accelerates query performance and reduces the I/O scans and CPU resources, when there are filters in the query. CarbonData index consists of multiple levels of indices. A processing framework can leverage this index to reduce the task it needs to schedule and process. It can also do skip scan in more finer grain units (called blocklet) in task side scanning instead of scanning the whole file.</p></li>
+  <li><p>Operable encoded data: It supports efficient compression and global encoding schemes and can query on compressed/encoded data. The data can be converted just before returning the results to the users, which is "late materialized".</p></li>
+  <li><p>Column group: Allows multiple columns to form a column group that would be stored as row format. This reduces the row reconstruction cost at query time.</p></li>
+  <li><p>Support for various use cases with one single Data format: Examples are interactive OLAP-style query, Sequential Access (big scan) and Random Access (narrow scan).</p></li>
+</ul>