You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by mc...@apache.org on 2018/11/29 00:58:28 UTC

[incubator-pinot] branch master updated: Re-org documentation (#3563)

This is an automated email from the ASF dual-hosted git repository.

mcvsubbu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
     new 45063b0  Re-org documentation (#3563)
45063b0 is described below

commit 45063b02473f2ecaa0c4bbf9fa153213ea5d6d0b
Author: Subbu Subramaniam <mc...@users.noreply.github.com>
AuthorDate: Wed Nov 28 16:58:23 2018 -0800

    Re-org documentation (#3563)
    
    * Re-org documentation
    
    Combined the sections on creating pinot segments into one.
    Removed extra pictures from pluggable stream section, referencing the realtime design instead.
    Created a new top level section on customizing pinot
    Other minor edits and warning fixes
    
    * Addressed some of the comments
---
 docs/client_api.rst              |  56 +++++++-------
 docs/conf.py                     |   2 +-
 docs/creating_pinot_segments.rst |  98 ------------------------
 docs/expressions_udf.rst         |   9 ++-
 docs/index.rst                   |  36 +++++----
 docs/intro.rst                   |   4 +-
 docs/llc.rst                     |  26 +++++--
 docs/management_api.rst          |   6 +-
 docs/multitenancy.rst            |  24 +++---
 docs/partition_aware_routing.rst |   6 +-
 docs/pinot_hadoop.rst            | 157 ++++++++++++++++++++++++++++++++-------
 docs/pluggable_streams.rst       |  64 ++++++++--------
 docs/pql_examples.rst            | 140 ++++++++++++++++++++--------------
 docs/reference.rst               |   7 +-
 docs/schema_timespec.rst         |   6 +-
 docs/segment_fetcher.rst         |  16 ++--
 docs/trying_pinot.rst            |  46 +++++++-----
 17 files changed, 382 insertions(+), 321 deletions(-)

diff --git a/docs/client_api.rst b/docs/client_api.rst
index f802d02..6a99d9a 100644
--- a/docs/client_api.rst
+++ b/docs/client_api.rst
@@ -1,12 +1,14 @@
-REST API
-========
+Executing queries via REST API on the Broker
+============================================
 
-The Pinot REST API can be accessed by ``POST`` ing a JSON object containing the parameter ``pql`` to the ``/query`` endpoint on a broker. Depending on the type of query, the results can take different shapes. For example, using curl:
+The Pinot REST API can be accessed by invoking ``POST`` operation witha a JSON body containing the parameter ``pql``
+to the ``/query`` URI endpoint on a broker. Depending on the type of query, the results can take different shapes.
+The examples below use curl.
 
 Aggregation
 -----------
 
-::
+.. code-block:: none
 
   curl -X POST -d '{"pql":"select count(*) from flights"}' http://localhost:8099/query
 
@@ -30,7 +32,7 @@ Aggregation
 Aggregation with grouping
 -------------------------
 
-::
+.. code-block:: none
 
   curl -X POST -d '{"pql":"select count(*) from flights group by Carrier"}' http://localhost:8099/query
 
@@ -68,7 +70,7 @@ Aggregation with grouping
 Selection
 ---------
 
-::
+.. code-block:: none
 
   curl -X POST -d '{"pql":"select * from flights limit 3"}' http://localhost:8099/query
 
@@ -145,12 +147,12 @@ Connections to Pinot are created using the ConnectionFactory class' utility meth
 
 .. code-block:: java
 
- Connection connection = ConnectionFactory.fromZookeeper
-     (some-zookeeper-server:2191/zookeeperPath");
+   Connection connection = ConnectionFactory.fromZookeeper
+     ("some-zookeeper-server:2191/zookeeperPath");
 
- Connection connection = ConnectionFactory.fromProperties("demo.properties");
+   Connection connection = ConnectionFactory.fromProperties("demo.properties");
 
- Connection connection = ConnectionFactory.fromHostList
+   Connection connection = ConnectionFactory.fromHostList
      ("some-server:1234", "some-other-server:1234", ...);
 
 
@@ -158,8 +160,8 @@ Queries can be sent directly to the Pinot cluster using the Connection.execute(j
 
 .. code-block:: java
 
- ResultSetGroup resultSetGroup = connection.execute("select * from foo...");
- Future<ResultSetGroup> futureResultSetGroup = connection.executeAsync
+   ResultSetGroup resultSetGroup = connection.execute("select * from foo...");
+   Future<ResultSetGroup> futureResultSetGroup = connection.executeAsync
      ("select * from foo...");
 
 
@@ -167,38 +169,38 @@ Queries can also use a PreparedStatement to escape query parameters:
 
 .. code-block:: java
 
- PreparedStatement statement = connection.prepareStatement
+   PreparedStatement statement = connection.prepareStatement
      ("select * from foo where a = ?");
- statement.setString(1, "bar");
+   statement.setString(1, "bar");
 
- ResultSetGroup resultSetGroup = statement.execute();
- Future<ResultSetGroup> futureResultSetGroup = statement.executeAsync();
+   ResultSetGroup resultSetGroup = statement.execute();
+   Future<ResultSetGroup> futureResultSetGroup = statement.executeAsync();
 
 
 In the case of a selection query, results can be obtained with the various get methods in the first ResultSet, obtained through the getResultSet(int) method:
 
 .. code-block:: java
 
- ResultSet resultSet = connection.execute
+   ResultSet resultSet = connection.execute
      ("select foo, bar from baz where quux = 'quuux'").getResultSet(0);
 
- for(int i = 0; i < resultSet.getRowCount(); ++i) {
-     System.out.println("foo: " + resultSet.getString(i, 0);
-     System.out.println("bar: " + resultSet.getInt(i, 1);
- }
+   for (int i = 0; i < resultSet.getRowCount(); ++i) {
+     System.out.println("foo: " + resultSet.getString(i, 0));
+     System.out.println("bar: " + resultSet.getInt(i, 1));
+   }
 
- resultSet.close();
+   resultSet.close();
 
 
-In the case where there is an aggregation, each aggregation function is within its own ResultSet:
+In the case of aggregation, each aggregation function is within its own ResultSet:
 
 .. code-block:: java
 
- ResultSetGroup resultSetGroup = connection.execute("select count(*) from foo");
+   ResultSetGroup resultSetGroup = connection.execute("select count(*) from foo");
 
- ResultSet resultSet = resultSetGroup.getResultSet(0);
- System.out.println("Number of records: " + resultSet.getInt(0));
- resultSet.close();
+   ResultSet resultSet = resultSetGroup.getResultSet(0);
+   System.out.println("Number of records: " + resultSet.getInt(0));
+   resultSet.close();
 
 
 There can be more than one ResultSet, each of which can contain multiple results grouped by a group key.
diff --git a/docs/conf.py b/docs/conf.py
index a95d37e..1cf24c1 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -48,6 +48,7 @@ extensions = [
     'sphinx.ext.todo',
     'sphinx.ext.mathjax',
     'sphinx.ext.githubpages',
+    'sphinx.ext.intersphinx',
 ]
 
 # Add any paths that contain templates here, relative to this directory.
@@ -293,7 +294,6 @@ texinfo_documents = [
      'Miscellaneous'),
 ]
 
-extensions = ['sphinx.ext.intersphinx']
 
 # Documents to append as an appendix to all manuals.
 #texinfo_appendices = []
diff --git a/docs/creating_pinot_segments.rst b/docs/creating_pinot_segments.rst
deleted file mode 100644
index 5bae71e..0000000
--- a/docs/creating_pinot_segments.rst
+++ /dev/null
@@ -1,98 +0,0 @@
-Creating Pinot segments outside of Hadoop
-=========================================
-
-This document describes steps required for creating Pinot2_0 segments from standard formats like CSV/JSON.
-
-Compiling the code
-------------------
-Follow the steps described in the section on :doc: `Demonstration <trying_pinot>` to build pinot. Locate ``pinot-admin.sh`` in ``pinot-tools/trget/pinot-tools=pkg/bin/pinot-admin.sh``.
-
-
-Data Preparation
-----------------
-
-#.  Create a top level directory containing all the CSV/JSON files that need to be converted.
-#.  The file name extensions are expected to be the same as the format name (_i.e_ ``.csv``, or ``.json``), and are case insensitive.
-    Note that the converter expects the .csv extension even if the data is delimited using tabs or spaces instead.
-#.  Prepare a schema file describing the schema of the input data. This file needs to be in JSON format. An example is provided at the end of the article.
-#.  Specifically for CSV format, an optional csv config file can be provided (also in JSON format). This is used to configure parameters like the delimiter/header for the CSV file etc.  
-        A detailed description of this follows below.  
-
-Creating a segment
-------------------
-Run the pinot-admin command to generate the segments. The command can be invoked as follows. Options within "[ ]" are optional. For -format, the default value is AVRO.
-
-::
-
-    bin/pinot-admin.sh CreateSegment -dataDir <input_data_dir> [-format [CSV/JSON/AVRO]] [-readerConfigFile <csv_config_file>] [-generatorConfigFile <generator_config_file>] -segmentName <segment_name> -schemaFile <input_schema_file> -tableName <table_name> -outDir <output_data_dir> [-overwrite]
-
-CSV Reader Config file
-----------------------
-To configure various parameters for CSV a config file in JSON format can be provided. This file is optional, as are each of its parameters. When not provided, default values used for these parameters are described below:
-
-#.  fileFormat: Specify one of the following. Default is EXCEL.  
-
- ##.  EXCEL
- ##.  MYSQL
- ##.  RFC4180
- ##.  TDF
-
-#.  header: If the input CSV file does not contain a header, it can be specified using this field. Note, if this is specified, then the input file is expected to not contain the header row, or else it will result in parse error. The columns in the header must be delimited by the same delimiter character as the rest of the CSV file.
-#.  delimiter: Use this to specify a delimiter character. The default value is ",".
-#.  dateFormat: If there are columns that are in date format and need to be converted into Epoch (in milliseconds), use this to specify the format. Default is "mm-dd-yyyy".
-#.  dateColumns: If there are multiple date columns, use this to list those columns.
-
-Below is a sample config file.
-
-::
-
-  {
-    "fileFormat" : "EXCEL",
-    "header" : "col1,col2,col3,col4",
-    "delimiter" : "\t",
-    "dateFormat" : "mm-dd-yy"
-    "dateColumns" : ["col1", "col2"]
-  }
-
-Sample JSON schema file:
-
-::
-
-  {
-    "dimensionFieldSpecs" : [
-      {			   
-        "dataType" : "STRING",
-        "delimiter" : null,
-        "singleValueField" : true,
-        "name" : "name"
-      },
-      {
-        "dataType" : "INT",
-        "delimiter" : null,
-        "singleValueField" : true,
-        "name" : "age"
-      }
-    ],
-    "timeFieldSpec" : {
-      "incomingGranularitySpec" : {
-        "timeType" : "DAYS",
-        "dataType" : "LONG",
-        "name" : "incomingName1"
-      },
-      "outgoingGranularitySpec" : {
-        "timeType" : "DAYS",
-        "dataType" : "LONG",
-        "name" : "outgoingName1"
-      }
-    },
-    "metricFieldSpecs" : [
-      {
-        "dataType" : "FLOAT",
-        "delimiter" : null,
-        "singleValueField" : true,
-        "name" : "percent"
-      }
-     ]
-    },
-    "schemaName" : "mySchema",
-  }
diff --git a/docs/expressions_udf.rst b/docs/expressions_udf.rst
index d373424..270be67 100644
--- a/docs/expressions_udf.rst
+++ b/docs/expressions_udf.rst
@@ -7,13 +7,13 @@ The query language for Pinot (:doc:`PQL <reference>`) currently only supports *s
 
 The high level requirement here is to support *expressions* that represent a function on a set of columns in the queries, as opposed to just columns.
 
-::
+.. code-block:: none
 
   select <exp1> from myTable where ... [group by <exp2>]
 
 Where exp1 and exp2 can be of the form:
 
-::
+.. code-block:: none
 
   func1(func2(col1, col2...func3(...)...), coln...)...
 
@@ -44,7 +44,8 @@ Parser
 
 The PQL parser is already capable of parsing expressions in the *selection*, *aggregation* and *group by* sections. Following is a sample query containing expression, and its parse tree shown in the image.
 
-::
+.. code-block:: none
+
   select f1(f2(col1, col2), col3) from myTable where (col4 = 'x') group by f3(col5, f4(col6, col7))
 
 
@@ -112,7 +113,7 @@ We see the following limitations in functionality currently:
 
 #. Nesting of *aggregation* functions is not supported in the expression tree. This is because the number of documents after *aggregation* is reduced. In the expression below, *sum* of *col2* would yield one value, whereas *xform1* one *col1* would yield the same number of documents as in the input.
 
-::
+.. code-block:: none
 
    sum(xform1(col1), sum(col2))
 
diff --git a/docs/index.rst b/docs/index.rst
index 66d6ec2..f1defaa 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -3,12 +3,10 @@
    You can adapt this file completely to your liking, but it should at least
    contain the root `toctree` directive.
 
-Welcome to Pinot's documentation!
-=================================
+############
+Introduction
+############
 
-########
-Contents
-########
 
 .. toctree::
    :maxdepth: 1
@@ -24,8 +22,20 @@ Reference
 .. toctree::
    :maxdepth: 1
 
-   in_production
+
    reference
+   in_production
+
+#################
+Customizing Pinot
+#################
+
+.. toctree::
+   :maxdepth: 1
+
+
+   pluggable_streams
+   segment_fetcher
 
 ################
 Design Documents
@@ -34,19 +44,19 @@ Design Documents
 .. toctree::
    :maxdepth: 1
 
+
    llc
    partition_aware_routing
    expressions_udf
-   multitenancy
    schema_timespec
 
-   
+################
+Design Proposals
+################
 
+.. toctree::
+   :maxdepth: 1
 
-Indices and tables
-==================
 
-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
+   multitenancy
 
diff --git a/docs/intro.rst b/docs/intro.rst
index b169b4a..9f5cb93 100644
--- a/docs/intro.rst
+++ b/docs/intro.rst
@@ -1,5 +1,5 @@
-Introduction
-============
+About Pinot
+===========
 
 Pinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data
 from offline data sources (such as Hadoop and flat files) as well as streaming events (such as Kafka). Pinot is designed to scale horizontally,
diff --git a/docs/llc.rst b/docs/llc.rst
index 15d2e0a..4c08e70 100644
--- a/docs/llc.rst
+++ b/docs/llc.rst
@@ -1,14 +1,28 @@
-LLC Design
-==========
+Realtime Design
+===============
+
+Pinot consumes rows from streaming data (such as Kafka) and serves queries on the data consumed thus far.
+
+Two modes of consumption are supported in Pinot:
+
+.. _hlc-section:
+
+High Level Consumers
+--------------------
 
-High Level Consumer
--------------------
 .. figure:: High-level-stream.png
 
    High Level Stream Consumer Architecture
 
-Low Level Consumer
-------------------
+
+*TODO*: Add design description of how HLC realtime works
+
+
+.. _llc-section:
+
+Low Level Consumers
+------------------- 
+
 .. figure:: Low-level-stream.png
 
    Low Level Stream Consumer Architecture
diff --git a/docs/management_api.rst b/docs/management_api.rst
index a2c367a..5e4d843 100644
--- a/docs/management_api.rst
+++ b/docs/management_api.rst
@@ -1,5 +1,7 @@
-Management REST API
--------------------
+Managing Pinot via REST API on the Controller
+=============================================
+
+*TODO* : Remove this section altogether and find a place somewhere for a pointer to the management API. Maybe in the 'Running pinot in production' section?
 
 There is a REST API which allows management of tables, tenants, segments and schemas. It can be accessed by going to ``http://[controller host]/help`` which offers a web UI to do these tasks, as well as document the REST API.
 
diff --git a/docs/multitenancy.rst b/docs/multitenancy.rst
index edd4bf8..ffcf157 100644
--- a/docs/multitenancy.rst
+++ b/docs/multitenancy.rst
@@ -87,7 +87,7 @@ Adding Nodes to cluster
 
 Adding node to cluster can be done in two ways, manual or automatic. This is controlled by a property set in cluster config called "allowPariticpantAutoJoin". If this is set to true, participants can join the cluster when they are started. If not, they need to be pre-registered in Helix via `Helix Admin <http://helix.apache.org/0.6.4-docs/tutorial_admin.html>`_ command addInstance.
 
-::
+.. code-block:: none
 
   {
    "id" : "PinotPerfTestCluster",
@@ -102,7 +102,7 @@ In Pinot 2.0 we will set AUTO_JOIN to true. This means after the SRE's procure t
 
 The znode ``CONFIGS/PARTICIPANT/ServerInstanceName`` looks lik below:
 
-::
+.. code-block:: none
 
     {
      "id":"Server_localhost_8098"
@@ -120,7 +120,7 @@ The znode ``CONFIGS/PARTICIPANT/ServerInstanceName`` looks lik below:
 
 And the znode ``CONFIGS/PARTICIPANT/BrokerInstanceName`` looks like below:
 
-::
+.. code-block:: none
 
     {
      "id":"Broker_localhost_8099"
@@ -143,7 +143,7 @@ There is one resource idealstate created for Broker by default called broker_res
 
 *CLUSTERNAME/IDEALSTATES/BrokerResource (Broker IdealState before adding data resource)*
 
-::
+.. code-block:: none
 
   {
    "id" : "brokerResource",
@@ -166,7 +166,7 @@ After adding a resource using the following data resource creation command, a re
 Sample Curl request
 -------------------
 
-::
+.. code-block:: none
 
   curl -i -X POST -H 'Content-Type: application/json' -d '{"requestType":"create", "resourceName":"XLNT","tableName":"T1", "timeColumnName":"daysSinceEpoch", "timeType":"daysSinceEpoch","numberOfDataInstances":4,"numberOfCopies":2,"retentionTimeUnit":"DAYS", "retentionTimeValue":"700","pushFrequency":"daily", "brokerTagName":"XLNT", "numberOfBrokerInstances":1, "segmentAssignmentStrategy":"BalanceNumSegmentAssignmentStrategy", "resourceType":"OFFLINE", "metadata":{}}'
 
@@ -175,7 +175,7 @@ This is how it looks in Helix after running the above command.
 
 The znode ``CONFIGS/PARTICIPANT/Broker_localhost_8099`` looks as follows:
 
-::
+.. code-block:: none
 
     {
      "id":"Broker_localhost_8099"
@@ -193,7 +193,7 @@ The znode ``CONFIGS/PARTICIPANT/Broker_localhost_8099`` looks as follows:
 
 And the znode ``IDEALSTATES/brokerResource`` looks like below after Data resource is created
 
-::
+.. code-block:: none
 
     {
      "id":"brokerResource"
@@ -220,7 +220,7 @@ Server Info in Helix
 
 The znode ``CONFIGS/PARTICIPANT/Server_localhost_8098`` looks as below
 
-::
+.. code-block:: none
 
     {
      "id":"Server_localhost_8098"
@@ -238,7 +238,7 @@ The znode ``CONFIGS/PARTICIPANT/Server_localhost_8098`` looks as below
 
 And the znode ``/IDEALSTATES/XLNT (XLNT Data Resource IdealState)`` looks as below:
 
-::
+.. code-block:: none
 
     {
      "id":"XLNT"
@@ -267,7 +267,7 @@ Add a table to data resource
 
 Sample Curl request
 
-::
+.. code-block:: none
 
     curl -i -X PUT -H 'Content-Type: application/json' -d '{"requestType":"addTableToResource","resourceName":"XLNT","tableName":"T1", "resourceType":"OFFLINE", "metadata":{}}' <span class="nolink">[http://CONTROLLER-HOST:PORT/dataresources](http://CONTROLLER-HOST:PORT/dataresources)
 
@@ -275,7 +275,7 @@ After the table is added, mapping between Resources and Tables are maintained in
 
 The znode ``/PROPERTYSTORE/CONFIGS/RESOURCE/XLNT`` like like:
 
-::
+.. code-block:: none
 
     {
      "id":"mirrorProfileViewOfflineEvents1_O"
@@ -307,7 +307,7 @@ The znode ``/PROPERTYSTORE/CONFIGS/RESOURCE/XLNT`` like like:
 
 The znode ``/IDEALSTATES/XLNT (XLNT Data Resource IdealState)``
 
-::
+.. code-block:: none
 
     {
      "id":"XLNT_O"
diff --git a/docs/partition_aware_routing.rst b/docs/partition_aware_routing.rst
index 1a6a4d9..be9dcd4 100644
--- a/docs/partition_aware_routing.rst
+++ b/docs/partition_aware_routing.rst
@@ -48,7 +48,7 @@ Prune function: Name of the class that will be used by the broker to prune a seg
 
 For example, let us consider a case where the data is naturally partitioned on time column ‘daysSinceEpoch’. The segment zk metadata will have information like below:
 
-::
+.. code-block:: none
 
   {
     “partitionColumn”	: “daysSinceEpoch”,
@@ -59,7 +59,7 @@ For example, let us consider a case where the data is naturally partitioned on t
 
 Now consider the following query comes in. 
 
-::
+.. code-block:: none
 
   Select count(*) from myTable where daysSinceEpoch between 17100 and 17110
 
@@ -67,7 +67,7 @@ The broker will recognize the range predicate on the partition column, and call
 
 Let’s consider another example where the data is partitioned by memberId, where a hash function was applied on the memberId to compute a partition number.
 
-::
+.. code-block:: none
 
   {
     “partitionColumn”	: “memberId”,
diff --git a/docs/pinot_hadoop.rst b/docs/pinot_hadoop.rst
index 0fa4181..f388e00 100644
--- a/docs/pinot_hadoop.rst
+++ b/docs/pinot_hadoop.rst
@@ -1,39 +1,43 @@
-Creating Pinot segments in Hadoop
-=================================
+Creating Pinot segments
+=======================
 
-Pinot index files can be created offline on Hadoop, then pushed onto a production cluster. Because index generation does not happen on the Pinot nodes serving traffic, this means that these nodes can continue to serve traffic without impacting performance while data is being indexed. The index files are then pushed onto the Pinot cluster, where the files are distributed and loaded by the server nodes with minimal performance impact.
+Pinot segments can be created offline on Hadoop, or via command line from data files. Controller REST endpoint
+can then be used to add the segment to the table to which the segment belongs.
+
+Creating segments using hadoop
+------------------------------
 
 .. figure:: Pinot-Offline-only-flow.png
 
   Offline Pinot workflow
 
-To create index files offline  a Hadoop workflow can be created to complete the following steps:
+To create Pinot segments on Hadoop, a workflow can be created to complete the following steps:
 
-1. Pre-aggregate, clean up and prepare the data, writing it as Avro format files in a single HDFS directory
-2. Create the index files
-3. Upload the index files to the Pinot cluster
+#. Pre-aggregate, clean up and prepare the data, writing it as Avro format files in a single HDFS directory
+#. Create segments
+#. Upload segments to the Pinot cluster
 
-Step one can be done using your favorite tool (such as Pig, Hive or Spark), while Pinot provides two MapReduce jobs to do step two and three.
+Step one can be done using your favorite tool (such as Pig, Hive or Spark), Pinot provides two MapReduce jobs to do step two and three.
 
-Configuration
--------------
+Configuring the job
+^^^^^^^^^^^^^^^^^^^
 
 Create a job properties configuration file, such as one below:
 
-::
+.. code-block:: none
 
   # === Index segment creation job config ===
 
   # path.to.input: Input directory containing Avro files
   path.to.input=/user/pinot/input/data
 
-  # path.to.output: Output directory containing Pinot index segments
+  # path.to.output: Output directory containing Pinot segments
   path.to.output=/user/pinot/output
 
   # path.to.schema: Schema file for the table, stored locally
   path.to.schema=flights-schema.json
 
-  # segment.table.name: Name of the table for which to generate index segments
+  # segment.table.name: Name of the table for which to generate segments
   segment.table.name=flights
 
   # === Segment tar push job config ===
@@ -45,28 +49,129 @@ Create a job properties configuration file, such as one below:
   push.to.port=8888
 
 
-Index file creation
--------------------
+Executing the job
+^^^^^^^^^^^^^^^^^
 
 The Pinot Hadoop module contains a job that you can incorporate into your
-workflow to generate Pinot indices. Note that this will only create data for you. 
-In order to have this data on your cluster, you want to also run the SegmentTarPush
-job, details below. To run SegmentCreation through the command line:
+workflow to generate Pinot segments.
 
-::
+.. code-block:: none
 
   mvn clean install -DskipTests -Pbuild-shaded-jar
   hadoop jar pinot-hadoop-0.016-shaded.jar SegmentCreation job.properties
 
+You can then use the SegmentTarPush job to push segments via the controller REST API.
 
-Index file push
----------------
+.. code-block:: none
 
-This job takes generated Pinot index files from an input directory and pushes
-them to a Pinot controller node.
+  hadoop jar pinot-hadoop-0.016-shaded.jar SegmentTarPush job.properties
 
-::
 
-  mvn clean install -DskipTests -Pbuild-shaded-jar
-  hadoop jar pinot-hadoop-0.016-shaded.jar SegmentTarPush job.properties
+Creating Pinot segments outside of Hadoop
+-----------------------------------------
+
+Here is how you can create Pinot segments from standard formats like CSV/JSON.
+
+#. Follow the steps described in the section on :ref:`compiling-code-section` to build pinot. Locate ``pinot-admin.sh`` in ``pinot-tools/target/pinot-tools=pkg/bin/pinot-admin.sh``.
+#.  Create a top level directory containing all the CSV/JSON files that need to be converted into segments.
+#.  The file name extensions are expected to be the same as the format name (*i.e* ``.csv``, or ``.json``), and are case insensitive.
+    Note that the converter expects the ``.csv`` extension even if the data is delimited using tabs or spaces instead.
+#.  Prepare a schema file describing the schema of the input data. The schema needs to be in JSON format. See example later in this section.
+#.  Specifically for CSV format, an optional csv config file can be provided (also in JSON format). This is used to configure parameters like the delimiter/header for the CSV file etc.  
+        A detailed description of this follows below.  
+
+Run the pinot-admin command to generate the segments. The command can be invoked as follows. Options within "[ ]" are optional. For -format, the default value is AVRO.
+
+.. code-block:: none
+
+    bin/pinot-admin.sh CreateSegment -dataDir <input_data_dir> [-format [CSV/JSON/AVRO]] [-readerConfigFile <csv_config_file>] [-generatorConfigFile <generator_config_file>] -segmentName <segment_name> -schemaFile <input_schema_file> -tableName <table_name> -outDir <output_data_dir> [-overwrite]
+
+
+To configure various parameters for CSV a config file in JSON format can be provided. This file is optional, as are each of its parameters. When not provided, default values used for these parameters are described below:
+
+#.  fileFormat: Specify one of the following. Default is EXCEL.  
+
+ ##.  EXCEL
+ ##.  MYSQL
+ ##.  RFC4180
+ ##.  TDF
+
+#.  header: If the input CSV file does not contain a header, it can be specified using this field. Note, if this is specified, then the input file is expected to not contain the header row, or else it will result in parse error. The columns in the header must be delimited by the same delimiter character as the rest of the CSV file.
+#.  delimiter: Use this to specify a delimiter character. The default value is ",".
+#.  dateFormat: If there are columns that are in date format and need to be converted into Epoch (in milliseconds), use this to specify the format. Default is "mm-dd-yyyy".
+#.  dateColumns: If there are multiple date columns, use this to list those columns.
+
+Below is a sample config file.
+
+.. code-block:: none
+
+  {
+    "fileFormat" : "EXCEL",
+    "header" : "col1,col2,col3,col4",
+    "delimiter" : "\t",
+    "dateFormat" : "mm-dd-yy"
+    "dateColumns" : ["col1", "col2"]
+  }
+
+Sample Schema:
+
+.. code-block:: none
+
+  {
+    "dimensionFieldSpecs" : [
+      {			   
+        "dataType" : "STRING",
+        "delimiter" : null,
+        "singleValueField" : true,
+        "name" : "name"
+      },
+      {
+        "dataType" : "INT",
+        "delimiter" : null,
+        "singleValueField" : true,
+        "name" : "age"
+      }
+    ],
+    "timeFieldSpec" : {
+      "incomingGranularitySpec" : {
+        "timeType" : "DAYS",
+        "dataType" : "LONG",
+        "name" : "incomingName1"
+      },
+      "outgoingGranularitySpec" : {
+        "timeType" : "DAYS",
+        "dataType" : "LONG",
+        "name" : "outgoingName1"
+      }
+    },
+    "metricFieldSpecs" : [
+      {
+        "dataType" : "FLOAT",
+        "delimiter" : null,
+        "singleValueField" : true,
+        "name" : "percent"
+      }
+     ]
+    },
+    "schemaName" : "mySchema",
+  }
+
+Pushing segments to Pinot
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You can use curl to push a segment to pinot:
+
+.. code-block:: none
+
+   curl -X POST -F segment=@<segment-tar-file-path> http://controllerHost:controllerPort/segments
+
 
+Alternatively you can use the pinot-admin.sh utility to upload one or more segments:
+
+.. code-block:: none
+
+  pinot-tools/target/pinot-tools-pkg/bin//pinot-admin.sh UploadSegment -controllerHost <hostname> -controllerPort <port> -segmentDir <segmentDirectoryPath>
+
+The command uploads all the segments found in ``segmentDirectoryPath``. 
+The segments could be either tar-compressed (in which case it is a file under ``segmentDirectoryPath``)
+or uncompressed (in which case it is a directory under ``segmentDirectoryPath``).
diff --git a/docs/pluggable_streams.rst b/docs/pluggable_streams.rst
index 876e31c..4c6b26a 100644
--- a/docs/pluggable_streams.rst
+++ b/docs/pluggable_streams.rst
@@ -16,17 +16,14 @@ Some of the streams for which plug-ins can be added are:
 * `Pulsar <https://pulsar.apache.org/docs/en/client-libraries-java/>`_
 
 
-You may encounter some limitations either in Pinot or in the stream system while developing plug-ins. Please feel free to get in touch with us when you start writing a stream plug-in, and we can help you out. We are open to receiving PRs in order to improve these abstractions if they do not work for a certain stream implementation.
+You may encounter some limitations either in Pinot or in the stream system while developing plug-ins.
+Please feel free to get in touch with us when you start writing a stream plug-in, and we can help you out.
+We are open to receiving PRs in order to improve these abstractions if they do not work for a certain stream implementation.
 
-Pinot Stream Consumers
-----------------------
-Pinot consumes rows from event streams and serves queries on the data consumed. Rows may be consumed either at stream level (also referred to as high level) or at partition level (also referred to as low level).
+Refer to sections :ref:`hlc-section` and :ref:`llc-section` for details on how Pinot consumes streaming data.
 
-.. figure:: pluggable_streams.png
-
-.. figure:: High-level-stream.png
-
-   Stream Level Consumer
+Requirements to support Stream Level (High Level) consumers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 The stream should provide the following guarantees:
 
@@ -36,13 +33,12 @@ The stream should provide the following guarantees:
 * The checkpoints should be recorded only when Pinot makes a call to do so.
 * The consumer should be able to start consumption from one of:
 
-  ** latest avaialble data
-  ** earliest available data
-  ** last saved checkpoint
-
-.. figure:: Low-level-stream.png
+  * latest avaialble data
+  * earliest available data
+  * last saved checkpoint
 
-  Partition Level Consumer
+Requirements to support Partition Level (Low Level) consumers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 While consuming rows at a partition level, the stream should support the following
 properties:
@@ -52,9 +48,10 @@ properties:
 * Refer to a partition as a number not exceeding 32 bits long.
 * Stream should provide the following mechanisms to get an offset for a given partition of the stream:
 
-  ** get the offset of the oldest event available (assuming events are aged out periodically) in the partition.
-  ** get the offset of the most recent event published in the partition
-  ** (optionally) get the offset of an event that was published at a specified time
+  * get the offset of the oldest event available (assuming events are aged out periodically) in the partition.
+  * get the offset of the most recent event published in the partition
+  * (optionally) get the offset of an event that was published at a specified time
+
 * Stream should provide a mechanism to consume a set of events from a partition starting from a specified offset.
 * Events with higher offsets should be more recent (the offsets of events need not be contiguous)
 
@@ -91,24 +88,25 @@ The rest of the configuration properties for your stream should be set with the
 
 All values should be strings. For example:
 
-.::
 
-  "streamType" : "foo",
-  "stream.foo.topic.name" : "SomeTopic",
-  "stream.foo.consumer.type": "lowlevel",
-  "stream.foo.consumer.factory.class.name": "fully.qualified.pkg.ConsumerFactoryClassName",
-  "stream.foo.consumer.prop.auto.offset.reset": "largest",
-  "stream.foo.decoder.class.name" : "fully.qualified.pkg.DecoderClassName",
-  "stream.foo.decoder.prop.a.decoder.property" : "decoderPropValue",
-  "stream.foo.connection.timeout.millis" : "10000", // default 30_000
-  "stream.foo.fetch.timeout.millis" : "10000" // default 5_000
+.. code-block:: none
+
+    "streamType" : "foo",
+    "stream.foo.topic.name" : "SomeTopic",
+    "stream.foo.consumer.type": "lowlevel",
+    "stream.foo.consumer.factory.class.name": "fully.qualified.pkg.ConsumerFactoryClassName",
+    "stream.foo.consumer.prop.auto.offset.reset": "largest",
+    "stream.foo.decoder.class.name" : "fully.qualified.pkg.DecoderClassName",
+    "stream.foo.decoder.prop.a.decoder.property" : "decoderPropValue",
+    "stream.foo.connection.timeout.millis" : "10000", // default 30_000
+    "stream.foo.fetch.timeout.millis" : "10000" // default 5_000
 
 
 You can have additional properties that are specific to your stream. For example:
 
-.::
+.. code-block:: none
 
-"stream.foo.some.buffer.size" : "24g"
+  "stream.foo.some.buffer.size" : "24g"
 
 In addition to these properties, you can define thresholds for the consuming segments:
 
@@ -117,10 +115,10 @@ In addition to these properties, you can define thresholds for the consuming seg
 
 The properties for the thresholds are as follows:
 
-.::
+.. code-block:: none
 
-"realtime.segment.flush.threshold.size" : "100000"
-"realtime.segment.flush.threshold.time" : "6h"
+  "realtime.segment.flush.threshold.size" : "100000"
+  "realtime.segment.flush.threshold.time" : "6h"
 
 
 An example of this implementation can be found in the `KafkaConsumerFactory <com.linkedin.pinot.core.realtime.impl.kafka.KafkaConsumerFactory>`_, which is an implementation for the kafka stream.
diff --git a/docs/pql_examples.rst b/docs/pql_examples.rst
index a160763..cfa0b0f 100644
--- a/docs/pql_examples.rst
+++ b/docs/pql_examples.rst
@@ -77,66 +77,30 @@ Note: results might not be consistent if column ordered by has same value in mul
     ORDER BY bar DESC
     LIMIT 50, 100
 
-Wild-card match
----------------
+Wild-card match (in WHERE clause only)
+--------------------------------------
+
+To count rows where the column ``airlineName`` starts with ``U``
 
 .. code-block:: sql
 
   SELECT count(*) FROM SomeTable
-    WHERE regexp_like(columnName, '.*regex-here?')
-    GROUP BY someOtherColumn TOP 10
+    WHERE regexp_like(airlineName, '^U.*')
+    GROUP BY airlineName TOP 10
+
+Examples with UDF
+-----------------
 
-Time-Convert UDF
-----------------
+As of now, functions have to be implemented within Pinot. Injecting functions is not allowed yet.
+The examples below demonstrate the use of UDFs
 
 .. code-block:: sql
 
   SELECT count(*) FROM myTable
     GROUP BY timeConvert(timeColumnName, 'SECONDS', 'DAYS')
 
-Differences with SQL
---------------------
-
-* ``JOIN`` is not supported
-* Use ``TOP`` instead of ``LIMIT`` for truncation
-* ``LIMIT n`` has no effect in grouping queries, should use ``TOP n`` instead. If no ``TOP n`` defined, PQL will use ``TOP 10`` as default truncation setting.
-* No need to select the columns to group with.
-
-The following two queries are both supported in PQL, where the non-aggregation columns are ignored.
-
-.. code-block:: sql
-
-  SELECT MIN(foo), MAX(foo), SUM(foo), AVG(foo) FROM mytable
-    GROUP BY bar, baz
-    TOP 50
-
-  SELECT bar, baz, MIN(foo), MAX(foo), SUM(foo), AVG(foo) FROM mytable
-    GROUP BY bar, baz
-    TOP 50
-
-* Always order by the aggregated value
-  The results will always order by the aggregated value itself.
-* Results equivalent to grouping on each aggregation
-  The results for query:
-
-.. code-block:: sql
-
-  SELECT MIN(foo), MAX(foo) FROM myTable
-    GROUP BY bar
-    TOP 50
-
-will be the same as the combining results from the following queries:
-
-.. code-block:: sql
-
-  SELECT MIN(foo) FROM myTable
-    GROUP BY bar
-    TOP 50
-  SELECT MAX(foo) FROM myTable
-    GROUP BY bar
-    TOP 50
-
-where we don't put the results for the same group together.
+  SELECT count(*) FROM myTable
+    GROUP BY div(tim
 
 PQL Specification
 -----------------
@@ -189,10 +153,12 @@ WHERE
 
 Supported predicates are comparisons with a constant using the standard SQL operators (``=``, ``<``, ``<=``, ``>``, ``>=``, ``<>``, '!=') , range comparisons using ``BETWEEN`` (``foo BETWEEN 42 AND 69``), set membership (``foo IN (1, 2, 4, 8)``) and exclusion (``foo NOT IN (1, 2, 4, 8)``). For ``BETWEEN``, the range is inclusive.
 
+Comparison with a regular expression is supported using the regexp_like function, as in ``WHERE regexp_like(columnName, 'regular expression')``
+
 GROUP BY
 ^^^^^^^^
 
-The ``GROUP BY`` clause groups aggregation results by a list of columns.
+The ``GROUP BY`` clause groups aggregation results by a list of columns, or transform functions on columns (see below)
 
 
 ORDER BY
@@ -224,11 +190,71 @@ For example, the following query will calculate the maximum value of column ``fo
 
 Supported transform functions
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+``ADD``
+   Sum of at least two values
+
+``SUB``
+   Difference between two values
+
+``MULT``
+   Product of at least two values
 
-* ``ADD``: sum of at least two values
-* ``SUB``: difference between two values
-* ``MULT``: product of at least two values
-* ``DIV``: quotient of two values
-* ``TIMECONVERT``: takes 3 arguments, converts the value into another time unit. E.g. ``TIMECONVERT(time, 'MILLISECONDS', 'SECONDS')``
-* ``DATETIMECONVERT``: takes 4 arguments, converts the value into another date time format, and buckets time based on the given time granularity. E.g. ``DATETIMECONVERT(date, '1:MILLISECONDS:EPOCH', '1:SECONDS:EPOCH', '15:MINUTES')``
-* ``VALUEIN``: takes at least 2 arguments, where the first argument is a multi-valued column, and the following arguments are constant values. The transform function will filter the value from the multi-valued column with the given constant values. The ``VALUEIN`` transform function is especially useful when the same multi-valued column is both filtering column and grouping column. E.g. ``VALUEIN(mvColumn, 3, 5, 15)``
+``DIV``
+   Quotient of two values
+
+``TIMECONVERT``
+   Takes 3 arguments, converts the value into another time unit. E.g. ``TIMECONVERT(time, 'MILLISECONDS', 'SECONDS')``
+
+``DATETIMECONVERT``
+   Takes 4 arguments, converts the value into another date time format, and buckets time based on the given time granularity.
+   *e.g.* ``DATETIMECONVERT(date, '1:MILLISECONDS:EPOCH', '1:SECONDS:EPOCH', '15:MINUTES')``
+
+``VALUEIN``
+   Takes at least 2 arguments, where the first argument is a multi-valued column, and the following arguments are constant values.
+   The transform function will filter the value from the multi-valued column with the given constant values.
+   The ``VALUEIN`` transform function is especially useful when the same multi-valued column is both filtering column and grouping column.
+   *e.g.* ``VALUEIN(mvColumn, 3, 5, 15)``
+
+
+Differences with SQL
+--------------------
+
+* ``JOIN`` is not supported
+* Use ``TOP`` instead of ``LIMIT`` for truncation
+* ``LIMIT n`` has no effect in grouping queries, should use ``TOP n`` instead. If no ``TOP n`` defined, PQL will use ``TOP 10`` as default truncation setting.
+* No need to select the columns to group with.
+
+The following two queries are both supported in PQL, where the non-aggregation columns are ignored.
+
+.. code-block:: sql
+
+  SELECT MIN(foo), MAX(foo), SUM(foo), AVG(foo) FROM mytable
+    GROUP BY bar, baz
+    TOP 50
+
+  SELECT bar, baz, MIN(foo), MAX(foo), SUM(foo), AVG(foo) FROM mytable
+    GROUP BY bar, baz
+    TOP 50
+
+* The results will always order by the aggregated value (descending).
+
+The results for query:
+
+.. code-block:: sql
+
+  SELECT MIN(foo), MAX(foo) FROM myTable
+    GROUP BY bar
+    TOP 50
+
+will be the same as the combining results from the following queries:
+
+.. code-block:: sql
+
+  SELECT MIN(foo) FROM myTable
+    GROUP BY bar
+    TOP 50
+  SELECT MAX(foo) FROM myTable
+    GROUP BY bar
+    TOP 50
+
+where we don't put the results for the same group together.
diff --git a/docs/reference.rst b/docs/reference.rst
index 51881d8..2171119 100644
--- a/docs/reference.rst
+++ b/docs/reference.rst
@@ -1,15 +1,10 @@
 .. _reference:
 
-Pinot Reference Manual
-======================
 .. toctree::
-   :maxdepth: 2
+   :maxdepth: 1
 
    pql_examples
    client_api
    management_api
    pinot_hadoop
-   creating_pinot_segments
-   pluggable_streams
-   segment_fetcher
 
diff --git a/docs/schema_timespec.rst b/docs/schema_timespec.rst
index 1d7e769..531ff28 100644
--- a/docs/schema_timespec.rst
+++ b/docs/schema_timespec.rst
@@ -6,7 +6,7 @@ Problems with current schema design
 
 The pinot schema timespec looks like this:
 
-::
+.. code-block:: none
 
   {
     "timeFieldSpec":
@@ -29,7 +29,7 @@ Changes
 
 We have added a List<DateTimeFieldSpec> _dateTimeFieldSpecs to the pinot schema
 
-::
+.. code-block:: none
 
   {
     “dateTimeFieldSpec”:
@@ -67,7 +67,7 @@ We have added a List<DateTimeFieldSpec> _dateTimeFieldSpecs to the pinot schema
 
 Examples:
 
-::
+.. code-block:: none
 
   “dateTimeFieldSpec”:
   {
diff --git a/docs/segment_fetcher.rst b/docs/segment_fetcher.rst
index 7e57602..97e2339 100644
--- a/docs/segment_fetcher.rst
+++ b/docs/segment_fetcher.rst
@@ -14,15 +14,15 @@ HDFS segment fetcher configs
 -----------------------------
 
 In your Pinot controller/server configuration, you will need to provide the following configs:
-::
+
+.. code-block:: none
 
     pinot.controller.segment.fetcher.hdfs.hadoop.conf.path=`<file path to hadoop conf folder>
 
 
 or
 
-
-::
+.. code-block:: none
 
 
     pinot.server.segment.fetcher.hdfs.hadoop.conf.path=`<file path to hadoop conf folder>
@@ -30,14 +30,14 @@ or
 
 This path should point the local folder containing ``core-site.xml`` and ``hdfs-site.xml`` files from your Hadoop installation 
 
-::
+.. code-block:: none
 
     pinot.controller.segment.fetcher.hdfs.hadoop.kerberos.principle=`<your kerberos principal>
     pinot.controller.segment.fetcher.hdfs.hadoop.kerberos.keytab=`<your kerberos keytab>
 
 or
 
-::
+.. code-block:: none
 
     pinot.server.segment.fetcher.hdfs.hadoop.kerberos.principle=`<your kerberos principal>
     pinot.server.segment.fetcher.hdfs.hadoop.kerberos.keytab=`<your kerberos keytab>
@@ -54,7 +54,7 @@ To push HDFS segment files to Pinot controller, you just need to ensure you have
 
 For example, the following curl requests to Controller will notify it to download segment files to the proper table:
 
-::
+.. code-block:: none
  
   curl -X POST -H "UPLOAD_TYPE:URI" -H "DOWNLOAD_URI:hdfs://nameservice1/hadoop/path/to/segment/file.gz" -H "content-type:application/json" -d '' localhost:9000/segments
 
@@ -63,13 +63,13 @@ Implement your own segment fetcher for other systems
 
 You can also implement your own segment fetchers for other file systems and load into Pinot system with an external jar. All you need to do is to implement a class that extends the interface of `SegmentFetcher <https://github.com/linkedin/pinot/blob/master/pinot-common/src/main/java/com/linkedin/pinot/common/segment/fetcher/SegmentFetcher.java>`_ and provides config to Pinot Controller and Server as follows:
 
-::
+.. code-block:: none
 
     pinot.controller.segment.fetcher.`<protocol>`.class =`<class path to your implementation>
 
 or
 
-::
+.. code-block:: none
 
     pinot.server.segment.fetcher.`<protocol>`.class =`<class path to your implementation>
 
diff --git a/docs/trying_pinot.rst b/docs/trying_pinot.rst
index ddc1197..1e2e72f 100644
--- a/docs/trying_pinot.rst
+++ b/docs/trying_pinot.rst
@@ -1,25 +1,24 @@
-Running the Pinot Demonstration
-===============================
+Quick Demo
+==========
 
 A quick way to get familiar with Pinot is to run the Pinot examples. The examples can be run either by compiling the
 code or by running the prepackaged Docker images.
 
-Basic Pinot querying
---------------------
-
 To demonstrate Pinot, let's start a simple one node cluster, along with the required Zookeeper. This demo setup also
 creates a table, generates some Pinot segments, then uploads them to the cluster in order to make them queryable.
 
 All of the setup is automated, so the only thing required at the beginning is to start the demonstration cluster.
 
-Running Pinot using Docker
-~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-With Docker installed, the Pinot demonstration can be run using ``docker run -it -p 9000:9000
-linkedin/pinot-quickstart-offline``. This will start a single node Pinot cluster with a preloaded data set.
+.. _compiling-code-section:
 
-Running Pinot by compiling the code
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Compiling the code
+~~~~~~~~~~~~~~~~~~
+
+.. note::
+
+    You can skip this step if you are planning to run the pre-built docker image. Make sure you have Docker installed.
+    Some of the newer query features may not be available in docker as of this writing
 
 One can also run the Pinot demonstration by checking out the code on GitHub, compiling it, and running it. Compiling
 Pinot requires JDK 8 or later and Apache Maven 3.
@@ -27,15 +26,19 @@ Pinot requires JDK 8 or later and Apache Maven 3.
 #. Check out the code from GitHub (https://github.com/apache/incubator-pinot)
 #. With Maven installed, run ``mvn install package -DskipTests`` in the directory in which you checked out Pinot.
 #. Make the generated scripts executable ``cd pinot-distribution/target/pinot-0.016-pkg; chmod +x bin/*.sh``
-#. Run Pinot: ``bin/quick-start-offline.sh``
 
-Trying out the demo
-~~~~~~~~~~~~~~~~~~~
+Trying out Offline quickstart demo
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To run the demo with docker
+  ``docker run -it -p 9000:9000 linkedin/pinot-quickstart-offline``
+To run the demo with compiled code:
+  ``bin/quick-start-offline.sh``
 
 Once the Pinot cluster is running, you can query it by going to http://localhost:9000/query/
 
 You can also use the REST API to query Pinot, as well as the Java client. As this is outside of the scope of this
-introduction, the reference documentation to use the Pinot client APIs is in the :ref:`client-api` section.
+introduction, the reference documentation to use the Pinot client APIs is in the :doc:`client_api` section.
 
 Pinot uses PQL, a SQL-like query language, to query data. Here are some sample queries:
 
@@ -58,12 +61,15 @@ Pinot uses PQL, a SQL-like query language, to query data. Here are some sample q
 
 The full reference for the PQL query language is present in the :ref:`pql` section of the Pinot documentation.
 
-Realtime data ingestion
------------------------
+Trying out Realtime quickstart demo
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pinot can ingest data from streaming sources such as Kafka. 
 
-Pinot can also ingest data from streaming sources, such as Kafka streams. To run the realtime data ingestion demo, you
-can run ``docker run -it -p 9000:9000 linkedin/pinot-quickstart-realtime`` using Docker, or
-``bin/quick-start-realtime.sh`` if using the self-compiled version.
+To run the demo with docker
+  ``docker run -it -p 9000:9000 linkedin/pinot-quickstart-realtime``
+To run the demo with compiled code:
+  ``bin/quick-start-realtime.sh``
 
 Once started, the demo will start Kafka, create a Kafka topic, and create a realtime Pinot table. Once created, Pinot
 will start ingesting events from the Kafka topic into the table. The demo also starts a consumer that consumes events


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org