You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by jl...@apache.org on 2019/03/09 00:39:05 UTC
[incubator-pinot] 01/01: Add more documents to docs folder

This is an automated email from the ASF dual-hosted git repository.

jlli pushed a commit to branch add-doc-experiment-with-pinot
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 81848aaee209bdbad809759c901c1520b1225e09
Author: jackjlli <jl...@linkedin.com>
AuthorDate: Fri Mar 8 16:38:41 2019 -0800

    Add more documents to docs folder
---
 docs/getting_started.rst      |   7 +++++
 docs/img/generate-segment.png | Bin 0 -> 218597 bytes
 docs/img/list-schemas.png     | Bin 0 -> 8952 bytes
 docs/img/quey-table.png       | Bin 0 -> 35914 bytes
 docs/img/upload-segment.png   | Bin 0 -> 13944 bytes
 docs/management_api.rst       |  28 ++++++++++++++---
 docs/pinot_hadoop.rst         |  68 ++++++++++++++++++++++++++++++++++++++++--
 7 files changed, 96 insertions(+), 7 deletions(-)

diff --git a/docs/getting_started.rst b/docs/getting_started.rst
index e3d2416..4b92f96 100644
--- a/docs/getting_started.rst
+++ b/docs/getting_started.rst
@@ -94,3 +94,10 @@ show up in Pinot.
 To show new events appearing, one can run :sql:`SELECT * FROM meetupRsvp ORDER BY mtime DESC LIMIT 50` repeatedly, which shows the
 last events that were ingested by Pinot.
 
+Experiment with Pinot
+~~~~~~~~~~~~~~~~~~~~~
+
+Here's the instruction on how to add a simple table to the pinot system you got working int the previous demo steps,
+how to add segments, and how to query it.
+
+
diff --git a/docs/img/generate-segment.png b/docs/img/generate-segment.png
new file mode 100644
index 0000000..5848781
Binary files /dev/null and b/docs/img/generate-segment.png differ
diff --git a/docs/img/list-schemas.png b/docs/img/list-schemas.png
new file mode 100644
index 0000000..9b00855
Binary files /dev/null and b/docs/img/list-schemas.png differ
diff --git a/docs/img/quey-table.png b/docs/img/quey-table.png
new file mode 100644
index 0000000..5859f2b
Binary files /dev/null and b/docs/img/quey-table.png differ
diff --git a/docs/img/upload-segment.png b/docs/img/upload-segment.png
new file mode 100644
index 0000000..edd348d
Binary files /dev/null and b/docs/img/upload-segment.png differ
diff --git a/docs/management_api.rst b/docs/management_api.rst
index f8483eb..6f9930a 100644
--- a/docs/management_api.rst
+++ b/docs/management_api.rst
@@ -17,11 +17,31 @@
 .. under the License.
 ..
 
-Managing Pinot via REST API on the Controller
-=============================================
+Managing Pinot
+==============
 
-*TODO* : Remove this section altogether and find a place somewhere for a pointer to the management API. Maybe in the 'Running pinot in production' section?
+There are two ways to manage Pinot cluster, i.e. using Pinot management console and pinot-admin.sh script.
 
-There is a REST API which allows management of tables, tenants, segments and schemas. It can be accessed by going to ``http://[controller host]/help`` which offers a web UI to do these tasks, as well as document the REST API.
+Pinot Management Console
+------------------------
+
+There is a REST API which allows management of tables, tenants, segments and schemas. It can be accessed by going to
+``http://[controller host]/help`` which offers a web UI to do these tasks, as well as document the REST API.
+
+For example, list all the schema within Pinot cluster:
+.. figure:: img/list-schemas.png
+
+Upload a pinot segment:
+.. figure:: img/upload-segment.png
+
+
+Pinot-admin.sh
+--------------
 
 It can be used instead of the ``pinot-admin.sh`` commands to automate the creation of tables and tenants.
+
+For example, create a pinot segment:
+.. figure:: img/generate-segment.png
+
+Query a table:
+.. figure:: img/query-table.png
diff --git a/docs/pinot_hadoop.rst b/docs/pinot_hadoop.rst
index 1611040..25f6346 100644
--- a/docs/pinot_hadoop.rst
+++ b/docs/pinot_hadoop.rst
@@ -23,7 +23,8 @@ Creating Pinot segments
 =======================
 
 Pinot segments can be created offline on Hadoop, or via command line from data files. Controller REST endpoint
-can then be used to add the segment to the table to which the segment belongs.
+can then be used to add the segment to the table to which the segment belongs. Pinot segments can also be created by
+ingesting data from realtime resources (such as Kafka).
 
 Creating segments using hadoop
 ------------------------------
@@ -177,8 +178,8 @@ Sample Schema:
     "schemaName" : "mySchema",
   }
 
-Pushing segments to Pinot
-^^^^^^^^^^^^^^^^^^^^^^^^^
+Pushing offline segments to Pinot
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 You can use curl to push a segment to pinot:
 
@@ -196,3 +197,64 @@ Alternatively you can use the pinot-admin.sh utility to upload one or more segme
 The command uploads all the segments found in ``segmentDirectoryPath``.
 The segments could be either tar-compressed (in which case it is a file under ``segmentDirectoryPath``)
 or uncompressed (in which case it is a directory under ``segmentDirectoryPath``).
+
+Realtime segment generation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To consume in realtime, we simply need to create a table that uses the same schema and points to the Kafka topic to
+consume from, using a table definition such as this one:
+
+.. code-block:: none
+  {
+    "tableName":"flights",
+    "segmentsConfig" : {
+        "retentionTimeUnit":"DAYS",
+        "retentionTimeValue":"7",
+        "segmentPushFrequency":"daily",
+        "segmentPushType":"APPEND",
+        "replication" : "1",
+        "schemaName" : "flights",
+        "timeColumnName" : "daysSinceEpoch",
+        "timeType" : "DAYS",
+        "segmentAssignmentStrategy" : "BalanceNumSegmentAssignmentStrategy"
+    },
+    "tableIndexConfig" : {
+        "invertedIndexColumns" : ["Carrier"],
+        "loadMode"  : "HEAP",
+        "lazyLoad"  : "false",
+                "streamConfigs": {
+                        "streamType": "kafka",
+                        "stream.kafka.consumer.type": "highLevel",
+                        "stream.kafka.topic.name": "flights-realtime",
+                        "stream.kafka.decoder.class.name": "org.apache.pinot.core.realtime.impl.kafka.KafkaJSONMessageDecoder",
+                        "stream.kafka.zk.broker.url": "localhost:2181",
+                        "stream.kafka.hlc.zk.connect.string": "localhost:2181"
+                }
+    },
+    "tableType":"REALTIME",
+        "tenants" : {
+                "broker":"DefaultTenant_BROKER",
+                "server":"DefaultTenant_SERVER"
+        },
+    "metadata": {
+    }
+  }
+
+First, we'll start a local instance of Kafka and start streaming data into it:
+
+.. code-block:: none
+  bin/pinot-admin.sh StartKafka &
+  bin/pinot-admin.sh StreamAvroIntoKafka -avroFile flights-2014.avro -kafkaTopic flights-realtime &
+
+This will stream one event per second from the Avro file to the Kafka topic. Then, we'll create a realtime table, which
+will start consuming from the Kafka topic.
+
+.. code-block:: none
+  bin/pinot-admin.sh AddTable -filePath flights-definition-realtime.json
+
+We can then query the table and see the events stream in:
+
+.. code-block:: none
+  {"traceInfo":{},"numDocsScanned":17,"aggregationResults":[{"function":"count_star","value":"17"}],"timeUsedMs":27,"segmentStatistics":[],"exceptions":[],"totalDocs":17}
+
+Repeating the query multiple times should show the events slowly being streamed into the table.
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org