You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/03/07 20:56:05 UTC

[GitHub] [incubator-pinot] sunithabeeram commented on a change in pull request #3927: Add doc for Customizing Pinot

sunithabeeram commented on a change in pull request #3927: Add doc for Customizing Pinot
URL: https://github.com/apache/incubator-pinot/pull/3927#discussion_r263562852
 
 

 ##########
 File path: docs/customizations.rst
 ##########
 @@ -18,5 +18,152 @@
 ..
 
 
-Customization points in Pinot
-=============================
\ No newline at end of file
+Customizing Pinot
+===================
+
+There are a lot of places in Pinot which can be customized depending on the infrastructure or the use case. Below is a list of such customization points. 
+
+
+.. image:: img/CustomizingPinot.png
+
+
+1. Generating Pinot segments
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Typically, data files will be available on some offline data storage, such as HDFS, and a Hadoop job can be written to read the data and create the segment. The `SegmentCreationJob <https://github.com/apache/incubator-pinot/blob/master/pinot-hadoop/src/main/java/org/apache/pinot/hadoop/job/SegmentCreationJob.java>`_ class contains a hadoop job for creating segments. This is a map only job, and the mapper can be found in `SegmentCreationMapper <https://github.com/apache/incubator-pinot/blob/master/pinot-hadoop/src/main/java/org/apache/pinot/hadoop/job/mapper/SegmentCreationMapper.java>`_. You can override the SegmentCreationMapper with a custom mapper by overriding the SegmentCreationJob::getMapperClass() method. 
+
+New offline data is typically available in a daily or hourly frequency. You can schedule your jobs to run periodically using either cron or a scheduler such as `Azkaban <https://azkaban.github.io/>`_.    
 
 Review comment:
   We cannot comment on how frequently offline data is available. You can just say that the jobs can be run daily or hourly to push offline data to Pinot.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org