You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kibble.apache.org by hu...@apache.org on 2018/01/11 16:56:01 UTC

[kibble] branch master updated: Start working on Kibble documentation via RtD

This is an automated email from the ASF dual-hosted git repository.

humbedooh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kibble.git


The following commit(s) were added to refs/heads/master by this push:
     new dd69711  Start working on Kibble documentation via RtD
dd69711 is described below

commit dd697115c36b6f850aea27569549bf591ea85ded
Author: Daniel Gruno <hu...@apache.org>
AuthorDate: Thu Jan 11 17:55:24 2018 +0100

    Start working on Kibble documentation via RtD
---
 docs/.gitignore          |   1 +
 docs/Makefile            |  20 ++++
 docs/source/conf.py      | 169 +++++++++++++++++++++++++++++++++
 docs/source/index.rst    |  22 +++++
 docs/source/managing.rst |  21 +++++
 docs/source/setup.rst    | 236 +++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 469 insertions(+)

diff --git a/docs/.gitignore b/docs/.gitignore
new file mode 100644
index 0000000..378eac2
--- /dev/null
+++ b/docs/.gitignore
@@ -0,0 +1 @@
+build
diff --git a/docs/Makefile b/docs/Makefile
new file mode 100644
index 0000000..dd5b2a8
--- /dev/null
+++ b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line.
+SPHINXOPTS    =
+SPHINXBUILD   = sphinx-build
+SPHINXPROJ    = ApacheKibble
+SOURCEDIR     = source
+BUILDDIR      = build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
\ No newline at end of file
diff --git a/docs/source/conf.py b/docs/source/conf.py
new file mode 100644
index 0000000..48d4595
--- /dev/null
+++ b/docs/source/conf.py
@@ -0,0 +1,169 @@
+# -*- coding: utf-8 -*-
+#
+# Apache Kibble documentation build configuration file, created by
+# sphinx-quickstart on Thu Jan 11 06:05:51 2018.
+#
+# This file is execfile()d with the current directory set to its
+# containing dir.
+#
+# Note that not all possible configuration values are present in this
+# autogenerated file.
+#
+# All configuration values have a default; values that are commented out
+# serve to show the default.
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+# import os
+# import sys
+# sys.path.insert(0, os.path.abspath('.'))
+
+
+# -- General configuration ------------------------------------------------
+
+# If your documentation needs a minimal Sphinx version, state it here.
+#
+# needs_sphinx = '1.0'
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = ['sphinx.ext.todo',
+    'sphinx.ext.imgmath']
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# The suffix(es) of source filenames.
+# You can specify multiple suffix as a list of string:
+#
+# source_suffix = ['.rst', '.md']
+source_suffix = '.rst'
+
+# The master toctree document.
+master_doc = 'index'
+
+# General information about the project.
+project = u'Apache Kibble'
+copyright = u'2018, The Apache Kibble Community'
+author = u'The Apache Kibble Community'
+
+# The version info for the project you're documenting, acts as replacement for
+# |version| and |release|, also used in various other places throughout the
+# built documents.
+#
+# The short X.Y version.
+version = u'0.1'
+# The full version, including alpha/beta/rc tags.
+release = u'0.1'
+
+# The language for content autogenerated by Sphinx. Refer to documentation
+# for a list of supported languages.
+#
+# This is also used if you do content translation via gettext catalogs.
+# Usually you set "language" from the command line for these cases.
+language = None
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This patterns also effect to html_static_path and html_extra_path
+exclude_patterns = []
+
+# The name of the Pygments (syntax highlighting) style to use.
+pygments_style = 'sphinx'
+
+# If true, `todo` and `todoList` produce output, else they produce nothing.
+todo_include_todos = True
+
+
+# -- Options for HTML output ----------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+#
+html_theme = 'sphinx_rtd_theme'
+
+# Theme options are theme-specific and customize the look and feel of a theme
+# further.  For a list of options available for each theme, see the
+# documentation.
+#
+# html_theme_options = {}
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ['_static']
+
+# Custom sidebar templates, must be a dictionary that maps document names
+# to template names.
+#
+# This is required for the alabaster theme
+# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
+html_sidebars = {
+    '**': [
+        'relations.html',  # needs 'show_related': True theme option to display
+        'searchbox.html',
+    ]
+}
+
+
+# -- Options for HTMLHelp output ------------------------------------------
+
+# Output file base name for HTML help builder.
+htmlhelp_basename = 'ApacheKibbledoc'
+
+
+# -- Options for LaTeX output ---------------------------------------------
+
+latex_elements = {
+    # The paper size ('letterpaper' or 'a4paper').
+    #
+    # 'papersize': 'letterpaper',
+
+    # The font size ('10pt', '11pt' or '12pt').
+    #
+    # 'pointsize': '10pt',
+
+    # Additional stuff for the LaTeX preamble.
+    #
+    # 'preamble': '',
+
+    # Latex figure (float) alignment
+    #
+    # 'figure_align': 'htbp',
+}
+
+# Grouping the document tree into LaTeX files. List of tuples
+# (source start file, target name, title,
+#  author, documentclass [howto, manual, or own class]).
+latex_documents = [
+    (master_doc, 'ApacheKibble.tex', u'Apache Kibble Documentation',
+     u'The Apache Kibble Community', 'manual'),
+]
+
+
+# -- Options for manual page output ---------------------------------------
+
+# One entry per manual page. List of tuples
+# (source start file, name, description, authors, manual section).
+man_pages = [
+    (master_doc, 'apachekibble', u'Apache Kibble Documentation',
+     [author], 1)
+]
+
+
+# -- Options for Texinfo output -------------------------------------------
+
+# Grouping the document tree into Texinfo files. List of tuples
+# (source start file, target name, title, author,
+#  dir menu entry, description, category)
+texinfo_documents = [
+    (master_doc, 'ApacheKibble', u'Apache Kibble Documentation',
+     author, 'ApacheKibble', 'One line description of project.',
+     'Miscellaneous'),
+]
+
+
+
diff --git a/docs/source/index.rst b/docs/source/index.rst
new file mode 100644
index 0000000..204272f
--- /dev/null
+++ b/docs/source/index.rst
@@ -0,0 +1,22 @@
+.. Apache Kibble documentation master file, created by
+   sphinx-quickstart on Thu Jan 11 06:05:51 2018.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Welcome to Apache Kibble's documentation!
+=========================================
+
+.. toctree::
+   :maxdepth: 3
+   :caption: Contents:
+
+   setup
+   managing
+
+
+Indices and tables
+==================
+
+* :ref:`genindex`
+* :ref:`modindex`
+* :ref:`search`
diff --git a/docs/source/managing.rst b/docs/source/managing.rst
new file mode 100644
index 0000000..65aefb6
--- /dev/null
+++ b/docs/source/managing.rst
@@ -0,0 +1,21 @@
+Managing Apache Kibble
+======================
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+
+
+************************
+Creating an Organisation
+************************
+
+TODO
+
+.. _configdatasources:
+
+************************
+Configuring Data Sources
+************************
+
+ALSO TODO
diff --git a/docs/source/setup.rst b/docs/source/setup.rst
new file mode 100644
index 0000000..129b04e
--- /dev/null
+++ b/docs/source/setup.rst
@@ -0,0 +1,236 @@
+Setting up Apache Kibble
+========================
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+
+
+****************************
+Understanding the Components
+****************************
+
+Kibble consists of two major components:
+
+The Kibble Server (kibble)
+   This is the main database and UI Server. It serves as the hub for the
+   scanners to connect to, and provides the overall management of
+   sources as well as the visualizations and API end points.
+   
+The Kibble Scanner Applications (kibble-scanners)
+   This is a collection of scanning applications each designed to work
+   with a specific type of resource (a git repo, a mailing list, a JIRA
+   instance etc) and push copmpiled data objects to the Kibble Server.
+   Some resources only have one scanner plugin, while others may have
+   multiple plugins capable of dealing with specific aspects of a
+   resource.
+
+**********************
+Component Requirements
+**********************
+
+################
+Server Component
+################
+
+As said, the main Kibble Server is a hub for scanners, and as such, is
+only ever needed on one machine. It is recommended that, for large
+instances of kibble, you place the application on a machine or VM with
+sufficient resources to handle the database load and memory requirements.
+
+As a rule of thumb, the Server does not require a lot of disk space
+(enough to hold the compiled database), but it does require CPU and RAM.
+The scanners require more disk space, but can operate with limited CPU
+and RAM.
+
+As an example, let os examine the Apache Kibble demo instance:
+
+- 100 sources (git repos, mailing lists, bug trackers and so on)
+- 3,5 million source objects currently (commits, emails, tickets etc)
+- 10 concurrent users (actual people uing the web UI)
+
+The recommended minimal specs for the Server component on an instance of
+this size would be approximately 4-8GB RAM, 4 cores and at least 10GB
+disk space. As this is a centralized component, you will want to spec
+this to be able to efficiently deal with the entire database in memory
+for best performance.
+
+
+#################
+Scanner Component
+#################
+
+The scanner components can either consist of one instance, or be spread
+out in a clustered setup. Thus, the requirements can be spread out on
+multiple machines or VMs. Scanners will auto-adjust the scanning speed
+to match the number of CPU cores available to it; a scanner with two
+cores available will run two simultaneous jobs, whereas a scanner with
+eight cores will run eight simultaneous jobs to speed up processing.
+A scanner will typically require somewhere between 512 and 1GB of memory,
+and thus can safely run on a VM with 2GB memory (or less).
+
+
+********************
+Source Code Location
+********************
+
+.. This needs to change once we have released Kibble
+
+*Apache Kibble does not currently have any releases.*
+*You are however welcome to try out the development version.*
+
+For the time being, we recommend that you use the ``master`` branch for
+testing Kibble. This applies to both scanners and the server.
+
+The Kibble Server can be found via our source repository at
+https://github.com/apache/kibble
+
+The Kibble Scanners can be found at
+https://github.com/apache/kibble-scanners
+
+
+*********************
+Installing the Server
+*********************
+
+###############
+Pre-requisites
+###############
+
+Before you install the Kibble Server, please ensure you have the
+following components installed and set up:
+
+- An ElasticSearch instance, version 5.x or newer (does not have to be
+  on the same machine, but it may help speed up processing)
+- A web server of your choice (Apache HTTP Server, NGINX, lighttp etc)
+- Python 3.4 or newer with the following libraries installed:
+- - elasticsearch
+- - certifi
+- - yaml
+- - bcrypt
+- Gunicorn for Python 3.x (often called gunicorn3) or mod_wsgi
+
+###########################################
+Configuring and Priming the Kibble Instance
+###########################################
+Once you have the components installed and Kibble downloaded, you will
+need to prime the ElasticSearch instance and create a configuration file.
+
+Assuming you have installed kibble in /var/www/kibble, you would set it
+up by issuing the following:
+
+- ``cd /var/www/kibble/setup``
+- ``python3 setup.py``
+- Enter the configuration parameters the setup process asks for
+
+This will set up the database, the configuration file, and create your
+initial administrator account for the UI. You can later on do additional
+configuration of the data server by editing the ``api/yaml/kibble.yaml``
+file.
+
+#####################
+Setting up the Web UI
+#####################
+
+Once you have finished the initial setup, you will need to enable the
+web UI. Kibble is built as a WSGI application, and as such you can
+use mod_wsgi for apache, or proxy to Gunicorn. In this example, we will
+be using the Apache HTTP Server and proxy to Gunicorn:
+
+- Set up a virtual host in Apache:
+
+::
+
+   <VirtualHost *:80>
+      # Set this to your domain, or add kibble.localhost to /etc/hosts
+      ServerName kibble.localhost
+      DocumentRoot /var/www/kibble/ui/
+      # Proxy to gunicorn for /api/ below:
+      ProxyPass /api/ http://localhost:8000/api/
+   </VirtualHost>
+
+- Launch gunicorn as a daemon on port 8000:
+
+::
+
+   cd /var/www/kibble
+   gunicorn -w 10 -b 127.0.0.1:8000 handler:application -t 120 -D
+
+Once httpd is (re)started, you should be able to browse to your new
+Kibble instance.
+
+
+*******************
+Installing Scanners
+*******************
+
+##############
+Pre-requisites
+##############
+
+.. _cloc: https://github.com/AlDanial/cloc
+
+The Kibble Scanners rely on the following packages:
+
+- Python >= 3.4 with the following packages:
+- - python3-yaml
+- - python3-elasticsearch
+
+The scanners require the following optional components if you wish to enable
+git repository analysis:
+
+- git binaries (GPL License)
+- cloc_ version 1.70 or later (GPL License)
+
+
+###########################
+Configuring a Scanner Node
+###########################
+
+First, check out the scanner source:
+
+``git clone https://github.com/apache/kibble-scanners.git``
+
+Then edit the ``conf/config.yaml`` file to match both your ElasticSearch
+database and the file layout you wish to use on the scanner machine.
+Remember that the scanner must have enough disk space to fully store
+any resources you may be scanning. If you are scanning a large git repository,
+the scanner should have sufficient disk space to store it locally.
+
+If you plan to make use of the optional text analysis features of
+Kibble, you should also configure the API service you will be using
+(Watson/Azure/picoAPI etc).
+
+
+#############################
+Balacing Load Across Machines
+#############################
+
+If you wish to spread out the analysis load over several machines/VMs,
+you can do so by specifying a ``scanner.balance`` on each node. The balance
+directive uses the syntax X/Y, where Y is the total number of nodes in
+your scanner cluster, and X is the ID of the current scanner. Thus, if
+you have decided to use four machines for scanning, the first would have
+a balance of 1/4, the next would be 2/4, then 3/4 and finally 4/4 on the
+last machine. This will balance the load and storage requirements evenly
+across all machines.
+
+
+
+**************
+Running a Scan
+**************
+
+Once you have both scanners and the data server set up, you can begin
+scanning resources for data. Please refer to :ref:`configdatasources`
+for how to set up various resources for scanning via the Web UI.
+
+Scans can be iniated manually, but you may want to set up a cron job to
+handle daily scans of resources. To start a scan on a scanner machine,
+run the following: ``python3 src/kibble-scanner.py``
+
+This will load all plugins and use them in a sensible order on each
+resource that matches the appropriate type. The collected data will be
+pushed to the main data server and be available for visualizations
+instantly.
+

-- 
To stop receiving notification emails like this one, please contact
['"commits@kibble.apache.org" <co...@kibble.apache.org>'].