You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by cr...@apache.org on 2018/01/03 17:48:07 UTC

[07/35] incubator-airflow-site git commit: 1.9.0

http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/28a3eb60/code.html
----------------------------------------------------------------------
diff --git a/code.html b/code.html
index e37b111..65e03ee 100644
--- a/code.html
+++ b/code.html
@@ -13,6 +13,8 @@
 
   
   
+  
+  
 
   
 
@@ -80,7 +82,10 @@
           
             
             
-                <ul class="current">
+              
+            
+            
+              <ul class="current">
 <li class="toctree-l1"><a class="reference internal" href="project.html">Project</a></li>
 <li class="toctree-l1"><a class="reference internal" href="license.html">License</a></li>
 <li class="toctree-l1"><a class="reference internal" href="start.html">Quick Start</a></li>
@@ -207,82 +212,77 @@ method at a specified <code class="docutils literal"><span class="pre">poke_inte
 <h3>BaseOperator<a class="headerlink" href="#baseoperator" title="Permalink to this headline">¶</a></h3>
 <p>All operators are derived from <code class="docutils literal"><span class="pre">BaseOperator</span></code> and acquire much
 functionality through inheritance. Since this is the core of the engine,
-it&#8217;s worth taking the time to understand the parameters of <code class="docutils literal"><span class="pre">BaseOperator</span></code>
+it’s worth taking the time to understand the parameters of <code class="docutils literal"><span class="pre">BaseOperator</span></code>
 to understand the primitive features that can be leveraged in your
 DAGs.</p>
 <dl class="class">
 <dt id="airflow.models.BaseOperator">
-<em class="property">class </em><code class="descclassname">airflow.models.</code><code class="descname">BaseOperator</code><span class="sig-paren">(</span><em>task_id</em>, <em>owner='Airflow'</em>, <em>email=None</em>, <em>email_on_retry=True</em>, <em>email_on_failure=True</em>, <em>retries=0</em>, <em>retry_delay=datetime.timedelta(0</em>, <em>300)</em>, <em>retry_exponential_backoff=False</em>, <em>max_retry_delay=None</em>, <em>start_date=None</em>, <em>end_date=None</em>, <em>schedule_interval=None</em>, <em>depends_on_past=False</em>, <em>wait_for_downstream=False</em>, <em>dag=None</em>, <em>params=None</em>, <em>default_args=None</em>, <em>adhoc=False</em>, <em>priority_weight=1</em>, <em>queue='default'</em>, <em>pool=None</em>, <em>sla=None</em>, <em>execution_timeout=None</em>, <em>on_failure_callback=None</em>, <em>on_success_callback=None</em>, <em>on_retry_callback=None</em>, <em>trigger_rule=u'all_success'</em>, <em>resources=None</em>, <em>run_as_user=None</em>, <e
 m>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/airflow/models.html#BaseOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.models.BaseOperator" title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code class="descclassname">airflow.models.</code><code class="descname">BaseOperator</code><span class="sig-paren">(</span><em>task_id</em>, <em>owner='Airflow'</em>, <em>email=None</em>, <em>email_on_retry=True</em>, <em>email_on_failure=True</em>, <em>retries=0</em>, <em>retry_delay=datetime.timedelta(0</em>, <em>300)</em>, <em>retry_exponential_backoff=False</em>, <em>max_retry_delay=None</em>, <em>start_date=None</em>, <em>end_date=None</em>, <em>schedule_interval=None</em>, <em>depends_on_past=False</em>, <em>wait_for_downstream=False</em>, <em>dag=None</em>, <em>params=None</em>, <em>default_args=None</em>, <em>adhoc=False</em>, <em>priority_weight=1</em>, <em>queue='default'</em>, <em>pool=None</em>, <em>sla=None</em>, <em>execution_timeout=None</em>, <em>on_failure_callback=None</em>, <em>on_success_callback=None</em>, <em>on_retry_callback=None</em>, <em>trigger_rule=u'all_success'</em>, <em>resources=None</em>, <em>run_as_user=None</em>, <e
 m>task_concurrency=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/airflow/models.html#BaseOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.models.BaseOperator" title="Permalink to this definition">¶</a></dt>
 <dd><p>Abstract base class for all operators. Since operators create objects that
 become node in the dag, BaseOperator contains many recursive methods for
 dag crawling behavior. To derive this class, you are expected to override
-the constructor as well as the &#8216;execute&#8217; method.</p>
-<p>Operators derived from this task should perform or trigger certain tasks
+the constructor as well as the ‘execute’ method.</p>
+<p>Operators derived from this class should perform or trigger certain tasks
 synchronously (wait for completion). Example of operators could be an
 operator the runs a Pig job (PigOperator), a sensor operator that
 waits for a partition to land in Hive (HiveSensorOperator), or one that
 moves data from Hive to MySQL (Hive2MySqlOperator). Instances of these
 operators (tasks) target specific operations, running specific scripts,
 functions or data transfers.</p>
-<p>This class is abstract and shouldn&#8217;t be instantiated. Instantiating a
+<p>This class is abstract and shouldn’t be instantiated. Instantiating a
 class derived from this one results in the creation of a task object,
 which ultimately becomes a node in DAG objects. Task dependencies should
 be set by using the set_upstream and/or set_downstream methods.</p>
-<p>Note that this class is derived from SQLAlchemy&#8217;s Base class, which
-allows us to push metadata regarding tasks to the database. Deriving this
-classes needs to implement the polymorphic specificities documented in
-SQLAlchemy. This should become clear while reading the code for other
-operators.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name" />
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>task_id</strong> (<em>string</em>) &#8211; a unique, meaningful id for the task</li>
-<li><strong>owner</strong> (<em>string</em>) &#8211; the owner of the task, using the unix username is recommended</li>
-<li><strong>retries</strong> (<em>int</em>) &#8211; the number of retries that should be performed before
+<li><strong>task_id</strong> (<em>string</em>) – a unique, meaningful id for the task</li>
+<li><strong>owner</strong> (<em>string</em>) – the owner of the task, using the unix username is recommended</li>
+<li><strong>retries</strong> (<em>int</em>) – the number of retries that should be performed before
 failing the task</li>
-<li><strong>retry_delay</strong> (<em>timedelta</em>) &#8211; delay between retries</li>
-<li><strong>retry_exponential_backoff</strong> (<em>bool</em>) &#8211; allow progressive longer waits between
+<li><strong>retry_delay</strong> (<em>timedelta</em>) – delay between retries</li>
+<li><strong>retry_exponential_backoff</strong> (<em>bool</em>) – allow progressive longer waits between
 retries by using exponential backoff algorithm on retry delay (delay
 will be converted into seconds)</li>
-<li><strong>max_retry_delay</strong> (<em>timedelta</em>) &#8211; maximum delay interval between retries</li>
-<li><strong>start_date</strong> (<em>datetime</em>) &#8211; The <code class="docutils literal"><span class="pre">start_date</span></code> for the task, determines
+<li><strong>max_retry_delay</strong> (<em>timedelta</em>) – maximum delay interval between retries</li>
+<li><strong>start_date</strong> (<em>datetime</em>) – The <code class="docutils literal"><span class="pre">start_date</span></code> for the task, determines
 the <code class="docutils literal"><span class="pre">execution_date</span></code> for the first task instance. The best practice
 is to have the start_date rounded
-to your DAG&#8217;s <code class="docutils literal"><span class="pre">schedule_interval</span></code>. Daily jobs have their start_date
+to your DAG’s <code class="docutils literal"><span class="pre">schedule_interval</span></code>. Daily jobs have their start_date
 some day at 00:00:00, hourly jobs have their start_date at 00:00
 of a specific hour. Note that Airflow simply looks at the latest
 <code class="docutils literal"><span class="pre">execution_date</span></code> and adds the <code class="docutils literal"><span class="pre">schedule_interval</span></code> to determine
 the next <code class="docutils literal"><span class="pre">execution_date</span></code>. It is also very important
-to note that different tasks&#8217; dependencies
+to note that different tasks’ dependencies
 need to line up in time. If task A depends on task B and their
-start_date are offset in a way that their execution_date don&#8217;t line
-up, A&#8217;s dependencies will never be met. If you are looking to delay
+start_date are offset in a way that their execution_date don’t line
+up, A’s dependencies will never be met. If you are looking to delay
 a task, for example running a daily task at 2AM, look into the
 <code class="docutils literal"><span class="pre">TimeSensor</span></code> and <code class="docutils literal"><span class="pre">TimeDeltaSensor</span></code>. We advise against using
 dynamic <code class="docutils literal"><span class="pre">start_date</span></code> and recommend using fixed ones. Read the
 FAQ entry about start_date for more information.</li>
-<li><strong>end_date</strong> (<em>datetime</em>) &#8211; if specified, the scheduler won&#8217;t go beyond this date</li>
-<li><strong>depends_on_past</strong> (<em>bool</em>) &#8211; when set to true, task instances will run
-sequentially while relying on the previous task&#8217;s schedule to
+<li><strong>end_date</strong> (<em>datetime</em>) – if specified, the scheduler won’t go beyond this date</li>
+<li><strong>depends_on_past</strong> (<em>bool</em>) – when set to true, task instances will run
+sequentially while relying on the previous task’s schedule to
 succeed. The task instance for the start_date is allowed to run.</li>
-<li><strong>wait_for_downstream</strong> (<em>bool</em>) &#8211; when set to true, an instance of task
+<li><strong>wait_for_downstream</strong> (<em>bool</em>) – when set to true, an instance of task
 X will wait for tasks immediately downstream of the previous instance
 of task X to finish successfully before it runs. This is useful if the
 different instances of a task X alter the same asset, and this asset
 is used by tasks downstream of task X. Note that depends_on_past
 is forced to True wherever wait_for_downstream is used.</li>
-<li><strong>queue</strong> (<em>str</em>) &#8211; which queue to target when running this job. Not
+<li><strong>queue</strong> (<em>str</em>) – which queue to target when running this job. Not
 all executors implement queue management, the CeleryExecutor
 does support targeting specific queues.</li>
-<li><strong>dag</strong> (<a class="reference internal" href="#airflow.models.DAG" title="airflow.models.DAG"><em>DAG</em></a>) &#8211; a reference to the dag the task is attached to (if any)</li>
-<li><strong>priority_weight</strong> (<em>int</em>) &#8211; priority weight of this task against other task.
+<li><strong>dag</strong> (<a class="reference internal" href="#airflow.models.DAG" title="airflow.models.DAG"><em>DAG</em></a>) – a reference to the dag the task is attached to (if any)</li>
+<li><strong>priority_weight</strong> (<em>int</em>) – priority weight of this task against other task.
 This allows the executor to trigger higher priority tasks before
 others when things get backed up.</li>
-<li><strong>pool</strong> (<em>str</em>) &#8211; the slot pool this task should run in, slot pools are a
+<li><strong>pool</strong> (<em>str</em>) – the slot pool this task should run in, slot pools are a
 way to limit concurrency for certain tasks</li>
-<li><strong>sla</strong> (<em>datetime.timedelta</em>) &#8211; time by which the job is expected to succeed. Note that
+<li><strong>sla</strong> (<em>datetime.timedelta</em>) – time by which the job is expected to succeed. Note that
 this represents the <code class="docutils literal"><span class="pre">timedelta</span></code> after the period is closed. For
 example if you set an SLA of 1 hour, the scheduler would send dan email
 soon after 1:00AM on the <code class="docutils literal"><span class="pre">2016-01-02</span></code> if the <code class="docutils literal"><span class="pre">2016-01-01</span></code> instance
@@ -293,27 +293,29 @@ emails for sla misses. SLA misses are also recorded in the database
 for future reference. All tasks that share the same SLA time
 get bundled in a single email, sent soon after that time. SLA
 notification are sent once and only once for each task instance.</li>
-<li><strong>execution_timeout</strong> (<em>datetime.timedelta</em>) &#8211; max time allowed for the execution of
+<li><strong>execution_timeout</strong> (<em>datetime.timedelta</em>) – max time allowed for the execution of
 this task instance, if it goes beyond it will raise and fail.</li>
-<li><strong>on_failure_callback</strong> (<em>callable</em>) &#8211; a function to be called when a task instance
+<li><strong>on_failure_callback</strong> (<em>callable</em>) – a function to be called when a task instance
 of this task fails. a context dictionary is passed as a single
 parameter to this function. Context contains references to related
 objects to the task instance and is documented under the macros
 section of the API.</li>
-<li><strong>on_retry_callback</strong> &#8211; much like the <code class="docutils literal"><span class="pre">on_failure_callback</span></code> excepts
+<li><strong>on_retry_callback</strong> – much like the <code class="docutils literal"><span class="pre">on_failure_callback</span></code> except
 that it is executed when retries occur.</li>
-<li><strong>on_success_callback</strong> (<em>callable</em>) &#8211; much like the <code class="docutils literal"><span class="pre">on_failure_callback</span></code> excepts
+<li><strong>on_success_callback</strong> (<em>callable</em>) – much like the <code class="docutils literal"><span class="pre">on_failure_callback</span></code> except
 that it is executed when the task succeeds.</li>
-<li><strong>trigger_rule</strong> (<em>str</em>) &#8211; defines the rule by which dependencies are applied
+<li><strong>trigger_rule</strong> (<em>str</em>) – defines the rule by which dependencies are applied
 for the task to get triggered. Options are:
 <code class="docutils literal"><span class="pre">{</span> <span class="pre">all_success</span> <span class="pre">|</span> <span class="pre">all_failed</span> <span class="pre">|</span> <span class="pre">all_done</span> <span class="pre">|</span> <span class="pre">one_success</span> <span class="pre">|</span>
 <span class="pre">one_failed</span> <span class="pre">|</span> <span class="pre">dummy}</span></code>
 default is <code class="docutils literal"><span class="pre">all_success</span></code>. Options can be set as string or
 using the constants defined in the static class
 <code class="docutils literal"><span class="pre">airflow.utils.TriggerRule</span></code></li>
-<li><strong>resources</strong> (<em>dict</em>) &#8211; A map of resource parameter names (the argument names of the
+<li><strong>resources</strong> (<em>dict</em>) – A map of resource parameter names (the argument names of the
 Resources constructor) to their values.</li>
-<li><strong>run_as_user</strong> (<em>str</em>) &#8211; unix username to impersonate while running the task</li>
+<li><strong>run_as_user</strong> (<em>str</em>) – unix username to impersonate while running the task</li>
+<li><strong>task_concurrency</strong> (<em>int</em>) – When set, a task will be able to limit the concurrent
+runs across execution_dates</li>
 </ul>
 </td>
 </tr>
@@ -340,10 +342,10 @@ attributes.</p>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>soft_fail</strong> (<em>bool</em>) &#8211; Set to true to mark the task as SKIPPED on failure</li>
-<li><strong>poke_interval</strong> (<em>int</em>) &#8211; Time in seconds that the job should wait in
+<li><strong>soft_fail</strong> (<em>bool</em>) – Set to true to mark the task as SKIPPED on failure</li>
+<li><strong>poke_interval</strong> (<em>int</em>) – Time in seconds that the job should wait in
 between each tries</li>
-<li><strong>timeout</strong> (<em>int</em>) &#8211; Time, in seconds before the task times out and fails.</li>
+<li><strong>timeout</strong> (<em>int</em>) – Time, in seconds before the task times out and fails.</li>
 </ul>
 </td>
 </tr>
@@ -374,11 +376,11 @@ required to support attribute-based usage:</p>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>bash_command</strong> (<em>string</em>) &#8211; The command, set of commands or reference to a
-bash script (must be &#8216;.sh&#8217;) to be executed.</li>
-<li><strong>xcom_push</strong> (<em>bool</em>) &#8211; If xcom_push is True, the last line written to stdout
+<li><strong>bash_command</strong> (<em>string</em>) – The command, set of commands or reference to a
+bash script (must be ‘.sh’) to be executed.</li>
+<li><strong>xcom_push</strong> (<em>bool</em>) – If xcom_push is True, the last line written to stdout
 will also be pushed to an XCom when the bash command completes.</li>
-<li><strong>env</strong> (<em>dict</em>) &#8211; If env is not None, it must be a mapping that defines the
+<li><strong>env</strong> (<em>dict</em>) – If env is not None, it must be a mapping that defines the
 environment variables for the new process; these are used instead
 of inheriting the current process environment, which is the default
 behavior. (templated)</li>
@@ -399,15 +401,15 @@ which will be cleaned afterwards</p>
 <dl class="class">
 <dt id="airflow.operators.BranchPythonOperator">
 <em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">BranchPythonOperator</code><span class="sig-paren">(</span><em>python_callable</em>, <em>op_args=None</em>, <em>op_kwargs=None</em>, <em>provide_context=False</em>, <em>templates_dict=None</em>, <em>templates_exts=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/python_operator.html#BranchPythonOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.BranchPythonOperator" title="Permalink to this definition">¶</a></dt>
-<dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">python_operator.PythonOperator</span></code></p>
-<p>Allows a workflow to &#8220;branch&#8221; or follow a single path following the
+<dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">python_operator.PythonOperator</span></code>, <code class="xref py py-class docutils literal"><span class="pre">airflow.models.SkipMixin</span></code></p>
+<p>Allows a workflow to “branch” or follow a single path following the
 execution of this task.</p>
 <p>It derives the PythonOperator and expects a Python function that returns
 the task_id to follow. The task_id returned should point to a task
-directly downstream from {self}. All other &#8220;branches&#8221; or
+directly downstream from {self}. All other “branches” or
 directly downstream tasks are marked with a state of <code class="docutils literal"><span class="pre">skipped</span></code> so that
-these paths can&#8217;t move forward. The <code class="docutils literal"><span class="pre">skipped</span></code> states are propageted
-downstream to allow for the DAG state to fill up and the DAG run&#8217;s state
+these paths can’t move forward. The <code class="docutils literal"><span class="pre">skipped</span></code> states are propageted
+downstream to allow for the DAG state to fill up and the DAG run’s state
 to be inferred.</p>
 <p>Note that using tasks with <code class="docutils literal"><span class="pre">depends_on_past=True</span></code> downstream from
 <code class="docutils literal"><span class="pre">BranchPythonOperator</span></code> is logically unsound as <code class="docutils literal"><span class="pre">skipped</span></code> status
@@ -426,8 +428,8 @@ will invariably lead to block tasks that depend on their past successes.
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>trigger_dag_id</strong> (<em>str</em>) &#8211; the dag_id to trigger</li>
-<li><strong>python_callable</strong> (<em>python callable</em>) &#8211; a reference to a python function that will be
+<li><strong>trigger_dag_id</strong> (<em>str</em>) – the dag_id to trigger</li>
+<li><strong>python_callable</strong> (<em>python callable</em>) – a reference to a python function that will be
 called while passing it the <code class="docutils literal"><span class="pre">context</span></code> object and a placeholder
 object <code class="docutils literal"><span class="pre">obj</span></code> for your callable to fill and return if you want
 a DagRun created. This <code class="docutils literal"><span class="pre">obj</span></code> object contains a <code class="docutils literal"><span class="pre">run_id</span></code> and
@@ -461,13 +463,13 @@ DAG.</p>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>to</strong> (<em>list</em><em> or </em><em>string</em><em> (</em><em>comma</em><em> or </em><em>semicolon delimited</em><em>)</em><em></em>) &#8211; list of emails to send the email to</li>
-<li><strong>subject</strong> (<em>string</em>) &#8211; subject line for the email (templated)</li>
-<li><strong>html_content</strong> (<em>string</em>) &#8211; content of the email (templated), html markup
+<li><strong>to</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a><em> or </em><em>string</em><em> (</em><em>comma</em><em> or </em><em>semicolon delimited</em><em>)</em>) – list of emails to send the email to</li>
+<li><strong>subject</strong> (<em>string</em>) – subject line for the email (templated)</li>
+<li><strong>html_content</strong> (<em>string</em>) – content of the email (templated), html markup
 is allowed</li>
-<li><strong>files</strong> (<em>list</em>) &#8211; file names to attach in email</li>
-<li><strong>cc</strong> (<em>list</em><em> or </em><em>string</em><em> (</em><em>comma</em><em> or </em><em>semicolon delimited</em><em>)</em><em></em>) &#8211; list of recipients to be added in CC field</li>
-<li><strong>bcc</strong> (<em>list</em><em> or </em><em>string</em><em> (</em><em>comma</em><em> or </em><em>semicolon delimited</em><em>)</em><em></em>) &#8211; list of recipients to be added in BCC field</li>
+<li><strong>files</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a>) – file names to attach in email</li>
+<li><strong>cc</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a><em> or </em><em>string</em><em> (</em><em>comma</em><em> or </em><em>semicolon delimited</em><em>)</em>) – list of recipients to be added in CC field</li>
+<li><strong>bcc</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a><em> or </em><em>string</em><em> (</em><em>comma</em><em> or </em><em>semicolon delimited</em><em>)</em>) – list of recipients to be added in BCC field</li>
 </ul>
 </td>
 </tr>
@@ -485,18 +487,18 @@ is allowed</li>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>external_dag_id</strong> (<em>string</em>) &#8211; The dag_id that contains the task you want to
+<li><strong>external_dag_id</strong> (<em>string</em>) – The dag_id that contains the task you want to
 wait for</li>
-<li><strong>external_task_id</strong> (<em>string</em>) &#8211; The task_id that contains the task you want to
+<li><strong>external_task_id</strong> (<em>string</em>) – The task_id that contains the task you want to
 wait for</li>
-<li><strong>allowed_states</strong> (<em>list</em>) &#8211; list of allowed states, default is <code class="docutils literal"><span class="pre">['success']</span></code></li>
-<li><strong>execution_delta</strong> (<em>datetime.timedelta</em>) &#8211; time difference with the previous execution to
+<li><strong>allowed_states</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a>) – list of allowed states, default is <code class="docutils literal"><span class="pre">['success']</span></code></li>
+<li><strong>execution_delta</strong> (<em>datetime.timedelta</em>) – time difference with the previous execution to
 look at, the default is the same execution_date as the current task.
 For yesterday, use [positive!] datetime.timedelta(days=1). Either
 execution_delta or execution_date_fn can be passed to
 ExternalTaskSensor, but not both.</li>
-<li><strong>execution_date_fn</strong> (<em>callable</em>) &#8211; function that receives the current execution date
-and returns the desired execution date to query. Either execution_delta
+<li><strong>execution_date_fn</strong> (<em>callable</em>) – function that receives the current execution date
+and returns the desired execution dates to query. Either execution_delta
 or execution_date_fn can be passed to ExternalTaskSensor, but not both.</li>
 </ul>
 </td>
@@ -513,17 +515,17 @@ or execution_date_fn can be passed to ExternalTaskSensor, but not both.</li>
 provide the required methods in their respective hooks. The source hook
 needs to expose a <cite>get_records</cite> method, and the destination a
 <cite>insert_rows</cite> method.</p>
-<p>This is mean to be used on small-ish datasets that fit in memory.</p>
+<p>This is meant to be used on small-ish datasets that fit in memory.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name" />
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>sql</strong> (<em>str</em>) &#8211; SQL query to execute against the source database</li>
-<li><strong>destination_table</strong> (<em>str</em>) &#8211; target table</li>
-<li><strong>source_conn_id</strong> (<em>str</em>) &#8211; source connection</li>
-<li><strong>destination_conn_id</strong> (<em>str</em>) &#8211; source connection</li>
-<li><strong>preoperator</strong> (<em>str</em><em> or </em><em>list of str</em>) &#8211; sql statement or list of statements to be
+<li><strong>sql</strong> (<em>str</em>) – SQL query to execute against the source database</li>
+<li><strong>destination_table</strong> (<em>str</em>) – target table</li>
+<li><strong>source_conn_id</strong> (<em>str</em>) – source connection</li>
+<li><strong>destination_conn_id</strong> (<em>str</em>) – source connection</li>
+<li><strong>preoperator</strong> (<em>str</em><em> or </em><em>list of str</em>) – sql statement or list of statements to be
 executed prior to loading the data</li>
 </ul>
 </td>
@@ -540,44 +542,99 @@ executed prior to loading the data</li>
 <dl class="staticmethod">
 <dt id="airflow.operators.HdfsSensor.filter_for_filesize">
 <em class="property">static </em><code class="descname">filter_for_filesize</code><span class="sig-paren">(</span><em>result</em>, <em>size=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/sensors.html#HdfsSensor.filter_for_filesize"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HdfsSensor.filter_for_filesize" title="Permalink to this definition">¶</a></dt>
-<dd><p>Will test the filepath result and test if its size is at least self.filesize
-:param result: a list of dicts returned by Snakebite ls
-:param size: the file size in MB a file should be at least to trigger True
-:return: (bool) depending on the matching criteria</p>
+<dd><p>Will test the filepath result and test if its size is at least self.filesize</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>result</strong> – a list of dicts returned by Snakebite ls</li>
+<li><strong>size</strong> – the file size in MB a file should be at least to trigger True</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">(bool) depending on the matching criteria</p>
+</td>
+</tr>
+</tbody>
+</table>
 </dd></dl>
 
 <dl class="staticmethod">
 <dt id="airflow.operators.HdfsSensor.filter_for_ignored_ext">
 <em class="property">static </em><code class="descname">filter_for_ignored_ext</code><span class="sig-paren">(</span><em>result</em>, <em>ignored_ext</em>, <em>ignore_copying</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/sensors.html#HdfsSensor.filter_for_ignored_ext"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HdfsSensor.filter_for_ignored_ext" title="Permalink to this definition">¶</a></dt>
-<dd><p>Will filter if instructed to do so the result to remove matching criteria
-:param result: (list) of dicts returned by Snakebite ls
-:param ignored_ext: (list) of ignored extentions
-:param ignore_copying: (bool) shall we ignore ?
-:return:</p>
+<dd><p>Will filter if instructed to do so the result to remove matching criteria</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>result</strong> – (list) of dicts returned by Snakebite ls</li>
+<li><strong>ignored_ext</strong> – (list) of ignored extensions</li>
+<li><strong>ignore_copying</strong> – (bool) shall we ignore ?</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">(list) of dicts which were not removed</p>
+</td>
+</tr>
+</tbody>
+</table>
 </dd></dl>
 
 </dd></dl>
 
 <dl class="class">
+<dt id="airflow.operators.HiveOperator">
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">HiveOperator</code><span class="sig-paren">(</span><em>hql</em>, <em>hive_cli_conn_id='hive_cli_default'</em>, <em>schema='default'</em>, <em>hiveconf_jinja_translate=False</em>, <em>script_begin_tag=None</em>, <em>run_as_owner=False</em>, <em>mapred_queue=None</em>, <em>mapred_queue_priority=None</em>, <em>mapred_job_name=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/hive_operator.html#HiveOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HiveOperator" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
+<p>Executes hql code in a specific Hive database.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>hql</strong> (<em>string</em>) – the hql to be executed</li>
+<li><strong>hive_cli_conn_id</strong> (<em>string</em>) – reference to the Hive database</li>
+<li><strong>hiveconf_jinja_translate</strong> (<em>boolean</em>) – when True, hiveconf-type templating
+${var} gets translated into jinja-type templating {{ var }}. Note that
+you may want to use this along with the
+<code class="docutils literal"><span class="pre">DAG(user_defined_macros=myargs)</span></code> parameter. View the DAG
+object documentation for more details.</li>
+<li><strong>script_begin_tag</strong> (<em>str</em>) – If defined, the operator will get rid of the
+part of the script before the first occurrence of <cite>script_begin_tag</cite></li>
+<li><strong>mapred_queue</strong> (<em>string</em>) – queue used by the Hadoop CapacityScheduler</li>
+<li><strong>mapred_queue_priority</strong> (<em>string</em>) – priority within CapacityScheduler queue.
+Possible settings include: VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW</li>
+<li><strong>mapred_job_name</strong> (<em>string</em>) – This name will appear in the jobtracker.
+This can make monitoring easier.</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
 <dt id="airflow.operators.HivePartitionSensor">
 <em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">HivePartitionSensor</code><span class="sig-paren">(</span><em>table</em>, <em>partition=&quot;ds='{{ ds }}'&quot;</em>, <em>metastore_conn_id='metastore_default'</em>, <em>schema='default'</em>, <em>poke_interval=180</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/sensors.html#HivePartitionSensor"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HivePartitionSensor" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <a class="reference internal" href="#airflow.operators.sensors.BaseSensorOperator" title="airflow.operators.sensors.BaseSensorOperator"><code class="xref py py-class docutils literal"><span class="pre">sensors.BaseSensorOperator</span></code></a></p>
 <p>Waits for a partition to show up in Hive.</p>
 <p>Note: Because <code class="docutils literal"><span class="pre">partition</span></code> supports general logical operators, it
 can be inefficient. Consider using NamedHivePartitionSensor instead if
-you don&#8217;t need the full flexibility of HivePartitionSensor.</p>
+you don’t need the full flexibility of HivePartitionSensor.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name" />
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>table</strong> (<em>string</em>) &#8211; The name of the table to wait for, supports the dot
+<li><strong>table</strong> (<em>string</em>) – The name of the table to wait for, supports the dot
 notation (my_database.my_table)</li>
-<li><strong>partition</strong> (<em>string</em>) &#8211; The partition clause to wait for. This is passed as
+<li><strong>partition</strong> (<em>string</em>) – The partition clause to wait for. This is passed as
 is to the metastore Thrift client <code class="docutils literal"><span class="pre">get_partitions_by_filter</span></code> method,
 and apparently supports SQL like notation as in <code class="docutils literal"><span class="pre">ds='2015-01-01'</span>
 <span class="pre">AND</span> <span class="pre">type='value'</span></code> and comparison operators as in <code class="docutils literal"><span class="pre">&quot;ds&gt;=2015-01-01&quot;</span></code></li>
-<li><strong>metastore_conn_id</strong> (<em>str</em>) &#8211; reference to the metastore thrift service
+<li><strong>metastore_conn_id</strong> (<em>str</em>) – reference to the metastore thrift service
 connection id</li>
 </ul>
 </td>
@@ -587,6 +644,89 @@ connection id</li>
 </dd></dl>
 
 <dl class="class">
+<dt id="airflow.operators.HiveToDruidTransfer">
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">HiveToDruidTransfer</code><span class="sig-paren">(</span><em>sql</em>, <em>druid_datasource</em>, <em>ts_dim</em>, <em>metric_spec=None</em>, <em>hive_cli_conn_id='hive_cli_default'</em>, <em>druid_ingest_conn_id='druid_ingest_default'</em>, <em>metastore_conn_id='metastore_default'</em>, <em>hadoop_dependency_coordinates=None</em>, <em>intervals=None</em>, <em>num_shards=-1</em>, <em>target_partition_size=-1</em>, <em>query_granularity='NONE'</em>, <em>segment_granularity='DAY'</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/hive_to_druid.html#HiveToDruidTransfer"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HiveToDruidTransfer" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
+<p>Moves data from Hive to Druid, [del]note that for now the data is loaded
+into memory before being pushed to Druid, so this operator should
+be used for smallish amount of data.[/del]</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>sql</strong> (<em>str</em>) – SQL query to execute against the Druid database</li>
+<li><strong>druid_datasource</strong> (<em>str</em>) – the datasource you want to ingest into in druid</li>
+<li><strong>ts_dim</strong> (<em>str</em>) – the timestamp dimension</li>
+<li><strong>metric_spec</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a>) – the metrics you want to define for your data</li>
+<li><strong>hive_cli_conn_id</strong> (<em>str</em>) – the hive connection id</li>
+<li><strong>druid_ingest_conn_id</strong> (<em>str</em>) – the druid ingest connection id</li>
+<li><strong>metastore_conn_id</strong> (<em>str</em>) – the metastore connection id</li>
+<li><strong>hadoop_dependency_coordinates</strong> (<em>list of str</em>) – list of coordinates to squeeze
+int the ingest json</li>
+<li><strong>intervals</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a>) – list of time intervals that defines segments, this
+is passed as is to the json object</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="airflow.operators.HiveToDruidTransfer.construct_ingest_query">
+<code class="descname">construct_ingest_query</code><span class="sig-paren">(</span><em>static_path</em>, <em>columns</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/hive_to_druid.html#HiveToDruidTransfer.construct_ingest_query"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HiveToDruidTransfer.construct_ingest_query" title="Permalink to this definition">¶</a></dt>
+<dd><p>Builds an ingest query for an HDFS TSV load.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>static_path</strong> (<em>str</em>) – The path on hdfs where the data is</li>
+<li><strong>columns</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a>) – List of all the columns that are available</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="airflow.operators.HiveToMySqlTransfer">
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">HiveToMySqlTransfer</code><span class="sig-paren">(</span><em>sql</em>, <em>mysql_table</em>, <em>hiveserver2_conn_id='hiveserver2_default'</em>, <em>mysql_conn_id='mysql_default'</em>, <em>mysql_preoperator=None</em>, <em>mysql_postoperator=None</em>, <em>bulk_load=False</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/hive_to_mysql.html#HiveToMySqlTransfer"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HiveToMySqlTransfer" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
+<p>Moves data from Hive to MySQL, note that for now the data is loaded
+into memory before being pushed to MySQL, so this operator should
+be used for smallish amount of data.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>sql</strong> (<em>str</em>) – SQL query to execute against the MySQL database</li>
+<li><strong>mysql_table</strong> (<em>str</em>) – target MySQL table, use dot notation to target a
+specific database</li>
+<li><strong>mysql_conn_id</strong> (<em>str</em>) – source mysql connection</li>
+<li><strong>hiveserver2_conn_id</strong> (<em>str</em>) – destination hive connection</li>
+<li><strong>mysql_preoperator</strong> (<em>str</em>) – sql statement to run against mysql prior to
+import, typically use to truncate of delete in place of the data
+coming in, allowing the task to be idempotent (running the task
+twice won’t double load data)</li>
+<li><strong>mysql_postoperator</strong> (<em>str</em>) – sql statement to run against mysql after the
+import, typically used to move data from staging to production
+and issue cleanup commands.</li>
+<li><strong>bulk_load</strong> (<em>bool</em>) – flag to use bulk_load option.  This loads mysql directly
+from a tab-delimited text file using the LOAD DATA LOCAL INFILE command.
+This option requires an extra connection parameter for the
+destination MySQL connection: {‘local_infile’: true}.</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
 <dt id="airflow.operators.SimpleHttpOperator">
 <em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">SimpleHttpOperator</code><span class="sig-paren">(</span><em>endpoint</em>, <em>method='POST'</em>, <em>data=None</em>, <em>headers=None</em>, <em>response_check=None</em>, <em>extra_options=None</em>, <em>xcom_push=False</em>, <em>http_conn_id='http_default'</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/http_operator.html#SimpleHttpOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.SimpleHttpOperator" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
@@ -596,18 +736,18 @@ connection id</li>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>http_conn_id</strong> (<em>string</em>) &#8211; The connection to run the sensor against</li>
-<li><strong>endpoint</strong> (<em>string</em>) &#8211; The relative part of the full url</li>
-<li><strong>method</strong> (<em>string</em>) &#8211; The HTTP method to use, default = &#8220;POST&#8221;</li>
+<li><strong>http_conn_id</strong> (<em>string</em>) – The connection to run the sensor against</li>
+<li><strong>endpoint</strong> (<em>string</em>) – The relative part of the full url</li>
+<li><strong>method</strong> (<em>string</em>) – The HTTP method to use, default = “POST”</li>
 <li><strong>data</strong> (<em>For POST/PUT</em><em>, </em><em>depends on the content-type parameter</em><em>,
-</em><em>for GET a dictionary of key/value string pairs</em>) &#8211; The data to pass. POST-data in POST/PUT and params
+</em><em>for GET a dictionary of key/value string pairs</em>) – The data to pass. POST-data in POST/PUT and params
 in the URL for a GET request.</li>
-<li><strong>headers</strong> (<em>a dictionary of string key/value pairs</em>) &#8211; The HTTP headers to be added to the GET request</li>
-<li><strong>response_check</strong> (<em>A lambda</em><em> or </em><em>defined function.</em>) &#8211; A check against the &#8216;requests&#8217; response object.
-Returns True for &#8216;pass&#8217; and False otherwise.</li>
+<li><strong>headers</strong> (<em>a dictionary of string key/value pairs</em>) – The HTTP headers to be added to the GET request</li>
+<li><strong>response_check</strong> (<em>A lambda</em><em> or </em><em>defined function.</em>) – A check against the ‘requests’ response object.
+Returns True for ‘pass’ and False otherwise.</li>
 <li><strong>extra_options</strong> (<em>A dictionary of options</em><em>, </em><em>where key is string and value
-depends on the option that's being modified.</em>) &#8211; Extra options for the &#8216;requests&#8217; library, see the
-&#8216;requests&#8217; documentation (options to modify timeout, ssl, etc.)</li>
+depends on the option that's being modified.</em>) – Extra options for the ‘requests’ library, see the
+‘requests’ documentation (options to modify timeout, ssl, etc.)</li>
 </ul>
 </td>
 </tr>
@@ -617,7 +757,7 @@ depends on the option that's being modified.</em>) &#8211; Extra options for the
 
 <dl class="class">
 <dt id="airflow.operators.HttpSensor">
-<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">HttpSensor</code><span class="sig-paren">(</span><em>endpoint</em>, <em>http_conn_id='http_default'</em>, <em>params=None</em>, <em>headers=None</em>, <em>response_check=None</em>, <em>extra_options=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/sensors.html#HttpSensor"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HttpSensor" title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">HttpSensor</code><span class="sig-paren">(</span><em>endpoint</em>, <em>http_conn_id='http_default'</em>, <em>method='GET'</em>, <em>request_params=None</em>, <em>headers=None</em>, <em>response_check=None</em>, <em>extra_options=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/sensors.html#HttpSensor"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.HttpSensor" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <a class="reference internal" href="#airflow.operators.sensors.BaseSensorOperator" title="airflow.operators.sensors.BaseSensorOperator"><code class="xref py py-class docutils literal"><span class="pre">sensors.BaseSensorOperator</span></code></a></p>
 <dl class="docutils">
 <dt>Executes a HTTP get statement and returns False on failure:</dt>
@@ -628,15 +768,16 @@ depends on the option that's being modified.</em>) &#8211; Extra options for the
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>http_conn_id</strong> (<em>string</em>) &#8211; The connection to run the sensor against</li>
-<li><strong>endpoint</strong> (<em>string</em>) &#8211; The relative part of the full url</li>
-<li><strong>params</strong> (<em>a dictionary of string key/value pairs</em>) &#8211; The parameters to be added to the GET url</li>
-<li><strong>headers</strong> (<em>a dictionary of string key/value pairs</em>) &#8211; The HTTP headers to be added to the GET request</li>
-<li><strong>response_check</strong> (<em>A lambda</em><em> or </em><em>defined function.</em>) &#8211; A check against the &#8216;requests&#8217; response object.
-Returns True for &#8216;pass&#8217; and False otherwise.</li>
+<li><strong>http_conn_id</strong> (<em>string</em>) – The connection to run the sensor against</li>
+<li><strong>method</strong> (<em>string</em>) – The HTTP request method to use</li>
+<li><strong>endpoint</strong> (<em>string</em>) – The relative part of the full url</li>
+<li><strong>request_params</strong> (<em>a dictionary of string key/value pairs</em>) – The parameters to be added to the GET url</li>
+<li><strong>headers</strong> (<em>a dictionary of string key/value pairs</em>) – The HTTP headers to be added to the GET request</li>
+<li><strong>response_check</strong> (<em>A lambda</em><em> or </em><em>defined function.</em>) – A check against the ‘requests’ response object.
+Returns True for ‘pass’ and False otherwise.</li>
 <li><strong>extra_options</strong> (<em>A dictionary of options</em><em>, </em><em>where key is string and value
-depends on the option that's being modified.</em>) &#8211; Extra options for the &#8216;requests&#8217; library, see the
-&#8216;requests&#8217; documentation (options to modify timeout, ssl, etc.)</li>
+depends on the option that's being modified.</em>) – Extra options for the ‘requests’ library, see the
+‘requests’ documentation (options to modify timeout, ssl, etc.)</li>
 </ul>
 </td>
 </tr>
@@ -651,20 +792,20 @@ depends on the option that's being modified.</em>) &#8211; Extra options for the
 <p>An alternative to the HivePartitionSensor that talk directly to the
 MySQL db. This was created as a result of observing sub optimal
 queries generated by the Metastore thrift service when hitting
-subpartitioned tables. The Thrift service&#8217;s queries were written in a
-way that wouldn&#8217;t leverage the indexes.</p>
+subpartitioned tables. The Thrift service’s queries were written in a
+way that wouldn’t leverage the indexes.</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name" />
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>schema</strong> (<em>str</em>) &#8211; the schema</li>
-<li><strong>table</strong> (<em>str</em>) &#8211; the table</li>
-<li><strong>partition_name</strong> (<em>str</em>) &#8211; the partition name, as defined in the PARTITIONS
+<li><strong>schema</strong> (<em>str</em>) – the schema</li>
+<li><strong>table</strong> (<em>str</em>) – the table</li>
+<li><strong>partition_name</strong> (<em>str</em>) – the partition name, as defined in the PARTITIONS
 table of the Metastore. Order of the fields does matter.
 Examples: <code class="docutils literal"><span class="pre">ds=2016-01-01</span></code> or
 <code class="docutils literal"><span class="pre">ds=2016-01-01/sub=foo</span></code> for a sub partitioned table</li>
-<li><strong>mysql_conn_id</strong> (<em>str</em>) &#8211; a reference to the MySQL conn_id for the metastore</li>
+<li><strong>mysql_conn_id</strong> (<em>str</em>) – a reference to the MySQL conn_id for the metastore</li>
 </ul>
 </td>
 </tr>
@@ -674,7 +815,7 @@ Examples: <code class="docutils literal"><span class="pre">ds=2016-01-01</span><
 
 <dl class="class">
 <dt id="airflow.operators.MySqlOperator">
-<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">MySqlOperator</code><span class="sig-paren">(</span><em>sql</em>, <em>mysql_conn_id='mysql_default'</em>, <em>parameters=None</em>, <em>autocommit=False</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/mysql_operator.html#MySqlOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.MySqlOperator" title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">MySqlOperator</code><span class="sig-paren">(</span><em>sql</em>, <em>mysql_conn_id='mysql_default'</em>, <em>parameters=None</em>, <em>autocommit=False</em>, <em>database=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/mysql_operator.html#MySqlOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.MySqlOperator" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
 <p>Executes sql code in a specific MySQL database</p>
 <table class="docutils field-list" frame="void" rules="none">
@@ -682,10 +823,50 @@ Examples: <code class="docutils literal"><span class="pre">ds=2016-01-01</span><
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>mysql_conn_id</strong> (<em>string</em>) &#8211; reference to a specific mysql database</li>
+<li><strong>mysql_conn_id</strong> (<em>string</em>) – reference to a specific mysql database</li>
 <li><strong>sql</strong> (<em>Can receive a str representing a sql statement</em><em>,
-</em><em>a list of str</em><em> (</em><em>sql statements</em><em>)</em><em></em><em>, or </em><em>reference to a template file.
-Template reference are recognized by str ending in '.sql'</em>) &#8211; the sql code to be executed</li>
+</em><em>a list of str</em><em> (</em><em>sql statements</em><em>)</em><em>, or </em><em>reference to a template file.
+Template reference are recognized by str ending in '.sql'</em>) – the sql code to be executed</li>
+<li><strong>database</strong> (<em>string</em>) – name of database which overwrite defined one in connection</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="airflow.operators.MySqlToHiveTransfer">
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">MySqlToHiveTransfer</code><span class="sig-paren">(</span><em>sql</em>, <em>hive_table</em>, <em>create=True</em>, <em>recreate=False</em>, <em>partition=None</em>, <em>delimiter=u'x01'</em>, <em>mysql_conn_id='mysql_default'</em>, <em>hive_cli_conn_id='hive_cli_default'</em>, <em>tblproperties=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/mysql_to_hive.html#MySqlToHiveTransfer"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.MySqlToHiveTransfer" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
+<p>Moves data from MySql to Hive. The operator runs your query against
+MySQL, stores the file locally before loading it into a Hive table.
+If the <code class="docutils literal"><span class="pre">create</span></code> or <code class="docutils literal"><span class="pre">recreate</span></code> arguments are set to <code class="docutils literal"><span class="pre">True</span></code>,
+a <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></code> and <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></code> statements are generated.
+Hive data types are inferred from the cursor’s metadata. Note that the
+table generated in Hive uses <code class="docutils literal"><span class="pre">STORED</span> <span class="pre">AS</span> <span class="pre">textfile</span></code>
+which isn’t the most efficient serialization format. If a
+large amount of data is loaded and/or if the table gets
+queried considerably, you may want to use this operator only to
+stage the data into a temporary table before loading it into its
+final destination using a <code class="docutils literal"><span class="pre">HiveOperator</span></code>.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>sql</strong> (<em>str</em>) – SQL query to execute against the MySQL database</li>
+<li><strong>hive_table</strong> (<em>str</em>) – target Hive table, use dot notation to target a
+specific database</li>
+<li><strong>create</strong> (<em>bool</em>) – whether to create the table if it doesn’t exist</li>
+<li><strong>recreate</strong> (<em>bool</em>) – whether to drop and recreate the table at every
+execution</li>
+<li><strong>partition</strong> (<em>dict</em>) – target partition as a dict of partition columns
+and values</li>
+<li><strong>delimiter</strong> (<em>str</em>) – field delimiter in the file</li>
+<li><strong>mysql_conn_id</strong> (<em>str</em>) – source mysql connection</li>
+<li><strong>hive_conn_id</strong> (<em>str</em>) – destination hive connection</li>
+<li><strong>tblproperties</strong> (<em>dict</em>) – TBLPROPERTIES of the hive table being created</li>
 </ul>
 </td>
 </tr>
@@ -703,14 +884,14 @@ Template reference are recognized by str ending in '.sql'</em>) &#8211; the sql
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>partition_names</strong> (<em>list of strings</em>) &#8211; List of fully qualified names of the
+<li><strong>partition_names</strong> (<em>list of strings</em>) – List of fully qualified names of the
 partitions to wait for. A fully qualified name is of the
 form <code class="docutils literal"><span class="pre">schema.table/pk1=pv1/pk2=pv2</span></code>, for example,
 default.users/ds=2016-01-01. This is passed as is to the metastore
 Thrift client <code class="docutils literal"><span class="pre">get_partitions_by_name</span></code> method. Note that
 you cannot use logical or comparison operators as in
 HivePartitionSensor.</li>
-<li><strong>metastore_conn_id</strong> (<em>str</em>) &#8211; reference to the metastore thrift service
+<li><strong>metastore_conn_id</strong> (<em>str</em>) – reference to the metastore thrift service
 connection id</li>
 </ul>
 </td>
@@ -720,6 +901,28 @@ connection id</li>
 </dd></dl>
 
 <dl class="class">
+<dt id="airflow.operators.PostgresOperator">
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">PostgresOperator</code><span class="sig-paren">(</span><em>sql</em>, <em>postgres_conn_id='postgres_default'</em>, <em>autocommit=False</em>, <em>parameters=None</em>, <em>database=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/postgres_operator.html#PostgresOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.PostgresOperator" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
+<p>Executes sql code in a specific Postgres database</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>postgres_conn_id</strong> (<em>string</em>) – reference to a specific postgres database</li>
+<li><strong>sql</strong> (<em>Can receive a str representing a sql statement</em><em>,
+</em><em>a list of str</em><em> (</em><em>sql statements</em><em>)</em><em>, or </em><em>reference to a template file.
+Template reference are recognized by str ending in '.sql'</em>) – the sql code to be executed</li>
+<li><strong>database</strong> (<em>string</em>) – name of database which overwrite defined one in connection</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
 <dt id="airflow.operators.PrestoCheckOperator">
 <em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">PrestoCheckOperator</code><span class="sig-paren">(</span><em>sql</em>, <em>presto_conn_id='presto_default'</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/presto_check_operator.html#PrestoCheckOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.PrestoCheckOperator" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">airflow.operators.check_operator.CheckOperator</span></code></p>
@@ -738,8 +941,8 @@ values return <code class="docutils literal"><span class="pre">False</span></cod
 <p>Given a query like <code class="docutils literal"><span class="pre">SELECT</span> <span class="pre">COUNT(*)</span> <span class="pre">FROM</span> <span class="pre">foo</span></code>, it will fail only if
 the count <code class="docutils literal"><span class="pre">==</span> <span class="pre">0</span></code>. You can craft much more complex query that could,
 for instance, check that the table has the same number of rows as
-the source table upstream, or that the count of today&#8217;s partition is
-greater than yesterday&#8217;s partition, or that a set of metrics are less
+the source table upstream, or that the count of today’s partition is
+greater than yesterday’s partition, or that a set of metrics are less
 than 3 standard deviation for the 7 day average.</p>
 <p>This operator can be used as a data quality check in your pipeline, and
 depending on where you put it in your DAG, you have the choice to
@@ -751,8 +954,8 @@ without stopping the progress of the DAG.</p>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>sql</strong> (<em>string</em>) &#8211; the sql to be executed</li>
-<li><strong>presto_conn_id</strong> (<em>string</em>) &#8211; reference to the Presto database</li>
+<li><strong>sql</strong> (<em>string</em>) – the sql to be executed</li>
+<li><strong>presto_conn_id</strong> (<em>string</em>) – reference to the Presto database</li>
 </ul>
 </td>
 </tr>
@@ -771,11 +974,11 @@ a certain tolerance of the ones from days_back before.</p>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>table</strong> (<em>str</em>) &#8211; the table name</li>
-<li><strong>days_back</strong> (<em>int</em>) &#8211; number of days between ds and the ds we want to check
+<li><strong>table</strong> (<em>str</em>) – the table name</li>
+<li><strong>days_back</strong> (<em>int</em>) – number of days between ds and the ds we want to check
 against. Defaults to 7 days</li>
-<li><strong>metrics_threshold</strong> (<em>dict</em>) &#8211; a dictionary of ratios indexed by metrics</li>
-<li><strong>presto_conn_id</strong> (<em>string</em>) &#8211; reference to the Presto database</li>
+<li><strong>metrics_threshold</strong> (<em>dict</em>) – a dictionary of ratios indexed by metrics</li>
+<li><strong>presto_conn_id</strong> (<em>string</em>) – reference to the Presto database</li>
 </ul>
 </td>
 </tr>
@@ -793,8 +996,8 @@ against. Defaults to 7 days</li>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>sql</strong> (<em>string</em>) &#8211; the sql to be executed</li>
-<li><strong>presto_conn_id</strong> (<em>string</em>) &#8211; reference to the Presto database</li>
+<li><strong>sql</strong> (<em>string</em>) – the sql to be executed</li>
+<li><strong>presto_conn_id</strong> (<em>string</em>) – reference to the Presto database</li>
 </ul>
 </td>
 </tr>
@@ -812,21 +1015,21 @@ against. Defaults to 7 days</li>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>python_callable</strong> (<em>python callable</em>) &#8211; A reference to an object that is callable</li>
-<li><strong>op_kwargs</strong> (<em>dict</em>) &#8211; a dictionary of keyword arguments that will get unpacked
+<li><strong>python_callable</strong> (<em>python callable</em>) – A reference to an object that is callable</li>
+<li><strong>op_kwargs</strong> (<em>dict</em>) – a dictionary of keyword arguments that will get unpacked
 in your function</li>
-<li><strong>op_args</strong> (<em>list</em>) &#8211; a list of positional arguments that will get unpacked when
+<li><strong>op_args</strong> (<a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a>) – a list of positional arguments that will get unpacked when
 calling your callable</li>
-<li><strong>provide_context</strong> (<em>bool</em>) &#8211; if set to true, Airflow will pass a set of
+<li><strong>provide_context</strong> (<em>bool</em>) – if set to true, Airflow will pass a set of
 keyword arguments that can be used in your function. This set of
 kwargs correspond exactly to what you can use in your jinja
 templates. For this to work, you need to define <cite>**kwargs</cite> in your
 function header.</li>
-<li><strong>templates_dict</strong> (<em>dict of str</em>) &#8211; a dictionary where the values are templates that
+<li><strong>templates_dict</strong> (<em>dict of str</em>) – a dictionary where the values are templates that
 will get templated by the Airflow engine sometime between
 <code class="docutils literal"><span class="pre">__init__</span></code> and <code class="docutils literal"><span class="pre">execute</span></code> takes place and are made available
-in your callable&#8217;s context after the template has been applied</li>
-<li><strong>templates_exts</strong> &#8211; a list of file extensions to resolve while
+in your callable’s context after the template has been applied</li>
+<li><strong>templates_exts</strong> – a list of file extensions to resolve while
 processing templated fields, for examples <code class="docutils literal"><span class="pre">['.sql',</span> <span class="pre">'.hql']</span></code></li>
 </ul>
 </td>
@@ -837,7 +1040,7 @@ processing templated fields, for examples <code class="docutils literal"><span c
 
 <dl class="class">
 <dt id="airflow.operators.S3KeySensor">
-<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">S3KeySensor</code><span class="sig-paren">(</span><em>bucket_key</em>, <em>bucket_name=None</em>, <em>wildcard_match=False</em>, <em>s3_conn_id='s3_default'</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/sensors.html#S3KeySensor"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.S3KeySensor" title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">S3KeySensor</code><span class="sig-paren">(</span><em>bucket_key</em>, <em>bucket_name=None</em>, <em>wildcard_match=False</em>, <em>aws_conn_id='aws_default'</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/sensors.html#S3KeySensor"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.S3KeySensor" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <a class="reference internal" href="#airflow.operators.sensors.BaseSensorOperator" title="airflow.operators.sensors.BaseSensorOperator"><code class="xref py py-class docutils literal"><span class="pre">sensors.BaseSensorOperator</span></code></a></p>
 <p>Waits for a key (a file-like instance on S3) to be present in a S3 bucket.
 S3 being a key/value it does not support folders. The path is just a key
@@ -847,12 +1050,61 @@ a resource.</p>
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>bucket_key</strong> (<em>str</em>) &#8211; The key being waited on. Supports full s3:// style url
+<li><strong>bucket_key</strong> (<em>str</em>) – The key being waited on. Supports full s3:// style url
 or relative path from root level.</li>
-<li><strong>bucket_name</strong> (<em>str</em>) &#8211; Name of the S3 bucket</li>
-<li><strong>wildcard_match</strong> (<em>bool</em>) &#8211; whether the bucket_key should be interpreted as a
+<li><strong>bucket_name</strong> (<em>str</em>) – Name of the S3 bucket</li>
+<li><strong>wildcard_match</strong> (<em>bool</em>) – whether the bucket_key should be interpreted as a
 Unix wildcard pattern</li>
-<li><strong>s3_conn_id</strong> (<em>str</em>) &#8211; a reference to the s3 connection</li>
+<li><strong>aws_conn_id</strong> (<em>str</em>) – a reference to the s3 connection</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="airflow.operators.S3ToHiveTransfer">
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">S3ToHiveTransfer</code><span class="sig-paren">(</span><em>s3_key</em>, <em>field_dict</em>, <em>hive_table</em>, <em>delimiter='</em>, <em>'</em>, <em>create=True</em>, <em>recreate=False</em>, <em>partition=None</em>, <em>headers=False</em>, <em>check_headers=False</em>, <em>wildcard_match=False</em>, <em>aws_conn_id='aws_default'</em>, <em>hive_cli_conn_id='hive_cli_default'</em>, <em>input_compressed=False</em>, <em>tblproperties=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/s3_to_hive_operator.html#S3ToHiveTransfer"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.S3ToHiveTransfer" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
+<p>Moves data from S3 to Hive. The operator downloads a file from S3,
+stores the file locally before loading it into a Hive table.
+If the <code class="docutils literal"><span class="pre">create</span></code> or <code class="docutils literal"><span class="pre">recreate</span></code> arguments are set to <code class="docutils literal"><span class="pre">True</span></code>,
+a <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></code> and <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></code> statements are generated.
+Hive data types are inferred from the cursor’s metadata from.</p>
+<p>Note that the table generated in Hive uses <code class="docutils literal"><span class="pre">STORED</span> <span class="pre">AS</span> <span class="pre">textfile</span></code>
+which isn’t the most efficient serialization format. If a
+large amount of data is loaded and/or if the tables gets
+queried considerably, you may want to use this operator only to
+stage the data into a temporary table before loading it into its
+final destination using a <code class="docutils literal"><span class="pre">HiveOperator</span></code>.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>s3_key</strong> (<em>str</em>) – The key to be retrieved from S3</li>
+<li><strong>field_dict</strong> (<em>dict</em>) – A dictionary of the fields name in the file
+as keys and their Hive types as values</li>
+<li><strong>hive_table</strong> (<em>str</em>) – target Hive table, use dot notation to target a
+specific database</li>
+<li><strong>create</strong> (<em>bool</em>) – whether to create the table if it doesn’t exist</li>
+<li><strong>recreate</strong> (<em>bool</em>) – whether to drop and recreate the table at every
+execution</li>
+<li><strong>partition</strong> (<em>dict</em>) – target partition as a dict of partition columns
+and values</li>
+<li><strong>headers</strong> (<em>bool</em>) – whether the file contains column names on the first
+line</li>
+<li><strong>check_headers</strong> (<em>bool</em>) – whether the column names on the first line should be
+checked against the keys of field_dict</li>
+<li><strong>wildcard_match</strong> (<em>bool</em>) – whether the s3_key should be interpreted as a Unix
+wildcard pattern</li>
+<li><strong>delimiter</strong> (<em>str</em>) – field delimiter in the file</li>
+<li><strong>aws_conn_id</strong> (<em>str</em>) – source s3 connection</li>
+<li><strong>hive_cli_conn_id</strong> (<em>str</em>) – destination hive connection</li>
+<li><strong>input_compressed</strong> (<em>bool</em>) – Boolean to determine if file decompression is
+required to process headers</li>
+<li><strong>tblproperties</strong> (<em>dict</em>) – TBLPROPERTIES of the hive table being created</li>
 </ul>
 </td>
 </tr>
@@ -863,29 +1115,89 @@ Unix wildcard pattern</li>
 <dl class="class">
 <dt id="airflow.operators.ShortCircuitOperator">
 <em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">ShortCircuitOperator</code><span class="sig-paren">(</span><em>python_callable</em>, <em>op_args=None</em>, <em>op_kwargs=None</em>, <em>provide_context=False</em>, <em>templates_dict=None</em>, <em>templates_exts=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/python_operator.html#ShortCircuitOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.ShortCircuitOperator" title="Permalink to this definition">¶</a></dt>
-<dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">python_operator.PythonOperator</span></code></p>
+<dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">python_operator.PythonOperator</span></code>, <code class="xref py py-class docutils literal"><span class="pre">airflow.models.SkipMixin</span></code></p>
 <p>Allows a workflow to continue only if a condition is met. Otherwise, the
-workflow &#8220;short-circuits&#8221; and downstream tasks are skipped.</p>
+workflow “short-circuits” and downstream tasks are skipped.</p>
 <p>The ShortCircuitOperator is derived from the PythonOperator. It evaluates a
 condition and short-circuits the workflow if the condition is False. Any
-downstream tasks are marked with a state of &#8220;skipped&#8221;. If the condition is
+downstream tasks are marked with a state of “skipped”. If the condition is
 True, downstream tasks proceed as normal.</p>
 <p>The condition is determined by the result of <cite>python_callable</cite>.</p>
 </dd></dl>
 
 <dl class="class">
+<dt id="airflow.operators.SlackAPIOperator">
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">SlackAPIOperator</code><span class="sig-paren">(</span><em>token='unset'</em>, <em>method='unset'</em>, <em>api_params=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/slack_operator.html#SlackAPIOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.SlackAPIOperator" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#airflow.models.BaseOperator" title="airflow.models.BaseOperator"><code class="xref py py-class docutils literal"><span class="pre">airflow.models.BaseOperator</span></code></a></p>
+<p>Base Slack Operator
+The SlackAPIPostOperator is derived from this operator.
+In the future additional Slack API Operators will be derived from this class as well</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>token</strong> (<em>string</em>) – Slack API token (<a class="reference external" href="https://api.slack.com/web">https://api.slack.com/web</a>)</li>
+<li><strong>method</strong> (<em>string</em>) – The Slack API Method to Call (<a class="reference external" href="https://api.slack.com/methods">https://api.slack.com/methods</a>)</li>
+<li><strong>api_params</strong> (<em>dict</em>) – API Method call parameters (<a class="reference external" href="https://api.slack.com/methods">https://api.slack.com/methods</a>)</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="airflow.operators.SlackAPIOperator.construct_api_call_params">
+<code class="descname">construct_api_call_params</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="reference internal" href="_modules/slack_operator.html#SlackAPIOperator.construct_api_call_params"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.SlackAPIOperator.construct_api_call_params" title="Permalink to this definition">¶</a></dt>
+<dd><p>Used by the execute function. Allows templating on the source fields of the api_call_params dict before construction</p>
+<p>Override in child classes.
+Each SlackAPIOperator child class is responsible for having a construct_api_call_params function
+which sets self.api_call_params with a dict of API call parameters (<a class="reference external" href="https://api.slack.com/methods">https://api.slack.com/methods</a>)</p>
+</dd></dl>
+
+<dl class="method">
+<dt id="airflow.operators.SlackAPIOperator.execute">
+<code class="descname">execute</code><span class="sig-paren">(</span><em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/slack_operator.html#SlackAPIOperator.execute"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.SlackAPIOperator.execute" title="Permalink to this definition">¶</a></dt>
+<dd><p>SlackAPIOperator calls will not fail even if the call is not unsuccessful.
+It should not prevent a DAG from completing in success</p>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="airflow.operators.SlackAPIPostOperator">
+<em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">SlackAPIPostOperator</code><span class="sig-paren">(</span><em>channel='#general'</em>, <em>username='Airflow'</em>, <em>text='No message has been set.nHere is a cat video insteadnhttps://www.youtube.com/watch?v=J---aiyznGQ'</em>, <em>icon_url='https://raw.githubusercontent.com/airbnb/airflow/master/airflow/www/static/pin_100.png'</em>, <em>attachments=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/slack_operator.html#SlackAPIPostOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.SlackAPIPostOperator" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">slack_operator.SlackAPIOperator</span></code></p>
+<p>Posts messages to a slack channel</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>channel</strong> (<em>string</em>) – channel in which to post message on slack name (#general) or ID (C12318391)</li>
+<li><strong>username</strong> (<em>string</em>) – Username that airflow will be posting to Slack as</li>
+<li><strong>text</strong> (<em>string</em>) – message to send to slack</li>
+<li><strong>icon_url</strong> (<em>string</em>) – url to icon used for this message</li>
+<li><strong>attachments</strong> (<em>array of hashes</em>) – extra formatting details - see <a class="reference external" href="https://api.slack.com/docs/attachments">https://api.slack.com/docs/attachments</a></li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
 <dt id="airflow.operators.SqlSensor">
 <em class="property">class </em><code class="descclassname">airflow.operators.</code><code class="descname">SqlSensor</code><span class="sig-paren">(</span><em>conn_id</em>, <em>sql</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/sensors.html#SqlSensor"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflow.operators.SqlSensor" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <a class="reference internal" href="#airflow.operators.sensors.BaseSensorOperator" title="airflow.operators.sensors.BaseSensorOperator"><code class="xref py py-class docutils literal"><span class="pre">sensors.BaseSensorOperator</span></code></a></p>
-<p>Runs a sql statement until a criteria is met. It will keep trying until
-sql returns no row, or if the first cell in (0, &#8216;0&#8217;, &#8216;&#8217;).</p>
+<p>Runs a sql statement until a criteria is met. It will keep trying while
+sql returns no row, or if the first cell in (0, ‘0’, ‘’).</p>
 <table class="docutils field-list" frame="void" rules="none">
 <col class="field-name" />
 <col class="field-body" />
 <tbody valign="top">
 <tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
-<li><strong>conn_id</strong> (<em>string</em>) &#8211; The connection to run the sensor against</li>
-<li><strong>sql</strong> &#8211; The sql to run. To pass, it needs to return at least one cell
+<li><strong>conn_id</strong> (<em>string</em>) – The connection to run the sensor against</li>
+<li><strong>sql</strong> – The sql to run. To pass, it needs to return at least one cell
 that contains a non-zero / empty string value.</li>
 </ul>
 </td>
@@ -903,7 +1215,7 @@ that contains a non-zero / empty string value.</li>
 <col class="field-name" />
 <col class="field-body" />
 <tbody valign="top">
-<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>target_time</strong> (<em>datetime.time</em>) &#8211; time after which the job succeeds</td>
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>target_time</strong> (<em>datetime.time</em>) – time after which the job succeeds</td>
 </tr>
 </tbody>
 </table>
@@ -916,6 +1228,60 @@ that contains a non-zero / empty string value.</li>
 <p>Waits for a file or folder to land in HDFS</p>
 </dd></dl>
 
+<dl class="class">
+<dt id="airflow.operators.docker_operator.DockerOperator">
+<em class="property">class </em><code class="descclassname">airflow.operators.docker_operator.</code><code class="descname">DockerOperator</code><span class="sig-paren">(</span><em>image</em>, <em>api_version=None</em>, <em>command=None</em>, <em>cpus=1.0</em>, <em>docker_url='unix://var/run/docker.sock'</em>, <em>environment=None</em>, <em>force_pull=False</em>, <em>mem_limit=None</em>, <em>network_mode=None</em>, <em>tls_ca_cert=None</em>, <em>tls_client_cert=None</em>, <em>tls_client_key=None</em>, <em>tls_hostname=None</em>, <em>tls_ssl_version=None</em>, <em>tmp_dir='/tmp/airflow'</em>, <em>user=None</em>, <em>volumes=None</em>, <em>working_dir=None</em>, <em>xcom_push=False</em>, <em>xcom_all=False</em>, <em>docker_conn_id=None</em>, <em>*args</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/airflow/operators/docker_operator.html#DockerOperator"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#airflo
 w.operators.docker_operator.DockerOperator" title="Permalink to this definition">¶</a></dt>
+<dd><p>Execute a command inside a docker container.</p>
+<p>A temporary directory is created on the host and mounted into a container to allow storing files
+that together exceed the default disk size of 10GB in a container. The path to the mounted
+directory can be accessed via the environment variable <code class="docutils literal"><span class="pre">AIRFLOW_TMP_DIR</span></code>.</p>
+<p>If a login to a private registry is required prior to pulling the image, a
+Docker connection needs to be configured in Airflow and the connection ID
+be provided with the parameter <code class="docutils literal"><span class="pre">docker_conn_id</span></code>.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>image</strong> (<em>str</em>) – Docker image from which to create the container.</li>
+<li><strong>api_version</strong> (<em>str</em>) – Remote API version. Set to <code class="docutils literal"><span class="pre">auto</span></code> to automatically
+detect the server’s version.</li>
+<li><strong>command</strong> (<em>str</em><em> or </em><a class="reference internal" href="integration.html#airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list" title="airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook.list"><em>list</em></a>) – Command to be run in the container.</li>
+<li><strong>cpus</strong> (<em>float</em>) – Number of CPUs to assign to the container.
+This value gets multiplied with 1024. See
+<a class="reference external" href="https://docs.docker.com/engine/reference/run/#cpu-share-constraint">https://docs.docker.com/engine/reference/run/#cpu-share-constraint</a></li>
+<li><strong>docker_url</strong> (<em>str</em>) – URL of the host running the docker daemon.
+Default is unix://var/run/docker.sock</li>
+<li><strong>environment</strong> (<em>dict</em>) – Environment variables to set in the container.</li>
+<li><strong>force_pull</strong> (<em>bool</em>) – Pull the docker image on every run. Default is false.</li>
+<li><strong>mem_limit</strong> (<em>float</em><em> or </em><em>str</em>) – Maximum amount of memory the container can use. Either a float value, which
+represents the limit in bytes, or a string like <code class="docutils literal"><span class="pre">128m</span></code> or <code class="docutils literal"><span class="pre">1g</span></code>.</li>
+<li><strong>network_mode</strong> (<em>str</em>) – Network mode for the container.</li>
+<li><strong

<TRUNCATED>