You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by uw...@apache.org on 2017/11/19 14:56:15 UTC

[02/19] arrow-site git commit: API doc update

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/35611f84/docs/python/pandas.html
----------------------------------------------------------------------
diff --git a/docs/python/pandas.html b/docs/python/pandas.html
index bf84a76..65d6221 100644
--- a/docs/python/pandas.html
+++ b/docs/python/pandas.html
@@ -158,7 +158,7 @@ supports flat columns, the Table also provides nested columns, thus it can
 represent more data than a DataFrame, so a full conversion is not always possible.</p>
 <p>Conversion from a Table to a DataFrame is done by calling
 <code class="xref py py-meth docutils literal"><span class="pre">pyarrow.table.Table.to_pandas()</span></code>. The inverse is then achieved by using
-<code class="xref py py-meth docutils literal"><span class="pre">pyarrow.Table.from_pandas()</span></code>. This conversion routine provides the
+<a class="reference internal" href="generated/pyarrow.Table.html#pyarrow.Table.from_pandas" title="pyarrow.Table.from_pandas"><code class="xref py py-meth docutils literal"><span class="pre">pyarrow.Table.from_pandas()</span></code></a>. This conversion routine provides the
 convience parameter <code class="docutils literal"><span class="pre">timestamps_to_ms</span></code>. Although Arrow supports timestamps of
 different resolutions, pandas only supports nanosecond timestamps and most
 other systems (e.g. Parquet) only work on millisecond timestamps. This parameter

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/35611f84/docs/python/parquet.html
----------------------------------------------------------------------
diff --git a/docs/python/parquet.html b/docs/python/parquet.html
index d0be86a..313b7b7 100644
--- a/docs/python/parquet.html
+++ b/docs/python/parquet.html
@@ -164,19 +164,6 @@ and writing Parquet files with pandas as well.</p>
 <p>If you installed <code class="docutils literal"><span class="pre">pyarrow</span></code> with pip or conda, it should be built with Parquet
 support bundled:</p>
 <div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [1]: </span><span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">ImportError</span><span class="g g-Whitespace">                               </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-1-dc8a4f7832af&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
-
-<span class="nn">~apache-arrow/arrow/python/pyarrow/__init__.py</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="g g-Whitespace">     </span><span class="mi">30</span> 
-<span class="g g-Whitespace">     </span><span class="mi">31</span> 
-<span class="ne">---&gt; </span><span class="mi">32</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="n">cpu_count</span><span class="p">,</span> <span class="n">set_cpu_count</span>
-<span class="g g-Whitespace">     </span><span class="mi">33</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="p">(</span><span class="n">null</span><span class="p">,</span> <span class="n">bool_</span><span class="p">,</span>
-<span class="g g-Whitespace">     </span><span class="mi">34</span>                          <span class="n">int8</span><span class="p">,</span> <span class="n">int16</span><span class="p">,</span> <span class="n">int32</span><span class="p">,</span> <span class="n">int64</span><span class="p">,</span>
-
-<span class="ne">ImportError</span>: libarrow.so.0: cannot open shared object file: No such file or directory
 </pre></div>
 </div>
 <p>If you are building <code class="docutils literal"><span class="pre">pyarrow</span></code> from source, you must also build <a class="reference external" href="http://github.com/apache/parquet-cpp">parquet-cpp</a> and enable the Parquet extensions when
@@ -185,7 +172,7 @@ details.</p>
 </div>
 <div class="section" id="reading-and-writing-single-files">
 <h2>Reading and Writing Single Files<a class="headerlink" href="#reading-and-writing-single-files" title="Permalink to this headline">¶</a></h2>
-<p>The functions <code class="xref py py-func docutils literal"><span class="pre">read_table()</span></code> and <code class="xref py py-func docutils literal"><span class="pre">write_table()</span></code>
+<p>The functions <a class="reference internal" href="generated/pyarrow.parquet.read_table.html#pyarrow.parquet.read_table" title="pyarrow.parquet.read_table"><code class="xref py py-func docutils literal"><span class="pre">read_table()</span></code></a> and <a class="reference internal" href="generated/pyarrow.parquet.write_table.html#pyarrow.parquet.write_table" title="pyarrow.parquet.write_table"><code class="xref py py-func docutils literal"><span class="pre">write_table()</span></code></a>
 read and write the <a class="reference internal" href="data.html#data-table"><span class="std std-ref">pyarrow.Table</span></a> objects, respectively.</p>
 <p>Let’s look at a simple table:</p>
 <div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [2]: </span><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
@@ -193,19 +180,6 @@ read and write the <a class="reference internal" href="data.html#data-table"><sp
 <span class="gp">In [3]: </span><span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span>
 
 <span class="gp">In [4]: </span><span class="kn">import</span> <span class="nn">pyarrow</span> <span class="kn">as</span> <span class="nn">pa</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">ImportError</span><span class="g g-Whitespace">                               </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-4-852643f3aad4&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="kn">import</span> <span class="nn">pyarrow</span> <span class="kn">as</span> <span class="nn">pa</span>
-
-<span class="nn">~apache-arrow/arrow/python/pyarrow/__init__.py</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="g g-Whitespace">     </span><span class="mi">30</span> 
-<span class="g g-Whitespace">     </span><span class="mi">31</span> 
-<span class="ne">---&gt; </span><span class="mi">32</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="n">cpu_count</span><span class="p">,</span> <span class="n">set_cpu_count</span>
-<span class="g g-Whitespace">     </span><span class="mi">33</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="p">(</span><span class="n">null</span><span class="p">,</span> <span class="n">bool_</span><span class="p">,</span>
-<span class="g g-Whitespace">     </span><span class="mi">34</span>                          <span class="n">int8</span><span class="p">,</span> <span class="n">int16</span><span class="p">,</span> <span class="n">int32</span><span class="p">,</span> <span class="n">int64</span><span class="p">,</span>
-
-<span class="ne">ImportError</span>: libarrow.so.0: cannot open shared object file: No such file or directory
 
 <span class="gp">In [5]: </span><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">({</span><span class="s1">&#39;one&#39;</span><span class="p">:</span> <span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">],</span>
 <span class="gp">   ...: </span>                   <span class="s1">&#39;two&#39;</span><span class="p">:</span> <span class="p">[</span><span class="s1">&#39;foo&#39;</span><span class="p">,</span> <span class="s1">&#39;bar&#39;</span><span class="p">,</span> <span class="s1">&#39;baz&#39;</span><span class="p">],</span>
@@ -213,68 +187,45 @@ read and write the <a class="reference internal" href="data.html#data-table"><sp
 <span class="gp">   ...: </span>
 
 <span class="gp">In [6]: </span><span class="n">table</span> <span class="o">=</span> <span class="n">pa</span><span class="o">.</span><span class="n">Table</span><span class="o">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">NameError</span><span class="g g-Whitespace">                                 </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-6-0c992c881c53&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="n">table</span> <span class="o">=</span> <span class="n">pa</span><span class="o">.</span><span class="n">Table</span><span class="o">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
-
-<span class="ne">NameError</span>: name &#39;pa&#39; is not defined
 </pre></div>
 </div>
 <p>We write this to Parquet format with <code class="docutils literal"><span class="pre">write_table</span></code>:</p>
 <div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [7]: </span><span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">ImportError</span><span class="g g-Whitespace">                               </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-7-dc8a4f7832af&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
-
-<span class="nn">~apache-arrow/arrow/python/pyarrow/__init__.py</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="g g-Whitespace">     </span><span class="mi">30</span> 
-<span class="g g-Whitespace">     </span><span class="mi">31</span> 
-<span class="ne">---&gt; </span><span class="mi">32</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="n">cpu_count</span><span class="p">,</span> <span class="n">set_cpu_count</span>
-<span class="g g-Whitespace">     </span><span class="mi">33</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="p">(</span><span class="n">null</span><span class="p">,</span> <span class="n">bool_</span><span class="p">,</span>
-<span class="g g-Whitespace">     </span><span class="mi">34</span>                          <span class="n">int8</span><span class="p">,</span> <span class="n">int16</span><span class="p">,</span> <span class="n">int32</span><span class="p">,</span> <span class="n">int64</span><span class="p">,</span>
-
-<span class="ne">ImportError</span>: libarrow.so.0: cannot open shared object file: No such file or directory
 
 <span class="gp">In [8]: </span><span class="n">pq</span><span class="o">.</span><span class="n">write_table</span><span class="p">(</span><span class="n">table</span><span class="p">,</span> <span class="s1">&#39;example.parquet&#39;</span><span class="p">)</span>
-<span class="go">
 ---------------------------------------------------------------------------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-8-26f49370facd&gt; in &lt;module&gt;()</span>
-<span class="go">----&gt; 1 pq.write_table(table, &#39;example.parquet&#39;)</span>
-
-<span class="go">NameError: name &#39;pq&#39; is not defined</span>
 </pre></div>
 </div>
 <p>This creates a single Parquet file. In practice, a Parquet dataset may consist
 of many files in many directories. We can read a single file back with
 <code class="docutils literal"><span class="pre">read_table</span></code>:</p>
 <div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [9]: </span><span class="n">table2</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">read_table</span><span class="p">(</span><span class="s1">&#39;example.parquet&#39;</span><span class="p">)</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">NameError</span><span class="g g-Whitespace">                                 </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-9-4549b0e97541&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="n">table2</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">read_table</span><span class="p">(</span><span class="s1">&#39;example.parquet&#39;</span><span class="p">)</span>
-
-<span class="ne">NameError</span>: name &#39;pq&#39; is not defined
 
 <span class="gp">In [10]: </span><span class="n">table2</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span>
-<span class="go">---------------------------------------------------------------------------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-10-a643630fbee7&gt; in &lt;module&gt;()</span>
-<span class="go">----&gt; 1 table2.to_pandas()</span>
-
-<span class="go">NameError: name &#39;table2&#39; is not defined</span>
+<span class="gh">Out[10]: </span><span class="go"></span>
+<span class="go">   one  three  two</span>
+<span class="go">0 -1.0   True  foo</span>
+<span class="go">1  NaN  False  bar</span>
+<span class="go">2  2.5   True  baz</span>
 </pre></div>
 </div>
 <p>You can pass a subset of columns to read, which can be much faster than reading
 the whole file (due to the columnar layout):</p>
 <div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [11]: </span><span class="n">pq</span><span class="o">.</span><span class="n">read_table</span><span class="p">(</span><span class="s1">&#39;example.parquet&#39;</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;one&#39;</span><span class="p">,</span> <span class="s1">&#39;three&#39;</span><span class="p">])</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">NameError</span><span class="g g-Whitespace">                                 </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-11-09e0d0d1ccc2&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="n">pq</span><span class="o">.</span><span class="n">read_table</span><span class="p">(</span><span class="s1">&#39;example.parquet&#39;</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;one&#39;</span><span class="p">,</span> <span class="s1">&#39;three&#39;</span><span class="p">])</span>
-
-<span class="ne">NameError</span>: name &#39;pq&#39; is not defined
+<span class="gh">Out[11]: </span><span class="go"></span>
+<span class="go">pyarrow.Table</span>
+<span class="go">one: double</span>
+<span class="go">three: bool</span>
+<span class="go">metadata</span>
+<span class="gt">--------</span>
+<span class="p">{</span><span class="sa">b</span><span class="s1">&#39;pandas&#39;</span><span class="p">:</span> <span class="sa">b</span><span class="s1">&#39;{&quot;index_columns&quot;: [&quot;__index_level_0__&quot;], &quot;column_indexes&quot;: [{&quot;na&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;me&quot;: null, &quot;pandas_type&quot;: &quot;string&quot;, &quot;numpy_type&quot;: &quot;object&quot;, &quot;met&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;adata&quot;: null}], &quot;columns&quot;: [{&quot;name&quot;: &quot;one&quot;, &quot;pandas_type&quot;: &quot;floa&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;t64&quot;, &quot;numpy_type&quot;: &quot;float64&quot;, &quot;metadata&quot;: null}, {&quot;name&quot;: &quot;thre&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;e&quot;, &quot;pandas_type&quot;: &quot;bool&quot;, &quot;numpy_type&quot;: &quot;bool&quot;, &quot;metadata&quot;: nul&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;l}, {&quot;name&quot;: &quot;two&quot;, &quot;pandas_type&quot;: &quot;unicode&quot;, &quot;numpy_type&quot;: &quot;obj&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;ect&quot;, &quot;metadata&quot;: null}, {&quot;name&quot;: null, &quot;pandas_type&quot;: &quot;int64&quot;, &#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;&quot;numpy_type&quot;: &quot;int64&quot;, &quot;metadata&quot;: null}], &quot;pandas_version&quot;: &quot;0.&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;21.0&quot;}&#39;</span><span class="p">}</span>
 </pre></div>
 </div>
 <p>We need not use a string to specify the origin of the file. It can be any of:</p>
@@ -289,30 +240,26 @@ maps) will perform the best.</p>
 </div>
 <div class="section" id="finer-grained-reading-and-writing">
 <h2>Finer-grained Reading and Writing<a class="headerlink" href="#finer-grained-reading-and-writing" title="Permalink to this headline">¶</a></h2>
-<p><code class="docutils literal"><span class="pre">read_table</span></code> uses the <code class="xref py py-class docutils literal"><span class="pre">ParquetFile</span></code> class, which has other features:</p>
+<p><code class="docutils literal"><span class="pre">read_table</span></code> uses the <a class="reference internal" href="generated/pyarrow.parquet.ParquetFile.html#pyarrow.parquet.ParquetFile" title="pyarrow.parquet.ParquetFile"><code class="xref py py-class docutils literal"><span class="pre">ParquetFile</span></code></a> class, which has other features:</p>
 <div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [12]: </span><span class="n">parquet_file</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">ParquetFile</span><span class="p">(</span><span class="s1">&#39;example.parquet&#39;</span><span class="p">)</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">NameError</span><span class="g g-Whitespace">                                 </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-12-7695c1e2d638&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="n">parquet_file</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">ParquetFile</span><span class="p">(</span><span class="s1">&#39;example.parquet&#39;</span><span class="p">)</span>
-
-<span class="ne">NameError</span>: name &#39;pq&#39; is not defined
 
 <span class="gp">In [13]: </span><span class="n">parquet_file</span><span class="o">.</span><span class="n">metadata</span>
-<span class="go">--------------------------------------------------------------------
 -------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-13-86e921dc15b9&gt; in &lt;module&gt;()</span>
-<span class="go">----&gt; 1 parquet_file.metadata</span>
-
-<span class="go">NameError: name &#39;parquet_file&#39; is not defined</span>
+<span class="gh">Out[13]: </span><span class="go"></span>
+<span class="go">&lt;pyarrow._parquet.FileMetaData object at 0x7f7f6480cef8&gt;</span>
+<span class="go">  created_by: parquet-cpp version 1.3.2-SNAPSHOT</span>
+<span class="go">  num_columns: 4</span>
+<span class="go">  num_rows: 3</span>
+<span class="go">  num_row_groups: 1</span>
+<span class="go">  format_version: 1.0</span>
+<span class="go">  serialized_size: 938</span>
 
 <span class="gp">In [14]: </span><span class="n">parquet_file</span><span class="o">.</span><span class="n">schema</span>
-<span class="go">
 
 ---------------------------------------------------------------------------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-14-e603c6d2ddb9&gt; in &lt;module&gt;()</span>
-<span class="go">----&gt; 1 parquet_file.schema</span>
-
-<span class="go">NameError: name &#39;parquet_file&#39; is not defined</span>
+<span class="go">Out[14]: </span>
+<span class="go">&lt;pyarrow._parquet.ParquetSchema object at 0x7f7f650b74c8&gt;</span>
+<span class="go">one: DOUBLE</span>
+<span class="go">three: BOOLEAN</span>
+<span class="go">two: BYTE_ARRAY UTF8</span>
+<span class="go">__index_level_0__: INT64</span>
 </pre></div>
 </div>
 <p>As you can learn more in the <a class="reference external" href="https://github.com/apache/parquet-format">Apache Parquet format</a>, a Parquet file consists of
@@ -320,67 +267,42 @@ multiple row groups. <code class="docutils literal"><span class="pre">read_table
 concatenate them into a single table. You can read individual row groups with
 <code class="docutils literal"><span class="pre">read_row_group</span></code>:</p>
 <div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [15]: </span><span class="n">parquet_file</span><span class="o">.</span><span class="n">num_row_groups</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">NameError</span><span class="g g-Whitespace">                                 </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-15-0f5225406313&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="n">parquet_file</span><span class="o">.</span><span class="n">num_row_groups</span>
-
-<span class="ne">NameError</span>: name &#39;parquet_file&#39; is not defined
+<span class="gh">Out[15]: </span><span class="go">1</span>
 
 <span class="gp">In [16]: </span><span class="n">parquet_file</span><span class="o">.</span><span class="n">read_row_group</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
-<span class="go">---------------------------------------------------------------------------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-16-f2750315d6f8&gt; in &lt;module&gt;()</span>
-<span class="go">----&gt; 1 parquet_file.read_row_group(0)</span>
-
-<span class="go">NameError: name &#39;parquet_file&#39; is not defined</span>
+<span class="go">Out[16]: </span>
+<span class="go">pyarrow.Table</span>
+<span class="go">one: double</span>
+<span class="go">three: bool</span>
+<span class="go">two: string</span>
+<span class="go">__index_level_0__: int64</span>
+<span class="go">metadata</span>
+<span class="gt">--------</span>
+<span class="p">{</span><span class="sa">b</span><span class="s1">&#39;pandas&#39;</span><span class="p">:</span> <span class="sa">b</span><span class="s1">&#39;{&quot;index_columns&quot;: [&quot;__index_level_0__&quot;], &quot;column_indexes&quot;: [{&quot;na&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;me&quot;: null, &quot;pandas_type&quot;: &quot;string&quot;, &quot;numpy_type&quot;: &quot;object&quot;, &quot;met&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;adata&quot;: null}], &quot;columns&quot;: [{&quot;name&quot;: &quot;one&quot;, &quot;pandas_type&quot;: &quot;floa&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;t64&quot;, &quot;numpy_type&quot;: &quot;float64&quot;, &quot;metadata&quot;: null}, {&quot;name&quot;: &quot;thre&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;e&quot;, &quot;pandas_type&quot;: &quot;bool&quot;, &quot;numpy_type&quot;: &quot;bool&quot;, &quot;metadata&quot;: nul&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;l}, {&quot;name&quot;: &quot;two&quot;, &quot;pandas_type&quot;: &quot;unicode&quot;, &quot;numpy_type&quot;: &quot;obj&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;ect&quot;, &quot;metadata&quot;: null}, {&quot;name&quot;: null, &quot;pandas_type&quot;: &quot;int64&quot;, &#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;&quot;numpy_type&quot;: &quot;int64&quot;, &quot;metadata&quot;: null}], &quot;pandas_version&quot;: &quot;0.&#39;</span>
+            <span class="sa">b</span><span class="s1">&#39;21.0&quot;}&#39;</span><span class="p">}</span>
 </pre></div>
 </div>
 <p>We can similarly write a Parquet file with multiple row groups by using
 <code class="docutils literal"><span class="pre">ParquetWriter</span></code>:</p>
 <div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [17]: </span><span class="n">writer</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">ParquetWriter</span><span class="p">(</span><span class="s1">&#39;example2.parquet&#39;</span><span class="p">,</span> <span class="n">table</span><span class="o">.</span><span class="n">schema</span><span class="p">)</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">NameError</span><span class="g g-Whitespace">                                 </span>Traceback (most recent call last)
-<span class="nn">&lt;ipython-input-17-2128d611204d&gt;</span> in <span class="ni">&lt;module&gt;</span><span class="nt">()</span>
-<span class="ne">----&gt; </span><span class="mi">1</span> <span class="n">writer</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">ParquetWriter</span><span class="p">(</span><span class="s1">&#39;example2.parquet&#39;</span><span class="p">,</span> <span class="n">table</span><span class="o">.</span><span class="n">schema</span><span class="p">)</span>
-
-<span class="ne">NameError</span>: name &#39;pq&#39; is not defined
 
 <span class="gp">In [18]: </span><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">3</span><span class="p">):</span>
 <span class="gp">   ....: </span>    <span class="n">writer</span><span class="o">.</span><span class="n">write_table</span><span class="p">(</span><span class="n">table</span><span class="p">)</span>
 <span class="gp">   ....: </span>
-<span class="go">----------------------------------------------
 -----------------------------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-18-7de1fedb098e&gt; in &lt;module&gt;()</span>
-<span class="go">      1 for i in range(3):</span>
-<span class="go">----&gt; 2     writer.write_table(table)</span>
-<span class="go">      3 </span>
-
-<span class="go">NameError: name &#39;writer&#39; is not defined</span>
 
 <span class="gp">In [19]: </span><span class="n">writer</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
-<span class="go">
 
 ---------------------------------------------------------------------------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-19-5f6d4868f1d2&gt; in &lt;module&gt;()</span>
-<span class="go">----&gt; 1 writer.close()</span>
-
-<span class="go">NameError: name &#39;writer&#39; is not defined</span>
 
 <span class="gp">In [20]: </span><span class="n">pf2</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">ParquetFile</span><span class="p">(</span><span class="s1">&#39;example2.parquet&#39;</span><span class="p">)</span>
-<span class="go">
 
 
 
 
 
 ---------------------------------------------------------------------------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-20-3a76d54b8afe&gt; in &lt;module&gt;()</span>
-<span class="go">----&gt; 1 pf2 = pq.ParquetFile(&#39;example2.parquet&#39;)</span>
-
-<span class="go">NameError: name &#39;pq&#39; is not defined</span>
 
 <span class="gp">In [21]: </span><span class="n">pf2</span><span class="o">.</span><span class="n">num_row_groups</span>
-<span class="go">
 
 
 
 
 
 
 
 
 
 
 
 
 -----------------------------------------------------
 ----------------------</span>
-<span class="go">NameError                                 Traceback (most recent call last)</span>
-<span class="go">&lt;ipython-input-21-90d9e5dc197c&gt; in &lt;module&gt;()</span>
-<span class="go">----&gt; 1 pf2.num_row_groups</span>
-
-<span class="go">NameError: name &#39;pf2&#39; is not defined</span>
+<span class="gh">Out[21]: </span><span class="go">3</span>
 </pre></div>
 </div>
 </div>
@@ -437,7 +359,7 @@ number of ways:</p>
   ...
 </pre></div>
 </div>
-<p>The <code class="xref py py-class docutils literal"><span class="pre">ParquetDataset</span></code> class accepts either a directory name or a list
+<p>The <a class="reference internal" href="generated/pyarrow.parquet.ParquetDataset.html#pyarrow.parquet.ParquetDataset" title="pyarrow.parquet.ParquetDataset"><code class="xref py py-class docutils literal"><span class="pre">ParquetDataset</span></code></a> class accepts either a directory name or a list
 or file paths, and can discover and infer some common partition structures,
 such as those produced by Hive:</p>
 <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">dataset</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">ParquetDataset</span><span class="p">(</span><span class="s1">&#39;dataset_name/&#39;</span><span class="p">)</span>

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/35611f84/docs/python/plasma.html
----------------------------------------------------------------------
diff --git a/docs/python/plasma.html b/docs/python/plasma.html
index e39a4c3..fa68418 100644
--- a/docs/python/plasma.html
+++ b/docs/python/plasma.html
@@ -27,6 +27,7 @@
     <link rel="index" title="Index" href="genindex.html" />
     <link rel="search" title="Search" href="search.html" />
     <link rel="next" title="Using PyArrow with pandas" href="pandas.html" />
+    <link rel="prev" title="pyarrow.HdfsFile" href="generated/pyarrow.HdfsFile.html" />
 <meta charset='utf-8'>
 <meta http-equiv='X-UA-Compatible' content='IE=edge,chrome=1'>
 <meta name='viewport' content='width=device-width, initial-scale=1.0, maximum-scale=1'>
@@ -117,6 +118,10 @@
               
                 
   <li>
+    <a href="generated/pyarrow.HdfsFile.html" title="Previous Chapter: pyarrow.HdfsFile"><span class="glyphicon glyphicon-chevron-left visible-sm"></span><span class="hidden-sm hidden-tablet">&laquo; pyarrow.HdfsFile</span>
+    </a>
+  </li>
+  <li>
     <a href="pandas.html" title="Next Chapter: Using PyArrow with pandas"><span class="glyphicon glyphicon-chevron-right visible-sm"></span><span class="hidden-sm hidden-tablet">Using PyArrow... &raquo;</span>
     </a>
   </li>