You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by uw...@apache.org on 2017/11/19 14:56:15 UTC
[02/19] arrow-site git commit: API doc update
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/35611f84/docs/python/pandas.html
----------------------------------------------------------------------
diff --git a/docs/python/pandas.html b/docs/python/pandas.html
index bf84a76..65d6221 100644
--- a/docs/python/pandas.html
+++ b/docs/python/pandas.html
@@ -158,7 +158,7 @@ supports flat columns, the Table also provides nested columns, thus it can
represent more data than a DataFrame, so a full conversion is not always possible.</p>
<p>Conversion from a Table to a DataFrame is done by calling
<code class="xref py py-meth docutils literal"><span class="pre">pyarrow.table.Table.to_pandas()</span></code>. The inverse is then achieved by using
-<code class="xref py py-meth docutils literal"><span class="pre">pyarrow.Table.from_pandas()</span></code>. This conversion routine provides the
+<a class="reference internal" href="generated/pyarrow.Table.html#pyarrow.Table.from_pandas" title="pyarrow.Table.from_pandas"><code class="xref py py-meth docutils literal"><span class="pre">pyarrow.Table.from_pandas()</span></code></a>. This conversion routine provides the
convience parameter <code class="docutils literal"><span class="pre">timestamps_to_ms</span></code>. Although Arrow supports timestamps of
different resolutions, pandas only supports nanosecond timestamps and most
other systems (e.g. Parquet) only work on millisecond timestamps. This parameter
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/35611f84/docs/python/parquet.html
----------------------------------------------------------------------
diff --git a/docs/python/parquet.html b/docs/python/parquet.html
index d0be86a..313b7b7 100644
--- a/docs/python/parquet.html
+++ b/docs/python/parquet.html
@@ -164,19 +164,6 @@ and writing Parquet files with pandas as well.</p>
<p>If you installed <code class="docutils literal"><span class="pre">pyarrow</span></code> with pip or conda, it should be built with Parquet
support bundled:</p>
<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [1]: </span><span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">ImportError</span><span class="g g-Whitespace"> </span>Traceback (most recent call last)
-<span class="nn"><ipython-input-1-dc8a4f7832af></span> in <span class="ni"><module></span><span class="nt">()</span>
-<span class="ne">----> </span><span class="mi">1</span> <span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
-
-<span class="nn">~apache-arrow/arrow/python/pyarrow/__init__.py</span> in <span class="ni"><module></span><span class="nt">()</span>
-<span class="g g-Whitespace"> </span><span class="mi">30</span>
-<span class="g g-Whitespace"> </span><span class="mi">31</span>
-<span class="ne">---> </span><span class="mi">32</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="n">cpu_count</span><span class="p">,</span> <span class="n">set_cpu_count</span>
-<span class="g g-Whitespace"> </span><span class="mi">33</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="p">(</span><span class="n">null</span><span class="p">,</span> <span class="n">bool_</span><span class="p">,</span>
-<span class="g g-Whitespace"> </span><span class="mi">34</span> <span class="n">int8</span><span class="p">,</span> <span class="n">int16</span><span class="p">,</span> <span class="n">int32</span><span class="p">,</span> <span class="n">int64</span><span class="p">,</span>
-
-<span class="ne">ImportError</span>: libarrow.so.0: cannot open shared object file: No such file or directory
</pre></div>
</div>
<p>If you are building <code class="docutils literal"><span class="pre">pyarrow</span></code> from source, you must also build <a class="reference external" href="http://github.com/apache/parquet-cpp">parquet-cpp</a> and enable the Parquet extensions when
@@ -185,7 +172,7 @@ details.</p>
</div>
<div class="section" id="reading-and-writing-single-files">
<h2>Reading and Writing Single Files<a class="headerlink" href="#reading-and-writing-single-files" title="Permalink to this headline">¶</a></h2>
-<p>The functions <code class="xref py py-func docutils literal"><span class="pre">read_table()</span></code> and <code class="xref py py-func docutils literal"><span class="pre">write_table()</span></code>
+<p>The functions <a class="reference internal" href="generated/pyarrow.parquet.read_table.html#pyarrow.parquet.read_table" title="pyarrow.parquet.read_table"><code class="xref py py-func docutils literal"><span class="pre">read_table()</span></code></a> and <a class="reference internal" href="generated/pyarrow.parquet.write_table.html#pyarrow.parquet.write_table" title="pyarrow.parquet.write_table"><code class="xref py py-func docutils literal"><span class="pre">write_table()</span></code></a>
read and write the <a class="reference internal" href="data.html#data-table"><span class="std std-ref">pyarrow.Table</span></a> objects, respectively.</p>
<p>Let’s look at a simple table:</p>
<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [2]: </span><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
@@ -193,19 +180,6 @@ read and write the <a class="reference internal" href="data.html#data-table"><sp
<span class="gp">In [3]: </span><span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span>
<span class="gp">In [4]: </span><span class="kn">import</span> <span class="nn">pyarrow</span> <span class="kn">as</span> <span class="nn">pa</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">ImportError</span><span class="g g-Whitespace"> </span>Traceback (most recent call last)
-<span class="nn"><ipython-input-4-852643f3aad4></span> in <span class="ni"><module></span><span class="nt">()</span>
-<span class="ne">----> </span><span class="mi">1</span> <span class="kn">import</span> <span class="nn">pyarrow</span> <span class="kn">as</span> <span class="nn">pa</span>
-
-<span class="nn">~apache-arrow/arrow/python/pyarrow/__init__.py</span> in <span class="ni"><module></span><span class="nt">()</span>
-<span class="g g-Whitespace"> </span><span class="mi">30</span>
-<span class="g g-Whitespace"> </span><span class="mi">31</span>
-<span class="ne">---> </span><span class="mi">32</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="n">cpu_count</span><span class="p">,</span> <span class="n">set_cpu_count</span>
-<span class="g g-Whitespace"> </span><span class="mi">33</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="p">(</span><span class="n">null</span><span class="p">,</span> <span class="n">bool_</span><span class="p">,</span>
-<span class="g g-Whitespace"> </span><span class="mi">34</span> <span class="n">int8</span><span class="p">,</span> <span class="n">int16</span><span class="p">,</span> <span class="n">int32</span><span class="p">,</span> <span class="n">int64</span><span class="p">,</span>
-
-<span class="ne">ImportError</span>: libarrow.so.0: cannot open shared object file: No such file or directory
<span class="gp">In [5]: </span><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">({</span><span class="s1">'one'</span><span class="p">:</span> <span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">],</span>
<span class="gp"> ...: </span> <span class="s1">'two'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'foo'</span><span class="p">,</span> <span class="s1">'bar'</span><span class="p">,</span> <span class="s1">'baz'</span><span class="p">],</span>
@@ -213,68 +187,45 @@ read and write the <a class="reference internal" href="data.html#data-table"><sp
<span class="gp"> ...: </span>
<span class="gp">In [6]: </span><span class="n">table</span> <span class="o">=</span> <span class="n">pa</span><span class="o">.</span><span class="n">Table</span><span class="o">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">NameError</span><span class="g g-Whitespace"> </span>Traceback (most recent call last)
-<span class="nn"><ipython-input-6-0c992c881c53></span> in <span class="ni"><module></span><span class="nt">()</span>
-<span class="ne">----> </span><span class="mi">1</span> <span class="n">table</span> <span class="o">=</span> <span class="n">pa</span><span class="o">.</span><span class="n">Table</span><span class="o">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
-
-<span class="ne">NameError</span>: name 'pa' is not defined
</pre></div>
</div>
<p>We write this to Parquet format with <code class="docutils literal"><span class="pre">write_table</span></code>:</p>
<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [7]: </span><span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
-<span class="gt">---------------------------------------------------------------------------</span>
-<span class="ne">ImportError</span><span class="g g-Whitespace"> </span>Traceback (most recent call last)
-<span class="nn"><ipython-input-7-dc8a4f7832af></span> in <span class="ni"><module></span><span class="nt">()</span>
-<span class="ne">----> </span><span class="mi">1</span> <span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
-
-<span class="nn">~apache-arrow/arrow/python/pyarrow/__init__.py</span> in <span class="ni"><module></span><span class="nt">()</span>
-<span class="g g-Whitespace"> </span><span class="mi">30</span>
-<span class="g g-Whitespace"> </span><span class="mi">31</span>
-<span class="ne">---> </span><span class="mi">32</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="n">cpu_count</span><span class="p">,</span> <span class="n">set_cpu_count</span>
-<span class="g g-Whitespace"> </span><span class="mi">33</span> <span class="kn">from</span> <span class="nn">pyarrow.lib</span> <span class="kn">import</span> <span class="p">(</span><span class="n">null</span><span class="p">,</span> <span class="n">bool_</span><span class="p">,</span>
-<span class="g g-Whitespace"> </span><span class="mi">34</span> <span class="n">int8</span><span class="p">,</span> <span class="n">int16</span><span class="p">,</span> <span class="n">int32</span><span class="p">,</span> <span class="n">int64</span><span class="p">,</span>
-
-<span class="ne">ImportError</span>: libarrow.so.0: cannot open shared object file: No such file or directory
<span class="gp">In [8]: </span><span class="n">pq</span><span class="o">.</span><span class="n">write_table</span><span class="p">(</span><span class="n">table</span><span class="p">,</span> <span class="s1">'example.parquet'</span><span class="p">)</span>
-<span class="go">