You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2017/05/08 04:52:47 UTC
[02/27] arrow-site git commit: Update Python documentation
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/6360599f/docs/python/install.html
----------------------------------------------------------------------
diff --git a/docs/python/install.html b/docs/python/install.html
index e38cfe2..dc4651c 100644
--- a/docs/python/install.html
+++ b/docs/python/install.html
@@ -1,176 +1,93 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<!DOCTYPE html>
-<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
-<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
-<head>
- <meta charset="utf-8">
-
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
-
- <title>Install PyArrow — pyarrow documentation</title>
-
-
-
-
-
-
-
-
-
-
-
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
-
-
-
-
-
- <link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
-
-
-
-
-
- <link rel="index" title="Index"
- href="genindex.html"/>
- <link rel="search" title="Search" href="search.html"/>
- <link rel="top" title="pyarrow documentation" href="index.html"/>
- <link rel="next" title="Pandas Interface" href="pandas.html"/>
- <link rel="prev" title="Apache Arrow (Python)" href="index.html"/>
-
-
- <script src="_static/js/modernizr.min.js"></script>
-
-</head>
-
-<body class="wy-body-for-nav" role="document">
-
-
- <div class="wy-grid-for-nav">
-
+ <title>Install PyArrow — pyarrow documentation</title>
- <nav data-toggle="wy-nav-shift" class="wy-nav-side">
- <div class="wy-side-scroll">
- <div class="wy-side-nav-search">
-
-
-
- <a href="index.html" class="icon icon-home"> pyarrow
-
-
-
- </a>
-
-
-
-
-
-
-
-<div role="search">
- <form id="rtd-search-form" class="wy-form" action="search.html" method="get">
- <input type="text" name="q" placeholder="Search docs" />
- <input type="hidden" name="check_keywords" value="yes" />
- <input type="hidden" name="area" value="default" />
- </form>
-</div>
-
-
- </div>
-
- <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-
-
-
-
-
-
- <p class="caption"><span class="caption-text">Getting Started</span></p>
-<ul class="current">
-<li class="toctree-l1 current"><a class="current reference internal" href="#">Install PyArrow</a><ul>
-<li class="toctree-l2"><a class="reference internal" href="#conda">Conda</a></li>
-<li class="toctree-l2"><a class="reference internal" href="#pip">Pip</a></li>
-<li class="toctree-l2"><a class="reference internal" href="#building-from-source">Building from source</a><ul>
-<li class="toctree-l3"><a class="reference internal" href="#system-requirements">System requirements</a></li>
-<li class="toctree-l3"><a class="reference internal" href="#python-requirements">Python requirements</a></li>
-<li class="toctree-l3"><a class="reference internal" href="#installing-arrow-c-library">Installing Arrow C++ library</a></li>
-<li class="toctree-l3"><a class="reference internal" href="#id1">Install <cite>pyarrow</cite></a></li>
-</ul>
-</li>
+ <link rel="stylesheet" href="_static/sphinxdoc.css" type="text/css" />
+ <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: './',
+ VERSION: '',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true,
+ SOURCELINK_SUFFIX: '.txt'
+ };
+ </script>
+ <script type="text/javascript" src="_static/jquery.js"></script>
+ <script type="text/javascript" src="_static/underscore.js"></script>
+ <script type="text/javascript" src="_static/doctools.js"></script>
+ <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+ <link rel="index" title="Index" href="genindex.html" />
+ <link rel="search" title="Search" href="search.html" />
+ <link rel="next" title="Development" href="development.html" />
+ <link rel="prev" title="Apache Arrow (Python)" href="index.html" />
+ </head>
+ <body role="document">
+ <div class="related" role="navigation" aria-label="related navigation">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="development.html" title="Development"
+ accesskey="N">next</a> |</li>
+ <li class="right" >
+ <a href="index.html" title="Apache Arrow (Python)"
+ accesskey="P">previous</a> |</li>
+ <li class="nav-item nav-item-0"><a href="index.html">pyarrow documentation</a> »</li>
+ </ul>
+ </div>
+ <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
+ <div class="sphinxsidebarwrapper">
+ <h3><a href="index.html">Table Of Contents</a></h3>
+ <ul>
+<li><a class="reference internal" href="#">Install PyArrow</a><ul>
+<li><a class="reference internal" href="#conda">Conda</a></li>
+<li><a class="reference internal" href="#pip">Pip</a></li>
+<li><a class="reference internal" href="#installing-from-source">Installing from source</a></li>
</ul>
</li>
-<li class="toctree-l1"><a class="reference internal" href="pandas.html">Pandas Interface</a></li>
-<li class="toctree-l1"><a class="reference internal" href="filesystems.html">File interfaces and Memory Maps</a></li>
-<li class="toctree-l1"><a class="reference internal" href="parquet.html">Reading/Writing Parquet files</a></li>
-<li class="toctree-l1"><a class="reference internal" href="api.html">API Reference</a></li>
-<li class="toctree-l1"><a class="reference internal" href="getting_involved.html">Getting Involved</a></li>
-</ul>
-<p class="caption"><span class="caption-text">Additional Features</span></p>
-<ul>
-<li class="toctree-l1"><a class="reference internal" href="jemalloc.html">jemalloc MemoryPool</a></li>
</ul>
-
-
+ <h4>Previous topic</h4>
+ <p class="topless"><a href="index.html"
+ title="previous chapter">Apache Arrow (Python)</a></p>
+ <h4>Next topic</h4>
+ <p class="topless"><a href="development.html"
+ title="next chapter">Development</a></p>
+ <div role="note" aria-label="source link">
+ <h3>This Page</h3>
+ <ul class="this-page-menu">
+ <li><a href="_sources/install.rst.txt"
+ rel="nofollow">Show Source</a></li>
+ </ul>
+ </div>
+<div id="searchbox" style="display: none" role="search">
+ <h3>Quick search</h3>
+ <form class="search" action="search.html" method="get">
+ <div><input type="text" name="q" /></div>
+ <div><input type="submit" value="Go" /></div>
+ <input type="hidden" name="check_keywords" value="yes" />
+ <input type="hidden" name="area" value="default" />
+ </form>
+</div>
+<script type="text/javascript">$('#searchbox').show(0);</script>
</div>
</div>
- </nav>
-
- <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
-
-
- <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
-
- <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
- <a href="index.html">pyarrow</a>
-
- </nav>
-
-
-
- <div class="wy-nav-content">
- <div class="rst-content">
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-<div role="navigation" aria-label="breadcrumbs navigation">
-
- <ul class="wy-breadcrumbs">
-
- <li><a href="index.html">Docs</a> »</li>
-
- <li>Install PyArrow</li>
-
-
- <li class="wy-breadcrumbs-aside">
-
-
- <a href="_sources/install.rst.txt" rel="nofollow"> View page source</a>
-
-
- </li>
-
- </ul>
-
-
- <hr/>
-</div>
- <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
- <div itemprop="articleBody">
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="bodywrapper">
+ <div class="body" role="main">
<div class="section" id="install-pyarrow">
<h1>Install PyArrow<a class="headerlink" href="#install-pyarrow" title="Permalink to this headline">¶</a></h1>
@@ -189,182 +106,43 @@
</div>
<div class="admonition note">
<p class="first admonition-title">Note</p>
-<p class="last">Currently there are only binary artifcats available for Linux and MacOS.
+<p class="last">Currently there are only binary artifacts available for Linux and MacOS.
Otherwise this will only pull the python sources and assumes an existing
-installation of the C++ part of Arrow.
-To retrieve the binary artifacts, you’ll need a recent <code class="docutils literal"><span class="pre">pip</span></code> version that
-supports features like the <code class="docutils literal"><span class="pre">manylinux1</span></code> tag.</p>
-</div>
-</div>
-<div class="section" id="building-from-source">
-<h2>Building from source<a class="headerlink" href="#building-from-source" title="Permalink to this headline">¶</a></h2>
-<p>First, clone the master git repository:</p>
-<div class="highlight-bash"><div class="highlight"><pre><span></span>git clone https://github.com/apache/arrow.git arrow
-</pre></div>
-</div>
-<div class="section" id="system-requirements">
-<h3>System requirements<a class="headerlink" href="#system-requirements" title="Permalink to this headline">¶</a></h3>
-<p>Building pyarrow requires:</p>
-<ul class="simple">
-<li>A C++11 compiler<ul>
-<li>Linux: gcc >= 4.8 or clang >= 3.5</li>
-<li>OS X: XCode 6.4 or higher preferred</li>
-</ul>
-</li>
-<li><a class="reference external" href="https://cmake.org/">CMake</a></li>
-</ul>
-</div>
-<div class="section" id="python-requirements">
-<h3>Python requirements<a class="headerlink" href="#python-requirements" title="Permalink to this headline">¶</a></h3>
-<p>You will need Python (CPython) 2.7, 3.4, or 3.5 installed. Earlier releases and
-are not being targeted.</p>
-<div class="admonition note">
-<p class="first admonition-title">Note</p>
-<p class="last">This library targets CPython only due to an emphasis on interoperability with
-pandas and NumPy, which are only available for CPython.</p>
-</div>
-<p>The build requires NumPy, Cython, and a few other Python dependencies:</p>
-<div class="highlight-bash"><div class="highlight"><pre><span></span>pip install cython
-<span class="nb">cd</span> arrow/python
-pip install -r requirements.txt
-</pre></div>
-</div>
-</div>
-<div class="section" id="installing-arrow-c-library">
-<h3>Installing Arrow C++ library<a class="headerlink" href="#installing-arrow-c-library" title="Permalink to this headline">¶</a></h3>
-<p>First, you should choose an installation location for Arrow C++. In the future
-using the default system install location will work, but for now we are being
-explicit:</p>
-<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nb">export</span> <span class="nv">ARROW_HOME</span><span class="o">=</span>$HOME/local
-</pre></div>
-</div>
-<p>Now, we build Arrow:</p>
-<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nb">cd</span> arrow/cpp
-
-mkdir dev-build
-<span class="nb">cd</span> dev-build
-
-cmake -DCMAKE_INSTALL_PREFIX<span class="o">=</span><span class="nv">$A</span>RROW_HOME ..
-
-make
-
-<span class="c1"># Use sudo here if $ARROW_HOME requires it</span>
-make install
-</pre></div>
-</div>
-<p>To get the optional Parquet support, you should also build and install
-<a class="reference external" href="https://github.com/apache/parquet-cpp/blob/master/README.md">parquet-cpp</a>.</p>
-</div>
-<div class="section" id="id1">
-<h3>Install <cite>pyarrow</cite><a class="headerlink" href="#id1" title="Permalink to this headline">¶</a></h3>
-<div class="highlight-bash"><div class="highlight"><pre><span></span><span class="nb">cd</span> arrow/python
-
-<span class="c1"># --with-parquet enables the Apache Parquet support in PyArrow</span>
-<span class="c1"># --with-jemalloc enables the jemalloc allocator support in PyArrow</span>
-<span class="c1"># --build-type=release disables debugging information and turns on</span>
-<span class="c1"># compiler optimizations for native code</span>
-python setup.py build_ext --with-parquet --with-jemalloc --build-type<span class="o">=</span>release install
-python setup.py install
-</pre></div>
-</div>
-<div class="admonition warning">
-<p class="first admonition-title">Warning</p>
-<p class="last">On XCode 6 and prior there are some known OS X <cite>@rpath</cite> issues. If you are
-unable to import pyarrow, upgrading XCode may be the solution.</p>
-</div>
-<div class="admonition note">
-<p class="first admonition-title">Note</p>
-<p class="last">In development installations, you will also need to set a correct
-<code class="docutils literal"><span class="pre">LD_LIBRARY_PATH</span></code>. This is most probably done with
-<code class="docutils literal"><span class="pre">export</span> <span class="pre">LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH</span></code>.</p>
-</div>
-<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">In</span> <span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="kn">import</span> <span class="nn">pyarrow</span>
-
-<span class="n">In</span> <span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">])</span>
-<span class="n">Out</span><span class="p">[</span><span class="mi">2</span><span class="p">]:</span>
-<span class="o"><</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">array</span><span class="o">.</span><span class="n">Int64Array</span> <span class="nb">object</span> <span class="n">at</span> <span class="mh">0x7f899f3e60e8</span><span class="o">></span>
-<span class="p">[</span>
- <span class="mi">1</span><span class="p">,</span>
- <span class="mi">2</span><span class="p">,</span>
- <span class="mi">3</span>
-<span class="p">]</span>
-</pre></div>
+installation of the C++ part of Arrow. To retrieve the binary artifacts,
+you’ll need a recent <code class="docutils literal"><span class="pre">pip</span></code> version that supports features like the
+<code class="docutils literal"><span class="pre">manylinux1</span></code> tag.</p>
</div>
</div>
+<div class="section" id="installing-from-source">
+<h2>Installing from source<a class="headerlink" href="#installing-from-source" title="Permalink to this headline">¶</a></h2>
+<p>See <a class="reference internal" href="development.html#development"><span class="std std-ref">Development</span></a>.</p>
</div>
</div>
- </div>
- <div class="articleComments">
-
- </div>
</div>
- <footer>
-
- <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
-
- <a href="pandas.html" class="btn btn-neutral float-right" title="Pandas Interface" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
-
-
- <a href="index.html" class="btn btn-neutral" title="Apache Arrow (Python)" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
-
- </div>
-
-
- <hr/>
-
- <div role="contentinfo">
- <p>
- © Copyright 2016 Apache Software Foundation.
-
- </p>
- </div>
- Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
-
-</footer>
-
</div>
</div>
-
- </section>
-
- </div>
-
-
-
-
-
- <script type="text/javascript">
- var DOCUMENTATION_OPTIONS = {
- URL_ROOT:'./',
- VERSION:'',
- COLLAPSE_INDEX:false,
- FILE_SUFFIX:'.html',
- HAS_SOURCE: true,
- SOURCELINK_SUFFIX: '.txt'
- };
- </script>
- <script type="text/javascript" src="_static/jquery.js"></script>
- <script type="text/javascript" src="_static/underscore.js"></script>
- <script type="text/javascript" src="_static/doctools.js"></script>
- <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-
-
-
-
-
- <script type="text/javascript" src="_static/js/theme.js"></script>
-
-
-
-
- <script type="text/javascript">
- jQuery(function () {
- SphinxRtdTheme.StickyNav.enable();
- });
- </script>
-
-
-</body>
+ <div class="clearer"></div>
+ </div>
+ <div class="related" role="navigation" aria-label="related navigation">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ >index</a></li>
+ <li class="right" >
+ <a href="development.html" title="Development"
+ >next</a> |</li>
+ <li class="right" >
+ <a href="index.html" title="Apache Arrow (Python)"
+ >previous</a> |</li>
+ <li class="nav-item nav-item-0"><a href="index.html">pyarrow documentation</a> »</li>
+ </ul>
+ </div>
+ <div class="footer" role="contentinfo">
+ © Copyright 2016-2017 Apache Software Foundation.
+ Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.5.5.
+ </div>
+ </body>
</html>
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/6360599f/docs/python/ipc.html
----------------------------------------------------------------------
diff --git a/docs/python/ipc.html b/docs/python/ipc.html
new file mode 100644
index 0000000..4fc1da8
Binary files /dev/null and b/docs/python/ipc.html differ
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/6360599f/docs/python/jemalloc.html
----------------------------------------------------------------------
diff --git a/docs/python/jemalloc.html b/docs/python/jemalloc.html
index 26c85bd..af4712d 100644
--- a/docs/python/jemalloc.html
+++ b/docs/python/jemalloc.html
@@ -1,168 +1,80 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<!DOCTYPE html>
-<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
-<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
-<head>
- <meta charset="utf-8">
-
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
-
- <title>jemalloc MemoryPool — pyarrow documentation</title>
-
-
-
-
-
-
-
-
-
-
-
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
-
-
-
-
-
- <link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
-
-
-
-
-
- <link rel="index" title="Index"
- href="genindex.html"/>
- <link rel="search" title="Search" href="search.html"/>
- <link rel="top" title="pyarrow documentation" href="index.html"/>
- <link rel="prev" title="Getting Involved" href="getting_involved.html"/>
-
-
- <script src="_static/js/modernizr.min.js"></script>
-
-</head>
-
-<body class="wy-body-for-nav" role="document">
-
-
- <div class="wy-grid-for-nav">
-
+ <title>jemalloc MemoryPool — pyarrow documentation</title>
- <nav data-toggle="wy-nav-shift" class="wy-nav-side">
- <div class="wy-side-scroll">
- <div class="wy-side-nav-search">
-
-
-
- <a href="index.html" class="icon icon-home"> pyarrow
-
-
-
- </a>
-
-
-
-
-
-
-
-<div role="search">
- <form id="rtd-search-form" class="wy-form" action="search.html" method="get">
- <input type="text" name="q" placeholder="Search docs" />
- <input type="hidden" name="check_keywords" value="yes" />
- <input type="hidden" name="area" value="default" />
- </form>
+ <link rel="stylesheet" href="_static/sphinxdoc.css" type="text/css" />
+ <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: './',
+ VERSION: '',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true,
+ SOURCELINK_SUFFIX: '.txt'
+ };
+ </script>
+ <script type="text/javascript" src="_static/jquery.js"></script>
+ <script type="text/javascript" src="_static/underscore.js"></script>
+ <script type="text/javascript" src="_static/doctools.js"></script>
+ <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+ <link rel="index" title="Index" href="genindex.html" />
+ <link rel="search" title="Search" href="search.html" />
+ <link rel="prev" title="Getting Involved" href="getting_involved.html" />
+ </head>
+ <body role="document">
+ <div class="related" role="navigation" aria-label="related navigation">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="getting_involved.html" title="Getting Involved"
+ accesskey="P">previous</a> |</li>
+ <li class="nav-item nav-item-0"><a href="index.html">pyarrow documentation</a> »</li>
+ </ul>
+ </div>
+ <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
+ <div class="sphinxsidebarwrapper">
+ <h4>Previous topic</h4>
+ <p class="topless"><a href="getting_involved.html"
+ title="previous chapter">Getting Involved</a></p>
+ <div role="note" aria-label="source link">
+ <h3>This Page</h3>
+ <ul class="this-page-menu">
+ <li><a href="_sources/jemalloc.rst.txt"
+ rel="nofollow">Show Source</a></li>
+ </ul>
+ </div>
+<div id="searchbox" style="display: none" role="search">
+ <h3>Quick search</h3>
+ <form class="search" action="search.html" method="get">
+ <div><input type="text" name="q" /></div>
+ <div><input type="submit" value="Go" /></div>
+ <input type="hidden" name="check_keywords" value="yes" />
+ <input type="hidden" name="area" value="default" />
+ </form>
</div>
-
-
- </div>
-
- <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-
-
-
-
-
-
- <p class="caption"><span class="caption-text">Getting Started</span></p>
-<ul>
-<li class="toctree-l1"><a class="reference internal" href="install.html">Install PyArrow</a></li>
-<li class="toctree-l1"><a class="reference internal" href="pandas.html">Pandas Interface</a></li>
-<li class="toctree-l1"><a class="reference internal" href="filesystems.html">File interfaces and Memory Maps</a></li>
-<li class="toctree-l1"><a class="reference internal" href="parquet.html">Reading/Writing Parquet files</a></li>
-<li class="toctree-l1"><a class="reference internal" href="api.html">API Reference</a></li>
-<li class="toctree-l1"><a class="reference internal" href="getting_involved.html">Getting Involved</a></li>
-</ul>
-<p class="caption"><span class="caption-text">Additional Features</span></p>
-<ul class="current">
-<li class="toctree-l1 current"><a class="current reference internal" href="#">jemalloc MemoryPool</a></li>
-</ul>
-
-
-
+<script type="text/javascript">$('#searchbox').show(0);</script>
</div>
</div>
- </nav>
-
- <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
-
-
- <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
-
- <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
- <a href="index.html">pyarrow</a>
-
- </nav>
-
-
-
- <div class="wy-nav-content">
- <div class="rst-content">
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-<div role="navigation" aria-label="breadcrumbs navigation">
-
- <ul class="wy-breadcrumbs">
-
- <li><a href="index.html">Docs</a> »</li>
-
- <li>jemalloc MemoryPool</li>
-
-
- <li class="wy-breadcrumbs-aside">
-
-
- <a href="_sources/jemalloc.rst.txt" rel="nofollow"> View page source</a>
-
-
- </li>
-
- </ul>
-
-
- <hr/>
-</div>
- <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
- <div itemprop="articleBody">
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="bodywrapper">
+ <div class="body" role="main">
<div class="section" id="jemalloc-memorypool">
<h1>jemalloc MemoryPool<a class="headerlink" href="#jemalloc-memorypool" title="Permalink to this headline">¶</a></h1>
-<p>Arrow’s default <code class="xref py py-class docutils literal"><span class="pre">MemoryPool</span></code> uses the system’s allocator
+<p>Arrow’s default <a class="reference internal" href="generated/pyarrow.MemoryPool.html#pyarrow.MemoryPool" title="pyarrow.MemoryPool"><code class="xref py py-class docutils literal"><span class="pre">MemoryPool</span></code></a> uses the system’s allocator
through the POSIX APIs. Although this already provides aligned allocation, the
POSIX interface doesn’t support aligned reallocation. The default reallocation
strategy is to allocate a new region, copy over the old data and free the
@@ -170,10 +82,9 @@ previous region. Using <a class="reference external" href="http://jemalloc.net/"
the existing memory allocation to the requested size. While this may still be
linear in the size of allocated memory, it is magnitudes faster as only the page
mapping in the kernel is touched, not the actual data.</p>
-<p>The <code class="xref py py-mod docutils literal"><span class="pre">jemalloc</span></code> allocator is not enabled by default to allow the
-use of the system allocator and/or other allocators like <code class="docutils literal"><span class="pre">tcmalloc</span></code>. You can
-either explicitly make it the default allocator or pass it only to single
-operations.</p>
+<p>The jemalloc-based allocator is not enabled by default to allow the use of the
+system allocator and/or other allocators like <code class="docutils literal"><span class="pre">tcmalloc</span></code>. You can either
+explicitly make it the default allocator or pass it only to single operations.</p>
<div class="code python highlight-default"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">pyarrow</span> <span class="k">as</span> <span class="nn">pa</span>
<span class="n">jemalloc_pool</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">jemalloc_memory_pool</span><span class="p">()</span>
@@ -191,74 +102,26 @@ operations.</p>
</div>
- </div>
- <div class="articleComments">
-
- </div>
</div>
- <footer>
-
- <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
-
-
- <a href="getting_involved.html" class="btn btn-neutral" title="Getting Involved" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
-
- </div>
-
-
- <hr/>
-
- <div role="contentinfo">
- <p>
- © Copyright 2016 Apache Software Foundation.
-
- </p>
- </div>
- Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
-
-</footer>
-
</div>
</div>
-
- </section>
-
- </div>
-
-
-
-
-
- <script type="text/javascript">
- var DOCUMENTATION_OPTIONS = {
- URL_ROOT:'./',
- VERSION:'',
- COLLAPSE_INDEX:false,
- FILE_SUFFIX:'.html',
- HAS_SOURCE: true,
- SOURCELINK_SUFFIX: '.txt'
- };
- </script>
- <script type="text/javascript" src="_static/jquery.js"></script>
- <script type="text/javascript" src="_static/underscore.js"></script>
- <script type="text/javascript" src="_static/doctools.js"></script>
- <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-
-
-
-
-
- <script type="text/javascript" src="_static/js/theme.js"></script>
-
-
-
-
- <script type="text/javascript">
- jQuery(function () {
- SphinxRtdTheme.StickyNav.enable();
- });
- </script>
-
-
-</body>
+ <div class="clearer"></div>
+ </div>
+ <div class="related" role="navigation" aria-label="related navigation">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ >index</a></li>
+ <li class="right" >
+ <a href="getting_involved.html" title="Getting Involved"
+ >previous</a> |</li>
+ <li class="nav-item nav-item-0"><a href="index.html">pyarrow documentation</a> »</li>
+ </ul>
+ </div>
+ <div class="footer" role="contentinfo">
+ © Copyright 2016-2017 Apache Software Foundation.
+ Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.5.5.
+ </div>
+ </body>
</html>
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/6360599f/docs/python/memory.html
----------------------------------------------------------------------
diff --git a/docs/python/memory.html b/docs/python/memory.html
new file mode 100644
index 0000000..f9f8cd2
Binary files /dev/null and b/docs/python/memory.html differ
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/6360599f/docs/python/objects.inv
----------------------------------------------------------------------
diff --git a/docs/python/objects.inv b/docs/python/objects.inv
index 8a6ec1f..5f1059a 100644
Binary files a/docs/python/objects.inv and b/docs/python/objects.inv differ
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/6360599f/docs/python/pandas.html
----------------------------------------------------------------------
diff --git a/docs/python/pandas.html b/docs/python/pandas.html
index f5c23cc..d519ecb 100644
--- a/docs/python/pandas.html
+++ b/docs/python/pandas.html
@@ -1,228 +1,151 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<!DOCTYPE html>
-<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
-<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
-<head>
- <meta charset="utf-8">
-
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
-
- <title>Pandas Interface — pyarrow documentation</title>
-
-
-
-
-
-
-
-
-
-
-
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
-
-
-
-
-
- <link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
-
-
-
-
-
- <link rel="index" title="Index"
- href="genindex.html"/>
- <link rel="search" title="Search" href="search.html"/>
- <link rel="top" title="pyarrow documentation" href="index.html"/>
- <link rel="next" title="File interfaces and Memory Maps" href="filesystems.html"/>
- <link rel="prev" title="Install PyArrow" href="install.html"/>
-
-
- <script src="_static/js/modernizr.min.js"></script>
-
-</head>
-
-<body class="wy-body-for-nav" role="document">
-
-
- <div class="wy-grid-for-nav">
-
+ <title>Using PyArrow with pandas — pyarrow documentation</title>
- <nav data-toggle="wy-nav-shift" class="wy-nav-side">
- <div class="wy-side-scroll">
- <div class="wy-side-nav-search">
-
-
-
- <a href="index.html" class="icon icon-home"> pyarrow
-
-
-
- </a>
-
-
-
-
-
-
-
-<div role="search">
- <form id="rtd-search-form" class="wy-form" action="search.html" method="get">
- <input type="text" name="q" placeholder="Search docs" />
- <input type="hidden" name="check_keywords" value="yes" />
- <input type="hidden" name="area" value="default" />
- </form>
-</div>
-
-
- </div>
-
- <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-
-
-
-
-
-
- <p class="caption"><span class="caption-text">Getting Started</span></p>
-<ul class="current">
-<li class="toctree-l1"><a class="reference internal" href="install.html">Install PyArrow</a></li>
-<li class="toctree-l1 current"><a class="current reference internal" href="#">Pandas Interface</a><ul>
-<li class="toctree-l2"><a class="reference internal" href="#dataframes">DataFrames</a></li>
-<li class="toctree-l2"><a class="reference internal" href="#series">Series</a></li>
-<li class="toctree-l2"><a class="reference internal" href="#type-differences">Type differences</a><ul>
-<li class="toctree-l3"><a class="reference internal" href="#pandas-arrow-conversion">Pandas -> Arrow Conversion</a></li>
-<li class="toctree-l3"><a class="reference internal" href="#arrow-pandas-conversion">Arrow -> Pandas Conversion</a></li>
+ <link rel="stylesheet" href="_static/sphinxdoc.css" type="text/css" />
+ <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: './',
+ VERSION: '',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true,
+ SOURCELINK_SUFFIX: '.txt'
+ };
+ </script>
+ <script type="text/javascript" src="_static/jquery.js"></script>
+ <script type="text/javascript" src="_static/underscore.js"></script>
+ <script type="text/javascript" src="_static/doctools.js"></script>
+ <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+ <link rel="index" title="Index" href="genindex.html" />
+ <link rel="search" title="Search" href="search.html" />
+ <link rel="next" title="Reading and Writing the Apache Parquet Format" href="parquet.html" />
+ <link rel="prev" title="Filesystem Interfaces" href="filesystems.html" />
+ </head>
+ <body role="document">
+ <div class="related" role="navigation" aria-label="related navigation">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="parquet.html" title="Reading and Writing the Apache Parquet Format"
+ accesskey="N">next</a> |</li>
+ <li class="right" >
+ <a href="filesystems.html" title="Filesystem Interfaces"
+ accesskey="P">previous</a> |</li>
+ <li class="nav-item nav-item-0"><a href="index.html">pyarrow documentation</a> »</li>
+ </ul>
+ </div>
+ <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
+ <div class="sphinxsidebarwrapper">
+ <h3><a href="index.html">Table Of Contents</a></h3>
+ <ul>
+<li><a class="reference internal" href="#">Using PyArrow with pandas</a><ul>
+<li><a class="reference internal" href="#dataframes">DataFrames</a></li>
+<li><a class="reference internal" href="#series">Series</a></li>
+<li><a class="reference internal" href="#type-differences">Type differences</a><ul>
+<li><a class="reference internal" href="#pandas-arrow-conversion">pandas -> Arrow Conversion</a></li>
+<li><a class="reference internal" href="#arrow-pandas-conversion">Arrow -> pandas Conversion</a></li>
</ul>
</li>
</ul>
</li>
-<li class="toctree-l1"><a class="reference internal" href="filesystems.html">File interfaces and Memory Maps</a></li>
-<li class="toctree-l1"><a class="reference internal" href="parquet.html">Reading/Writing Parquet files</a></li>
-<li class="toctree-l1"><a class="reference internal" href="api.html">API Reference</a></li>
-<li class="toctree-l1"><a class="reference internal" href="getting_involved.html">Getting Involved</a></li>
-</ul>
-<p class="caption"><span class="caption-text">Additional Features</span></p>
-<ul>
-<li class="toctree-l1"><a class="reference internal" href="jemalloc.html">jemalloc MemoryPool</a></li>
</ul>
-
-
+ <h4>Previous topic</h4>
+ <p class="topless"><a href="filesystems.html"
+ title="previous chapter">Filesystem Interfaces</a></p>
+ <h4>Next topic</h4>
+ <p class="topless"><a href="parquet.html"
+ title="next chapter">Reading and Writing the Apache Parquet Format</a></p>
+ <div role="note" aria-label="source link">
+ <h3>This Page</h3>
+ <ul class="this-page-menu">
+ <li><a href="_sources/pandas.rst.txt"
+ rel="nofollow">Show Source</a></li>
+ </ul>
+ </div>
+<div id="searchbox" style="display: none" role="search">
+ <h3>Quick search</h3>
+ <form class="search" action="search.html" method="get">
+ <div><input type="text" name="q" /></div>
+ <div><input type="submit" value="Go" /></div>
+ <input type="hidden" name="check_keywords" value="yes" />
+ <input type="hidden" name="area" value="default" />
+ </form>
+</div>
+<script type="text/javascript">$('#searchbox').show(0);</script>
</div>
</div>
- </nav>
-
- <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
-
-
- <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
-
- <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
- <a href="index.html">pyarrow</a>
-
- </nav>
-
-
-
- <div class="wy-nav-content">
- <div class="rst-content">
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-<div role="navigation" aria-label="breadcrumbs navigation">
-
- <ul class="wy-breadcrumbs">
-
- <li><a href="index.html">Docs</a> »</li>
-
- <li>Pandas Interface</li>
-
-
- <li class="wy-breadcrumbs-aside">
-
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="bodywrapper">
+ <div class="body" role="main">
- <a href="_sources/pandas.rst.txt" rel="nofollow"> View page source</a>
-
-
- </li>
-
- </ul>
-
-
- <hr/>
-</div>
- <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
- <div itemprop="articleBody">
-
- <div class="section" id="pandas-interface">
-<h1>Pandas Interface<a class="headerlink" href="#pandas-interface" title="Permalink to this headline">¶</a></h1>
-<p>To interface with Pandas, PyArrow provides various conversion routines to
-consume Pandas structures and convert back to them.</p>
+ <div class="section" id="using-pyarrow-with-pandas">
+<h1>Using PyArrow with pandas<a class="headerlink" href="#using-pyarrow-with-pandas" title="Permalink to this headline">¶</a></h1>
+<p>To interface with pandas, PyArrow provides various conversion routines to
+consume pandas structures and convert back to them.</p>
<div class="section" id="dataframes">
<h2>DataFrames<a class="headerlink" href="#dataframes" title="Permalink to this headline">¶</a></h2>
-<p>The equivalent to a Pandas DataFrame in Arrow is a <code class="xref py py-class docutils literal"><span class="pre">pyarrow.table.Table</span></code>.
-Both consist of a set of named columns of equal length. While Pandas only
+<p>The equivalent to a pandas DataFrame in Arrow is a <code class="xref py py-class docutils literal"><span class="pre">pyarrow.table.Table</span></code>.
+Both consist of a set of named columns of equal length. While pandas only
supports flat columns, the Table also provides nested columns, thus it can
represent more data than a DataFrame, so a full conversion is not always possible.</p>
<p>Conversion from a Table to a DataFrame is done by calling
<code class="xref py py-meth docutils literal"><span class="pre">pyarrow.table.Table.to_pandas()</span></code>. The inverse is then achieved by using
<code class="xref py py-meth docutils literal"><span class="pre">pyarrow.Table.from_pandas()</span></code>. This conversion routine provides the
convience parameter <code class="docutils literal"><span class="pre">timestamps_to_ms</span></code>. Although Arrow supports timestamps of
-different resolutions, Pandas only supports nanosecond timestamps and most
+different resolutions, pandas only supports nanosecond timestamps and most
other systems (e.g. Parquet) only work on millisecond timestamps. This parameter
-can be used to already do the time conversion during the Pandas to Arrow
+can be used to already do the time conversion during the pandas to Arrow
conversion.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">pyarrow</span> <span class="kn">as</span> <span class="nn">pa</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">({</span><span class="s2">"a"</span><span class="p">:</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]})</span>
-<span class="c1"># Convert from Pandas to Arrow</span>
+<span class="c1"># Convert from pandas to Arrow</span>
<span class="n">table</span> <span class="o">=</span> <span class="n">pa</span><span class="o">.</span><span class="n">Table</span><span class="o">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
-<span class="c1"># Convert back to Pandas</span>
+<span class="c1"># Convert back to pandas</span>
<span class="n">df_new</span> <span class="o">=</span> <span class="n">table</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span>
</pre></div>
</div>
</div>
<div class="section" id="series">
<h2>Series<a class="headerlink" href="#series" title="Permalink to this headline">¶</a></h2>
-<p>In Arrow, the most similar structure to a Pandas Series is an Array.
+<p>In Arrow, the most similar structure to a pandas Series is an Array.
It is a vector that contains data of the same type as linear memory. You can
-convert a Pandas Series to an Arrow Array using <code class="xref py py-meth docutils literal"><span class="pre">pyarrow.array.from_pandas_series()</span></code>.
+convert a pandas Series to an Arrow Array using <code class="xref py py-meth docutils literal"><span class="pre">pyarrow.array.from_pandas_series()</span></code>.
As Arrow Arrays are always nullable, you can supply an optional mask using
the <code class="docutils literal"><span class="pre">mask</span></code> parameter to mark all null-entries.</p>
</div>
<div class="section" id="type-differences">
<h2>Type differences<a class="headerlink" href="#type-differences" title="Permalink to this headline">¶</a></h2>
-<p>With the current design of Pandas and Arrow, it is not possible to convert all
-column types unmodified. One of the main issues here is that Pandas has no
+<p>With the current design of pandas and Arrow, it is not possible to convert all
+column types unmodified. One of the main issues here is that pandas has no
support for nullable columns of arbitrary type. Also <code class="docutils literal"><span class="pre">datetime64</span></code> is currently
fixed to nanosecond resolution. On the other side, Arrow might be still missing
support for some types.</p>
<div class="section" id="pandas-arrow-conversion">
-<h3>Pandas -> Arrow Conversion<a class="headerlink" href="#pandas-arrow-conversion" title="Permalink to this headline">¶</a></h3>
+<h3>pandas -> Arrow Conversion<a class="headerlink" href="#pandas-arrow-conversion" title="Permalink to this headline">¶</a></h3>
<table border="1" class="docutils">
<colgroup>
<col width="48%" />
<col width="52%" />
</colgroup>
<thead valign="bottom">
-<tr class="row-odd"><th class="head">Source Type (Pandas)</th>
+<tr class="row-odd"><th class="head">Source Type (pandas)</th>
<th class="head">Destination Type (Arrow)</th>
</tr>
</thead>
@@ -255,7 +178,7 @@ support for some types.</p>
</table>
</div>
<div class="section" id="arrow-pandas-conversion">
-<h3>Arrow -> Pandas Conversion<a class="headerlink" href="#arrow-pandas-conversion" title="Permalink to this headline">¶</a></h3>
+<h3>Arrow -> pandas Conversion<a class="headerlink" href="#arrow-pandas-conversion" title="Permalink to this headline">¶</a></h3>
<table border="1" class="docutils">
<colgroup>
<col width="40%" />
@@ -263,7 +186,7 @@ support for some types.</p>
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">Source Type (Arrow)</th>
-<th class="head">Destination Type (Pandas)</th>
+<th class="head">Destination Type (pandas)</th>
</tr>
</thead>
<tbody valign="top">
@@ -304,76 +227,29 @@ support for some types.</p>
</div>
- </div>
- <div class="articleComments">
-
- </div>
</div>
- <footer>
-
- <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
-
- <a href="filesystems.html" class="btn btn-neutral float-right" title="File interfaces and Memory Maps" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
-
-
- <a href="install.html" class="btn btn-neutral" title="Install PyArrow" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
-
- </div>
-
-
- <hr/>
-
- <div role="contentinfo">
- <p>
- © Copyright 2016 Apache Software Foundation.
-
- </p>
- </div>
- Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
-
-</footer>
-
</div>
</div>
-
- </section>
-
- </div>
-
-
-
-
-
- <script type="text/javascript">
- var DOCUMENTATION_OPTIONS = {
- URL_ROOT:'./',
- VERSION:'',
- COLLAPSE_INDEX:false,
- FILE_SUFFIX:'.html',
- HAS_SOURCE: true,
- SOURCELINK_SUFFIX: '.txt'
- };
- </script>
- <script type="text/javascript" src="_static/jquery.js"></script>
- <script type="text/javascript" src="_static/underscore.js"></script>
- <script type="text/javascript" src="_static/doctools.js"></script>
- <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-
-
-
-
-
- <script type="text/javascript" src="_static/js/theme.js"></script>
-
-
-
-
- <script type="text/javascript">
- jQuery(function () {
- SphinxRtdTheme.StickyNav.enable();
- });
- </script>
-
-
-</body>
+ <div class="clearer"></div>
+ </div>
+ <div class="related" role="navigation" aria-label="related navigation">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ >index</a></li>
+ <li class="right" >
+ <a href="parquet.html" title="Reading and Writing the Apache Parquet Format"
+ >next</a> |</li>
+ <li class="right" >
+ <a href="filesystems.html" title="Filesystem Interfaces"
+ >previous</a> |</li>
+ <li class="nav-item nav-item-0"><a href="index.html">pyarrow documentation</a> »</li>
+ </ul>
+ </div>
+ <div class="footer" role="contentinfo">
+ © Copyright 2016-2017 Apache Software Foundation.
+ Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.5.5.
+ </div>
+ </body>
</html>
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/6360599f/docs/python/parquet.html
----------------------------------------------------------------------
diff --git a/docs/python/parquet.html b/docs/python/parquet.html
index e958986..b5263c5 100644
--- a/docs/python/parquet.html
+++ b/docs/python/parquet.html
@@ -1,305 +1,334 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<!DOCTYPE html>
-<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
-<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
-<head>
- <meta charset="utf-8">
-
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
-
- <title>Reading/Writing Parquet files — pyarrow documentation</title>
-
-
-
-
-
-
-
-
-
-
-
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
-
-
-
-
-
- <link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
-
-
-
-
-
- <link rel="index" title="Index"
- href="genindex.html"/>
- <link rel="search" title="Search" href="search.html"/>
- <link rel="top" title="pyarrow documentation" href="index.html"/>
- <link rel="next" title="API Reference" href="api.html"/>
- <link rel="prev" title="File interfaces and Memory Maps" href="filesystems.html"/>
-
-
- <script src="_static/js/modernizr.min.js"></script>
-
-</head>
-
-<body class="wy-body-for-nav" role="document">
-
-
- <div class="wy-grid-for-nav">
-
+ <title>Reading and Writing the Apache Parquet Format — pyarrow documentation</title>
- <nav data-toggle="wy-nav-shift" class="wy-nav-side">
- <div class="wy-side-scroll">
- <div class="wy-side-nav-search">
-
-
-
- <a href="index.html" class="icon icon-home"> pyarrow
-
-
-
- </a>
-
-
-
-
-
-
-
-<div role="search">
- <form id="rtd-search-form" class="wy-form" action="search.html" method="get">
- <input type="text" name="q" placeholder="Search docs" />
- <input type="hidden" name="check_keywords" value="yes" />
- <input type="hidden" name="area" value="default" />
- </form>
-</div>
-
-
- </div>
-
- <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-
-
-
-
-
-
- <p class="caption"><span class="caption-text">Getting Started</span></p>
-<ul class="current">
-<li class="toctree-l1"><a class="reference internal" href="install.html">Install PyArrow</a></li>
-<li class="toctree-l1"><a class="reference internal" href="pandas.html">Pandas Interface</a></li>
-<li class="toctree-l1"><a class="reference internal" href="filesystems.html">File interfaces and Memory Maps</a></li>
-<li class="toctree-l1 current"><a class="current reference internal" href="#">Reading/Writing Parquet files</a><ul>
-<li class="toctree-l2"><a class="reference internal" href="#reading-parquet">Reading Parquet</a></li>
-<li class="toctree-l2"><a class="reference internal" href="#writing-parquet">Writing Parquet</a></li>
+ <link rel="stylesheet" href="_static/sphinxdoc.css" type="text/css" />
+ <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: './',
+ VERSION: '',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true,
+ SOURCELINK_SUFFIX: '.txt'
+ };
+ </script>
+ <script type="text/javascript" src="_static/jquery.js"></script>
+ <script type="text/javascript" src="_static/underscore.js"></script>
+ <script type="text/javascript" src="_static/doctools.js"></script>
+ <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+ <link rel="index" title="Index" href="genindex.html" />
+ <link rel="search" title="Search" href="search.html" />
+ <link rel="next" title="API Reference" href="api.html" />
+ <link rel="prev" title="Using PyArrow with pandas" href="pandas.html" />
+ </head>
+ <body role="document">
+ <div class="related" role="navigation" aria-label="related navigation">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="api.html" title="API Reference"
+ accesskey="N">next</a> |</li>
+ <li class="right" >
+ <a href="pandas.html" title="Using PyArrow with pandas"
+ accesskey="P">previous</a> |</li>
+ <li class="nav-item nav-item-0"><a href="index.html">pyarrow documentation</a> »</li>
+ </ul>
+ </div>
+ <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
+ <div class="sphinxsidebarwrapper">
+ <h3><a href="index.html">Table Of Contents</a></h3>
+ <ul>
+<li><a class="reference internal" href="#">Reading and Writing the Apache Parquet Format</a><ul>
+<li><a class="reference internal" href="#obtaining-pyarrow-with-parquet-support">Obtaining PyArrow with Parquet Support</a></li>
+<li><a class="reference internal" href="#reading-and-writing-single-files">Reading and Writing Single Files</a></li>
+<li><a class="reference internal" href="#finer-grained-reading-and-writing">Finer-grained Reading and Writing</a></li>
+<li><a class="reference internal" href="#compression-encoding-and-file-compatibility">Compression, Encoding, and File Compatibility</a></li>
+<li><a class="reference internal" href="#reading-multiples-files-and-partitioned-datasets">Reading Multiples Files and Partitioned Datasets</a></li>
+<li><a class="reference internal" href="#multithreaded-reads">Multithreaded Reads</a></li>
</ul>
</li>
-<li class="toctree-l1"><a class="reference internal" href="api.html">API Reference</a></li>
-<li class="toctree-l1"><a class="reference internal" href="getting_involved.html">Getting Involved</a></li>
-</ul>
-<p class="caption"><span class="caption-text">Additional Features</span></p>
-<ul>
-<li class="toctree-l1"><a class="reference internal" href="jemalloc.html">jemalloc MemoryPool</a></li>
</ul>
-
-
+ <h4>Previous topic</h4>
+ <p class="topless"><a href="pandas.html"
+ title="previous chapter">Using PyArrow with pandas</a></p>
+ <h4>Next topic</h4>
+ <p class="topless"><a href="api.html"
+ title="next chapter">API Reference</a></p>
+ <div role="note" aria-label="source link">
+ <h3>This Page</h3>
+ <ul class="this-page-menu">
+ <li><a href="_sources/parquet.rst.txt"
+ rel="nofollow">Show Source</a></li>
+ </ul>
+ </div>
+<div id="searchbox" style="display: none" role="search">
+ <h3>Quick search</h3>
+ <form class="search" action="search.html" method="get">
+ <div><input type="text" name="q" /></div>
+ <div><input type="submit" value="Go" /></div>
+ <input type="hidden" name="check_keywords" value="yes" />
+ <input type="hidden" name="area" value="default" />
+ </form>
+</div>
+<script type="text/javascript">$('#searchbox').show(0);</script>
</div>
</div>
- </nav>
-
- <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
-
-
- <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
-
- <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
- <a href="index.html">pyarrow</a>
-
- </nav>
-
-
-
- <div class="wy-nav-content">
- <div class="rst-content">
-
-
-
-
-
-
-
-
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="bodywrapper">
+ <div class="body" role="main">
+
+ <div class="section" id="reading-and-writing-the-apache-parquet-format">
+<span id="parquet"></span><h1>Reading and Writing the Apache Parquet Format<a class="headerlink" href="#reading-and-writing-the-apache-parquet-format" title="Permalink to this headline">¶</a></h1>
+<p>The <a class="reference external" href="http://parquet.apache.org/">Apache Parquet</a> project provides a
+standardized open-source columnar storage format for use in data analysis
+systems. It was created originally for use in <a class="reference external" href="http://hadoop.apache.org/">Apache Hadoop</a> with systems like <a class="reference external" href="http://drill.apache.org">Apache Drill</a>, <a class="reference external" href="http://hive.apache.org">Apache Hive</a>, <a class="reference external" href="http://impala.apache.org">Apache
+Impala (incubating)</a>, and <a class="reference external" href="http://spark.apache.org">Apache Spark</a> adopting it as a shared standard for high
+performance data IO.</p>
+<p>Apache Arrow is an ideal in-memory transport layer for data that is being read
+or written with Parquet files. We have been concurrently developing the <a class="reference external" href="http://github.com/apache/parquet-cpp">C++
+implementation of Apache Parquet</a>,
+which includes a native, multithreaded C++ adapter to and from in-memory Arrow
+data. PyArrow includes Python bindings to this code, which thus enables reading
+and writing Parquet files with pandas as well.</p>
+<div class="section" id="obtaining-pyarrow-with-parquet-support">
+<h2>Obtaining PyArrow with Parquet Support<a class="headerlink" href="#obtaining-pyarrow-with-parquet-support" title="Permalink to this headline">¶</a></h2>
+<p>If you installed <code class="docutils literal"><span class="pre">pyarrow</span></code> with pip or conda, it should be built with Parquet
+support bundled:</p>
+<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [1]: </span><span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
+</pre></div>
+</div>
+<p>If you are building <code class="docutils literal"><span class="pre">pyarrow</span></code> from source, you must also build <a class="reference external" href="http://github.com/apache/parquet-cpp">parquet-cpp</a> and enable the Parquet extensions when
+building <code class="docutils literal"><span class="pre">pyarrow</span></code>. See the <a class="reference internal" href="development.html#development"><span class="std std-ref">Development</span></a> page for more
+details.</p>
+</div>
+<div class="section" id="reading-and-writing-single-files">
+<h2>Reading and Writing Single Files<a class="headerlink" href="#reading-and-writing-single-files" title="Permalink to this headline">¶</a></h2>
+<p>The functions <a class="reference internal" href="generated/pyarrow.parquet.read_table.html#pyarrow.parquet.read_table" title="pyarrow.parquet.read_table"><code class="xref py py-func docutils literal"><span class="pre">read_table()</span></code></a> and <a class="reference internal" href="generated/pyarrow.parquet.write_table.html#pyarrow.parquet.write_table" title="pyarrow.parquet.write_table"><code class="xref py py-func docutils literal"><span class="pre">write_table()</span></code></a>
+read and write the <a class="reference internal" href="data.html#data-table"><span class="std std-ref">pyarrow.Table</span></a> objects, respectively.</p>
+<p>Let’s look at a simple table:</p>
+<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [2]: </span><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
+<span class="gp">In [3]: </span><span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span>
+<span class="gp">In [4]: </span><span class="kn">import</span> <span class="nn">pyarrow</span> <span class="kn">as</span> <span class="nn">pa</span>
+<span class="gp">In [5]: </span><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">({</span><span class="s1">'one'</span><span class="p">:</span> <span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">],</span>
+<span class="gp"> ...: </span> <span class="s1">'two'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'foo'</span><span class="p">,</span> <span class="s1">'bar'</span><span class="p">,</span> <span class="s1">'baz'</span><span class="p">],</span>
+<span class="gp"> ...: </span> <span class="s1">'three'</span><span class="p">:</span> <span class="p">[</span><span class="bp">True</span><span class="p">,</span> <span class="bp">False</span><span class="p">,</span> <span class="bp">True</span><span class="p">]})</span>
+<span class="gp"> ...: </span>
+<span class="gp">In [6]: </span><span class="n">table</span> <span class="o">=</span> <span class="n">pa</span><span class="o">.</span><span class="n">Table</span><span class="o">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
+</pre></div>
+</div>
+<p>We write this to Parquet format with <code class="docutils literal"><span class="pre">write_table</span></code>:</p>
+<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [7]: </span><span class="kn">import</span> <span class="nn">pyarrow.parquet</span> <span class="kn">as</span> <span class="nn">pq</span>
+<span class="gp">In [8]: </span><span class="n">pq</span><span class="o">.</span><span class="n">write_table</span><span class="p">(</span><span class="n">table</span><span class="p">,</span> <span class="s1">'example.parquet'</span><span class="p">)</span>
+</pre></div>
+</div>
+<p>This creates a single Parquet file. In practice, a Parquet dataset may consist
+of many files in many directories. We can read a single file back with
+<code class="docutils literal"><span class="pre">read_table</span></code>:</p>
+<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [9]: </span><span class="n">table2</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">read_table</span><span class="p">(</span><span class="s1">'example.parquet'</span><span class="p">)</span>
+
+<span class="gp">In [10]: </span><span class="n">table2</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span>
+<span class="gh">Out[10]: </span><span class="go"></span>
+<span class="go"> one three two</span>
+<span class="go">0 -1.0 True foo</span>
+<span class="go">1 NaN False bar</span>
+<span class="go">2 2.5 True baz</span>
+</pre></div>
+</div>
+<p>You can pass a subset of columns to read, which can be much faster than reading
+the whole file (due to the columnar layout):</p>
+<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [11]: </span><span class="n">pq</span><span class="o">.</span><span class="n">read_table</span><span class="p">(</span><span class="s1">'example.parquet'</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s1">'one'</span><span class="p">,</span> <span class="s1">'three'</span><span class="p">])</span>
+<span class="gh">Out[11]: </span><span class="go"></span>
+<span class="go">pyarrow.Table</span>
+<span class="go">one: double</span>
+<span class="go">three: bool</span>
+<span class="go">-- metadata --</span>
+</pre></div>
+</div>
+<p>We need not use a string to specify the origin of the file. It can be any of:</p>
+<ul class="simple">
+<li>A file path as a string</li>
+<li>A <a class="reference internal" href="memory.html#io-native-file"><span class="std std-ref">NativeFile</span></a> from PyArrow</li>
+<li>A Python file object</li>
+</ul>
+<p>In general, a Python file object will have the worst read performance, while a
+string file path or an instance of <code class="xref py py-class docutils literal"><span class="pre">NativeFIle</span></code> (especially memory
+maps) will perform the best.</p>
+</div>
+<div class="section" id="finer-grained-reading-and-writing">
+<h2>Finer-grained Reading and Writing<a class="headerlink" href="#finer-grained-reading-and-writing" title="Permalink to this headline">¶</a></h2>
+<p><code class="docutils literal"><span class="pre">read_table</span></code> uses the <a class="reference internal" href="generated/pyarrow.parquet.ParquetFile.html#pyarrow.parquet.ParquetFile" title="pyarrow.parquet.ParquetFile"><code class="xref py py-class docutils literal"><span class="pre">ParquetFile</span></code></a> class, which has other features:</p>
+<div class="highlight-ipython"><div class="highlight"><pre><span></span><span class="gp">In [12]: </span><span class="n">parquet_file</span> <span class="o">=</span> <span class="n">pq</span><span class="o">.</span><span class="n">ParquetFile</span><span class="p">(</span><span class="s1">'example.parquet'</span><span class="p">)</span>
+
+<span class="gp">In [13]: </span><span class="n">parquet_file</span><span class="o">.</span><span class="n">metadata</span>
+<span class="gh">Out[13]: </span><span class="go"></span>
+<span class="go"><pyarrow._parquet.FileMetaData object at 0x2b0bce0dd9a8></span>
+<span class="go"> created_by: parquet-cpp version 1.0.0</span>
+<span class="go"> num_columns: 3</span>
+<span class="go"> num_rows: 3</span>
+<span class="go"> num_row_groups: 1</span>
+<span class="go"> format_version: 1.0</span>
+<span class="go"> serialized_size: 252</span>
+
+<span class="gp">In [14]: </span><span class="n">parquet_file</span><span class="o">.</span><span class="n">schema</span>
+<span class="go">