You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/09 12:54:57 UTC

[GitHub] [arrow-cookbook] lidavidm commented on a change in pull request #22: Initial C++ cookbook

lidavidm commented on a change in pull request #22:
URL: https://github.com/apache/arrow-cookbook/pull/22#discussion_r685159995



##########
File path: cpp/CONTRIBUTING.md
##########
@@ -0,0 +1,184 @@
+Bulding the C++ Cookbook
+========================
+
+The C++ cookbook combines output from a set of C++ test programs with
+an RST document tree rendered with sphinx.
+
+Running ``make py`` from the cookbook root directory (the one where
+the ``README.rst`` exists) will install all necessary dependencies,
+run the tests to generate the output, and will compile the cookbook
+to HTML.
+
+You will see the compiled result inside the ``build/cpp`` directory.
+
+The above process requires conda to be installed and is primarily
+intended for build systems.  See below for more information on setting
+up a development environment for developing recipes.
+
+Developing C++ Recipes
+======================
+
+Every recipe is a combination of prose written in **reStructuredText**
+format using the `Sphinx <https://www.sphinx-doc.org/>`_ documentation
+system and a snippet of a googletest test.
+
+New recipes can be added to one of the existing ``.rst`` files if
+they suit that section or you can create new sections by adding
+additional ``.rst`` files in the ``source`` directory. You just
+need to remember to add them to the ``index.rst`` file in the
+``toctree`` for them to become visible.
+
+Referencing a C++ Snippet
+-------------------------
+
+Most recipes will reference a snippet of C++ code.  For simplicity
+a custom `recipe` directive htat can be used like so:
+
+```
+.. recipe:: ../code/creating_arrow_objects.cc CreatingArrays
+  :caption: Creating an array from C++ primitives
+  :dedent: 4
+```
+
+Each `recipe` directive has two requried arguments.  The first is
+a path to the file containing the source file containing the snippet
+and the second is the name of the snippet and must correspond to a
+set of CreateRecipe/EndRecipe calls in the source file.
+
+The directive will generate two code blocks in the cookbook.  The first
+code block will contain the source code itself and will be annotated
+with any (optional) caption specified on the recipe directive.  The
+second block will contain the test output.
+
+The optional `dedent` argument should be used to remove leading white
+space from your source code.
+
+Writing a C++ Snippet
+---------------------
+
+Each snippet source file contains a set of
+[googletest](https://github.com/google/googletest) tests.  Feel free to
+use any googletest features needed to help setup and verify your test.
+To reference a snippet you need to surround it in `BeginRecipe` and
+`EndRecipe` calls.  For example:
+
+```
+StartRecipe("CreatingArrays");
+arrow::Int32Builder builder;
+ASSERT_OK(builder.Append(1));
+ASSERT_OK(builder.Append(2));
+ASSERT_OK(builder.Append(3));
+ASSERT_OK_AND_ASSIGN(shared_ptr<arrow::Array> arr, builder.Finish())
+rout << arr->ToString() << endl;
+EndRecipe("CreatingArrays");
+```
+
+The variable `rout` is set to a `std::ostream` instance that is used to
+capture test output.  Anything output to `rout` will show up in the recipe
+output block when the recipe is rendered into the cookbook.
+
+Referencing Arrow C++ Documentation
++++++++++++++++++++++++++++++++++++
+
+The Arrow project has its own documentation for the C++ implementation that
+is hosted at https://arrow.apache.org/docs/cpp/index.html.  Fortunately,
+this documentation is also built with Sphinx and so we can use the extension
+`intersphinx` to reference sections of this documentation.  To do so simply
+write a standard Sphinx reference like so:
+
+```
+Typed subclasses of :cpp:class:`arrow::ArrayBuilder` make it easy
+to efficiently create Arrow arrays from existing C++ data
+```
+
+A helpful command is
+`python -msphinx.ext.intersphinx https://arrow.apache.org/docs/objects.inv`
+which will list all of the possible targets to link to.
+
+Development Workflow
+====================
+
+Running `make` at the top level can be rather slow as it will rebuild the
+entire environment each time.  It is primarily intended for use in CI and
+requires you to have conda installed.
+
+For recipe development you are encouraged to create your own out-of-source
+cmake build.  For example:
+
+```
+mkdir cpp/code/build
+cd cpp/code/build
+cmake ../code -DCMAKE_BUILD_TYPE=Debug
+cmake --build .
+ctest
+```
+
+Then you can rerun all of the tests with `ctest` and you can rebuild and
+rerun individual tests much more quickly with something like
+`cmake --build . --target creating_arrow_objects && ctest creating_arrow_objects`.
+Everytime the cmake build is run it will update the recipe output file
+referenced by the sphinx build so after rerunning a test you can visualize the
+output by running `make html` in the `cpp` directory.
+
+Using Conda
++++++++++++
+
+If you are using conda then there is file `cpp/requirements.yml` which can be
+used to create an environment for recipe development with the command:
+
+```
+conda env create -n cookbook-cpp --file cpp/requirements.yml
+```
+
+Development Philosophy
+======================
+
+Everything is the Cookbook
+++++++++++++++++++++++++++
+
+The entire document should serve as an example of how to use Arrow C++, not just the
+referenced snippets.  This means that the below style rules and guidelines apply to
+source code that is not referenced by the cookbook itself.
+
+Style
++++++
+
+This cookbook follows the same style rules as Arrow C++ which is the Google style
+guide with a few exceptions described
+[here](https://arrow.apache.org/docs/developers/cpp/development.html#code-style-linting-and-ci)
+
+Simple
+++++++
+
+The examples should be as simple as possible.  If complex code (e.g. templates) can be
+used to do something more efficiently then there should be a simple, inefficient version
+alongside the more complex version.
+
+Do not use `auto` in any of the templates unless you must (e.g. lambdas).  Cookbook
+viewers will be using a browser, not an IDE, and it is not always simple to determine
+the inferred type.
+
+For conciseness, `using namespace std;` should be used.

Review comment:
       I'm not sure the conciseness buys us much compared with potential name clashes/confusion over where symbols come from. Also, I don't see this rule in the Arrow source and in fact the only instance of this declaration is in vendored third-party code.

##########
File path: cpp/CONTRIBUTING.md
##########
@@ -0,0 +1,184 @@
+Bulding the C++ Cookbook
+========================
+
+The C++ cookbook combines output from a set of C++ test programs with
+an RST document tree rendered with sphinx.
+
+Running ``make py`` from the cookbook root directory (the one where
+the ``README.rst`` exists) will install all necessary dependencies,
+run the tests to generate the output, and will compile the cookbook
+to HTML.
+
+You will see the compiled result inside the ``build/cpp`` directory.
+
+The above process requires conda to be installed and is primarily
+intended for build systems.  See below for more information on setting
+up a development environment for developing recipes.
+
+Developing C++ Recipes
+======================
+
+Every recipe is a combination of prose written in **reStructuredText**
+format using the `Sphinx <https://www.sphinx-doc.org/>`_ documentation
+system and a snippet of a googletest test.
+
+New recipes can be added to one of the existing ``.rst`` files if
+they suit that section or you can create new sections by adding
+additional ``.rst`` files in the ``source`` directory. You just
+need to remember to add them to the ``index.rst`` file in the
+``toctree`` for them to become visible.
+
+Referencing a C++ Snippet
+-------------------------
+
+Most recipes will reference a snippet of C++ code.  For simplicity
+a custom `recipe` directive htat can be used like so:
+
+```
+.. recipe:: ../code/creating_arrow_objects.cc CreatingArrays
+  :caption: Creating an array from C++ primitives
+  :dedent: 4
+```
+
+Each `recipe` directive has two requried arguments.  The first is
+a path to the file containing the source file containing the snippet
+and the second is the name of the snippet and must correspond to a
+set of CreateRecipe/EndRecipe calls in the source file.
+
+The directive will generate two code blocks in the cookbook.  The first
+code block will contain the source code itself and will be annotated
+with any (optional) caption specified on the recipe directive.  The
+second block will contain the test output.
+
+The optional `dedent` argument should be used to remove leading white
+space from your source code.
+
+Writing a C++ Snippet
+---------------------
+
+Each snippet source file contains a set of
+[googletest](https://github.com/google/googletest) tests.  Feel free to
+use any googletest features needed to help setup and verify your test.
+To reference a snippet you need to surround it in `BeginRecipe` and
+`EndRecipe` calls.  For example:
+
+```
+StartRecipe("CreatingArrays");
+arrow::Int32Builder builder;
+ASSERT_OK(builder.Append(1));

Review comment:
       We should note that macros like ASSERT_OK come from Arrow (and ideally link to documentation for them).

##########
File path: cpp/CONTRIBUTING.md
##########
@@ -0,0 +1,184 @@
+Bulding the C++ Cookbook
+========================
+
+The C++ cookbook combines output from a set of C++ test programs with
+an RST document tree rendered with sphinx.
+
+Running ``make py`` from the cookbook root directory (the one where
+the ``README.rst`` exists) will install all necessary dependencies,
+run the tests to generate the output, and will compile the cookbook
+to HTML.
+
+You will see the compiled result inside the ``build/cpp`` directory.
+
+The above process requires conda to be installed and is primarily
+intended for build systems.  See below for more information on setting
+up a development environment for developing recipes.
+
+Developing C++ Recipes
+======================
+
+Every recipe is a combination of prose written in **reStructuredText**
+format using the `Sphinx <https://www.sphinx-doc.org/>`_ documentation
+system and a snippet of a googletest test.
+
+New recipes can be added to one of the existing ``.rst`` files if
+they suit that section or you can create new sections by adding
+additional ``.rst`` files in the ``source`` directory. You just
+need to remember to add them to the ``index.rst`` file in the
+``toctree`` for them to become visible.
+
+Referencing a C++ Snippet
+-------------------------
+
+Most recipes will reference a snippet of C++ code.  For simplicity
+a custom `recipe` directive htat can be used like so:

Review comment:
       ```suggestion
   a custom `recipe` directive that can be used like so:
   ```

##########
File path: build/index.html
##########
@@ -40,6 +40,7 @@ <h1>Apache Arrow Cookbook</h1>
 		<ul>
 			<li><a href="py/index.html">Python Cookbook</a></li>
 			<li><a href="r/index.html">R Cookbook</a></li>
+			<li><a href="cpp/index.html">C++ Cookbook</a></li>

Review comment:
       nit: why not alphabetically order?

##########
File path: cpp/code/CMakeLists.txt
##########
@@ -0,0 +1,47 @@
+cmake_minimum_required(VERSION 3.19)
+project(arrow-cookbook)
+
+set(CMAKE_CXX_STANDARD 17)
+
+# Add googletest
+include(FetchContent)
+FetchContent_Declare(
+  googletest
+  GIT_REPOSITORY https://github.com/google/googletest.git
+  GIT_TAG        e2239ee6043f73722e7aa812a459f54a28552929 # release-1.11.0
+)
+# For Windows: Prevent overriding the parent project's compiler/linker settings
+set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)
+FetchContent_MakeAvailable(googletest)
+
+# Add Arrow
+find_package(Arrow REQUIRED)
+
+# Create test targets
+enable_testing()
+include(GoogleTest)
+add_executable(
+        creating_arrow_objects

Review comment:
       Do we want to set up cmake-format (eventually)? At least, there's inconsistent indentation in this file.

##########
File path: cpp/source/conf.py
##########
@@ -0,0 +1,72 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Configuration file for the Sphinx documentation builder.
+#
+# This file only contains a selection of the most common options. For a full
+# list see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+import os
+import sys
+sys.path.append(os.path.abspath("../ext"))
+
+
+# -- Project information -----------------------------------------------------
+
+project = 'Apache Arrow C++ Cookbook'
+copyright = '2021, Apache Software Foundation'
+author = 'The Apache Software Foundation'
+
+
+# -- General configuration ---------------------------------------------------
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+    "sphinx.ext.intersphinx", "recipeext"
+]
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This pattern also affects html_static_path and html_extra_path.
+exclude_patterns = []
+
+
+# -- Options for HTML output -------------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+#
+html_theme = 'alabaster'

Review comment:
       Should we (eventually) use the same style as the Arrow docs?

##########
File path: cpp/code/creating_arrow_objects.cc
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <arrow/api.h>
+#include <gtest/gtest.h>
+
+#include "common.h"
+
+using namespace std;
+
+TEST(CreatingArrowObjects, CreateArrays) {
+  StartRecipe("CreatingArrays");
+  arrow::Int32Builder builder;
+  ASSERT_OK(builder.Append(1));
+  ASSERT_OK(builder.Append(2));
+  ASSERT_OK(builder.Append(3));
+  ASSERT_OK_AND_ASSIGN(shared_ptr<arrow::Array> arr, builder.Finish())

Review comment:
       ```suggestion
     ASSERT_OK_AND_ASSIGN(shared_ptr<arrow::Array> arr, builder.Finish());
   ```

##########
File path: cpp/CONTRIBUTING.md
##########
@@ -0,0 +1,184 @@
+Bulding the C++ Cookbook
+========================
+
+The C++ cookbook combines output from a set of C++ test programs with
+an RST document tree rendered with sphinx.
+
+Running ``make py`` from the cookbook root directory (the one where
+the ``README.rst`` exists) will install all necessary dependencies,
+run the tests to generate the output, and will compile the cookbook
+to HTML.
+
+You will see the compiled result inside the ``build/cpp`` directory.
+
+The above process requires conda to be installed and is primarily
+intended for build systems.  See below for more information on setting
+up a development environment for developing recipes.
+
+Developing C++ Recipes
+======================
+
+Every recipe is a combination of prose written in **reStructuredText**
+format using the `Sphinx <https://www.sphinx-doc.org/>`_ documentation
+system and a snippet of a googletest test.
+
+New recipes can be added to one of the existing ``.rst`` files if
+they suit that section or you can create new sections by adding
+additional ``.rst`` files in the ``source`` directory. You just
+need to remember to add them to the ``index.rst`` file in the
+``toctree`` for them to become visible.
+
+Referencing a C++ Snippet
+-------------------------
+
+Most recipes will reference a snippet of C++ code.  For simplicity
+a custom `recipe` directive htat can be used like so:
+
+```
+.. recipe:: ../code/creating_arrow_objects.cc CreatingArrays
+  :caption: Creating an array from C++ primitives
+  :dedent: 4
+```
+
+Each `recipe` directive has two requried arguments.  The first is
+a path to the file containing the source file containing the snippet
+and the second is the name of the snippet and must correspond to a
+set of CreateRecipe/EndRecipe calls in the source file.
+
+The directive will generate two code blocks in the cookbook.  The first
+code block will contain the source code itself and will be annotated
+with any (optional) caption specified on the recipe directive.  The
+second block will contain the test output.
+
+The optional `dedent` argument should be used to remove leading white
+space from your source code.
+
+Writing a C++ Snippet
+---------------------
+
+Each snippet source file contains a set of
+[googletest](https://github.com/google/googletest) tests.  Feel free to
+use any googletest features needed to help setup and verify your test.
+To reference a snippet you need to surround it in `BeginRecipe` and
+`EndRecipe` calls.  For example:
+
+```
+StartRecipe("CreatingArrays");
+arrow::Int32Builder builder;
+ASSERT_OK(builder.Append(1));
+ASSERT_OK(builder.Append(2));
+ASSERT_OK(builder.Append(3));
+ASSERT_OK_AND_ASSIGN(shared_ptr<arrow::Array> arr, builder.Finish())
+rout << arr->ToString() << endl;
+EndRecipe("CreatingArrays");
+```
+
+The variable `rout` is set to a `std::ostream` instance that is used to
+capture test output.  Anything output to `rout` will show up in the recipe
+output block when the recipe is rendered into the cookbook.
+
+Referencing Arrow C++ Documentation
++++++++++++++++++++++++++++++++++++
+
+The Arrow project has its own documentation for the C++ implementation that
+is hosted at https://arrow.apache.org/docs/cpp/index.html.  Fortunately,
+this documentation is also built with Sphinx and so we can use the extension
+`intersphinx` to reference sections of this documentation.  To do so simply
+write a standard Sphinx reference like so:
+
+```
+Typed subclasses of :cpp:class:`arrow::ArrayBuilder` make it easy
+to efficiently create Arrow arrays from existing C++ data
+```
+
+A helpful command is
+`python -msphinx.ext.intersphinx https://arrow.apache.org/docs/objects.inv`
+which will list all of the possible targets to link to.
+
+Development Workflow
+====================
+
+Running `make` at the top level can be rather slow as it will rebuild the
+entire environment each time.  It is primarily intended for use in CI and
+requires you to have conda installed.
+
+For recipe development you are encouraged to create your own out-of-source
+cmake build.  For example:
+
+```
+mkdir cpp/code/build
+cd cpp/code/build
+cmake ../code -DCMAKE_BUILD_TYPE=Debug
+cmake --build .
+ctest
+```
+
+Then you can rerun all of the tests with `ctest` and you can rebuild and
+rerun individual tests much more quickly with something like
+`cmake --build . --target creating_arrow_objects && ctest creating_arrow_objects`.
+Everytime the cmake build is run it will update the recipe output file
+referenced by the sphinx build so after rerunning a test you can visualize the
+output by running `make html` in the `cpp` directory.
+
+Using Conda
++++++++++++
+
+If you are using conda then there is file `cpp/requirements.yml` which can be
+used to create an environment for recipe development with the command:
+
+```
+conda env create -n cookbook-cpp --file cpp/requirements.yml
+```
+
+Development Philosophy
+======================
+
+Everything is the Cookbook
+++++++++++++++++++++++++++
+
+The entire document should serve as an example of how to use Arrow C++, not just the
+referenced snippets.  This means that the below style rules and guidelines apply to
+source code that is not referenced by the cookbook itself.
+
+Style
++++++
+
+This cookbook follows the same style rules as Arrow C++ which is the Google style
+guide with a few exceptions described
+[here](https://arrow.apache.org/docs/developers/cpp/development.html#code-style-linting-and-ci)
+
+Simple
+++++++
+
+The examples should be as simple as possible.  If complex code (e.g. templates) can be
+used to do something more efficiently then there should be a simple, inefficient version
+alongside the more complex version.
+
+Do not use `auto` in any of the templates unless you must (e.g. lambdas).  Cookbook
+viewers will be using a browser, not an IDE, and it is not always simple to determine
+the inferred type.
+
+For conciseness, `using namespace std;` should be used.

Review comment:
       It also explicitly contradicts the C++ style guide: https://google.github.io/styleguide/cppguide.html#Namespaces

##########
File path: cpp/code/common.h
##########
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_COOKBOOK_COMMON_H
+#define ARROW_COOKBOOK_COMMON_H
+
+#include <sstream>
+#include <string>
+
+#include <arrow/result.h>
+#include <arrow/status.h>

Review comment:
       FWIW, a Conda-installed copy of Arrow/C++ seems to contain arrow/testing/gtest_util.h so those could be used.

##########
File path: cpp/code/main.cc
##########
@@ -0,0 +1,42 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <iostream>
+
+#include <filesystem>
+
+#include <arrow/status.h>
+#include "gtest/gtest.h"
+
+#include "common.h"
+
+int main(int argc, char** argv) {
+  testing::InitGoogleTest(&argc, argv);
+  int retval = RUN_ALL_TESTS();
+  if (retval == 0 && HasRecipeOutput()) {
+    arrow::Status st = DumpRecipeOutput("recipes_out.arrow");
+    if (!st.ok()) {
+      std::cerr << "Tests ran successfully but failed to dump recipe output: " << st
+                << std::endl;
+      // return -1;

Review comment:
       Was this meant to be commented out?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org