You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by al...@apache.org on 2022/02/15 18:21:36 UTC

[arrow-datafusion] branch master updated: Add section on testing to the Developers doc (#1471)

This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git


The following commit(s) were added to refs/heads/master by this push:
     new 43ed320  Add section on testing to the Developers doc (#1471)
43ed320 is described below

commit 43ed320acc8836bf761f2b591ad42df4e0ac5647
Author: Andrew Lamb <an...@nerdnetworks.org>
AuthorDate: Tue Feb 15 13:17:41 2022 -0500

    Add section on testing to the Developers doc (#1471)
    
    * Add section on testing to the Developers doc
    
    * Improvements, document integration test runner
    
    * prettier
---
 DEVELOPERS.md | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/DEVELOPERS.md b/DEVELOPERS.md
index 9a75486..397bfdf 100644
--- a/DEVELOPERS.md
+++ b/DEVELOPERS.md
@@ -40,6 +40,79 @@ Testing setup:
 - `export PARQUET_TEST_DATA=$(pwd)/parquet-testing/data/`
 - `export ARROW_TEST_DATA=$(pwd)/testing/data/`
 
+## Test Organization
+
+DataFusion has several levels of tests in its [Test
+Pyramid](https://martinfowler.com/articles/practical-test-pyramid.html)
+and tries to follow [Testing Organization](https://doc.rust-lang.org/book/ch11-03-test-organization.html) in the The Book.
+
+This section highlights the most important test modules that exist
+
+### Unit tests
+
+Tests for the code in an individual module are defined in the same source file with a `test` module, following Rust convention
+
+### Rust Integration Tests
+
+There are several tests of the public interface of the DataFusion library in the [tests](https://github.com/apache/arrow-datafusion/blob/master/datafusion/tests) directory.
+
+You can run these tests individually using a command such as
+
+```shell
+cargo test -p datafusion --tests sql_integration
+```
+
+One very important test is the [sql_integraton](https://github.com/apache/arrow-datafusion/blob/master/datafusion/tests/sql_integration.rs) test which validates DataFusion's ability to run a large assortment of SQL queries against an assortment of data setsups.
+
+### SQL / Postgres Integration Tests
+
+The [integration-tests](https://github.com/apache/arrow-datafusion/blob/master/datafusion/integration-tests] directory contains a harness that runs certain queries against both postgres and datafusion and compares results
+
+#### setup environment
+
+```shell
+export POSTGRES_DB=postgres
+export POSTGRES_USER=postgres
+export POSTGRES_HOST=localhost
+export POSTGRES_PORT=5432
+```
+
+#### Install dependencies
+
+```shell
+# Install dependencies
+python -m pip install --upgrade pip setuptools wheel
+python -m pip install -r integration-tests/requirements.txt
+
+# setup environment
+POSTGRES_DB=postgres POSTGRES_USER=postgres POSTGRES_HOST=localhost POSTGRES_PORT=5432 python -m pytest -v integration-tests/test_psql_parity.py
+
+# Create
+psql -d "$POSTGRES_DB" -h "$POSTGRES_HOST" -p "$POSTGRES_PORT" -U "$POSTGRES_USER" -c 'CREATE TABLE IF NOT EXISTS test (
+  c1 character varying NOT NULL,
+  c2 integer NOT NULL,
+  c3 smallint NOT NULL,
+  c4 smallint NOT NULL,
+  c5 integer NOT NULL,
+  c6 bigint NOT NULL,
+  c7 smallint NOT NULL,
+  c8 integer NOT NULL,
+  c9 bigint NOT NULL,
+  c10 character varying NOT NULL,
+  c11 double precision NOT NULL,
+  c12 double precision NOT NULL,
+  c13 character varying NOT NULL
+);'
+
+psql -d "$POSTGRES_DB" -h "$POSTGRES_HOST" -p "$POSTGRES_PORT" -U "$POSTGRES_USER" -c "\copy test FROM '$(pwd)/testing/data/csv/aggregate_test_100.csv' WITH (FORMAT csv, HEADER true);"
+```
+
+#### Invoke the test runner
+
+```shell
+python -m pytest -v integration-tests/test_psql_parity.py
+```
+
 ## How to add a new scalar function
 
 Below is a checklist of what you need to do to add a new scalar function to DataFusion: