You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by al...@apache.org on 2022/02/15 18:21:36 UTC
[arrow-datafusion] branch master updated: Add section on testing to the Developers doc (#1471)
This is an automated email from the ASF dual-hosted git repository.
alamb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git
The following commit(s) were added to refs/heads/master by this push:
new 43ed320 Add section on testing to the Developers doc (#1471)
43ed320 is described below
commit 43ed320acc8836bf761f2b591ad42df4e0ac5647
Author: Andrew Lamb <an...@nerdnetworks.org>
AuthorDate: Tue Feb 15 13:17:41 2022 -0500
Add section on testing to the Developers doc (#1471)
* Add section on testing to the Developers doc
* Improvements, document integration test runner
* prettier
---
DEVELOPERS.md | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)
diff --git a/DEVELOPERS.md b/DEVELOPERS.md
index 9a75486..397bfdf 100644
--- a/DEVELOPERS.md
+++ b/DEVELOPERS.md
@@ -40,6 +40,79 @@ Testing setup:
- `export PARQUET_TEST_DATA=$(pwd)/parquet-testing/data/`
- `export ARROW_TEST_DATA=$(pwd)/testing/data/`
+## Test Organization
+
+DataFusion has several levels of tests in its [Test
+Pyramid](https://martinfowler.com/articles/practical-test-pyramid.html)
+and tries to follow [Testing Organization](https://doc.rust-lang.org/book/ch11-03-test-organization.html) in the The Book.
+
+This section highlights the most important test modules that exist
+
+### Unit tests
+
+Tests for the code in an individual module are defined in the same source file with a `test` module, following Rust convention
+
+### Rust Integration Tests
+
+There are several tests of the public interface of the DataFusion library in the [tests](https://github.com/apache/arrow-datafusion/blob/master/datafusion/tests) directory.
+
+You can run these tests individually using a command such as
+
+```shell
+cargo test -p datafusion --tests sql_integration
+```
+
+One very important test is the [sql_integraton](https://github.com/apache/arrow-datafusion/blob/master/datafusion/tests/sql_integration.rs) test which validates DataFusion's ability to run a large assortment of SQL queries against an assortment of data setsups.
+
+### SQL / Postgres Integration Tests
+
+The [integration-tests](https://github.com/apache/arrow-datafusion/blob/master/datafusion/integration-tests] directory contains a harness that runs certain queries against both postgres and datafusion and compares results
+
+#### setup environment
+
+```shell
+export POSTGRES_DB=postgres
+export POSTGRES_USER=postgres
+export POSTGRES_HOST=localhost
+export POSTGRES_PORT=5432
+```
+
+#### Install dependencies
+
+```shell
+# Install dependencies
+python -m pip install --upgrade pip setuptools wheel
+python -m pip install -r integration-tests/requirements.txt
+
+# setup environment
+POSTGRES_DB=postgres POSTGRES_USER=postgres POSTGRES_HOST=localhost POSTGRES_PORT=5432 python -m pytest -v integration-tests/test_psql_parity.py
+
+# Create
+psql -d "$POSTGRES_DB" -h "$POSTGRES_HOST" -p "$POSTGRES_PORT" -U "$POSTGRES_USER" -c 'CREATE TABLE IF NOT EXISTS test (
+ c1 character varying NOT NULL,
+ c2 integer NOT NULL,
+ c3 smallint NOT NULL,
+ c4 smallint NOT NULL,
+ c5 integer NOT NULL,
+ c6 bigint NOT NULL,
+ c7 smallint NOT NULL,
+ c8 integer NOT NULL,
+ c9 bigint NOT NULL,
+ c10 character varying NOT NULL,
+ c11 double precision NOT NULL,
+ c12 double precision NOT NULL,
+ c13 character varying NOT NULL
+);'
+
+psql -d "$POSTGRES_DB" -h "$POSTGRES_HOST" -p "$POSTGRES_PORT" -U "$POSTGRES_USER" -c "\copy test FROM '$(pwd)/testing/data/csv/aggregate_test_100.csv' WITH (FORMAT csv, HEADER true);"
+```
+
+#### Invoke the test runner
+
+```shell
+python -m pytest -v integration-tests/test_psql_parity.py
+```
+
## How to add a new scalar function
Below is a checklist of what you need to do to add a new scalar function to DataFusion: