You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2019/07/15 02:19:00 UTC

[jira] [Created] (DRILL-7324) Many vector-validity errors from unit tests

Paul Rogers created DRILL-7324:
----------------------------------

             Summary: Many vector-validity errors from unit tests
                 Key: DRILL-7324
                 URL: https://issues.apache.org/jira/browse/DRILL-7324
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.16.0
            Reporter: Paul Rogers


Drill's value vectors contain many counts that must be maintained in sync. Drill provides a utility, {{BatchValidator}} to check (a subset of) these values for consistency.

The {{IteratorValidatorBatchIterator}} class is used in tests to validate the state of each operator (AKA "record batch") as Drill runs the Volcano iterator. This class can also validate vectors by setting the {{VALIDATE_VECTORS}} constant to `true`.

This was done, then unit tests were run. Many tests failed. Examples:

{noformat}
[INFO] Running org.apache.drill.TestUnionDistinct
18:44:26.742 [22d42585-74c2-d418-6f59-9b1870d04770:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from LimitRecordBatch
key - NullableBitVector: Row count = 0, but value count = 2
18:44:26.745 [22d42585-74c2-d418-6f59-9b1870d04770:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from LimitRecordBatch
key - NullableBitVector: Row count = 0, but value count = 2

[INFO] Running org.apache.drill.TestUnionDistinct
8:44:48.302 [22d4256e-c90b-847c-5104-02d6cdf5223e:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from LimitRecordBatch
key - NullableBitVector: Row count = 0, but value count = 2
18:44:48.703 [22d4256e-ccf3-2af6-f56a-140e9c3e55bb:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
n_nationkey - IntVector: Row count = 2, but value count = 25
n_regionkey - IntVector: Row count = 2, but value count = 25
18:44:48.731 [22d4256e-ccf3-2af6-f56a-140e9c3e55bb:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
n_nationkey - IntVector: Row count = 4, but value count = 25
n_regionkey - IntVector: Row count = 4, but value count = 25
18:44:49.039 [22d4256f-6b39-d2ab-d145-4f2b0db315a3:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
n_nationkey - IntVector: Row count = 2, but value count = 25
18:44:49.363 [22d4256e-3d91-850f-9ab4-5939219ac0d0:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
c_custkey - IntVector: Row count = 4, but value count = 1500
18:44:49.597 [22d4256d-c113-ae5c-6f31-4dd1ec091365:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
n_nationkey - IntVector: Row count = 5, but value count = 25
n_regionkey - IntVector: Row count = 5, but value count = 25
18:44:49.610 [22d4256d-c113-ae5c-6f31-4dd1ec091365:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
r_regionkey - IntVector: Row count = 1, but value count = 5
18:44:53.029 [22d4256a-8b70-5f3b-f79b-806e194c5ed2:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from LimitRecordBatch
n_nationkey - IntVector: Row count = 0, but value count = 25
n_name - VarCharVector: Row count = 0, but value count = 25
n_regionkey - IntVector: Row count = 0, but value count = 25
18:44:53.033 [22d4256a-8b70-5f3b-f79b-806e194c5ed2:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from LimitRecordBatch
n_regionkey - IntVector: Row count = 5, but value count = 25
18:44:53.331 [22d4256a-526c-7815-c216-8e45752a4a6c:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from LimitRecordBatch
n_nationkey - IntVector: Row count = 5, but value count = 25
n_name - VarCharVector: Row count = 5, but value count = 25
n_regionkey - IntVector: Row count = 5, but value count = 25
18:44:53.337 [22d4256a-526c-7815-c216-8e45752a4a6c:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from LimitRecordBatch
n_regionkey - IntVector: Row count = 0, but value count = 25
18:44:53.646 [22d42569-c293-ced0-c3d0-e9153cc4a70a:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from LimitRecordBatch
key - NullableBitVector: Row count = 0, but value count = 2

Running org.apache.drill.TestTpchSingleMode
18:45:01.299 [22d42563-0ed6-1501-86a1-4cb375a9cad4:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch

Running org.apache.drill.TestMergeFilterPlan
18:45:03.738 [22d4255f-b322-fd56-2f93-34b7f5c709c1:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
o_orderkey - IntVector: Row count = 561, but value count = 15000
o_orderdate - DateVector: Row count = 561, but value count = 15000
o_orderpriority - VarCharVector: Row count = 561, but value count = 15000
18:45:03.828 [22d4255f-b322-fd56-2f93-34b7f5c709c1:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
l_orderkey - IntVector: Row count = 20580, but value count = 32767
l_commitdate - DateVector: Row count = 20580, but value count = 32767
l_receiptdate - DateVector: Row count = 20580, but value count = 32767
18:45:03.990 [22d4255f-b322-fd56-2f93-34b7f5c709c1:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
l_orderkey - IntVector: Row count = 17317, but value count = 27408
l_commitdate - DateVector: Row count = 17317, but value count = 27408
l_receiptdate - DateVector: Row count = 17317, but value count = 27408
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.041 s - in org.apache.drill.TestMergeFilterPlan
18:45:04.929 [22d4255f-040c-f4c9-7d23-b90702db4a1e:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
o_orderkey - IntVector: Row count = 2287, but value count = 15000
o_custkey - IntVector: Row count = 2287, but value count = 15000
o_orderdate - DateVector: Row count = 2287, but value count = 15000
18:45:04.944 [22d4255f-040c-f4c9-7d23-b90702db4a1e:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
r_regionkey - IntVector: Row count = 1, but value count = 5
r_name - VarCharVector: Row count = 1, but value count = 5
[INFO] Running org.apache.drill.TestSelectWithOption
18:45:06.120 [22d4255e-5f13-aabb-40bb-bd09dc3d35e1:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
l_quantity - Float8Vector: Row count = 594, but value count = 32767
l_extendedprice - Float8Vector: Row count = 594, but value count = 32767
l_discount - Float8Vector: Row count = 594, but value count = 32767
l_shipdate - DateVector: Row count = 594, but value count = 32767
18:45:06.156 [22d4255e-5f13-aabb-40bb-bd09dc3d35e1:frag:0:0] ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from FilterRecordBatch
l_quantity - Float8Vector: Row count = 543, but value count = 27408
l_extendedprice - Float8Vector: Row count = 543, but value count = 27408
l_discount - Float8Vector: Row count = 543, but value count = 27408
l_shipdate - DateVector: Row count = 543, but value count = 27408
{noformat}

And many, many more.

The problem with these errors is that it makes operators very fragile: once we accept invalid vectors, it is very hard to detect when an operator makes vectors even more invalid. It is also hard to reason about the code if the inputs (or outputs) can be corrupt in normal operation.

Suggestions:

1. Extend {{BatchValidator}} with the vectors not yet covered (maps, repeated maps.)
2. Work step-by-step through tests.
3. Identify operators that corrupt vectors.
4. Fix the source of corruption and retest.
5. Continue until no vector corruption errors occur.
6. Change the {{IteratorValidatorBatchIterator}} to check vectors by default, and to throw a fatal error if corruption is found.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)