You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vitalii Diravka (Jira)" <ji...@apache.org> on 2021/11/18 22:36:00 UTC
[jira] [Assigned] (DRILL-5612) Random failure in TestMergeJoinWithSchemaChanges
[ https://issues.apache.org/jira/browse/DRILL-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vitalii Diravka reassigned DRILL-5612:
--------------------------------------
Assignee: Vitalii Diravka
> Random failure in TestMergeJoinWithSchemaChanges
> ------------------------------------------------
>
> Key: DRILL-5612
> URL: https://issues.apache.org/jira/browse/DRILL-5612
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.11.0
> Reporter: Paul Rogers
> Assignee: Vitalii Diravka
> Priority: Major
> Attachments: image-2021-11-16-02-35-25-690.png
>
>
> The unit test {{org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns}} is subject to random failures, perhaps due to changes in file order in readers.
> The test builds a number of input files, then executes queries against them. On most runs, the output is fine:
> {code}
> Running org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns
> /home/.../target/1498606483211-0/mergejoin-schemachanges-left
> /home/.../target/1498606483211-1/mergejoin-schemachanges-right
> {code}
> But, on occasion, the query fails:
> {code}
> org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges
> testMissingAndNewColumns(org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges) Time elapsed: 0.569 sec <<< ERROR!
> ...: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with changing schemas
> Fragment 0:0
> (org.apache.drill.exec.exception.SchemaChangeException) Sort currently only supports a single schema.
> org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build():152
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext():476
> ...
> {code}
> The line in the exception above:
> {code}
> public void build(VectorContainer outputContainer) throws SchemaChangeException {
> outputContainer.clear();
> if (batches.keySet().size() > 1) {
> throw new SchemaChangeException("Sort currently only supports a single schema.");
> }
> {code}
> The above code has not changed in quite some time. The failure is in the "legacy" external sort.
> Although the external sort does support schema changes, it only does so in the form of a union vector, which must be enabled. (Other tests validate that schema changes work.)
> What is likely happening here is that the sort sometimes sees two files with differing schemas, sometimes multiple threads run so that a single sort sees only one file. This speculation can be verified by looking at a log file (not available in the test run that failed) to see if the scan under the sort read more than one file.
> Or, perhaps the order of the JSON files matters. Perhaps file order varies across machines (since the Linux command to list directories does not guarantee order.)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)