You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/10 18:07:45 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue, #2719: Test (path normalization) failures while verifying release candidate 9.0.0 RC1

alamb opened a new issue, #2719:
URL: https://github.com/apache/arrow-datafusion/issues/2719

   **Describe the bug**
   The `verify-release-candidate` script failed for me locally while verifying release candidate 9.0.0 RC1
   
   **To Reproduce**
   Run the release verification script with 9.0.0 RC1
   
   ```shell
   ./dev/release/verify-release-candidate.sh 9.0.0 1
   ```
   
   It eventually fails with the following message:
   
   ```
   failures:
   
   ---- sql::explain_analyze::csv_explain stdout ----
   thread 'sql::explain_analyze::csv_explain' panicked at 'assertion failed: `(left == right)`
     left: `[["logical_plan", "Projection: #aggregate_test_100.c1\n  Filter: #aggregate_test_100.c2 > Int64(10)\n    TableScan: aggregate_test_100 projection=Some([c1, c2]), partial_filters=[#aggregate_test_100.c2 > Int64(10)]"], ["physical_plan", "ProjectionExec: expr=[c1@0 as c1]\n  CoalesceBatchesExec: target_batch_size=4096\n    FilterExec: CAST(c2@1 AS Int64) > 10\n      RepartitionExec: partitioning=RoundRobinBatch(NUM_CORES)\n        CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1, c2]\n"]]`,
    right: `[["logical_plan", "Projection: #aggregate_test_100.c1\n  Filter: #aggregate_test_100.c2 > Int64(10)\n    TableScan: aggregate_test_100 projection=Some([c1, c2]), partial_filters=[#aggregate_test_100.c2 > Int64(10)]"], ["physical_plan", "ProjectionExec: expr=[c1@0 as c1]\n  CoalesceBatchesExec: target_batch_size=4096\n    FilterExec: CAST(c2@1 AS Int64) > 10\n      RepartitionExec: partitioning=RoundRobinBatch(NUM_CORES)\n        CsvExec: files=[/privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1, c2]\n"]]`', datafusion/core/tests/sql/explain_analyze.rs:766:5
   
   ---- sql::explain_analyze::test_physical_plan_display_indent stdout ----
   thread 'sql::explain_analyze::test_physical_plan_display_indent' panicked at 'assertion failed: `(left == right)`
     left: `["GlobalLimitExec: skip=None, fetch=10", "  SortExec: [the_min@2 DESC]", "    CoalescePartitionsExec", "      ProjectionExec: expr=[c1@0 as c1, MAX(aggregate_test_100.c12)@1 as MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)@2 as the_min]", "        AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]", "          CoalesceBatchesExec: target_batch_size=4096", "            RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "              AggregateExec: mode=Partial, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]", "                CoalesceBatchesExec: target_batch_size=4096", "                  FilterExec: c12@1 < CAST(10 AS Float64)", "                    RepartitionExec: partitioning=RoundRobinBatch(9000)", "                      CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1,
  c12]"]`,
    right: `["GlobalLimitExec: skip=None, fetch=10", "  SortExec: [the_min@2 DESC]", "    CoalescePartitionsExec", "      ProjectionExec: expr=[c1@0 as c1, MAX(aggregate_test_100.c12)@1 as MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)@2 as the_min]", "        AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]", "          CoalesceBatchesExec: target_batch_size=4096", "            RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "              AggregateExec: mode=Partial, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]", "                CoalesceBatchesExec: target_batch_size=4096", "                  FilterExec: c12@1 < CAST(10 AS Float64)", "                    RepartitionExec: partitioning=RoundRobinBatch(9000)", "                      CsvExec: files=[/privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, project
 ion=[c1, c12]"]`: expected:
   [
       "GlobalLimitExec: skip=None, fetch=10",
       "  SortExec: [the_min@2 DESC]",
       "    CoalescePartitionsExec",
       "      ProjectionExec: expr=[c1@0 as c1, MAX(aggregate_test_100.c12)@1 as MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)@2 as the_min]",
       "        AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
       "          CoalesceBatchesExec: target_batch_size=4096",
       "            RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)",
       "              AggregateExec: mode=Partial, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
       "                CoalesceBatchesExec: target_batch_size=4096",
       "                  FilterExec: c12@1 < CAST(10 AS Float64)",
       "                    RepartitionExec: partitioning=RoundRobinBatch(9000)",
       "                      CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1, c12]",
   ]
   actual:
   
   [
       "GlobalLimitExec: skip=None, fetch=10",
       "  SortExec: [the_min@2 DESC]",
       "    CoalescePartitionsExec",
       "      ProjectionExec: expr=[c1@0 as c1, MAX(aggregate_test_100.c12)@1 as MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)@2 as the_min]",
       "        AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
       "          CoalesceBatchesExec: target_batch_size=4096",
       "            RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)",
       "              AggregateExec: mode=Partial, gby=[c1@0 as c1], aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
       "                CoalesceBatchesExec: target_batch_size=4096",
       "                  FilterExec: c12@1 < CAST(10 AS Float64)",
       "                    RepartitionExec: partitioning=RoundRobinBatch(9000)",
       "                      CsvExec: files=[/privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1, c12]",
   ]
   ', datafusion/core/tests/sql/explain_analyze.rs:680:5
   
   ---- sql::explain_analyze::test_physical_plan_display_indent_multi_children stdout ----
   thread 'sql::explain_analyze::test_physical_plan_display_indent_multi_children' panicked at 'assertion failed: `(left == right)`
     left: `["ProjectionExec: expr=[c1@0 as c1]", "  CoalesceBatchesExec: target_batch_size=4096", "    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]", "      CoalesceBatchesExec: target_batch_size=4096", "        RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "          ProjectionExec: expr=[c1@0 as c1]", "            ProjectionExec: expr=[c1@0 as c1]", "              RepartitionExec: partitioning=RoundRobinBatch(9000)", "                CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]", "      CoalesceBatchesExec: target_batch_size=4096", "        RepartitionExec: partitioning=Hash([Column { name: \"c2\", index: 0 }], 9000)", "          ProjectionExec: expr=[c2@0 as c2]", "            ProjectionExec: expr=[c1@0 as c2]", "              RepartitionExec: partitioning=RoundRobinBatch(9000)", "                CsvExec: fi
 les=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]"]`,
    right: `["ProjectionExec: expr=[c1@0 as c1]", "  CoalesceBatchesExec: target_batch_size=4096", "    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]", "      CoalesceBatchesExec: target_batch_size=4096", "        RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "          ProjectionExec: expr=[c1@0 as c1]", "            ProjectionExec: expr=[c1@0 as c1]", "              RepartitionExec: partitioning=RoundRobinBatch(9000)", "                CsvExec: files=[/privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]", "      CoalesceBatchesExec: target_batch_size=4096", "        RepartitionExec: partitioning=Hash([Column { name: \"c2\", index: 0 }], 9000)", "          ProjectionExec: expr=[c2@0 as c2]", "            ProjectionExec: expr=[c1@0 as c2]", "              RepartitionExec: partitioning=RoundRobinBatch(9000)", "                Csv
 Exec: files=[/privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]"]`: expected:
   [
       "ProjectionExec: expr=[c1@0 as c1]",
       "  CoalesceBatchesExec: target_batch_size=4096",
       "    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]",
       "      CoalesceBatchesExec: target_batch_size=4096",
       "        RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)",
       "          ProjectionExec: expr=[c1@0 as c1]",
       "            ProjectionExec: expr=[c1@0 as c1]",
       "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
       "                CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]",
       "      CoalesceBatchesExec: target_batch_size=4096",
       "        RepartitionExec: partitioning=Hash([Column { name: \"c2\", index: 0 }], 9000)",
       "          ProjectionExec: expr=[c2@0 as c2]",
       "            ProjectionExec: expr=[c1@0 as c2]",
       "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
       "                CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]",
   ]
   actual:
   
   [
       "ProjectionExec: expr=[c1@0 as c1]",
       "  CoalesceBatchesExec: target_batch_size=4096",
       "    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]",
       "      CoalesceBatchesExec: target_batch_size=4096",
       "        RepartitionExec: partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)",
       "          ProjectionExec: expr=[c1@0 as c1]",
       "            ProjectionExec: expr=[c1@0 as c1]",
       "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
       "                CsvExec: files=[/privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]",
       "      CoalesceBatchesExec: target_batch_size=4096",
       "        RepartitionExec: partitioning=Hash([Column { name: \"c2\", index: 0 }], 9000)",
       "          ProjectionExec: expr=[c2@0 as c2]",
       "            ProjectionExec: expr=[c1@0 as c2]",
       "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
       "                CsvExec: files=[/privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]",
   ]
   ', datafusion/core/tests/sql/explain_analyze.rs:731:5
   
   
   failures:
       sql::explain_analyze::csv_explain
       sql::explain_analyze::test_physical_plan_display_indent
       sql::explain_analyze::test_physical_plan_display_indent_multi_children
   
   test result: FAILED. 362 passed; 3 failed; 2 ignored; 0 measured; 0 filtered out; finished in 3.11s
   
   error: test failed, to rerun pass '-p datafusion --test sql_integration'
   + cleanup
   + '[' no = yes ']'
   + echo 'Failed to verify release candidate. See /var/folders/s3/h5hgj43j0bv83shtmz_t_w400000gn/T/arrow-9.0.0.XXXXX.KsfEL7Og for details.'
   Failed to verify release candidate. See /var/folders/s3/h5hgj43j0bv83shtmz_t_w400000gn/T/arrow-9.0.0.XXXXX.KsfEL7Og for details.
   ```
   
   **Expected behavior**
   The verification should pass
   
   **Additional context**
   Mailing list thread: https://lists.apache.org/thread/7mg9kwlfyrxm5fx96w8q0c436by93567


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] andygrove commented on issue #2719: Test (path normalization) failures while verifying release candidate 9.0.0 RC1

Posted by GitBox <gi...@apache.org>.

andygrove commented on issue #2719:
URL: https://github.com/apache/arrow-datafusion/issues/2719#issuecomment-1152791833

   I ran into this a while back and changed my ARROW_TEST_DATA env var to remove the trailing slash but we should really fix this. I will take a look in the next few days if nobody else picks this up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] alamb closed issue #2719: Test (path normalization) failures while verifying release candidate 9.0.0 RC1

Posted by GitBox <gi...@apache.org>.

alamb closed issue #2719: Test (path normalization) failures while verifying release candidate 9.0.0 RC1
URL: https://github.com/apache/arrow-datafusion/issues/2719


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] alamb commented on issue #2719: Test (path normalization) failures while verifying release candidate 9.0.0 RC1

Posted by GitBox <gi...@apache.org>.

alamb commented on issue #2719:
URL: https://github.com/apache/arrow-datafusion/issues/2719#issuecomment-1152613288

   The difference appears to be in the normalized path, rather than the actual structure:
   
   ```
    CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], ...
   ```
   vs
   ```
   CsvExec: files=[/privateARROW_TEST_DATA/csv/aggregate_test_100.csv], ...
   ```
   
   So I think this is a test bug rather than a code bug


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org