You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Caizhi Weng (JIRA)" <ji...@apache.org> on 2018/08/24 03:55:00 UTC

[jira] [Reopened] (CALCITE-2484) Dynamic table tests give wrong results when running tests concurrently.

     [ https://issues.apache.org/jira/browse/CALCITE-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Caizhi Weng reopened CALCITE-2484:
----------------------------------

Sorry for reopening the issue.

After cherry-picking the commit to my own repository, I didn't experience the test failure on the cluster of my company yesterday.

But this morning I discover that it's just because the "timing" of the tests are changed. To trigger this bug, two test methods using the same table name must run concurrently, so the "timing" condition is quite strict to trigger this bug in the current test set.

I carefully read the changes in CALCITE 2435 and I discover that, although {{cache}} no longer exists in {{SqlTestFactory}}, {{catalogReader}} is now a method in {{SqlTestFactory}}. As the test cases still use the same instance of the factory, they still have access to the same catalog reader. So the problem still exists.

I construct two mock test classes to trigger this bug stably. The two test classes are the same except for their names. One of the test class is listed as follows:
{code:java}
public class MockSqlValidatorTest1 extends SqlValidatorTestCase {
  // Member definition omitted.

  @Test
  public void testDynamicStar1() {
    final String sql = "select newid from (\n"
        + "  select *, NATION.N_NATION + 100 as newid\n"
        + "  from \"DYNAMIC\".NATION, \"DYNAMIC\".CUSTOMER)";
    sql(sql).type("RecordType(ANY NEWID) NOT NULL");
  }

  @Test
  public void testDynamicStar2() {
    final String sql = "select newid from (\n"
        + "  select *, NATION.N_NATION + 100 as newid\n"
        + "  from \"DYNAMIC\".NATION, \"DYNAMIC\".CUSTOMER)";
    sql(sql).type("RecordType(ANY NEWID) NOT NULL");
  }

  // 296 more test cases

  @Test
  public void testDynamicStar299() {
    final String sql = "select newid from (\n"
        + "  select *, NATION.N_NATION + 100 as newid\n"
        + "  from \"DYNAMIC\".NATION, \"DYNAMIC\".CUSTOMER)";
    sql(sql).type("RecordType(ANY NEWID) NOT NULL");
  }

  @Test
  public void testDynamicStar300() {
    final String sql = "select newid from (\n"
        + "  select *, NATION.N_NATION + 100 as newid\n"
        + "  from \"DYNAMIC\".NATION, \"DYNAMIC\".CUSTOMER)";
    sql(sql).type("RecordType(ANY NEWID) NOT NULL");
  }
{code}
You can check these two test classes [here|https://paste.ubuntu.com/p/rcWYjMzMjf/] and [here|https://paste.ubuntu.com/p/Tb2VTz74Xv/] so you can try them out.

If we use
{code}
mvn -T 4 -Dtest=MockSqlValidatorTest* test -pl core
{code}
to run these two test classes concurrently, the bug will occur.

But if we use
{code}
mvn -Dsurefire.parallel= -Dtest=MockSqlValidatorTest* test -pl core
{code}
to let them run one by one, the bug will not occur.

So we still need to remove the single {{INSTANCE}} of {{SqlTestFactory}}. Each test class should create and use their own instance of {{SqlTestFactory}} when they want to use it.

> Dynamic table tests give wrong results when running tests concurrently.
> -----------------------------------------------------------------------
>
>                 Key: CALCITE-2484
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2484
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.17.0
>            Reporter: Caizhi Weng
>            Assignee: Julian Hyde
>            Priority: Major
>             Fix For: 1.18.0
>
>
> When running tests on the cluster of my company, I always experience the failure of the test {{SqlValidatorTest::testDynamicStar2()}}. After investigation, I discover that it is triggered by the cache in {{DefaultSqlTestFactory}} introduced in [this commit|https://github.com/apache/calcite/commit/39c22f0c8b7b5b46a152f432e8708ce73ace1ef7].
> The failure of the test case is because:
> # In the current implementation, when a test class wants to use a {{DefaultSqlTestFactory}}, it will use {{DefaultSqlTestFactory.INSTANCE}}, so every class using this factory actually shares the same factory instance.
> # {{cache}} is a private member of {{DefaultSqlTestFactory}}, so every class actually shares the same {{cache}}.
> # The key of {{cache}} is of {{SqlTestFactory}} type. As every class shares the same factory instance, every class actually shares the same cache key.
> # As catalog reader is stored in cache, root schema is stored in catalog reader, table is stored in root schema, and row type is stored in table, every class actually has access to the same row type instance.
> # As dynamic table will modify row type if a column name it wants to use doesn't exist, two test cases running concurrently and using the same table name may read and modify the same row type instance, causing the result of the test to be incorrect, thus the failure of the test.
> What we need to do is to remove {{DefaultSqlTestFactory.INSTANCE}}, and let every test class use a new instance of the factory, so that we can solve the concurrent modification problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)