You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/01/18 19:59:26 UTC

[jira] [Created] (DRILL-5204) Extend mock data source to use table specs from SQL

Paul Rogers created DRILL-5204:
----------------------------------

             Summary: Extend mock data source to use table specs from SQL
                 Key: DRILL-5204
                 URL: https://issues.apache.org/jira/browse/DRILL-5204
             Project: Apache Drill
          Issue Type: Improvement
          Components: Tools, Build & Test
    Affects Versions: 1.9.0
            Reporter: Paul Rogers
            Assignee: Paul Rogers
            Priority: Minor


DRILL-5152 provided a simple way to generate mock data from SQL:

{code}
SELECT colName_type FROM `mock`.`tableName_size` ...
{code}

The fix in that release encoded types and record counts directly in the SQL, which is very handy for many simple cases.

The original mock data source has another feature: it lets you create multiple mock blocks of data that can be read in multiple threads. Later additions made it easy to repeat a column definition (to generate, say, a table with 1000 columns), to choose the data generator class, etc. All of this was available only when writing physical plans by hand and encoding the definition in the sub scan for the mock data source.

This enhancement extends the SQL feature to allow the definitions to appear in a JSON file easily referenced from SQL. The JSON file must be somewhere on the class path (typically in a resources directory.) Then:

{code}
SELECT red, blue, green FROM `mock`.`foo/colors.json` ...
{code}

Is interpreted to mean, "the file colors.json defines a mock data source, perhaps with repeated columns, perhaps with multiple fragments. From that mock data source, select the three columns red, blue and green."

With this change, tests can include quite sophisticated mock data sources, simplifying debugging of plans with multiple fragments and/or more complex table structures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)