You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2019/06/18 19:18:00 UTC

[jira] [Created] (DRILL-7298) Revise log regex plugin to work with table functions

Paul Rogers created DRILL-7298:
----------------------------------

             Summary: Revise log regex plugin to work with table functions
                 Key: DRILL-7298
                 URL: https://issues.apache.org/jira/browse/DRILL-7298
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.16.0
            Reporter: Paul Rogers


See the [PR for DRILL-7293|https://github.com/apache/drill/pull/1807], the discussion regarding table properties. The logRegex plugin contains a list of {{LogFormatField}} objects:

{code:java}
  private List<LogFormatField> schema;
{code}

As it turns out, such a list cannot be used with table properties. This ticket asks to find a solution, perhaps using the suggestions from the PR.

The log format plugin allows users to read any text file that can be described with a regex. The plugin lets the user provide the plugin, and a list of fields that match the groups within the regex. These fields are described with the {{schema}} list. The schema defines a name, type and parse pattern.

Because of the versatility of logRegex, it would be great to be able to specify the pattern and field in a table function so that users do not have to create a new plugin config each time they want to query a new kind of file. DRILL-7293 allows the user to specify the regex and schema using the recently added schema provisioning system. Still, it would be handy to use table functions.

The require changes are to use types that the table functions can handle, which limits choices to strings and numbers. For ad-hoc query use, it might be fine to just list field names. Or, perhaps, if no field names are provided, use the {{columns}} array as in CSV. For ad-hoc use, type conversions can be expressed as casts rather than as types in the table functions.

h4. Backward Compatibility

Care must be taken when changing the config structure of an existing plugin. In the past, Drill would refuse to start if the JSON configs stored in ZK did not match the schema that Jackson expects based on the config class. Any fix or this problem *must* ensure that existing configs do not cause Drill startup to fail. Ideally, configs would be automatically upgraded so that users don't have to take any manual steps when upgrading Drill with the features requested here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)