You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Charles Givre (Jira)" <ji...@apache.org> on 2020/03/30 13:42:00 UTC
[jira] [Resolved] (DRILL-4955) Log Parser for Drill
[ https://issues.apache.org/jira/browse/DRILL-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Charles Givre resolved DRILL-4955.
----------------------------------
Resolution: Resolved
> Log Parser for Drill
> --------------------
>
> Key: DRILL-4955
> URL: https://issues.apache.org/jira/browse/DRILL-4955
> Project: Apache Drill
> Issue Type: New Feature
> Components: Storage - Text & CSV
> Affects Versions: 1.9.0
> Reporter: Charles Givre
> Priority: Major
> Labels: features
> Fix For: Future
>
>
> I've been experimenting with a generic log parser for Drill. The basic concept is that if you wanted Drill to ingest log files such as this MySQL log:
> {code}
> 070823 21:00:32 1 Connect root@localhost on test1
> 070823 21:00:48 1 Query show tables
> 070823 21:00:56 1 Query select * from category
> 070917 16:29:01 21 Query select * from location
> 070917 16:29:12 21 Query select * from location where id = 1 LIMIT 1
> {code}
> You probably could do it with the various string manipulation methods such as split, substring etc. but you'd end up with some ugly and very complex queries.
> The extension I've built allows you to supply Drill with a regex for the formatting and a list of fields as shown below.
> {code}
> "log": {
> "type": "log",
> "extensions": [
> "log"
> ],
> "fieldNames": [
> "date",
> "time",
> "pid",
> "action",
> "query"
> ],
> "pattern": "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)"
> }
> {code}
> You can then query this log files in this format in Drill. I'd like to submit this for inclusion in Drill if there is interest.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)