You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Niels Basjes (JIRA)" <ji...@apache.org> on 2015/07/23 23:06:04 UTC

[jira] [Updated] (PIG-4639) Add better parser for Apache HTTPD access log.

     [ https://issues.apache.org/jira/browse/PIG-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niels Basjes updated PIG-4639:
------------------------------
    Issue Type: New Feature  (was: Improvement)

> Add better parser for Apache HTTPD access log.
> ----------------------------------------------
>
>                 Key: PIG-4639
>                 URL: https://issues.apache.org/jira/browse/PIG-4639
>             Project: Pig
>          Issue Type: New Feature
>          Components: piggybank
>    Affects Versions: 0.15.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>
> Currently there are two parsers for Apache Logfiles in piggybank that only allow parsing the 'combined' and 'common' logformats. These two also only parse the 'basics'.
> This is proposed patch to add the existing https://github.com/nielsbasjes/logparser (Apache 2.0 license) as an 'out of the box' parser to piggybank that supports (almost) all LogFormat specifiers and as such adds parsing capabilities for (almost) all custom logformats used in production scenarios. 
> This parser also goes much deeper in the sense that it allows extracting things like the value of a cookie or the value of a  query string parameter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)