You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Aarti Gupta <aa...@qualys.com> on 2018/08/30 11:45:22 UTC

Grok and Flink

Hi,

We are using the Grok filter in Logstash to parse and enrich our data. Grok
provides inbuilt parsing for common log sources such as Apache, this allows
us to add structure to unstructured data.

After the data has been parsed in Logstash, we then stream the data over
Kafka to Flink for further CEP processing.

We are looking to see if we can get rid of the Logstash piece and do all of
the data enrichment and parsing in Flink.

Our question - does Flink have an inbuilt library similar to Grok that
provides out of the box parsing for common log formats.

Thanks in advance,
Aarti

-- 
Aarti Gupta <https://www.linkedin.com/company/qualys>
Director, Engineering, Correlation


aagupta@qualys.com
T


Qualys, Inc. – Blog <https://qualys.com/blog> | Community
<https://community.qualys.com> | Twitter <https://twitter.com/qualys>


<https://www.qualys.com/email-banner>

Re: Grok and Flink

Posted by Aarti Gupta <aa...@qualys.com>.
Interesting, thanks Lehuede. Will take a look.

--Aarti

On Thu, Aug 30, 2018 at 5:59 PM, Lehuede sebastien <le...@gmail.com>
wrote:

> Hi,
>
> To parse my logs and reuse all my Grok pattern, i use the Java Grok API
> directly in my DataStream. Please see : https://github.com/thekrakken/
> java-grok
>
> With that you should be able to get rid of the full Logstash piece and use
> only the Grok part.
>
> Another solution, for example if you have logs/events in CEF Format, you
> can just use 'split' in the flatmap function for example.
>
> Hope will help.
>
> Regards,
> Sebastien.
>



-- 
Aarti Gupta <https://www.linkedin.com/company/qualys>
Director, Engineering, Correlation


aagupta@qualys.com
T


Qualys, Inc. – Blog <https://qualys.com/blog> | Community
<https://community.qualys.com> | Twitter <https://twitter.com/qualys>


<https://www.qualys.com/email-banner>

Re: Grok and Flink

Posted by Lehuede sebastien <le...@gmail.com>.
 Hi,

To parse my logs and reuse all my Grok pattern, i use the Java Grok API
directly in my DataStream. Please see :
https://github.com/thekrakken/java-grok

With that you should be able to get rid of the full Logstash piece and use
only the Grok part.

Another solution, for example if you have logs/events in CEF Format, you
can just use 'split' in the flatmap function for example.

Hope will help.

Regards,
Sebastien.