You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Charles Givre <cg...@gmail.com> on 2019/09/22 14:55:01 UTC

[DISCUSS]: PCAP Reader Improvements

Hello all, 
I'm contemplating some improvements to Drill's PCAP reader.  Specifically, I'd like for Drill to actually be able to parse some of the actual packet data.  I was thinking of using KaiTai structs as a means to do so as they already have parsers for common packets.  An example of this is the DNS parser (https://formats.kaitai.io/dns_packet/java.html) 

I was thinking of doing the following:
1.  Converting the PCAP plugin to use the EVF framework. 
2.  Including a config option to turn the parsing on/off
3.  Having the appropriate parser read and parse the data and store it into a Drill map. 

Does anyone have any comments or thoughts on the matter?
Thanks,
-- C


Re: [DISCUSS]: PCAP Reader Improvements

Posted by Ted Dunning <te...@gmail.com>.
Another though is to have an alternative (potential) map field for each
possible protocol.

Thus, you would have a map for the DNS protocol and a map for the ICMP and
so on. This would allow each map to have a fixed format.



On Sun, Sep 22, 2019 at 9:46 AM Charles Givre <cg...@gmail.com> wrote:

> Hi Ted,
> EVF = Enhanced Vector Framework. Complete tutorial here:
> https://github.com/paul-rogers/drill/wiki/Developer%27s-Guide-to-the-Enhanced-Vector-Framework#basics-tutorial
> <
> https://github.com/paul-rogers/drill/wiki/Developer's-Guide-to-the-Enhanced-Vector-Framework#basics-tutorial
> >
> Basically, what I was thinking was that we can use the EVF to define the
> schema for known columns (IE level 1 & 2 headers).  EVF handles pushdown
> projection so we could eliminate a lot of that logic in the plugin.  Then
> EVF also allows dynamic schema discovery, so we could create a map called
> packet_data or whatever, and that would be populated with whatever fields
> exist in the packet.  We would need to write or otherwise obtain protocol
> dissectors for the different protocols but I'm going to start wtih DNS
> since I need that for work.   I'm pretty sure that the EVF allows for
> variant maps so if you have a DNS packet and a ICMP packet, you'd get
> different fields in the map.
> -- C
>
>
>
>
> > On Sep 22, 2019, at 11:30 AM, Ted Dunning <te...@gmail.com> wrote:
> >
> > This sounds amazing.
> >
> > Some questions.
> >
> > What is EVF?
> >
> > How can you deal with the problem of variant maps?
> >
> > On Sun, Sep 22, 2019, 7:55 AM Charles Givre <cg...@gmail.com> wrote:
> >
> >> Hello all,
> >> I'm contemplating some improvements to Drill's PCAP reader.
> Specifically,
> >> I'd like for Drill to actually be able to parse some of the actual
> packet
> >> data.  I was thinking of using KaiTai structs as a means to do so as
> they
> >> already have parsers for common packets.  An example of this is the DNS
> >> parser (https://formats.kaitai.io/dns_packet/java.html)
> >>
> >> I was thinking of doing the following:
> >> 1.  Converting the PCAP plugin to use the EVF framework.
> >> 2.  Including a config option to turn the parsing on/off
> >> 3.  Having the appropriate parser read and parse the data and store it
> >> into a Drill map.
> >>
> >> Does anyone have any comments or thoughts on the matter?
> >> Thanks,
> >> -- C
> >>
> >>
>
>

Re: [DISCUSS]: PCAP Reader Improvements

Posted by Charles Givre <cg...@gmail.com>.
Hi Ted, 
EVF = Enhanced Vector Framework. Complete tutorial here: https://github.com/paul-rogers/drill/wiki/Developer%27s-Guide-to-the-Enhanced-Vector-Framework#basics-tutorial <https://github.com/paul-rogers/drill/wiki/Developer's-Guide-to-the-Enhanced-Vector-Framework#basics-tutorial>
Basically, what I was thinking was that we can use the EVF to define the schema for known columns (IE level 1 & 2 headers).  EVF handles pushdown projection so we could eliminate a lot of that logic in the plugin.  Then EVF also allows dynamic schema discovery, so we could create a map called packet_data or whatever, and that would be populated with whatever fields exist in the packet.  We would need to write or otherwise obtain protocol dissectors for the different protocols but I'm going to start wtih DNS since I need that for work.   I'm pretty sure that the EVF allows for variant maps so if you have a DNS packet and a ICMP packet, you'd get different fields in the map. 
-- C




> On Sep 22, 2019, at 11:30 AM, Ted Dunning <te...@gmail.com> wrote:
> 
> This sounds amazing.
> 
> Some questions.
> 
> What is EVF?
> 
> How can you deal with the problem of variant maps?
> 
> On Sun, Sep 22, 2019, 7:55 AM Charles Givre <cg...@gmail.com> wrote:
> 
>> Hello all,
>> I'm contemplating some improvements to Drill's PCAP reader.  Specifically,
>> I'd like for Drill to actually be able to parse some of the actual packet
>> data.  I was thinking of using KaiTai structs as a means to do so as they
>> already have parsers for common packets.  An example of this is the DNS
>> parser (https://formats.kaitai.io/dns_packet/java.html)
>> 
>> I was thinking of doing the following:
>> 1.  Converting the PCAP plugin to use the EVF framework.
>> 2.  Including a config option to turn the parsing on/off
>> 3.  Having the appropriate parser read and parse the data and store it
>> into a Drill map.
>> 
>> Does anyone have any comments or thoughts on the matter?
>> Thanks,
>> -- C
>> 
>> 


Re: [DISCUSS]: PCAP Reader Improvements

Posted by Ted Dunning <te...@gmail.com>.
This sounds amazing.

Some questions.

What is EVF?

How can you deal with the problem of variant maps?

On Sun, Sep 22, 2019, 7:55 AM Charles Givre <cg...@gmail.com> wrote:

> Hello all,
> I'm contemplating some improvements to Drill's PCAP reader.  Specifically,
> I'd like for Drill to actually be able to parse some of the actual packet
> data.  I was thinking of using KaiTai structs as a means to do so as they
> already have parsers for common packets.  An example of this is the DNS
> parser (https://formats.kaitai.io/dns_packet/java.html)
>
> I was thinking of doing the following:
> 1.  Converting the PCAP plugin to use the EVF framework.
> 2.  Including a config option to turn the parsing on/off
> 3.  Having the appropriate parser read and parse the data and store it
> into a Drill map.
>
> Does anyone have any comments or thoughts on the matter?
> Thanks,
> -- C
>
>