You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Charles Givre (Jira)" <ji...@apache.org> on 2019/11/12 14:05:00 UTC

[jira] [Created] (DRILL-7443) Enable PCAP Plugin to Reassemble TCP Streams

Charles Givre created DRILL-7443:
------------------------------------

             Summary: Enable PCAP Plugin to Reassemble TCP Streams
                 Key: DRILL-7443
                 URL: https://issues.apache.org/jira/browse/DRILL-7443
             Project: Apache Drill
          Issue Type: Improvement
          Components: Storage - Other
    Affects Versions: 1.16.0
            Reporter: Charles Givre
            Assignee: Charles Givre


One common task in network forensics is reassembling TCP streams from captured network data.  This PR adds this capability to Drill.

h2. Usage

To enable TCP re-sessionization, in the configuration for the PCAP reader, simply set the variable: {{sessionizeTCPStreams}} to {{true}}.

This can also be accomplished at query time by using the {{table()}} method.

{{SELECT * FROM table(dfs.test.`attack-trace.pcap` (type => 'pcap', sessionizeTCPStreams=> true))}}

h3. Results
*When this option is enabled, Drill will ignore all packets that are not TCP packets.*
Executing a query with this option enables changes the results Drill will return from PCAP files.  

You will get the following columns:
* session_start_time:  The start time of the session
* session_end_time:  The ending time of the session
* session_duration:  The duration of the session. This will be a Drill PERIOD datatype.
* total_packet_count:  The number of packets in the session
* connection_time:  The amount of time it took for the TCP handshake to be completed.  Useful for network diagnostics
* src_ip:  The IP address of the initiating machine
* dst_ip:  The IP address of the remote machine
* src_port:  The port of the originating machine
* dst_port:  The port of the remote machine
* src_mac_address:  The MAC address of the originating machine
* dst_mac_address:  The MAC address of the remote machine
* tcp_session:  This is the session hash for the TCP session.  (Long)
* is_corrupt:  True/false if the session contains corrupted packets
* data_from_originator:  The data sent from the originator
* data_from_remote:  The data sent from the remote machine
* data_volume_from_remote: The number of bytes sent from the remote host
* data_volume_from_origin:  The number of bytes sent from the originating machine
* packet_count_from_origin:  The number of packets sent from the originating machine
* packet_count_from_remote:  The number of packets sent from the remote machine
 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)