You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/12 14:37:00 UTC

[jira] [Commented] (DRILL-7443) Enable PCAP Plugin to Reassemble TCP Streams

    [ https://issues.apache.org/jira/browse/DRILL-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972458#comment-16972458 ] 

ASF GitHub Bot commented on DRILL-7443:
---------------------------------------

cgivre commented on pull request #1898: DRILL-7443: Enable PCAP Plugin to Reassemble TCP Streams
URL: https://github.com/apache/drill/pull/1898
 
 
   One common task in network forensics is reassembling TCP streams from captured network data.  This PR adds this capability to Drill.
   
   ## Usage
   
   To enable TCP re-sessionization, in the configuration for the PCAP reader, simply set the variable: `sessionizeTCPStreams` to `true`.
   
   This can also be accomplished at query time by using the {{table()}} method.
   ```
   SELECT * FROM table(dfs.test.`attack-trace.pcap` (type => 'pcap', sessionizeTCPStreams=> true)}
   ```
   ## Results
   **When this option is enabled, Drill will ignore all packets that are not TCP packets.**
   Executing a query with this option enables changes the results Drill will return from PCAP files.  
   
   You will get the following columns:
   * `session_start_time`:  The start time of the session
   * `session_end_time`:  The ending time of the session
   * `session_duration`:  The duration of the session. This will be a Drill PERIOD datatype.
   * `total_packet_count`:  The number of packets in the session
   * `connection_time`:  The amount of time it took for the TCP handshake to be completed.  Useful for network diagnostics`
   * `src_ip`:  The IP address of the initiating machine
   * `dst_ip`:  The IP address of the remote machine
   * `src_port`:  The port of the originating machine
   * `dst_port`:  The port of the remote machine
   * `src_mac_address`:  The MAC address of the originating machine
   * `dst_mac_address`:  The MAC address of the remote machine
   * `tcp_session`:  This is the session hash for the TCP session.  (Long)
   * `is_corrupt`:  True/false if the session contains corrupted packets
   * `data_from_originator`:  The data sent from the originator
   * `data_from_remote`:  The data sent from the remote machine
   * `data_volume_from_remote`: The number of bytes sent from the remote host
   * `data_volume_from_origin`:  The number of bytes sent from the originating machine
   * `packet_count_from_origin`:  The number of packets sent from the originating machine
   * `packet_count_from_remote`:  The number of packets sent from the remote machine
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Enable PCAP Plugin to Reassemble TCP Streams
> --------------------------------------------
>
>                 Key: DRILL-7443
>                 URL: https://issues.apache.org/jira/browse/DRILL-7443
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Other
>    Affects Versions: 1.16.0
>            Reporter: Charles Givre
>            Assignee: Charles Givre
>            Priority: Major
>
> One common task in network forensics is reassembling TCP streams from captured network data.  This PR adds this capability to Drill.
> h2. Usage
> To enable TCP re-sessionization, in the configuration for the PCAP reader, simply set the variable: {{sessionizeTCPStreams}} to {{true}}.
> This can also be accomplished at query time by using the {{table()}} method.
> {{SELECT * FROM table(dfs.test.`attack-trace.pcap` (type => 'pcap', sessionizeTCPStreams=> true))}}
> h3. Results
> *When this option is enabled, Drill will ignore all packets that are not TCP packets.*
> Executing a query with this option enables changes the results Drill will return from PCAP files.  
> You will get the following columns:
> * session_start_time:  The start time of the session
> * session_end_time:  The ending time of the session
> * session_duration:  The duration of the session. This will be a Drill PERIOD datatype.
> * total_packet_count:  The number of packets in the session
> * connection_time:  The amount of time it took for the TCP handshake to be completed.  Useful for network diagnostics
> * src_ip:  The IP address of the initiating machine
> * dst_ip:  The IP address of the remote machine
> * src_port:  The port of the originating machine
> * dst_port:  The port of the remote machine
> * src_mac_address:  The MAC address of the originating machine
> * dst_mac_address:  The MAC address of the remote machine
> * tcp_session:  This is the session hash for the TCP session.  (Long)
> * is_corrupt:  True/false if the session contains corrupted packets
> * data_from_originator:  The data sent from the originator
> * data_from_remote:  The data sent from the remote machine
> * data_volume_from_remote: The number of bytes sent from the remote host
> * data_volume_from_origin:  The number of bytes sent from the originating machine
> * packet_count_from_origin:  The number of packets sent from the originating machine
> * packet_count_from_remote:  The number of packets sent from the remote machine
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)