You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "Casey Stella (JIRA)" <ji...@apache.org> on 2016/04/26 15:26:12 UTC

[jira] [Created] (METRON-119) Move the PCAP topology from HBase

Casey Stella created METRON-119:
-----------------------------------

             Summary: Move the PCAP topology from HBase
                 Key: METRON-119
                 URL: https://issues.apache.org/jira/browse/METRON-119
             Project: Metron
          Issue Type: Improvement
            Reporter: Casey Stella
            Assignee: Casey Stella


As it stands, the existing approach to handling PCAP data has some issues handling high volume packet capture data.  With the advent of a DPDK plugin for capturing packet data, we are going to hit some limitations on the  throughput of consumption if we continue to try to push packet data into HBase at line-speed.

Furthermore, storing PCAP data into HBase limits the range of filter queries that we can perform (i.e. only those expressible within the key).  As of now, we require all fields to be present (source IP/port, destination IP/port and protocol), rather than allowing any wildcards.

To address these issues, we should create a higher performance topology which attaches the appropriate header to the raw packet and timestamp read from Kafka (as placed onto kafka by the packet capture sensor) and appends this packet to a sequence file in HDFS.  The sequence file will be rolled based on number of packets or time (e.g. 1 hrs worth of packets in a given sequence file).

On the query side, we should adjust the middle tier service layer to start a MR job on the appropriate set of sequence files to filter out the appropriate packets.  NOTE: the UI modifications to make this reasonable for the end-user will need to be done in a follow-on JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)