You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "Casey Stella (JIRA)" <ji...@apache.org> on 2016/04/26 15:26:12 UTC
[jira] [Created] (METRON-119) Move the PCAP topology from HBase
Casey Stella created METRON-119:
-----------------------------------
Summary: Move the PCAP topology from HBase
Key: METRON-119
URL: https://issues.apache.org/jira/browse/METRON-119
Project: Metron
Issue Type: Improvement
Reporter: Casey Stella
Assignee: Casey Stella
As it stands, the existing approach to handling PCAP data has some issues handling high volume packet capture data. With the advent of a DPDK plugin for capturing packet data, we are going to hit some limitations on the throughput of consumption if we continue to try to push packet data into HBase at line-speed.
Furthermore, storing PCAP data into HBase limits the range of filter queries that we can perform (i.e. only those expressible within the key). As of now, we require all fields to be present (source IP/port, destination IP/port and protocol), rather than allowing any wildcards.
To address these issues, we should create a higher performance topology which attaches the appropriate header to the raw packet and timestamp read from Kafka (as placed onto kafka by the packet capture sensor) and appends this packet to a sequence file in HDFS. The sequence file will be rolled based on number of packets or time (e.g. 1 hrs worth of packets in a given sequence file).
On the query side, we should adjust the middle tier service layer to start a MR job on the appropriate set of sequence files to filter out the appropriate packets. NOTE: the UI modifications to make this reasonable for the end-user will need to be done in a follow-on JIRA.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)