You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@metron.apache.org by Matt Foley <mf...@hortonworks.com> on 2016/10/07 22:49:23 UTC

[CALL FOR TEST DATA] Request help identifying public domain or opensource test data sets for Metron testing

Hi all,

Enhanced testing of Metron, especially performance testing, would be aided by having data sets of realistic size, that exercise one or more of the various parts of Metron:

  *   each Parser (bro, yaf, snort, squid, ...)
  *   each Enhancer (geo, user, assets, ...)
  *   each Threat Intel module (Soltra, HailATaxi, ...)

Data sets must meet the following criteria:

  *   opensource or public domain
  *   suitably scrubbed, containing no Personally Identifiable Information
  *   unencumbered by company sensitivity, security, or IP concerns.

They may take the form of raw PCAP streams, or they may be already parsed or otherwise pre-processed.

If you know of opensource or public domain data sets of this kind, please respond with the URL, in this email thread or to the Jira ticket METRON-491<https://issues.apache.org/jira/browse/METRON-491>.

If you have an appropriate data set that your company would be willing to contribute, please also respond and we will help in any way we can.

?

Thanks,

--Matt

Re: [CALL FOR TEST DATA] Request help identifying public domain or opensource test data sets for Metron testing

Posted by Dima Kovalyov <Di...@sstech.us>.
Hello Matt,

We (Sstech team) currently have parsers and data generators for BlueCoat, Unix, MS Exchange, MS Windows and we would gladly contribute them.

Can you please share the procedure for submitting these peaces?
Thank you.

- Dima

On 10/08/2016 01:49 AM, Matt Foley wrote:

Hi all,

Enhanced testing of Metron, especially performance testing, would be aided by having data sets of realistic size, that exercise one or more of the various parts of Metron:

  *   each Parser (bro, yaf, snort, squid, ...)
  *   each Enhancer (geo, user, assets, ...)
  *   each Threat Intel module (Soltra, HailATaxi, ...)

Data sets must meet the following criteria:

  *   opensource or public domain
  *   suitably scrubbed, containing no Personally Identifiable Information
  *   unencumbered by company sensitivity, security, or IP concerns.

They may take the form of raw PCAP streams, or they may be already parsed or otherwise pre-processed.

If you know of opensource or public domain data sets of this kind, please respond with the URL, in this email thread or to the Jira ticket METRON-491<https://issues.apache.org/jira/browse/METRON-491>.

If you have an appropriate data set that your company would be willing to contribute, please also respond and we will help in any way we can.

​

Thanks,

--Matt