You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "Jon Zeolla (JIRA)" <ji...@apache.org> on 2017/11/07 12:04:00 UTC

[jira] [Assigned] (METRON-1052) Add forensic similarity hash functions to Stellar

     [ https://issues.apache.org/jira/browse/METRON-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jon Zeolla reassigned METRON-1052:
----------------------------------

    Assignee: Casey Stella

> Add forensic similarity hash functions to Stellar
> -------------------------------------------------
>
>                 Key: METRON-1052
>                 URL: https://issues.apache.org/jira/browse/METRON-1052
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Jon Zeolla
>            Assignee: Casey Stella
>
> This is a follow-on to METRON-539.  Currently we have Stellar functions to perform cryptographic hashing operations.  It would be useful to expand this to support forensic similarity hash functions so we could compare the similarity of inputs.  I can see two main components of this, and one secondary/lower priority thought:
> (1) Support of LSH and/or CCTP hash functions (aka forensic similarity hash functions) such as sdhash or spamsum/ssdeep.  I quickly found some code examples[1][2] in Java that have compatible licenses, in case that is appealing.
> (2) An approximate string matching function to establish a similarity rating between n hashes.  ssdeep currently has this via its -x and -k options, and there are some other thoughts[3] on how to best do this, but I'm aware there are numerous ways that we may want to consider comparing strings for similarity (damerau-levenshtein distance, longest common subsequence, etc.).  
> (3) Similar to 2, I could see some applicability as a streaming enrichment, but as a native feature this would be a much lower priority/potentially a separate PR.
> 1:  https://github.com/pcbje/autopsy-ahbm/blob/master/src/com/pcbje/ahbm/Sdhash.java
> 2:  https://github.com/tdebatty/java-spamsum
> 3:  https://www.virusbulletin.com/virusbulletin/2015/11/optimizing-ssdeep-use-scale



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)