You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/02 22:13:00 UTC

[jira] [Commented] (METRON-1533) Create KAFKA_FIND Stellar Function

    [ https://issues.apache.org/jira/browse/METRON-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461672#comment-16461672 ] 

ASF GitHub Bot commented on METRON-1533:
----------------------------------------

Github user merrimanr commented on the issue:

    https://github.com/apache/metron/pull/1000
  
    I tested this in full dev and the results were somewhat inconsistent.  I listened on the enrichments topic with the kafka-console-consumer tool in one window:
    ```
    /usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh -z node1:2181 --topic enrichments
    ```
    While repeatedly running this command in another:
    ```
    KAFKA_FIND('enrichments', m -> MAP_GET('source.type', m) == 'snort')
    ```
    About 25-50% of the time the Stellar shell returned `[]` and the other times it would return a snort message as expected.
    
    How long will this command listen until it times out (or is it based on number of messages read)?  Sometimes it returned an empty array immediately.  Is this configurable?  


> Create KAFKA_FIND Stellar Function
> ----------------------------------
>
>                 Key: METRON-1533
>                 URL: https://issues.apache.org/jira/browse/METRON-1533
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>            Priority: Minor
>
> When creating enrichments, I often find that I want to validate that the enrichment I just created was successful on the live, incoming stream of telemetry. My workflow looks something like this.
> 1. Create and test the enrichment that I want to create.
> {code:java}
> [Stellar]>>> ip_src_addr := "72.34.49.86"
> 72.34.49.86
> [Stellar]>>> geo := GEO_GET(ip_src_addr)
> {country=US, dmaCode=803, city=Los Angeles, postalCode=90014, latitude=34.0438, location_point=34.0438,-118.2512, locID=5368361, longitude=-118.2512}
> {code}
> 2. That looks good to me. Now let's add that to my Bro telemetry.
> {code:java}
> [Stellar]>>> conf := SHELL_EDIT(conf)
> {
>   "enrichment" : {
>     "fieldMap": {
>       "stellar": {
>         "config": [
>            "geo := GEO_GET(ip_src_addr)"
>         ]
>       }
>     }
>   },
>   "threatIntel": {
>   }
> }
> [Stellar]>>> CONFIG_PUT("ENRICHMENTS", e, "bro")
> {code}
>  
>  3.  It looks like that worked, but did that really work?
> At this point, I would run KAFKA_GET as many times as it takes to retrieve a Bro message. You would just have to get lucky and hope that the enrichment worked and secondly that you would pull down a Bro message (as opposed to a different sensor).
>  
> I would rather have a function that lets me only pull back the messages that I care about. In this case I could either retrieve only Bro messages.
> {code:java}
> KAFKA_FIND('indexing', m -> MAP_GET('source.type', m) == 'bro')
> {code}
> Or I could look for messages that contain geolocation data.
> {code:java}
> KAFKA_FIND('indexing', m -> MAP_EXISTS('geo.city', m))
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)