You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@metron.apache.org by mm...@apache.org on 2019/11/04 19:44:01 UTC
[metron] branch master updated: METRON-2293 Fix some inaccuracies
in the MaaS README (mmiklavc) closes apache/metron#1536
This is an automated email from the ASF dual-hosted git repository.
mmiklavcic pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/metron.git
The following commit(s) were added to refs/heads/master by this push:
new 6c064b7 METRON-2293 Fix some inaccuracies in the MaaS README (mmiklavc) closes apache/metron#1536
6c064b7 is described below
commit 6c064b76e2252776cde64280b9ee15091339cf5a
Author: mmiklavc <mi...@gmail.com>
AuthorDate: Mon Nov 4 12:43:35 2019 -0700
METRON-2293 Fix some inaccuracies in the MaaS README (mmiklavc) closes apache/metron#1536
---
metron-analytics/metron-maas-service/README.md | 209 +++++++++++++++++++++++--
1 file changed, 199 insertions(+), 10 deletions(-)
diff --git a/metron-analytics/metron-maas-service/README.md b/metron-analytics/metron-maas-service/README.md
index bd1d76c..93c798a 100644
--- a/metron-analytics/metron-maas-service/README.md
+++ b/metron-analytics/metron-maas-service/README.md
@@ -135,10 +135,9 @@ Let's augment the `squid` proxy sensor to use a model that will determine if the
## Install Prerequisites and Mock DGA Service
Now let's install some prerequisites:
-* Flask via `yum install python-flask`
-* Jinja2 via `yum install python-jinja2`
-* Squid client via `yum install squid`
-* ES Head plugin via `/usr/share/elasticsearch/bin/plugin install mobz/elasticsearch-head`
+* Flask via `yum -y install python-flask`
+* Jinja2 via `yum -y install python-jinja2`
+* Squid client via `yum -y install squid`
Start Squid via `service squid start`
@@ -154,13 +153,13 @@ The following presumes that you are a logged in as a user who has a
home directory in HDFS under `/user/$USER`. If you do not, please create one
and ensure the permissions are set appropriate:
```
-su - hdfs -c "hadoop fs -mkdir /user/$USER"
-su - hdfs -c "hadoop fs -chown $USER:$USER /user/$USER"
+su - hdfs -c "hdfs dfs -mkdir /user/$USER"
+su - hdfs -c "hdfs dfs -chown $USER:$USER /user/$USER"
```
-Or, in the common case for the `metron` user:
+Or, in the common case for the `metron` user (if the user does not already exist):
```
-su - hdfs -c "hadoop fs -mkdir /user/metron"
-su - hdfs -c "hadoop fs -chown metron:metron /user/metron"
+su - hdfs -c "hdfs dfs -mkdir /user/metron"
+su - hdfs -c "hdfs dfs -chown metron:metron /user/metron"
```
Now let's start MaaS and deploy the Mock DGA Service:
@@ -173,6 +172,10 @@ Now let's start MaaS and deploy the Mock DGA Service:
## Adjust Configurations for Squid to Call Model
Now that we have a deployed model, let's adjust the configurations for the Squid topology to annotate the messages with the output of the model.
+* First pull down the latest configuration from Zookeeper
+```
+$METRON_HOME/bin/zk_load_configs.sh -m PULL -o ${METRON_HOME}/config/zookeeper -z $ZOOKEEPER -f
+```
* Edit the squid parser configuration at `$METRON_HOME/config/zookeeper/parsers/squid.json` in your favorite text editor and add a new FieldTransformation to indicate a threat alert based on the model (note the addition of `is_malicious` and `is_alert`):
```
{
@@ -217,8 +220,185 @@ Now that we have a deployed model, let's adjust the configurations for the Squid
}
}
```
+* Setup an indexing configuration here `${METRON_HOME}/config/zookeeper/indexing/squid.json` with the following contents:
+```
+{
+ "hdfs" : {
+ "index": "squid",
+ "batchSize": 5,
+ "enabled" : true
+ },
+ "elasticsearch" : {
+ "index": "squid",
+ "batchSize": 5,
+ "enabled" : true
+ },
+ "solr" : {
+ "index": "squid",
+ "batchSize": 5,
+ "enabled" : true
+ }
+}
+```
* Upload new configs via `$METRON_HOME/bin/zk_load_configs.sh --mode PUSH -i $METRON_HOME/config/zookeeper -z node1:2181`
* Make the Squid topic in kafka via `/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper node1:2181 --create --topic squid --partitions 1 --replication-factor 1`
+* Setup your squid indexing template for Elasticsearch (if using Elasticsearch)
+```
+curl -XPUT 'http://node1:9200/_template/squid_index' -d '
+{
+ "template": "squid_index*",
+ "mappings": {
+ "squid_doc": {
+ "dynamic_templates": [
+ {
+ "geo_location_point": {
+ "match": "enrichments:geo:*:location_point",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "geo_point"
+ }
+ }
+ },
+ {
+ "geo_country": {
+ "match": "enrichments:geo:*:country",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "keyword"
+ }
+ }
+ },
+ {
+ "geo_city": {
+ "match": "enrichments:geo:*:city",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "keyword"
+ }
+ }
+ },
+ {
+ "geo_location_id": {
+ "match": "enrichments:geo:*:locID",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "keyword"
+ }
+ }
+ },
+ {
+ "geo_dma_code": {
+ "match": "enrichments:geo:*:dmaCode",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "keyword"
+ }
+ }
+ },
+ {
+ "geo_postal_code": {
+ "match": "enrichments:geo:*:postalCode",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "keyword"
+ }
+ }
+ },
+ {
+ "geo_latitude": {
+ "match": "enrichments:geo:*:latitude",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "float"
+ }
+ }
+ },
+ {
+ "geo_longitude": {
+ "match": "enrichments:geo:*:longitude",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "float"
+ }
+ }
+ },
+ {
+ "timestamps": {
+ "match": "*:ts",
+ "match_mapping_type": "*",
+ "mapping": {
+ "type": "date",
+ "format": "epoch_millis"
+ }
+ }
+ },
+ {
+ "threat_triage_score": {
+ "mapping": {
+ "type": "float"
+ },
+ "match": "threat:triage:*score",
+ "match_mapping_type": "*"
+ }
+ },
+ {
+ "threat_triage_reason": {
+ "mapping": {
+ "type": "text",
+ "fielddata": "true"
+ },
+ "match": "threat:triage:rules:*:reason",
+ "match_mapping_type": "*"
+ }
+ },
+ {
+ "threat_triage_name": {
+ "mapping": {
+ "type": "text",
+ "fielddata": "true"
+ },
+ "match": "threat:triage:rules:*:name",
+ "match_mapping_type": "*"
+ }
+ }
+ ],
+ "properties": {
+ "timestamp": {
+ "type": "date",
+ "format": "epoch_millis"
+ },
+ "source:type": {
+ "type": "keyword"
+ },
+ "ip_dst_addr": {
+ "type": "ip"
+ },
+ "ip_dst_port": {
+ "type": "integer"
+ },
+ "ip_src_addr": {
+ "type": "ip"
+ },
+ "ip_src_port": {
+ "type": "integer"
+ },
+ "alert": {
+ "type": "nested"
+ },
+ "metron_alert" : {
+ "type" : "nested"
+ },
+ "guid": {
+ "type": "keyword"
+ }
+ }
+ }
+ }
+}
+'
+# Verify the template installs as expected
+curl -XGET 'http://node1:9200/_template/squid_index?pretty'
+```
## Start Topologies and Send Data
Now we need to start the topologies and send some data:
@@ -226,7 +406,16 @@ Now we need to start the topologies and send some data:
* Generate some data via the squid client:
* Generate a legit example: `squidclient http://yahoo.com`
* Generate a malicious example: `squidclient http://cnn.com`
+* (Optional) In another terminal, tail the "enrichments" Kafka topic. You should be able to see the new data output there after the next couple steps.
+```
+${HDP_HOME}/kafka-broker/bin/kafka-console-consumer.sh --bootstrap-server $BROKERLIST --topic enrichments
+```
* Send the data to kafka via `cat /var/log/squid/access.log | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list node1:6667 --topic squid`
-* Browse the data in elasticsearch via the ES Head plugin @ [http://node1:9200/_plugin/head/](http://node1:9200/_plugin/head/) and verify that in the squid index you have two documents
+* If you setup the optional Kafka consumer in another terminal, you should see a couple records output as follows. Notice that one has `"is_malicious":"legit"` while the other has `"is_malicious":"malicious"`:
+```
+{"is_malicious":"legit","full_hostname":"yahoo.com","code":301,"method":"GET","url":"http:\/\/yahoo.com\/","source.type":"squid","elapsed":192,"ip_dst_addr":"72.30.35.10","original_string":"1571163620.277 192 127.0.0.1 TCP_MISS\/301 366 GET http:\/\/yahoo.com\/ - DIRECT\/72.30.35.10 text\/html","bytes":366,"domain_without_subdomains":"yahoo.com","action":"TCP_MISS","guid":"9d19f502-0770-4ca1-9eeb-d0bcbb0942c7","ip_src_addr":"127.0.0.1","timestamp":1571163620277}
+{"is_malicious":"malicious","full_hostname":"cnn.com","code":301,"method":"GET","is_alert":"true","url":"http:\/\/cnn.com\/","source.type":"squid","elapsed":106,"ip_dst_addr":"151.101.1.67","original_string":"1571163632.536 106 127.0.0.1 TCP_MISS\/301 539 GET http:\/\/cnn.com\/ - DIRECT\/151.101.1.67 -","bytes":539,"domain_without_subdomains":"cnn.com","action":"TCP_MISS","guid":"69254e10-bed9-4743-ba8b-d3c01c29430d","ip_src_addr":"127.0.0.1","timestamp":1571163632536}
+```
+* Browse the data in the Alerts UI @ [http://node1:4201/alerts-list](http://node1:4201/alerts-list) and verify that in the squid index you have two documents.
* One from `yahoo.com` which does not have `is_alert` set and does have `is_malicious` set to `legit`
* One from `cnn.com` which does have `is_alert` set to `true`, `is_malicious` set to `malicious` and `threat:triage:level` set to 100