You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by JonZeolla <gi...@git.apache.org> on 2017/04/01 18:12:38 UTC

[GitHub] incubator-metron pull request #503: METRON-815 sensor-stubs sometimes send m...

GitHub user JonZeolla opened a pull request:

    https://github.com/apache/incubator-metron/pull/503

    METRON-815 sensor-stubs sometimes send malformed bro timestamps

    ## Contributor Comments
    The bro sensor-stub sends a malformed timestamp when transforming an input timestamp that has != 6 decimal places.  For instance:
    
    ```
    [vagrant@node1 bin]$ SEARCH="\"ts\"\:[0-9]\+.[0-9]\{6\}"
    [vagrant@node1 bin]$ REPLACE="\"ts\"\:`date +%s`.000000"
    [vagrant@node1 bin]$ cat /opt/sensor-stubs/data/bro.out | sed -e "s/$SEARCH/$REPLACE/g"
    ...
    {"dns": {"ts":1491064638.000000.38621,"uid":"CQ5vBa2GcEToa4NKt5","id.orig_h":"192.168.66.1","id.orig_p":5353,"id.resp_h":"224.0.0.251","id.resp_p":5353,"proto":"udp","trans_id":0,"query":"_googlecast._tcp.local","qclass":1,"qclass_name":"C_INTERNET","qtype":12,"qtype_name":"PTR","AA":false,"TC":false,"RD":false,"RA":false,"Z":0,"rejected":false}}
    ```
    
    You can reproduce this in your environment by spinning up quick-dev or full-dev and by monitoring the bro kafka topic via `/usr/hdp/2.5.3.0-37/kafka/bin/kafka-console-consumer.sh --zookeeper node1:2181 --topic bro`.  You can also investigate errors in the [storm topology](http://node1:8744/) "bro" under parserBolt.
    
    This changes SEARCH and REPLACE so that the timestamp is only updated to the latest second and the decimals are untouched.
    
    ## Pull Request Checklist
    
    Thank you for submitting a contribution to Apache Metron (Incubating).  
    Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions.  
    Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides.  
    
    
    In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:
    
    ### For all changes:
    - [X] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). 
    - [X] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    - [X] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    
    ### For code changes:
    - [X] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
    - [X] Have you included steps or a guide to how the change may be verified and tested manually?
    - [X] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via:
      ```
      mvn -q clean integration-test install && build_utils/verify_licenses.sh 
      ```
    
    - [N/A] Have you written or updated unit tests and or integration tests to verify your changes?
    - [N/A] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
    - [X] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
    
    #### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
    It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JonZeolla/incubator-metron METRON-815

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/503.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #503
    
----
commit e034f569b88f1bb4751660750bc8d31ac0a9dc34
Author: Jon Zeolla <ze...@gmail.com>
Date:   2017-04-01T17:05:59Z

    METRON-815 sensor-stubs sometimes send malformed bro timestamps

commit 6a36dc51ac7d7dac8b5962ad9e8010719c3a91eb
Author: Jon Zeolla <ze...@gmail.com>
Date:   2017-04-01T17:58:51Z

    Even better, keep the variable length timestamp

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #503: METRON-815 sensor-stubs sometimes send malforme...

Posted by anandsubbu <gi...@git.apache.org>.
Github user anandsubbu commented on the issue:

    https://github.com/apache/incubator-metron/pull/503
  
    +1 (non-binding)
    
    Verified with the modified SEARCH/REPLACE pattern. No more parse exceptions seen. Thank you, @JonZeolla !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Re: [GitHub] incubator-metron pull request #503: METRON-815 sensor-stubs sometimes send m...

Posted by "Zeolla@GMail.com" <ze...@gmail.com>.
Bro timestamps are often out of order depending on the log because some
lines are written when the connection ends and others are written when an
event within a connection occurs.  As such, timestamps can be confusing to
look at initially, but it is very normal for them not to be in order.
Also, we are already breaking any sort ordering by randomly selecting logs
from bro.out and replacing the timestamps with the current timestamp, so
I'm not concerned with my changes causing any more of a headache than
flattening the decimal places with 0s.

Jon

On Sun, Apr 2, 2017, 11:50 PM mattf-horton <gi...@git.apache.org> wrote:

> Github user mattf-horton commented on a diff in the pull request:
>
>
> https://github.com/apache/incubator-metron/pull/503#discussion_r109336484
>
>     --- Diff:
> metron-deployment/roles/sensor-stubs/templates/start-bro-stub ---
>     @@ -47,8 +47,8 @@ TOPIC="bro"
>      while true; do
>
>        # transform the bro timestamp and push to kafka
>     -  SEARCH="\"ts\"\:[0-9]\+.[0-9]\{6\}"
>     -  REPLACE="\"ts\"\:`date +%s`.000000"
>     +  SEARCH="\"ts\"\:[0-9]\+\."
>     +  REPLACE="\"ts\"\:`date +%s`\."
>     --- End diff --
>
>     @JonZeolla , good catch.  Leaving the fractional portion of the
> timestamp the same as it is, is appealing.  However, since the granularity
> of `date +%s` is only seconds, and we might transform a bunch of timestamps
> in one second of wallclock realtime, this may result in apparently
> out-of-order timestamps, no?  Eg, if we start with data whose first three
> records have timestamps:
>     1491190032.222222 1491190032.777777 1491190033.111111
>     The transformed data will have timestamps
>     1491190442.222222 1491190442.777777 1491190442.111111
>     with later ones being (at least potentially) out of order.  The
> original code would have generated
>     1491190442.000000 1491190442.000000 1491190442.000000
>     which is rather monotone, but at least not out of order.
>
>     Is this okay, or potentially bad?
>     Perhaps it would be better to just change the `.[0-9]\{6\}` to
> `\.[0-9]\+` in line 50, and leaving line 51 unchanged?
>     (I'm asking, I don't know.  Maybe bro data can naturally be out of
> order?)
>
>
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastructure@apache.org or file a JIRA ticket
> with INFRA.
> ---
>
-- 

Jon

[GitHub] incubator-metron pull request #503: METRON-815 sensor-stubs sometimes send m...

Posted by mattf-horton <gi...@git.apache.org>.
Github user mattf-horton commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/503#discussion_r109336484
  
    --- Diff: metron-deployment/roles/sensor-stubs/templates/start-bro-stub ---
    @@ -47,8 +47,8 @@ TOPIC="bro"
     while true; do
       
       # transform the bro timestamp and push to kafka
    -  SEARCH="\"ts\"\:[0-9]\+.[0-9]\{6\}"
    -  REPLACE="\"ts\"\:`date +%s`.000000"
    +  SEARCH="\"ts\"\:[0-9]\+\."
    +  REPLACE="\"ts\"\:`date +%s`\."
    --- End diff --
    
    @JonZeolla , good catch.  Leaving the fractional portion of the timestamp the same as it is, is appealing.  However, since the granularity of `date +%s` is only seconds, and we might transform a bunch of timestamps in one second of wallclock realtime, this may result in apparently out-of-order timestamps, no?  Eg, if we start with data whose first three records have timestamps:
    1491190032.222222 1491190032.777777 1491190033.111111
    The transformed data will have timestamps
    1491190442.222222 1491190442.777777 1491190442.111111
    with later ones being (at least potentially) out of order.  The original code would have generated
    1491190442.000000 1491190442.000000 1491190442.000000
    which is rather monotone, but at least not out of order.
    
    Is this okay, or potentially bad?
    Perhaps it would be better to just change the `.[0-9]\{6\}` to `\.[0-9]\+` in line 50, and leaving line 51 unchanged?
    (I'm asking, I don't know.  Maybe bro data can naturally be out of order?)
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request #503: METRON-815 sensor-stubs sometimes send m...

Posted by JonZeolla <gi...@git.apache.org>.
Github user JonZeolla commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/503#discussion_r109411418
  
    --- Diff: metron-deployment/roles/sensor-stubs/templates/start-bro-stub ---
    @@ -47,8 +47,8 @@ TOPIC="bro"
     while true; do
       
       # transform the bro timestamp and push to kafka
    -  SEARCH="\"ts\"\:[0-9]\+.[0-9]\{6\}"
    -  REPLACE="\"ts\"\:`date +%s`.000000"
    +  SEARCH="\"ts\"\:[0-9]\+\."
    +  REPLACE="\"ts\"\:`date +%s`\."
    --- End diff --
    
    Bro timestamps are often out of order depending on the log because some lines are written when the connection ends and others are written when an event within a connection occurs. As such, timestamps can be confusing to look at initially, but it is very normal for them not to be in order. Also, we are already breaking any sort ordering by randomly selecting logs from bro.out and replacing the timestamps with the current timestamp, so I'm not concerned with my changes causing any more of a headache than flattening the decimal places with 0s.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request #503: METRON-815 sensor-stubs sometimes send m...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-metron/pull/503


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request #503: METRON-815 sensor-stubs sometimes send m...

Posted by mattf-horton <gi...@git.apache.org>.
Github user mattf-horton commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/503#discussion_r109444746
  
    --- Diff: metron-deployment/roles/sensor-stubs/templates/start-bro-stub ---
    @@ -47,8 +47,8 @@ TOPIC="bro"
     while true; do
       
       # transform the bro timestamp and push to kafka
    -  SEARCH="\"ts\"\:[0-9]\+.[0-9]\{6\}"
    -  REPLACE="\"ts\"\:`date +%s`.000000"
    +  SEARCH="\"ts\"\:[0-9]\+\."
    +  REPLACE="\"ts\"\:`date +%s`\."
    --- End diff --
    
    Okay, +1 then.  Thanks for the information.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---