You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "Mohan (JIRA)" <ji...@apache.org> on 2017/08/25 11:58:00 UTC

[jira] [Created] (METRON-1133) Entity value for a profiled data written wrongly to Hbase

Mohan created METRON-1133:
-----------------------------

             Summary: Entity value for a profiled data written wrongly to Hbase 
                 Key: METRON-1133
                 URL: https://issues.apache.org/jira/browse/METRON-1133
             Project: Metron
          Issue Type: Bug
    Affects Versions: 0.4.0
            Reporter: Mohan


I have created profile with Profiler's "Group By" functionality which operates over any incoming telemetry that has an `ip_src_addr` and a `timestamp` field. 
It produces a profile that segments the data by day of week. It does by using a 'groupBy' expression to extract the day of week from the telemetry's `timestamp` field.

My Kafka messages are 


{code:java}
{     "ip_src_addr": "10.0.0.1",     "protocol": "HTTPS",     "length": "10",     "bytes_in": 234,     "timestamp": "1503657089000"  }
{     "ip_src_addr": "10.0.0.2",     "protocol": "HTTP",     "length": "20",     "bytes_in": 390,     "timestamp": "1503657089000"   }
{     "ip_src_addr": "10.0.0.3",     "protocol": "DNS",     "length": "30",     "bytes_in": 560,     "timestamp": "1503657089000"    }
{code}


My profile config looks as 


{code:java}
{
"profiles": [
 {
    "profile": "calender-effects",
    "onlyif": "exists(ip_src_addr) and exists(timestamp)",
    "foreach": "ip_src_addr",
    "init":{ "count": 0 },
    "update":{ "count": "count + 1" },
    "result": "count",
    "groupBy": ["DAY_OF_WEEK(start)"]
 }]
}
{code}


After pushing all the above messages 8 times each, When I scan the profiler table from Hbase 

{code:java}
hbase(main):003:0> scan 'profiler'
ROW                                                                                 COLUMN+CELL
 \x00\x00\x03Pcalender-effects10.0.0.16\x00\x00\x00\x00\x00\xBF3.                   column=P:value, timestamp=1503657430993, value=\x02\x10
 \x00\x00\x03Pcalender-effects10.0.0.26\x00\x00\x00\x00\x00\xBF3.                   column=P:value, timestamp=1503657430993, value=\x02\x10
 \x00\x00\x03Pcalender-effects10.0.0.36\x00\x00\x00\x00\x00\xBF3.                   column=P:value, timestamp=1503657430993, value=\x02\x10
{code}


I see that an extra digit '6' is getting appended to the entity values 

When retrieving profile data using the stellar shell, I wasn't able retrieve data from the same day of week to account for any calendar effects. 
The following example retrieves profile data over the past 10 days.

{code:java}
[Stellar]>>> PROFILE_GET( "calender-effects", "10.0.0.1", PROFILE_FIXED(10, "DAYS",{'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}), [] )
[]
{code}


I was able to retrieve the data by changing the entity value to "10.0.0.16" instead of 10.0.0.1

{code:java}
[Stellar]>>> PROFILE_GET( "calender-effects", "10.0.0.16", PROFILE_FIXED(10, "DAYS",{'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}), [] )
[8]
{code}


retrieves profile data over the past 10 days only for Fridays also fails,  

{code:java}
[Stellar]>>> PROFILE_GET( "calender-effects", "10.0.0.16", PROFILE_FIXED(10, "DAYS",{'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}), [friday] )
[]
{code}

SO Where this value '6' is getting appended to the entity value ? It looks to me like the group by value ie the Day of the week is getting appended to the entity value.  
to confirm the same I changed the timestamp value in the messages to "timestamp": "1503583870672" and I see that the value '5' got appended to the entity value!! 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)