You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@metron.apache.org by shoggi <sh...@gmail.com> on 2017/03/04 22:38:33 UTC

Stellar is unable to query base data

Hey all

Very strange, I had a few profilers working and wanted to show someone
(left system alone for a few days) & now can't query data anymore. I went
so far to reboot the system, deleted the profiler table in hbase and loaded
new data.

I see the data in base but stellar does not let me query it anymore. The
queries return empty as if data does not exist but it's definitely there.
The timeframe can not be an issue, tired to use a very wide stellar query
and as mentioned, loaded fresh data.

Any troubleshooting hints? This bugs me, as I have not touched the system &
even restarted it to get rid of any possible stale connections.

[Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
[]

[Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
[]

Base data is there:

\xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
timestamp=1488664729500,
value=\x01\x00org.apache.metron.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C\x00\x00\x00\x01@b
\x
 1z\x96F
C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@
\x82H\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@
\x82H\x00\x00\x00\x00\x00A\x14\xE3D\x0
                                                     0\x00\x00\x00@
\x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
\x82H\x00\x00\x00\x00\x00@
\x82H\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x

 00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

Thanks
shoggi

Re: Stellar is unable to query base data

Posted by Nick Allen <ni...@nickallen.org>.
The tick time (aka period duration) defines how frequently the profile is
flushed.  In other words, how frequently the computed values are stored.

If the Profiler client (aka PROFILE_GET) is looking for profiles that were
written with a 15 minute tick time, then it can only 'see' profiles that
were created with a 15 minute tick time.  If your profile was written with
a 1 minute tick time then PROFILE_GET will not return any results.

All that to say, you need to make sure that the client settings for tick
time agree with whatever tick time was defined when you created the profile.




On Sun, Mar 5, 2017 at 5:48 PM, shoggi <sh...@gmail.com> wrote:

> The quorum and kafka config was ok, the host is actually called node1
> (same system). The variables were set like that because I wanted to see if
> I can set it to another value. Anyway, changed everything back and did
> another of this:
>   - killed the topology
>   - created an empty profiler config
>   - restarted system
>   - added profiler configuration again (started with just one profile)
>   - data gets added to hbase, I get the error as shown previously, every
> couple of flush cycles
>   - still no luck querying hbase out from stellar or via the enrichment
> parser. No errors anywhere but the profiler NPE's
>
> you mentioned tick time.. is that something I can tune?
>
>
> 2017-03-05 23:25:06.583 o.a.m.p.b.ProfileBuilderBolt [INFO] Flushing
> profile: profile=url-length, entity=google.ch
> 2017-03-05 23:25:06.584 o.a.m.p.b.ProfileBuilderBolt [ERROR] Unexpected
> failure: message='null', tuple='source: __system:-1, stream: __tick, id:
> {}, [60]'
> java.lang.NullPointerException
> at org.apache.metron.profiler.stellar.DefaultStellarExecutor.execute(
> DefaultStellarExecutor.java:117) ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeResult(ProfileBuilderBolt.java:316)
> ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$flush$4(ProfileBuilderBolt.java:245)
> ~[stormjar.jar:?]
> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
> ~[?:1.8.0_77]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
> ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.
> doExecute(ProfileBuilderBolt.java:164) ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.
> execute(ProfileBuilderBolt.java:144) [stormjar.jar:?]
> at org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> 2017-03-05 23:25:06.585 o.a.s.d.executor [ERROR]
> java.lang.NullPointerException
> at org.apache.metron.profiler.stellar.DefaultStellarExecutor.execute(
> DefaultStellarExecutor.java:117) ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeResult(ProfileBuilderBolt.java:316)
> ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$flush$4(ProfileBuilderBolt.java:245)
> ~[stormjar.jar:?]
> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
> ~[?:1.8.0_77]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
> ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.
> doExecute(ProfileBuilderBolt.java:164) ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.
> execute(ProfileBuilderBolt.java:144) [stormjar.jar:?]
> at org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> 2017-03-05 23:25:08.628 o.a.s.k.ZkCoordinator [INFO] Task [1/1] Refreshing
> partition manager connections
>
> On Sun, Mar 5, 2017 at 6:24 PM, Casey Stella <ce...@gmail.com> wrote:
>
>> Ok, so a couple of things I see here that you might try:
>>
>>
>>    - You should set kafka.zk and kafka.broker in profiler.properties to
>>    your real zookeeper quorum and kafka broker respectively
>>
>> In your profiler.json, instead of:
>>  {
>>       "profile": "url-bytes",
>>       "foreach": "if exists(domain_without_subdomains) then
>> domain_without_subdomains else 'n/a'",
>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>> 'squid'",
>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>       "result": "n"
>>     },
>> {
>>       "profile": "content-type",
>>       "foreach": "if exists(domain_content) then domain_content else
>> 'n/a'",
>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>       "result": "o"
>>     }
>> You might want (note the change on the update statements)
>>  {
>>       "profile": "url-bytes",
>>       "foreach": "if exists(domain_without_subdomains) then
>> domain_without_subdomains else 'n/a'",
>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>> 'squid'",
>>       "update": { "n": "STATS_ADD(n, bytes)" },
>>       "result": "n"
>>     },
>> {
>>       "profile": "content-type",
>>       "foreach": "if exists(domain_content) then domain_content else
>> 'n/a'",
>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>       "update": { "o": "STATS_ADD(o, bytes)" },
>>       "result": "o"
>>     }
>>
>> Try restarting the profiler topology and if you could look at the storm
>> logs and see if you see any issues show up in the logs for the profiler.
>>
>> On Sun, Mar 5, 2017 at 7:11 AM, shoggi <sh...@gmail.com> wrote:
>>
>>> Here is my config:
>>>
>>> # global config
>>> {
>>> "es.clustername": "metron",
>>> "es.ip": "172.16.16.2",
>>> "es.port": "9300",
>>> "es.date.format": "yyyy.MM.dd.HH"
>>> }
>>>
>>> # profiler config
>>> {
>>>   "profiles": [
>>>     {
>>>       "profile": "url-length",
>>>       "foreach": "if exists(domain_without_subdomains) then
>>> domain_without_subdomains else 'n/a'",
>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>> 'squid'",
>>>       "update": { "m": "STATS_ADD(m, LENGTH(url))" },
>>>       "result": "m"
>>>     },
>>>     {
>>>       "profile": "url-bytes",
>>>       "foreach": "if exists(domain_without_subdomains) then
>>> domain_without_subdomains else 'n/a'",
>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>> 'squid'",
>>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>>       "result": "n"
>>>     },
>>>     {
>>>       "profile": "content-type",
>>>       "foreach": "if exists(domain_content) then domain_content else
>>> 'n/a'",
>>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>>       "result": "o"
>>>     }
>>>   ]
>>> }
>>>
>>> # profiler properties
>>> ##### Storm #####
>>>
>>> profiler.workers=1
>>> profiler.executors=0
>>> profiler.input.topic=indexing
>>> profiler.period.duration=15
>>> profiler.period.duration.units=MINUTES
>>> profiler.ttl=30
>>> profiler.ttl.units=MINUTES
>>> profiler.hbase.salt.divisor=1000
>>> profiler.hbase.table=profiler
>>> profiler.hbase.column.family=P
>>> profiler.hbase.batch=10
>>> profiler.hbase.flush.interval.seconds=30
>>>
>>> ##### Kafka #####
>>>
>>> kafka.zk=node1:2181
>>> kafka.broker=node1:6667
>>> kafka.start=WHERE_I_LEFT_OFF
>>>
>>> On Sun, Mar 5, 2017 at 2:37 AM, Casey Stella <ce...@gmail.com> wrote:
>>>
>>>> Sorry you are having issues! :(. Sometimes this is due to a mismatch in
>>>> the tick time in the profiler between write and read.
>>>>
>>>> What's in your global config (METRON_HOME/config/zookeeper/global.json),
>>>> profiler config (METRON_HOME/config/zookeeper/profiler.json) and
>>>> profiler topology properties (METRON_HOME/config/profiler.properties)?
>>>>
>>>>
>>>>
>>>> On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:
>>>>
>>>>> Hey all
>>>>>
>>>>> Very strange, I had a few profilers working and wanted to show someone
>>>>> (left system alone for a few days) & now can't query data anymore. I went
>>>>> so far to reboot the system, deleted the profiler table in hbase and loaded
>>>>> new data.
>>>>>
>>>>> I see the data in base but stellar does not let me query it anymore.
>>>>> The queries return empty as if data does not exist but it's definitely
>>>>> there. The timeframe can not be an issue, tired to use a very wide stellar
>>>>> query and as mentioned, loaded fresh data.
>>>>>
>>>>> Any troubleshooting hints? This bugs me, as I have not touched the
>>>>> system & even restarted it to get rid of any possible stale connections.
>>>>>
>>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
>>>>> []
>>>>>
>>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
>>>>> []
>>>>>
>>>>> Base data is there:
>>>>>
>>>>> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
>>>>> timestamp=1488664729500, value=\x01\x00org.apache.metro
>>>>> n.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C
>>>>> \x00\x00\x00\x01@b\x
>>>>>  1z\x96F
>>>>> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x00\x00\x00\x00
>>>>> \x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x
>>>>> 00\x00\x00\x00\x00A\x14\xE3D\x0
>>>>>                                                      0\x00\x00\x00@
>>>>> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
>>>>> \x82H\x00\x00\x00\x00\x00@\x82H\x00\x00\x00\x00\x00\x
>>>>> 00\x00\x00\x00\x00\x00\x00\x00\x
>>>>>
>>>>>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>>>>>
>>>>> Thanks
>>>>> shoggi
>>>>>
>>>>
>>>
>>
>

Re: Stellar is unable to query base data

Posted by Casey Stella <ce...@gmail.com>.
Sorry we couldn't get you to a working state, but thanks so much for the
kind words!
On Mon, Mar 6, 2017 at 20:53 shoggi <sh...@gmail.com> wrote:

> Hi all
>
> I did add the 2 additional lines in the global.json file but also that did
> not help.
>
> I completely understand the issues running an out-versioned system and for
> that I am very thankful for all the help provided from all of you.
> Unfortunately I could not upgrade my 0.3.0 bare metal system as my time
> evaluating the system has run short. I wanted to use the time I had left to
> collect as much information as possible.
>
> I definitely like what I have seen and I absolutely believe this to be a
> great tool as it provides great flexibility in creating customized
> analytics & detection capabilities. Just thinking about the enrichment part
> and how I can tie in virtually anything I want, is exactly what I want from
> such a platform.
>
> You guys do an amazing job of helping the people wanting to know metron
> better. Undoubtedly it is very hard to work on the code and at the same
> time sharing your time to answer the many questions.
>
> If I may add two recommendations from my end.
>
> 1) A current and past threads point out to not loose touch with the
> 'window shoppers' (not sure if that term makes sense to you). Invest some
> time to get the installers right. People with various backgrounds look at
> your project right now. The simpler it is for them to get a system running,
> more likely it will be to get them excited about it. It took me more than a
> month to go through an installation, getting to know the various
> components, generating my own feeds prepared and ingested, included various
> data enrichments and creating relevant dashboards. You cannot expect this
> from someone who just wants to look at metron. True, it is still early days
> but the more you automate that 'acquisition' part, the more time you will
> have on your hands doing development work. It is easier said than done but
> you have here lots of people in the group who are willing to test &
> contribute. Use them..! Oh yeah.. I really like your youtube videos but you
> definitely need to promote those better. Add some proper titles so that
> they can be searched & found and add a short summary of what the video is
> about in the description field. It can be something very simple such as
> follows (Video from the 23rd of September):
> Topics covered:
> a) PCAP CLI
> b) Stellar introduction
> c) Ambari
> d) Profiler
>
> 2) Documentation is painful but grateful. Ask the community to help, you
> might even find someone with the passion to handle this for you. I did
> write my own installation manual for a bare metal rig, only to find out
> that others such as Dima did the same (and better). It might be helpful to
> have a dedicated person or a group of people to write metron documentation.
> Some things are for a developer just known facts. A newbie on the other
> hand can be easily deterred if not guided through properly. On that note..
> Apart of everything else I had to find out, one experience stuck with me.
> When I for example wanted to join a variable and a string with Stellar, I
> had to look at the source code to find the proper syntax. It never occurred
> to me to use square brackets and the short help only mentions to use a
> list. For the person who coded the function, it is crystal clear. Others
> might get to it eventually and then there are the rest who do not want to
> find out, it just needs to be clear. Also here, it is easier said than done
> but I strongly believe that you can gain lots of having someone oversee
> your docs and help getting more people excited about metron.
>
> Keep up with the great work !
>
> Regards
> Shoggi
>
> On Mon, Mar 6, 2017 at 8:12 PM, Michael Miklavcic <
> michael.miklavcic@gmail.com> wrote:
>
> Hi Shoggi,
>
> In addition to Nick's and Casey's comments, I noticed your global.json
> does not specify a profiler period. Try adding the following:
> "profiler.client.period.duration" : "15",
> "profiler.client.period.duration.units" : "MINUTES"
>
> This period duration should match the duration you've specified in the
> profiler.properties file:
> profiler.period.duration=15
> profiler.period.duration.units=MINUTES
>
> If you want to use a different period duration, you should change the
> value to match in *both* locations.
>
> Best,
> Mike
>
>
> On Sun, Mar 5, 2017 at 6:09 PM, Nick Allen <ni...@nickallen.org> wrote:
>
> What version of Metron are you using?  Based on what I am seeing in the
> stack trace it seems to be a few versions ago.  Any chance you'd be willing
> to try something newer like 0.3.1 RC5? It would be easier to help
> troubleshoot that way.
>
> On Sun, Mar 5, 2017 at 5:48 PM, shoggi <sh...@gmail.com> wrote:
>
> The quorum and kafka config was ok, the host is actually called node1
> (same system). The variables were set like that because I wanted to see if
> I can set it to another value. Anyway, changed everything back and did
> another of this:
>   - killed the topology
>   - created an empty profiler config
>   - restarted system
>   - added profiler configuration again (started with just one profile)
>   - data gets added to hbase, I get the error as shown previously, every
> couple of flush cycles
>   - still no luck querying hbase out from stellar or via the enrichment
> parser. No errors anywhere but the profiler NPE's
>
> you mentioned tick time.. is that something I can tune?
>
>
> 2017-03-05 23:25:06.583 o.a.m.p.b.ProfileBuilderBolt [INFO] Flushing
> profile: profile=url-length, entity=google.ch
> 2017-03-05 23:25:06.584 o.a.m.p.b.ProfileBuilderBolt [ERROR] Unexpected
> failure: message='null', tuple='source: __system:-1, stream: __tick, id:
> {}, [60]'
> java.lang.NullPointerException
> at
> org.apache.metron.profiler.stellar.DefaultStellarExecutor.execute(DefaultStellarExecutor.java:117)
> ~[stormjar.jar:?]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeResult(ProfileBuilderBolt.java:316)
> ~[stormjar.jar:?]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$flush$4(ProfileBuilderBolt.java:245)
> ~[stormjar.jar:?]
> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
> ~[?:1.8.0_77]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
> ~[stormjar.jar:?]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.doExecute(ProfileBuilderBolt.java:164)
> ~[stormjar.jar:?]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.execute(ProfileBuilderBolt.java:144)
> [stormjar.jar:?]
> at
> org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> 2017-03-05 23:25:06.585 o.a.s.d.executor [ERROR]
> java.lang.NullPointerException
> at
> org.apache.metron.profiler.stellar.DefaultStellarExecutor.execute(DefaultStellarExecutor.java:117)
> ~[stormjar.jar:?]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeResult(ProfileBuilderBolt.java:316)
> ~[stormjar.jar:?]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$flush$4(ProfileBuilderBolt.java:245)
> ~[stormjar.jar:?]
> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
> ~[?:1.8.0_77]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
> ~[stormjar.jar:?]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.doExecute(ProfileBuilderBolt.java:164)
> ~[stormjar.jar:?]
> at
> org.apache.metron.profiler.bolt.ProfileBuilderBolt.execute(ProfileBuilderBolt.java:144)
> [stormjar.jar:?]
> at
> org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at
> org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> 2017-03-05 23:25:08.628 o.a.s.k.ZkCoordinator [INFO] Task [1/1] Refreshing
> partition manager connections
>
> On Sun, Mar 5, 2017 at 6:24 PM, Casey Stella <ce...@gmail.com> wrote:
>
> Ok, so a couple of things I see here that you might try:
>
>
>    - You should set kafka.zk and kafka.broker in profiler.properties to
>    your real zookeeper quorum and kafka broker respectively
>
> In your profiler.json, instead of:
>  {
>       "profile": "url-bytes",
>       "foreach": "if exists(domain_without_subdomains) then
> domain_without_subdomains else 'n/a'",
>       "onlyif": "exists(domain_without_subdomains) && source.type ==
> 'squid'",
>       "update": { "n": "STATS_ADD(m, bytes)" },
>       "result": "n"
>     },
> {
>       "profile": "content-type",
>       "foreach": "if exists(domain_content) then domain_content else
> 'n/a'",
>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>       "update": { "o": "STATS_ADD(m, bytes)" },
>       "result": "o"
>     }
> You might want (note the change on the update statements)
>  {
>       "profile": "url-bytes",
>       "foreach": "if exists(domain_without_subdomains) then
> domain_without_subdomains else 'n/a'",
>       "onlyif": "exists(domain_without_subdomains) && source.type ==
> 'squid'",
>       "update": { "n": "STATS_ADD(n, bytes)" },
>       "result": "n"
>     },
> {
>       "profile": "content-type",
>       "foreach": "if exists(domain_content) then domain_content else
> 'n/a'",
>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>       "update": { "o": "STATS_ADD(o, bytes)" },
>       "result": "o"
>     }
>
> Try restarting the profiler topology and if you could look at the storm
> logs and see if you see any issues show up in the logs for the profiler.
>
> On Sun, Mar 5, 2017 at 7:11 AM, shoggi <sh...@gmail.com> wrote:
>
> Here is my config:
>
> # global config
> {
> "es.clustername": "metron",
> "es.ip": "172.16.16.2",
> "es.port": "9300",
> "es.date.format": "yyyy.MM.dd.HH"
> }
>
> # profiler config
> {
>   "profiles": [
>     {
>       "profile": "url-length",
>       "foreach": "if exists(domain_without_subdomains) then
> domain_without_subdomains else 'n/a'",
>       "onlyif": "exists(domain_without_subdomains) && source.type ==
> 'squid'",
>       "update": { "m": "STATS_ADD(m, LENGTH(url))" },
>       "result": "m"
>     },
>     {
>       "profile": "url-bytes",
>       "foreach": "if exists(domain_without_subdomains) then
> domain_without_subdomains else 'n/a'",
>       "onlyif": "exists(domain_without_subdomains) && source.type ==
> 'squid'",
>       "update": { "n": "STATS_ADD(m, bytes)" },
>       "result": "n"
>     },
>     {
>       "profile": "content-type",
>       "foreach": "if exists(domain_content) then domain_content else
> 'n/a'",
>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>       "update": { "o": "STATS_ADD(m, bytes)" },
>       "result": "o"
>     }
>   ]
> }
>
> # profiler properties
> ##### Storm #####
>
> profiler.workers=1
> profiler.executors=0
> profiler.input.topic=indexing
> profiler.period.duration=15
> profiler.period.duration.units=MINUTES
> profiler.ttl=30
> profiler.ttl.units=MINUTES
> profiler.hbase.salt.divisor=1000
> profiler.hbase.table=profiler
> profiler.hbase.column.family=P
> profiler.hbase.batch=10
> profiler.hbase.flush.interval.seconds=30
>
> ##### Kafka #####
>
> kafka.zk=node1:2181
> kafka.broker=node1:6667
> kafka.start=WHERE_I_LEFT_OFF
>
> On Sun, Mar 5, 2017 at 2:37 AM, Casey Stella <ce...@gmail.com> wrote:
>
> Sorry you are having issues! :(. Sometimes this is due to a mismatch in
> the tick time in the profiler between write and read.
>
> What's in your global config (METRON_HOME/config/zookeeper/global.json),
> profiler config (METRON_HOME/config/zookeeper/profiler.json) and profiler
> topology properties (METRON_HOME/config/profiler.properties)?
>
>
>
> On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:
>
> Hey all
>
> Very strange, I had a few profilers working and wanted to show someone
> (left system alone for a few days) & now can't query data anymore. I went
> so far to reboot the system, deleted the profiler table in hbase and loaded
> new data.
>
> I see the data in base but stellar does not let me query it anymore. The
> queries return empty as if data does not exist but it's definitely there.
> The timeframe can not be an issue, tired to use a very wide stellar query
> and as mentioned, loaded fresh data.
>
> Any troubleshooting hints? This bugs me, as I have not touched the system
> & even restarted it to get rid of any possible stale connections.
>
> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
> []
>
> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
> []
>
> Base data is there:
>
> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
> timestamp=1488664729500,
> value=\x01\x00org.apache.metron.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C\x00\x00\x00\x01@b
> \x
>  1z\x96F
> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@
> \x82H\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@
> \x82H\x00\x00\x00\x00\x00A\x14\xE3D\x0
>                                                      0\x00\x00\x00@
> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
> \x82H\x00\x00\x00\x00\x00@
> \x82H\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x
>
>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>
> Thanks
> shoggi
>
>
>
>
>
>
>
>

Re: Stellar is unable to query base data

Posted by shoggi <sh...@gmail.com>.
Hi all

I did add the 2 additional lines in the global.json file but also that did
not help.

I completely understand the issues running an out-versioned system and for
that I am very thankful for all the help provided from all of you.
Unfortunately I could not upgrade my 0.3.0 bare metal system as my time
evaluating the system has run short. I wanted to use the time I had left to
collect as much information as possible.

I definitely like what I have seen and I absolutely believe this to be a
great tool as it provides great flexibility in creating customized
analytics & detection capabilities. Just thinking about the enrichment part
and how I can tie in virtually anything I want, is exactly what I want from
such a platform.

You guys do an amazing job of helping the people wanting to know metron
better. Undoubtedly it is very hard to work on the code and at the same
time sharing your time to answer the many questions.

If I may add two recommendations from my end.

1) A current and past threads point out to not loose touch with the 'window
shoppers' (not sure if that term makes sense to you). Invest some time to
get the installers right. People with various backgrounds look at your
project right now. The simpler it is for them to get a system running, more
likely it will be to get them excited about it. It took me more than a
month to go through an installation, getting to know the various
components, generating my own feeds prepared and ingested, included various
data enrichments and creating relevant dashboards. You cannot expect this
from someone who just wants to look at metron. True, it is still early days
but the more you automate that 'acquisition' part, the more time you will
have on your hands doing development work. It is easier said than done but
you have here lots of people in the group who are willing to test &
contribute. Use them..! Oh yeah.. I really like your youtube videos but you
definitely need to promote those better. Add some proper titles so that
they can be searched & found and add a short summary of what the video is
about in the description field. It can be something very simple such as
follows (Video from the 23rd of September):
Topics covered:
a) PCAP CLI
b) Stellar introduction
c) Ambari
d) Profiler

2) Documentation is painful but grateful. Ask the community to help, you
might even find someone with the passion to handle this for you. I did
write my own installation manual for a bare metal rig, only to find out
that others such as Dima did the same (and better). It might be helpful to
have a dedicated person or a group of people to write metron documentation.
Some things are for a developer just known facts. A newbie on the other
hand can be easily deterred if not guided through properly. On that note..
Apart of everything else I had to find out, one experience stuck with me.
When I for example wanted to join a variable and a string with Stellar, I
had to look at the source code to find the proper syntax. It never occurred
to me to use square brackets and the short help only mentions to use a
list. For the person who coded the function, it is crystal clear. Others
might get to it eventually and then there are the rest who do not want to
find out, it just needs to be clear. Also here, it is easier said than done
but I strongly believe that you can gain lots of having someone oversee
your docs and help getting more people excited about metron.

Keep up with the great work !

Regards
Shoggi

On Mon, Mar 6, 2017 at 8:12 PM, Michael Miklavcic <
michael.miklavcic@gmail.com> wrote:

> Hi Shoggi,
>
> In addition to Nick's and Casey's comments, I noticed your global.json
> does not specify a profiler period. Try adding the following:
> "profiler.client.period.duration" : "15",
> "profiler.client.period.duration.units" : "MINUTES"
>
> This period duration should match the duration you've specified in the
> profiler.properties file:
> profiler.period.duration=15
> profiler.period.duration.units=MINUTES
>
> If you want to use a different period duration, you should change the
> value to match in *both* locations.
>
> Best,
> Mike
>
>
> On Sun, Mar 5, 2017 at 6:09 PM, Nick Allen <ni...@nickallen.org> wrote:
>
>> What version of Metron are you using?  Based on what I am seeing in the
>> stack trace it seems to be a few versions ago.  Any chance you'd be willing
>> to try something newer like 0.3.1 RC5? It would be easier to help
>> troubleshoot that way.
>>
>> On Sun, Mar 5, 2017 at 5:48 PM, shoggi <sh...@gmail.com> wrote:
>>
>>> The quorum and kafka config was ok, the host is actually called node1
>>> (same system). The variables were set like that because I wanted to see if
>>> I can set it to another value. Anyway, changed everything back and did
>>> another of this:
>>>   - killed the topology
>>>   - created an empty profiler config
>>>   - restarted system
>>>   - added profiler configuration again (started with just one profile)
>>>   - data gets added to hbase, I get the error as shown previously, every
>>> couple of flush cycles
>>>   - still no luck querying hbase out from stellar or via the enrichment
>>> parser. No errors anywhere but the profiler NPE's
>>>
>>> you mentioned tick time.. is that something I can tune?
>>>
>>>
>>> 2017-03-05 23:25:06.583 o.a.m.p.b.ProfileBuilderBolt [INFO] Flushing
>>> profile: profile=url-length, entity=google.ch
>>> 2017-03-05 23:25:06.584 o.a.m.p.b.ProfileBuilderBolt [ERROR] Unexpected
>>> failure: message='null', tuple='source: __system:-1, stream: __tick, id:
>>> {}, [60]'
>>> java.lang.NullPointerException
>>> at org.apache.metron.profiler.stellar.DefaultStellarExecutor.ex
>>> ecute(DefaultStellarExecutor.java:117) ~[stormjar.jar:?]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeRe
>>> sult(ProfileBuilderBolt.java:316) ~[stormjar.jar:?]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$fl
>>> ush$4(ProfileBuilderBolt.java:245) ~[stormjar.jar:?]
>>> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
>>> ~[?:1.8.0_77]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
>>> ~[stormjar.jar:?]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.doExecute(ProfileBuilderBolt.java:164)
>>> ~[stormjar.jar:?]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.execute(ProfileBuilderBolt.java:144)
>>> [stormjar.jar:?]
>>> at org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
>>> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
>>> 2017-03-05 23:25:06.585 o.a.s.d.executor [ERROR]
>>> java.lang.NullPointerException
>>> at org.apache.metron.profiler.stellar.DefaultStellarExecutor.ex
>>> ecute(DefaultStellarExecutor.java:117) ~[stormjar.jar:?]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeRe
>>> sult(ProfileBuilderBolt.java:316) ~[stormjar.jar:?]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$fl
>>> ush$4(ProfileBuilderBolt.java:245) ~[stormjar.jar:?]
>>> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
>>> ~[?:1.8.0_77]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
>>> ~[stormjar.jar:?]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.doExecute(ProfileBuilderBolt.java:164)
>>> ~[stormjar.jar:?]
>>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.execute(ProfileBuilderBolt.java:144)
>>> [stormjar.jar:?]
>>> at org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
>>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>>> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
>>> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
>>> 2017-03-05 23:25:08.628 o.a.s.k.ZkCoordinator [INFO] Task [1/1]
>>> Refreshing partition manager connections
>>>
>>> On Sun, Mar 5, 2017 at 6:24 PM, Casey Stella <ce...@gmail.com> wrote:
>>>
>>>> Ok, so a couple of things I see here that you might try:
>>>>
>>>>
>>>>    - You should set kafka.zk and kafka.broker in profiler.properties
>>>>    to your real zookeeper quorum and kafka broker respectively
>>>>
>>>> In your profiler.json, instead of:
>>>>  {
>>>>       "profile": "url-bytes",
>>>>       "foreach": "if exists(domain_without_subdomains) then
>>>> domain_without_subdomains else 'n/a'",
>>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>>> 'squid'",
>>>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>>>       "result": "n"
>>>>     },
>>>> {
>>>>       "profile": "content-type",
>>>>       "foreach": "if exists(domain_content) then domain_content else
>>>> 'n/a'",
>>>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>>>       "result": "o"
>>>>     }
>>>> You might want (note the change on the update statements)
>>>>  {
>>>>       "profile": "url-bytes",
>>>>       "foreach": "if exists(domain_without_subdomains) then
>>>> domain_without_subdomains else 'n/a'",
>>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>>> 'squid'",
>>>>       "update": { "n": "STATS_ADD(n, bytes)" },
>>>>       "result": "n"
>>>>     },
>>>> {
>>>>       "profile": "content-type",
>>>>       "foreach": "if exists(domain_content) then domain_content else
>>>> 'n/a'",
>>>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>>>       "update": { "o": "STATS_ADD(o, bytes)" },
>>>>       "result": "o"
>>>>     }
>>>>
>>>> Try restarting the profiler topology and if you could look at the storm
>>>> logs and see if you see any issues show up in the logs for the profiler.
>>>>
>>>> On Sun, Mar 5, 2017 at 7:11 AM, shoggi <sh...@gmail.com> wrote:
>>>>
>>>>> Here is my config:
>>>>>
>>>>> # global config
>>>>> {
>>>>> "es.clustername": "metron",
>>>>> "es.ip": "172.16.16.2",
>>>>> "es.port": "9300",
>>>>> "es.date.format": "yyyy.MM.dd.HH"
>>>>> }
>>>>>
>>>>> # profiler config
>>>>> {
>>>>>   "profiles": [
>>>>>     {
>>>>>       "profile": "url-length",
>>>>>       "foreach": "if exists(domain_without_subdomains) then
>>>>> domain_without_subdomains else 'n/a'",
>>>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>>>> 'squid'",
>>>>>       "update": { "m": "STATS_ADD(m, LENGTH(url))" },
>>>>>       "result": "m"
>>>>>     },
>>>>>     {
>>>>>       "profile": "url-bytes",
>>>>>       "foreach": "if exists(domain_without_subdomains) then
>>>>> domain_without_subdomains else 'n/a'",
>>>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>>>> 'squid'",
>>>>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>>>>       "result": "n"
>>>>>     },
>>>>>     {
>>>>>       "profile": "content-type",
>>>>>       "foreach": "if exists(domain_content) then domain_content else
>>>>> 'n/a'",
>>>>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>>>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>>>>       "result": "o"
>>>>>     }
>>>>>   ]
>>>>> }
>>>>>
>>>>> # profiler properties
>>>>> ##### Storm #####
>>>>>
>>>>> profiler.workers=1
>>>>> profiler.executors=0
>>>>> profiler.input.topic=indexing
>>>>> profiler.period.duration=15
>>>>> profiler.period.duration.units=MINUTES
>>>>> profiler.ttl=30
>>>>> profiler.ttl.units=MINUTES
>>>>> profiler.hbase.salt.divisor=1000
>>>>> profiler.hbase.table=profiler
>>>>> profiler.hbase.column.family=P
>>>>> profiler.hbase.batch=10
>>>>> profiler.hbase.flush.interval.seconds=30
>>>>>
>>>>> ##### Kafka #####
>>>>>
>>>>> kafka.zk=node1:2181
>>>>> kafka.broker=node1:6667
>>>>> kafka.start=WHERE_I_LEFT_OFF
>>>>>
>>>>> On Sun, Mar 5, 2017 at 2:37 AM, Casey Stella <ce...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Sorry you are having issues! :(. Sometimes this is due to a mismatch
>>>>>> in the tick time in the profiler between write and read.
>>>>>>
>>>>>> What's in your global config (METRON_HOME/config/zookeeper/global.json),
>>>>>> profiler config (METRON_HOME/config/zookeeper/profiler.json) and
>>>>>> profiler topology properties (METRON_HOME/config/profiler.p
>>>>>> roperties)?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:
>>>>>>
>>>>>>> Hey all
>>>>>>>
>>>>>>> Very strange, I had a few profilers working and wanted to show
>>>>>>> someone (left system alone for a few days) & now can't query data anymore.
>>>>>>> I went so far to reboot the system, deleted the profiler table in hbase and
>>>>>>> loaded new data.
>>>>>>>
>>>>>>> I see the data in base but stellar does not let me query it anymore.
>>>>>>> The queries return empty as if data does not exist but it's definitely
>>>>>>> there. The timeframe can not be an issue, tired to use a very wide stellar
>>>>>>> query and as mentioned, loaded fresh data.
>>>>>>>
>>>>>>> Any troubleshooting hints? This bugs me, as I have not touched the
>>>>>>> system & even restarted it to get rid of any possible stale connections.
>>>>>>>
>>>>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
>>>>>>> []
>>>>>>>
>>>>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
>>>>>>> []
>>>>>>>
>>>>>>> Base data is there:
>>>>>>>
>>>>>>> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
>>>>>>> timestamp=1488664729500, value=\x01\x00org.apache.metro
>>>>>>> n.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C
>>>>>>> \x00\x00\x00\x01@b\x
>>>>>>>  1z\x96F
>>>>>>> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x00\x00\x00\x00
>>>>>>> \x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x
>>>>>>> 00\x00\x00\x00\x00A\x14\xE3D\x0
>>>>>>>                                                      0\x00\x00\x00@
>>>>>>> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
>>>>>>> \x82H\x00\x00\x00\x00\x00@\x82H\x00\x00\x00\x00\x00\x
>>>>>>> 00\x00\x00\x00\x00\x00\x00\x00\x
>>>>>>>
>>>>>>>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>>>>>>>
>>>>>>> Thanks
>>>>>>> shoggi
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Stellar is unable to query base data

Posted by Michael Miklavcic <mi...@gmail.com>.
Hi Shoggi,

In addition to Nick's and Casey's comments, I noticed your global.json does
not specify a profiler period. Try adding the following:
"profiler.client.period.duration" : "15",
"profiler.client.period.duration.units" : "MINUTES"

This period duration should match the duration you've specified in the
profiler.properties file:
profiler.period.duration=15
profiler.period.duration.units=MINUTES

If you want to use a different period duration, you should change the value
to match in *both* locations.

Best,
Mike


On Sun, Mar 5, 2017 at 6:09 PM, Nick Allen <ni...@nickallen.org> wrote:

> What version of Metron are you using?  Based on what I am seeing in the
> stack trace it seems to be a few versions ago.  Any chance you'd be willing
> to try something newer like 0.3.1 RC5? It would be easier to help
> troubleshoot that way.
>
> On Sun, Mar 5, 2017 at 5:48 PM, shoggi <sh...@gmail.com> wrote:
>
>> The quorum and kafka config was ok, the host is actually called node1
>> (same system). The variables were set like that because I wanted to see if
>> I can set it to another value. Anyway, changed everything back and did
>> another of this:
>>   - killed the topology
>>   - created an empty profiler config
>>   - restarted system
>>   - added profiler configuration again (started with just one profile)
>>   - data gets added to hbase, I get the error as shown previously, every
>> couple of flush cycles
>>   - still no luck querying hbase out from stellar or via the enrichment
>> parser. No errors anywhere but the profiler NPE's
>>
>> you mentioned tick time.. is that something I can tune?
>>
>>
>> 2017-03-05 23:25:06.583 o.a.m.p.b.ProfileBuilderBolt [INFO] Flushing
>> profile: profile=url-length, entity=google.ch
>> 2017-03-05 23:25:06.584 o.a.m.p.b.ProfileBuilderBolt [ERROR] Unexpected
>> failure: message='null', tuple='source: __system:-1, stream: __tick, id:
>> {}, [60]'
>> java.lang.NullPointerException
>> at org.apache.metron.profiler.stellar.DefaultStellarExecutor.ex
>> ecute(DefaultStellarExecutor.java:117) ~[stormjar.jar:?]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeRe
>> sult(ProfileBuilderBolt.java:316) ~[stormjar.jar:?]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$
>> flush$4(ProfileBuilderBolt.java:245) ~[stormjar.jar:?]
>> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
>> ~[?:1.8.0_77]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
>> ~[stormjar.jar:?]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.doExecute(ProfileBuilderBolt.java:164)
>> ~[stormjar.jar:?]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.execute(ProfileBuilderBolt.java:144)
>> [stormjar.jar:?]
>> at org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
>> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
>> 2017-03-05 23:25:06.585 o.a.s.d.executor [ERROR]
>> java.lang.NullPointerException
>> at org.apache.metron.profiler.stellar.DefaultStellarExecutor.ex
>> ecute(DefaultStellarExecutor.java:117) ~[stormjar.jar:?]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeRe
>> sult(ProfileBuilderBolt.java:316) ~[stormjar.jar:?]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$
>> flush$4(ProfileBuilderBolt.java:245) ~[stormjar.jar:?]
>> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
>> ~[?:1.8.0_77]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
>> ~[stormjar.jar:?]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.doExecute(ProfileBuilderBolt.java:164)
>> ~[stormjar.jar:?]
>> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.execute(ProfileBuilderBolt.java:144)
>> [stormjar.jar:?]
>> at org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
>> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
>> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
>> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
>> 2017-03-05 23:25:08.628 o.a.s.k.ZkCoordinator [INFO] Task [1/1]
>> Refreshing partition manager connections
>>
>> On Sun, Mar 5, 2017 at 6:24 PM, Casey Stella <ce...@gmail.com> wrote:
>>
>>> Ok, so a couple of things I see here that you might try:
>>>
>>>
>>>    - You should set kafka.zk and kafka.broker in profiler.properties to
>>>    your real zookeeper quorum and kafka broker respectively
>>>
>>> In your profiler.json, instead of:
>>>  {
>>>       "profile": "url-bytes",
>>>       "foreach": "if exists(domain_without_subdomains) then
>>> domain_without_subdomains else 'n/a'",
>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>> 'squid'",
>>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>>       "result": "n"
>>>     },
>>> {
>>>       "profile": "content-type",
>>>       "foreach": "if exists(domain_content) then domain_content else
>>> 'n/a'",
>>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>>       "result": "o"
>>>     }
>>> You might want (note the change on the update statements)
>>>  {
>>>       "profile": "url-bytes",
>>>       "foreach": "if exists(domain_without_subdomains) then
>>> domain_without_subdomains else 'n/a'",
>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>> 'squid'",
>>>       "update": { "n": "STATS_ADD(n, bytes)" },
>>>       "result": "n"
>>>     },
>>> {
>>>       "profile": "content-type",
>>>       "foreach": "if exists(domain_content) then domain_content else
>>> 'n/a'",
>>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>>       "update": { "o": "STATS_ADD(o, bytes)" },
>>>       "result": "o"
>>>     }
>>>
>>> Try restarting the profiler topology and if you could look at the storm
>>> logs and see if you see any issues show up in the logs for the profiler.
>>>
>>> On Sun, Mar 5, 2017 at 7:11 AM, shoggi <sh...@gmail.com> wrote:
>>>
>>>> Here is my config:
>>>>
>>>> # global config
>>>> {
>>>> "es.clustername": "metron",
>>>> "es.ip": "172.16.16.2",
>>>> "es.port": "9300",
>>>> "es.date.format": "yyyy.MM.dd.HH"
>>>> }
>>>>
>>>> # profiler config
>>>> {
>>>>   "profiles": [
>>>>     {
>>>>       "profile": "url-length",
>>>>       "foreach": "if exists(domain_without_subdomains) then
>>>> domain_without_subdomains else 'n/a'",
>>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>>> 'squid'",
>>>>       "update": { "m": "STATS_ADD(m, LENGTH(url))" },
>>>>       "result": "m"
>>>>     },
>>>>     {
>>>>       "profile": "url-bytes",
>>>>       "foreach": "if exists(domain_without_subdomains) then
>>>> domain_without_subdomains else 'n/a'",
>>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>>> 'squid'",
>>>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>>>       "result": "n"
>>>>     },
>>>>     {
>>>>       "profile": "content-type",
>>>>       "foreach": "if exists(domain_content) then domain_content else
>>>> 'n/a'",
>>>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>>>       "result": "o"
>>>>     }
>>>>   ]
>>>> }
>>>>
>>>> # profiler properties
>>>> ##### Storm #####
>>>>
>>>> profiler.workers=1
>>>> profiler.executors=0
>>>> profiler.input.topic=indexing
>>>> profiler.period.duration=15
>>>> profiler.period.duration.units=MINUTES
>>>> profiler.ttl=30
>>>> profiler.ttl.units=MINUTES
>>>> profiler.hbase.salt.divisor=1000
>>>> profiler.hbase.table=profiler
>>>> profiler.hbase.column.family=P
>>>> profiler.hbase.batch=10
>>>> profiler.hbase.flush.interval.seconds=30
>>>>
>>>> ##### Kafka #####
>>>>
>>>> kafka.zk=node1:2181
>>>> kafka.broker=node1:6667
>>>> kafka.start=WHERE_I_LEFT_OFF
>>>>
>>>> On Sun, Mar 5, 2017 at 2:37 AM, Casey Stella <ce...@gmail.com>
>>>> wrote:
>>>>
>>>>> Sorry you are having issues! :(. Sometimes this is due to a mismatch
>>>>> in the tick time in the profiler between write and read.
>>>>>
>>>>> What's in your global config (METRON_HOME/config/zookeeper/global.json),
>>>>> profiler config (METRON_HOME/config/zookeeper/profiler.json) and
>>>>> profiler topology properties (METRON_HOME/config/profiler.properties)?
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:
>>>>>
>>>>>> Hey all
>>>>>>
>>>>>> Very strange, I had a few profilers working and wanted to show
>>>>>> someone (left system alone for a few days) & now can't query data anymore.
>>>>>> I went so far to reboot the system, deleted the profiler table in hbase and
>>>>>> loaded new data.
>>>>>>
>>>>>> I see the data in base but stellar does not let me query it anymore.
>>>>>> The queries return empty as if data does not exist but it's definitely
>>>>>> there. The timeframe can not be an issue, tired to use a very wide stellar
>>>>>> query and as mentioned, loaded fresh data.
>>>>>>
>>>>>> Any troubleshooting hints? This bugs me, as I have not touched the
>>>>>> system & even restarted it to get rid of any possible stale connections.
>>>>>>
>>>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
>>>>>> []
>>>>>>
>>>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
>>>>>> []
>>>>>>
>>>>>> Base data is there:
>>>>>>
>>>>>> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
>>>>>> timestamp=1488664729500, value=\x01\x00org.apache.metro
>>>>>> n.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C
>>>>>> \x00\x00\x00\x01@b\x
>>>>>>  1z\x96F
>>>>>> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x00\x00\x00\x00
>>>>>> \x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x
>>>>>> 00\x00\x00\x00\x00A\x14\xE3D\x0
>>>>>>                                                      0\x00\x00\x00@
>>>>>> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
>>>>>> \x82H\x00\x00\x00\x00\x00@\x82H\x00\x00\x00\x00\x00\x
>>>>>> 00\x00\x00\x00\x00\x00\x00\x00\x
>>>>>>
>>>>>>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>>>>>>
>>>>>> Thanks
>>>>>> shoggi
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Stellar is unable to query base data

Posted by Nick Allen <ni...@nickallen.org>.
What version of Metron are you using?  Based on what I am seeing in the
stack trace it seems to be a few versions ago.  Any chance you'd be willing
to try something newer like 0.3.1 RC5? It would be easier to help
troubleshoot that way.

On Sun, Mar 5, 2017 at 5:48 PM, shoggi <sh...@gmail.com> wrote:

> The quorum and kafka config was ok, the host is actually called node1
> (same system). The variables were set like that because I wanted to see if
> I can set it to another value. Anyway, changed everything back and did
> another of this:
>   - killed the topology
>   - created an empty profiler config
>   - restarted system
>   - added profiler configuration again (started with just one profile)
>   - data gets added to hbase, I get the error as shown previously, every
> couple of flush cycles
>   - still no luck querying hbase out from stellar or via the enrichment
> parser. No errors anywhere but the profiler NPE's
>
> you mentioned tick time.. is that something I can tune?
>
>
> 2017-03-05 23:25:06.583 o.a.m.p.b.ProfileBuilderBolt [INFO] Flushing
> profile: profile=url-length, entity=google.ch
> 2017-03-05 23:25:06.584 o.a.m.p.b.ProfileBuilderBolt [ERROR] Unexpected
> failure: message='null', tuple='source: __system:-1, stream: __tick, id:
> {}, [60]'
> java.lang.NullPointerException
> at org.apache.metron.profiler.stellar.DefaultStellarExecutor.execute(
> DefaultStellarExecutor.java:117) ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeResult(ProfileBuilderBolt.java:316)
> ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$flush$4(ProfileBuilderBolt.java:245)
> ~[stormjar.jar:?]
> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
> ~[?:1.8.0_77]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
> ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.
> doExecute(ProfileBuilderBolt.java:164) ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.
> execute(ProfileBuilderBolt.java:144) [stormjar.jar:?]
> at org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> 2017-03-05 23:25:06.585 o.a.s.d.executor [ERROR]
> java.lang.NullPointerException
> at org.apache.metron.profiler.stellar.DefaultStellarExecutor.execute(
> DefaultStellarExecutor.java:117) ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeResult(ProfileBuilderBolt.java:316)
> ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$flush$4(ProfileBuilderBolt.java:245)
> ~[stormjar.jar:?]
> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
> ~[?:1.8.0_77]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
> ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.
> doExecute(ProfileBuilderBolt.java:164) ~[stormjar.jar:?]
> at org.apache.metron.profiler.bolt.ProfileBuilderBolt.
> execute(ProfileBuilderBolt.java:144) [stormjar.jar:?]
> at org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
> [storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> 2017-03-05 23:25:08.628 o.a.s.k.ZkCoordinator [INFO] Task [1/1] Refreshing
> partition manager connections
>
> On Sun, Mar 5, 2017 at 6:24 PM, Casey Stella <ce...@gmail.com> wrote:
>
>> Ok, so a couple of things I see here that you might try:
>>
>>
>>    - You should set kafka.zk and kafka.broker in profiler.properties to
>>    your real zookeeper quorum and kafka broker respectively
>>
>> In your profiler.json, instead of:
>>  {
>>       "profile": "url-bytes",
>>       "foreach": "if exists(domain_without_subdomains) then
>> domain_without_subdomains else 'n/a'",
>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>> 'squid'",
>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>       "result": "n"
>>     },
>> {
>>       "profile": "content-type",
>>       "foreach": "if exists(domain_content) then domain_content else
>> 'n/a'",
>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>       "result": "o"
>>     }
>> You might want (note the change on the update statements)
>>  {
>>       "profile": "url-bytes",
>>       "foreach": "if exists(domain_without_subdomains) then
>> domain_without_subdomains else 'n/a'",
>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>> 'squid'",
>>       "update": { "n": "STATS_ADD(n, bytes)" },
>>       "result": "n"
>>     },
>> {
>>       "profile": "content-type",
>>       "foreach": "if exists(domain_content) then domain_content else
>> 'n/a'",
>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>       "update": { "o": "STATS_ADD(o, bytes)" },
>>       "result": "o"
>>     }
>>
>> Try restarting the profiler topology and if you could look at the storm
>> logs and see if you see any issues show up in the logs for the profiler.
>>
>> On Sun, Mar 5, 2017 at 7:11 AM, shoggi <sh...@gmail.com> wrote:
>>
>>> Here is my config:
>>>
>>> # global config
>>> {
>>> "es.clustername": "metron",
>>> "es.ip": "172.16.16.2",
>>> "es.port": "9300",
>>> "es.date.format": "yyyy.MM.dd.HH"
>>> }
>>>
>>> # profiler config
>>> {
>>>   "profiles": [
>>>     {
>>>       "profile": "url-length",
>>>       "foreach": "if exists(domain_without_subdomains) then
>>> domain_without_subdomains else 'n/a'",
>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>> 'squid'",
>>>       "update": { "m": "STATS_ADD(m, LENGTH(url))" },
>>>       "result": "m"
>>>     },
>>>     {
>>>       "profile": "url-bytes",
>>>       "foreach": "if exists(domain_without_subdomains) then
>>> domain_without_subdomains else 'n/a'",
>>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>>> 'squid'",
>>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>>       "result": "n"
>>>     },
>>>     {
>>>       "profile": "content-type",
>>>       "foreach": "if exists(domain_content) then domain_content else
>>> 'n/a'",
>>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>>       "result": "o"
>>>     }
>>>   ]
>>> }
>>>
>>> # profiler properties
>>> ##### Storm #####
>>>
>>> profiler.workers=1
>>> profiler.executors=0
>>> profiler.input.topic=indexing
>>> profiler.period.duration=15
>>> profiler.period.duration.units=MINUTES
>>> profiler.ttl=30
>>> profiler.ttl.units=MINUTES
>>> profiler.hbase.salt.divisor=1000
>>> profiler.hbase.table=profiler
>>> profiler.hbase.column.family=P
>>> profiler.hbase.batch=10
>>> profiler.hbase.flush.interval.seconds=30
>>>
>>> ##### Kafka #####
>>>
>>> kafka.zk=node1:2181
>>> kafka.broker=node1:6667
>>> kafka.start=WHERE_I_LEFT_OFF
>>>
>>> On Sun, Mar 5, 2017 at 2:37 AM, Casey Stella <ce...@gmail.com> wrote:
>>>
>>>> Sorry you are having issues! :(. Sometimes this is due to a mismatch in
>>>> the tick time in the profiler between write and read.
>>>>
>>>> What's in your global config (METRON_HOME/config/zookeeper/global.json),
>>>> profiler config (METRON_HOME/config/zookeeper/profiler.json) and
>>>> profiler topology properties (METRON_HOME/config/profiler.properties)?
>>>>
>>>>
>>>>
>>>> On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:
>>>>
>>>>> Hey all
>>>>>
>>>>> Very strange, I had a few profilers working and wanted to show someone
>>>>> (left system alone for a few days) & now can't query data anymore. I went
>>>>> so far to reboot the system, deleted the profiler table in hbase and loaded
>>>>> new data.
>>>>>
>>>>> I see the data in base but stellar does not let me query it anymore.
>>>>> The queries return empty as if data does not exist but it's definitely
>>>>> there. The timeframe can not be an issue, tired to use a very wide stellar
>>>>> query and as mentioned, loaded fresh data.
>>>>>
>>>>> Any troubleshooting hints? This bugs me, as I have not touched the
>>>>> system & even restarted it to get rid of any possible stale connections.
>>>>>
>>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
>>>>> []
>>>>>
>>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
>>>>> []
>>>>>
>>>>> Base data is there:
>>>>>
>>>>> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
>>>>> timestamp=1488664729500, value=\x01\x00org.apache.metro
>>>>> n.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C
>>>>> \x00\x00\x00\x01@b\x
>>>>>  1z\x96F
>>>>> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x00\x00\x00\x00
>>>>> \x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x
>>>>> 00\x00\x00\x00\x00A\x14\xE3D\x0
>>>>>                                                      0\x00\x00\x00@
>>>>> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
>>>>> \x82H\x00\x00\x00\x00\x00@\x82H\x00\x00\x00\x00\x00\x
>>>>> 00\x00\x00\x00\x00\x00\x00\x00\x
>>>>>
>>>>>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>>>>>
>>>>> Thanks
>>>>> shoggi
>>>>>
>>>>
>>>
>>
>

Re: Stellar is unable to query base data

Posted by shoggi <sh...@gmail.com>.
The quorum and kafka config was ok, the host is actually called node1 (same
system). The variables were set like that because I wanted to see if I can
set it to another value. Anyway, changed everything back and did another of
this:
  - killed the topology
  - created an empty profiler config
  - restarted system
  - added profiler configuration again (started with just one profile)
  - data gets added to hbase, I get the error as shown previously, every
couple of flush cycles
  - still no luck querying hbase out from stellar or via the enrichment
parser. No errors anywhere but the profiler NPE's

you mentioned tick time.. is that something I can tune?


2017-03-05 23:25:06.583 o.a.m.p.b.ProfileBuilderBolt [INFO] Flushing
profile: profile=url-length, entity=google.ch
2017-03-05 23:25:06.584 o.a.m.p.b.ProfileBuilderBolt [ERROR] Unexpected
failure: message='null', tuple='source: __system:-1, stream: __tick, id:
{}, [60]'
java.lang.NullPointerException
at
org.apache.metron.profiler.stellar.DefaultStellarExecutor.execute(DefaultStellarExecutor.java:117)
~[stormjar.jar:?]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeResult(ProfileBuilderBolt.java:316)
~[stormjar.jar:?]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$flush$4(ProfileBuilderBolt.java:245)
~[stormjar.jar:?]
at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
~[?:1.8.0_77]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
~[stormjar.jar:?]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.doExecute(ProfileBuilderBolt.java:164)
~[stormjar.jar:?]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.execute(ProfileBuilderBolt.java:144)
[stormjar.jar:?]
at
org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
2017-03-05 23:25:06.585 o.a.s.d.executor [ERROR]
java.lang.NullPointerException
at
org.apache.metron.profiler.stellar.DefaultStellarExecutor.execute(DefaultStellarExecutor.java:117)
~[stormjar.jar:?]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.executeResult(ProfileBuilderBolt.java:316)
~[stormjar.jar:?]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.lambda$flush$4(ProfileBuilderBolt.java:245)
~[stormjar.jar:?]
at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
~[?:1.8.0_77]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.flush(ProfileBuilderBolt.java:237)
~[stormjar.jar:?]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.doExecute(ProfileBuilderBolt.java:164)
~[stormjar.jar:?]
at
org.apache.metron.profiler.bolt.ProfileBuilderBolt.execute(ProfileBuilderBolt.java:144)
[stormjar.jar:?]
at
org.apache.storm.daemon.executor$fn__6571$tuple_action_fn__6573.invoke(executor.clj:734)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.daemon.executor$mk_task_receiver$fn__6492.invoke(executor.clj:469)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.disruptor$clojure_handler$reify__6005.onEvent(disruptor.clj:40)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at
org.apache.storm.daemon.executor$fn__6571$fn__6584$fn__6637.invoke(executor.clj:853)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
[storm-core-1.0.1.2.5.0.0-1245.jar:1.0.1.2.5.0.0-1245]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
2017-03-05 23:25:08.628 o.a.s.k.ZkCoordinator [INFO] Task [1/1] Refreshing
partition manager connections

On Sun, Mar 5, 2017 at 6:24 PM, Casey Stella <ce...@gmail.com> wrote:

> Ok, so a couple of things I see here that you might try:
>
>
>    - You should set kafka.zk and kafka.broker in profiler.properties to
>    your real zookeeper quorum and kafka broker respectively
>
> In your profiler.json, instead of:
>  {
>       "profile": "url-bytes",
>       "foreach": "if exists(domain_without_subdomains) then
> domain_without_subdomains else 'n/a'",
>       "onlyif": "exists(domain_without_subdomains) && source.type ==
> 'squid'",
>       "update": { "n": "STATS_ADD(m, bytes)" },
>       "result": "n"
>     },
> {
>       "profile": "content-type",
>       "foreach": "if exists(domain_content) then domain_content else
> 'n/a'",
>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>       "update": { "o": "STATS_ADD(m, bytes)" },
>       "result": "o"
>     }
> You might want (note the change on the update statements)
>  {
>       "profile": "url-bytes",
>       "foreach": "if exists(domain_without_subdomains) then
> domain_without_subdomains else 'n/a'",
>       "onlyif": "exists(domain_without_subdomains) && source.type ==
> 'squid'",
>       "update": { "n": "STATS_ADD(n, bytes)" },
>       "result": "n"
>     },
> {
>       "profile": "content-type",
>       "foreach": "if exists(domain_content) then domain_content else
> 'n/a'",
>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>       "update": { "o": "STATS_ADD(o, bytes)" },
>       "result": "o"
>     }
>
> Try restarting the profiler topology and if you could look at the storm
> logs and see if you see any issues show up in the logs for the profiler.
>
> On Sun, Mar 5, 2017 at 7:11 AM, shoggi <sh...@gmail.com> wrote:
>
>> Here is my config:
>>
>> # global config
>> {
>> "es.clustername": "metron",
>> "es.ip": "172.16.16.2",
>> "es.port": "9300",
>> "es.date.format": "yyyy.MM.dd.HH"
>> }
>>
>> # profiler config
>> {
>>   "profiles": [
>>     {
>>       "profile": "url-length",
>>       "foreach": "if exists(domain_without_subdomains) then
>> domain_without_subdomains else 'n/a'",
>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>> 'squid'",
>>       "update": { "m": "STATS_ADD(m, LENGTH(url))" },
>>       "result": "m"
>>     },
>>     {
>>       "profile": "url-bytes",
>>       "foreach": "if exists(domain_without_subdomains) then
>> domain_without_subdomains else 'n/a'",
>>       "onlyif": "exists(domain_without_subdomains) && source.type ==
>> 'squid'",
>>       "update": { "n": "STATS_ADD(m, bytes)" },
>>       "result": "n"
>>     },
>>     {
>>       "profile": "content-type",
>>       "foreach": "if exists(domain_content) then domain_content else
>> 'n/a'",
>>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>>       "update": { "o": "STATS_ADD(m, bytes)" },
>>       "result": "o"
>>     }
>>   ]
>> }
>>
>> # profiler properties
>> ##### Storm #####
>>
>> profiler.workers=1
>> profiler.executors=0
>> profiler.input.topic=indexing
>> profiler.period.duration=15
>> profiler.period.duration.units=MINUTES
>> profiler.ttl=30
>> profiler.ttl.units=MINUTES
>> profiler.hbase.salt.divisor=1000
>> profiler.hbase.table=profiler
>> profiler.hbase.column.family=P
>> profiler.hbase.batch=10
>> profiler.hbase.flush.interval.seconds=30
>>
>> ##### Kafka #####
>>
>> kafka.zk=node1:2181
>> kafka.broker=node1:6667
>> kafka.start=WHERE_I_LEFT_OFF
>>
>> On Sun, Mar 5, 2017 at 2:37 AM, Casey Stella <ce...@gmail.com> wrote:
>>
>>> Sorry you are having issues! :(. Sometimes this is due to a mismatch in
>>> the tick time in the profiler between write and read.
>>>
>>> What's in your global config (METRON_HOME/config/zookeeper/global.json),
>>> profiler config (METRON_HOME/config/zookeeper/profiler.json) and
>>> profiler topology properties (METRON_HOME/config/profiler.properties)?
>>>
>>>
>>>
>>> On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:
>>>
>>>> Hey all
>>>>
>>>> Very strange, I had a few profilers working and wanted to show someone
>>>> (left system alone for a few days) & now can't query data anymore. I went
>>>> so far to reboot the system, deleted the profiler table in hbase and loaded
>>>> new data.
>>>>
>>>> I see the data in base but stellar does not let me query it anymore.
>>>> The queries return empty as if data does not exist but it's definitely
>>>> there. The timeframe can not be an issue, tired to use a very wide stellar
>>>> query and as mentioned, loaded fresh data.
>>>>
>>>> Any troubleshooting hints? This bugs me, as I have not touched the
>>>> system & even restarted it to get rid of any possible stale connections.
>>>>
>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
>>>> []
>>>>
>>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
>>>> []
>>>>
>>>> Base data is there:
>>>>
>>>> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
>>>> timestamp=1488664729500, value=\x01\x00org.apache.metro
>>>> n.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C
>>>> \x00\x00\x00\x01@b\x
>>>>  1z\x96F
>>>> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x00\x00\x00\x00
>>>> \x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x
>>>> 00\x00\x00\x00\x00A\x14\xE3D\x0
>>>>                                                      0\x00\x00\x00@
>>>> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
>>>> \x82H\x00\x00\x00\x00\x00@\x82H\x00\x00\x00\x00\x00\x
>>>> 00\x00\x00\x00\x00\x00\x00\x00\x
>>>>
>>>>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>>>>
>>>> Thanks
>>>> shoggi
>>>>
>>>
>>
>

Re: Stellar is unable to query base data

Posted by Casey Stella <ce...@gmail.com>.
Ok, so a couple of things I see here that you might try:


   - You should set kafka.zk and kafka.broker in profiler.properties to
   your real zookeeper quorum and kafka broker respectively

In your profiler.json, instead of:
 {
      "profile": "url-bytes",
      "foreach": "if exists(domain_without_subdomains) then
domain_without_subdomains else 'n/a'",
      "onlyif": "exists(domain_without_subdomains) && source.type ==
'squid'",
      "update": { "n": "STATS_ADD(m, bytes)" },
      "result": "n"
    },
{
      "profile": "content-type",
      "foreach": "if exists(domain_content) then domain_content else 'n/a'",
      "onlyif": "exists(domain_content) && source.type == 'squid'",
      "update": { "o": "STATS_ADD(m, bytes)" },
      "result": "o"
    }
You might want (note the change on the update statements)
 {
      "profile": "url-bytes",
      "foreach": "if exists(domain_without_subdomains) then
domain_without_subdomains else 'n/a'",
      "onlyif": "exists(domain_without_subdomains) && source.type ==
'squid'",
      "update": { "n": "STATS_ADD(n, bytes)" },
      "result": "n"
    },
{
      "profile": "content-type",
      "foreach": "if exists(domain_content) then domain_content else 'n/a'",
      "onlyif": "exists(domain_content) && source.type == 'squid'",
      "update": { "o": "STATS_ADD(o, bytes)" },
      "result": "o"
    }

Try restarting the profiler topology and if you could look at the storm
logs and see if you see any issues show up in the logs for the profiler.

On Sun, Mar 5, 2017 at 7:11 AM, shoggi <sh...@gmail.com> wrote:

> Here is my config:
>
> # global config
> {
> "es.clustername": "metron",
> "es.ip": "172.16.16.2",
> "es.port": "9300",
> "es.date.format": "yyyy.MM.dd.HH"
> }
>
> # profiler config
> {
>   "profiles": [
>     {
>       "profile": "url-length",
>       "foreach": "if exists(domain_without_subdomains) then
> domain_without_subdomains else 'n/a'",
>       "onlyif": "exists(domain_without_subdomains) && source.type ==
> 'squid'",
>       "update": { "m": "STATS_ADD(m, LENGTH(url))" },
>       "result": "m"
>     },
>     {
>       "profile": "url-bytes",
>       "foreach": "if exists(domain_without_subdomains) then
> domain_without_subdomains else 'n/a'",
>       "onlyif": "exists(domain_without_subdomains) && source.type ==
> 'squid'",
>       "update": { "n": "STATS_ADD(m, bytes)" },
>       "result": "n"
>     },
>     {
>       "profile": "content-type",
>       "foreach": "if exists(domain_content) then domain_content else
> 'n/a'",
>       "onlyif": "exists(domain_content) && source.type == 'squid'",
>       "update": { "o": "STATS_ADD(m, bytes)" },
>       "result": "o"
>     }
>   ]
> }
>
> # profiler properties
> ##### Storm #####
>
> profiler.workers=1
> profiler.executors=0
> profiler.input.topic=indexing
> profiler.period.duration=15
> profiler.period.duration.units=MINUTES
> profiler.ttl=30
> profiler.ttl.units=MINUTES
> profiler.hbase.salt.divisor=1000
> profiler.hbase.table=profiler
> profiler.hbase.column.family=P
> profiler.hbase.batch=10
> profiler.hbase.flush.interval.seconds=30
>
> ##### Kafka #####
>
> kafka.zk=node1:2181
> kafka.broker=node1:6667
> kafka.start=WHERE_I_LEFT_OFF
>
> On Sun, Mar 5, 2017 at 2:37 AM, Casey Stella <ce...@gmail.com> wrote:
>
>> Sorry you are having issues! :(. Sometimes this is due to a mismatch in
>> the tick time in the profiler between write and read.
>>
>> What's in your global config (METRON_HOME/config/zookeeper/global.json),
>> profiler config (METRON_HOME/config/zookeeper/profiler.json) and
>> profiler topology properties (METRON_HOME/config/profiler.properties)?
>>
>>
>>
>> On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:
>>
>>> Hey all
>>>
>>> Very strange, I had a few profilers working and wanted to show someone
>>> (left system alone for a few days) & now can't query data anymore. I went
>>> so far to reboot the system, deleted the profiler table in hbase and loaded
>>> new data.
>>>
>>> I see the data in base but stellar does not let me query it anymore. The
>>> queries return empty as if data does not exist but it's definitely there.
>>> The timeframe can not be an issue, tired to use a very wide stellar query
>>> and as mentioned, loaded fresh data.
>>>
>>> Any troubleshooting hints? This bugs me, as I have not touched the
>>> system & even restarted it to get rid of any possible stale connections.
>>>
>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
>>> []
>>>
>>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
>>> []
>>>
>>> Base data is there:
>>>
>>> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
>>> timestamp=1488664729500, value=\x01\x00org.apache.metro
>>> n.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\
>>> x1C\x00\x00\x00\x01@b\x
>>>  1z\x96F
>>> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x00\x00\x00\x00
>>> \x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\
>>> x00\x00\x00\x00\x00A\x14\xE3D\x0
>>>                                                      0\x00\x00\x00@
>>> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
>>> \x82H\x00\x00\x00\x00\x00@\x82H\x00\x00\x00\x00\x00\x
>>> 00\x00\x00\x00\x00\x00\x00\x00\x
>>>
>>>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>>>
>>> Thanks
>>> shoggi
>>>
>>
>

Re: Stellar is unable to query base data

Posted by shoggi <sh...@gmail.com>.
Here is my config:

# global config
{
"es.clustername": "metron",
"es.ip": "172.16.16.2",
"es.port": "9300",
"es.date.format": "yyyy.MM.dd.HH"
}

# profiler config
{
  "profiles": [
    {
      "profile": "url-length",
      "foreach": "if exists(domain_without_subdomains) then
domain_without_subdomains else 'n/a'",
      "onlyif": "exists(domain_without_subdomains) && source.type ==
'squid'",
      "update": { "m": "STATS_ADD(m, LENGTH(url))" },
      "result": "m"
    },
    {
      "profile": "url-bytes",
      "foreach": "if exists(domain_without_subdomains) then
domain_without_subdomains else 'n/a'",
      "onlyif": "exists(domain_without_subdomains) && source.type ==
'squid'",
      "update": { "n": "STATS_ADD(m, bytes)" },
      "result": "n"
    },
    {
      "profile": "content-type",
      "foreach": "if exists(domain_content) then domain_content else 'n/a'",
      "onlyif": "exists(domain_content) && source.type == 'squid'",
      "update": { "o": "STATS_ADD(m, bytes)" },
      "result": "o"
    }
  ]
}

# profiler properties
##### Storm #####

profiler.workers=1
profiler.executors=0
profiler.input.topic=indexing
profiler.period.duration=15
profiler.period.duration.units=MINUTES
profiler.ttl=30
profiler.ttl.units=MINUTES
profiler.hbase.salt.divisor=1000
profiler.hbase.table=profiler
profiler.hbase.column.family=P
profiler.hbase.batch=10
profiler.hbase.flush.interval.seconds=30

##### Kafka #####

kafka.zk=node1:2181
kafka.broker=node1:6667
kafka.start=WHERE_I_LEFT_OFF

On Sun, Mar 5, 2017 at 2:37 AM, Casey Stella <ce...@gmail.com> wrote:

> Sorry you are having issues! :(. Sometimes this is due to a mismatch in
> the tick time in the profiler between write and read.
>
> What's in your global config (METRON_HOME/config/zookeeper/global.json),
> profiler config (METRON_HOME/config/zookeeper/profiler.json) and profiler
> topology properties (METRON_HOME/config/profiler.properties)?
>
>
>
> On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:
>
>> Hey all
>>
>> Very strange, I had a few profilers working and wanted to show someone
>> (left system alone for a few days) & now can't query data anymore. I went
>> so far to reboot the system, deleted the profiler table in hbase and loaded
>> new data.
>>
>> I see the data in base but stellar does not let me query it anymore. The
>> queries return empty as if data does not exist but it's definitely there.
>> The timeframe can not be an issue, tired to use a very wide stellar query
>> and as mentioned, loaded fresh data.
>>
>> Any troubleshooting hints? This bugs me, as I have not touched the system
>> & even restarted it to get rid of any possible stale connections.
>>
>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
>> []
>>
>> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
>> []
>>
>> Base data is there:
>>
>> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
>> timestamp=1488664729500, value=\x01\x00org.apache.metron.statistics.
>> OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C\x00\x00\x00\x01@b\x
>>  1z\x96F
>> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@\x82H\x00\x00\x00\
>> x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@\
>> x82H\x00\x00\x00\x00\x00A\x14\xE3D\x0
>>                                                      0\x00\x00\x00@
>> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
>> \x82H\x00\x00\x00\x00\x00@\x82H\x00\x00\x00\x00\x00\
>> x00\x00\x00\x00\x00\x00\x00\x00\x
>>
>>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>>
>> Thanks
>> shoggi
>>
>

Re: Stellar is unable to query base data

Posted by Casey Stella <ce...@gmail.com>.
Sorry you are having issues! :(. Sometimes this is due to a mismatch in the
tick time in the profiler between write and read.

What's in your global config (METRON_HOME/config/zookeeper/global.json),
profiler config (METRON_HOME/config/zookeeper/profiler.json) and profiler
topology properties (METRON_HOME/config/profiler.properties)?


On Sat, Mar 4, 2017 at 17:38 shoggi <sh...@gmail.com> wrote:

> Hey all
>
> Very strange, I had a few profilers working and wanted to show someone
> (left system alone for a few days) & now can't query data anymore. I went
> so far to reboot the system, deleted the profiler table in hbase and loaded
> new data.
>
> I see the data in base but stellar does not let me query it anymore. The
> queries return empty as if data does not exist but it's definitely there.
> The timeframe can not be an issue, tired to use a very wide stellar query
> and as mentioned, loaded fresh data.
>
> Any troubleshooting hints? This bugs me, as I have not touched the system
> & even restarted it to get rid of any possible stale connections.
>
> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"MINUTES")
> []
>
> [Stellar]>>> PROFILE_GET( "url-bytes","google.com",60,"HOURS")
> []
>
> Base data is there:
>
> \xFF\xFF\xFFkurl-bytesgoogle.com\x00\x00\x00\x00\x0 column=P:value,
> timestamp=1488664729500,
> value=\x01\x00org.apache.metron.statistics.OnlineStatisticsProvide\xF2\x01\x00\x00\x00\x1C\x00\x00\x00\x01@b
> \x
>  1z\x96F
> C0\x00\x00\x00\x00\x00\x00\x00\x00\x01@
> \x82H\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01@
> \x82H\x00\x00\x00\x00\x00A\x14\xE3D\x0
>                                                      0\x00\x00\x00@
> \x19|\x87\xD0\xEA\xAA\xFB@\x82H\x00\x00\x00\x00\x00@
> \x82H\x00\x00\x00\x00\x00@
> \x82H\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x
>
>  00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
>
> Thanks
> shoggi
>