You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by ashish rawat <dc...@gmail.com> on 2016/04/19 18:53:29 UTC
Elastic Search Interpreter limitation?
Hi,
I am trying to use the filters aggregation of elastic search
https://www.elastic.co/guide/en/elasticsearch/reference/2.2/search-aggregations-bucket-filters-aggregation.html
As documented on the elastic page, I made the following query through
zeppelin
{
"aggs" : {
"messages" : {
"filters" : {
"filters" : {
"error" : { "term" : { "logLevel" : "error" }},
"trace" : { "term" : { "logLevel" : "trace" }}
}
},
"aggs" : {
"messages_over_time" : {
"date_histogram" : {
"field" : "timestamp",
"interval" : "day",
"format" : "yyyy-MM-dd"
}
}
}
}
but the response only contained the fields: 'key' and 'doc_count', whereas
if I run the same query through elastic's rest interface, I get the
following result
"aggregations": {
"messages": {
"buckets": {
"error": {
"doc_count": 57,
"messages_over_time": {
"buckets": [
{
"key_as_string": "2016-03-21",
"key": 1458518400000,
"doc_count": 1
},
{
"key_as_string": "2016-03-22",
"key": 1458604800000,
"doc_count": 0
},
{
"key_as_string": "2016-03-23",
"key": 1458691200000,
"doc_count": 0
},
{
"key_as_string": "2016-03-24",
"key": 1458777600000,
"doc_count": 0
},
{
"key_as_string": "2016-03-25",
"key": 1458864000000,
"doc_count": 0
},
{
"key_as_string": "2016-03-26",
"key": 1458950400000,
"doc_count": 0
},
{
"key_as_string": "2016-03-27",
"key": 1459036800000,
"doc_count": 0
},
{
"key_as_string": "2016-03-28",
"key": 1459123200000,
"doc_count": 0
},
{
"key_as_string": "2016-03-29",
"key": 1459209600000,
"doc_count": 0
},
{
"key_as_string": "2016-03-30",
"key": 1459296000000,
"doc_count": 0
},
{
"key_as_string": "2016-03-31",
"key": 1459382400000,
"doc_count": 0
},
{
"key_as_string": "2016-04-01",
"key": 1459468800000,
"doc_count": 8
},
{
"key_as_string": "2016-04-02",
"key": 1459555200000,
"doc_count": 0
},
{
"key_as_string": "2016-04-03",
"key": 1459641600000,
"doc_count": 0
},
{
"key_as_string": "2016-04-04",
"key": 1459728000000,
"doc_count": 48
}
]
}
},
"trace": {
"doc_count": 372,
"messages_over_time": {
"buckets": [
{
"key_as_string": "2016-04-04",
"key": 1459728000000,
"doc_count": 372
}
]
}
}
}
}
as expected, it has the timeseries of the 'error' and 'trace' messages.
Is there any limitation in elastic search interpreter which does not allow
parsing of complex responses?
Regards,
Ashish
Re: Elastic Search Interpreter limitation?
Posted by ashish rawat <dc...@gmail.com>.
Hi Bruno,
I believe I have found the issue. There indeed is a dependency on the
_source fields in this line 461
final String json = hit.getSourceAsString();
Regards,
Ashish
On Wed, Apr 20, 2016 at 12:05 AM, ashish rawat <dc...@gmail.com> wrote:
> Hi Bruno,
>
> I am encountering another issue, which might also be related to the
> interpreter.
>
> When using the "fields" attribute in the query to select the exact fields
> to return, I get an "Error: String is null" through Zeppelin, while the
> same query works through the REST interface.
>
> I noticed that a normal query, of the form
>
> {
> "query": {"regexp":{"log":"module"}}
> }
>
> returns results in the following format:
> "hits": {....
> "hits": [
> {....
> "_source": {
>
>
> while a query with "fields", return results in the format:
> "hits": {....
> "hits": [
> {....
> "fields": {
>
> Could this be the issue? I had a quick scan over the
> ElasticsearchInterpreter.buildSearchHitsResponseMessage, but couldn't find
> any dependency on "_source" to validate my assumption.
>
> Do you think this could be an interpreter issue?
>
> Regards,
> Ashish
>
> On Tue, Apr 19, 2016 at 10:54 PM, ashish rawat <dc...@gmail.com>
> wrote:
>
>> Thanks Bruno for the prompt reply. Do you know of any indirect way of
>> achieving the same, i.e. timeseries' of all values of a field (eg logLevel,
>> httpMethod)
>>
>> Regards,
>> Ashish
>>
>> On Tue, Apr 19, 2016 at 10:38 PM, Bruno Bonnin <bb...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> You are right, there are some limitations with the Elasticsearch
>>> interpreter.
>>> I have developed it and I'am going to check how I can change the
>>> component to take into account this kind of more complex request.
>>>
>>> Regards,
>>> Bruno
>>>
>>> 2016-04-19 18:53 GMT+02:00 ashish rawat <dc...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> I am trying to use the filters aggregation of elastic search
>>>>
>>>> https://www.elastic.co/guide/en/elasticsearch/reference/2.2/search-aggregations-bucket-filters-aggregation.html
>>>>
>>>>
>>>> As documented on the elastic page, I made the following query through
>>>> zeppelin
>>>> {
>>>> "aggs" : {
>>>> "messages" : {
>>>> "filters" : {
>>>> "filters" : {
>>>> "error" : { "term" : { "logLevel" : "error" }},
>>>> "trace" : { "term" : { "logLevel" : "trace" }}
>>>> }
>>>> },
>>>> "aggs" : {
>>>> "messages_over_time" : {
>>>> "date_histogram" : {
>>>> "field" : "timestamp",
>>>> "interval" : "day",
>>>> "format" : "yyyy-MM-dd"
>>>> }
>>>> }
>>>> }
>>>> }
>>>>
>>>> but the response only contained the fields: 'key' and 'doc_count',
>>>> whereas if I run the same query through elastic's rest interface, I get the
>>>> following result
>>>>
>>>> "aggregations": {
>>>> "messages": {
>>>> "buckets": {
>>>> "error": {
>>>> "doc_count": 57,
>>>> "messages_over_time": {
>>>> "buckets": [
>>>> {
>>>> "key_as_string": "2016-03-21",
>>>> "key": 1458518400000,
>>>> "doc_count": 1
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-22",
>>>> "key": 1458604800000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-23",
>>>> "key": 1458691200000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-24",
>>>> "key": 1458777600000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-25",
>>>> "key": 1458864000000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-26",
>>>> "key": 1458950400000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-27",
>>>> "key": 1459036800000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-28",
>>>> "key": 1459123200000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-29",
>>>> "key": 1459209600000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-30",
>>>> "key": 1459296000000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-03-31",
>>>> "key": 1459382400000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-04-01",
>>>> "key": 1459468800000,
>>>> "doc_count": 8
>>>> },
>>>> {
>>>> "key_as_string": "2016-04-02",
>>>> "key": 1459555200000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-04-03",
>>>> "key": 1459641600000,
>>>> "doc_count": 0
>>>> },
>>>> {
>>>> "key_as_string": "2016-04-04",
>>>> "key": 1459728000000,
>>>> "doc_count": 48
>>>> }
>>>> ]
>>>> }
>>>> },
>>>> "trace": {
>>>> "doc_count": 372,
>>>> "messages_over_time": {
>>>> "buckets": [
>>>> {
>>>> "key_as_string": "2016-04-04",
>>>> "key": 1459728000000,
>>>> "doc_count": 372
>>>> }
>>>> ]
>>>> }
>>>> }
>>>> }
>>>> }
>>>>
>>>> as expected, it has the timeseries of the 'error' and 'trace' messages.
>>>>
>>>> Is there any limitation in elastic search interpreter which does not
>>>> allow parsing of complex responses?
>>>>
>>>> Regards,
>>>> Ashish
>>>>
>>>>
>>>
>>
>
Re: Elastic Search Interpreter limitation?
Posted by ashish rawat <dc...@gmail.com>.
Hi Bruno,
I am encountering another issue, which might also be related to the
interpreter.
When using the "fields" attribute in the query to select the exact fields
to return, I get an "Error: String is null" through Zeppelin, while the
same query works through the REST interface.
I noticed that a normal query, of the form
{
"query": {"regexp":{"log":"module"}}
}
returns results in the following format:
"hits": {....
"hits": [
{....
"_source": {
while a query with "fields", return results in the format:
"hits": {....
"hits": [
{....
"fields": {
Could this be the issue? I had a quick scan over the
ElasticsearchInterpreter.buildSearchHitsResponseMessage, but couldn't find
any dependency on "_source" to validate my assumption.
Do you think this could be an interpreter issue?
Regards,
Ashish
On Tue, Apr 19, 2016 at 10:54 PM, ashish rawat <dc...@gmail.com> wrote:
> Thanks Bruno for the prompt reply. Do you know of any indirect way of
> achieving the same, i.e. timeseries' of all values of a field (eg logLevel,
> httpMethod)
>
> Regards,
> Ashish
>
> On Tue, Apr 19, 2016 at 10:38 PM, Bruno Bonnin <bb...@gmail.com> wrote:
>
>> Hello,
>>
>> You are right, there are some limitations with the Elasticsearch
>> interpreter.
>> I have developed it and I'am going to check how I can change the
>> component to take into account this kind of more complex request.
>>
>> Regards,
>> Bruno
>>
>> 2016-04-19 18:53 GMT+02:00 ashish rawat <dc...@gmail.com>:
>>
>>> Hi,
>>>
>>> I am trying to use the filters aggregation of elastic search
>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/2.2/search-aggregations-bucket-filters-aggregation.html
>>>
>>>
>>> As documented on the elastic page, I made the following query through
>>> zeppelin
>>> {
>>> "aggs" : {
>>> "messages" : {
>>> "filters" : {
>>> "filters" : {
>>> "error" : { "term" : { "logLevel" : "error" }},
>>> "trace" : { "term" : { "logLevel" : "trace" }}
>>> }
>>> },
>>> "aggs" : {
>>> "messages_over_time" : {
>>> "date_histogram" : {
>>> "field" : "timestamp",
>>> "interval" : "day",
>>> "format" : "yyyy-MM-dd"
>>> }
>>> }
>>> }
>>> }
>>>
>>> but the response only contained the fields: 'key' and 'doc_count',
>>> whereas if I run the same query through elastic's rest interface, I get the
>>> following result
>>>
>>> "aggregations": {
>>> "messages": {
>>> "buckets": {
>>> "error": {
>>> "doc_count": 57,
>>> "messages_over_time": {
>>> "buckets": [
>>> {
>>> "key_as_string": "2016-03-21",
>>> "key": 1458518400000,
>>> "doc_count": 1
>>> },
>>> {
>>> "key_as_string": "2016-03-22",
>>> "key": 1458604800000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-23",
>>> "key": 1458691200000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-24",
>>> "key": 1458777600000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-25",
>>> "key": 1458864000000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-26",
>>> "key": 1458950400000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-27",
>>> "key": 1459036800000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-28",
>>> "key": 1459123200000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-29",
>>> "key": 1459209600000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-30",
>>> "key": 1459296000000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-03-31",
>>> "key": 1459382400000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-04-01",
>>> "key": 1459468800000,
>>> "doc_count": 8
>>> },
>>> {
>>> "key_as_string": "2016-04-02",
>>> "key": 1459555200000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-04-03",
>>> "key": 1459641600000,
>>> "doc_count": 0
>>> },
>>> {
>>> "key_as_string": "2016-04-04",
>>> "key": 1459728000000,
>>> "doc_count": 48
>>> }
>>> ]
>>> }
>>> },
>>> "trace": {
>>> "doc_count": 372,
>>> "messages_over_time": {
>>> "buckets": [
>>> {
>>> "key_as_string": "2016-04-04",
>>> "key": 1459728000000,
>>> "doc_count": 372
>>> }
>>> ]
>>> }
>>> }
>>> }
>>> }
>>>
>>> as expected, it has the timeseries of the 'error' and 'trace' messages.
>>>
>>> Is there any limitation in elastic search interpreter which does not
>>> allow parsing of complex responses?
>>>
>>> Regards,
>>> Ashish
>>>
>>>
>>
>
Re: Elastic Search Interpreter limitation?
Posted by ashish rawat <dc...@gmail.com>.
Thanks Bruno for the prompt reply. Do you know of any indirect way of
achieving the same, i.e. timeseries' of all values of a field (eg logLevel,
httpMethod)
Regards,
Ashish
On Tue, Apr 19, 2016 at 10:38 PM, Bruno Bonnin <bb...@gmail.com> wrote:
> Hello,
>
> You are right, there are some limitations with the Elasticsearch
> interpreter.
> I have developed it and I'am going to check how I can change the component
> to take into account this kind of more complex request.
>
> Regards,
> Bruno
>
> 2016-04-19 18:53 GMT+02:00 ashish rawat <dc...@gmail.com>:
>
>> Hi,
>>
>> I am trying to use the filters aggregation of elastic search
>>
>> https://www.elastic.co/guide/en/elasticsearch/reference/2.2/search-aggregations-bucket-filters-aggregation.html
>>
>>
>> As documented on the elastic page, I made the following query through
>> zeppelin
>> {
>> "aggs" : {
>> "messages" : {
>> "filters" : {
>> "filters" : {
>> "error" : { "term" : { "logLevel" : "error" }},
>> "trace" : { "term" : { "logLevel" : "trace" }}
>> }
>> },
>> "aggs" : {
>> "messages_over_time" : {
>> "date_histogram" : {
>> "field" : "timestamp",
>> "interval" : "day",
>> "format" : "yyyy-MM-dd"
>> }
>> }
>> }
>> }
>>
>> but the response only contained the fields: 'key' and 'doc_count',
>> whereas if I run the same query through elastic's rest interface, I get the
>> following result
>>
>> "aggregations": {
>> "messages": {
>> "buckets": {
>> "error": {
>> "doc_count": 57,
>> "messages_over_time": {
>> "buckets": [
>> {
>> "key_as_string": "2016-03-21",
>> "key": 1458518400000,
>> "doc_count": 1
>> },
>> {
>> "key_as_string": "2016-03-22",
>> "key": 1458604800000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-23",
>> "key": 1458691200000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-24",
>> "key": 1458777600000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-25",
>> "key": 1458864000000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-26",
>> "key": 1458950400000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-27",
>> "key": 1459036800000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-28",
>> "key": 1459123200000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-29",
>> "key": 1459209600000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-30",
>> "key": 1459296000000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-03-31",
>> "key": 1459382400000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-04-01",
>> "key": 1459468800000,
>> "doc_count": 8
>> },
>> {
>> "key_as_string": "2016-04-02",
>> "key": 1459555200000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-04-03",
>> "key": 1459641600000,
>> "doc_count": 0
>> },
>> {
>> "key_as_string": "2016-04-04",
>> "key": 1459728000000,
>> "doc_count": 48
>> }
>> ]
>> }
>> },
>> "trace": {
>> "doc_count": 372,
>> "messages_over_time": {
>> "buckets": [
>> {
>> "key_as_string": "2016-04-04",
>> "key": 1459728000000,
>> "doc_count": 372
>> }
>> ]
>> }
>> }
>> }
>> }
>>
>> as expected, it has the timeseries of the 'error' and 'trace' messages.
>>
>> Is there any limitation in elastic search interpreter which does not
>> allow parsing of complex responses?
>>
>> Regards,
>> Ashish
>>
>>
>
Re: Elastic Search Interpreter limitation?
Posted by Bruno Bonnin <bb...@gmail.com>.
Hello,
You are right, there are some limitations with the Elasticsearch
interpreter.
I have developed it and I'am going to check how I can change the component
to take into account this kind of more complex request.
Regards,
Bruno
2016-04-19 18:53 GMT+02:00 ashish rawat <dc...@gmail.com>:
> Hi,
>
> I am trying to use the filters aggregation of elastic search
>
> https://www.elastic.co/guide/en/elasticsearch/reference/2.2/search-aggregations-bucket-filters-aggregation.html
>
>
> As documented on the elastic page, I made the following query through
> zeppelin
> {
> "aggs" : {
> "messages" : {
> "filters" : {
> "filters" : {
> "error" : { "term" : { "logLevel" : "error" }},
> "trace" : { "term" : { "logLevel" : "trace" }}
> }
> },
> "aggs" : {
> "messages_over_time" : {
> "date_histogram" : {
> "field" : "timestamp",
> "interval" : "day",
> "format" : "yyyy-MM-dd"
> }
> }
> }
> }
>
> but the response only contained the fields: 'key' and 'doc_count', whereas
> if I run the same query through elastic's rest interface, I get the
> following result
>
> "aggregations": {
> "messages": {
> "buckets": {
> "error": {
> "doc_count": 57,
> "messages_over_time": {
> "buckets": [
> {
> "key_as_string": "2016-03-21",
> "key": 1458518400000,
> "doc_count": 1
> },
> {
> "key_as_string": "2016-03-22",
> "key": 1458604800000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-23",
> "key": 1458691200000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-24",
> "key": 1458777600000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-25",
> "key": 1458864000000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-26",
> "key": 1458950400000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-27",
> "key": 1459036800000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-28",
> "key": 1459123200000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-29",
> "key": 1459209600000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-30",
> "key": 1459296000000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-03-31",
> "key": 1459382400000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-04-01",
> "key": 1459468800000,
> "doc_count": 8
> },
> {
> "key_as_string": "2016-04-02",
> "key": 1459555200000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-04-03",
> "key": 1459641600000,
> "doc_count": 0
> },
> {
> "key_as_string": "2016-04-04",
> "key": 1459728000000,
> "doc_count": 48
> }
> ]
> }
> },
> "trace": {
> "doc_count": 372,
> "messages_over_time": {
> "buckets": [
> {
> "key_as_string": "2016-04-04",
> "key": 1459728000000,
> "doc_count": 372
> }
> ]
> }
> }
> }
> }
>
> as expected, it has the timeseries of the 'error' and 'trace' messages.
>
> Is there any limitation in elastic search interpreter which does not allow
> parsing of complex responses?
>
> Regards,
> Ashish
>
>