You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@apisix.apache.org by JinChao Shuai <sh...@apache.org> on 2021/11/29 07:58:57 UTC

[Proposal] support splunk hec logging plugin

Hi folks,

Splunk is a machine data search engine. According to the latest database
search engine ranking [1], Splunk has been ranked second, and Splunk can be
used to collect, index and retrieve various application data. Splunk, like
ElasticSearch, is quasi-real-time and can provide an uninterrupted data
stream of search results. I think Apache APISIX can provide a plugin that
supports pushing request logs to Splunk, expands the data observability of
APISIX, and reduces the cost of Splunk users using APISIX as a gateway.

The following are the design and technical details:

1、Name

splunk-hec-logging

2、Configuration

{
    "endpoint":{
        "uri":"https://hec-splunk.company.com/services/collector",
        "token":"BD274822-96AA-4DA6-90EC-18940FB2414C",
        "channel":"FE0ECFAD-13D5-401B-847D-77833BD77131",
        "ssl":true
    },
    "inactive_timeout":10,
    "max_retry_count":0,
    "buffer_duration":60,
    "retry_delay":1,
    "batch_max_size":1
}

- `endpoint`            Splunk HTTP Event Collector(HEC) endpoint
- `endpoint.uri`        Splunk HTTP Event Collector(HEC) endpoint request
uri
- `endpoint.token`      Splunk HTTP Event Collector(HEC) endpoint request
access token [2]
- `endpoint.channel`    Splunk HTTP Event Collector(HEC) endpoint channel
identifier (GUID) [3]
- `endpoint.ssl`        Enable SSL Verify
- `max_retry_count`     max number of retries before removing from the
processing pipe line
- `retry_delay`         number of seconds the process execution should be
delayed if the execution fails
- `buffer_duration`     max age in seconds of the oldest entry in a batch
before the batch must be processed
- `inactive_timeout`    max age in seconds when the buffer will be flushed
if inactive
- `batch_max_size`      max size of each batch

3、Details

3.1、Configuration process

1. Add and set up the Http Event Controller (HEC) through the Splunk
console and get the access token.
2. If HEC enable the indexer to confirm, you must specify a channel and
obtain the channel ID.
3. Set the request URI, access token, and channel ID of HEC to the plugin
configuration

3.2、HTTP Request process

1. Obtain and assemble request information data in the APISIX Log stage,
data format refer to [4]
2. Add the assembled request data to the batch queue
3. When the threshold of the batch queue is triggered, the requested data
is submitted in batches to Splunk's HEC

[1] https://db-engines.com/en/ranking/search+engine
[2]
https://docs.splunk.com/Documentation/Splunk/latest/Data/UsetheHTTPEventCollector#Create_an_Event_Collector_token_on_Splunk_Enterprise
[3]
https://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHECIDXAck#About_channels_and_sending_data
[4]
https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

-- 
Thanks,
Janko

Re: [Proposal] support splunk hec logging plugin

Posted by JinChao Shuai <sh...@apache.org>.

Okay, I will add relevant information later

Zexuan Luo <sp...@apache.org> 于2021年11月29日周一 下午5:50写道：

> Could you list the request data which needs to be submitted in detail?
> I followed the link you gave but it only lists the metadata.
>
> We can use the more common `ssl_verify` field to configure TLS
> verification instead of the `ssl` field. Also, a timeout field is
> required in the endpoint.
>
> JinChao Shuai <sh...@apache.org> 于2021年11月29日周一 下午3:59写道：
> >
> > Hi folks,
> >
> > Splunk is a machine data search engine. According to the latest database
> > search engine ranking [1], Splunk has been ranked second, and Splunk can
> be
> > used to collect, index and retrieve various application data. Splunk,
> like
> > ElasticSearch, is quasi-real-time and can provide an uninterrupted data
> > stream of search results. I think Apache APISIX can provide a plugin that
> > supports pushing request logs to Splunk, expands the data observability
> of
> > APISIX, and reduces the cost of Splunk users using APISIX as a gateway.
> >
> > The following are the design and technical details:
> >
> > 1、Name
> >
> > splunk-hec-logging
> >
> > 2、Configuration
> >
> > {
> >     "endpoint":{
> >         "uri":"https://hec-splunk.company.com/services/collector",
> >         "token":"BD274822-96AA-4DA6-90EC-18940FB2414C",
> >         "channel":"FE0ECFAD-13D5-401B-847D-77833BD77131",
> >         "ssl":true
> >     },
> >     "inactive_timeout":10,
> >     "max_retry_count":0,
> >     "buffer_duration":60,
> >     "retry_delay":1,
> >     "batch_max_size":1
> > }
> >
> > - `endpoint`            Splunk HTTP Event Collector(HEC) endpoint
> > - `endpoint.uri`        Splunk HTTP Event Collector(HEC) endpoint request
> > uri
> > - `endpoint.token`      Splunk HTTP Event Collector(HEC) endpoint request
> > access token [2]
> > - `endpoint.channel`    Splunk HTTP Event Collector(HEC) endpoint channel
> > identifier (GUID) [3]
> > - `endpoint.ssl`        Enable SSL Verify
> > - `max_retry_count`     max number of retries before removing from the
> > processing pipe line
> > - `retry_delay`         number of seconds the process execution should be
> > delayed if the execution fails
> > - `buffer_duration`     max age in seconds of the oldest entry in a batch
> > before the batch must be processed
> > - `inactive_timeout`    max age in seconds when the buffer will be
> flushed
> > if inactive
> > - `batch_max_size`      max size of each batch
> >
> > 3、Details
> >
> > 3.1、Configuration process
> >
> > 1. Add and set up the Http Event Controller (HEC) through the Splunk
> > console and get the access token.
> > 2. If HEC enable the indexer to confirm, you must specify a channel and
> > obtain the channel ID.
> > 3. Set the request URI, access token, and channel ID of HEC to the plugin
> > configuration
> >
> > 3.2、HTTP Request process
> >
> > 1. Obtain and assemble request information data in the APISIX Log stage,
> > data format refer to [4]
> > 2. Add the assembled request data to the batch queue
> > 3. When the threshold of the batch queue is triggered, the requested data
> > is submitted in batches to Splunk's HEC
> >
> > [1] https://db-engines.com/en/ranking/search+engine
> > [2]
> >
> https://docs.splunk.com/Documentation/Splunk/latest/Data/UsetheHTTPEventCollector#Create_an_Event_Collector_token_on_Splunk_Enterprise
> > [3]
> >
> https://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHECIDXAck#About_channels_and_sending_data
> > [4]
> >
> https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata
> >
> > --
> > Thanks,
> > Janko
>


-- 
Thanks,
Janko

Re: [Proposal] support splunk hec logging plugin

Posted by Zexuan Luo <sp...@apache.org>.

Could you list the request data which needs to be submitted in detail?
I followed the link you gave but it only lists the metadata.

We can use the more common `ssl_verify` field to configure TLS
verification instead of the `ssl` field. Also, a timeout field is
required in the endpoint.

JinChao Shuai <sh...@apache.org> 于2021年11月29日周一 下午3:59写道：
>
> Hi folks,
>
> Splunk is a machine data search engine. According to the latest database
> search engine ranking [1], Splunk has been ranked second, and Splunk can be
> used to collect, index and retrieve various application data. Splunk, like
> ElasticSearch, is quasi-real-time and can provide an uninterrupted data
> stream of search results. I think Apache APISIX can provide a plugin that
> supports pushing request logs to Splunk, expands the data observability of
> APISIX, and reduces the cost of Splunk users using APISIX as a gateway.
>
> The following are the design and technical details:
>
> 1、Name
>
> splunk-hec-logging
>
> 2、Configuration
>
> {
>     "endpoint":{
>         "uri":"https://hec-splunk.company.com/services/collector",
>         "token":"BD274822-96AA-4DA6-90EC-18940FB2414C",
>         "channel":"FE0ECFAD-13D5-401B-847D-77833BD77131",
>         "ssl":true
>     },
>     "inactive_timeout":10,
>     "max_retry_count":0,
>     "buffer_duration":60,
>     "retry_delay":1,
>     "batch_max_size":1
> }
>
> - `endpoint`            Splunk HTTP Event Collector(HEC) endpoint
> - `endpoint.uri`        Splunk HTTP Event Collector(HEC) endpoint request
> uri
> - `endpoint.token`      Splunk HTTP Event Collector(HEC) endpoint request
> access token [2]
> - `endpoint.channel`    Splunk HTTP Event Collector(HEC) endpoint channel
> identifier (GUID) [3]
> - `endpoint.ssl`        Enable SSL Verify
> - `max_retry_count`     max number of retries before removing from the
> processing pipe line
> - `retry_delay`         number of seconds the process execution should be
> delayed if the execution fails
> - `buffer_duration`     max age in seconds of the oldest entry in a batch
> before the batch must be processed
> - `inactive_timeout`    max age in seconds when the buffer will be flushed
> if inactive
> - `batch_max_size`      max size of each batch
>
> 3、Details
>
> 3.1、Configuration process
>
> 1. Add and set up the Http Event Controller (HEC) through the Splunk
> console and get the access token.
> 2. If HEC enable the indexer to confirm, you must specify a channel and
> obtain the channel ID.
> 3. Set the request URI, access token, and channel ID of HEC to the plugin
> configuration
>
> 3.2、HTTP Request process
>
> 1. Obtain and assemble request information data in the APISIX Log stage,
> data format refer to [4]
> 2. Add the assembled request data to the batch queue
> 3. When the threshold of the batch queue is triggered, the requested data
> is submitted in batches to Splunk's HEC
>
> [1] https://db-engines.com/en/ranking/search+engine
> [2]
> https://docs.splunk.com/Documentation/Splunk/latest/Data/UsetheHTTPEventCollector#Create_an_Event_Collector_token_on_Splunk_Enterprise
> [3]
> https://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHECIDXAck#About_channels_and_sending_data
> [4]
> https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata
>
> --
> Thanks,
> Janko

Re: [Proposal] support splunk hec logging plugin

Posted by JinChao Shuai <sh...@apache.org>.

APISIX has a retry mechanism in the batch queue. The number of retries can
be set by `max_retry_count`. If Splunk crashes completely, the message will
be discarded.

Chao Zhang <zc...@gmail.com> 于2021年11月30日周二 下午8:26写道：

> How does this plugin handle situation if the splunk is crashed, and lots of
> logging messages backlog in the memory?
>
> JinChao Shuai <sh...@apache.org>于2021年11月29日 周一15:59写道：
>
> > Hi folks,
> >
> > Splunk is a machine data search engine. According to the latest database
> > search engine ranking [1], Splunk has been ranked second, and Splunk can
> be
> > used to collect, index and retrieve various application data. Splunk,
> like
> > ElasticSearch, is quasi-real-time and can provide an uninterrupted data
> > stream of search results. I think Apache APISIX can provide a plugin that
> > supports pushing request logs to Splunk, expands the data observability
> of
> > APISIX, and reduces the cost of Splunk users using APISIX as a gateway.
> >
> > The following are the design and technical details:
> >
> > 1、Name
> >
> > splunk-hec-logging
> >
> > 2、Configuration
> >
> > {
> >     "endpoint":{
> >         "uri":"https://hec-splunk.company.com/services/collector",
> >         "token":"BD274822-96AA-4DA6-90EC-18940FB2414C",
> >         "channel":"FE0ECFAD-13D5-401B-847D-77833BD77131",
> >         "ssl":true
> >     },
> >     "inactive_timeout":10,
> >     "max_retry_count":0,
> >     "buffer_duration":60,
> >     "retry_delay":1,
> >     "batch_max_size":1
> > }
> >
> > - `endpoint`            Splunk HTTP Event Collector(HEC) endpoint
> > - `endpoint.uri`        Splunk HTTP Event Collector(HEC) endpoint request
> > uri
> > - `endpoint.token`      Splunk HTTP Event Collector(HEC) endpoint request
> > access token [2]
> > - `endpoint.channel`    Splunk HTTP Event Collector(HEC) endpoint channel
> > identifier (GUID) [3]
> > - `endpoint.ssl`        Enable SSL Verify
> > - `max_retry_count`     max number of retries before removing from the
> > processing pipe line
> > - `retry_delay`         number of seconds the process execution should be
> > delayed if the execution fails
> > - `buffer_duration`     max age in seconds of the oldest entry in a batch
> > before the batch must be processed
> > - `inactive_timeout`    max age in seconds when the buffer will be
> flushed
> > if inactive
> > - `batch_max_size`      max size of each batch
> >
> > 3、Details
> >
> > 3.1、Configuration process
> >
> > 1. Add and set up the Http Event Controller (HEC) through the Splunk
> > console and get the access token.
> > 2. If HEC enable the indexer to confirm, you must specify a channel and
> > obtain the channel ID.
> > 3. Set the request URI, access token, and channel ID of HEC to the plugin
> > configuration
> >
> > 3.2、HTTP Request process
> >
> > 1. Obtain and assemble request information data in the APISIX Log stage,
> > data format refer to [4]
> > 2. Add the assembled request data to the batch queue
> > 3. When the threshold of the batch queue is triggered, the requested data
> > is submitted in batches to Splunk's HEC
> >
> > [1] https://db-engines.com/en/ranking/search+engine
> > [2]
> >
> >
> https://docs.splunk.com/Documentation/Splunk/latest/Data/UsetheHTTPEventCollector#Create_an_Event_Collector_token_on_Splunk_Enterprise
> > [3]
> >
> >
> https://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHECIDXAck#About_channels_and_sending_data
> > [4]
> >
> >
> https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata
> >
> > --
> > Thanks,
> > Janko
> >
> --
> Best regards
> Chao Zhang
>
> https://github.com/tokers
>


-- 
Thanks,
Janko

Re: [Proposal] support splunk hec logging plugin

Posted by Chao Zhang <zc...@gmail.com>.

How does this plugin handle situation if the splunk is crashed, and lots of
logging messages backlog in the memory?

JinChao Shuai <sh...@apache.org>于2021年11月29日 周一15:59写道：

> Hi folks,
>
> Splunk is a machine data search engine. According to the latest database
> search engine ranking [1], Splunk has been ranked second, and Splunk can be
> used to collect, index and retrieve various application data. Splunk, like
> ElasticSearch, is quasi-real-time and can provide an uninterrupted data
> stream of search results. I think Apache APISIX can provide a plugin that
> supports pushing request logs to Splunk, expands the data observability of
> APISIX, and reduces the cost of Splunk users using APISIX as a gateway.
>
> The following are the design and technical details:
>
> 1、Name
>
> splunk-hec-logging
>
> 2、Configuration
>
> {
>     "endpoint":{
>         "uri":"https://hec-splunk.company.com/services/collector",
>         "token":"BD274822-96AA-4DA6-90EC-18940FB2414C",
>         "channel":"FE0ECFAD-13D5-401B-847D-77833BD77131",
>         "ssl":true
>     },
>     "inactive_timeout":10,
>     "max_retry_count":0,
>     "buffer_duration":60,
>     "retry_delay":1,
>     "batch_max_size":1
> }
>
> - `endpoint`            Splunk HTTP Event Collector(HEC) endpoint
> - `endpoint.uri`        Splunk HTTP Event Collector(HEC) endpoint request
> uri
> - `endpoint.token`      Splunk HTTP Event Collector(HEC) endpoint request
> access token [2]
> - `endpoint.channel`    Splunk HTTP Event Collector(HEC) endpoint channel
> identifier (GUID) [3]
> - `endpoint.ssl`        Enable SSL Verify
> - `max_retry_count`     max number of retries before removing from the
> processing pipe line
> - `retry_delay`         number of seconds the process execution should be
> delayed if the execution fails
> - `buffer_duration`     max age in seconds of the oldest entry in a batch
> before the batch must be processed
> - `inactive_timeout`    max age in seconds when the buffer will be flushed
> if inactive
> - `batch_max_size`      max size of each batch
>
> 3、Details
>
> 3.1、Configuration process
>
> 1. Add and set up the Http Event Controller (HEC) through the Splunk
> console and get the access token.
> 2. If HEC enable the indexer to confirm, you must specify a channel and
> obtain the channel ID.
> 3. Set the request URI, access token, and channel ID of HEC to the plugin
> configuration
>
> 3.2、HTTP Request process
>
> 1. Obtain and assemble request information data in the APISIX Log stage,
> data format refer to [4]
> 2. Add the assembled request data to the batch queue
> 3. When the threshold of the batch queue is triggered, the requested data
> is submitted in batches to Splunk's HEC
>
> [1] https://db-engines.com/en/ranking/search+engine
> [2]
>
> https://docs.splunk.com/Documentation/Splunk/latest/Data/UsetheHTTPEventCollector#Create_an_Event_Collector_token_on_Splunk_Enterprise
> [3]
>
> https://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHECIDXAck#About_channels_and_sending_data
> [4]
>
> https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata
>
> --
> Thanks,
> Janko
>
-- 
Best regards
Chao Zhang

https://github.com/tokers