You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@metron.apache.org by Ali Nazemian <al...@gmail.com> on 2017/07/14 09:31:34 UTC

How to change Elasticsearch indexing policy

Hi,

I am investigating different tuning aspects, and I was wondering how I can
change the policy of Elasticsearch indexing. Currently, as a default
behaviour, events are stored in separate indices hourly. How can I change
this behaviour? Is this a hard-coded design or I can change it through
configurations?

Cheers,
Ali

Re: How to change Elasticsearch indexing policy

Posted by Ali Nazemian <al...@gmail.com>.
Hi Simon,

That's true for very large clusters. However, always having lots of shards
and indices are not optimum for a small cluster. First, I think having
hourly index cannot be one recipe for every cluster without considering the
ingestion rate, cluster size and the type of query. Second, query result
would be blocked by the slowest shard/index. For example, if I want to have
a near-realtime query result for the last 30 days, It doesn't matter how
fast Elasticsearch can respond to 719 of the indices. The query result is
blocked by the result of the last index. In addition to that, most of the
frequent queries target the most recent index, so index-query congestion is
not avoidable in this way. Generally, finding the best strategy for
indexing depends on the type of use cases and queries and it would be nice
to customise it.

Can you please guide me where I can find the corresponding configuration
parameter in Metron for this matter?

Cheers,
Ali

On Fri, Jul 14, 2017 at 7:55 PM, Simon Elliston Ball <
simon@simonellistonball.com> wrote:

> You could change the index data format. One word of caution here though;
> the last time I saw this done it caused huge problems with locking on
> ingest against people running queries on the current day’s data and tended
> to knock recent relevant indexes out of disk cache at the OS level. It
> might look like it will help a bit for ingest initially, but with load on
> the end user side, it’s probably going to kill your disks at any reasonable
> scale.
>
> Simon
>
> > On 14 Jul 2017, at 10:31, Ali Nazemian <al...@gmail.com> wrote:
> >
> > Hi,
> >
> > I am investigating different tuning aspects, and I was wondering how I
> can change the policy of Elasticsearch indexing. Currently, as a default
> behaviour, events are stored in separate indices hourly. How can I change
> this behaviour? Is this a hard-coded design or I can change it through
> configurations?
> >
> > Cheers,
> > Ali
>
>


-- 
A.Nazemian

Re: How to change Elasticsearch indexing policy

Posted by Simon Elliston Ball <si...@simonellistonball.com>.
You could change the index data format. One word of caution here though; the last time I saw this done it caused huge problems with locking on ingest against people running queries on the current day’s data and tended to knock recent relevant indexes out of disk cache at the OS level. It might look like it will help a bit for ingest initially, but with load on the end user side, it’s probably going to kill your disks at any reasonable scale. 

Simon

> On 14 Jul 2017, at 10:31, Ali Nazemian <al...@gmail.com> wrote:
> 
> Hi,
> 
> I am investigating different tuning aspects, and I was wondering how I can change the policy of Elasticsearch indexing. Currently, as a default behaviour, events are stored in separate indices hourly. How can I change this behaviour? Is this a hard-coded design or I can change it through configurations?
> 
> Cheers,
> Ali