You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Dibyajyoti Ghosh <di...@gmail.com> on 2013/10/04 21:14:29 UTC
ElasticSearchSink - A couple of feature requests
Hi all,
This is a repost from dev@flume.apache.org. I was not sure if flume
developers got the email thus pardon my repost if it feels like I am
spamming the mailing list.
I have a couple of feature requests for ElasticSearchSink and didn't find
open JIRA tickets for these requirements.
I have already modified ElasticSearchSink locally for the smaller of the
feature request and the longer one is in progress. I wanted to discuss the
features first with you first before creating the JIRA tickets so here is a
brief summary of the improvements I have in mind.
DETAILS>>>
Flume version:
Flume 1.4.0-cdh4.4.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 154d35659212f07edc896b414a43996fb8121773
Compiled by jenkins on Tue Sep 3 20:53:28 PDT 2013
>From source with checksum f95b4a7f48080f876d6482bb88bcc342
And ElasticSearch v0.90.1.
*
*
*Improvement request #1 - HDFS file suffix style index suffix in
ElasticSearchSink:**
*
*
*
*agent.sinks.myESsink.indexName = myIndex **
*
*
*
ElasticSearchSink uses the provided index name as index prefix and appends
"YYYY-MM-DD" to generate the actual index in ES which being convenient for
my testing purposes, doesn't allow creating index monthly / yearly or more
generally speaking based on some regex provided in flume config similar to
HDFS fileSuffix .e.g.
*
*
*agent.sinks.myESsink.indexSuffix = "YYYY"* will create index as
myIndex-2013 / myIndex-2014 etc and when not provided will create index
with just the index name or can default back to 'YYYY-MM-DD'.
*Improvement request #2 - ElasticSearchSink ttl field modification to mimic
actual ES:*
*agent.sinks.myESsink.ttl = <some integer value> (current specification)*
The second one is comparatively trivial but good to have. Current ElasticSearch
TTL defaults to 5 days and works with integers only again which is treated
as days.
It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to
mimic the TTL configuration in ElasticSearch mapping.
*agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)*
For the ttl I have already made changes in my local flume git repo and
currently testing it. The change doesn't break existing way of specifying
TTL field only extends it to allow "1d" / "2w" style TTL specification.
<<<DETAILS
Kindly suggest what should I do to make these changes incorporated in the
future release(s) of Flume.
Best and thanks,
- Dib
Re: ElasticSearchSink - A couple of feature requests
Posted by Dibyajyoti Ghosh <di...@gmail.com>.
Hi all,
Can any of the Flume JIRA admins please assign
https://issues.apache.org/jira/browse/FLUME-2206 ticket to me. I am testing
the changes locally and have a patch I would like to submit for review.
Thanks,
- Dib
On Fri, Oct 4, 2013 at 1:55 PM, Dibyajyoti Ghosh
<di...@gmail.com>wrote:
> Thanks Hari.
>
> I am creating JIRA tickets for the improvements.
>
> Best,
> - Dib
>
>
> On Fri, Oct 4, 2013 at 1:45 PM, Hari Shreedharan <
> hshreedharan@cloudera.com> wrote:
>
>> Hi,
>>
>> I am not too familiar with ElasticSearch. If you want to file a jira,
>> someone might pick it up when they have time.
>>
>>
>> Thanks,
>> Hari
>>
>> On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
>>
>> Hi all,
>>
>> This is a repost from dev@flume.apache.org. I was not sure if flume
>> developers got the email thus pardon my repost if it feels like I am
>> spamming the mailing list.
>>
>> I have a couple of feature requests for ElasticSearchSink and didn't find
>> open JIRA tickets for these requirements.
>>
>> I have already modified ElasticSearchSink locally for the smaller of the
>> feature request and the longer one is in progress. I wanted to discuss the
>> features first with you first before creating the JIRA tickets so here is a
>> brief summary of the improvements I have in mind.
>>
>>
>> DETAILS>>>
>>
>> Flume version:
>>
>> Flume 1.4.0-cdh4.4.0
>> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
>> Revision: 154d35659212f07edc896b414a43996fb8121773
>> Compiled by jenkins on Tue Sep 3 20:53:28 PDT 2013
>> From source with checksum f95b4a7f48080f876d6482bb88bcc342
>>
>> And ElasticSearch v0.90.1.
>> *
>> *
>> *Improvement request #1 - HDFS file suffix style index suffix in
>> ElasticSearchSink:**
>> *
>> *
>> *
>> *agent.sinks.myESsink.indexName = myIndex **
>> *
>> *
>> *
>> ElasticSearchSink uses the provided index name as index prefix and
>> appends "YYYY-MM-DD" to generate the actual index in ES which being
>> convenient for my testing purposes, doesn't allow creating index monthly /
>> yearly or more generally speaking based on some regex provided in flume
>> config similar to HDFS fileSuffix .e.g.
>> *
>> *
>> *agent.sinks.myESsink.indexSuffix = "YYYY"* will create index as
>> myIndex-2013 / myIndex-2014 etc and when not provided will create index
>> with just the index name or can default back to 'YYYY-MM-DD'.
>>
>> *Improvement request #2 - ElasticSearchSink ttl field modification to
>> mimic actual ES:*
>>
>> *agent.sinks.myESsink.ttl = <some integer value> (current specification)*
>>
>> The second one is comparatively trivial but good to have. Current ElasticSearch
>> TTL defaults to 5 days and works with integers only again which is treated
>> as days.
>>
>> It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to
>> mimic the TTL configuration in ElasticSearch mapping.
>>
>> *agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)*
>>
>> For the ttl I have already made changes in my local flume git repo and
>> currently testing it. The change doesn't break existing way of specifying
>> TTL field only extends it to allow "1d" / "2w" style TTL specification.
>>
>> <<<DETAILS
>>
>> Kindly suggest what should I do to make these changes incorporated in the
>> future release(s) of Flume.
>>
>> Best and thanks,
>> - Dib
>>
>>
>>
>
Re: ElasticSearchSink - A couple of feature requests
Posted by Dibyajyoti Ghosh <di...@gmail.com>.
Hi all,
Can any of the Flume JIRA admins please assign
https://issues.apache.org/jira/browse/FLUME-2206 ticket to me. I am testing
the changes locally and have a patch I would like to submit for review.
Thanks,
- Dib
On Fri, Oct 4, 2013 at 1:55 PM, Dibyajyoti Ghosh
<di...@gmail.com>wrote:
> Thanks Hari.
>
> I am creating JIRA tickets for the improvements.
>
> Best,
> - Dib
>
>
> On Fri, Oct 4, 2013 at 1:45 PM, Hari Shreedharan <
> hshreedharan@cloudera.com> wrote:
>
>> Hi,
>>
>> I am not too familiar with ElasticSearch. If you want to file a jira,
>> someone might pick it up when they have time.
>>
>>
>> Thanks,
>> Hari
>>
>> On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
>>
>> Hi all,
>>
>> This is a repost from dev@flume.apache.org. I was not sure if flume
>> developers got the email thus pardon my repost if it feels like I am
>> spamming the mailing list.
>>
>> I have a couple of feature requests for ElasticSearchSink and didn't find
>> open JIRA tickets for these requirements.
>>
>> I have already modified ElasticSearchSink locally for the smaller of the
>> feature request and the longer one is in progress. I wanted to discuss the
>> features first with you first before creating the JIRA tickets so here is a
>> brief summary of the improvements I have in mind.
>>
>>
>> DETAILS>>>
>>
>> Flume version:
>>
>> Flume 1.4.0-cdh4.4.0
>> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
>> Revision: 154d35659212f07edc896b414a43996fb8121773
>> Compiled by jenkins on Tue Sep 3 20:53:28 PDT 2013
>> From source with checksum f95b4a7f48080f876d6482bb88bcc342
>>
>> And ElasticSearch v0.90.1.
>> *
>> *
>> *Improvement request #1 - HDFS file suffix style index suffix in
>> ElasticSearchSink:**
>> *
>> *
>> *
>> *agent.sinks.myESsink.indexName = myIndex **
>> *
>> *
>> *
>> ElasticSearchSink uses the provided index name as index prefix and
>> appends "YYYY-MM-DD" to generate the actual index in ES which being
>> convenient for my testing purposes, doesn't allow creating index monthly /
>> yearly or more generally speaking based on some regex provided in flume
>> config similar to HDFS fileSuffix .e.g.
>> *
>> *
>> *agent.sinks.myESsink.indexSuffix = "YYYY"* will create index as
>> myIndex-2013 / myIndex-2014 etc and when not provided will create index
>> with just the index name or can default back to 'YYYY-MM-DD'.
>>
>> *Improvement request #2 - ElasticSearchSink ttl field modification to
>> mimic actual ES:*
>>
>> *agent.sinks.myESsink.ttl = <some integer value> (current specification)*
>>
>> The second one is comparatively trivial but good to have. Current ElasticSearch
>> TTL defaults to 5 days and works with integers only again which is treated
>> as days.
>>
>> It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to
>> mimic the TTL configuration in ElasticSearch mapping.
>>
>> *agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)*
>>
>> For the ttl I have already made changes in my local flume git repo and
>> currently testing it. The change doesn't break existing way of specifying
>> TTL field only extends it to allow "1d" / "2w" style TTL specification.
>>
>> <<<DETAILS
>>
>> Kindly suggest what should I do to make these changes incorporated in the
>> future release(s) of Flume.
>>
>> Best and thanks,
>> - Dib
>>
>>
>>
>
Re: ElasticSearchSink - A couple of feature requests
Posted by Dibyajyoti Ghosh <di...@gmail.com>.
Thanks Hari.
I am creating JIRA tickets for the improvements.
Best,
- Dib
On Fri, Oct 4, 2013 at 1:45 PM, Hari Shreedharan
<hs...@cloudera.com>wrote:
> Hi,
>
> I am not too familiar with ElasticSearch. If you want to file a jira,
> someone might pick it up when they have time.
>
>
> Thanks,
> Hari
>
> On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
>
> Hi all,
>
> This is a repost from dev@flume.apache.org. I was not sure if flume
> developers got the email thus pardon my repost if it feels like I am
> spamming the mailing list.
>
> I have a couple of feature requests for ElasticSearchSink and didn't find
> open JIRA tickets for these requirements.
>
> I have already modified ElasticSearchSink locally for the smaller of the
> feature request and the longer one is in progress. I wanted to discuss the
> features first with you first before creating the JIRA tickets so here is a
> brief summary of the improvements I have in mind.
>
>
> DETAILS>>>
>
> Flume version:
>
> Flume 1.4.0-cdh4.4.0
> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
> Revision: 154d35659212f07edc896b414a43996fb8121773
> Compiled by jenkins on Tue Sep 3 20:53:28 PDT 2013
> From source with checksum f95b4a7f48080f876d6482bb88bcc342
>
> And ElasticSearch v0.90.1.
> *
> *
> *Improvement request #1 - HDFS file suffix style index suffix in
> ElasticSearchSink:**
> *
> *
> *
> *agent.sinks.myESsink.indexName = myIndex **
> *
> *
> *
> ElasticSearchSink uses the provided index name as index prefix and appends
> "YYYY-MM-DD" to generate the actual index in ES which being convenient for
> my testing purposes, doesn't allow creating index monthly / yearly or more
> generally speaking based on some regex provided in flume config similar to
> HDFS fileSuffix .e.g.
> *
> *
> *agent.sinks.myESsink.indexSuffix = "YYYY"* will create index as
> myIndex-2013 / myIndex-2014 etc and when not provided will create index
> with just the index name or can default back to 'YYYY-MM-DD'.
>
> *Improvement request #2 - ElasticSearchSink ttl field modification to
> mimic actual ES:*
>
> *agent.sinks.myESsink.ttl = <some integer value> (current specification)*
>
> The second one is comparatively trivial but good to have. Current ElasticSearch
> TTL defaults to 5 days and works with integers only again which is treated
> as days.
>
> It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to
> mimic the TTL configuration in ElasticSearch mapping.
>
> *agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)*
>
> For the ttl I have already made changes in my local flume git repo and
> currently testing it. The change doesn't break existing way of specifying
> TTL field only extends it to allow "1d" / "2w" style TTL specification.
>
> <<<DETAILS
>
> Kindly suggest what should I do to make these changes incorporated in the
> future release(s) of Flume.
>
> Best and thanks,
> - Dib
>
>
>
Re: ElasticSearchSink - A couple of feature requests
Posted by Dibyajyoti Ghosh <di...@gmail.com>.
Thanks Hari.
I am creating JIRA tickets for the improvements.
Best,
- Dib
On Fri, Oct 4, 2013 at 1:45 PM, Hari Shreedharan
<hs...@cloudera.com>wrote:
> Hi,
>
> I am not too familiar with ElasticSearch. If you want to file a jira,
> someone might pick it up when they have time.
>
>
> Thanks,
> Hari
>
> On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
>
> Hi all,
>
> This is a repost from dev@flume.apache.org. I was not sure if flume
> developers got the email thus pardon my repost if it feels like I am
> spamming the mailing list.
>
> I have a couple of feature requests for ElasticSearchSink and didn't find
> open JIRA tickets for these requirements.
>
> I have already modified ElasticSearchSink locally for the smaller of the
> feature request and the longer one is in progress. I wanted to discuss the
> features first with you first before creating the JIRA tickets so here is a
> brief summary of the improvements I have in mind.
>
>
> DETAILS>>>
>
> Flume version:
>
> Flume 1.4.0-cdh4.4.0
> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
> Revision: 154d35659212f07edc896b414a43996fb8121773
> Compiled by jenkins on Tue Sep 3 20:53:28 PDT 2013
> From source with checksum f95b4a7f48080f876d6482bb88bcc342
>
> And ElasticSearch v0.90.1.
> *
> *
> *Improvement request #1 - HDFS file suffix style index suffix in
> ElasticSearchSink:**
> *
> *
> *
> *agent.sinks.myESsink.indexName = myIndex **
> *
> *
> *
> ElasticSearchSink uses the provided index name as index prefix and appends
> "YYYY-MM-DD" to generate the actual index in ES which being convenient for
> my testing purposes, doesn't allow creating index monthly / yearly or more
> generally speaking based on some regex provided in flume config similar to
> HDFS fileSuffix .e.g.
> *
> *
> *agent.sinks.myESsink.indexSuffix = "YYYY"* will create index as
> myIndex-2013 / myIndex-2014 etc and when not provided will create index
> with just the index name or can default back to 'YYYY-MM-DD'.
>
> *Improvement request #2 - ElasticSearchSink ttl field modification to
> mimic actual ES:*
>
> *agent.sinks.myESsink.ttl = <some integer value> (current specification)*
>
> The second one is comparatively trivial but good to have. Current ElasticSearch
> TTL defaults to 5 days and works with integers only again which is treated
> as days.
>
> It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to
> mimic the TTL configuration in ElasticSearch mapping.
>
> *agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)*
>
> For the ttl I have already made changes in my local flume git repo and
> currently testing it. The change doesn't break existing way of specifying
> TTL field only extends it to allow "1d" / "2w" style TTL specification.
>
> <<<DETAILS
>
> Kindly suggest what should I do to make these changes incorporated in the
> future release(s) of Flume.
>
> Best and thanks,
> - Dib
>
>
>
Re: ElasticSearchSink - A couple of feature requests
Posted by Hari Shreedharan <hs...@cloudera.com>.
Hi,
I am not too familiar with ElasticSearch. If you want to file a jira, someone might pick it up when they have time.
Thanks,
Hari
On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
> Hi all,
>
> This is a repost from dev@flume.apache.org (mailto:dev@flume.apache.org). I was not sure if flume developers got the email thus pardon my repost if it feels like I am spamming the mailing list.
>
> I have a couple of feature requests for ElasticSearchSink and didn't find open JIRA tickets for these requirements.
>
> I have already modified ElasticSearchSink locally for the smaller of the feature request and the longer one is in progress. I wanted to discuss the features first with you first before creating the JIRA tickets so here is a brief summary of the improvements I have in mind.
>
>
> DETAILS>>>
>
> Flume version:
>
> Flume 1.4.0-cdh4.4.0
> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
> Revision: 154d35659212f07edc896b414a43996fb8121773
> Compiled by jenkins on Tue Sep 3 20:53:28 PDT 2013
> From source with checksum f95b4a7f48080f876d6482bb88bcc342
>
>
> And ElasticSearch v0.90.1.
>
> Improvement request #1 - HDFS file suffix style index suffix in ElasticSearchSink:
>
> agent.sinks.myESsink.indexName = myIndex
>
> ElasticSearchSink uses the provided index name as index prefix and appends "YYYY-MM-DD" to generate the actual index in ES which being convenient for my testing purposes, doesn't allow creating index monthly / yearly or more generally speaking based on some regex provided in flume config similar to HDFS fileSuffix .e.g.
>
> agent.sinks.myESsink.indexSuffix = "YYYY" will create index as myIndex-2013 / myIndex-2014 etc and when not provided will create index with just the index name or can default back to 'YYYY-MM-DD'.
>
> Improvement request #2 - ElasticSearchSink ttl field modification to mimic actual ES:
>
> agent.sinks.myESsink.ttl = <some integer value> (current specification)
>
> The second one is comparatively trivial but good to have. Current ElasticSearch TTL defaults to 5 days and works with integers only again which is treated as days.
>
> It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to mimic the TTL configuration in ElasticSearch mapping.
>
> agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)
>
> For the ttl I have already made changes in my local flume git repo and currently testing it. The change doesn't break existing way of specifying TTL field only extends it to allow "1d" / "2w" style TTL specification.
>
> <<<DETAILS
>
> Kindly suggest what should I do to make these changes incorporated in the future release(s) of Flume.
>
> Best and thanks,
> - Dib
Re: ElasticSearchSink - A couple of feature requests
Posted by Hari Shreedharan <hs...@cloudera.com>.
Hi,
I am not too familiar with ElasticSearch. If you want to file a jira, someone might pick it up when they have time.
Thanks,
Hari
On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
> Hi all,
>
> This is a repost from dev@flume.apache.org (mailto:dev@flume.apache.org). I was not sure if flume developers got the email thus pardon my repost if it feels like I am spamming the mailing list.
>
> I have a couple of feature requests for ElasticSearchSink and didn't find open JIRA tickets for these requirements.
>
> I have already modified ElasticSearchSink locally for the smaller of the feature request and the longer one is in progress. I wanted to discuss the features first with you first before creating the JIRA tickets so here is a brief summary of the improvements I have in mind.
>
>
> DETAILS>>>
>
> Flume version:
>
> Flume 1.4.0-cdh4.4.0
> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
> Revision: 154d35659212f07edc896b414a43996fb8121773
> Compiled by jenkins on Tue Sep 3 20:53:28 PDT 2013
> From source with checksum f95b4a7f48080f876d6482bb88bcc342
>
>
> And ElasticSearch v0.90.1.
>
> Improvement request #1 - HDFS file suffix style index suffix in ElasticSearchSink:
>
> agent.sinks.myESsink.indexName = myIndex
>
> ElasticSearchSink uses the provided index name as index prefix and appends "YYYY-MM-DD" to generate the actual index in ES which being convenient for my testing purposes, doesn't allow creating index monthly / yearly or more generally speaking based on some regex provided in flume config similar to HDFS fileSuffix .e.g.
>
> agent.sinks.myESsink.indexSuffix = "YYYY" will create index as myIndex-2013 / myIndex-2014 etc and when not provided will create index with just the index name or can default back to 'YYYY-MM-DD'.
>
> Improvement request #2 - ElasticSearchSink ttl field modification to mimic actual ES:
>
> agent.sinks.myESsink.ttl = <some integer value> (current specification)
>
> The second one is comparatively trivial but good to have. Current ElasticSearch TTL defaults to 5 days and works with integers only again which is treated as days.
>
> It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to mimic the TTL configuration in ElasticSearch mapping.
>
> agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)
>
> For the ttl I have already made changes in my local flume git repo and currently testing it. The change doesn't break existing way of specifying TTL field only extends it to allow "1d" / "2w" style TTL specification.
>
> <<<DETAILS
>
> Kindly suggest what should I do to make these changes incorporated in the future release(s) of Flume.
>
> Best and thanks,
> - Dib