You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by "Jakati, Pavan" <pa...@cgi.com> on 2015/07/17 15:40:16 UTC
How to read content from files which are in nested directory
Hi Folks,
I am consuming Apache flume to read logs which are stored in following format , logs under date directory . How do I read the logs using spooldir source.
Directory Structure :
flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
total 0
drwxr-xr-x 2 root root 29 Jun 3 18:04 06-03-15
drwxr-xr-x 2 root root 56 Jun 4 18:16 06-04-15
drwxr-xr-x 2 root root 29 Jun 5 14:03 06-05-15
drwxr-xr-x 2 root root 29 Jun 8 12:43 06-08-15
drwxr-xr-x 2 root root 29 Jun 9 14:47 06-09-15
drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
My current Configuration :
agent1.sources.source1.type = spooldir
agent1.sources.source1.channels = channel1
agent1.sources.source1.spoolDir = /ADAPTORS/MAIL_CONNECT/logs/07-17-15
agent1.sources.source1.fileHeader = true
Pls suggest if i need to use some other source type .
Can I use any script inside conf file to change the directory dynamically . Thanks
Regards,
PavaN
RE: How to read content from files which are in nested directory
Posted by "Jakati, Pavan" <pa...@cgi.com>.
Hi Ashish
Thanks for the help. Can you pls help with procedure to update or pull the patch into my system . Thanks.
Regards,
PavaN
-----Original Message-----
From: Ashish [mailto:paliwalashish@gmail.com]
Sent: Saturday, July 18, 2015 12:44 AM
To: user@flume.apache.org
Subject: Re: How to read content from files which are in nested directory
One more way to achieve the same is using this patch
https://issues.apache.org/jira/browse/FLUME-1899
On Fri, Jul 17, 2015 at 7:14 AM, Johny Rufus <jr...@cloudera.com> wrote:
> The spooling directory source as of now supports only reading from a
> flat directory and wont read files from subdirectories.
> You could write an external script that transfers all the files in all
> the date directories to a common directory which spooling source
> points to. (If this fits your use case)
>
> Thanks,
> Rufus
>
> On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
>>
>> Hi Folks,
>>
>>
>>
>> I am consuming Apache flume to read logs which are stored in
>> following format , logs under date directory . How do I read the logs
>> using spooldir source.
>>
>>
>>
>>
>>
>> Directory Structure :
>>
>>
>>
>>
>>
>> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>>
>> total 0
>>
>> drwxr-xr-x 2 root root 29 Jun 3 18:04 06-03-15
>>
>> drwxr-xr-x 2 root root 56 Jun 4 18:16 06-04-15
>>
>> drwxr-xr-x 2 root root 29 Jun 5 14:03 06-05-15
>>
>> drwxr-xr-x 2 root root 29 Jun 8 12:43 06-08-15
>>
>> drwxr-xr-x 2 root root 29 Jun 9 14:47 06-09-15
>>
>> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>>
>> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>>
>> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>>
>> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>>
>> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>>
>>
>>
>>
>>
>> My current Configuration :
>>
>> agent1.sources.source1.type = spooldir
>>
>> agent1.sources.source1.channels = channel1
>>
>> agent1.sources.source1.spoolDir =
>> /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>>
>> agent1.sources.source1.fileHeader = true
>>
>>
>>
>>
>>
>> Pls suggest if i need to use some other source type .
>>
>>
>>
>> Can I use any script inside conf file to change the directory
>> dynamically . Thanks
>>
>>
>>
>>
>>
>> Regards,
>>
>> PavaN
>>
>>
>
>
--
thanks
ashish
Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal
Re: How to read content from files which are in nested directory
Posted by Ashish <pa...@gmail.com>.
Did you applied the patch? If yes, then probably you need to debug
what's going on. Start with some logs.
On Wed, Jul 22, 2015 at 4:06 AM, Jakati, Pavan <pa...@cgi.com> wrote:
> Hi Ashish ,
>
> I used recursiveDirectorySearch in my conf and set it to true . Yet I am unable to read content under subdirectory .
>
> Regards,
> PavaN
>
>
> -----Original Message-----
> From: Jakati, Pavan
> Sent: Monday, July 20, 2015 6:08 PM
> To: user@flume.apache.org
> Subject: RE: How to read content from files which are in nested directory
>
> Hi Ashish
>
> Thanks for the help. Can you pls help with procedure to update or pull the patch into my system . Thanks.
>
> Regards,
> PavaN
>
>
> -----Original Message-----
> From: Ashish [mailto:paliwalashish@gmail.com]
> Sent: Saturday, July 18, 2015 12:44 AM
> To: user@flume.apache.org
> Subject: Re: How to read content from files which are in nested directory
>
> One more way to achieve the same is using this patch
> https://issues.apache.org/jira/browse/FLUME-1899
>
> On Fri, Jul 17, 2015 at 7:14 AM, Johny Rufus <jr...@cloudera.com> wrote:
>> The spooling directory source as of now supports only reading from a
>> flat directory and wont read files from subdirectories.
>> You could write an external script that transfers all the files in all
>> the date directories to a common directory which spooling source
>> points to. (If this fits your use case)
>>
>> Thanks,
>> Rufus
>>
>> On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
>>>
>>> Hi Folks,
>>>
>>>
>>>
>>> I am consuming Apache flume to read logs which are stored in
>>> following format , logs under date directory . How do I read the logs
>>> using spooldir source.
>>>
>>>
>>>
>>>
>>>
>>> Directory Structure :
>>>
>>>
>>>
>>>
>>>
>>> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>>>
>>> total 0
>>>
>>> drwxr-xr-x 2 root root 29 Jun 3 18:04 06-03-15
>>>
>>> drwxr-xr-x 2 root root 56 Jun 4 18:16 06-04-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 5 14:03 06-05-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 8 12:43 06-08-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 9 14:47 06-09-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>>>
>>>
>>>
>>>
>>>
>>> My current Configuration :
>>>
>>> agent1.sources.source1.type = spooldir
>>>
>>> agent1.sources.source1.channels = channel1
>>>
>>> agent1.sources.source1.spoolDir =
>>> /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>>>
>>> agent1.sources.source1.fileHeader = true
>>>
>>>
>>>
>>>
>>>
>>> Pls suggest if i need to use some other source type .
>>>
>>>
>>>
>>> Can I use any script inside conf file to change the directory
>>> dynamically . Thanks
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> PavaN
>>>
>>>
>>
>>
>
>
>
> --
> thanks
> ashish
>
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal
--
thanks
ashish
Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal
RE: How to read content from files which are in nested directory
Posted by "Jakati, Pavan" <pa...@cgi.com>.
Hi Ashish ,
I used recursiveDirectorySearch in my conf and set it to true . Yet I am unable to read content under subdirectory .
Regards,
PavaN
-----Original Message-----
From: Jakati, Pavan
Sent: Monday, July 20, 2015 6:08 PM
To: user@flume.apache.org
Subject: RE: How to read content from files which are in nested directory
Hi Ashish
Thanks for the help. Can you pls help with procedure to update or pull the patch into my system . Thanks.
Regards,
PavaN
-----Original Message-----
From: Ashish [mailto:paliwalashish@gmail.com]
Sent: Saturday, July 18, 2015 12:44 AM
To: user@flume.apache.org
Subject: Re: How to read content from files which are in nested directory
One more way to achieve the same is using this patch
https://issues.apache.org/jira/browse/FLUME-1899
On Fri, Jul 17, 2015 at 7:14 AM, Johny Rufus <jr...@cloudera.com> wrote:
> The spooling directory source as of now supports only reading from a
> flat directory and wont read files from subdirectories.
> You could write an external script that transfers all the files in all
> the date directories to a common directory which spooling source
> points to. (If this fits your use case)
>
> Thanks,
> Rufus
>
> On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
>>
>> Hi Folks,
>>
>>
>>
>> I am consuming Apache flume to read logs which are stored in
>> following format , logs under date directory . How do I read the logs
>> using spooldir source.
>>
>>
>>
>>
>>
>> Directory Structure :
>>
>>
>>
>>
>>
>> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>>
>> total 0
>>
>> drwxr-xr-x 2 root root 29 Jun 3 18:04 06-03-15
>>
>> drwxr-xr-x 2 root root 56 Jun 4 18:16 06-04-15
>>
>> drwxr-xr-x 2 root root 29 Jun 5 14:03 06-05-15
>>
>> drwxr-xr-x 2 root root 29 Jun 8 12:43 06-08-15
>>
>> drwxr-xr-x 2 root root 29 Jun 9 14:47 06-09-15
>>
>> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>>
>> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>>
>> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>>
>> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>>
>> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>>
>>
>>
>>
>>
>> My current Configuration :
>>
>> agent1.sources.source1.type = spooldir
>>
>> agent1.sources.source1.channels = channel1
>>
>> agent1.sources.source1.spoolDir =
>> /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>>
>> agent1.sources.source1.fileHeader = true
>>
>>
>>
>>
>>
>> Pls suggest if i need to use some other source type .
>>
>>
>>
>> Can I use any script inside conf file to change the directory
>> dynamically . Thanks
>>
>>
>>
>>
>>
>> Regards,
>>
>> PavaN
>>
>>
>
>
--
thanks
ashish
Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal
Re: How to read content from files which are in nested directory
Posted by Ashish <pa...@gmail.com>.
One more way to achieve the same is using this patch
https://issues.apache.org/jira/browse/FLUME-1899
On Fri, Jul 17, 2015 at 7:14 AM, Johny Rufus <jr...@cloudera.com> wrote:
> The spooling directory source as of now supports only reading from a flat
> directory and wont read files from subdirectories.
> You could write an external script that transfers all the files in all the
> date directories to a common directory which spooling source points to. (If
> this fits your use case)
>
> Thanks,
> Rufus
>
> On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
>>
>> Hi Folks,
>>
>>
>>
>> I am consuming Apache flume to read logs which are stored in following
>> format , logs under date directory . How do I read the logs using spooldir
>> source.
>>
>>
>>
>>
>>
>> Directory Structure :
>>
>>
>>
>>
>>
>> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>>
>> total 0
>>
>> drwxr-xr-x 2 root root 29 Jun 3 18:04 06-03-15
>>
>> drwxr-xr-x 2 root root 56 Jun 4 18:16 06-04-15
>>
>> drwxr-xr-x 2 root root 29 Jun 5 14:03 06-05-15
>>
>> drwxr-xr-x 2 root root 29 Jun 8 12:43 06-08-15
>>
>> drwxr-xr-x 2 root root 29 Jun 9 14:47 06-09-15
>>
>> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>>
>> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>>
>> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>>
>> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>>
>> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>>
>>
>>
>>
>>
>> My current Configuration :
>>
>> agent1.sources.source1.type = spooldir
>>
>> agent1.sources.source1.channels = channel1
>>
>> agent1.sources.source1.spoolDir = /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>>
>> agent1.sources.source1.fileHeader = true
>>
>>
>>
>>
>>
>> Pls suggest if i need to use some other source type .
>>
>>
>>
>> Can I use any script inside conf file to change the directory dynamically
>> . Thanks
>>
>>
>>
>>
>>
>> Regards,
>>
>> PavaN
>>
>>
>
>
--
thanks
ashish
Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal
RE: How to read content from files which are in nested directory
Posted by "Jakati, Pavan" <pa...@cgi.com>.
Hi Johny
Thanks for the update. Can you pls let me know if I can add the script into my flume conf and do the required actions. I do not wish to run script outside flume.
Regards,
PavaN
From: Johny Rufus [mailto:jrufus@cloudera.com]
Sent: Friday, July 17, 2015 7:45 PM
To: user@flume.apache.org
Subject: Re: How to read content from files which are in nested directory
The spooling directory source as of now supports only reading from a flat directory and wont read files from subdirectories.
You could write an external script that transfers all the files in all the date directories to a common directory which spooling source points to. (If this fits your use case)
Thanks,
Rufus
On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com>> wrote:
Hi Folks,
I am consuming Apache flume to read logs which are stored in following format , logs under date directory . How do I read the logs using spooldir source.
Directory Structure :
flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
total 0
drwxr-xr-x 2 root root 29 Jun 3 18:04 06-03-15
drwxr-xr-x 2 root root 56 Jun 4 18:16 06-04-15
drwxr-xr-x 2 root root 29 Jun 5 14:03 06-05-15
drwxr-xr-x 2 root root 29 Jun 8 12:43 06-08-15
drwxr-xr-x 2 root root 29 Jun 9 14:47 06-09-15
drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
My current Configuration :
agent1.sources.source1.type = spooldir
agent1.sources.source1.channels = channel1
agent1.sources.source1.spoolDir = /ADAPTORS/MAIL_CONNECT/logs/07-17-15
agent1.sources.source1.fileHeader = true
Pls suggest if i need to use some other source type .
Can I use any script inside conf file to change the directory dynamically . Thanks
Regards,
PavaN
Re: How to read content from files which are in nested directory
Posted by Johny Rufus <jr...@cloudera.com>.
The spooling directory source as of now supports only reading from a flat
directory and wont read files from subdirectories.
You could write an external script that transfers all the files in all the
date directories to a common directory which spooling source points to. (If
this fits your use case)
Thanks,
Rufus
On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
> Hi Folks,
>
>
>
> I am consuming Apache flume to read logs which are stored in following
> format , logs under date directory . How do I read the logs using spooldir
> source.
>
>
>
>
>
> Directory Structure :
>
>
>
>
>
> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>
> total 0
>
> drwxr-xr-x 2 root root 29 Jun 3 18:04 06-03-15
>
> drwxr-xr-x 2 root root 56 Jun 4 18:16 06-04-15
>
> drwxr-xr-x 2 root root 29 Jun 5 14:03 06-05-15
>
> drwxr-xr-x 2 root root 29 Jun 8 12:43 06-08-15
>
> drwxr-xr-x 2 root root 29 Jun 9 14:47 06-09-15
>
> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>
> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>
> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>
> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>
> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>
> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>
> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>
>
>
>
>
> My current Configuration :
>
> agent1.sources.source1.type = spooldir
>
> agent1.sources.source1.channels = channel1
>
> agent1.sources.source1.spoolDir = /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>
> agent1.sources.source1.fileHeader = true
>
>
>
>
>
> Pls suggest if i need to use some other source type .
>
>
>
> Can I use any script inside conf file to change the directory dynamically
> . Thanks
>
>
>
>
>
> Regards,
>
> *PavaN*
>
>
>