You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by "Jakati, Pavan" <pa...@cgi.com> on 2015/07/17 15:40:16 UTC

How to read content from files which are in nested directory

Hi Folks,

I am consuming Apache flume to read logs which are stored in following format , logs under date directory . How do I read the logs using spooldir source.


Directory Structure :


flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
total 0
drwxr-xr-x 2 root root 29 Jun  3 18:04 06-03-15
drwxr-xr-x 2 root root 56 Jun  4 18:16 06-04-15
drwxr-xr-x 2 root root 29 Jun  5 14:03 06-05-15
drwxr-xr-x 2 root root 29 Jun  8 12:43 06-08-15
drwxr-xr-x 2 root root 29 Jun  9 14:47 06-09-15
drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15


My current Configuration :
agent1.sources.source1.type = spooldir
agent1.sources.source1.channels = channel1
agent1.sources.source1.spoolDir = /ADAPTORS/MAIL_CONNECT/logs/07-17-15
agent1.sources.source1.fileHeader = true


Pls suggest if i need to use some other source type .

Can I use any script inside conf file to change the directory dynamically . Thanks


Regards,
PavaN


RE: How to read content from files which are in nested directory

Posted by "Jakati, Pavan" <pa...@cgi.com>.
Hi Ashish 

Thanks for the help. Can you pls help with procedure to update or pull the patch into my system . Thanks. 

Regards,
PavaN


-----Original Message-----
From: Ashish [mailto:paliwalashish@gmail.com] 
Sent: Saturday, July 18, 2015 12:44 AM
To: user@flume.apache.org
Subject: Re: How to read content from files which are in nested directory

One more way to achieve the same is using this patch
https://issues.apache.org/jira/browse/FLUME-1899

On Fri, Jul 17, 2015 at 7:14 AM, Johny Rufus <jr...@cloudera.com> wrote:
> The spooling directory source as of now supports only reading from a 
> flat directory and wont read files from subdirectories.
> You could write an external script that transfers all the files in all 
> the date directories to a common directory which spooling source 
> points to. (If this fits your use case)
>
> Thanks,
> Rufus
>
> On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
>>
>> Hi Folks,
>>
>>
>>
>> I am consuming Apache flume to read logs which are stored in 
>> following format , logs under date directory . How do I read the logs 
>> using spooldir source.
>>
>>
>>
>>
>>
>> Directory Structure :
>>
>>
>>
>>
>>
>> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>>
>> total 0
>>
>> drwxr-xr-x 2 root root 29 Jun  3 18:04 06-03-15
>>
>> drwxr-xr-x 2 root root 56 Jun  4 18:16 06-04-15
>>
>> drwxr-xr-x 2 root root 29 Jun  5 14:03 06-05-15
>>
>> drwxr-xr-x 2 root root 29 Jun  8 12:43 06-08-15
>>
>> drwxr-xr-x 2 root root 29 Jun  9 14:47 06-09-15
>>
>> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>>
>> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>>
>> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>>
>> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>>
>> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>>
>>
>>
>>
>>
>> My current Configuration :
>>
>> agent1.sources.source1.type = spooldir
>>
>> agent1.sources.source1.channels = channel1
>>
>> agent1.sources.source1.spoolDir = 
>> /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>>
>> agent1.sources.source1.fileHeader = true
>>
>>
>>
>>
>>
>> Pls suggest if i need to use some other source type .
>>
>>
>>
>> Can I use any script inside conf file to change the directory 
>> dynamically . Thanks
>>
>>
>>
>>
>>
>> Regards,
>>
>> PavaN
>>
>>
>
>



--
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

Re: How to read content from files which are in nested directory

Posted by Ashish <pa...@gmail.com>.
Did you applied the patch? If yes, then probably you need to debug
what's going on. Start with some logs.

On Wed, Jul 22, 2015 at 4:06 AM, Jakati, Pavan <pa...@cgi.com> wrote:
> Hi Ashish ,
>
> I used recursiveDirectorySearch in my conf and set it to true . Yet I am unable to read content under subdirectory .
>
> Regards,
> PavaN
>
>
> -----Original Message-----
> From: Jakati, Pavan
> Sent: Monday, July 20, 2015 6:08 PM
> To: user@flume.apache.org
> Subject: RE: How to read content from files which are in nested directory
>
> Hi Ashish
>
> Thanks for the help. Can you pls help with procedure to update or pull the patch into my system . Thanks.
>
> Regards,
> PavaN
>
>
> -----Original Message-----
> From: Ashish [mailto:paliwalashish@gmail.com]
> Sent: Saturday, July 18, 2015 12:44 AM
> To: user@flume.apache.org
> Subject: Re: How to read content from files which are in nested directory
>
> One more way to achieve the same is using this patch
> https://issues.apache.org/jira/browse/FLUME-1899
>
> On Fri, Jul 17, 2015 at 7:14 AM, Johny Rufus <jr...@cloudera.com> wrote:
>> The spooling directory source as of now supports only reading from a
>> flat directory and wont read files from subdirectories.
>> You could write an external script that transfers all the files in all
>> the date directories to a common directory which spooling source
>> points to. (If this fits your use case)
>>
>> Thanks,
>> Rufus
>>
>> On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
>>>
>>> Hi Folks,
>>>
>>>
>>>
>>> I am consuming Apache flume to read logs which are stored in
>>> following format , logs under date directory . How do I read the logs
>>> using spooldir source.
>>>
>>>
>>>
>>>
>>>
>>> Directory Structure :
>>>
>>>
>>>
>>>
>>>
>>> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>>>
>>> total 0
>>>
>>> drwxr-xr-x 2 root root 29 Jun  3 18:04 06-03-15
>>>
>>> drwxr-xr-x 2 root root 56 Jun  4 18:16 06-04-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun  5 14:03 06-05-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun  8 12:43 06-08-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun  9 14:47 06-09-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>>>
>>> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>>>
>>>
>>>
>>>
>>>
>>> My current Configuration :
>>>
>>> agent1.sources.source1.type = spooldir
>>>
>>> agent1.sources.source1.channels = channel1
>>>
>>> agent1.sources.source1.spoolDir =
>>> /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>>>
>>> agent1.sources.source1.fileHeader = true
>>>
>>>
>>>
>>>
>>>
>>> Pls suggest if i need to use some other source type .
>>>
>>>
>>>
>>> Can I use any script inside conf file to change the directory
>>> dynamically . Thanks
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> PavaN
>>>
>>>
>>
>>
>
>
>
> --
> thanks
> ashish
>
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal



-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

RE: How to read content from files which are in nested directory

Posted by "Jakati, Pavan" <pa...@cgi.com>.
Hi Ashish ,

I used recursiveDirectorySearch in my conf and set it to true . Yet I am unable to read content under subdirectory . 

Regards,
PavaN


-----Original Message-----
From: Jakati, Pavan 
Sent: Monday, July 20, 2015 6:08 PM
To: user@flume.apache.org
Subject: RE: How to read content from files which are in nested directory

Hi Ashish 

Thanks for the help. Can you pls help with procedure to update or pull the patch into my system . Thanks. 

Regards,
PavaN


-----Original Message-----
From: Ashish [mailto:paliwalashish@gmail.com]
Sent: Saturday, July 18, 2015 12:44 AM
To: user@flume.apache.org
Subject: Re: How to read content from files which are in nested directory

One more way to achieve the same is using this patch
https://issues.apache.org/jira/browse/FLUME-1899

On Fri, Jul 17, 2015 at 7:14 AM, Johny Rufus <jr...@cloudera.com> wrote:
> The spooling directory source as of now supports only reading from a 
> flat directory and wont read files from subdirectories.
> You could write an external script that transfers all the files in all 
> the date directories to a common directory which spooling source 
> points to. (If this fits your use case)
>
> Thanks,
> Rufus
>
> On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
>>
>> Hi Folks,
>>
>>
>>
>> I am consuming Apache flume to read logs which are stored in 
>> following format , logs under date directory . How do I read the logs 
>> using spooldir source.
>>
>>
>>
>>
>>
>> Directory Structure :
>>
>>
>>
>>
>>
>> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>>
>> total 0
>>
>> drwxr-xr-x 2 root root 29 Jun  3 18:04 06-03-15
>>
>> drwxr-xr-x 2 root root 56 Jun  4 18:16 06-04-15
>>
>> drwxr-xr-x 2 root root 29 Jun  5 14:03 06-05-15
>>
>> drwxr-xr-x 2 root root 29 Jun  8 12:43 06-08-15
>>
>> drwxr-xr-x 2 root root 29 Jun  9 14:47 06-09-15
>>
>> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>>
>> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>>
>> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>>
>> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>>
>> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>>
>>
>>
>>
>>
>> My current Configuration :
>>
>> agent1.sources.source1.type = spooldir
>>
>> agent1.sources.source1.channels = channel1
>>
>> agent1.sources.source1.spoolDir =
>> /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>>
>> agent1.sources.source1.fileHeader = true
>>
>>
>>
>>
>>
>> Pls suggest if i need to use some other source type .
>>
>>
>>
>> Can I use any script inside conf file to change the directory 
>> dynamically . Thanks
>>
>>
>>
>>
>>
>> Regards,
>>
>> PavaN
>>
>>
>
>



--
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

Re: How to read content from files which are in nested directory

Posted by Ashish <pa...@gmail.com>.
One more way to achieve the same is using this patch
https://issues.apache.org/jira/browse/FLUME-1899

On Fri, Jul 17, 2015 at 7:14 AM, Johny Rufus <jr...@cloudera.com> wrote:
> The spooling directory source as of now supports only reading from a flat
> directory and wont read files from subdirectories.
> You could write an external script that transfers all the files in all the
> date directories to a common directory which spooling source points to. (If
> this fits your use case)
>
> Thanks,
> Rufus
>
> On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:
>>
>> Hi Folks,
>>
>>
>>
>> I am consuming Apache flume to read logs which are stored in following
>> format , logs under date directory . How do I read the logs using spooldir
>> source.
>>
>>
>>
>>
>>
>> Directory Structure :
>>
>>
>>
>>
>>
>> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>>
>> total 0
>>
>> drwxr-xr-x 2 root root 29 Jun  3 18:04 06-03-15
>>
>> drwxr-xr-x 2 root root 56 Jun  4 18:16 06-04-15
>>
>> drwxr-xr-x 2 root root 29 Jun  5 14:03 06-05-15
>>
>> drwxr-xr-x 2 root root 29 Jun  8 12:43 06-08-15
>>
>> drwxr-xr-x 2 root root 29 Jun  9 14:47 06-09-15
>>
>> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>>
>> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>>
>> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>>
>> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>>
>> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>>
>> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>>
>>
>>
>>
>>
>> My current Configuration :
>>
>> agent1.sources.source1.type = spooldir
>>
>> agent1.sources.source1.channels = channel1
>>
>> agent1.sources.source1.spoolDir = /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>>
>> agent1.sources.source1.fileHeader = true
>>
>>
>>
>>
>>
>> Pls suggest if i need to use some other source type .
>>
>>
>>
>> Can I use any script inside conf file to change the directory dynamically
>> . Thanks
>>
>>
>>
>>
>>
>> Regards,
>>
>> PavaN
>>
>>
>
>



-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

RE: How to read content from files which are in nested directory

Posted by "Jakati, Pavan" <pa...@cgi.com>.
Hi Johny


Thanks for the update. Can you pls let me know if I can add the script into my flume conf and do the required actions. I do not wish to run script outside flume.

Regards,
PavaN

From: Johny Rufus [mailto:jrufus@cloudera.com]
Sent: Friday, July 17, 2015 7:45 PM
To: user@flume.apache.org
Subject: Re: How to read content from files which are in nested directory

The spooling directory source as of now supports only reading from a flat directory and wont read files from subdirectories.
You could write an external script that transfers all the files in all the date directories to a common directory which spooling source points to. (If this fits your use case)

Thanks,
Rufus

On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com>> wrote:
Hi Folks,

I am consuming Apache flume to read logs which are stored in following format , logs under date directory . How do I read the logs using spooldir source.


Directory Structure :


flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
total 0
drwxr-xr-x 2 root root 29 Jun  3 18:04 06-03-15
drwxr-xr-x 2 root root 56 Jun  4 18:16 06-04-15
drwxr-xr-x 2 root root 29 Jun  5 14:03 06-05-15
drwxr-xr-x 2 root root 29 Jun  8 12:43 06-08-15
drwxr-xr-x 2 root root 29 Jun  9 14:47 06-09-15
drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15


My current Configuration :
agent1.sources.source1.type = spooldir
agent1.sources.source1.channels = channel1
agent1.sources.source1.spoolDir = /ADAPTORS/MAIL_CONNECT/logs/07-17-15
agent1.sources.source1.fileHeader = true


Pls suggest if i need to use some other source type .

Can I use any script inside conf file to change the directory dynamically . Thanks


Regards,
PavaN



Re: How to read content from files which are in nested directory

Posted by Johny Rufus <jr...@cloudera.com>.
The spooling directory source as of now supports only reading from a flat
directory and wont read files from subdirectories.
You could write an external script that transfers all the files in all the
date directories to a common directory which spooling source points to. (If
this fits your use case)

Thanks,
Rufus

On Fri, Jul 17, 2015 at 6:40 AM, Jakati, Pavan <pa...@cgi.com> wrote:

>  Hi Folks,
>
>
>
> I am consuming Apache flume to read logs which are stored in following
> format , logs under date directory . How do I read the logs using spooldir
> source.
>
>
>
>
>
> Directory Structure :
>
>
>
>
>
> flume]# ll /ADAPTORS/MAIL_CONNECT/logs/
>
> total 0
>
> drwxr-xr-x 2 root root 29 Jun  3 18:04 06-03-15
>
> drwxr-xr-x 2 root root 56 Jun  4 18:16 06-04-15
>
> drwxr-xr-x 2 root root 29 Jun  5 14:03 06-05-15
>
> drwxr-xr-x 2 root root 29 Jun  8 12:43 06-08-15
>
> drwxr-xr-x 2 root root 29 Jun  9 14:47 06-09-15
>
> drwxr-xr-x 2 root root 29 Jun 10 18:49 06-10-15
>
> drwxr-xr-x 2 root root 29 Jun 11 17:22 06-11-15
>
> drwxr-xr-x 2 root root 29 Jun 12 11:37 06-12-15
>
> drwxr-xr-x 2 root root 29 Jun 15 11:39 06-15-15
>
> drwxr-xr-x 2 root root 29 Jun 24 10:45 06-23-15
>
> drwxr-xr-x 2 root root 29 Jun 24 10:35 06-24-15
>
> drwxr-xr-x 2 root root 29 Jun 25 17:28 06-25-15
>
>
>
>
>
> My current Configuration :
>
> agent1.sources.source1.type = spooldir
>
> agent1.sources.source1.channels = channel1
>
> agent1.sources.source1.spoolDir = /ADAPTORS/MAIL_CONNECT/logs/07-17-15
>
> agent1.sources.source1.fileHeader = true
>
>
>
>
>
> Pls suggest if i need to use some other source type .
>
>
>
> Can I use any script inside conf file to change the directory dynamically
> . Thanks
>
>
>
>
>
> Regards,
>
> *PavaN*
>
>
>