You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by "Ari Rabkin (JIRA)" <ji...@apache.org> on 2009/04/24 00:26:30 UTC

[jira] Created: (CHUKWA-185) ability to tail a whole directory

ability to tail a whole directory
---------------------------------

                 Key: CHUKWA-185
                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
             Project: Hadoop Chukwa
          Issue Type: Bug
          Components: data collection
    Affects Versions: 0.1.2
            Reporter: Ari Rabkin
            Assignee: Ari Rabkin


Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Jerome Boulon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719664#action_12719664 ] 

Jerome Boulon commented on CHUKWA-185:
--------------------------------------


Ari, it will be good to have a better control on TerminatorThread ... maybe a pool of TerminatorThread instead of creating a new one every time. A simpler solution will be to limit the number of "running" TerminatorThread's instances...

Also I'm not sure if the solution could so simple. 

If the agent crash, it shouldn't resend something that has already been sent.
Here what I was thinking of:
- make the timeWindow mandatory, could default to XX minutes
- keep track of all files that are in the processing window ( file.lastModifiedDate > now -  timeWindow), using a simple text file, (tracking file)
- when the last modified date for a file exceed the timeWindow then:
--->  do a shutdown on the adaptor for this file's entry
--->  delete the file's entry from the tracking file
- keep the tracking file in a chukwa directory and reload it at agent re-start to avoid sending the same file twice

How do you stop tailing a file? We cannot assume that we can delete a file so we need to have that built in. My proposal is to use the last modified date and the timeWindow to automatically remove adaptors.


> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720407#action_12720407 ] 

Eric Yang commented on CHUKWA-185:
----------------------------------

My only concern is the wait time between each scan is 10 seconds.  This is a bit short for 2 level deep directory structure.  If it's small directory structure, this works fine.  +1

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch, CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-185:
------------------------------

    Status: Open  (was: Patch Available)

Will revise as per Jerome's comments.

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719671#action_12719671 ] 

Ari Rabkin commented on CHUKWA-185:
-----------------------------------

I had figured we'd solve the expiration problem in FTA, rather than here.  We already have CHUKWA-204 open for this. However, my patch does need to do something with time windows to make sure it doesn't restart the adaptors after they stop themselves.  I'll fix that in the next version.

I'm happy to revisit the TerminatorThread mechanism, but again, that's an FTA problem, and there's no need to solve that problem and this one at the same time.

As to duplicate data.  Once DirTailer starts a FileTailer, that FileTailer gets checkpointed in the usual way. And exactly one tailer can get started for each file.  This patch shouldn't create any duplicate-data issues we didn't already have.


> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719007#action_12719007 ] 

Ari Rabkin commented on CHUKWA-185:
-----------------------------------

Proposal:  
DirTailingAdaptor should take two parameters; a directory and an optional date.
It will periodically scan the directory and all subdirs, and then start filetailing adaptors on any file modified since that date, if none is running.  Date defaults to the epoch.

Combined with CHUKWA-204, this will prevent adaptor count from rising without bound, while still making it easy to snarf a whole directory tree.

I assume this is going into trunk, not 0.2?

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.1.2
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-185:
------------------------------

    Attachment: CHUKWA-185.patch

Revised patch, demonstrating correct handling of old files.

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch, CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723124#action_12723124 ] 

Hudson commented on CHUKWA-185:
-------------------------------

Integrated in Chukwa-trunk #61 (See [http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/61/])
    

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch, CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Mac Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710745#action_12710745 ] 

Mac Yang commented on CHUKWA-185:
---------------------------------

+1
this feature will make it easier to collect job history, job conf and task syslog

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.1.2
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-185:
------------------------------

    Affects Version/s: 0.2.0
               Status: Patch Available  (was: Open)

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710653#action_12710653 ] 

Ari Rabkin commented on CHUKWA-185:
-----------------------------------

The approach I had in mind was the following -- 
Define a "DirTailingAdaptor", that takes as parameters a directory, and enough options to create a FileTailingAdaptor.  (probably a class name and a data type)

That adaptor should scan the directory; if it sees a new file, it should start a tailing adaptor on it.
Keep a list of currently running adaptors in the directory.

For now, we can punt on expiring the adaptors -- CHUKWA-204 will solve that problem. 

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.1.2
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720448#action_12720448 ] 

Ari Rabkin commented on CHUKWA-185:
-----------------------------------

I'm open to suggestions on picking the interval between scans.  It's configurable, of course. But perhaps a thing to do is to have it scale with the duration of a scan. So that small dirs are scanned more frequently. If that sounds good, I'll open a separate JIRA for it.

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch, CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719758#action_12719758 ] 

Ari Rabkin commented on CHUKWA-185:
-----------------------------------

What I was planning to do was this.  DTA takes a time cut-off, and will not stream files last modified before the cutoff.  So if you specify the epoch, you get everything.  Whenever DTA does a scan of the directory, it updates that cutoff to the time when the scan started.  So for a file that isn't being modified, DTA will start tailing it at most once.  

Time windowing for shutdown should be addressed by CHUKWA-204.

It might be reasonable to build a command line tool or script that stops all FTAs in a given subdirectory.  There's no need for that to be coupled to this patch in any way.  But I think we should have real use cases before we hack on it.

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-185:
------------------------------

    Issue Type: New Feature  (was: Bug)

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719008#action_12719008 ] 

Eric Yang commented on CHUKWA-185:
----------------------------------

Trunk is ideal for testing this feature.

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.1.2
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719735#action_12719735 ] 

Eric Yang commented on CHUKWA-185:
----------------------------------

If we go with time window approach, DTA will only work on files that have active updates.  What if the user want to stream over files that were previously archived and no longer receiving updates?  This is not in the previous identified use case, but it may make sense to include this use case.

If we specify start time (as processed time flag), and time window size, the system could process data in a queue and try to closing the gap between past and present.  The processed time flag could be used as an indicator for resuming agent crash as well.


> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-185:
------------------------------

    Attachment: CHUKWA-185.patch

Ideally, we'd add some more test coverage to make sure that the created adaptors have the right classes and params.

Also that the recursion works properly.

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-185:
------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this to trunk.

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch, CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Jerome Boulon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719744#action_12719744 ] 

Jerome Boulon commented on CHUKWA-185:
--------------------------------------

>> What if the user want to stream over files that were previously archived and no longer receiving updates
That one could be addressed by the backfilling tool or DirTailingAdaptor could be started with a "--force" flag

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719835#action_12719835 ] 

Ari Rabkin commented on CHUKWA-185:
-----------------------------------

I had misunderstood intent of CHUKWA-204.  Shutoff for file tailers is now CHUKWA-295.

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-185:
------------------------------

    Status: Patch Available  (was: Open)

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch, CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-185:
------------------------------

    Fix Version/s: 0.3.0

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.1.2, 0.2.0
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>             Fix For: 0.3.0
>
>         Attachments: CHUKWA-185.patch
>
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-185) ability to tail a whole directory

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719399#action_12719399 ] 

Ari Rabkin commented on CHUKWA-185:
-----------------------------------

I have some code, but a couple concerns.

What should we do if a user tries to tail "/"?  Creating millions of adaptors is *probably* the Wrong Thing.  But I'm okay saying "this is the user's problem".  Another approach is to only gradually create the FileTailingAdaptors that do the real tailing, so that the user can kill it if it goes out of control.

When the DirTailer is stopped, should that stop tailing all the files in the directory, or just stop scanning for new ones?

Is DirTailer responsible for shutting off the FileTailers after a set period, or is that the responsibility of the Tailers themselves?

> ability to tail a whole directory
> ---------------------------------
>
>                 Key: CHUKWA-185
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-185
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.1.2
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>
> Right now, FileTailingAdaptors watch particular files.   It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.