You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flume.apache.org by "taoyang (Jira)" <ji...@apache.org> on 2020/01/19 11:05:00 UTC

[jira] [Commented] (FLUME-3352) Unnecessary canary test will block on readonly spooldir while another trackerdir is set.

    [ https://issues.apache.org/jira/browse/FLUME-3352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018875#comment-17018875 ] 

taoyang commented on FLUME-3352:
--------------------------------

Fix & two unit tests were added to ensure the correctness.

The description of two unit test:
 *testRenameTrackingPolicyOnReadonlySpoolDirectory*
 One supplemented to examine the mechanisation of canary (RenameTrackingPolicy is enabled so the spooldir should have the write access.)
 It can pass no matter before or after the fix.

*testTrackerDirTrackingPolicyOnReadonlySpoolDirectory*
 One that added to ensure the correctness on currect fix.(TrackerDirTrackingPolicy is enabled and the trackdir is different from spooldir. This time the write permission of spooldir is not necessary to the trackdir and also works well.)
 It can fail before the fix and pass after the fix.

> Unnecessary canary test will block on readonly spooldir while another trackerdir is set.
> ----------------------------------------------------------------------------------------
>
>                 Key: FLUME-3352
>                 URL: https://issues.apache.org/jira/browse/FLUME-3352
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.9.0
>            Reporter: taoyang
>            Priority: Blocker
>              Labels: patch, pull-request-available
>             Fix For: 1.10.0
>
>         Attachments: FLUME-3352-0.patch
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> *Phenomenon*
> In many cases, we have just read permission on spoolDir and write permission on trackerDir.
> However whenever flume starts for spooldir source , it will always try to create a '.canary' file in the spooling directory.
> Then it leads to the failure of some processing unnecessarily.
> *Recur*
> (Usually spooldir is mounted readonly from a nas in production and this time we create a readonly directory instead.)
> First we create the spoolDir by root user and create a file in it and it is readonly for others by default.
> {code:java}
> su root
> mkdir /home/hadoop/testspooldir
> echo 'foo' > /home/hadoop/testspooldir/bar{code}
> Then switch to another user (hadoop) who runs flume and make sure it has read permission for the spooldir.
> {code:java}
> su hadoop
> mkdir /home/hadoop/testtrackerdir
> ll /home/hadoop/testspooldir
> >> total 4-rw-r--r-- 1 root root 4 Jan 16 19:15 bar{code}
> now create the example.conf:
> {code:java}
> a1.sources = r1
> a1.sinks = k1
> a1.channels = c1
> a1.sources.r1.type = spooldir
> a1.sources.r1.deletePolicy = never
> a1.sources.r1.spoolDir = /home/hadoop/testspooldir
> a1.sources.r1.trackerDir = /home/hadoop/testtrackerdir
> a1.sources.r1.trackingPolicy = tracker_dir
> a1.sinks.k1.type = logger
> a1.channels.c1.type = memory
> a1.channels.c1.capacity = 1000
> a1.channels.c1.transactionCapacity = 100
> a1.sources.r1.channels = c1
> a1.sinks.k1.channel = c1
> {code}
> and start flume with it
> {code:java}
> bin/flume-ng agent --conf conf --conf-file conf/example.conf --name a1 -Dflume.root.logger=INFO,console{code}
> then the IOException is thrown.
> {code:java}
> 2020-01-16 19:16:12,777 (lifecycleSupervisor-1-0) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)] Unabl2020-01-16 19:16:12,777 (lifecycleSupervisor-1-0) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)] Unable to start EventDrivenSourceRunner: { source:Spool Directory source r1: { spoolDir: /home/hadoop/testspooldir } } - Exception follows.org.apache.flume.FlumeException: Unable to read and modify files in the spooling directory: /home/hadoop/testspooldir at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.<init>(ReliableSpoolingFileEventReader.java:195) at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.<init>(ReliableSpoolingFileEventReader.java:89) at org.apache.flume.client.avro.ReliableSpoolingFileEventReader$Builder.build(ReliableSpoolingFileEventReader.java:882) at org.apache.flume.source.SpoolDirectorySource.start(SpoolDirectorySource.java:111) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:249) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)Caused by: java.io.IOException: Permission denied at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2024) at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.<init>(ReliableSpoolingFileEventReader.java:185) ... 12 more{code}
> *Fix*
> We just add the condition where this trick is necessary.
> The pr/patch will be submitted as as shown below.
> Or let it still exist and using warning log instead of exception thrown is better?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org